GROK Help with timestamp

Hello,
I have below sample data:

10.10.0.1 - - [20/May/2025:13:02:52 +0000] "GET /website/redirect/2df4e11a-7e8f-4927-938b-7adfcc40f566 HTTP/1.1" 302 -

I cannot set any know be me timestamps for such date/time
Can someone help me with that ?

My pattern is as follow:

%{IPORHOST:ip_address} %{NOTSPACE:aa} %{NOTSPACE:aa} \[%{DATA:abc}\] %{GREEDYDATA:message}

How can I implement timestamp for such time/date ?

You could try

    grok { match => { "message" => "%{HTTPD_COMMONLOG}" } }
    date { match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] }

which will produce

    "source" => {
    "address" => "10.111.158.158"
},
 "timestamp" => "20/May/2025:13:02:52 +0000",
       "url" => {
    "original" => "/website/redirect/2df4e11a-7e8f-4927-938b-7adfcc40f566"
},
      "http" => {
    "response" => {
        "status_code" => 302
    },
     "version" => "1.1",
     "request" => {
        "method" => "GET"
    }
}

If you don't like those names then you could use mutate+rename to change them, or use a custom grok pattern based on the bundled HTTPD_COMMONLOG pattern.

Many thanks for that,
I saw some groks about http logs but I didn't think to use it.

But speaking about my question.
Is it some solution with some timestamps (I tested all but nothing of them works for me) to work with this date ?

I do not understand what question you are asking.

I was meaning about help to create grok pattern for this strange date:
[20/May/2025:13:02:52 +0000]

I tried to use many options with timestamp with no effect. It is possible to create grok for year, month ... etc

I implemented grok as You sent to me and basically it works fine,
but I have servers with different logs - I mean date first:
[20/May/2025:13:02:52 +0000] 10.10.0.1 TLSv1.2 ECDHE-RSA "POST /service/modell/ExternalMode HTTP/1.1" 3188

o I mean : date, ip, greedydata

and grok poattern with:

grok { match => { "message" => "%{HTTPD_COMMONLOG}" } }
    date { match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] }

does not work with such logs.

I think that I will have to implement many groks for them - I mean

grok { ............}
or
grok { ............}

grok can match against a list of patterns. If HTTPD_COMMONLOG does not fit then you can rearrange the parts of it, and include new patterns to match your other log format.

    grok {
        match => {
            "message" => [
                "%{HTTPD_COMMONLOG}",
                '^\[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{NOTSPACE:tlsversion} %{NOTSPACE:cipher} "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" (?:-|%{NUMBER:bytes})'
            ]
        }
    }

You may want to model the second pattern on the ECS-compatible HTTD_COMMONLOG rather than the legacy version, so that you get field names like [source][address] instead of [clientip]

If you want the individual parts of the date you could override the bundled HTTPDATE pattern with something like

        pattern_definitions => {
            "HTTPDATE" => "%{MONTHDAY:[ts][day]}/%{MONTH:[ts][month]}/%{YEAR:[ts][year]}:%{TIME:[ts][time]} %{INT:[ts][tz]}"
        }

many thanks for that, this grok pattern works fine,
but I have question,

Could You please explain to me a bit below grok part - I completely not understand it:

"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" (?:-|%{NUMBER:bytes})'

basically I tried to understand even : ?: or -|

This is not a site to provide explanations of regular expressions, but I can explain some terms that might help you get started elsewhere

| is used for alternation. So (?:-|%{NUMBER:bytes}) matches either - or a number, which is what is used for fields like byte counts in HTTP logs.

() are used to create a capture group. It captures part of the regular expression so that it can be referenced when doing a substitution. See this post for an example.

Sometimes you need to use () to surround part of a pattern (as in that alternation above), but you do not want to capture it to reference later. In that case you would use (?: and ) around the pattern to create a non-capture group.

Standard web logs will record the user request surrounded by double quotes., like this: "GET /foo/ HTTP/1.0"

The first part of the pattern you mentioned is trying to match that. Within the double quotes it first tries to match the verb and then the URI (which cannot contain a space). The HTTP/1.0 after that is actually optional, so the HTTP/%{NUMBER:httpversion} is made into a non-capture group using (?: and ) and then followed by ? which means it occurs 0 or 1 times (i.e. it is optional).

If matching the verb, URI and optional version fails then it says (using alternation) to just capture everything within the double quotes in the field [rawrequest].

Lastly, there are many slightly different flavours of regexps -- Ruby, Java, perl, POSIX, shell, csh, ed, and many more. logstash uses Ruby regexps for grok and mutate+gsub filters.

There are places in logstash where Java regexps are used (some file paths) but the differences between Java and Ruby are unlikely to matter in those places.

1 Like