Table of Contents
Logstash setup
Download and install
wget https://artifacts.opensearch.org/logstash/logstash-oss-with-opensearch-output-plugin-8.9.0-linux-x64.tar.gz tar xf logstash-oss-with-opensearch-output-plugin-8.9.0-linux-x64.tar.gz cd logstash-8.9.0/ && bin/logstash-plugin install logstash-output-opensearch
Examples
nginx logs using regular indices
input { file { path => "/var/log/nginx/nginx_logs*_access.log" } } filter { grok { patterns_dir => "/etc/logstash.d/patterns" match => { "message" => "%{NGINX_ACCESS}" } remove_field => ["message"] } useragent { source => "user_agent" target => "useragent" remove_field => "user_agent" } } output { opensearch { hosts => "https://{{ opensearch_host }}:9200" user => "logstash" password => "mypassword" index => "logstash-nginx-access-logs-${HOSTNAME}" manage_template => true template_overwrite => true template => "/etc/logstash.d/nginx_access_index_map.json" ssl_certificate_verification => false } }
You can also use multiple file inputs like so:
input { file { path => [ "/var/log/nginx/nginx_logs*_access.log", "/var/log/nginx/some_other_web*_access.log" ] } } ...
Above we're using a grok pattern named NGINX_ACCESS stored in patterns directory. Example of pattern:
METHOD (OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT) CACHED (HIT|MISS|BYPASS|EXPIRED|STALE|UPDATING|REVALIDATED) NGINX_ACCESS "%{HTTPDATE:time_local}" client=%{IP:client} country=%{GREEDYDATA:country} method=%{METHOD:method} request="%{METHOD} %{URIPATHPARAM:request} HTTP/%{BASE16FLOAT:http_version}" request_length=%{INT:request_length} status=%{INT:status} bytes_sent=%{INT:bytes_sent} body_bytes_sent=%{INT:body_bytes_sent} referer=(%{URI:referer}|-) user_agent=%{GREEDYDATA:user_agent} upstream_addr=(%{HOSTPORT:upstream_addr}|-) upstream_status=(%{INT:upstream_status}|-) request_time=(%{ISO8601_SECOND:request_time}|-) upstream_response_time=(%{ISO8601_SECOND:upstream_response_time}|-) upstream_connect_time=(%{ISO8601_SECOND:upstream_connect_time}|-) upstream_header_time=(%{ISO8601_SECOND:upstream_header_time}|-) upstream_cache_status=(%{CACHED:upstream_cache_status}|-) is_bot=%{INT:is_bot} cookie_mbbauth_present=(%{GREEDYDATA:cookie_mbbauth_present}|-)
For testing the pattern use http://grokconstructor.appspot.com. Copy a few log lines there and adjust above pattern until you get a match.
manage_template ⇒ true
creates the pattern in opensearch (elasticsearch) DB automatically, just make sure the logstash user has the permissions to create the indices with specified names. Index pattern is described in template ⇒ “/etc/logstash.d/nginx_access_index_map.json”
. This must match the grok pattern above i.e.
{ "version" : 50001, "template" : "logstash-nginx-access*", "settings" : { "index" : { "refresh_interval" : "5s" } }, "mappings" : { "properties" : { "@timestamp" : { "type" : "date" }, "client": { "type" : "ip" }, "country" : { "type" : "keyword" }, "method" : { "type" : "keyword" }, "request" : { "type" : "keyword" }, "request_length" : { "type" : "integer" }, "status" : { "type" : "integer" }, "bytes_sent": { "type" : "integer" }, "body_bytes_sent": { "type" : "integer" }, "referer" : { "type" : "keyword" }, "useragent" : { "dynamic" : true, "properties" : { "device" : { "properties" : { "name" : { "type" : "keyword" } } }, "name" : { "type" : "keyword" }, "os" : { "properties" : { "name" : { "type" : "keyword" }, "version" : { "type" : "keyword" }, "full" : { "type" : "keyword" } } }, "version" : { "type" : "keyword" } } }, "upstream_addr" : { "type" : "keyword" }, "upstream_status" : { "type" : "keyword" }, "request_time" : { "type" : "float" } } }, "aliases" : {} }
${HOSTNAME} is an environment variable which must be defined (via .bashrc or with systemd unit which starts logstash service etc.)
# /etc/systemd/system/logstash.service [Unit] Description=Massage various logs and forward them to opensearch/elasticsearch After=network.target [Service] Type=simple ExecStart=/usr/local/bin/logstash -f /etc/logstash.d/nginx_logs.conf Environment="HOSTNAME={{ hostname }}" Restart=on-failure [Install] WantedBy=multi-user.target
referer=(%{URI:referer}|-)
construct means the referer in this case might be empty (-
).
nginx logs using datastreams
This is perhaps a better approach, using datastreams as this will do some automatic rollover to new index.
Change the logstash output
part to this
... output { opensearch { hosts => "https://{{ opensearch_host }}:9200" user => "logstash" password => "mypassword" ssl_certificate_verification => false action => "create" index => "whatever-pattern-abc" } }...
index
needs to be set to the name of the datastream you defined in Opensearch and add action
directive.
In Opensearch you need to create a datastream, but first you need to create a template.
1. Go to Index Management > Templates > Create template
2. Add template name, select type “Data streams” and put in the Time field (@timestamp in this example). Index pattern should match the pattern name for which logstash user has rights to write to and this will be the name of datastream used later.
3. You can add an alias if you want, replicas are set to 0 here to save some space.
4. In field mappings you need to map the fields sent by logstash. Easiest is to c/p the json into json editor from existing index, like the one that would be created by logstash using the regular index (see above).
5. When creating datastream the name must match the pattern from step 2 above, but it doesn't have to be exactly the same (so here it should be “whatever-pattern-abc” to match logstash config). The rest should be autofilled.
Tested on
- logstash-8.14.3
- Opensearch 2.15