Logstash output

edit

The Logstash output uses an internal protocol to send events directly to Logstash over TCP. Logstash provides additional parsing, transformation, and routing of data collected by Elastic Agent.

Compatibility: This output works with all compatible versions of Logstash. Refer to the Elastic Support Matrix.

This example configures a Logstash output called default in the elastic-agent.yml file:

outputs:
  default:
    type: logstash
    hosts: ["127.0.0.1:5044"] 

The Logstash server and the port (5044) where Logstash is configured to listen for incoming Elastic Agent connections.

To receive the events in Logstash, you also need to create a Logstash configuration pipeline. The Logstash configuration pipeline listens for incoming Elastic Agent connections, processes received events, and then sends the events to Elasticsearch.

The following example configures a Logstash pipeline that listens on port 5044 for incoming Elastic Agent connections and routes received events to Elasticsearch:

input {
  elastic_agent {
    port => 5044
    enrich => none # don't modify the events' schema at all
    # or minimal change, add only ssl and source metadata
    # enrich => [ssl_peer_metadata, source_metadata]
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"] 
    data_stream => "true"
  }
}

The Elasticsearch server and the port (9200) where Elasticsearch is running.

For more information about configuring Logstash, refer to Configuring Logstash and Elastic Agent input plugin.

Logstash output configuration settings

edit

The logstash output supports the following settings, grouped by category. Many of these settings have sensible defaults that allow you to run Elastic Agent with minimal configuration.

Commonly used settings

edit
Setting Description

enabled

(boolean) Enables or disables the output. If set to false, the output is disabled.

escape_html

(boolean) Configures escaping of HTML in strings. Set to true to enable escaping.

Default: false

hosts

(list) The list of known Logstash servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly.

All entries in this list can contain a port number. If no port is specified, 5044 is used.

proxy_url

(string) The URL of the SOCKS5 proxy to use when connecting to the Logstash servers. The value must be a URL with a scheme of socks5://. The protocol used to communicate to Logstash is not based on HTTP, so you cannot use a web proxy.

If the SOCKS5 proxy server requires client authentication, embed a username and password in the URL as shown in the example.

When using a proxy, hostnames are resolved on the proxy server instead of on the client. To change this behavior, set proxy_use_local_resolver.

outputs:
  default:
    type: logstash
    hosts: ["remote-host:5044"]
    proxy_url: socks5://user:password@socks5-proxy:2233

proxy_use_ local_resolver

(boolean) Determines whether Logstash hostnames are resolved locally when using a proxy. If false and a proxy is used, name resolution occurs on the proxy server.

Default: false

Authentication settings

edit

When sending data to a secured cluster through the logstash output, Elastic Agent can use SSL/TLS. For a list of available settings, refer to SSL/TLS, specifically the settings under Table 4, “Common configuration options” and Table 5, “Client configuration options”.

To use SSL/TLS, you must also configure the Elastic Agent input plugin for Logstash to use SSL/TLS.

For more information, refer to Configure SSL/TLS for the Logstash output.

Memory queue settings

edit

The memory queue keeps all events in memory.

The memory queue waits for the output to acknowledge or drop events. If the queue is full, no new events can be inserted into the memory queue. Only after the signal from the output will the queue free up space for more events to be accepted.

The memory queue is controlled by the parameters flush.min_events and flush.timeout. flush.min_events gives a limit on the number of events that can be included in a single batch, and flush.timeout specifies how long the queue should wait to completely fill an event request. If the output supports a bulk_max_size parameter, the maximum batch size will be the smaller of bulk_max_size and flush.min_events.

flush.min_events is a legacy parameter, and new configurations should prefer to control batch size with bulk_max_size. As of 8.13, there is never a performance advantage to limiting batch size with flush.min_events instead of bulk_max_size.

In synchronous mode, an event request is always filled as soon as events are available, even if there are not enough events to fill the requested batch. This is useful when latency must be minimized. To use synchronous mode, set flush.timeout to 0.

For backwards compatibility, synchronous mode can also be activated by setting flush.min_events to 0 or 1. In this case, batch size will be capped at 1/2 the queue capacity.

In asynchronous mode, an event request will wait up to the specified timeout to try and fill the requested batch completely. If the timeout expires, the queue returns a partial batch with all available events. To use asynchronous mode, set flush.timeout to a positive duration, for example 5s.

This sample configuration forwards events to the output when there are enough events to fill the output’s request (usually controlled by bulk_max_size, and limited to at most 512 events by flush.min_events), or when events have been waiting for 5s without filling the requested size:f 512 events are available or the oldest available event has been waiting for 5s in the queue:

  queue.mem.events: 4096
  queue.mem.flush.min_events: 512
  queue.mem.flush.timeout: 5s
Setting Description

queue.mem.events

The number of events the queue can store. This value should be evenly divisible by the smaller of queue.mem.flush.min_events or bulk_max_size to avoid sending partial batches to the output.

Default: 3200 events

queue.mem.flush.min_events

flush.min_events is a legacy parameter, and new configurations should prefer to control batch size with bulk_max_size. As of 8.13, there is never a performance advantage to limiting batch size with flush.min_events instead of bulk_max_size

Default: 1600 events

queue.mem.flush.timeout

(int) The maximum wait time for queue.mem.flush.min_events to be fulfilled. If set to 0s, events are available to the output immediately.

Default: 10s

Performance tuning settings

edit

Settings that may affect performance.

Setting Description

backoff.init

(string) The number of seconds to wait before trying to reconnect to Logstash after a network error. After waiting backoff.init seconds, Elastic Agent tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to backoff.max. After a successful connection, the backoff timer is reset.

Default: 1s

backoff.max

(string) The maximum number of seconds to wait before attempting to connect to Elasticsearch after a network error.

Default: 60s

bulk_max_size

(int) The maximum number of events to bulk in a single Logstash request.

Events can be collected into batches. Elastic Agent will split batches larger than bulk_max_size into multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.

Set this value to 0 to turn off the splitting of batches. When splitting is turned off, the queue determines the number of events to be contained in a batch.

Default: 2048

compression_level

(int) The gzip compression level. Set this value to 0 to disable compression. The compression level must be in the range of 1 (best speed) to 9 (best compression).

Increasing the compression level reduces network usage but increases CPU usage.

Default: 3

loadbalance

If true and multiple Logstash hosts are configured, the output plugin load balances published events onto all Logstash hosts. If false, the output plugin sends all events to one host (determined at random) and switches to another host if the selected one becomes unresponsive.

With loadbalance enabled:

  • Elastic Agent reads batches of events and sends each batch to one Logstash worker dynamically, based on a work-queue shared between the outputs.
  • If a connection drops, Elastic Agent takes the disconnected Logstash worker out of its pool.
  • Elastic Agent tries to reconnect. If it succeeds, it re-adds the Logstash worker to the pool.
  • If one of the Logstash nodes is slow but "healthy", it sends a keep-alive signal until the full batch of data is processed. This prevents Elastic Agent from sending further data until it receives an acknowledgement signal back from Logstash. Elastic Agent keeps all events in memory until after that acknowledgement occurs.

Without loadbalance enabled:

  • Elastic Agent picks a random Logstash host and sends batches of events to it. Due to the random algorithm, the load on the Logstash nodes should be roughly equal.
  • In case of any errors, Elastic Agent picks another Logstash node, also at random. If a connection to a host fails, the host is retried only if there are errors on the new connection.

Default: false

Example:

outputs:
  default:
    type: logstash
    hosts: ["localhost:5044", "localhost:5045"]
    loadbalance: true

max_retries

(int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.

Set max_retries to a value less than 0 to retry until all events are published.

Default: 3

pipelining

(int) The number of batches to send asynchronously to Logstash while waiting for an ACK from Logstash. The output becomes blocking after the specified number of batches are written. Specify 0 to turn off pipelining.

Default: 2

slow_start

(boolean) If true, only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to bulk_max_size if no error is encountered. On error, the number of events per transaction is reduced again.

Default: false

timeout

(string) The number of seconds to wait for responses from the Logstash server before timing out.

Default: 30s

ttl

(string) Time to live for a connection to Logstash after which the connection will be reestablished. This setting is useful when Logstash hosts represent load balancers. Because connections to Logstash hosts are sticky, operating behind load balancers can lead to uneven load distribution across instances. Specify a TTL on the connection to achieve equal connection distribution across instances.

Default: 0 (turns off the feature)

The ttl option is not yet supported on an asynchronous Logstash client (one with the pipelining option set).

worker

(int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host).

Default: 1