Upgrade Guide for Logstash 2.2

Hello Logstashers! In this post, we'll talk about considerations when upgrading to Logstash 2.2.0. Specifically how to avoid situations where some new behavior (and a bit of soon to be fixed misbehavior by the Elasticsearch output plugin), can cause a larger than desired number of file handles to be used. I will also discuss some of the new tuning considerations for 2.2.0. To help users with upgrading to 2.2.0, we’ve also added a new documentation section here.

The good news is that Logstash 2.2 brings with it some very significant performance improvements! The down side is that one of the changes we made to the default settings in Logstash may cause issues for some configurations. For example, we are hearing reports of Logstash taking up many more file handles than it previously would.

Lets dive deeper into these changes.

Worker Units

There are two types of worker in Logstash 2.2, pipeline workers (formerly filter workers) and output workers. In previous versions of Logstash the filters and outputs ran in their own threads. In 2.2 we unified our threading model, putting both filter and output worker threads in one 'pipeline' worker thread type. This has given us some significant performance gains. The output worker concept remains, but rather than mapping directly to a thread it maps to a pool of available output objects. During execution these are grabbed by a pipeline worker for use and returned when done.

You will find that you need to set -w higher than before, and in fact Logstash may require more total threads than before. However, this is by design! You will find that you have more threads doing less work. Sure, a fair number of them may be idle in IO wait, however the work they do is now much more efficient, and with less context switching. To tune the -w setting just keep increasing it, even to a multiple of the number of cores on your system (remember, those threads are often idle) until you see throughput go down. By default, -w is set to match the number of cores in your system.

Batching Events

There is also a new batch size setting -b which you can tune in a similar way. Please read our previous post that explains these options in more detail. This batch size is now also the default flush size for the Elasticsearch output. The flush_size option on that now only changes the maximum flush size and will no longer set the minimum flush size.

New Pipeline and Outputs

In our performance testing we found that increasing the number of output workers to stay in lock step with pipeline workers yielded some nice performance gains. However, for our Elasticsearch output there's an undesired behavior that stayed hidden until now. If you use multiple Elasticsearch output workers they won't share the same backend connection pool! That means if you have 5 backend ES instances and 5 workers you may have up to 5*5=25 connections! If you have 40 ES nodes the effect is amplified!

We understand the current situation isn’t ideal in some cases. The fix isn't to change the Logstash core behavior, but to make the Elasticsearch output handle connections responsibly in the new pipeline model. We are targeting a new release in the 2.2 series that addresses this. We're currently working on a patch for this that will only require a plugin upgrade, but until that's released you'll want to stick with a relatively low number of ES output workers. The workaround in 2.2 is to just set your worker count low explicitly, try only 1 or 2 workers.

Feedback

I hope we’ve clarified some aspects about our new pipeline architecture and upgrading to 2.2. Please let us know if you have any feedback! You can always reach us on Twitter (@elastic), on our forum, or report any problems on the GitHub issues page.