A History of Logstash Output Workers

Summary

  • If you’re having performance problems read the new Performance Tuning Guide Logstash output workers are confusing.
  • Logstash <= 2.1 had slow and safe defaults
  • Logstash 2.2 had fast and unsafe defaults, this was mostly fine, but sometimes not
  • Logstash 2.3 went back to slow and safe defaults
  • Logstash 2.4 will let us start making plugins fast and safe as a default again, but we need to update individual plugins to do this using the new ‘shared’ output type.
  • If your output is the bottleneck today manually set the output workers setting higher. This may cause increased resource usage and may cause some issues on some plugins, but it generally works for common plugins.

A Brief Overview

This blog post is meant to clarify the output worker changes in Logstash 2.2+, and to discuss how we’ll be improving output workers in the future. Logstash 2.2 introduced the NG Pipeline which provided a sizable performance improvement. It has created some confusion however. Prior to 2.2 there was a notion of Filter Workers (FWs) and Output Workers (OWs). Filter workers executed everything in the filter {} portion of the config, and each plugin in the output {} portion could use a user definable number of OW threads. In 2.2 these were both replaced with one concept of Pipeline Workers (PWs). Confusingly, you can still set OWs, and they still do something very important. Let’s compare and contrast. 

The animation below contrasts the threading model between 2.1 and 2.2.

thread_changes.gif

To put it a bit more formally:

  • Logstash < 2.2
    • Filters / Outputs handle one event at a time
    • Queue between filter workers and output workers
    • Dedicated Thread per FW
    • Dedicated Thread per OW
    • Independently controllable filter/output workers
  • Logstash >= 2.2
    • Filters and outputs deal in batches rather than single events.
    • No queue between filter worker and output worker as they now process batches as one unit
    • One Pipeline Worker thread pool shared by both filters and outputs
    • OWs are no longer threads, they are shared objects that can only be used by a single Pipeline Worker at a time.

The upshot of this change is that in 2.2+ -w, which now controls the pipeline worker setting, controls how many concurrent threads can exist for the filter and output stage combined (PWs). The output worker setting controls how many of those threads can simultaneously work on that output. The takeaways here are:

Output workers should never be set to a number > the number of pipeline workers, since that is the maximum number that can be simultaneously used. You may need more PWs than cores on your system since multiple PWs can be blocked on I/O in a single OW object. Increasing the number of OWs often provides a tangible benefit but be careful. For some outputs adding more can either slow them down or cause them to break in subtle ways.

A more in-depth exploration of the history and the evolution of pipeline architecture is available in this video by Logstash creator Jordan Sissel. We encourage you to watch this!


Why Autoscaling Output Workers in Logstash 2.2 was a Mistake

The NG pipeline was released in 2.2 a few months after 2.0 went out. In Logstash 2.0 we had switched the default protocol for the Elasticsearch output to HTTP. One concern here was that while the plugin was nearly identical in performance / resource utilization to the old default of the Elasticsearch Transport protocol it needed more OWs to reach parity. There was a concern that users upgrading from 1.x to 2.x might feel this slowdown due to the defaults and not tune OW settings. We thought it would also be generally good to, by default, set the number of OWs for every plugin to be equal to the number of PWs. We code reviewed each output we maintain for concurrency issues and set the required options on ones that didn’t support > 1 OW, which meant that the pipeline would leave these at a single OW. This turned out to have some...err… adverse effects.

The main problem with this approach is that doing post-hoc code reviews for concurrency is hard. We have a truly massive amount of plugins which made the task even harder. We started getting bug reports in the 2.2.x series related to plugins we’d missed that may have been threadsafe in terms of data structures, but not in terms of logic.

We made the decision that it was premature to assume that all outputs would behave well in a concurrent environment and to let those bugs shake out in a minor release, so, we reverted it, setting the default back to one OW. So, users who saw speed gains in 2.2.x might have seen speed losses in 2.3.x out of the box. Improving performance is a top priority for us, but correctness of outputs is of even greater importance. Luckily we can achieve both with the new model we’re incorporating into the upcoming Logstash 2.4 and 5.0 series.

All this being said, if you’re having performance problems please read the new Performance Tuning Guide. Its an easy to follow set of diagnostic and remedial procedures for managing Logstash performance.

The Future

The future is in just getting rid of OWs. We call these ‘shared’ outputs. Outputs that can’t be parallelized will be the default, and will only support a single instance with synchronized access. The latest version of the ES output (currently only available in 5.x alphas, but most likely coming to ES 2.4.x) is the first shared Logstash output. These outputs manage all their own concurrency internally as a single shared object. The new ES output will run in parallel for as many PWs as you have with no additional configuration. There are some concurrency controls but these must be custom defined by the plugin in a way that makes sense for its own IO. For instance, on the Elasticsearch output plugin the maximum # of connections will be tunable (it defaulted to a very high number by default, at 100), but aside from that there aren’t any.

In the future we hope to refactor as many Logstash output plugins as possible to use the new concurrency :shared option, per Issue #5661 and use finer grain locking than what the OW system provided. We’ll provide an update when this is done. In the meanwhile, we encourage our users to discuss these issues on our forum, and open GitHub issues. We are also available on twitter and IRC to help you out!