Beats 1.0.0-beta4: lightweight log forwarding with Filebeat
We just released Beats version 1.0.0-beta4. It contains improvements for Packetbeat and Topbeat but also something new: the first version of Filebeat, our new open source log forwarder that ships to Logstash or Elasticsearch.
The goal of Filebeat is to tail logs and to ship them off servers to a central location for further processing. It is very lightweight in terms of consumed resources and has no dependencies or plugins to manage.
Based on the Logstash Forwarder code base
While Filebeat is a new project, significant parts of its code base are based on the Logstash Forwarder project, which has been used in production by many companies for years.
Because Logstash Forwarder and Logstash don’t share any code and they are written in different programming languages (Go and Ruby), Logstash Forwarder was unfortunately neglected for too long by the dev community around Logstash and tended to lag behind in terms of improvements and bug fixes. A few community forks have been created, for example the log-courier and this one by Etsy.
By making Logstash Forwarder a Beats project, which has a dedicated team of Go developers inside Elastic, we make sure the Elastic stack has a well maintained log forwarder.
Over the past couple of months, we took the Logstash Forwarder code, split it into parts, replaced the more rusty bits, added tons of unit, integration and system tests and then put it back together in a Beat. Filebeat now shares a lot of code with our other Beats via libbeat, making the projects benefit from each other’s improvements.
The logs are the queue
One of the core features of the Logstash Forwarder, which we have inherited in Filebeat is that after sending the log lines to Logstash it waits for a confirmation that the message was received. Only after this ack is received the persistent registry where it remembers how far it advanced in each file.
This means that if there is a network partition between Filebeat and Logstash and the connection is temporarily lost, Filebeat simply waits for the connection to be re-established before advancing its pointer in the log files. Unless the connection cannot be established for a very long time and the logs are rotated, no log lines are lost. Similarly, if you restart Logstash, no log lines are lost. If you restart Filebeat, no log lines are lost. If there’s network congestion on the path, no log lines are lost. You get the idea.
Note that there is also drawback to this “at-least-once” approach. Because we retransmit unacknowledged messages, log lines might be duplicated. For most people duplicate log messages are better than lost log messages, but we’ve been working on reducing this effect.
I'm a lumberjack and I'm ok
The protocol used between the Logstash Forwarder and Logstash already supports encryption, message batching and acknowledgements. The current version of Filebeat uses the same protocol but with a few improvements. Notably, for minimizing congestion situations that can result in duplicates, a slow start mechanism is used for the batch size, similar with the approach taken by TCP to avoid congestion. Partial ACKs were also introduced to help in this area. To allow for a simple migration path from the Logstash Forwarder, we have created a new input plugin for Logstash, called logstash-input-beats, and we made sure it can be used in parallel with the existing input plugin used by the Forwarder.
Another change is making the certificate-based authentication optional so that it’s easier to use Filebeat while testing or developing.
Yes, Filebeat works on Windows
I’m glad you asked. Just like Packetbeat and Topbeat, the other Beats projects that we have so far, Filebeat supports Windows. The Logstash Forwarder already worked on Windows but there were a few issues that severely crippled its functionality on that platform. We fixed those and now we run the same tests that we have for Linux and OS X on Windows as well, so you don’t have to worry about differences in behaviour when you have a mixed environment.
The Beats packaging framework also makes it a bit easier to run Filebeat as a Windows service.
Migrate from the Logstash Forwarder
To be more like the other Beats, Filebeat uses YAML for its configuration file, rather than the JSON+comments language used by Logstash Forwarder. Since we started a new project and re-configuration is needed anyway, we took the opportunity to clean up some configuration options. For example, many of the options that Logstash Forwarder received as CLI options are now configuration file options.
Don’t worry though, it’s all quite simple, and we’ve written a fairly extensive migration guide to help you on the process.
Packetbeat and Topbeat
While the main focus during this release cycle was Filebeat, we continued to improve Packetbeat and Topbeat. You can find the respective change logs here and here. Notably they now both support sending the data to Logstash via the same protocol that Filebeat uses, so Redis is no longer needed for the integration with Logstash. This also means that you can now easily use Logstash as a gateway to the many outputs supported by Logstash.
The (near) future
We intend to have the first GA release for all the Beats in a matter of weeks. Shortly after removing the beta label, we will add to Filebeat the three most commonly requested features: multi-line support, Windows event log support, and simple grep-like filtering based on the log message. The three new planned enhancements are tracked in this list.
We always love hearing from you, so why not give Filebeat, Packetbeat or Topbeat a try and let us know what you think on the forums?
Image credits: port, lumberjack, logs