file
editfile
edit- Version: 4.0.0
- Released on: 2016-10-17
- Changelog
- Compatible: 5.1.1.1, 5.0.0, 2.4.1, 2.4.0, 2.3.4
Stream events from files, normally by tailing them in a manner
similar to tail -0F
but optionally reading them from the
beginning.
By default, each event is assumed to be one line and a line is taken to be the text before a newline character. Normally, logging will add a newline to the end of each line written. If you would like to join multiple log lines into one event, you’ll want to use the multiline codec or filter.
The plugin aims to track changing files and emit new content as it’s appended to each file. It’s not well-suited for reading a file from beginning to end and storing all of it in a single event (not even with the multiline codec or filter).
Reading from remote network volumes
editThe file input is not tested on remote filesystems such as NFS, Samba, s3fs-fuse, etc. These remote filesystems typically have behaviors that are very different from local filesystems and are therefore unlikely to work correctly when used with the file input.
Tracking of current position in watched files
editThe plugin keeps track of the current position in each file by recording it in a separate file named sincedb. This makes it possible to stop and restart Logstash and have it pick up where it left off without missing the lines that were added to the file while Logstash was stopped.
By default, the sincedb file is placed in the home directory of the
user running Logstash with a filename based on the filename patterns
being watched (i.e. the path
option). Thus, changing the filename
patterns will result in a new sincedb file being used and any
existing current position state will be lost. If you change your
patterns with any frequency it might make sense to explicitly choose
a sincedb path with the sincedb_path
option.
A different sincedb_path
must be used for each input. Using the same
path will cause issues. The read checkpoints for each input must be
stored in a different path so the information does not override.
Sincedb files are text files with four columns:
- The inode number (or equivalent).
- The major device number of the file system (or equivalent).
- The minor device number of the file system (or equivalent).
- The current byte offset within the file.
On non-Windows systems you can obtain the inode number of a file
with e.g. ls -li
.
File rotation
editFile rotation is detected and handled by this input, regardless of
whether the file is rotated via a rename or a copy operation. To
support programs that write to the rotated file for some time after
the rotation has taken place, include both the original filename and
the rotated filename (e.g. /var/log/syslog and /var/log/syslog.1) in
the filename patterns to watch (the path
option). Note that the
rotated filename will be treated as a new file so if
start_position
is set to beginning the rotated file will be
reprocessed.
With the default value of start_position
(end) any messages
written to the end of the file between the last read operation prior
to the rotation and its reopening under the new name (an interval
determined by the stat_interval
and discover_interval
options)
will not get picked up.
Synopsis
editThis plugin supports the following configuration options:
Required configuration options:
file { path => ... }
Available configuration options:
Setting | Input type | Required | Default value |
---|---|---|---|
No |
|
||
No |
|
||
No |
|
||
No |
|
||
No |
|
||
No |
|
||
No |
|||
No |
|||
No |
|||
No |
|||
Yes |
|||
No |
|||
No |
|
||
string, one of |
No |
|
|
No |
|
||
No |
|||
No |
Details
edit
close_older
edit- Value type is number
-
Default value is
3600
The file input closes any files that were last read the specified timespan in seconds ago. This has different implications depending on if a file is being tailed or read. If tailing, and there is a large time gap in incoming data the file can be closed (allowing other files to be opened) but will be queued for reopening when new data is detected. If reading, the file will be closed after closed_older seconds from when the last bytes were read. The default is 1 hour
codec
edit- Value type is codec
-
Default value is
"plain"
The codec used for input data. Input codecs are a convenient method for decoding your data before it enters the input, without needing a separate filter in your Logstash pipeline.
delimiter
edit- Value type is string
-
Default value is
"\n"
set the new line delimiter, defaults to "\n"
discover_interval
edit- Value type is number
-
Default value is
15
How often (in seconds) we expand the filename patterns in the
path
option to discover new files to watch.
enable_metric
edit- Value type is boolean
-
Default value is
true
Disable or enable metric logging for this specific plugin instance by default we record all the metrics we can, but you can disable metrics collection for a specific plugin.
exclude
edit- Value type is array
- There is no default value for this setting.
Exclusions (matched against the filename, not full path). Filename patterns are valid here, too. For example, if you have
path => "/var/log/*"
You might want to exclude gzipped files:
exclude => "*.gz"
id
edit- Value type is string
- There is no default value for this setting.
Add a unique ID
to the plugin instance, this ID
is used for tracking
information for a specific configuration of the plugin.
output { stdout { id => "ABC" } }
If you don’t explicitely set this variable Logstash will generate a unique name.
ignore_older
edit- Value type is number
- There is no default value for this setting.
When the file input discovers a file that was last modified before the specified timespan in seconds, the file is ignored. After it’s discovery, if an ignored file is modified it is no longer ignored and any new data is read. By default, this option is disabled. Note this unit is in seconds.
max_open_files
edit- Value type is number
- There is no default value for this setting.
What is the maximum number of file_handles that this input consumes at any one time. Use close_older to close some files if you need to process more files than this number. This should not be set to the maximum the OS can do because file handles are needed for other LS plugins and OS processes. The default of 4095 is set in filewatch.
path
edit- This is a required setting.
- Value type is array
- There is no default value for this setting.
The path(s) to the file(s) to use as an input.
You can use filename patterns here, such as /var/log/*.log
.
If you use a pattern like /var/log/**/*.log
, a recursive search
of /var/log
will be done for all *.log
files.
Paths must be absolute and cannot be relative.
You may also configure multiple paths. See an example on the Logstash configuration page.
sincedb_path
edit- Value type is string
- There is no default value for this setting.
Path of the sincedb database file (keeps track of the current
position of monitored log files) that will be written to disk.
The default will write sincedb files to some path matching $HOME/.sincedb*
NOTE: it must be a file path and not a directory path
sincedb_write_interval
edit- Value type is number
-
Default value is
15
How often (in seconds) to write a since database with the current position of monitored log files.
start_position
edit-
Value can be any of:
beginning
,end
-
Default value is
"end"
Choose where Logstash starts initially reading files: at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning.
This option only modifies "first contact" situations where a file is new and not seen before, i.e. files that don’t have a current position recorded in a sincedb file read by Logstash. If a file has already been seen before, this option has no effect and the position recorded in the sincedb file will be used.
stat_interval
edit- Value type is number
-
Default value is
1
How often (in seconds) we stat files to see if they have been modified. Increasing this interval will decrease the number of system calls we make, but increase the time to detect new log lines.
tags
edit- Value type is array
- There is no default value for this setting.
Add any number of arbitrary tags to your event.
This can help with processing later.
type
edit- Value type is string
- There is no default value for this setting.
Add a type
field to all events handled by this input.
Types are used mainly for filter activation.
The type is stored as part of the event itself, so you can also use the type to search for it in Kibana.
If you try to set a type on an event that already has one (for example when you send an event from a shipper to an indexer) then a new input will not override the existing type. A type set at the shipper stays with that event for its life even when sent to another Logstash server.