- Filebeat Reference: other versions:
- Overview
- Getting Started With Filebeat
- Setting up and running Filebeat
- Upgrading Filebeat
- How Filebeat works
- Configuring Filebeat
- Specify which modules to run
- Configure inputs
- Manage multiline messages
- Specify general settings
- Load external configuration files
- Configure the internal queue
- Configure the output
- Configure index lifecycle management
- Load balance the output hosts
- Specify SSL settings
- Filter and enhance the exported data
- Define processors
- Add cloud metadata
- Add fields
- Add labels
- Add the local time zone
- Add tags
- Decode CEF
- Decode CSV fields
- Decode JSON fields
- Decode Base64 fields
- Decompress gzip fields
- Community ID Network Flow Hash
- Convert
- Drop events
- Drop fields from events
- Extract array
- Keep fields from events
- Registered Domain
- Rename fields from events
- Add Kubernetes metadata
- Add Docker metadata
- Add Host metadata
- Add Observer metadata
- Dissect strings
- DNS Reverse Lookup
- Add process metadata
- Script Processor
- Timestamp
- Parse data by using ingest node
- Enrich events with geoIP information
- Configure project paths
- Configure the Kibana endpoint
- Load the Kibana dashboards
- Load the Elasticsearch index template
- Configure logging
- Use environment variables in the configuration
- Autodiscover
- YAML tips and gotchas
- Regular expression support
- HTTP Endpoint
- filebeat.reference.yml
- Beats central management
- Modules
- Modules overview
- Apache module
- Auditd module
- AWS module
- CEF module
- Cisco module
- Coredns Module
- Elasticsearch module
- Envoyproxy Module
- Google Cloud module
- haproxy module
- IBM MQ module
- Icinga module
- IIS module
- Iptables module
- Kafka module
- Kibana module
- Logstash module
- MongoDB module
- MSSQL module
- MySQL module
- nats module
- NetFlow module
- Nginx module
- Osquery module
- Palo Alto Networks module
- PostgreSQL module
- RabbitMQ module
- Redis module
- Santa module
- Suricata module
- System module
- Traefik module
- Zeek (Bro) Module
- Exported fields
- Apache fields
- Auditd fields
- AWS fields
- Beat fields
- Decode CEF processor fields fields
- CEF fields
- Cisco fields
- Cloud provider metadata fields
- Coredns fields
- Docker fields
- ECS fields
- elasticsearch fields
- Envoyproxy fields
- Google Cloud fields
- haproxy fields
- Host fields
- ibmmq fields
- Icinga fields
- IIS fields
- iptables fields
- Jolokia Discovery autodiscover provider fields
- Kafka fields
- kibana fields
- Kubernetes fields
- Log file content fields
- logstash fields
- mongodb fields
- mssql fields
- MySQL fields
- nats fields
- NetFlow fields
- NetFlow fields
- Nginx fields
- Osquery fields
- panw fields
- PostgreSQL fields
- Process fields
- RabbitMQ fields
- Redis fields
- s3 fields
- Google Santa fields
- Suricata fields
- System fields
- Traefik fields
- Zeek fields
- Monitoring Filebeat
- Securing Filebeat
- Troubleshooting
- Get help
- Debug
- Common problems
- Can’t read log files from network volumes
- Filebeat isn’t collecting lines from a file
- Too many open file handlers
- Registry file is too large
- Inode reuse causes Filebeat to skip lines
- Log rotation results in lost or duplicate events
- Open file handlers cause issues with Windows file rotation
- Filebeat is using too much CPU
- Dashboard in Kibana is breaking up data fields incorrectly
- Fields are not indexed or usable in Kibana visualizations
- Filebeat isn’t shipping the last line of a file
- Filebeat keeps open file handlers of deleted files for a long time
- Filebeat uses too much bandwidth
- Error loading config file
- Found unexpected or unknown characters
- Logstash connection doesn’t work
- @metadata is missing in Logstash
- Not sure whether to use Logstash or Beats
- SSL client fails to connect to Logstash
- Monitoring UI shows fewer Beats than expected
- Contributing to Beats
s3 input
edits3 input
editThis functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
Use the s3
input to retrieve logs from S3 objects that are pointed by messages
from specific SQS queues. This input can, for example, be used to receive S3
server access logs to monitor detailed records for the requests that are made to
a bucket.
When processing a s3 object which pointed by a sqs message, if half of the set visibility timeout passed and the processing is still ongoing, then the visibility timeout of that sqs message will be reset to make sure the message does not go back to the queue in the middle of the processing. If there are errors happening during the processing of the s3 object, then the process will be stopped and the sqs message will be returned back to the queue.
filebeat.inputs: - type: s3 queue_url: https://sqs.ap-southeast-1.amazonaws.com/1234/test-s3-queue access_key_id: my-access-key secret_access_key: my-secret-access-key
The s3
input supports the following configuration options plus the
Common options described later.
queue_url
editURL of the AWS SQS queue that messages will be received from. Required.
visibility_timeout
editThe duration (in seconds) that the received messages are hidden from subsequent retrieve requests after being retrieved by a ReceiveMessage request. This value needs to be a lot bigger than filebeat collection frequency so if it took too long to read the s3 log, this sqs message will not be reprocessed. The default visibility timeout for a message is 300 seconds. The minimum is 0 seconds. The maximum is 12 hours.
aws credentials
editIn order to make AWS API calls, s3
input requires AWS credentials.Please see
AWS credentials options for more details.
AWS Permissions
editSpecific AWS permissions are required for IAM user to access SQS and S3:
s3:GetObject sqs:ReceiveMessage sqs:ChangeMessageVisibility sqs:DeleteMessage
S3 and SQS setup
editEnable bucket notification: any new object creation in S3 bucket will also create a notification through SQS. Please see create-sqs-queue-for-notification for more details.
Parallel Processing
editMultiple Filebeat instances can read from the same SQS queues at the same time. To horizontally scale processing when there are large amounts of log data flowing into an S3 bucket, you can run multiple Filebeat instances that read from the same SQS queues at the same time. No additional configuration is required.
Using SQS ensures that each message in the queue is processed only once even when multiple Filebeat instances are running in parallel. To prevent Filebeat from receiving and processing the message more than once, set the visibility timeout.
The visibility timeout begins when SQS returns a message to Filebeat. During this time, Filebeat processes and deletes the message. However, if Filebeat fails before deleting the message and your system doesn’t call the DeleteMessage action for that message before the visibility timeout expires, the message becomes visible to other Filebeat instances, and the message is received again. By default, the visibility timeout is set to 5 minutes for s3 input in Filebeat. 5 minutes is sufficient time for Filebeat to read SQS messages and process related s3 log files.
Common options
editThe following configuration options are supported by all inputs.
enabled
editUse the enabled
option to enable and disable inputs. By default, enabled is
set to true.
tags
editA list of tags that Filebeat includes in the tags
field of each published
event. Tags make it easy to select specific events in Kibana or apply
conditional filtering in Logstash. These tags will be appended to the list of
tags specified in the general configuration.
Example:
filebeat.inputs: - type: s3 . . . tags: ["json"]
fields
editOptional fields that you can specify to add additional information to the
output. For example, you might add fields that you can use for filtering log
data. Fields can be scalar values, arrays, dictionaries, or any nested
combination of these. By default, the fields that you specify here will be
grouped under a fields
sub-dictionary in the output document. To store the
custom fields as top-level fields, set the fields_under_root
option to true.
If a duplicate field is declared in the general configuration, then its value
will be overwritten by the value declared here.
filebeat.inputs: - type: s3 . . . fields: app_id: query_engine_12
fields_under_root
editIf this option is set to true, the custom
fields are stored as top-level fields in
the output document instead of being grouped under a fields
sub-dictionary. If
the custom field names conflict with other field names added by Filebeat,
then the custom fields overwrite the other fields.
processors
editA list of processors to apply to the input data.
See Filter and enhance the exported data for information about specifying processors in your config.
pipeline
editThe Ingest Node pipeline ID to set for the events generated by this input.
The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.
AWS Credentials Configuration
editTo configure AWS credentials, either put the credentials into the Filebeat configuration, or use a shared credentials file, as shown in the following examples.
Supported Formats
edit-
Use
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
and/orAWS_SESSION_TOKEN
Users can either put the credentials into metricbeat module configuration or use
environment variable AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
and/or
AWS_SESSION_TOKEN
instead.
-
Use AWS credentials in Filebeat configuration
filebeat.inputs: - type: s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue access_key_id: '<access_key_id>' secret_access_key: '<secret_access_key>' session_token: '<session_token>'
or
filebeat.inputs: - type: s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue access_key_id: '${AWS_ACCESS_KEY_ID:""}' secret_access_key: '${AWS_SECRET_ACCESS_KEY:""}' session_token: '${AWS_SESSION_TOKEN:""}'
-
Use shared AWS credentials file
filebeat.inputs: - type: s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue credential_profile_name: test-fb
credential_profile_name
is optional. If there is no credential_profile_name
given, the default profile will be used.
In Windows, shared credentials file is at C:\Users\<yourUserName>\.aws\credentials
.
For Linux, macOS or Unix, the file is located at ~/.aws/credentials
. Please see
Create Shared Credentials File
for more details.
AWS Credentials Types
editThere are two different types of AWS credentials can be used: access keys and temporary security credentials.
- Access keys
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
are the two parts of access keys.
They are long-term credentials for an IAM user or the AWS account root user.
Please see
AWS Access Keys
and Secret Access Keys
for more details.
- Temporary security credentials
temporary security credentials has a limited lifetime and consists of an
access key ID, a secret access key, and a security token which typically returned
from GetSessionToken
. MFA-enabled IAM users would need to submit an MFA code
while calling GetSessionToken
. default_region
identifies the AWS Region
whose servers you want to send your first API request to by default. This is
typically the Region closest to you, but it can be any Region. Please see
Temporary Security Credentials
for more details.
sts get-session-token
AWS CLI can be used to generate temporary credentials. For example. with MFA-enabled:
aws> sts get-session-token --serial-number arn:aws:iam::1234:mfa/your-email@example.com --token-code 456789 --duration-seconds 129600
Because temporary security credentials are short term, after they expire, the user needs to generate new ones and modify the aws.yml config file with the new credentials. Unless live reloading feature is enabled for Metricbeat, the user needs to manually restart Metricbeat after updating the config file in order to continue collecting Cloudwatch metrics. This will cause data loss if the config file is not updated with new credentials before the old ones expire. For Metricbeat, we recommend users to use access keys in config file to enable aws module making AWS api calls without have to generate new temporary credentials and update the config frequently.
IAM policy is an entity that defines permissions to an object within your AWS environment. Specific permissions needs to be added into the IAM user’s policy to authorize Metricbeat to collect AWS monitoring metrics. Please see documentation under each metricset for required permissions.
On this page