- Filebeat Reference: other versions:
- Filebeat overview
- Quick start: installation and configuration
- Set up and run
- Upgrade
- How Filebeat works
- Configure
- Inputs
- Modules
- General settings
- Project paths
- Config file loading
- Output
- Kerberos
- SSL
- Index lifecycle management (ILM)
- Elasticsearch index template
- Kibana endpoint
- Kibana dashboards
- Processors
- Define processors
- add_cloud_metadata
- add_cloudfoundry_metadata
- add_docker_metadata
- add_fields
- add_host_metadata
- add_id
- add_kubernetes_metadata
- add_labels
- add_locale
- add_network_direction
- add_nomad_metadata
- add_observer_metadata
- add_process_metadata
- add_tags
- community_id
- convert
- copy_fields
- decode_base64_field
- decode_cef
- decode_csv_fields
- decode_json_fields
- decode_xml
- decode_xml_wineventlog
- decompress_gzip_field
- detect_mime_type
- dissect
- dns
- drop_event
- drop_fields
- extract_array
- fingerprint
- include_fields
- rate_limit
- registered_domain
- rename
- script
- timestamp
- translate_sid
- truncate_fields
- urldecode
- Autodiscover
- Internal queue
- Load balancing
- Logging
- HTTP endpoint
- Regular expression support
- Instrumentation
- filebeat.reference.yml
- How to guides
- Override configuration settings
- Load the Elasticsearch index template
- Change the index name
- Load Kibana dashboards
- Load ingest pipelines
- Enrich events with geoIP information
- Deduplicate data
- Parse data using an ingest pipeline
- Use environment variables in the configuration
- Avoid YAML formatting problems
- Migrate
log
input configurations tofilestream
- Modules
- Modules overview
- ActiveMQ module
- Apache module
- Auditd module
- AWS module
- AWS Fargate module
- Azure module
- Barracuda module
- Bluecoat module
- CEF module
- Check Point module
- Cisco module
- CoreDNS module
- Crowdstrike module
- Cyberark module
- Cyberark PAS module
- Cylance module
- Elasticsearch module
- Envoyproxy Module
- F5 module
- Fortinet module
- Google Cloud module
- Google Workspace module
- GSuite module
- HAproxy module
- IBM MQ module
- Icinga module
- IIS module
- Imperva module
- Infoblox module
- Iptables module
- Juniper module
- Kafka module
- Kibana module
- Logstash module
- Microsoft module
- MISP module
- MongoDB module
- MSSQL module
- MySQL module
- MySQL Enterprise module
- NATS module
- NetFlow module
- Netscout module
- Nginx module
- Office 365 module
- Okta module
- Oracle module
- Osquery module
- Palo Alto Networks module
- pensando module
- PostgreSQL module
- Proofpoint module
- RabbitMQ module
- Radware module
- Redis module
- Santa module
- Snort module
- Snyk module
- Sonicwall module
- Sophos module
- Squid module
- Suricata module
- System module
- Threat Intel module
- Tomcat module
- Traefik module
- Zeek (Bro) Module
- ZooKeeper module
- Zoom module
- Zscaler module
- Exported fields
- ActiveMQ fields
- Apache fields
- Auditd fields
- AWS fields
- aws-cloudwatch fields
- AWS Fargate fields
- Azure fields
- Barracuda Web Application Firewall fields
- Beat fields
- Blue Coat Director fields
- Decode CEF processor fields fields
- CEF fields
- Checkpoint fields
- Cisco fields
- Cloud provider metadata fields
- Coredns fields
- Crowdstrike fields
- Cyber-Ark fields
- CyberArk PAS fields
- CylanceProtect fields
- Docker fields
- ECS fields
- Elasticsearch fields
- Envoyproxy fields
- Big-IP Access Policy Manager fields
- Fortinet fields
- Google Cloud Platform (GCP) fields
- google_workspace fields
- gsuite fields
- HAProxy fields
- Host fields
- ibmmq fields
- Icinga fields
- IIS fields
- Imperva SecureSphere fields
- Infoblox NIOS fields
- iptables fields
- Jolokia Discovery autodiscover provider fields
- Juniper JUNOS fields
- Kafka fields
- kibana fields
- Kubernetes fields
- Log file content fields
- logstash fields
- Microsoft fields
- MISP fields
- mongodb fields
- mssql fields
- MySQL fields
- MySQL Enterprise fields
- NATS fields
- NetFlow fields
- Arbor Peakflow SP fields
- Nginx fields
- Office 365 fields
- Okta fields
- Oracle fields
- Osquery fields
- panw fields
- Pensando fields
- PostgreSQL fields
- Process fields
- Proofpoint Email Security fields
- RabbitMQ fields
- Radware DefensePro fields
- Redis fields
- s3 fields
- Google Santa fields
- Snort/Sourcefire fields
- Snyk fields
- Sonicwall-FW fields
- sophos fields
- Squid fields
- Suricata fields
- System fields
- threatintel fields
- Apache Tomcat fields
- Traefik fields
- Zeek fields
- ZooKeeper fields
- Zoom fields
- Zscaler NSS fields
- Monitor
- Secure
- Troubleshoot
- Get help
- Debug
- Common problems
- Error extracting container id while using Kubernetes metadata
- Can’t read log files from network volumes
- Filebeat isn’t collecting lines from a file
- Too many open file handlers
- Registry file is too large
- Inode reuse causes Filebeat to skip lines
- Log rotation results in lost or duplicate events
- Open file handlers cause issues with Windows file rotation
- Filebeat is using too much CPU
- Dashboard in Kibana is breaking up data fields incorrectly
- Fields are not indexed or usable in Kibana visualizations
- Filebeat isn’t shipping the last line of a file
- Filebeat keeps open file handlers of deleted files for a long time
- Filebeat uses too much bandwidth
- Error loading config file
- Found unexpected or unknown characters
- Logstash connection doesn’t work
- Publishing to Logstash fails with "connection reset by peer" message
- @metadata is missing in Logstash
- Not sure whether to use Logstash or Beats
- SSL client fails to connect to Logstash
- Monitoring UI shows fewer Beats than expected
- Dashboard could not locate the index-pattern
- Contribute to Beats
AWS S3 input
editAWS S3 input
editUse the aws-s3
input to retrieve logs from S3 objects that are pointed to by
S3 notification events read from an SQS queue or directly polling list of S3 objects in an S3 bucket.
The use of SQS notification is preferred: polling list of S3 objects is expensive
in terms of performance and costs and should be preferably used only when no SQS
notification can be attached to the S3 buckets. This input can, for example, be
used to receive S3 access logs to monitor detailed records for the requests that
are made to a bucket. This input also supports S3 notification from SNS to SQS.
SQS notification method is enabled setting queue_url
configuration value.
S3 bucket list polling method is enabled setting bucket_arn
configuration value.
Both value cannot be set at the same time, at least one of the two value must be set.
When using the SQS notification method this input depends on S3 notifications delivered
to an SQS queue for s3:ObjectCreated:*
events. You must create an SQS queue and configure S3
to publish events to the queue.
When processing a S3 object which pointed by a SQS message, if half of the set visibility timeout passed and the processing is still ongoing, then the visibility timeout of that SQS message will be reset to make sure the message does not go back to the queue in the middle of the processing. If there are errors happening during the processing of the S3 object, then the process will be stopped and the SQS message will be returned back to the queue.
filebeat.inputs: - type: aws-s3 queue_url: https://sqs.ap-southeast-1.amazonaws.com/1234/test-s3-queue credential_profile_name: elastic-beats expand_event_list_from_field: Records
When using the direct polling list of S3 objects in an S3 buckets,
a number of workers that will process the S3 objects listed must be set
through the number_of_workers
config.
Listing of the S3 bucket will be polled according the time interval defined by
bucket_list_interval
config. Default value is 120secs.
filebeat.inputs: - type: aws-s3 bucket_arn: arn:aws:s3:::test-s3-bucket number_of_workers: 5 bucket_list_interval: 300s credential_profile_name: elastic-beats expand_event_list_from_field: Records
The aws-s3
input supports the following configuration options plus the
Common options described later.
api_timeout
editThe maximum duration of the AWS API call. If it exceeds the timeout, the AWS API
call will be interrupted. The default AWS API timeout is 120s
.
The API timeout must be longer than the sqs.wait_time
value.
buffer_size
editThe size in bytes of the buffer that each harvester uses when fetching a file.
This only applies to non-JSON logs. The default is 16 KiB
.
content_type
editA standard MIME type describing the format of the object data. This
can be set to override the MIME type that was given to the object when
it was uploaded. For example: application/json
.
encoding
editThe file encoding to use for reading data that contains international
characters. This only applies to non-JSON logs. See encoding
.
expand_event_list_from_field
editIf the fileset using this input expects to receive multiple messages bundled
under a specific field then the config option expand_event_list_from_field
value can be assigned the name of the field. This setting will be able to split
the messages under the group value into separate events. For example, CloudTrail
logs are in JSON format and events are found under the JSON object "Records".
{ "Records": [ { "eventVersion": "1.07", "eventTime": "2019-11-14T00:51:00Z", "awsRegion": "us-east-1", "eventID": "EXAMPLE8-9621-4d00-b913-beca2EXAMPLE", }, { "eventVersion": "1.07", "eventTime": "2019-11-14T00:52:00Z", "awsRegion": "us-east-1", "eventID": "EXAMPLEc-28be-486c-8928-49ce6EXAMPLE", } ] }
Note: When expand_event_list_from_field
parameter is given in the config,
aws-s3 input will assume the logs are in JSON format and decode them as JSON.
Content type will not be checked. If a file has "application/json" content-type,
expand_event_list_from_field
becomes required to read the JSON file.
file_selectors
editIf the SQS queue will have events that correspond to files that
Filebeat shouldn’t process file_selectors
can be used to limit
the files that are downloaded. This is a list of selectors which are
made up of regex
and expand_event_list_from_field
options. The
regex
should match the S3 object key in the SQS message, and the
optional expand_event_list_from_field
is the same as the global
setting. If file_selectors
is given, then any global
expand_event_list_from_field
value is ignored in favor of the ones
specified in the file_selectors
. Regex syntax is the same as the Go
language. Files that don’t match one of the regexes won’t be
processed. content_type
, parsers
,
include_s3_metadata
,max_bytes
,
buffer_size
, and encoding
may also
be set for each file selector.
file_selectors: - regex: '/CloudTrail/' expand_event_list_from_field: 'Records' - regex: '/CloudTrail-Digest/' - regex: '/CloudTrail-Insight/' expand_event_list_from_field: 'Records'
fips_enabled
editEnabling this option changes the service name from s3
to s3-fips
for
connecting to the correct service endpoint. For example:
s3-fips.us-gov-east-1.amazonaws.com
.
include_s3_metadata
editThis input can include S3 object metadata in the generated events for use in
follow-on processing. You must specify the list of keys to include. By default
none are included. If the key exists in the S3 response then it will be included
in the event as aws.s3.metadata.<key>
where the key name as been normalized
to all lowercase.
include_s3_metadata: - last-modified - x-amz-version-id
max_bytes
editThe maximum number of bytes that a single log message can have. All bytes after
max_bytes
are discarded and not sent. This setting is especially useful for
multiline log messages, which can get large. This only applies to non-JSON logs.
The default is 10 MiB
.
max_number_of_messages
editThe maximum number of SQS messages that can be inflight at any time. Defaults to 5.
parsers
editThis functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
This option expects a list of parsers that non-JSON logs go through.
Available parsers:
-
multiline
In this example, Filebeat is reading multiline messages that
consist of XML that start with the <Event>
tag.
filebeat.inputs: - type: aws-s3 ... parsers: - multiline: pattern: "^<Event" negate: true match: after
See the available parser settings in detail below.
multiline
editThis functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
Options that control how Filebeat deals with log messages that span multiple lines. See Multiline messages for more information about configuring multiline options.
queue_url
editURL of the AWS SQS queue that messages will be received from. (Required when bucket_arn
is not set).
visibility_timeout
editThe duration that the received SQS messages are hidden from subsequent retrieve
requests after being retrieved by a ReceiveMessage
request. The default
visibility timeout is 300s
. The maximum is 12h
. Filebeat will
automatically reset the visibility timeout of a message after 1/2 of the
duration passes to prevent a message that is still being processed from
returning to the queue.
sqs.max_receive_count
editThe maximum number of times a SQS message should be received (retried) before
deleting it. This feature prevents poison-pill messages (messages that can be
received but can’t be processed) from consuming resources. The number of times
a message has been received is tracked using the ApproximateReceiveCount
SQS
attribute. The default value is 5.
If you have configured a dead letter queue then you can set this value to
-1
to disable deletion on failure.
sqs.notification_parsing_script.source
editInline Javascript source code.
sqs.notification_parsing_script.source: > function parse(notification) { var evts = []; var evt = new S3EventV2(); evt.SetS3BucketName(notification.bucket); evt.SetS3ObjectKey(notification.path); evts.push(evt); return evts; }
sqs.notification_parsing_script.file
editPath to a script file to load. Relative paths are interpreted as
relative to the path.config
directory. Globs are expanded.
This loads filter.js
from disk.
sqs.notification_parsing_script.file: ${path.config}/filter.js
sqs.notification_parsing_script.files
editList of script files to load. The scripts are concatenated together.
Relative paths are interpreted as relative to the path.config
directory.
And globs are expanded.
sqs.notification_parsing_script.params
editA dictionary of parameters that are passed to the register
of the
script.
Parameters can be passed to the script by adding params
to the config.
This allows for a script to be made reusable. When using params
the
code must define a register(params)
function to receive the parameters.
sqs.notification_parsing_script: params: provider: aws:s3 source: > var params = {provider: ""}; function register(scriptParams) { params = scriptParams; } function parse(notification) { var evts = []; var evt = new S3EventV2(); evt.SetS3BucketName(notification.bucket); evt.SetS3ObjectKey(notification.path); evt.SetProvider(params.provider); evts.push(evt); return evts; }
sqs.notification_parsing_script.timeout
editThis sets an execution timeout for the process
function. When
the process
function takes longer than the timeout
period the function
is interrupted. You can set this option to prevent a script from running for
too long (like preventing an infinite while
loop). By default there is no
timeout.
sqs.notification_parsing_script.max_cached_sessions
editThis sets the maximum number of Javascript VM sessions that will be cached to avoid reallocation.
sqs.wait_time
editThe maximum duration that an SQS ReceiveMessage
call should wait for a message
to arrive in the queue before returning. The default value is 20s
. The maximum
value is 20s
.
bucket_arn
editARN of the AWS S3 bucket that will be polled for list operation. (Required when queue_url
is not set).
bucket_list_interval
editTime interval for polling listing of the S3 bucket: default to 120s
.
bucket_list_prefix
editPrefix to apply for the list request to the S3 bucket. Default empty.
number_of_workers
editNumber of workers that will process the S3 objects listed. (Required when bucket_arn
is set).
aws credentials
editIn order to make AWS API calls, aws-s3
input requires AWS credentials. Please
see AWS credentials options for more details.
AWS Permissions
editSpecific AWS permissions are required for IAM user to access SQS and S3 when using the SQS notifications method:
s3:GetObject sqs:ReceiveMessage sqs:ChangeMessageVisibility sqs:DeleteMessage
Reduced specific S3 AWS permissions are required for IAM user to access S3 when using the polling list of S3 bucket objects:
s3:GetObject s3:ListBucket s3:GetBucketLocation
S3 and SQS setup
editEnable bucket notification: any new object creation in S3 bucket will also create a notification through SQS. Please see create-sqs-queue-for-notification for more details.
S3 → SNS → SQS setup
editIf you would like to use the bucket notification in multiple different consumers (others than filebeat), you should use an SNS topic for the bucket notification. Please see create-SNS-topic-for-notification for more details. SQS queue will be configured as a subscriber to the SNS topic.
Parallel Processing
editWhen using the SQS notifications method, multiple Filebeat instances can read from the same SQS queues at the same time. To horizontally scale processing when there are large amounts of log data flowing into an S3 bucket, you can run multiple Filebeat instances that read from the same SQS queues at the same time. No additional configuration is required.
Using SQS ensures that each message in the queue is processed only once even when multiple Filebeat instances are running in parallel. To prevent Filebeat from receiving and processing the message more than once, set the visibility timeout.
The visibility timeout begins when SQS returns a message to Filebeat. During this time, Filebeat processes and deletes the message. However, if Filebeat fails before deleting the message and your system doesn’t call the DeleteMessage action for that message before the visibility timeout expires, the message becomes visible to other Filebeat instances, and the message is received again. By default, the visibility timeout is set to 5 minutes for aws-s3 input in Filebeat. 5 minutes is sufficient time for Filebeat to read SQS messages and process related s3 log files.
When using the polling list of S3 bucket objects method be aware that if running multiple Filebeat instances,
they can list the same S3 bucket at the same time. Since the state of the ingested S3 objects is persisted
(upon processing a single list operation) in the path.data
configuration
and multiple Filebeat cannot share the same path.data
this will produce repeated
ingestion of the S3 object.
Therefore, when using the polling list of S3 bucket objects method, scaling should be
vertical, with a single bigger Filebeat instance and higher number_of_workers
config value.
SQS Custom Notification Parsing Script
editUnder some circumstances you might want to listen to events that are not following the standard SQS notifications format. To be able to parse them, it is possible to define a custom script that will take care of processing them and generating the required list of S3 Events used to download the files.
The sqs.notification_parsing_script
executes Javascript code to process an event.
It uses a pure Go implementation of ECMAScript 5.1 and has no external dependencies.
It can be configured by embedding Javascript in your configuration file or by pointing
the processor at external file(s). Only one of the options sqs.notification_parsing_script.source
, sqs.notification_parsing_script.file
, and sqs.notification_parsing_script.files
can be set at the same time.
The script requires a parse(notification)
function that receives the notification as
a raw string and returns a list of S3EventV2
objects. This raw string can then be
processed as needed, e.g.: JSON.parse(n)
or the provided helper for XML new XMLDecoder(n)
.
If the script defines a test()
function it will be invoked when it is loaded. Any exceptions thrown will cause the processor to fail to load. This can be used to make assertions about the behavior of the script.
function parse(n) { var m = JSON.parse(n); var evts = []; var files = m.files; var bucket = m.bucket; if (!Array.isArray(files) || (files.length == 0) || bucket == null || bucket == "") { return evts; } files.forEach(function(f){ var evt = new S3EventV2(); evt.SetS3BucketName(bucket); evt.SetS3ObjectKey(f.path); evts.push(evt); }); return evts; } function test() { var events = parse({bucket: "aBucket", files: [{path: "path/to/file"}]}); if (events.length !== 1) { throw "expecting one event"; } if (events[0].S3.Bucket.Name === "aBucket") { throw "expected bucket === aBucket"; } if (events[0].S3.Object.Key === "path/to/file") { throw "expected bucket === path/to/file"; } }
S3EventV2 API
editThe S3EventV2
object returned by the parse
method.
Method | Description |
---|---|
|
Returns a new Example: |
|
Sets the AWS region. Example: |
|
Sets the event name. Example: |
|
Sets the event surce. Example: |
|
Sets the bucket name. Example: |
|
Sets the bucket ARN. Example: |
|
Sets the object key. Example: |
In order to be able to retrieve an S3 object successfully, at least S3.Object.Key
and S3.Bucket.Name
properties must be set (using the provided setters). The other
properties will be used as metadata in the resulting event when available.
XMLDecoder API
editTo help with XML decoding, an XMLDecoder
class is provided.
Example XML input:
<catalog> <book seq="1"> <author>William H. Gaddis</author> <title>The Recognitions</title> <review>One of the great seminal American novels of the 20th century.</review> </book> </catalog>
Will produce the following output:
{ "catalog": { "book": { "author": "William H. Gaddis", "review": "One of the great seminal American novels of the 20th century.", "seq": "1", "title": "The Recognitions" } } }
Method | Description |
---|---|
|
Returns a new Example: |
|
Causes the Decoder to prepend a hyphen ( Example: |
|
Causes the Decoder to transform all key name to lowercase. Example: |
|
Reads the XML string and return a map containing the data. Example: |
Metrics
editThis input exposes metrics under the HTTP monitoring endpoint.
These metrics are exposed under the /dataset
path. They can be used to
observe the activity of the input.
Metric | Description |
---|---|
|
Number of SQS messages received (not necessarily processed fully). |
|
Number of SQS visibility timeout extensions. |
|
Number of SQS messages inflight (gauge). |
|
Number of SQS message returned to queue (happens on errors implicitly after visibility timeout passes). |
|
Number of SQS messages deleted. |
|
Histogram of the elapsed SQS processing times in nanoseconds (time of receipt to time of delete/return). |
|
Number of S3 objects downloaded. |
|
Number of S3 objects returned by list operations. |
|
Number of S3 objects that matched file_selectors rules. |
|
Number of S3 objects processed that were fully ACKed. |
|
Number of S3 bytes processed. |
|
Number of events created from processing S3 data. |
|
Number of S3 objects inflight (gauge). |
|
Histogram of the elapsed S3 object processing times in nanoseconds (start of download to completion of parsing). |
Common options
editThe following configuration options are supported by all inputs.
enabled
editUse the enabled
option to enable and disable inputs. By default, enabled is
set to true.
tags
editA list of tags that Filebeat includes in the tags
field of each published
event. Tags make it easy to select specific events in Kibana or apply
conditional filtering in Logstash. These tags will be appended to the list of
tags specified in the general configuration.
Example:
filebeat.inputs: - type: aws-s3 . . . tags: ["json"]
fields
editOptional fields that you can specify to add additional information to the
output. For example, you might add fields that you can use for filtering log
data. Fields can be scalar values, arrays, dictionaries, or any nested
combination of these. By default, the fields that you specify here will be
grouped under a fields
sub-dictionary in the output document. To store the
custom fields as top-level fields, set the fields_under_root
option to true.
If a duplicate field is declared in the general configuration, then its value
will be overwritten by the value declared here.
filebeat.inputs: - type: aws-s3 . . . fields: app_id: query_engine_12
fields_under_root
editIf this option is set to true, the custom
fields are stored as top-level fields in
the output document instead of being grouped under a fields
sub-dictionary. If
the custom field names conflict with other field names added by Filebeat,
then the custom fields overwrite the other fields.
processors
editA list of processors to apply to the input data.
See Processors for information about specifying processors in your config.
pipeline
editThe ingest pipeline ID to set for the events generated by this input.
The pipeline ID can also be configured in the Elasticsearch output, but this option usually results in simpler configuration files. If the pipeline is configured both in the input and output, the option from the input is used.
keep_null
editIf this option is set to true, fields with null
values will be published in
the output document. By default, keep_null
is set to false
.
index
editIf present, this formatted string overrides the index for events from this input
(for elasticsearch outputs), or sets the raw_index
field of the event’s
metadata (for other outputs). This string can only refer to the agent name and
version and the event timestamp; for access to dynamic fields, use
output.elasticsearch.index
or a processor.
Example value: "%{[agent.name]}-myindex-%{+yyyy.MM.dd}"
might
expand to "filebeat-myindex-2019.11.01"
.
publisher_pipeline.disable_host
editBy default, all events contain host.name
. This option can be set to true
to
disable the addition of this field to all events. The default value is false
.
AWS Credentials Configuration
editTo configure AWS credentials, either put the credentials into the Filebeat configuration, or use a shared credentials file, as shown in the following examples.
Configuration parameters
edit- access_key_id: first part of access key.
- secret_access_key: second part of access key.
- session_token: required when using temporary security credentials.
- credential_profile_name: profile name in shared credentials file.
- shared_credential_file: directory of the shared credentials file.
- role_arn: AWS IAM Role to assume.
-
endpoint: URL of the entry point for an AWS web service.
Most AWS services offer a regional endpoint that can be used to make requests.
The general syntax of a regional endpoint is
protocol://service-code.region-code.endpoint-code
. Some services, such as IAM, do not support regions. The endpoints for these services do not include a region. Inaws
module,endpoint
config is to set theendpoint-code
part, such asamazonaws.com
,amazonaws.com.cn
,c2s.ic.gov
,sc2s.sgov.gov
. -
proxy_url: URL of the proxy to use to connect to AWS web services. The syntax is
http(s)://<IP/Hostname>:<port>
Supported Formats
editThe examples in this section refer to Metricbeat, but the credential options for authentication with AWS are the same no matter which Beat is being used.
-
Use
access_key_id
,secret_access_key
, and/orsession_token
Users can either put the credentials into the Metricbeat module configuration or use
environment variable AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
and/or
AWS_SESSION_TOKEN
instead.
If running on Docker, these environment variables should be added as a part of the docker command. For example, with Metricbeat:
$ docker run -e AWS_ACCESS_KEY_ID=abcd -e AWS_SECRET_ACCESS_KEY=abcd -d --name=metricbeat --user=root --volume="$(pwd)/metricbeat.aws.yml:/usr/share/metricbeat/metricbeat.yml:ro" docker.elastic.co/beats/metricbeat:7.11.1 metricbeat -e -E cloud.auth=elastic:1234 -E cloud.id=test-aws:1234
Sample metricbeat.aws.yml
looks like:
metricbeat.modules: - module: aws period: 5m access_key_id: ${AWS_ACCESS_KEY_ID} secret_access_key: ${AWS_SECRET_ACCESS_KEY} session_token: ${AWS_SESSION_TOKEN} metricsets: - ec2
Environment variables can also be added through a file. For example:
$ cat env.list AWS_ACCESS_KEY_ID=abcd AWS_SECRET_ACCESS_KEY=abcd $ docker run --env-file env.list -d --name=metricbeat --user=root --volume="$(pwd)/metricbeat.aws.yml:/usr/share/metricbeat/metricbeat.yml:ro" docker.elastic.co/beats/metricbeat:7.11.1 metricbeat -e -E cloud.auth=elastic:1234 -E cloud.id=test-aws:1234
-
Use
credential_profile_name
and/orshared_credential_file
If access_key_id
, secret_access_key
and role_arn
are all not given, then
filebeat will check for credential_profile_name
. If you use different credentials for
different tools or applications, you can use profiles to configure multiple
access keys in the same configuration file. If there is no credential_profile_name
given, the default profile will be used.
shared_credential_file
is optional to specify the directory of your shared
credentials file. If it’s empty, the default directory will be used.
In Windows, shared credentials file is at C:\Users\<yourUserName>\.aws\credentials
.
For Linux, macOS or Unix, the file is located at ~/.aws/credentials
. When running as a service,
the home path depends on the user that manages the service, so the shared_credential_file
parameter can be used to avoid ambiguity. Please see
Create Shared Credentials File
for more details.
-
Use
role_arn
role_arn
is used to specify which AWS IAM role to assume for generating
temporary credentials. If role_arn
is given, filebeat will check if
access keys are given. If not, filebeat will check for credential profile
name. If neither is given, default credential profile will be used. Please make
sure credentials are given under either a credential profile or access keys.
If running on Docker, the credential file needs to be provided via a volume mount. For example, with Metricbeat:
docker run -d --name=metricbeat --user=root --volume="$(pwd)/metricbeat.aws.yml:/usr/share/metricbeat/metricbeat.yml:ro" --volume="/Users/foo/.aws/credentials:/usr/share/metricbeat/credentials:ro" docker.elastic.co/beats/metricbeat:7.11.1 metricbeat -e -E cloud.auth=elastic:1234 -E cloud.id=test-aws:1234
Sample metricbeat.aws.yml
looks like:
metricbeat.modules: - module: aws period: 5m credential_profile_name: elastic-beats shared_credential_file: /usr/share/metricbeat/credentials metricsets: - ec2
-
Use AWS credentials in Filebeat configuration
filebeat.inputs: - type: aws-s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue access_key_id: '<access_key_id>' secret_access_key: '<secret_access_key>' session_token: '<session_token>'
or
filebeat.inputs: - type: aws-s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue access_key_id: '${AWS_ACCESS_KEY_ID:""}' secret_access_key: '${AWS_SECRET_ACCESS_KEY:""}' session_token: '${AWS_SESSION_TOKEN:""}'
-
Use IAM role ARN
filebeat.inputs: - type: aws-s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue role_arn: arn:aws:iam::123456789012:role/test-mb
-
Use shared AWS credentials file
filebeat.inputs: - type: aws-s3 queue_url: https://sqs.us-east-1.amazonaws.com/123/test-queue credential_profile_name: test-fb
AWS Credentials Types
editThere are two different types of AWS credentials can be used: access keys and temporary security credentials.
- Access keys
AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
are the two parts of access keys.
They are long-term credentials for an IAM user or the AWS account root user.
Please see
AWS Access Keys
and Secret Access Keys
for more details.
- IAM role ARN
An IAM role is an IAM identity that you can create in your account that has specific permissions that determine what the identity can and cannot do in AWS. A role does not have standard long-term credentials such as a password or access keys associated with it. Instead, when you assume a role, it provides you with temporary security credentials for your role session. IAM role Amazon Resource Name (ARN) can be used to specify which AWS IAM role to assume to generate temporary credentials. Please see AssumeRole API documentation for more details.
Here are the steps to set up IAM role using AWS CLI for Metricbeat. Please replace
123456789012
with your own account ID.
Step 1. Create example-policy.json
file to include all permissions:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:GetObject", "sqs:ReceiveMessage" ], "Resource": "*" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": "sqs:ChangeMessageVisibility", "Resource": "arn:aws:sqs:us-east-1:123456789012:test-fb-ks" }, { "Sid": "VisualEditor2", "Effect": "Allow", "Action": "sqs:DeleteMessage", "Resource": "arn:aws:sqs:us-east-1:123456789012:test-fb-ks" }, { "Sid": "VisualEditor3", "Effect": "Allow", "Action": [ "sts:AssumeRole", "sqs:ListQueues", "tag:GetResources", "ec2:DescribeInstances", "cloudwatch:GetMetricData", "ec2:DescribeRegions", "iam:ListAccountAliases", "sts:GetCallerIdentity", "cloudwatch:ListMetrics" ], "Resource": "*" } ] }
Step 2. Create IAM policy using the aws iam create-policy
command:
$ aws iam create-policy --policy-name example-policy --policy-document file://example-policy.json
Step 3. Create the JSON file example-role-trust-policy.json
that defines the trust relationship of the IAM role
{ "Version": "2012-10-17", "Statement": { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::123456789012:root" }, "Action": "sts:AssumeRole" } }
Step 4. Create the IAM role and attach the policy:
$ aws iam create-role --role-name example-role --assume-role-policy-document file://example-role-trust-policy.json $ aws iam attach-role-policy --role-name example-role --policy-arn "arn:aws:iam::123456789012:policy/example-policy"
After these steps are done, IAM role ARN can be used for authentication in Metricbeat
aws
module.
- Temporary security credentials
Temporary security credentials has a limited lifetime and consists of an
access key ID, a secret access key, and a security token which typically returned
from GetSessionToken
. MFA-enabled IAM users would need to submit an MFA code
while calling GetSessionToken
. default_region
identifies the AWS Region
whose servers you want to send your first API request to by default. This is
typically the Region closest to you, but it can be any Region. Please see
Temporary Security Credentials
for more details.
sts get-session-token
AWS CLI can be used to generate temporary credentials. For example. with MFA-enabled:
aws> sts get-session-token --serial-number arn:aws:iam::1234:mfa/your-email@example.com --token-code 456789 --duration-seconds 129600
Because temporary security credentials are short term, after they expire, the user needs to generate new ones and modify the aws.yml config file with the new credentials. Unless live reloading feature is enabled for Metricbeat, the user needs to manually restart Metricbeat after updating the config file in order to continue collecting Cloudwatch metrics. This will cause data loss if the config file is not updated with new credentials before the old ones expire. For Metricbeat, we recommend users to use access keys in config file to enable aws module making AWS api calls without have to generate new temporary credentials and update the config frequently.
IAM policy is an entity that defines permissions to an object within your AWS environment. Specific permissions needs to be added into the IAM user’s policy to authorize Metricbeat to collect AWS monitoring metrics. Please see documentation under each metricset for required permissions.
On this page
api_timeout
buffer_size
content_type
encoding
expand_event_list_from_field
file_selectors
fips_enabled
include_s3_metadata
max_bytes
max_number_of_messages
parsers
multiline
queue_url
visibility_timeout
sqs.max_receive_count
sqs.notification_parsing_script.source
sqs.notification_parsing_script.file
sqs.notification_parsing_script.files
sqs.notification_parsing_script.params
sqs.notification_parsing_script.timeout
sqs.notification_parsing_script.max_cached_sessions
sqs.wait_time
bucket_arn
bucket_list_interval
bucket_list_prefix
number_of_workers
aws credentials
- AWS Permissions
- S3 and SQS setup
- S3 → SNS → SQS setup
- Parallel Processing
- SQS Custom Notification Parsing Script
- S3EventV2 API
- XMLDecoder API
- Metrics
- Common options
enabled
tags
fields
fields_under_root
processors
pipeline
keep_null
index
publisher_pipeline.disable_host
- AWS Credentials Configuration
- Configuration parameters
- Supported Formats
- AWS Credentials Types