Parse data using ingest node pipelinesedit
You can configure APM Server to use an ingest node to pre-process documents before indexing them in Elasticsearch. A pipeline definition specifies the series of pipelines or processors that will transform each document in a specific way. For example, a pipeline might define one processor that removes a field, followed by another that renames a field.
Pipelines can be used to ensure data security by removing or obfuscating sensitive information. See data security for an example.
Default ingest pipelineedit
By default, APM Server registers the apm
pipeline definition to Elasticsearch on startup.
The apm
pipeline defines the following inner pipelines:
|
Adds |
|
Enriches Elastic’s JavaScript RUM Agent data by
adding user Geo-IP data to the |
|
Adds an ingest timestamp for APM events. |
|
[7.7]
Added in 7.7. Upgrading? See upgrading to 7.7
To save storage, removes metadata fields, like In previous versions of APM Server, this functionality was hardcoded internally. Switching metadata cleanup from an internal process to a processor allows you to keep any span metadata that is important in your architecture. |
See the complete pipeline definition by navigating to the APM Server’s home directory,
and then viewing ingest/pipeline/definition.json
.
To disable this, or any other pipeline, set output.elasticsearch.pipeline: _none
.
Custom pipelinesedit
Using custom pipelines involves two steps:
- First, you need to register a pipeline in Elasticsearch.
- Then, the pipeline needs to be applied during data ingestion.
Register pipelines in Elasticsearchedit
To register a pipeline in Elasticsearch, you can either configure APM Server to register pipelines on startup, or you can manually upload a pipeline definition.
Register pipelines on APM Server startupedit
Automatic pipeline registration requires output.elasticsearch
to be enabled and configured.
Navigate to APM Server’s home directory and find the default pipeline configuration at
ingest/pipeline/definition.json
.
To add, change, or remove pipelines in Elasticsearch,
change the definitions in this file and restart your APM Server or run apm-server setup --pipelines
.
By default, pipeline registration is enabled.
Manually upload pipeline definitionsedit
You can manually upload pipeline definitions by describing them in a file.
Consider the following sample pipeline in a file named pipeline.json
.
This pipeline definition converts the value of beat.name
to lowercase before indexing each document.
{ "description": "Test pipeline", "processors": [ { "lowercase": { "field": "beat.name" } } ] }
To register this pipeline, run:
curl -H 'Content-Type: application/json' -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d @pipeline.json
Apply pipelines during data ingestionedit
To specify which pipelines to apply during data ingestion,
add the pipeline IDs to the pipelines
option under output.elasticsearch
in the apm-server.yml
file:
output.elasticsearch: pipelines: - pipeline: "test-pipeline"
More information and examples for applying pipelines is available in the Elasticsearch output pipeline documentation.