Custom filters
editCustom filters
editCustom filters, including ingest pipeline filters and APM agent filters, allow you to filter or redact APM data on ingestion.
Ingest pipelines specify a series of processors that transform data in a specific way. Transformation happens prior to indexing—inflicting no performance overhead on the monitored application. Pipelines are a flexible and easy way to filter or obfuscate Elastic APM data.
Features of this approach:
- Filters are applied at ingestion time.
- All Elastic APM agents and fields are supported.
- Data leaves the instrumented service.
- There are no performance overhead implications on the instrumented service.
For a step-by-step example, refer to Tutorial: Use an ingest pipeline to redact sensitive information.
Some APM agents offer a way to manipulate or drop APM events before they are sent to APM Server.
Features of this approach:
- Data is sanitized before leaving the instrumented service.
- Not supported by all Elastic APM agents.
- Potential overhead implications on the instrumented service.
Refer to the relevant agent’s documentation for more information and examples:
- .NET: Filter API.
-
Node.js:
addFilter()
. - Python: custom processors.
-
Ruby:
add_filter()
.
Say you decide to capture HTTP request bodies
but quickly notice that sensitive information is being collected in the
http.request.body.original
field:
{ "email": "[email protected]", "password": "hunter2" }
To obfuscate the passwords stored in the request body, you can use a series of ingest processors.
This tutorial uses the Ingest APIs, but it’s also possible to create a pipeline using the UI. In Kibana, go to Stack Management → Ingest Pipelines → Create pipeline → New pipeline or use the global search field.
To start, create a pipeline with a simple description and an empty array of processors:
Add your first processor to the processors array. Because the agent captures the request body as a string, use the JSON processor to convert the original field value into a structured JSON object. Save this JSON object in a new field:
{ "json": { "field": "http.request.body.original", "target_field": "http.request.body.original_json", "ignore_failure": true } }
If body.original_json
is not null
, i.e., it exists, we’ll redact the password
with the set processor,
by setting the value of body.original_json.password
to "redacted"
:
{ "set": { "field": "http.request.body.original_json.password", "value": "redacted", "if": "ctx?.http?.request?.body?.original_json != null" } }
Use the convert processor to convert the JSON value of body.original_json
to a string and set it as the body.original
value:
{ "convert": { "field": "http.request.body.original_json", "target_field": "http.request.body.original", "type": "string", "if": "ctx?.http?.request?.body?.original_json != null", "ignore_failure": true } }
Finally, use the remove processor to remove the body.original_json
field:
{ "remove": { "field": "http.request.body.original_json", "if": "ctx?.http?.request?.body?.original_json != null", "ignore_failure": true } }
Then put it all together,
and use the create or update pipeline API to register the new pipeline in Elasticsearch.
Name the pipeline apm_redacted_body_password
:
PUT _ingest/pipeline/apm_redacted_body_password { "description": "redact http.request.body.original.password", "processors": [ { "json": { "field": "http.request.body.original", "target_field": "http.request.body.original_json", "ignore_failure": true } }, { "set": { "field": "http.request.body.original_json.password", "value": "redacted", "if": "ctx?.http?.request?.body?.original_json != null" } }, { "convert": { "field": "http.request.body.original_json", "target_field": "http.request.body.original", "type": "string", "if": "ctx?.http?.request?.body?.original_json != null", "ignore_failure": true } }, { "remove": { "field": "http.request.body.original_json", "if": "ctx?.http?.request?.body?.original_json != null", "ignore_failure": true } } ] }
Prior to enabling this new pipeline, you can test it with the simulate pipeline API. This API allows you to run multiple documents through a pipeline to ensure it is working correctly.
The request below simulates running three different documents through the pipeline:
POST _ingest/pipeline/apm_redacted_body_password/_simulate { "docs": [ { "_source": { "http": { "request": { "body": { "original": """{"email": "[email protected]", "password": "hunter2"}""" } } } } }, { "_source": { "some-other-field": true } }, { "_source": { "http": { "request": { "body": { "original": """["invalid json" """ } } } } } ] }
This document features the same sensitive data from the original example above |
|
This document only contains an unrelated field |
|
This document contains invalid JSON |
The API response should be similar to this:
{ "docs" : [ { "doc" : { "_source" : { "http" : { "request" : { "body" : { "original" : { "password" : "redacted", "email" : "[email protected]" } } } } } } }, { "doc" : { "_source" : { "nobody" : true } } }, { "doc" : { "_source" : { "http" : { "request" : { "body" : { "original" : """["invalid json" """ } } } } } } ] }
As expected, only the first simulated document has a redacted password field. All other documents are unaffected.
@custom
pipelineThe final step in this process is to call the newly created apm_redacted_body_password
pipeline
from the @custom
pipeline of the data stream you wish to edit.
@custom
pipelines are specific to each data stream and follow a similar naming convention: <type>-<dataset>@custom
.
As a reminder, the default APM data streams are:
-
Application traces:
traces-apm-<namespace>
-
RUM and iOS agent application traces:
traces-apm.rum-<namespace>
-
APM internal metrics:
metrics-apm.internal-<namespace>
-
APM transaction metrics:
metrics-apm.transaction.<metricset.interval>-<namespace>
-
APM service destination metrics:
metrics-apm.service_destination.<metricset.interval>-<namespace>
-
APM service transaction metrics:
metrics-apm.service_transaction.<metricset.interval>-<namespace>
-
APM service summary metrics:
metrics-apm.service_summary.<metricset.interval>-<namespace>
-
Application metrics:
metrics-apm.app.<service.name>-<namespace>
-
APM error/exception logging:
logs-apm.error-<namespace>
-
Applications UI logging:
logs-apm.app.<service.name>-<namespace>
To match a custom ingest pipeline with a data stream, follow the <type>-<dataset>@custom
template,
or replace -namespace
with @custom
in the table above.
For example, to target application traces, you’d create a pipeline named traces-apm@custom
.
Use the create or update pipeline API to register the new pipeline in Elasticsearch.
Name the pipeline traces-apm@custom
:
PUT _ingest/pipeline/traces-apm@custom { "processors": [ { "pipeline": { "name": "apm_redacted_body_password" } } ] }
That’s it! Passwords will now be redacted from your APM HTTP body data.
To learn more about ingest pipelines, see View the Elasticsearch index template.