Elastic Observability: Streams Data Quality and Failure Store Insights

When working with observability and logging data, not all documents make it into Elasticsearch in pristine condition. Some may be dropped due to processing failures in ingest pipelines or mapping errors, while others may be partially ingested with ignored fields if a fields value is incompatible with the defined mappings. These issues can impact downstream analysis and dashboards. Streams data quality makes it easier than ever to monitor the health of your ingested data, identify potential issues, and take corrective action right from the UI. With data quality, you can now see exactly how well your Stream is performing and quickly understand whether your data has a Good, Degraded, or Poor quality.

What's in data quality

At-a-glance summary

The summary card shows:

Degraded documents - Documents that contain the
_ignored
field - see this for more info.
Failed documents - Documents that were rejected at ingestion due to mapping conflicts or pipeline failures.

The overall quality score (Good, Degraded, Poor) is automatically calculated based on the percentage of degraded and failed documents.

Trends over time

The tab includes a time-series chart so you can track how degraded and failed documents are accumulating over time. Use the date picker to zoom into a specific range and understand when problems are spiking.

Quality issues table

A detailed table lists the types of issues affecting your stream. For each issue, you can:

See which fields are causing problems.
Review counts of affected documents.
Filter by issues that have not been solved yet (Current issues only).
Open a flyout to dive deeper into the cause of the issue and learn how to fix it.

Monitoring degraded documents

A degraded document is one that contains the

_ignored

field, which means one or more of its fields were ignored during indexing. One of the reasons could be that their values didn’t match the expected mappings. While the rest of the document is still indexed, a high number of degraded documents can affect query results, dashboards, and overall observability accuracy.

To help keep these issues under control, the Data quality tab provides visibility into the percentage of degraded documents in your stream.

Set up a rule to stay ahead of issues

You can use the Create rule button above the Degraded docs chart to define an alert that notifies you when the percentage of degraded documents crosses a certain threshold. This makes it easy to proactively monitor for mapping mismatches and ensure your data continues to meet quality expectations.

For more information on how to configure this rule, see Degraded docs rule conditions.

Handling failed documents with the failure store

Failure store is a special index that captures documents rejected during ingestion. Instead of losing this data, the failure store retains it in a dedicated

::failures

index, allowing you to inspect the problematic documents, understand what went wrong, and fix the underlying issues.

In Data Quality tab, the failed documents are only visible if your stream has a failure store enabled, for checking failure store documents you are required to have at least

read_failure_store

privileges. If the failure store is not enabled, you’ll see an “Enable failure store” link that opens a modal to configure it and set the retention period. For enabling failure store you are required to have

manage_failure_store

privileges over the specific data stream. For further information about failure store security you can refer to Searching failures.

Once enabled, you can edit the failure store configuration or disable it at any time using the Edit button above the failed docs chart.

The failure store can also be configured in the Streams Retention tab - see this article for more information.

Technical implementation

Under the hood, the Data quality tab builds on the existing Dataset quality plugin - the same one that powers the Dataset quality page in Stack Management. However, instead of working in the context of datasets following the Data stream naming scheme, it’s now tailored specifically for streams.

To determine the quality of a stream, the UI sends three ES|QL query server requests:

All documents (including failures):

 FROM myStream, myStream::failures | STATS doc_count = COUNT(*)

Failed documents only:

 FROM myStream::failures | STATS failed_doc_count = COUNT(*)

Degraded documents:

FROM myStream METADATA _ignored | WHERE _ignored IS NOT NULL | STATS degraded_doc_count = COUNT(*)

The results of these queries are then used to calculate the percentages of failed and degraded documents. The overall data quality is determined using simple thresholds:

Good: Both percentages are 0%
Degraded: Any percentage is greater than 0% but less than 3%
Poor: Any percentage is above 3%

For managing the failure store, Streams uses the Update data stream options API with the

failure_store

parameter to configure and update the failure store settings, including enabling the store and setting the retention period.

Why you’ll love this

The new Data quality tab gives you:

Visibility into ingestion problems without digging into logs
A clear breakdown of degraded vs. failed documents
Insights into which fields are ignored and why
Tools to capture and troubleshoot failed docs with the failure store

By surfacing data quality issues directly in the Streams UI, we’re making it easier to keep your data flowing reliably and to ensure your analytics are built on a strong foundation.

Try it out today

The data quality feature is available in Elastic Observability on Serverless, and coming soon for self-managed and Elastic Cloud users.

Sign up for an Elastic trial at cloud.elastic.co, and trial Elastic's Serverless offering which will allow you to play with all of the Streams functionality.

For more information on Streams:

Read about Reimagining streams

Look at the Streams website

Read the Streams documentation