Datafeeds

edit

Anomaly detection jobs can analyze data that is stored in Elasticsearch or data that is sent from some other source via an API. Datafeeds retrieve data from Elasticsearch for analysis, which is the simpler and more common scenario.

You can associate only one datafeed with each anomaly detection job. The datafeed contains a query that runs at a defined interval (frequency). By default, this interval is calculated relative to the bucket span of the anomaly detection job. If you are concerned about delayed data, you can add a delay before the query runs at each interval. See Handling delayed data.

Datafeeds can also aggregate data before sending it to the anomaly detection job. There are some limitations, however, and aggregations should generally be used only for low cardinality data. See Aggregating data for faster performance.

If you create anomaly detection jobs in Kibana, you must use datafeeds. When you create an anomaly detection job, you select an index pattern and Kibana configures the datafeed for you under the covers. For a description of all the datafeed properties, see the create datafeeds API.

To start retrieving data from Elasticsearch, you must start the datafeed. When you start it, you can optionally specify start and end times. If you do not specify an end time, the datafeed runs continuously. You can start and stop datafeeds in Kibana or use the start datafeeds and stop datafeeds APIs. A datafeed can be started and stopped multiple times throughout its lifecycle.

When the Elasticsearch security features are enabled, a datafeed stores the roles of the user who created or updated the datafeed at that time. This means that if those roles are updated, the datafeed subsequently runs with the new permissions that are associated with the roles. However, if the user’s roles are adjusted after creating or updating the datafeed, the datafeed continues to run with the permissions that were associated with the original roles.

One way to update the roles that are stored within the datafeed without changing any other settings is to submit an empty JSON document ({}) to the update datafeed API.

If the data that you want to analyze is not stored in Elasticsearch, you cannot use datafeeds. You can however send batches of data directly to the job by using the post data to jobs API.