Reverting to a model snapshot

edit

Snapshots of the machine learning model for each anomaly detection job are saved frequently to an internal Elasticsearch index to ensure resilience. It makes it possible to reset the model to a previous state in case of a system failure or if the model changed significantly due to a one-off event.

  1. In Kibana, navigate to Jobs. To open Jobs, find Machine Learning > Anomaly Detection in the main menu, or use the global search field.
  2. Locate the anomaly detection job whose model you want to revert in the job table.
  3. Open the job details and navigate to the Model Snapshots tab.

    A screenshot of a job with the Model Snapshots tab opened
  4. Select a snapshot from the list and click the Revert icon under Actions.
  5. Optional: Select if you want to replay the analysis based on the data in your index after the revert has been applied. If you don’t select this option, there will be no anomaly detection results after the snapshot was taken and the job results end at the point in time of the snapshot. If you select this option, you can then optionally select one or both of the next two options:

    • You can select whether you want the job to continue running in real time after the replay, or to just replay existing data after the snapshot.
    • You can select a time range you want to avoid during the replay by declaring a calendar event. This way, you can skip any problematic time frame that you want the anomaly detection job to avoid.
    A screenshot of a revert model snapshot flyout
  6. Click Apply.

You can use custom rules to avoid a model being updated in case of a known event you want to exclude from the analysis. Using custom rules might help you to avoid situations where you need to revert to a snapshot.

Alternatively, you can use the revert model snapshots API. In this case, you need to manually close the corresponding job before reverting to the saved snapshot.

  • By default, when you revert to a snapshot, all anomaly detection results are deleted for the corresponding job after the point when the snapshot was saved. If you replay the analysis, results will be re-generated based on your configuration.
  • Reverting to a snapshot does not change the data_counts values of the anomaly detection job, these values are not reverted to the earlier state.