Trained models

edit

When you use a data frame analytics job to perform classification or regression analysis, it creates a machine learning model that is trained and tested against a labeled data set. When you are satisfied with your trained model, you can use it to make predictions against new data. For example, you can use it in the processor of an ingest pipeline or in a pipeline aggregation within a search query. For more information about this process, see Introduction to supervised learning and inference for classification and regression.

In Kibana, you can view and manage your trained models in Stack Management > Alerts and Insights > Machine Learning and Machine Learning > Model Management.

Alternatively, you can use APIs like get trained models and delete trained models.

Deploying trained models

edit
Models trained by data frame analytics
edit
  1. To deploy data frame analytics model in a pipeline, navigate to Machine Learning > Model Management > Trained models in the main menu, or use the global search field in Kibana.
  2. Find the model you want to deploy in the list and click Deploy model in the Actions menu.

    The trained models UI in Kibana
  3. Create an inference pipeline to be able to use the model against new data through the pipeline. Add a name and a description or use the default values.

    Creating an inference pipeline
  4. Configure the pipeline processors or use the default settings.

    Configuring an inference processor
  5. Configure to handle ingest failures or use the default settings.
  6. (Optional) Test your pipeline by running a simulation of the pipeline to confirm it produces the anticipated results.
  7. Review the settings and click Create pipeline.

The model is deployed and ready to use through the inference pipeline.

Models trained by other methods
edit

You can also supply trained models that are not created by data frame analytics job but adhere to the appropriate JSON schema. Likewise, you can use third-party models to perform natural language processing (NLP) tasks. If you want to use these trained models in the Elastic Stack, you must store them in Elasticsearch documents by using the create trained models API. For more information about NLP models, refer to Deploy trained models.

Exporting and importing models

edit

Models trained in Elasticsearch are portable and can be transferred between clusters. This is particularly useful when models are trained in isolation from the cluster where they are used for inference. The following instructions show how to use curl and jq to export a model as JSON and import it to another cluster.

  1. Given a model name, find the model ID. You can use curl to call the get trained model API to list all models with their IDs.

    curl -s -u username:password \
      -X GET "http://localhost:9200/_ml/trained_models" \
        | jq . -C \
        | more

    If you want to show just the model IDs available, use jq to select a subset.

    curl -s -u username:password \
      -X GET "http://localhost:9200/_ml/trained_models" \
        | jq -C -r '.trained_model_configs[].model_id'
    flights1-1607953694065
    flights0-1607953585123
    lang_ident_model_1

    In this example, you are exporting the model with ID flights1-1607953694065.

  2. Using curl from the command line, again use the get trained models API to export the entire model definition and save it to a JSON file.

    curl -u username:password \
      -X GET "http://localhost:9200/_ml/trained_models/flights1-1607953694065?exclude_generated=true&include=definition&decompress_definition=false" \
        | jq '.trained_model_configs[0] | del(.model_id)' \
        > flights1.json

    A few observations:

    • Exporting models requires using curl or a similar tool that can stream the model over HTTP into a file. If you use the Kibana Console, the browser might be unresponsive due to the size of exported models.
    • Note the query parameters that are used during export. These parameters are necessary to export the model in a way that it can later be imported again and used for inference.
    • You must unnest the JSON object by one level to extract just the model definition. You must also remove the existing model ID in order to not have ID collisions when you import again. You can do these steps using jq inline or alternatively it can be done to the resulting JSON file after downloading using jq or other tools.
  3. Import the saved model using curl to upload the JSON file to the created trained model API. When you specify the URL, you can also set the model ID to something new using the last path part of the URL.

    curl -u username:password \
      -H 'Content-Type: application/json' \
      -X PUT "http://localhost:9200/_ml/trained_models/flights1-imported" \
      --data-binary @flights1.json
  • Models exported from the get trained models API are limited in size by the http.max_content_length global configuration value in Elasticsearch. The default value is 100mb and may need to be increased depending on the size of model being exported.
  • Connection timeouts can occur, for example, when model sizes are very large or your cluster is under load. If needed, you can increase timeout configurations for curl (for example, curl --max-time 600) or your client of choice.

If you also want to copy the data frame analytics job to the new cluster, you can export and import jobs in the Stack Management app in Kibana. Refer to Exporting and importing machine learning jobs.

Importing an external model to the Elastic Stack

edit

It is possible to import a model to your Elasticsearch cluster even if the model is not trained by Elastic data frame analytics. Eland supports importing models directly through its APIs. Please refer to the latest Eland documentation for more information on supported model types and other details of using Eland to import models with.