Get trained models API

edit

Retrieves configuration information for a trained model.

This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.

Request

edit

GET _ml/trained_models/

GET _ml/trained_models/<model_id>

GET _ml/trained_models/_all

GET _ml/trained_models/<model_id1>,<model_id2>

GET _ml/trained_models/<model_id_pattern*>

Prerequisites

edit

If the Elasticsearch security features are enabled, you must have the following privileges:

  • cluster: monitor_ml

For more information, see Security privileges and Machine learning security privileges.

Description

edit

You can get information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.

Path parameters

edit
<model_id>
(Optional, string) The unique identifier of the trained model.

Query parameters

edit
allow_no_match

(Optional, Boolean) Specifies what to do when the request:

  • Contains wildcard expressions and there are no models that match.
  • Contains the _all string or no identifiers and there are no matches.
  • Contains wildcard expressions and there are only partial matches.

The default value is true, which returns an empty array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

decompress_definition
(Optional, Boolean) Specifies whether the included model definition should be returned as a JSON map (true) or in a custom compressed format (false). Defaults to true.
exclude_generated
(Optional, Boolean) Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster. Default is false.
from
(Optional, integer) Skips the specified number of models. The default value is 0.
include

(Optional, string) A comma delimited string of optional fields to include in the response body. The default value is empty, indicating no optional fields are included. Valid options are:

  • definition: Includes the model definition.
  • feature_importance_baseline: Includes the baseline for feature importance values.
  • hyperparameters: Includes the information about hyperparameters used to train the model. This information consists of the value, the absolute and relative importance of the hyperparameter as well as an indicator of whether it was specified by the user or tuned during hyperparameter optimization.
  • total_feature_importance: Includes the total feature importance for the training data set. The baseline and total feature importance values are returned in the metadata field in the response body.
size
(Optional, integer) Specifies the maximum number of models to obtain. The default value is 100.
tags
(Optional, string) A comma delimited string of tags. A trained model can have many tags, or none. When supplied, only trained models that contain all the supplied tags are returned.

Response body

edit
trained_model_configs

(array) An array of trained model resources, which are sorted by the model_id value in ascending order.

Properties of trained model resources
created_by
(string) Information on the creator of the trained model.
create_time
(time units) The time when the trained model was created.
default_field_map

(object) A string to string object that contains the default field map to use when inferring against the model. For example, data frame analytics may train the model on a specific multi-field foo.keyword. The analytics job would then supply a default field map entry for "foo" : "foo.keyword".

Any field map described in the inference configuration takes precedence.

description
(string) The free-text description of the trained model.
estimated_heap_memory_usage_bytes
(integer) The estimated heap usage in bytes to keep the trained model in memory.
estimated_operations
(integer) The estimated number of operations to use the trained model.
inference_config

(object) The default configuration for inference. This can be either a regression or classification configuration. It must match the underlying definition.trained_model's target_type.

Properties of inference_config
classification

(object) Classification configuration for inference.

Properties of classification inference
num_top_classes
(integer) Specifies the number of top class predictions to return. Defaults to 0.
num_top_feature_importance_values
(integer) Specifies the maximum number of feature importance values per document. By default, it is zero and no feature importance calculation occurs.
prediction_field_type
(string) Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.
results_field
(string) The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
top_classes_results_field
(string) Specifies the field to which the top classes are written. Defaults to top_classes.
regression

(object) Regression configuration for inference.

Properties of regression inference
num_top_feature_importance_values
(integer) Specifies the maximum number of feature importance values per document. By default, it is zero and no feature importance calculation occurs.
results_field
(string) The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.
input

(object) The input field names for the model definition.+

Properties of input
field_names
(string) An array of input field names for the model.
license_level
(string) The license level of the trained model.
metadata

(object) An object containing metadata about the trained model. For example, models created by data frame analytics contain analysis_config and input objects.

Properties of metadata
feature_importance_baseline
(object) An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.
hyperparameters

(array) List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

Properties of hyperparameters
absolute_importance
(double) A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.
max_trees
(integer) The maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.
name
(string) Name of the hyperparameter.
relative_importance
(double) A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.
supplied
(Boolean) Indicates if the hyperparameter is specified by the user (true) or optimized (false).
value
(double) The value of the hyperparameter, either optimized or specified by the user.
total_feature_importance

(array) An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

Properties of total feature importance
feature_name
(string) The feature for which this importance was calculated.
importance

(object) A collection of feature importance statistics related to the training data set for this particular feature.

Properties of feature importance
mean_magnitude
(double) The average magnitude of this feature across all the training data. This value is the average of the absolute values of the importance for this feature.
max
(integer) The maximum importance value across all the training data for this feature.
min
(integer) The minimum importance value across all the training data for this feature.
classes

(array) If the trained model is a classification model, feature importance statistics are gathered per target class value.

Properties of class feature importance
class_name
(string) The target class value. Could be a string, boolean, or number.
importance

(object) A collection of feature importance statistics related to the training data set for this particular feature.

Properties of feature importance
mean_magnitude
(double) The average magnitude of this feature across all the training data. This value is the average of the absolute values of the importance for this feature.
max
(int) The maximum importance value across all the training data for this feature.
min
(int) The minimum importance value across all the training data for this feature.
model_id
(string) Idetifier for the trained model.
tags
(string) A comma delimited string of tags. A trained model can have many tags, or none.
version
(string) The Elasticsearch version number in which the trained model was created.

Response codes

edit
400
If include_model_definition is true, this code indicates that more than one models match the ID pattern.
404 (Missing resources)
If allow_no_match is false, this code indicates that there are no resources that match the request or only partial matches for the request.

Examples

edit

The following example gets configuration information for all the trained models:

GET _ml/trained_models/