New

The executive guide to generative AI

Read more
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Create inference API

edit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

Creates a model to perform an inference task.

Request

edit

PUT /_inference/<task_type>/<model_id>

Prerequisites

edit

Description

edit

The create inference API enables you to create and configure an inference model to perform a specific inference task.

Path parameters

edit
<model_id>
(Required, string) The unique identifier of the model.
<task_type>

(Required, string) The type of the inference task that the model will perform. Available task types:

  • sparse_embedding,
  • text_embedding.

Request body

edit
service

(Required, string) The type of service supported for the specified task type. Available services:

  • elser
service_settings
(Required, object) Settings used to install the inference model. These settings are specific to the service you specified.
task_settings
(Optional, object) Settings to configure the inference task. These settings are specific to the <task_type> you specified.

Examples

edit

The following example shows how to create an inference model called my-elser-model to perform a sparse_embedding task type.

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

Example response:

{
  "model_id": "my-elser-model",
  "task_type": "sparse_embedding",
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}
Was this helpful?
Feedback