Semantic text field type reference
Serverless Stack
This page provides reference content for the semantic_text field type, including parameter descriptions, inference endpoint configuration options, chunking behavior, update operations, querying options, and limitations.
The semantic_text field type uses default indexing settings based on the inference endpoint specified, enabling you to get started without providing additional configuration details. You can override these defaults by customizing the parameters described below.
inference_id- (Optional, string) Inference endpoint that will be used to generate
embeddings for the field. If
search_inference_idis specified, the inference endpoint will only be used at index time. Learn more about configuring this parameter.
Updating the inference_id parameter
This parameter cannot be updated.
You can update this parameter by using the Update mapping API. You can update the inference endpoint if no values have been indexed or if the new endpoint is compatible with the current one.
When updating an inference_id it is important to ensure the new inference endpoint produces embeddings compatible with those already indexed. This typically means using the same underlying model.
search_inference_id-
(Optional, string) The inference endpoint that will be used to generate embeddings at query time. Use the Create inference API to create the endpoint. If not specified, the inference endpoint defined by
inference_idwill be used at both index and query time.You can update this parameter by using the Update mapping API.
Learn how to use dedicated endpoints for ingestion and search.
index_optionsStack- (Optional, object) Specifies the index options to override default values
for the field. Currently,
dense_vectorandsparse_vectorindex options are supported. For text embeddings,index_optionsmay match any allowed.
chunking_settingsStack-
(Optional, object) Settings for chunking text into smaller passages. If specified, these will override the chunking settings set in the Inference endpoint associated with
inference_id.If chunking settings are updated, they will not be applied to existing documents until they are reindexed. Defaults to the optimal chunking settings for Elastic Rerank.
To completely disable chunking, use the
nonechunking strategy.ImportantWhen using the
nonechunking strategy, if the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an error. In contrast, theelasticandelasticsearchservices will automatically truncate the input to fit within the model's limit.
The following example shows how to configure inference_id, index_options and chunking_settings for a semantic_text field type:
PUT my-index-000004
{
"mappings": {
"properties": {
"inference_field": {
"type": "semantic_text",
"inference_id": "my-text-embedding-endpoint",
"index_options": {
"dense_vector": {
"type": "int4_flat"
}
},
"chunking_settings": {
"type": "none"
}
}
}
}
}
- The
inference_idof the inference endpoint to use for generating embeddings. - Overrides default index options by specifying
int4_flatquantization for dense vector embeddings. - Disables automatic chunking by setting the chunking strategy to
none.
Stack
Newly created indices with semantic_text fields using dense embeddings will be
quantized
to bbq_hnsw automatically as long as they have a minimum of 64 dimensions.
The semantic_text field type specifies an inference endpoint identifier (inference_id) that is used to generate embeddings.
The following inference endpoint configurations are available:
Default and preconfigured endpoints: Use
semantic_textwithout creating an inference endpoint manually.ELSER on EIS: Use the ELSER model through the Elastic Inference Service.
Custom endpoints: Create your own inference endpoint using the Create inference API to use custom models or third-party services.
If you use a custom inference endpoint through your ML node and not through Elastic Inference Service (EIS), the recommended method is to use dedicated endpoints for ingestion and search.
Stack If you use EIS, you don't have to set up dedicated endpoints.
Removing an inference endpoint will cause ingestion of documents and semantic
queries to fail on indices that define semantic_text fields with that
inference endpoint as their inference_id. Trying
to delete an inference endpoint
that is used on a semantic_text field will result in an error.
Inference endpoints have a limit on the amount of text they can process. To
allow for large amounts of text to be used in semantic search, semantic_text
automatically generates smaller passages if needed, called chunks.
Each chunk refers to a passage of the text and the corresponding embedding generated from it. When querying, the individual passages will be automatically searched for each document, and the most relevant passage will be used to compute a score.
Chunks are stored as start and end character offsets rather than as separate text strings. These offsets point to the exact location of each chunk within the original input text.
You can pre-chunk content by providing text as arrays before indexing.
Refer to the Inference API documentation for values for chunking_settings and to Configuring chunking to learn about different chunking strategies.
semantic_text field types have the following limitations:
semantic_textfields are not currently supported as elements of nested fields.semantic_textfields can't currently be set as part of dynamic templates.semantic_textfields are not supported in indices created prior to 8.11.0.semantic_textfields do not support Cross-Cluster Search (CCS) whenccs_minimize_roundtripsis set tofalse.semantic_textfields do not support Cross-Cluster Search (CCS) in ES|QL.semantic_textfields do not support Cross-Cluster Replication (CCR).
When an index contains a semantic_text field, the docs.count value returned by the _cat/indices API may be higher than the number of documents you indexed.
This occurs because semantic_text stores embeddings in nested documents, one per chunk. The _cat/indices API counts all documents in the Lucene index, including these hidden nested documents.
To count only top-level documents, excluding the nested documents that store embeddings, use one of the following APIs:
GET /<index>/_countGET _cat/count/<index>