JinaAI inference integration
editJinaAI inference integration
editCreates an inference endpoint to perform an inference task with the jinaai service.
Request
editPUT /_inference/<task_type>/<inference_id>
Path parameters
edit-
<inference_id> - (Required, string) The unique identifier of the inference endpoint.
-
<task_type> -
(Required, string) The type of the inference task that the model will perform.
Available task types:
-
text_embedding, -
rerank.
-
Request body
edit-
chunking_settings -
(Optional, object) Chunking configuration object. Refer to Configuring chunking to learn more about chunking.
-
max_chunk_size -
(Optional, integer)
Specifies the maximum size of a chunk in words.
Defaults to
250. This value cannot be higher than300or lower than20(forsentencestrategy) or10(forwordstrategy). -
overlap -
(Optional, integer)
Only for
wordchunking strategy. Specifies the number of overlapping words for chunks. Defaults to100. This value cannot be higher than the half ofmax_chunk_size. -
sentence_overlap -
(Optional, integer)
Only for
sentencechunking strategy. Specifies the numnber of overlapping sentences for chunks. It can be either1or0. Defaults to1. -
strategy -
(Optional, string)
Specifies the chunking strategy.
It could be either
sentenceorword.
-
-
service -
(Required, string)
The type of service supported for the specified task type. In this case,
jinaai. -
service_settings -
(Required, object) Settings used to install the inference model.
These settings are specific to the
jinaaiservice.-
api_key -
(Required, string) A valid API key for your JinaAI account. You can find it at https://jina.ai/embeddings/.
You need to provide the API key only once, during the inference model creation. The Get inference API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.
-
rate_limit -
(Optional, object) The default rate limit for the
jinaaiservice is 2000 requests per minute for all task types. You can modify this using therequests_per_minutesetting in your service settings:"rate_limit": { "requests_per_minute": <<number_of_requests>> }More information about JinaAI’s rate limits can be found in https://jina.ai/contact-sales/#rate-limit.
service_settingsfor thereranktask type-
model_id -
(Required, string)
The name of the model to use for the inference task.
To review the available
rerankcompatible models, refer to https://jina.ai/reranker.
service_settingsfor thetext_embeddingtask type-
model_id -
(Optional, string)
The name of the model to use for the inference task.
To review the available
text_embeddingmodels, refer to the https://jina.ai/embeddings/. -
similarity -
(Optional, string)
Similarity measure. One of
cosine,dot_product,l2_norm. Defaults based on theembedding_type(float→dot_product,int8/byte→cosine).
-
-
-
task_settings -
(Optional, object) Settings to configure the inference task. These settings are specific to the
<task_type>you specified.task_settingsfor thereranktask type-
return_documents - (Optional, boolean) Specify whether to return doc text within the results.
-
top_n -
(Optional, integer)
The number of most relevant documents to return, defaults to the number of the documents.
If this inference endpoint is used in a
text_similarity_rerankerretriever query andtop_nis set, it must be greater than or equal torank_window_sizein the query.
task_settingsfor thetext_embeddingtask type-
task -
(Optional, string) Specifies the task passed to the model. Valid values are:
-
classification: use it for embeddings passed through a text classifier. -
clustering: use it for the embeddings run through a clustering algorithm. -
ingest: use it for storing document embeddings in a vector database. -
search: use it for storing embeddings of search queries run against a vector database to find relevant documents.
-
-
JinaAI service examples
editThe following examples demonstrate how to create inference endpoints for text_embeddings and rerank tasks using the JinaAI service and use them in search requests.
First, we create the embeddings service:
resp = client.inference.put(
task_type="text_embedding",
inference_id="jinaai-embeddings",
inference_config={
"service": "jinaai",
"service_settings": {
"model_id": "jina-embeddings-v3",
"api_key": "<api_key>"
}
},
)
print(resp)
const response = await client.inference.put({
task_type: "text_embedding",
inference_id: "jinaai-embeddings",
inference_config: {
service: "jinaai",
service_settings: {
model_id: "jina-embeddings-v3",
api_key: "<api_key>",
},
},
});
console.log(response);
PUT _inference/text_embedding/jinaai-embeddings
{
"service": "jinaai",
"service_settings": {
"model_id": "jina-embeddings-v3",
"api_key": "<api_key>"
}
}
Then, we create the rerank service:
resp = client.inference.put(
task_type="rerank",
inference_id="jinaai-rerank",
inference_config={
"service": "jinaai",
"service_settings": {
"api_key": "<api_key>",
"model_id": "jina-reranker-v2-base-multilingual"
},
"task_settings": {
"top_n": 10,
"return_documents": True
}
},
)
print(resp)
const response = await client.inference.put({
task_type: "rerank",
inference_id: "jinaai-rerank",
inference_config: {
service: "jinaai",
service_settings: {
api_key: "<api_key>",
model_id: "jina-reranker-v2-base-multilingual",
},
task_settings: {
top_n: 10,
return_documents: true,
},
},
});
console.log(response);
PUT _inference/rerank/jinaai-rerank
{
"service": "jinaai",
"service_settings": {
"api_key": "<api_key>",
"model_id": "jina-reranker-v2-base-multilingual"
},
"task_settings": {
"top_n": 10,
"return_documents": true
}
}
Now we can create an index that will use jinaai-embeddings service to index the documents.
resp = client.indices.create(
index="jinaai-index",
mappings={
"properties": {
"content": {
"type": "semantic_text",
"inference_id": "jinaai-embeddings"
}
}
},
)
print(resp)
const response = await client.indices.create({
index: "jinaai-index",
mappings: {
properties: {
content: {
type: "semantic_text",
inference_id: "jinaai-embeddings",
},
},
},
});
console.log(response);
PUT jinaai-index
{
"mappings": {
"properties": {
"content": {
"type": "semantic_text",
"inference_id": "jinaai-embeddings"
}
}
}
}
resp = client.bulk(
index="jinaai-index",
operations=[
{
"index": {
"_index": "jinaai-index",
"_id": "1"
}
},
{
"content": "Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades."
},
{
"index": {
"_index": "jinaai-index",
"_id": "2"
}
},
{
"content": "She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. "
},
{
"index": {
"_index": "jinaai-index",
"_id": "3"
}
},
{
"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."
}
],
)
print(resp)
const response = await client.bulk({
index: "jinaai-index",
operations: [
{
index: {
_index: "jinaai-index",
_id: "1",
},
},
{
content:
"Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades.",
},
{
index: {
_index: "jinaai-index",
_id: "2",
},
},
{
content:
"She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. ",
},
{
index: {
_index: "jinaai-index",
_id: "3",
},
},
{
content:
"Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists.",
},
],
});
console.log(response);
PUT jinaai-index/_bulk
{ "index" : { "_index" : "jinaai-index", "_id" : "1" } }
{"content": "Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades."}
{ "index" : { "_index" : "jinaai-index", "_id" : "2" } }
{"content": "She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. "}
{ "index" : { "_index" : "jinaai-index", "_id" : "3" } }
{"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."}
Now, with the index created, we can search with and without the reranker service.
resp = client.search(
index="jinaai-index",
query={
"semantic": {
"field": "content",
"query": "who inspired taking care of the sea?"
}
},
)
print(resp)
const response = await client.search({
index: "jinaai-index",
query: {
semantic: {
field: "content",
query: "who inspired taking care of the sea?",
},
},
});
console.log(response);
GET jinaai-index/_search
{
"query": {
"semantic": {
"field": "content",
"query": "who inspired taking care of the sea?"
}
}
}
resp = client.search(
index="jinaai-index",
retriever={
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"semantic": {
"field": "content",
"query": "who inspired taking care of the sea?"
}
}
}
},
"field": "content",
"rank_window_size": 100,
"inference_id": "jinaai-rerank",
"inference_text": "who inspired taking care of the sea?"
}
},
)
print(resp)
const response = await client.search({
index: "jinaai-index",
retriever: {
text_similarity_reranker: {
retriever: {
standard: {
query: {
semantic: {
field: "content",
query: "who inspired taking care of the sea?",
},
},
},
},
field: "content",
rank_window_size: 100,
inference_id: "jinaai-rerank",
inference_text: "who inspired taking care of the sea?",
},
},
});
console.log(response);
POST jinaai-index/_search
{
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"semantic": {
"field": "content",
"query": "who inspired taking care of the sea?"
}
}
}
},
"field": "content",
"rank_window_size": 100,
"inference_id": "jinaai-rerank",
"inference_text": "who inspired taking care of the sea?"
}
}
}