Semantic search using the Inference API with the OpenAI service

Learn how to use the Inference API with the OpenAI service for semantic search.

For this example, you will need:

  • An Elastic deployment:

  • Elasticsearch 8.12 or above.

  • A paid OpenAI account is required to use the Inference API with the OpenAI service as the OpenAI free trial API usage is limited.

If you don't have an Elastic Cloud deployment, sign up here for a free trial.

To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.12.0 or above). Because we're using an Elastic Cloud deployment, we'll use the Cloud ID to identify our deployment.

First we need to pip install the following packages:

  • elasticsearch

Next, we need to import the modules we need. 🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory.

Now we can instantiate the Python Elasticsearch client.

First we prompt the user for their password and Cloud ID. Then we create a client object that instantiates an instance of the Elasticsearch class.

Enable Telemetry

Knowing that you are using this notebook helps us decide where to invest our efforts to improve our products. We would like to ask you that you run the following code to let us gather anonymous usage statistics. See telemetry.py for details. Thank you!

Test the Client

Before you continue, confirm that the client has connected with this test.

Refer to the documentation to learn how to connect to a self-managed deployment.

Read this page to learn how to connect using API keys.

Create the inference endpoint

Let's create the inference endpoint by using the Create inference API.

You'll need an OpenAI API key for this that you can find in your OpenAI account under the API keys section. A paid membership is required to complete the steps in this notebook as the OpenAI free trial API usage is limited.

NOTE: If you use Elasticsearch 8.12, you must change inference_id in the snippet above to model_id! And inference.put to inference.put_model.

Create an ingest pipeline with an inference processor

Create an ingest pipeline with an inference processor by using the put_pipeline method. Reference the inference endpoint created above as model_id to infer against the data that is being ingested in the pipeline.

Let's note a few important parameters from that API call:

  • inference: A processor that performs inference using a machine learning model.
  • model_id: Specifies the ID of the inference endpoint to be used. In this example, the inference ID is set to my_openai_embedding_model. Use the inference ID you defined when created the inference task.
  • input_output: Specifies input and output fields.
  • input_field: Field name from which the dense_vector representation is created.
  • output_field: Field name which contains inference results.

Create index

The mapping of the destination index - the index that contains the embeddings that the model will create based on your input text - must be created. The destination index must have a field with the dense_vector field type to index the output of the OpenAI model.

Let's create an index named openai-movie-embeddings with the mappings we need.

Insert Documents

Let's insert our example dataset of 12 movies. You need a paid OpenAI account to complete this step, otherwise the documentation ingest will time out due to the API request rate limits.

Semantic search

After the dataset has been enriched with the embeddings, you can query the data using semantic search. Pass a query_vector_builder to the k-nearest neighbor (kNN) vector search API, and provide the query text and the model you have used to create the embeddings.

Score: 0.91674197 Title: Fight Club Plot: An insomniac office worker and a devil-may-care soapmaker form an underground fight club that evolves into something much, much more. Score: 0.9069592 Title: Pulp Fiction Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption. Score: 0.8992071 Title: The Dark Knight Plot: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.

NOTE: The value of model_id in the query_vector_builder must match the value of inference_id you created in the first step.

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as you are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself