In this notebook we'll build a chatbot that can respond to questions about custom data, such as policies of an employer.
The chatbot uses LangChain's ConversationalRetrievalChain and has the following capabilities:
- Answer questions asked in natural language
- Run hybrid search in Elasticsearch to find documents that answer the question
- Extract and summarize the answer using OpenAI LLM
- Maintain conversational memory for follow-up questions
Requirements 🧰
For this example, you will need:
- An Elastic deployment
- We'll be using Elastic Cloud for this example (available with a free trial)
- OpenAI account
Use Elastic Cloud
If you don't have an Elastic Cloud deployment, follow these steps to create one.
- Go to Elastic cloud Registration and sign up for a free trial
- Select Create Deployment and follow the instructions
Locally install Elasticsearch
If you prefer to run Elasticsearch locally, the easiest way is to use Docker. See Install Elasticsearch with Docker for instructions.
Install packages 📦
First we pip install the packages we need for this example.
Initialize clients 🔌
Next we input credentials with getpass. getpass is part of the Python standard library and is used to securely prompt for credentials.
Load and process documents 📄
Time to load some data! We'll be using the workplace search example data, which is a list of employee documents and policies.
Chunk documents into passages 🪓
As we're chatting with our bot, it will run semantic searches on the index to find the relevant documents. In order for this to be accurate, we need to split the full documents into small chunks (also called passages). This way the semantic search will find the passage within a document that most likely answers our question.
We'll use LangChain's RecursiveCharacterTextSplitter and split the documents' text at 800 characters with some overlap between chunks.
Let's generate the embeddings and index the documents with them.
Chat with the chatbot 💬
Let's initialize our chatbot. We'll define Elasticsearch as a store for retrieving documents and for storing the chat session history, OpenAI as the LLM to interpret questions and summarize answers, then we'll pass these to the conversational chain.
Now we can ask questions from our chatbot!
See how the chat history is passed as context for each question.
💡 Try experimenting with other questions or after clearing the workplace data, and observe how the responses change.
Once we're done, we can clean up the chat history for this session...
... or delete the indices.
ObjectApiResponse({'acknowledged': True})