Building AI Agents with AI SDK and Elastic

Do you keep hearing about AI agents, and aren't quite sure what they are or how to build one in TypeScript (or JavaScript)? Join me as I dive into what AI agents are, the possible use cases they can be used for, along with an example Travel Planner Agent built using AI SDK and Elasticsearch.

Do you keep hearing about AI agents, and aren't quite sure what they are or how they connect to Elastic? Here I dive into AI Agents, specifically covering:

  1. What is an AI agent?
  2. What problems can be solved using AI Agents?
  3. An example agent for travel planning, available here in GitHub, using AI SDK, Typescript and Elasticsearch.

What is an AI Agent?

An AI agent is a software that is able to perform tasks autonomously and take actions on behalf of a human leveraging Artificial Intelligence. It achieves this by combining one or more LLMs with tools (or functions) that you define to perform particular actions. Example actions in these tools could be:

  1. Extracting information from databases, sensors, APIs or search engines such as Elasticsearch.
  2. Performing complex calculations whose results can be summarized by the LLM.
  3. Making key decisions based on various data inputs quickly.
  4. Raising necessary alerts and feedback based on the response.

What can be done with them?

AI Agents could be leveraged for many different use cases in numerous domains based on the type of agent you build. Possible examples include:

  1. A utility-based agent to evaluate actions and make recommendations to maximize the gain, such as to suggest films and series to watch based on a person's prior watching history.
  2. Model-based agents that make real-time decisions based on input from sensors, such as self-driving cars or automated vacuum cleaners.
  3. Learning agents that combine data and machine learning to identify patterns and exceptions in cases such as fraud detection.
  4. Utility agents that recommend investment decisions based on a person's risk market and existing portfolio to maximize their return. With my former finance hat on this could expedite such decisions if accuracy, reputational risk and regulatory factors are carefully weighted.
  5. Simple chatbots, as seen today, that can access our account information and answer basic questions using language.

Example: Travel Planner

To better understand what these agents can do, and how to build one using familiar web technologies, let's walk through a simple example of a travel planner written using AI SDK, Typescript and Elasticsearch.

Architecture

Our example comprises of 5 distinct elements:

  1. A tool, named weatherTool that pulls weather data for the location specified by the questioner from Weather API.
  2. A fcdoTool tool that provides the current travel status of the destination from the GOV.UK Content API.
  3. The flight information is pulled from Elasticsearch using a simple query in tool flightTool.
  4. All of the above information is then passed to LLM GPT-4 Turbo.

Model choice

When building your first AI agent, it can be difficult to figure out which is the right model to use. Resources such as the Hugging Face Open LLM Leaderboard are a good start. But for tool usage guidance you can also check out the Berkeley Function-Calling Leaderboard .

In our case, AI SDK specifically recommends using models with strong tool calling capabilities such as gpt-4 or gpt-4-turbo in their Prompt Engineering documentation. Selecting the wrong model, as I found at the start of this project, can lead to the LLM not calling multiple tools in the way you expect, or even compatibility errors as you see below:

# Llama3 lack of tooling support (3.1 or higher)
llama3 does not support tools

# Unsupported toolChoice option to configure tool usage
AI_UnsupportedFunctionalityError: 'Unsupported tool choice type: required' functionality not supported.

Prerequisites

To run this example, please ensure the prerequisites in the repository README are actioned.

Basic Chat Assistant

The simplest AI agent that you can create with AI SDK will generate a response from the LLM without any additional grounding context. AI SDK supports numerous JavaScript frameworks as outlined in their documentation. However the AI SDK UI library documentation lists varied support for React, Svelte, Vue.js and SolidJS, with many of the tutorials targeting Next.js. For this reason, our example is written with Next.js.

The basic anatomy of any AI SDK chatbot uses the useChat hook to handle requests from the backend route, by default /api/chat/:

The page.tsx file contains our client-side implementation in the Chat component, including the submission, loading and error handling capabilities exposed by the useChat hook. The loading and error handling functionality are optional, but recommended to provide an indication of the state of the request. Agents can take considerable time to respond when compared to simple REST calls, meaning that it's important to keep a user updated on state and prevent key mashing and repeated calls.

Because of the client interactivity of this component, I use the use client directive to make sure the component is considered part of the client bundle:

'use client';
import { useChat } from '@ai-sdk/react';
import Spinner from './components/spinner';
export default function Chat() {
/* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */
const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();
return (
<div className="chat__form">
<div className="chat__messages">
{
/* Display all user messages and assistant responses */
messages.map(m => (
<div key={m.id} className="message">
<div>
{ /* Messages with the role of *assistant* denote responses from the LLM*/ }
<div className="role">{m.role === "assistant" ? "Sorley" : "Me"}</div>
{ /* User or LLM generated content */}
<div className="itinerary__div" dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div>
</div>
</div>
))}
</div>
{
/* Spinner shows when awaiting a response */
isLoading && (
<div className="spinner__container">
<Spinner />
<button id="stop__button" type="button" onClick={() => stop()}>
Stop
</button>
</div>
)}
{
/* Show error message and return button when something goes wrong */
error && (
<>
<div className="error__container">Unable to generate a plan. Please try again later!</div>
<button id="retry__button" type="button" onClick={() => reload()}>
Retry
</button>
</>
)}
{ /* Form using default input and submission handler form the useChat hook */ }
<form onSubmit={handleSubmit}>
<input
className="search-box__input"
value={input}
placeholder="Where would you like to go?"
onChange={handleInputChange}
disabled={error != null}
/>
</form>
</div>
);
}

The Chat component will maintain the user input via the input property exposed by the hook, and will send the response to the appropriate route on submission. I have used the default handleSubmit method, which will invoke the /ai/chat/ POST route.

The handler for this route, located in /ai/chat/route.ts, initializes the connection to the gpt-4-turbo LLM using the OpenAI provider:

import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
import { NextResponse } from 'next/server';
// Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
export const maxDuration = 30;
// Post request handler
export async function POST(req: Request) {
const { messages } = await req.json();
try {
// Generate response from the LLM using the provided model, system prompt and messages
const result = streamText({
model: openai('gpt-4-turbo'),
system: 'You are a helpful assistant that returns travel itineraries',
messages
});
// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
return result.toDataStreamResponse();
} catch(e) {
console.error(e);
return new NextResponse("Unable to generate a plan. Please try again later!");
}
}

Note that the above implementation will pull the API key from the environment variable OPENAI_API_KEY by default. If you need to customize the configuration of the openai provider, use the createOpenAI method to override the provider settings.

With the above routes, a little help from Showdown to format the GPT Markdown output as HTML, and a bit of CSS magic in the globals.css file, we end up with a simple responsive UI that will generate an itinerary based on the user prompt:

Adding tools

Adding tools to AI agents is basically creating custom functions that the LLM can use to enhance the response it generates. At this stage I shall add 3 new tools that the LLM can choose to use in generation of an itinerary, as shown in the below diagram:

Weather tool

While the generated itinerary is a great start, we may want to add additional information that the LLM was not trained on, such as weather. This leads us to write our first tool that can be used not only as input to the LLM, but additional data that allows us to adapt the UI.

The created weather tool, for which the full code is shown below, takes a single parameter location that the LLM will pull from the user input. The schema attribute accepts the parameter object using the TypeScript schema validation library Zod and ensures that the correct parameter types are passed. The description attribute allows you to define what the tool does to help the LLM decide if it wants to invoke the tool.

import { tool as createTool } from 'ai';
import { z } from 'zod';
import { WeatherResponse } from '../model/weather.model';
export const weatherTool = createTool({
description:
'Display the weather for a holiday location',
parameters: z.object({
location: z.string().describe('The location to get the weather for')
}),
execute: async function ({ location }) {
// While a historical forecast may be better, this example gets the next 3 days
const url = `https://api.weatherapi.com/v1/forecast.json?q=${location}&days=3&key=${process.env.WEATHER_API_KEY}`;
try {
const response = await fetch(url);
const weather : WeatherResponse = await response.json();
return {
location: location,
condition: weather.current.condition.text,
condition_image: weather.current.condition.icon,
temperature: Math.round(weather.current.temp_c),
feels_like_temperature: Math.round(weather.current.feelslike_c),
humidity: weather.current.humidity
};
} catch(e) {
console.error(e);
return {
message: 'Unable to obtain weather information',
location: location
};
}
}
});

You may have guessed that the execute attribute is where we define an asynchronous function with our desired tool logic. Specifically, the location to send to the weather API is passed to our tool function. The response is then transformed into a single JSON object that can be shown on the UI, and also used to generate the itinerary.

Given we are only running a single tool at this stage, we don't need to consider sequential or parallel flows. It's simply the case of adding the tools property to the streamText method that handles the LLM output in the original api/chat route:

import { weatherTool } from '@/app/ai/weather.tool';
// Other imports omitted
export const tools = {
displayWeather: weatherTool,
};
// Post request handler
export async function POST(req: Request) {
const { messages } = await req.json();
// Generate response from the LLM using the provided model, system prompt and messages (try catch block omitted)
const result = streamText({
model: openai('gpt-4-turbo'),
system:
'You are a helpful assistant that returns travel itineraries based on the specified location.',
messages,
maxSteps: 2,
tools
});
// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
return result.toDataStreamResponse();
}

The tool output is provided alongside the messages, which allows us to provide a more complete experience for the user. Each message contains a parts attribute that contains type and state properties. Where these properties are of value tool-invocation and result respectively, we can pull the returned results from the toolInvocation attribute and show them as we wish.

The page.tsx source is changed to show the weather summary alongside the generated itinerary:

'use client';
import { useChat } from '@ai-sdk/react';
import Image from 'next/image';
import { Weather } from './components/weather';
import pending from '../../public/multi-cloud.svg';
export default function Chat() {
/* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */
const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();
return (
<div className="chat__form">
<div className="chat__messages">
{
/* Display all user messages and assistant responses */
messages.map(m => (
<div key={m.id} className="message">
<div>
{ /* Messages with the role of *assistant* denote responses from the LLM */}
<div className="role">{m.role === "assistant" ? "Sorley" : "Me"}</div>
{ /* Tool handling */}
<div className="tools__summary">
{
m.parts.map(part => {
if (part.type === 'tool-invocation') {
const { toolName, toolCallId, state } = part.toolInvocation;
if (state === 'result') {
{ /* Show weather results */}
if (toolName === 'displayWeather') {
const { result } = part.toolInvocation;
return (
<div key={toolCallId}>
<Weather {...result} />
</div>
);
}
} else {
return (
<div key={toolCallId}>
{toolName === 'displayWeather' ? (
<div className="weather__tool">
<Image src={pending} width={80} height={80} alt="Placeholder Weather"/>
<p className="loading__weather__message">Loading weather...</p>
</div>
) : null}
</div>
);
}
}
})}
</div>
{ /* User or LLM generated content */}
<div className="itinerary__div" dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div>
</div>
</div>
))}
</div>
{ /* Spinner and loading handling omitted */ }
{ /* Form using default input and submission handler form the useChat hook */}
<form onSubmit={handleSubmit}>
<input
className="search-box__input"
value={input}
placeholder="Where would you like to go?"
onChange={handleInputChange}
disabled={error != null}
/>
</form>
</div>
);
}

The above will provide the following output to the user:

FCO tool

The power of AI agents is that the LLM can choose to trigger multiple tools to source relevant information when generating the response. Let's say we want to check the travel guidance for the destination country. A new tool fcdoGuidance, as per the below code, can trigger an API call to the GOV.UK Content API:

import { tool as createTool } from 'ai';
import { z } from 'zod';
import { FCDOResponse } from '../model/fco.model';
export const fcdoTool = createTool({
description:
'Display the FCDO guidance for a destination',
parameters: z.object({
country: z.string().describe('The country of the location to get the guidance for')
}),
execute: async function ({ country }) {
const url = `https://www.gov.uk/api/content/foreign-travel-advice/${country.toLowerCase()}`;
try {
const response = await fetch(url, { headers: { 'Content-Type': 'application/json' } });
const fcoResponse: FCDOResponse = await response.json();
const alertStatus: string = fcoResponse.details.alert_status.length == 0 ? 'Unknown' :
fcoResponse.details.alert_status[0].replaceAll('_', ' ');
return {
status: alertStatus,
url: fcoResponse.details?.document?.url
};
} catch(e) {
console.error(e);
return {
message: 'Unable to obtain FCDO information',
location: location
};
}
}
});

You will notice that the format is very similar to the weather tool discussed previously. Indeed, to include the tool into the LLM output it's just a case of adding to the tools property and amending the prompt in the /api/chat route:

// Imports omitted
export const tools = {
fcdoGuidance: fcdoTool,
displayWeather: weatherTool,
};
// Post request handler
export async function POST(req: Request) {
const { messages } = await req.json();
// Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)
const result = streamText({
model: openai('gpt-4-turbo'),
system:
"You are a helpful assistant that returns travel itineraries based on a location" +
"Use the current weather from the displayWeather tool to adjust the itinerary and give packing suggestions." +
"If the FCDO tool warns against travel DO NOT generate an itinerary.",
messages,
maxSteps: 2,
tools
});
// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
return result.toDataStreamResponse();
}

Once the components showing the output for the tool are added to the page, the output for a country where travel is not advised should look something like this:

LLMs that support tool calling have the choice not to call a tool unless it feels the need. With gpt-4-turbo both of our tools are being called in parallel. However, prior attempts using llama3.1 would result in a single model being called depending on the input.

Flight information tool

RAG, or Retrieval Augmented Generation, refers to software architectures where documents from a search engine or database is passed as the context to the LLM to ground the response to the provided set of documents. This architecture allows the LLM to generate a more accurate response based on data it has not been trained on previously. While Agentic RAG processes the documents using a defined set of tools, or alongside vector or hybrid search, it's also possible to utilize RAG as part of a complex flow with traditional lexical search as we do here.

To pass the flight information alongside the other tools to the LLM, a final tool flightTool pulls outbound and inbound flights using the provided source and destination from Elasticsearch using the Elasticsearch JavaScript client:

import { tool as createTool } from 'ai';
import { z } from 'zod';
import { Client } from '@elastic/elasticsearch';
import { SearchResponseBody } from '@elastic/elasticsearch/lib/api/types';
import { Flight } from '../model/flight.model';
const index: string = "upcoming-flight-data";
const client: Client = new Client({
node: process.env.ELASTIC_ENDPOINT,
auth: {
apiKey: process.env.ELASTIC_API_KEY || "",
},
});
function extractFlights(response: SearchResponseBody<Flight>): (Flight | undefined)[] {
return response.hits.hits.map(hit => { return hit._source})
}
export const flightTool = createTool({
description:
"Get flight information for a given destination from Elasticsearch, both outbound and return journeys",
parameters: z.object({
destination: z.string().describe("The destination we are flying to"),
origin: z
.string()
.describe(
"The origin we are flying from (defaults to London if not specified)"
),
}),
execute: async function ({ destination, origin }) {
try {
const responses = await client.msearch({
searches: [
{ index: index },
{
query: {
bool: {
must: [
{
match: {
origin: origin,
},
},
{
match: {
destination: destination,
},
},
],
},
},
},
// Return leg
{ index: index },
{
query: {
bool: {
must: [
{
match: {
origin: destination,
},
},
{
match: {
destination: origin,
},
},
],
},
},
},
],
});
if (responses.responses.length < 2) {
throw new Error("Unable to obtain flight data");
}
return {
outbound: extractFlights(responses.responses[0] as SearchResponseBody<Flight>),
inbound: extractFlights(responses.responses[1] as SearchResponseBody<Flight>)
};
} catch (e) {
console.error(e);
return {
message: "Unable to obtain flight information",
location: location,
};
}
},
});

This example makes use of the Multi search API to pull the outbound and inbound flights in separate searches, before pulling out the documents using the extractFlights utility method.

To use the tool output, we need to amend our prompt and tool collection once more in /ai/chat/route.ts:

// Imports omitted
// Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
export const maxDuration = 30;
export const tools = {
getFlights: flightTool,
displayWeather: weatherTool,
fcdoGuidance: fcdoTool
};
// Post request handler
export async function POST(req: Request) {
const { messages } = await req.json();
// Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)
const result = streamText({
model: openai('gpt-4-turbo'),
system:
"You are a helpful assistant that returns travel itineraries based on location, the FCDO guidance from the specified tool, and the weather captured from the displayWeather tool." +
"Use the flight information from tool getFlights only to recommend possible flights in the itinerary." +
"Return an itinerary of sites to see and things to do based on the weather." +
"If the FCDO tool warns against travel DO NOT generate an itinerary.",
messages,
maxSteps: 2,
tools
});
// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experience
return result.toDataStreamResponse();
}

With the final prompt, all 3 tools will be called to generate an itinerary including flight options:

Summary

If you weren't 100% confident about what AI agents are, now you do! We've covered that using a simple travel planner example using AI SDK, Typescript and Elasticsearch. It would be possible to extend our planner to add other sources, allow the user to book the trip along with tours, or even generate image banners based on the location (for which support in AI SDK is currently experimental).

If you haven't dived into the code yet, check it out here!

Resources

  1. AI SDK Core Documentation
  2. AI SDK Core > Tool Calling
  3. Elasticsearch JavaScript Client
  4. Travel Planner AI Agent | GitHub

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running!

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial, or try Elastic on your local machine now.

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself