Learn How to Launch Production-Ready AI Products. Download Our Free Guide
January 11, 2024

How can I use LLMs to classify user intents for my chatbot?

Guest Post
Anita Kirkovska
No items found.

Creating an AI chatbot involves more than just using a model API and adding context with your data.

You should also consider the various intents users might have and how to manage them.

In this blog post, we'll outline the steps to set up and evaluate intent detection in your chatbot workflow.

Learn how successful companies build with AI

Download this practical guide and enable your teams to innovate with AI.
Get Free Copy
If you want to compare more models, check our LLM Leaderboard here or book a demo to start using Vellum Evaluations to run these tests at scale.
Read the whole analysis in the sections that follow, and sign up for our newsletter if you want to get these analyses in your inbox!
Inspired by this, we've designed Vellum to meet these needs, and now many product and engineering teams use our suite of tools—Workflows, Evaluations, and Deployments—to build agentic workflows.

Build a Production-Ready AI System

Platform for product and engineering teams to build, deploy, and improve AI products.
Learn More

LLM orchestration with Vellum

Platform for product and engineering teams to build, deploy, and improve AI products.
Learn More

What Is Intent Detection?

Intent detection is a task that identifies a user's intention or desired outcome from their query. This task is essential for AI chatbots to provide accurate and relevant responses.

A very standard intent detection workflow uses a pre-trained LLM model, a prompt with instructions and context, and a handler logic for each of the intents.

Why do you need it?

Let’s look at a real example to illustrate the need for this step in a chatbot workflow.

Imagine that you’re building a customer service chatbot for an e-commerce business. Your visitors can ask the chatbot any question. Before the chatbot responds or takes any action, it needs to accurately understand the visitor's intent.

To keep this example simple, lets say that you have defined 5 intent categories: "Order Status", "Product Information", "Payments", "Returns" and "Feedback”.

For each category, there should be a distinct step where the LLM powered chatbot, figures out the user's intent. It does this by placing the user's question into the right category. After identifying the intent, the chatbot can then take the next appropriate actions for that particular category.

Having separate steps for the prompts and intent handlers is useful because each of your intents might need to do different actions. For example: “Returns” might need to be handled by an external service/API that a handler action should call, and the handler for “Product information” might just call an LLM and a context doc to answer with text response. Also, adding too many instructions in one prompt can also influence the performance.

Identifying these intents accurately allows the chatbot to respond better, call an external API or route the query to the correct personnel for further assistance.

In the next section we show you how to implement it.

How can you build intent detection for your chatbot?

To build a reliable intent detection for your chatbot you need to cover 4 critical steps:

  • Defining intents
  • Setting up the intent detection prompt
  • Setting up handler logic prompts
  • Testing and evaluating prompts/models

We give more details on these in the following sections.

Define intents

Identify the main reasons users interact with your chatbot, like asking for help or making a purchase.

Group these reasons into categories, or 'intents', using insights from customer interactions and FAQs.

Write the intent detection prompt

After you have your intents, you should start drafting the system prompt that will be used to classify the user’s query. Make sure to give clear directions, and follow best prompt engineering practices. Here’s a simple example for the system prompt:

🤖 System Prompt: You’re a LLM that detects intent from user queries. Your task is to classify the user's intent based on their query. Below are the possible intents with brief descriptions. Use these to accurately determine the user's goal, and output only the intent topic. - Order Status: Inquiries about the current status of an order, including delivery tracking and estimated arrival times. - Product Information: Questions regarding product details, specifications, availability, or compatibility. - Payments: Queries related to making payments, payment methods, billing issues, or transaction problems. - Returns: Requests or questions about returning a product, including return policies and procedures. - Feedback: User comments, reviews, or general feedback about products, services, or experiences. - Other: Choose this if the query doesn’t fall into any of the other intents. 💬 User Query: I would like to check my last order. 🤖 Response: Order status.

* For more reliable outputs you can also consider using function calling with your models, so that you always get a structured response from the model, one that can be used to run specific functions in your code. Learn how to set it up here.

Don’t forget to add a fallback option!

Did you notice that we added a fallback intent “Other” in the system prompt?

Adding fallback prompts is essential for handling situations where the chatbot fails to understand or correctly classify a user's intent.

Fallback prompts act as a safety net to keep users engaged, even when their query isn't a clear match. They can involve clarifying questions, rephrasing the query, or offering human assistance.

Set up handler logic

For each intent, you need to develop a response mechanism, and decide if the chatbot should perform an action like calling an API to another tool/service, or to just provide a text response.

To implement the handler logic you’ll need to build a more complex LLM chain, and use other prompts to provide text responses or to call an API to perform a specific action with external tools and services.

Test and evaluate prompts accross test cases

Testing and evaluating prompts across test cases is crucial for building reliable chatbots.

Before you push it to production, you need to be sure that your intent classifiers and handlers are working properly. You should test every intent path with various user queries and evaluate the performance of different prompt and model combinations.

For example, we recently did an experiment where we used four models and few-shot prompts to classify if a customer support ticket has been resolved or not. We had around 200 test cases, and used Vellum to evaluate our configuration at scale. Below you can see how that looked like in the product.

Let us help!

Building an intent classifier is not just a one-time setup; it’s a continuous process that requires extensive evaluation and monitoring once in production.

If you’re planning to build your custom chatbot and need assistance with the setup and evaluation, we can help.

Vellum’s platform for building production LLM apps can help you build a reliable chatbot. We provide the tooling layer to experiment with prompts and models, evaluate at scale, monitor them in production, and make changes with confidence if needed.

If you’re interested, you can book a call here. You can also subscribe to our blog and stay tuned for updates from us.


Join 10,000+ developers staying up-to-date with the latest AI techniques and methods.
Thanks for joining our newsletter.
Oops! Something went wrong.
Anita Kirkovska
Linkedin's logo

Founding Growth at Vellum

Anita Kirkovska, is currently leading Growth and Content Marketing at Vellum. She is a technical marketer, with an engineering background and a sharp acumen for scaling startups. She has helped SaaS startups scale and had a successful exit from an ML company. Anita writes a lot of content on generative AI to educate business founders on best practices in the field.

About the authors

No items found.

Related posts