CONTENTS

Inline evaluation / Guardrails: Ensure good system performance at run-time

This is some text inside of a div block.

Zero-Shot vs Few-Shot prompting: A Guide with Examples

Exploring zero-shot & few-shot prompting: usage, application methods, and limits.

Author

Anita Kirkovska

Author

Dec 21, 2023

There are various techniques for improving your model's answers, including zero-shot prompting and few-shot prompting.

This guide will cover the basics of these methods, when to use them, and their limitations.

‍

What is Zero-Shot prompting?

Zero-shot prompting provides no examples and lets the model figure things out on its own. It relies solely on the model's pre-training data and training techniques to generate a response. The response may not be completely perfect but will likely be coherent.

Here’s an example prompt that we ran with GPT-4.

Prompt

Result

Note that the prompt above didn’t give any instructions to the LLM about how to classify a sentiment. This goes to show that the model understands “sentiment” and can answer this question with zero-shot prompting.

With a broad enough knowledge base and understanding of language, LLMs can generate coherent responses for a number of new tasks using zero shot prompting.

If zero-shot doesn’t work for your example, it’s recommended to use few-shot prompting.

‍

What is Few-Shot prompting?

Few-shot prompting is a method where you use a few examples in your prompt to guide language models (like GPT-4) to learn new tasks quickly. Rather than retraining an entire model from scratch, you use your context window to provide a few examples to improve the model’s performance.

With the latest models and bigger context window sizes, this technique is even more useful.

Here’s a few-shot prompt example.

Prompt

Result

This is a very simple example, but depending on your task these can get more complex for the model to understand.

In the next section, we look at two examples that are easy for humans, but more challenging for a language model to categorize.

‍

Zero-Shot vs Few-Shot prompting (with examples)

Below we showcase two complex sentiment analysis examples that might be wrongly classified with zero-shot prompting. But, if similar examples are provided in a few-shot prompt, the model will learn and will correctly classify new similar ones.

Phrase with negation

Result

This one is tricky because we used a phrase with negation and it confuses the model to assume that this statement has a neutral sentiment, where in reality the sentiment is positive.

Negative term used in a positive way

Result

Again, the model is confused because it assumed that the terrible ending of the movie was perceived as negative, when in fact it was entertaining for the user and it was perceived as positive.

By providing similar examples in a few-shot prompt, you’ll help the model understand these edge cases. This way, the model can respond with the correct sentiment the next time it sees a similar example.

However, this prompting technique doesn’t come without its limits.

‍

Limits to Few-Shot prompting?

There are cases where few-shot prompting won’t be a good fit.

Here are some examples:

When you’re dealing with a more complex reasoning task and want the model to think step by step; in this case it’s recommended that you use Chain of Thought prompting to get better results.
If you want to classify some data that has high variability and nuance; you might need to fine-tune a model, as the context window of the model might not fit all unique examples that you’d like the model to consider
In cases where you don’t want to use fine-tuning, you can use RAG-based few shot prompting. With this technique you can dynamically retrieve pre-labelled examples that are most relevant to the question at hand by referencing your proprietary data stored in a vector database.

‍

Conclusion

You now have a solid understanding on zero-shot and few-shot prompting. Both can be very useful for different tasks.

When using few-shot prompting, it’s crucial to recognize the specific challenges in your data. Providing targeted examples can significantly improve the model's accuracy.

However, it's also important to be aware of the limitations. If your data varies a lot, or you're reaching the context window limits, or facing difficulties with complex prompts, think about whether fine-tuning a custom model could work better.

Ultimately, the key lies in experimentation. Try out different prompts, and perhaps even compare different models, to discover the most effective solution for your scenario.

These techniques are your toolbox, but it's your data and your experiments that'll show you what works best. Keep tinkering, and you'll find your sweet spot!

There are various techniques for improving your model's answers, including zero-shot prompting and few-shot prompting.

This guide will cover the basics of these methods, when to use them, and their limitations.

‍

What is Zero-Shot prompting?

Here’s an example prompt that we ran with GPT-4.

Prompt

Result

With a broad enough knowledge base and understanding of language, LLMs can generate coherent responses for a number of new tasks using zero shot prompting.

If zero-shot doesn’t work for your example, it’s recommended to use few-shot prompting.

‍

What is Few-Shot prompting?

With the latest models and bigger context window sizes, this technique is even more useful.

Here’s a few-shot prompt example.

Prompt

Result

This is a very simple example, but depending on your task these can get more complex for the model to understand.

In the next section, we look at two examples that are easy for humans, but more challenging for a language model to categorize.

‍

Zero-Shot vs Few-Shot prompting (with examples)

Phrase with negation

Result

This one is tricky because we used a phrase with negation and it confuses the model to assume that this statement has a neutral sentiment, where in reality the sentiment is positive.

Negative term used in a positive way

Result

Again, the model is confused because it assumed that the terrible ending of the movie was perceived as negative, when in fact it was entertaining for the user and it was perceived as positive.

However, this prompting technique doesn’t come without its limits.

‍

Limits to Few-Shot prompting?

There are cases where few-shot prompting won’t be a good fit.

Here are some examples:

When you’re dealing with a more complex reasoning task and want the model to think step by step; in this case it’s recommended that you use Chain of Thought prompting to get better results.
If you want to classify some data that has high variability and nuance; you might need to fine-tune a model, as the context window of the model might not fit all unique examples that you’d like the model to consider
In cases where you don’t want to use fine-tuning, you can use RAG-based few shot prompting. With this technique you can dynamically retrieve pre-labelled examples that are most relevant to the question at hand by referencing your proprietary data stored in a vector database.

‍

Conclusion

You now have a solid understanding on zero-shot and few-shot prompting. Both can be very useful for different tasks.

When using few-shot prompting, it’s crucial to recognize the specific challenges in your data. Providing targeted examples can significantly improve the model's accuracy.

Ultimately, the key lies in experimentation. Try out different prompts, and perhaps even compare different models, to discover the most effective solution for your scenario.

These techniques are your toolbox, but it's your data and your experiments that'll show you what works best. Keep tinkering, and you'll find your sweet spot!

ABOUT THE AUTHOR

Anita Kirkovska

Founding Growth Lead

An AI expert with a strong ML background, specializing in GenAI and LLM education. A former Fulbright scholar, she leads Growth and Education at Vellum, helping companies build and scale AI products. She conducts LLM evaluations and writes extensively on AI best practices, empowering business leaders to drive effective AI adoption.

No items found.

Guides

August 14, 2025

•

How to write effective prompts for GPT-5

Guides

August 12, 2025

•

6 min

Partnering with Composio to Help You Build Better AI Agents

Product Updates

August 12, 2025

•

Vellum Product Update | July

Guides

August 8, 2025

•

Best practices for building AI multi agent systems

Guides

August 7, 2025

•

7 min

GPT-5 Benchmarks

Model Comparisons

August 6, 2025

•

7 min

OpenAI o3 vs gpt-oss 120b

The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska

Head of Engineering