Vellum is coming to the AI Engineering World's Fair in SF. Come visit our booth and get a live demo!

Prompt Engineering Guide for Claude Models

Learn how to prompt Claude with these 11 prompt engineering tips.

Written by
Reviewed by
No items found.

Have you tried instructing any Claude model in the same way as you would GPT-4?

Given the widespread use and familiarity with OpenAI's models, it's a common reflex.

Yet, this approach doesn't quite hit the mark with these models.

Claude models are trained with different methods/techniques, and should be instructed with specific instructions that cater to those differences. So, I looked into Anthropic's official docs, and tried to use their guidelines to improve the LLM outputs for our customers.

Turns out, Claude models, specifically Claude 3 Opus, can do even better than GPT-4 if you learn to prompt it right.

The official documentation can be a bit confusing, so this guide will show you the most useful prompt engineering techniques. We've also developed a prompt converter; just paste your GPT-4 prompt and get an adapted Claude 3 Opus version. Try the tool here.

Now let's learn how to prompt Claude.

1. Use XML tags to separate instructions from context

The Claude models have been fine-tuned to pay special attention to the structure created by XML tags, and it won’t follow any random indicators like GPT does. It’s important to use these tags to separate instructions, examples, questions, context, and input data as needed.

For example you can add text tags to wrap the input:


Summarize the main ideas from a provided text.

<text> {input input here} </text>

You can use any names you like for these tags; there are no specific or exclusive names required. What's important is the format. Just make sure to include <> and </> , and it will work fine!

2. Be direct, concise and as specific as possible

This is equally important for every large model.

You’ll need to clearly state what the model should do rather than what it should avoid. Using affirmatives like “do” instead of “don’t” will give you better results.

Provide Claude with detailed context and clearly specify which tag to use to find this information.

Here’s how we can improve the above prompt:


Summarize the main ideas from the  provided article text within the <text> tags.

<text> {input input here} </text>

3. Help Claude with the output

The biggest problem with generally all Claude models is that it tends to be very chatty in its answers. It will always start with a sentence or two prior to providing the answer, despite being instructed in the prompt to follow a specific format.

To mitigate this, you can use the Assistant message to provide the beginning of the output. This technique will ensure Claude always begins its answer the same way.

Here’s how that prompt will look like if we want Claude to follow a specific format:


Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<format>
→ idea 1
→ idea 2
→ idea 3
</format>

<text> {input input here} </text>


Assistant: 

4. Assign a role

Always assign a role. If you’re building an AI-powered writing tool, start your prompt with “You’re a content writer…”, or better yet "You're the best content writer in the world!". Using the previous technique of putting the first token in the Assistant’s response, you can also force Claude to stay in character.

For example:


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<format>
→ idea 1
→ idea 2
→ idea 3
</format>

<text> {input input here} </text>


Assistant: [Jack, the best content writer in the world] →

5. Give Claude time to think

There are some cases when it can be beneficial to explicitly instruct Claude to generate extra text where it reasons through the problem. To achieve this, you can instruct Claude to first "think through the problem" and then provide the answer. You can request that Claude outputs this process with two separate XML tags: one for the "thinking" part and another for the "answer.", like in the prompt below:


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<format>
→ idea 1
→ idea 2
→ idea 3
</format>

When you generate the answer, first think how the output should be structured and add your answer in <thinking></thinking> tags. This is a space for you to write down relevant content and will not be shown to the user. Once you are done thinking, answer the question. Put your answer inside <answer></answer> XML tags.


<text> {input input here} </text>

Here’s what the model will output if we provide some text about Biochemistry (the prompt was cut down to highlight the format of the output):


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<format>
→ idea 1
→ idea 2
→ idea 3
</format>

Assistant: 

<thinking>
Here are the 4 key ideas I would summarize from the text:
1. Biochemistry explores chemical processes in living organisms by…. 
</thinking>


<answer>
→ Biochemistry explores chemical processes…
→ It plays a vital role in health and medicine… 
</answer>

Notice that the <answer> text doesn’t start with an arbitrary sentence, so you’ll always get the expected output format in this tag. You could easily apply some data manipulation, and cut the "thinking" tags, and extract the answer.

6. Provide Examples

Few-shot prompting is probably the most effective way to get Claude to give very good answers. Including a couple of examples that might generalize well for your use case, can have high impact on the quality of the answers. The more examples you add the better the response will be, but at the cost of higher latency and tokens.


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<format>
→ idea 1
→ idea 2
→ idea 3
</format>


Here is an example on how to respond in a standard interaction:
<example>
{input examples here}
</example>


<text> {input input here} </text>

7. Let Claude say "I don't know"

To prevent hallucinations just add the phrase shown in the prompt below

Answer the following question only if you know the answer or can make a well-informed guess; otherwise tell me you don't know it.

8. Long documents before instructions

If you’re dealing with longer documents, always ask your question at the end of the prompt. For very long prompts Claude gives accent to the end of your prompt, so you need to add important instructions at the end. This is extremely important for Claude 2.1.


<doc>
{input document here}
</doc>


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

9. Think step by step

You can significantly improve the accuracy, by adding the phrase “Think step by step” that will force Claude to think step by step, and follow intermediate steps to arrive to the final answer. This is called zero-shot chain of thought prompting and we wrote more on that in this blog post.

10. Break complex tasks into steps

Claude might perform poorly at complex tasks that are composed of several subtasks. If you know who those subtasks are, you can help Claude by providing a step by step instructions. Something like:


You’re Jack, the best content writer in the world. Summarize the main ideas from the provided article text within the <text> tags, and only output the main conclusions in a 4 bulleted list. Follow the format provided below:

<text> {input input here} </text>


Please follow this steps:

1. Write a one paragraph summary for {{text}}
2. Write 4 bulleted list with the main conclusions for {{text}}


11. Prompt Chaining

If you can’t get reliable results by breaking the prompt into subtasks, you can split the tasks in different prompts. This is called prompt chaining, and is very useful at troubleshooting specific steps in your prompts.

12. Look for Relevant Sentences First

All Claude models can recall information very good across their 200K context window (they passed the "Needle in a Haystack" test with 95% accuracy). But, the models can be reluctant to answer questions based on an individual sentence in a document, especially if that sentence has been injected or is out of place.

To fix this, you can add start the Assistant message with "Here is the most relevant sentence in the context:” , instructing Claude to begin its output with that sentence. This prompt instruction achieves near complete fidelity throughout Claude 2.1’s 200K context window.


Assistant: Here is the most relevant sentence in the context:

Test-driven prompt engineering with Claude

These best practices for Claude can help you write a solid first prompt. But, how can you determine if this method is effective across a wide range of user inputs?

To build confidence in your prompt, you can follow a test-driven prompt engineering approach.

You can compile a collection of test scenarios and apply them to various configurations of your prompt and model. Continue this process until you’re satisfied with the outcome.

Remember, constant iteration is key here. Even after pushing your prompt to production, it’s critical to monitor how it’s doing against live traffic and run regression tests before deploying any changes to your prompts.

If you need help with evaluating your prompts while you’re prototyping or when they’re in production — we can help.

Vellum provides the tooling layer to experiment with prompts and models, evaluate at scale, monitor them in production, and make changes with confidence.

If you’re interested, you can book a call here. You can also subscribe to our blog to stay tuned for updates.

FAQ

What is the main difference between Claude 2 and Claude 2.1?

The primary distinction is that Claude 2.1 features a context window that is twice as large (200,000 tokens) and introduces the ability to make function calls, a functionality that was previously exclusive to OpenAI models.

In addition to that, it demonstrates better recall capabilities, hallucinates less, and has better comprehension across a very big context window.

So, Claude 2.1 is a perfect model to handle longer, more complex documents like legal docs, and Claude 2 is great at text processing suitable for many other applications.

How large is Claude's context window?

Claude 2.1 leads in context prompting capabilities, supporting a maximum context window of 200,000 tokens, the highest available among models. This amounts to roughly 500 pages of information, or the equivalent of one Harry Potter book!

Does Claude 2 by Anthropic support function calling?

Yes, but currently limited to select early access partners. With the function calling option you can pass Claude a set of tools and have Claude decide which tool to use to help you achieve your task. Some examples include:

  • Function calling for arbitrary functions
  • Search over web sources
  • Retrieval over private knowledge bases
ABOUT THE AUTHOR
Anita Kirkovska
Founding Growth Lead

An AI expert with a strong ML background, specializing in GenAI and LLM education. A former Fulbright scholar, she leads Growth and Education at Vellum, helping companies build and scale AI products. She conducts LLM evaluations and writes extensively on AI best practices, empowering business leaders to drive effective AI adoption.

ABOUT THE reviewer

No items found.
lAST UPDATED
Feb 2, 2024
share post
Expert verified
Related Posts
Product Updates
October 1, 2025
7
Vellum Product Update | September
Guides
September 30, 2025
15
A practical guide to AI automation
LLM basics
September 25, 2025
8 min
Top Low-code AI Agent Platforms for Product Managers
LLM basics
September 25, 2025
8 min
The Best AI Agent Frameworks For Developers
Product Updates
September 24, 2025
7 min
Introducing AI Apps: A new interface to interact with AI workflows
LLM basics
September 18, 2025
7 min
Top 11 low‑code AI workflow automation tools
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component, Use {{general-cta}}

Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component  [For enterprise], Use {{general-cta-enterprise}}

The best AI agent platform for enterprises
Production-grade rigor in one platform: prompt builder, agent sandbox, and built-in evals and monitoring so your whole org can go AI native.

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Button Text

LLM leaderboard CTA component. Use {{llm-cta}}

Check our LLM leaderboard
Compare all open-source and proprietary model across different tasks like coding, math, reasoning and others.

Case study CTA component (ROI)

40% cost reduction on AI investment
Learn how Drata’s team uses Vellum and moves fast with AI initiatives, without sacrificing accuracy and security.

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

6+ months on engineering time saved
Learn how CourseMojo uses Vellum to enable their domain experts to collaborate on AI initiatives, reaching 10x of business growth without expanding the engineering team.

Case study CTA component (Time to value) = {{time-cta}}

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.
New CTA
Sorts the trigger and email categories

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

SOAP Note Generation Agent
Personalized healthcare explanations of a patient-doctor match

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

AI agent for claims review and error detection
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

E-commerce shopping agent

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

PDF Data Extraction to CSV
Extract unstructured data (PDF) into a structured format (CSV).

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Risk assessment agent for supply chain operations

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

Turn LinkedIn Posts into Articles and Push to Notion
Convert your best Linkedin posts into long form content.

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

No items found.

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Trust Center RAG Chatbot
Read from a vector database, and instantly answer questions about your security policies.

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

Financial Statement Review Workflow
Extract and review financial statements and their corresponding footnotes from SEC 10-K filings.
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Retail pricing optimizer agent
Analyze product data and market conditions and recommend pricing strategies.
Synthetic Dataset Generator
Generate a synthetic dataset for testing your AI engineered logic.
LinkedIn Content Planning Agent
Create a 30-day Linkedin content plan based on your goals and target audience.
E-commerce shopping agent
Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
PDF Data Extraction to CSV
Extract unstructured data (PDF) into a structured format (CSV).

Build AI agents in minutes for

{{industry_name}}

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.
AI agent for claims review and error detection
E-commerce shopping agent
Retail pricing optimizer agent
Analyze product data and market conditions and recommend pricing strategies.
Risk assessment agent for supply chain operations
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Case study results overview (usually added at top of case study)

What we did:

1-click

This is some text inside of a div block.

28,000+

Separate vector databases managed per tenant.

100+

Real-world eval tests run before every release.