Learn How to Launch Production-Ready AI Products. Download Our Free Guide
November 30, 2023

Tree of Thought Prompting: What It Is and How to Use It

Guest Post
Anita Kirkovska
No items found.

Different prompting techniques can improve the results from your large language models (LLMs).

One effective technique is Tree of Thought prompting, known for its ability to handle complex reasoning tasks.

In this blog post we’ll explain the framework, provide some examples and give you advice for your use-cases.

Learn how successful companies build with AI

Download this practical guide and enable your teams to innovate with AI.
Get Free Copy
If you want to compare more models, check our LLM Leaderboard here or book a demo to start using Vellum Evaluations to run these tests at scale.
Read the whole analysis in the sections that follow, and sign up for our newsletter if you want to get these analyses in your inbox!
Inspired by this, we've designed Vellum to meet these needs, and now many product and engineering teams use our suite of tools—Workflows, Evaluations, and Deployments—to build agentic workflows.

Build a Production-Ready AI System

Platform for product and engineering teams to build, deploy, and improve AI products.
Learn More

LLM orchestration with Vellum

Platform for product and engineering teams to build, deploy, and improve AI products.
Learn More

Why do you need prompting techniques?

You can achieve a lot with simple prompts, but the quality of your results will always depend on the quality of your prompt and the information you provide.

To increase the quality of your prompt, there are a few advanced prompting techniques that can guide the LLM to provide better answers, with less hallucinations. This is very useful if you expect your LLM to solve new, unseen problems that usually need intermediate steps.

There are many prompting techniques like Few-Shot prompting or Chain of Thought prompting which we covered in another post.

Today we’ll look at Tree of Thoughts.

Tree of Thoughts (ToT) framework

The Tree of Thoughts (ToT) is inspired by the human mind's approach to solve complex reasoning tasks through trial and error. Put simply, this technique guides the LLM to explore different ideas, and reevaluate when needed, in order to provide the optimal solution.

This approach outperforms Chain of Thought prompting because it doesn’t stick to one “chain of thought” reasoning. However, it requires a lot of coding and custom algorithms to search through the layers to find the best reasoning path.

How does it work?

ToT creates a tree-like structure of ideas, where each idea is a step towards solving a problem. This approach enables the LLM to self-evaluate the intermediate “thoughts” and decide whether to continue with that path or choose another.

To perform this, the authors of the ToT framework augment the LLM with search algorithms like breadth-first search and depth-first search.


The authors demonstrated that in tasks like mathematical reasoning and creative writing, the Tree of Thoughts prompting significantly exceeds the performance of other prompting methods.

This is certainly interesting. But, can we utilize this framework in a plain and simple prompt?

Turns out we can.

Tree of Thoughts (ToT) prompting (with examples)

The Tree of Thoughts prompting (ToT) is adapted to the Tree of Thoughts framework, where the author tried to fit the whole concept in a single prompt.

The example below showcases the structure of the prompt.

It's important to know that GPT-3.5 doesn't get the right answer for the below example with Chain of Thought prompting, but it does with ToT prompting.


ToT prompt

Imagine three different experts are answering this question.
All experts will write down 1 step of their thinking,then share it with the group.
Then all experts will go on to the next step, etc.
If any expert realises they're wrong at any point then they leave.
The question is...

The response from GPT-3.5:

Expert 1: The ball is in the living room.
Expert 2: The ball is in the kitchen.
Expert 3: The ball is in the bedroom.
Expert 1: Bob carries the cup to the bedroom, so the ball must be in the cup.
Expert 2: Oh, I see my mistake. Yes, the ball is in the cup.
Expert 3: Agreed, the ball is in the cup in the bedroom.
Expert 1: Next, Bob turns the cup upside down in the bedroom.
Expert 2: After that, Bob walks to the garden and puts the cup down.
Expert 3: Therefore, the ball must have fallen out of the cup when Bob turned it upside down in the bedroom. So, the ball is in the bedroom, not in the cup anymore.
Expert 1: Oh, you're right. I made a mistake. The ball is in the bedroom, not in the cup.
Expert 2: Agreed, the ball is in the bedroom.
Expert 3: Bob then walks to the garage, so the ball remains in the bedroom. It is not in the garden or the garage.
Expert 1: Absolutely, the ball is still in the bedroom.
Expert 2: Yes, the ball hasn't moved from the bedroom.
Expert 3: Therefore, the ball is in the bedroom, not in the garden or the garage.

All three experts agree that the ball is in the bedroom.

Some limited data from the authors shows that this prompting technique does better than Chain of Thought prompting on GPT 3.5, so it’s definitely worth exploring if you want to achieve better results with a less expensive model than GPT-4.

It’s always useful to compare various techniques with different models and evaluate which one will produce the best result for your use-case.

Want to compare prompting techniques?

If you want to experiment with various prompt techniques across various models for your use-case - we can help you!

Vellum has the tooling layer to experiment with prompts and models, evaluate their quality, and make changes with confidence once in production.

You can take a look at our use-cases, or book a call to talk with someone from our team.


Join 10,000+ developers staying up-to-date with the latest AI techniques and methods.
Thanks for joining our newsletter.
Oops! Something went wrong.
Anita Kirkovska
Linkedin's logo

Founding Growth at Vellum

Anita Kirkovska, is currently leading Growth and Content Marketing at Vellum. She is a technical marketer, with an engineering background and a sharp acumen for scaling startups. She has helped SaaS startups scale and had a successful exit from an ML company. Anita writes a lot of content on generative AI to educate business founders on best practices in the field.

About the authors

No items found.

Related posts