Large Language models (LLMs) are good at predicting the next word. But, they find it hard to solve problems that need step-by-step thinking.
Enter prompt engineering.
With the right prompts, you can guide a language model like GPT-4 to give better answers.
There's a whole toolkit of techniques to craft a good prompt, but when it comes to complex reasoning tasks, Chain of Thought prompting stands out as a solid option.
In this blog, we'll explore everything there is to know about Chain of Thought prompting, when it's the right choice, and how it stacks up against other techniques.
Chain-of-Thought (CoT) prompting is a technique that guides LLMs to follow a reasoning process when dealing with hard problems. This is done by showing the model a few examples where the step-by-step reasoning is clearly laid out. The model is then expected to follow that "chain of thought" reasoning and get to the correct answer.
Need your LLM to solve a linear equation?
Show an example and outline the intermediate steps on solving this kind of an equation.
Want your LLM to craft an optimized Python function for you?
Show the intermediate steps on how a function is defined, called and optimized.
And that’s it.
This might look very similar to few-shot prompting, but there is a significant difference.
Few-shot prompting is when you give a few examples so the language model can understand want it should do.
On the other hand, Chain-of-Thought prompting is about showing the step-by-step thinking from start to finish, which helps with “reasoning” and getting more detailed answers.
Bottom line: It's about showing the work, not just the answer.
CoT is ideal when your task involves complex reasoning that require arithmetic, commonsense, and symbolic reasoning; where the model needs to understand and follow intermediate steps to arrive at the correct answer.
On the flip side, smaller models have shown some issues, creating odd thought chains and being less precise compared to standard prompting.
In other specific cases, you don’t even need to show the intermediate steps; you can just use Zero-Shot CoT prompting.
Zero-shot chain-of-thought (Zero-Shot-CoT) prompting involves adding "Let's think step by step" to the original prompt to guide the language model's reasoning process. This approach is particularly useful when you don't have many examples to use in the prompt.
Let's say you're trying to teach the AI about a new concept, like "quantum physics," and you want it to generate some explanations. Instead of just saying, "Explain quantum physics," you can just say "Let's think step by step: Explain quantum physics."
By including the "Let's think step by step" part, you help the AI break down complex topics into manageable pieces.
And you can do this on auto-pilot.
Automatic Chain of Thought or Auto-CoT automatically generates the intermediate reasoning steps by utilizing a database of diverse questions grouped into clusters.
Auto-CoT goes through two main stages:
The process is illustrated below:
If we ask GPT-4 today to solve for x in the equation (64 = 2 + 5x + 32), it will solve it without any examples given*.*
This may look like a simple math problem, but at the beginning of 2023 this was a very challenging problem even for GPT-4.
These days, it seems like the model automatically provides step-by-step answers to most reasoning questions by default. Go ahead, try it!
Now, just think about how much smarter an LLM can become when you provide it with a step-by-step guide to optimize your code, restructure your databases, or develop a game strategy for popular games like "Minecraft.”
And imagine how powerful this technique can be when scientists teach an AI to follow detailed step-by-step diagnosis for complex medical conditions.
The possibilities are endless, and that’s where these techniques come in handy, especially when we introduce the “visual” element to the mix.
Multimodal Chain-of-Thought prompting uses both words and pictures to showcase the reasoning steps, to help guide the LLM to showcase its “reasoning”, and the right answer.
And if you were following the latest AI news, multi-modality is coming to an LLM near you.
Well, with MultiModal Chain-of-Thought prompting you can lay out the reasoning tasks, share the photos upfront and get to the answer right away.
The biggest limit is that there is no guarantee of correct reasoning paths, and since we don’t really know if the model is really “reasoning” with us, this can lead to both correct and incorrect answers.
There are other prompt techniques like Self-Consistency which incorporate different “reasoning examples” for a single task and Tree of Thoughts (ToT) that has like a map of possible paths, and self-calibrates if it goes towards the wrong path.
No matter the prompt engineering technique you pick for your project, it's important to experiment, test, and understand what your end users think.
With Chain of Thought (CoT) prompting, it tends to do better with bigger models and tricky reasoning tasks. If you're making an app and this sounds like what you need, we can help.
Vellum.ai gives you the tools to try out different Chain of Thought prompts and models, check how good they are, and tweak them easily once they're in production — no custom code needed! Request a demo for our app here, join our Discord or reach out to us at email@example.com if you have any questions!