LLM Prameter Guide

How to use Reasoning Effort?

Supported by

OpenAI

What is Reasoning Effort?

Reasoning Effort controls how much “thinking” a model does before it gives you an answer. You can set it to low, medium, or high.

Low → Quick answers, but less detailed reasoning
Medium → Balanced speed and depth (default)
High → Deep reasoning and more complete answers, but slower and more expensive

This parameter is especially useful for complex problem-solving, coding tasks, or math/logic problems.

How does Reasoning Effort work?

When a model answers, it generates hidden “reasoning tokens” (sometimes called chain-of-thought).

Without adjustment → The model uses a default amount of reasoning.
With Reasoning Effort → You decide how much effort it should spend “thinking” before finalizing the answer.

That means:

Less effort = faster, cheaper, but shallow responses
More effort = slower, costlier, but deeper and more accurate

How to use Reasoning Effort with OpenAI models

When calling an OpenAI reasoning model (o1, o3, o3-mini, GPT-5, etc.), you can set:

"reasoning_effort": "low"   // options: low, medium, high

Example:

response = client.chat.completions.create(
    model="o3-mini",
    messages=[{"role": "user", "content": "Solve this math problem: 243 / 9"}],
    reasoning_effort="high"
)

How to use Reasoning Effort with Claude models

Claude 3.7 Sonnet and later also support variable reasoning depth.
You can set the level of “thinking” tokens in a similar way via API (Anthropic docs recommend adjusting based on task complexity).

Key things to know

Defaults: Medium is default if you don’t set anything.
Trade-offs: Higher effort improves reasoning tasks (math, coding, logic puzzles), but takes longer and costs more.
Performance: Benchmarks show accuracy can improve 10–30% at high effort, depending on the model and task.
Cost: Reasoning models already cost more; “high” effort multiplies this. One study found they can be 10–74× pricier than standard models.

Example use cases

Low → Fast Q&A, customer support, summarization
Medium → Most general use cases
High → Advanced coding, multi-step math, research, reasoning benchmarks

Reasoning Effort Pricing (Conceptual)

Pricing varies by model, but here’s the pattern:

Effort Level	Latency	Accuracy	Cost Impact	When to Use
Low	Fast	Lower	Cheapest	Simple Q&A, summaries
Medium	Balanced	Good	Normal	General tasks
High	Slowest	Highest	Most Expensive	Complex problem-solving

‍

When to use Reasoning Effort?

When accuracy matters more than speed
When handling complex reasoning, math, or coding problems
When testing models against benchmarks or evals
When you want to experiment with cost/performance trade-offs

read about this parameter next:

Thinking tokens