x

Test models side by side in Vellum

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
updated
06 Aug 2025

LLM Cost Comparison

Compare the latest models released after April 2024 based on cost to use. We’ve pulled pricing data directly from model providers and tracked how pricing changes across input size, output length, and model version. If you want to compare costs for your own workloads, try Vellum Evals.

Fastest and most affordable models
Fastest Models
Tokens/seconds
2500
2000
1500
1000
500
0
Llama 4 Scout
2600
Llama 3.3 70b
2500
Llama 3.1 70b
2100
Llama 3.1 8b
1800
Llama 3.1 405b
969
Lowest Latency (TTFT)
Seconds to first token
0.6s
0.5s
0.4s
0.3s
0.2s
0.1s
0.0s
Nova Micro
0.3
Llama 3.1 8b
0.32
Llama 4 Scout
0.33
Gemini 2.0 Flash
0.34
GPT-4o mini
0.35
Cheapest Models
Input
Output
USD per 1M tokens
0.8
0.65
0.5
0.35
0.2
0.05
Nova Micro
$
0.04
$
0.14
Gemma 3 27b
$
0.07
$
0.07
Gemini 1.5 Flash
$
0.075
$
0.3
GPT oss 20b
$
0.08
$
0.35
Cost
LLM Cost Comparison
Showing 0 out of 20 results
Reset All
GPT-5
$
1.25
n/a
$
10
n/a
n/a
t/s
seconds
n/a
Claude Opus 4.1
$
15
n/a
$
75
n/a
n/a
t/s
seconds
n/a
GPT oss 20b
$
0.08
n/a
$
0.35
n/a
564
n/a
t/s
4
seconds
n/a
GPT oss 120b
$
0.15
n/a
$
0.6
n/a
260
n/a
t/s
8.1
seconds
n/a
Grok 4
$
n/a
$
n/a
52
n/a
t/s
13.3
seconds
n/a
Claude 4 Opus
$
15
n/a
$
75
n/a
n/a
t/s
1.95
seconds
n/a
Claude 4 Sonnet
$
3
n/a
$
15
n/a
n/a
t/s
1.9
seconds
n/a
Gemini 2.5 Flash
$
0.15
n/a
$
0.6
n/a
200
n/a
t/s
0.35
seconds
n/a
OpenAI o3
$
10
n/a
$
40
n/a
94
n/a
t/s
28
seconds
n/a
OpenAI o4-mini
$
1.1
n/a
$
4.4
n/a
135
n/a
t/s
35.3
seconds
n/a
GPT-4.1 nano
$
0.1
n/a
$
0.4
n/a
n/a
t/s
seconds
n/a
GPT-4.1 mini
$
0.4
n/a
$
1.6
n/a
n/a
t/s
seconds
n/a
GPT-4.1
$
2
n/a
$
8
n/a
n/a
t/s
seconds
n/a
Llama 4 Scout
$
0.11
n/a
$
0.34
n/a
2600
n/a
t/s
0.33
seconds
n/a
Llama 4 Maverick
$
0.2
n/a
$
0.6
n/a
126
n/a
t/s
0.45
seconds
n/a
Gemma 3 27b
$
0.07
n/a
$
0.07
n/a
59
n/a
t/s
0.72
seconds
n/a
Grok 3 [Beta]
$
n/a
$
n/a
n/a
t/s
seconds
n/a
Gemini 2.5 Pro
$
1.25
n/a
$
10
n/a
191
n/a
t/s
30
seconds
n/a
Claude 3.7 Sonnet
$
3
n/a
$
15
n/a
78
n/a
t/s
0.91
seconds
n/a
GPT-4.5
$
75
n/a
$
150
n/a
48
n/a
t/s
1.25
seconds
n/a
Claude 3.7 Sonnet [R]
$
3
n/a
$
15
n/a
78
n/a
t/s
0.95
seconds
n/a
DeepSeek-R1
$
0.55
n/a
$
2.19
n/a
24
n/a
t/s
4
seconds
n/a
OpenAI o3-mini
$
1.1
n/a
$
4.4
n/a
214
n/a
t/s
14
seconds
n/a
OpenAI o1-mini
$
3
n/a
$
12
n/a
220
n/a
t/s
11.43
seconds
n/a
Qwen2.5-VL-32B
$
n/a
$
n/a
n/a
t/s
seconds
n/a
DeepSeek V3 0324
$
0.27
n/a
$
1.1
n/a
33
n/a
t/s
4
seconds
n/a
OpenAI o1
$
15
n/a
$
60
n/a
100
n/a
t/s
30
seconds
n/a
Gemini 2.0 Flash
$
0.1
n/a
$
0.4
n/a
257
n/a
t/s
0.34
seconds
n/a
Llama 3.3 70b
$
0.59
n/a
$
0.7
n/a
2500
n/a
t/s
0.52
seconds
n/a
Nova Pro
$
1
n/a
$
4
n/a
128
n/a
t/s
0.64
seconds
n/a
Claude 3.5 Haiku
$
0.8
n/a
$
4
n/a
66
n/a
t/s
0.88
seconds
n/a
Llama 3.1 405b
$
3.5
n/a
$
3.5
n/a
969
n/a
t/s
0.73
seconds
n/a
GPT-4o mini
$
0.15
n/a
$
0.6
n/a
65
n/a
t/s
0.35
seconds
n/a
GPT-4o
$
2.5
n/a
$
10
n/a
143
n/a
t/s
0.51
seconds
n/a
Claude 3.5 Sonnet
$
3
n/a
$
15
n/a
78
n/a
t/s
1.22
seconds
n/a
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.