LLM Cost Calculator

Groq API Pricing 2025

Ultra-fast inference on dedicated LPU hardware — up to 1,000 tokens/sec. Best for latency-sensitive production apps. Predictable, linear pricing.

Get Groq API access →

Groq Model Pricing

Prices in USD per 1M tokens

ModelInput / 1MOutput / 1MContext
Llama 3.1 8B (Groq)
Ultra-low cost; 840 tokens/sec on Groq LPU
$0.05$0.08128,000
Llama 4 Scout (Groq)
Fastest LLM inference on Groq LPU; 594 tokens/sec
$0.11$0.34128,000
GPT-OSS 120B (Groq)
OpenAI open-source model on Groq; 500 tokens/sec
$0.15$0.6128,000
Qwen3 32B (Groq)
Strong mid-size model with extended context; 662 tokens/sec
$0.29$0.59131,072
Llama 3.3 70B (Groq)
Reliable 70B on Groq LPU; 394 tokens/sec
$0.59$0.79128,000

Estimated Monthly Cost (70% input / 30% output split)

Model1M tokens/mo10M tokens/mo100M tokens/mo1B tokens/mo
Llama 3.1 8B (Groq)$0.059$0.590$5.90$59.00
Llama 4 Scout (Groq)$0.179$1.79$17.90$179
GPT-OSS 120B (Groq)$0.285$2.85$28.50$285
Qwen3 32B (Groq)$0.380$3.80$38.00$380
Llama 3.3 70B (Groq)$0.650$6.50$65.00$650

Frequently Asked Questions

How much does Groq LLM API cost?

Groq offers 5 models ranging from $0.050/1M to $0.59/1M input tokens. Ultra-fast inference on dedicated LPU hardware — up to 1,000 tokens/sec. Best for latency-sensitive production apps. Predictable, linear pricing.

Is Groq cheaper than self-hosting?

For low-volume workloads (under 100M tokens/month), cloud APIs like Groq are almost always cheaper than purchasing and maintaining GPU hardware. Use our calculator to find the exact break-even point for your usage.