Compare free LLM APIs side-by-side

→ Groq vs Cerebras Inference

The fastest LLM inference on the planet (LPU) vs Wafer-scale chips → fastest open-model inference (often >2,000 tok/s)

→ Groq vs Together AI

The fastest LLM inference on the planet (LPU) vs Wide open-source model catalog with serverless + dedicated

→ OpenRouter vs Together AI

One API, 300+ models — including many free ones vs Wide open-source model catalog with serverless + dedicated

→ OpenRouter vs Groq

One API, 300+ models — including many free ones vs The fastest LLM inference on the planet (LPU)

→ Google AI Studio (Gemini) vs OpenRouter

Free Gemini API access — 1M-token context, multimodal vs One API, 300+ models — including many free ones

→ Cerebras Inference vs SambaNova Cloud

Wafer-scale chips → fastest open-model inference (often >2,000 tok/s) vs RDU-accelerated Llama 3 with very high tok/s

→ Mistral La Plateforme vs OpenRouter

EU-based; Mistral Small / Codestral with experimental free tier vs One API, 300+ models — including many free ones

→ GitHub Models vs OpenRouter

Free GPT-4o, Llama, Mistral via your GitHub PAT vs One API, 300+ models — including many free ones

→ Hugging Face Inference vs Together AI

300k+ open-source models, free serverless inference vs Wide open-source model catalog with serverless + dedicated

→ Groq vs GitHub Models

The fastest LLM inference on the planet (LPU) vs Free GPT-4o, Llama, Mistral via your GitHub PAT

Head-to-head comparisons