Free GPT-4o, Llama, Mistral via your GitHub PAT
GitHub Models — Free GPT-4o, Llama, Mistral via your GitHub PAT.
Free tier: Free for prototyping with a GitHub PAT. Strict rate limits (varies by model tier — Low/High); GPT-4o limited to a few RPM.
API is OpenAI-compatible — point your SDK at https://models.inference.ai.azure.com.
gpt-4ogpt-4o-miniMeta-Llama-3.3-70B-InstructMistral-largePhi-3.5-MoE-instructProduction traffic should move to Azure AI (paid). GitHub Models is explicitly prototyping-only.
Azure AI Foundry / Azure OpenAI for production.
https://models.inference.ai.azure.com
OpenAI-compatible — works with the OpenAI SDK by overriding base_url.
curl https://models.inference.ai.azure.com/chat/completions \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello in 5 words"}]
}'
Trying GPT-4o without an OpenAI account. Prototyping enterprise demos that will later move to Azure.
Production. Free tier rate limits are intentionally low to push paid Azure migration.
Free for prototyping with a GitHub PAT. Strict rate limits (varies by model tier — Low/High); GPT-4o limited to a few RPM. No credit card is required to sign up.
Most commonly used: gpt-4o, gpt-4o-mini, Meta-Llama-3.3-70B-Instruct, Mistral-large. The full current list is on GitHub Models's docs page.
Yes — point the OpenAI SDK's base URL at `https://models.inference.ai.azure.com` and pass your GitHub Models API key.
Production traffic should move to Azure AI (paid). GitHub Models is explicitly prototyping-only. If your traffic is bursty or seasonal, the free tier may be enough; if you need a guaranteed SLA, upgrade.