GitHub Models

Free GPT-4o, Llama, Mistral via your GitHub PAT

TL;DR

GitHub Models — Free GPT-4o, Llama, Mistral via your GitHub PAT. Free tier: Free for prototyping with a GitHub PAT. Strict rate limits (varies by model tier — Low/High); GPT-4o limited to a few RPM. API is OpenAI-compatible — point your SDK at https://models.inference.ai.azure.com.

—

Latency now

—

Uptime 24h

Free RPM

150

Free RPD

Get free API key → Read docs ↗ Pricing ↗

Free tier limits

15 requests/min
150 requests/day
8,000 tokens/min
64,000 tokens/day

No credit card required.

Models on free tier

gpt-4o
gpt-4o-mini
Meta-Llama-3.3-70B-Instruct
Mistral-large
Phi-3.5-MoE-instruct

Upgrade path

Production traffic should move to Azure AI (paid). GitHub Models is explicitly prototyping-only.

Azure AI Foundry / Azure OpenAI for production.

Endpoint

https://models.inference.ai.azure.com

OpenAI-compatible — works with the OpenAI SDK by overriding base_url.

Quick start

curl https://models.inference.ai.azure.com/chat/completions \
  -H "Authorization: Bearer $GITHUB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello in 5 words"}]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_GITHUB_PAT",
    base_url="https://models.inference.ai.azure.com",
)

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello in 5 words"}],
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GITHUB_TOKEN,
  baseURL: "https://models.inference.ai.azure.com",
});

const resp = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello in 5 words" }],
});
console.log(resp.choices[0].message.content);

When GitHub Models is the right pick

Stay on free tier when

Trying GPT-4o without an OpenAI account. Prototyping enterprise demos that will later move to Azure.

Pick something else when

Production. Free tier rate limits are intentionally low to push paid Azure migration.

FAQ

Is GitHub Models's API really free?

Free for prototyping with a GitHub PAT. Strict rate limits (varies by model tier — Low/High); GPT-4o limited to a few RPM. No credit card is required to sign up.

What models can I call on GitHub Models's free tier?

Most commonly used: gpt-4o, gpt-4o-mini, Meta-Llama-3.3-70B-Instruct, Mistral-large. The full current list is on GitHub Models's docs page.

Is GitHub Models OpenAI-compatible?

Yes — point the OpenAI SDK's base URL at `https://models.inference.ai.azure.com` and pass your GitHub Models API key.

When should I upgrade from GitHub Models's free tier?

Production traffic should move to Azure AI (paid). GitHub Models is explicitly prototyping-only. If your traffic is bursty or seasonal, the free tier may be enough; if you need a guaranteed SLA, upgrade.