API Pricing & Free Tier Matrix¶

This is the canonical tracker for API pricing links and free-tier availability across LLM providers and API platforms.

Scope and usage¶

Focus: API access (not consumer chat subscriptions unless directly tied to API credits).
Purpose: provider comparison, budgeting, and long-term maintenance reference.
Update target: monthly, and after major provider announcements.

Status legend¶

Yes = official free tier/trial access is currently documented.
Partial = limited free usage exists (for example, selected models/features).
No = no current free trial/tier is documented.
Unclear = pricing/billing docs do not clearly confirm a standing free tier.

Canonical pricing matrix (last verified: 2026-03-03)¶

Provider / Platform	Official links	Free tier / trial	Evidence summary
OpenAI	Docs · Pricing	No	Usage-priced API; prepaid credits required.
Anthropic (Claude API)	Docs · Pricing	Yes	New users receive small starter API credits.
Google Gemini Developer API	Docs · Pricing	Yes	Pricing tables include free-tier rows. AI Pro members receive $10/mo Cloud credits.
OpenRouter	Docs · Pricing	Yes	Free plan and free-model routing are documented.
xAI (Grok API)	Docs · Pricing	Yes	Docs mention monthly free requests/credits.
Z.ai (GLM API)	Docs · Pricing	Yes	New users can claim free API token packages.
Alibaba DashScope (Qwen APIs)	Docs · Pricing	Yes	Many models show temporary free quota periods.
Cohere	Docs · Pricing	Yes	Trial API keys are free and rate-limited.
Mistral AI	Docs · Pricing	Yes	Experiment plan supports free API testing.
Together AI	Docs · Pricing	No	Billing docs indicate paid credits are required.
Groq	Docs · Pricing	Yes	Free API plan and limits are documented.
Kiro	Docs · Pricing	Yes	Perpetual free tier (50 credits/mo) + 500 bonus credits.
Fireworks AI	Docs · Pricing	Yes	Public pricing notes starter free credits.
Replicate	Docs · Pricing	Partial	Some models can be run free before billing.
DeepSeek API	Docs · Pricing	Unclear	Granted balances are mentioned, fixed free tier unclear.
Perplexity API	Docs · Pricing	No	Purchased credits and top-up requirements documented.
AI21	Docs · Pricing	Yes	Pricing page advertises free trial credits.
Abacus.AI	Docs · Pricing	Yes	Free trial and ChatLLM free access documented.
Voyage AI	Docs · Pricing	Unclear	Paid rates are clear; standing free tier not explicit.
Cloudflare Workers AI	Docs · Pricing	Yes	Free plan includes daily usage.
Hugging Face Inference Providers	Docs · Pricing	Yes	Monthly included inference credits by account tier.
Cerebras Inference	Docs · Pricing	Yes	Pricing references free-tier usage/credits.
NVIDIA API Catalog	Docs · Pricing	Yes	Starter credits are referenced publicly.
SambaNova Cloud	Docs · Pricing	Unclear	Public page shows paid plans; no stable free policy.
AWS Bedrock	Docs · Pricing	No	Metered pay-as-you-go pricing.
Amazon Q	Docs · Pricing	Yes	Free tier for individuals/developers is documented.
Azure OpenAI Service	Docs · Pricing	Unclear	Metered service; only account-level cloud credits may apply.
Vertex AI (Gemini via GCP)	Docs · Pricing	Unclear	Metered pricing; no persistent API free-tier statement.
OCI Generative AI	Docs · Pricing	Unclear	Paid rates are public; free tier not clearly documented.
MiniMax	Docs · Pricing	Yes	Coding Plan provides a low-cost entry tier; trial credits available.
Moonshot AI	Docs · Pricing	Partial	Trial credits are typically granted to new developer accounts.

Developer Program Plans¶

These are the core subscription plans for developers that bundle AI access, cloud credits, and other professional benefits.

Program / Plan	Cost	AI Access & Quotas	Cloud Credits & Benefits
Google Developer Program — Standard	Free	10 Firebase Studio workspaces; Gemini Code Assist (Basic); Gemini CLI (60 RPM / 1000 RPD)	Monthly Google Skills credits (via GEAR); community access; private previews.
Google Developer Program — Premium	$24.99/mo or $299/yr	30 Firebase Studio workspaces; Gemini Code Assist (Higher); Gemini CLI (120 RPM / 1500 RPD)	$45/mo ($550/yr) GenAI/Cloud credit; $500 bonus credit upon certification; 1 Cloud cert voucher; expert consultation.
Google Developer Program — Enterprise	Preview	Gemini Code Assist Enterprise; Gemini CLI (120 RPM / 2000 RPD)	$150/mo Google Cloud credit; centralized purchasing; developer sandboxes.

Model-level quota tracker (expanded list)¶

This section is grouped by provider with compact four-column tables for narrower screens.

Verified = core limits are visible in official docs.
Partially verified = provider free-tier stance is verified, but model-level quotas are dynamic or not fully published.
Unverified = values come from community reports or account-specific observations not explicitly documented.

Code Generation Quality is subjective and treated as community-assessed, not an official benchmark metric.

Quick jump: Google Gemini · OpenAI · Anthropic Claude · Groq · Together AI · Hugging Face · Mistral AI · DeepSeek · Cohere · OpenRouter · Cerebras · xAI Grok · Kiro

Quotas format is context / RPM / RPD / TPM / daily token cap. n/p means "not published."

Capability tags:

CODE code generation and refactoring tasks.
VERIFY cross-checking, factual validation, and test review.
REASON complex reasoning and multi-step planning.
LONGCTX long documents, large prompts, and retrieval-heavy workflows.
FAST low-latency interactions and interactive agent loops.
BUDGET better free-tier value or lower-cost experimentation.
OPEN open-weight/open-model ecosystem affinity.

Capability Capacity Summary (auto-generated)¶

These summaries are generated from the model rows on this page using scripts/update_api_pricing_capability_summary.py. Only rows with a numeric daily token cap are included in the capacity math.

Leaderboard By Capability (known daily token caps)¶

Capability	Top models	Highest known daily cap	Known models
Coding	Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M); Cerebras — Llama 4 Maverick 400B (1M)	62.5M	9
Verification	Google Gemini — Gemini 2.5 Pro (25M)	25M	1
Reasoning	Google Gemini — Gemini 2.5 Pro (25M); Groq — GPT OSS 120B (100K)	25M	2
Long-context	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M)	250M	3
Low-latency	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 4 Maverick 400B (1M)	250M	9
Budget/free-value	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 3.1 8B (1M)	250M	7
Open-model ecosystem	Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M)	1M	7

80% Shortlist (known-cap coverage)¶

Capability	Models to reach >=80% of known capacity	Coverage	Total known daily cap
Coding	Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M)	95.9%	91.2M
Verification	Google Gemini — Gemini 2.5 Pro (25M)	100.0%	25M
Reasoning	Google Gemini — Gemini 2.5 Pro (25M)	99.6%	25.1M
Long-context	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M)	92.6%	337.5M
Low-latency	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M)	98.5%	317.1M
Budget/free-value	Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M)	99.2%	315.1M
Open-model ecosystem	Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M); Groq — Llama 4 Maverick 17B (500K); Groq — Qwen3 32B (500K)	87.0%	4.6M

Fast Recommendation (80% rule, known-cap data)¶

Goal	Recommended free-first models	Why this set
Coding	Google Gemini — Gemini 2.5 Flash; Google Gemini — Gemini 2.5 Pro	Reaches 95.9% of known daily capacity (91.2M total known).
Verification	Google Gemini — Gemini 2.5 Pro	Reaches 100.0% of known daily capacity (25M total known).
Reasoning	Google Gemini — Gemini 2.5 Pro	Reaches 99.6% of known daily capacity (25.1M total known).

Google Gemini¶

Model	Quotas	Verification	Summary
Gemini 2.5 Pro	`1M / 5 / 100 / 250K / ~25M`	Verified	CODE VERIFY REASON LONGCTX Account: Google. Quality: Excellent. Regional/compliance rules may affect access.
Gemini 2.5 Flash	`1M / 10 / 250 / 250K / ~62.5M`	Verified	CODE FAST LONGCTX BUDGET Account: Google. Quality: Very Good. Official free-tier RPM is 10 (not 15).
Gemini 2.5 Flash-Lite	`1M / 15 / 1000 / 250K / ~250M`	Verified	FAST LONGCTX BUDGET Account: Google. Quality: Good. Highest listed Gemini free-tier RPD.

OpenAI¶

Model	Quotas	Verification	Summary
GPT-4o	`128K / tier / tier / tier / tier`	Partially verified	CODE VERIFY REASON Account: OpenAI. Quality: Excellent. No standing public free API tier documented.
GPT-4o mini	`128K / tier / tier / tier / tier`	Partially verified	CODE FAST BUDGET Account: OpenAI. Quality: Very Good. Limits vary by trust tier/account status.

Anthropic Claude¶

Model	Quotas	Verification	Summary
Claude 3.5 Sonnet	`200K / tier / tier / tier / tier`	Partially verified	CODE VERIFY REASON LONGCTX Account: Anthropic console. Quality: Excellent. Tier 1 needs purchased credits.
Claude 3 Haiku	`200K / tier / tier / tier / tier`	Partially verified	FAST BUDGET VERIFY Account: Anthropic console. Quality: Good. Fixed global daily caps are not published.

Groq¶

Model	Quotas	Verification	Summary
Llama 3.3 70B (llama-3.3-70b-versatile)	`128K / 30 / 1000 / 12K / 100K`	Verified	FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Very Good. Official row differs from older community numbers.
Llama 4 Maverick 17B	`128K / 30 / 1000 / 6K / 500K`	Verified	FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Good. Fast inference; revision limits can change.
Qwen3 32B	`128K / 30 / 14,400 / 6K / 500K`	Partially verified	FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Good. Model IDs/limits may shift by release.
Llama 4 Scout 17B	`128K / 30 / 1000 / 6K / 500K`	Verified	FAST CODE OPEN BUDGET Account: Groq. 16E MoE variant optimized for low-latency tasks.
GPT OSS 120B	`128K / 30 / 1000 / 12K / 100K`	Verified	CODE REASON Account: Groq. High performance open-weights model.
Kimi K2 (kimi-k2-0905)	`256K / n/p / n/p / n/p / n/p`	Verified	LONGCTX REASON Account: Groq. 1T parameter MoE with 256K context.
Compound AI (groq/compound)	`128K / 30 / 250 / 70K / n/p`	Verified	FAST REASON CODE Account: Groq (no CC for free tier). Quality: Good. Official docs do not publish TPD.

Kiro¶

Model	Quotas	Verification	Summary
Auto (Frontier Mix)	`n/p / n/p / n/p / n/p / 50 credits`	Verified	CODE FAST BUDGET Account: Kiro. Mixed agent using frontier models. Free tier includes 50 monthly credits.

Together AI¶

Model	Quotas	Verification	Summary
Llama 4 Maverick	`131K / tier / tier / tier / tier`	Verified	CODE REASON OPEN Account: Together + paid credits. Quality: Very Good. No standing free trial in current docs.
DeepSeek V3.1	`64K / tier / tier / tier / tier`	Verified	CODE REASON OPEN Account: Together + paid credits. Quality: Excellent. "$100 signup credits" not confirmed in current docs.
Mistral Small 3	`128K / tier / tier / tier / tier`	Verified	CODE OPEN Account: Together + paid credits. Quality: Good. Limits are spend/account-tier dependent.

Hugging Face¶

Model	Quotas	Verification	Summary
Various open models	`varies / provider / provider / provider / credit-based`	Verified	OPEN BUDGET CODE Account: Hugging Face. Quality: Varies. Limits depend on routed provider and plan.
Pro-tier routed providers	`varies / higher / provider / provider / credit-based`	Verified	OPEN CODE VERIFY Account: Hugging Face Pro. Quality: Very Good. Pro includes higher monthly credits.

Mistral AI¶

Model	Quotas	Verification	Summary
Mistral Nemo 12B	`model / plan / plan / plan / plan`	Partially verified	CODE OPEN BUDGET Account: Mistral (Experiment/Scale). Quality: Good. Free Experiment plan exists; quotas are dynamic.
Mistral Small 3.1	`128K / plan / plan / plan / plan`	Partially verified	CODE OPEN Account: Mistral (Experiment/Scale). Quality: Good. Access/limits depend on plan tier.
Codestral	`32K / plan / plan / plan / plan`	Partially verified	CODE VERIFY OPEN Account: Mistral (Experiment/Scale). Quality: Excellent. Code-oriented with plan gating.
Mistral Large 3	`128K / plan / plan / plan / plan`	Partially verified	CODE REASON VERIFY Account: Mistral (Experiment/Scale). Quality: Excellent. Most capable tier usually paid.

DeepSeek¶

Model	Quotas	Verification	Summary
DeepSeek V3.2 (deepseek-chat)	`128K / n/p / n/p / n/p / n/p`	Partially verified	CODE REASON BUDGET Account: DeepSeek. Quality: Excellent. Pricing is public; fixed free quotas are not.
DeepSeek R1 / reasoner	`128K / n/p / n/p / n/p / n/p`	Unverified	REASON VERIFY CODE Account: DeepSeek. Quality: Excellent. "5M signup tokens" is not confirmed in official docs.

Cohere¶

Model	Quotas	Verification	Summary
Command R7B	`128K / 20 / ~1000mo / endpoint / 1000mo`	Verified	VERIFY FAST Account: Cohere trial key. Quality: Good. Free trial usage is heavily rate-limited.
Command R+	`128K / 20 / ~1000mo / endpoint / 1000mo`	Verified	VERIFY FAST CODE Account: Cohere trial key. Quality: Very Good. Trial cap is account-wide per month.

OpenRouter¶

Model	Quotas	Verification	Summary
Qwen3 Coder 480B (`:free` variant when available)	`model / 20 / 50d (<$10) or 1000d (>= $10) / n/p / n/p`	Verified	CODE OPEN BUDGET Account: OpenRouter. Quality: Excellent. Free limits are account-plan based.
GPT-OSS-120B (`:free` variant when available)	`model / 20 / 50d or 1000d (>= $10) / n/p / n/p`	Verified	CODE OPEN BUDGET Account: OpenRouter. Quality: Very Good. Free variants can rotate.
Llama 3.3 70B (`:free` variant when available)	`model / 20 / 50d or 1000d (>= $10) / n/p / n/p`	Verified	CODE OPEN BUDGET Account: OpenRouter. Quality: Very Good. Free router pool is dynamic.
Mistral Small 3.1 (`:free` variant when available)	`model / 20 / 50d or 1000d (>= $10) / n/p / n/p`	Verified	CODE OPEN BUDGET Account: OpenRouter. Quality: Good. Best for low-volume testing.
DeepSeek R1 (`:free` variant when available)	`model / 20 / 50d or 1000d (>= $10) / n/p / n/p`	Verified	REASON VERIFY BUDGET Account: OpenRouter. Quality: Excellent. Current docs differ from older community RPD values.

Cerebras¶

Model	Quotas	Verification	Summary
Llama 4 Maverick 400B	`128K (paid) / 30 / 14,400 / 60K / 1M`	Partially verified	FAST CODE OPEN Account: Cerebras. Quality: Very Good. Context/limits vary by tier and model page.
Qwen3 Coder 235B	`64K free, 131K paid / 30 / 14,400 / 60K / 1M`	Partially verified	FAST CODE OPEN Account: Cerebras. Quality: Excellent. Verify live limits on model page.
Llama 3.1 8B	`8K free, 32K paid / 30 / 14,400 / 60K / 1M`	Verified	FAST BUDGET OPEN Account: Cerebras. Quality: Good. Free-tier limits are explicitly published.

xAI Grok¶

Model	Quotas	Verification	Summary
Grok 4.1 Fast	`model / credit / credit / credit / credit`	Unverified	REASON VERIFY Account: xAI. Quality: Very Good. Promotional credits may exist; fixed "$25 startup credits" not consistently documented.

MiniMax¶

Model	Quotas	Verification	Summary
MiniMax-M2.5 (Coding Plan Starter)	`200K / 40 prompts per 5h / n/p / n/p / n/p`	Verified	CODE BUDGET REASON Account: MiniMax. Quality: Excellent. Optimized for coding. Fixed-fee subscription.
MiniMax-M2.5 (Pay-as-you-go)	`200K / plan / plan / plan / plan`	Verified	CODE FAST Account: MiniMax. Quality: Excellent. Competitive RMB pricing (2.1/8.4 per 1M tokens).

Moonshot AI¶

Model	Quotas	Verification	Summary
moonshot-v1-128k	`128K / tier / tier / tier / tier`	Partially verified	LONGCTX REASON Account: Moonshot AI. Quality: Very Good. Famous for pioneer long-context stability.

Notes and caveats¶

Provider pricing and free-tier rules change frequently; always verify from official links before budgeting.
Several vendors distinguish between product-level free plans and API-level free access.
Account-level cloud credits (Azure/AWS/GCP) are not equivalent to a provider-specific API free tier.

Maintenance protocol¶

When updating this page:

Validate each row against official docs/pricing pages.
Update Free tier / trial, Evidence summary, and Last verified.
Add providers only when official pricing and docs links are stable.
Use Unclear when evidence is ambiguous.
Regenerate capacity summaries with python3 scripts/update_api_pricing_capability_summary.py.

Sources / References¶

Contribution Metadata¶

Last reviewed: 2026-03-03
Confidence: medium