API Pricing & Free Tier Matrix¶
This is the canonical tracker for API pricing links and free-tier availability across LLM providers and API platforms.
Scope and usage¶
- Focus: API access (not consumer chat subscriptions unless directly tied to API credits).
- Purpose: provider comparison, budgeting, and long-term maintenance reference.
- Update target: monthly, and after major provider announcements.
Status legend¶
Yes= official free tier/trial access is currently documented.Partial= limited free usage exists (for example, selected models/features).No= no current free trial/tier is documented.Unclear= pricing/billing docs do not clearly confirm a standing free tier.
Canonical pricing matrix (last verified: 2026-03-03)¶
| Provider / Platform | Official links | Free tier / trial | Evidence summary |
|---|---|---|---|
| OpenAI | Docs · Pricing | No | Usage-priced API; prepaid credits required. |
| Anthropic (Claude API) | Docs · Pricing | Yes | New users receive small starter API credits. |
| Google Gemini Developer API | Docs · Pricing | Yes | Pricing tables include free-tier rows. AI Pro members receive $10/mo Cloud credits. |
| OpenRouter | Docs · Pricing | Yes | Free plan and free-model routing are documented. |
| xAI (Grok API) | Docs · Pricing | Yes | Docs mention monthly free requests/credits. |
| Z.ai (GLM API) | Docs · Pricing | Yes | New users can claim free API token packages. |
| Alibaba DashScope (Qwen APIs) | Docs · Pricing | Yes | Many models show temporary free quota periods. |
| Cohere | Docs · Pricing | Yes | Trial API keys are free and rate-limited. |
| Mistral AI | Docs · Pricing | Yes | Experiment plan supports free API testing. |
| Together AI | Docs · Pricing | No | Billing docs indicate paid credits are required. |
| Groq | Docs · Pricing | Yes | Free API plan and limits are documented. |
| Kiro | Docs · Pricing | Yes | Perpetual free tier (50 credits/mo) + 500 bonus credits. |
| Fireworks AI | Docs · Pricing | Yes | Public pricing notes starter free credits. |
| Replicate | Docs · Pricing | Partial | Some models can be run free before billing. |
| DeepSeek API | Docs · Pricing | Unclear | Granted balances are mentioned, fixed free tier unclear. |
| Perplexity API | Docs · Pricing | No | Purchased credits and top-up requirements documented. |
| AI21 | Docs · Pricing | Yes | Pricing page advertises free trial credits. |
| Abacus.AI | Docs · Pricing | Yes | Free trial and ChatLLM free access documented. |
| Voyage AI | Docs · Pricing | Unclear | Paid rates are clear; standing free tier not explicit. |
| Cloudflare Workers AI | Docs · Pricing | Yes | Free plan includes daily usage. |
| Hugging Face Inference Providers | Docs · Pricing | Yes | Monthly included inference credits by account tier. |
| Cerebras Inference | Docs · Pricing | Yes | Pricing references free-tier usage/credits. |
| NVIDIA API Catalog | Docs · Pricing | Yes | Starter credits are referenced publicly. |
| SambaNova Cloud | Docs · Pricing | Unclear | Public page shows paid plans; no stable free policy. |
| AWS Bedrock | Docs · Pricing | No | Metered pay-as-you-go pricing. |
| Amazon Q | Docs · Pricing | Yes | Free tier for individuals/developers is documented. |
| Azure OpenAI Service | Docs · Pricing | Unclear | Metered service; only account-level cloud credits may apply. |
| Vertex AI (Gemini via GCP) | Docs · Pricing | Unclear | Metered pricing; no persistent API free-tier statement. |
| OCI Generative AI | Docs · Pricing | Unclear | Paid rates are public; free tier not clearly documented. |
| MiniMax | Docs · Pricing | Yes | Coding Plan provides a low-cost entry tier; trial credits available. |
| Moonshot AI | Docs · Pricing | Partial | Trial credits are typically granted to new developer accounts. |
Developer Program Plans¶
These are the core subscription plans for developers that bundle AI access, cloud credits, and other professional benefits.
| Program / Plan | Cost | AI Access & Quotas | Cloud Credits & Benefits |
|---|---|---|---|
| Google Developer Program — Standard | Free | 10 Firebase Studio workspaces; Gemini Code Assist (Basic); Gemini CLI (60 RPM / 1000 RPD) | Monthly Google Skills credits (via GEAR); community access; private previews. |
| Google Developer Program — Premium | $24.99/mo or $299/yr | 30 Firebase Studio workspaces; Gemini Code Assist (Higher); Gemini CLI (120 RPM / 1500 RPD) | $45/mo ($550/yr) GenAI/Cloud credit; $500 bonus credit upon certification; 1 Cloud cert voucher; expert consultation. |
| Google Developer Program — Enterprise | Preview | Gemini Code Assist Enterprise; Gemini CLI (120 RPM / 2000 RPD) | $150/mo Google Cloud credit; centralized purchasing; developer sandboxes. |
Model-level quota tracker (expanded list)¶
This section is grouped by provider with compact four-column tables for narrower screens.
Verified= core limits are visible in official docs.Partially verified= provider free-tier stance is verified, but model-level quotas are dynamic or not fully published.Unverified= values come from community reports or account-specific observations not explicitly documented.
Code Generation Quality is subjective and treated as community-assessed, not an official benchmark metric.
Quick jump: Google Gemini · OpenAI · Anthropic Claude · Groq · Together AI · Hugging Face · Mistral AI · DeepSeek · Cohere · OpenRouter · Cerebras · xAI Grok · Kiro
Quotas format is context / RPM / RPD / TPM / daily token cap.
n/p means "not published."
Capability tags:
- CODE code generation and refactoring tasks.
- VERIFY cross-checking, factual validation, and test review.
- REASON complex reasoning and multi-step planning.
- LONGCTX long documents, large prompts, and retrieval-heavy workflows.
- FAST low-latency interactions and interactive agent loops.
- BUDGET better free-tier value or lower-cost experimentation.
- OPEN open-weight/open-model ecosystem affinity.
Capability Capacity Summary (auto-generated)¶
These summaries are generated from the model rows on this page using scripts/update_api_pricing_capability_summary.py.
Only rows with a numeric daily token cap are included in the capacity math.
Leaderboard By Capability (known daily token caps)¶
| Capability | Top models | Highest known daily cap | Known models |
|---|---|---|---|
| Coding | Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M); Cerebras — Llama 4 Maverick 400B (1M) | 62.5M | 9 |
| Verification | Google Gemini — Gemini 2.5 Pro (25M) | 25M | 1 |
| Reasoning | Google Gemini — Gemini 2.5 Pro (25M); Groq — GPT OSS 120B (100K) | 25M | 2 |
| Long-context | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M) | 250M | 3 |
| Low-latency | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 4 Maverick 400B (1M) | 250M | 9 |
| Budget/free-value | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 3.1 8B (1M) | 250M | 7 |
| Open-model ecosystem | Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M) | 1M | 7 |
80% Shortlist (known-cap coverage)¶
| Capability | Models to reach >=80% of known capacity | Coverage | Total known daily cap |
|---|---|---|---|
| Coding | Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M) | 95.9% | 91.2M |
| Verification | Google Gemini — Gemini 2.5 Pro (25M) | 100.0% | 25M |
| Reasoning | Google Gemini — Gemini 2.5 Pro (25M) | 99.6% | 25.1M |
| Long-context | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) | 92.6% | 337.5M |
| Low-latency | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) | 98.5% | 317.1M |
| Budget/free-value | Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) | 99.2% | 315.1M |
| Open-model ecosystem | Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M); Groq — Llama 4 Maverick 17B (500K); Groq — Qwen3 32B (500K) | 87.0% | 4.6M |
Fast Recommendation (80% rule, known-cap data)¶
| Goal | Recommended free-first models | Why this set |
|---|---|---|
| Coding | Google Gemini — Gemini 2.5 Flash; Google Gemini — Gemini 2.5 Pro | Reaches 95.9% of known daily capacity (91.2M total known). |
| Verification | Google Gemini — Gemini 2.5 Pro | Reaches 100.0% of known daily capacity (25M total known). |
| Reasoning | Google Gemini — Gemini 2.5 Pro | Reaches 99.6% of known daily capacity (25.1M total known). |
Google Gemini¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Gemini 2.5 Pro | 1M / 5 / 100 / 250K / ~25M |
Verified | CODE VERIFY REASON LONGCTX Account: Google. Quality: Excellent. Regional/compliance rules may affect access. |
| Gemini 2.5 Flash | 1M / 10 / 250 / 250K / ~62.5M |
Verified | CODE FAST LONGCTX BUDGET Account: Google. Quality: Very Good. Official free-tier RPM is 10 (not 15). |
| Gemini 2.5 Flash-Lite | 1M / 15 / 1000 / 250K / ~250M |
Verified | FAST LONGCTX BUDGET Account: Google. Quality: Good. Highest listed Gemini free-tier RPD. |
OpenAI¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| GPT-4o | 128K / tier / tier / tier / tier |
Partially verified | CODE VERIFY REASON Account: OpenAI. Quality: Excellent. No standing public free API tier documented. |
| GPT-4o mini | 128K / tier / tier / tier / tier |
Partially verified | CODE FAST BUDGET Account: OpenAI. Quality: Very Good. Limits vary by trust tier/account status. |
Anthropic Claude¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Claude 3.5 Sonnet | 200K / tier / tier / tier / tier |
Partially verified | CODE VERIFY REASON LONGCTX Account: Anthropic console. Quality: Excellent. Tier 1 needs purchased credits. |
| Claude 3 Haiku | 200K / tier / tier / tier / tier |
Partially verified | FAST BUDGET VERIFY Account: Anthropic console. Quality: Good. Fixed global daily caps are not published. |
Groq¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Llama 3.3 70B (llama-3.3-70b-versatile) | 128K / 30 / 1000 / 12K / 100K |
Verified | FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Very Good. Official row differs from older community numbers. |
| Llama 4 Maverick 17B | 128K / 30 / 1000 / 6K / 500K |
Verified | FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Good. Fast inference; revision limits can change. |
| Qwen3 32B | 128K / 30 / 14,400 / 6K / 500K |
Partially verified | FAST CODE OPEN BUDGET Account: Groq (no CC for free tier). Quality: Good. Model IDs/limits may shift by release. |
| Llama 4 Scout 17B | 128K / 30 / 1000 / 6K / 500K |
Verified | FAST CODE OPEN BUDGET Account: Groq. 16E MoE variant optimized for low-latency tasks. |
| GPT OSS 120B | 128K / 30 / 1000 / 12K / 100K |
Verified | CODE REASON Account: Groq. High performance open-weights model. |
| Kimi K2 (kimi-k2-0905) | 256K / n/p / n/p / n/p / n/p |
Verified | LONGCTX REASON Account: Groq. 1T parameter MoE with 256K context. |
| Compound AI (groq/compound) | 128K / 30 / 250 / 70K / n/p |
Verified | FAST REASON CODE Account: Groq (no CC for free tier). Quality: Good. Official docs do not publish TPD. |
Kiro¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Auto (Frontier Mix) | n/p / n/p / n/p / n/p / 50 credits |
Verified | CODE FAST BUDGET Account: Kiro. Mixed agent using frontier models. Free tier includes 50 monthly credits. |
Together AI¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Llama 4 Maverick | 131K / tier / tier / tier / tier |
Verified | CODE REASON OPEN Account: Together + paid credits. Quality: Very Good. No standing free trial in current docs. |
| DeepSeek V3.1 | 64K / tier / tier / tier / tier |
Verified | CODE REASON OPEN Account: Together + paid credits. Quality: Excellent. "$100 signup credits" not confirmed in current docs. |
| Mistral Small 3 | 128K / tier / tier / tier / tier |
Verified | CODE OPEN Account: Together + paid credits. Quality: Good. Limits are spend/account-tier dependent. |
Hugging Face¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Various open models | varies / provider / provider / provider / credit-based |
Verified | OPEN BUDGET CODE Account: Hugging Face. Quality: Varies. Limits depend on routed provider and plan. |
| Pro-tier routed providers | varies / higher / provider / provider / credit-based |
Verified | OPEN CODE VERIFY Account: Hugging Face Pro. Quality: Very Good. Pro includes higher monthly credits. |
Mistral AI¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Mistral Nemo 12B | model / plan / plan / plan / plan |
Partially verified | CODE OPEN BUDGET Account: Mistral (Experiment/Scale). Quality: Good. Free Experiment plan exists; quotas are dynamic. |
| Mistral Small 3.1 | 128K / plan / plan / plan / plan |
Partially verified | CODE OPEN Account: Mistral (Experiment/Scale). Quality: Good. Access/limits depend on plan tier. |
| Codestral | 32K / plan / plan / plan / plan |
Partially verified | CODE VERIFY OPEN Account: Mistral (Experiment/Scale). Quality: Excellent. Code-oriented with plan gating. |
| Mistral Large 3 | 128K / plan / plan / plan / plan |
Partially verified | CODE REASON VERIFY Account: Mistral (Experiment/Scale). Quality: Excellent. Most capable tier usually paid. |
DeepSeek¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| DeepSeek V3.2 (deepseek-chat) | 128K / n/p / n/p / n/p / n/p |
Partially verified | CODE REASON BUDGET Account: DeepSeek. Quality: Excellent. Pricing is public; fixed free quotas are not. |
| DeepSeek R1 / reasoner | 128K / n/p / n/p / n/p / n/p |
Unverified | REASON VERIFY CODE Account: DeepSeek. Quality: Excellent. "5M signup tokens" is not confirmed in official docs. |
Cohere¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Command R7B | 128K / 20 / ~1000mo / endpoint / 1000mo |
Verified | VERIFY FAST Account: Cohere trial key. Quality: Good. Free trial usage is heavily rate-limited. |
| Command R+ | 128K / 20 / ~1000mo / endpoint / 1000mo |
Verified | VERIFY FAST CODE Account: Cohere trial key. Quality: Very Good. Trial cap is account-wide per month. |
OpenRouter¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
Qwen3 Coder 480B (:free variant when available) |
model / 20 / 50d (<$10) or 1000d (>= $10) / n/p / n/p |
Verified | CODE OPEN BUDGET Account: OpenRouter. Quality: Excellent. Free limits are account-plan based. |
GPT-OSS-120B (:free variant when available) |
model / 20 / 50d or 1000d (>= $10) / n/p / n/p |
Verified | CODE OPEN BUDGET Account: OpenRouter. Quality: Very Good. Free variants can rotate. |
Llama 3.3 70B (:free variant when available) |
model / 20 / 50d or 1000d (>= $10) / n/p / n/p |
Verified | CODE OPEN BUDGET Account: OpenRouter. Quality: Very Good. Free router pool is dynamic. |
Mistral Small 3.1 (:free variant when available) |
model / 20 / 50d or 1000d (>= $10) / n/p / n/p |
Verified | CODE OPEN BUDGET Account: OpenRouter. Quality: Good. Best for low-volume testing. |
DeepSeek R1 (:free variant when available) |
model / 20 / 50d or 1000d (>= $10) / n/p / n/p |
Verified | REASON VERIFY BUDGET Account: OpenRouter. Quality: Excellent. Current docs differ from older community RPD values. |
Cerebras¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Llama 4 Maverick 400B | 128K (paid) / 30 / 14,400 / 60K / 1M |
Partially verified | FAST CODE OPEN Account: Cerebras. Quality: Very Good. Context/limits vary by tier and model page. |
| Qwen3 Coder 235B | 64K free, 131K paid / 30 / 14,400 / 60K / 1M |
Partially verified | FAST CODE OPEN Account: Cerebras. Quality: Excellent. Verify live limits on model page. |
| Llama 3.1 8B | 8K free, 32K paid / 30 / 14,400 / 60K / 1M |
Verified | FAST BUDGET OPEN Account: Cerebras. Quality: Good. Free-tier limits are explicitly published. |
xAI Grok¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| Grok 4.1 Fast | model / credit / credit / credit / credit |
Unverified | REASON VERIFY Account: xAI. Quality: Very Good. Promotional credits may exist; fixed "$25 startup credits" not consistently documented. |
MiniMax¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| MiniMax-M2.5 (Coding Plan Starter) | 200K / 40 prompts per 5h / n/p / n/p / n/p |
Verified | CODE BUDGET REASON Account: MiniMax. Quality: Excellent. Optimized for coding. Fixed-fee subscription. |
| MiniMax-M2.5 (Pay-as-you-go) | 200K / plan / plan / plan / plan |
Verified | CODE FAST Account: MiniMax. Quality: Excellent. Competitive RMB pricing (2.1/8.4 per 1M tokens). |
Moonshot AI¶
| Model | Quotas | Verification | Summary |
|---|---|---|---|
| moonshot-v1-128k | 128K / tier / tier / tier / tier |
Partially verified | LONGCTX REASON Account: Moonshot AI. Quality: Very Good. Famous for pioneer long-context stability. |
Notes and caveats¶
- Provider pricing and free-tier rules change frequently; always verify from official links before budgeting.
- Several vendors distinguish between product-level free plans and API-level free access.
- Account-level cloud credits (Azure/AWS/GCP) are not equivalent to a provider-specific API free tier.
Maintenance protocol¶
When updating this page:
- Validate each row against official docs/pricing pages.
- Update
Free tier / trial,Evidence summary, andLast verified. - Add providers only when official pricing and docs links are stable.
- Use
Unclearwhen evidence is ambiguous. - Regenerate capacity summaries with
python3 scripts/update_api_pricing_capability_summary.py.
Related pages¶
Sources / References¶
- Kiro Pricing
- Abacus.AI Pricing
- Amazon Q Pricing
- OpenAI API Pricing
- OpenAI Prepaid Billing
- Anthropic Claude API Pricing
- Google Gemini API Pricing
- Google Developer Program Plans & Pricing
- OpenRouter Pricing
- OpenRouter API rate limits
- OpenRouter Free Models Router
- MiniMax Pricing
- MiniMax Coding Plan
- Moonshot AI Website
- xAI API
- xAI Models/Pricing docs
- xAI Billing
- Z.ai Docs
- Z.ai Model Pricing
- Alibaba Model Studio billing
- Alibaba Model pricing
- Cohere Pricing
- Cohere trial/production key rate limits
- Mistral Pricing
- Mistral Experiment Plan (Free API)
- Together Pricing
- Together Billing Credits
- Groq Pricing
- Groq Rate Limits
- Fireworks Pricing
- Replicate Pricing
- Replicate Billing
- DeepSeek Models & Pricing
- Perplexity API Pricing
- AI21 Pricing
- Voyage Pricing
- Cloudflare Workers AI Pricing
- Hugging Face Pricing
- Cerebras Pricing
- NVIDIA API Pricing
- SambaNova Pricing
- AWS Bedrock Pricing
- Azure OpenAI Pricing
- Vertex AI GenAI Pricing
- OCI Generative AI Pricing
- Google Gemini rate limits
- Groq rate limits
- Cerebras rate limits
- ChatGPT shared context (title only)
Contribution Metadata¶
- Last reviewed: 2026-03-03
- Confidence: medium