Skip to content

API Pricing & Free Tier Matrix

This is the canonical tracker for API pricing links and free-tier availability across LLM providers and API platforms.

Scope and usage

  • Focus: API access (not consumer chat subscriptions unless directly tied to API credits).
  • Purpose: provider comparison, budgeting, and long-term maintenance reference.
  • Update target: monthly, and after major provider announcements.

Status legend

  • Yes = official free tier/trial access is currently documented.
  • Partial = limited free usage exists (for example, selected models/features).
  • No = no current free trial/tier is documented.
  • Unclear = pricing/billing docs do not clearly confirm a standing free tier.

Canonical pricing matrix (last verified: 2026-03-03)

Provider / Platform Official links Free tier / trial Evidence summary
OpenAI Docs · Pricing No Usage-priced API; prepaid credits required.
Anthropic (Claude API) Docs · Pricing Yes New users receive small starter API credits.
Google Gemini Developer API Docs · Pricing Yes Pricing tables include free-tier rows. AI Pro members receive $10/mo Cloud credits.
OpenRouter Docs · Pricing Yes Free plan and free-model routing are documented.
xAI (Grok API) Docs · Pricing Yes Docs mention monthly free requests/credits.
Z.ai (GLM API) Docs · Pricing Yes New users can claim free API token packages.
Alibaba DashScope (Qwen APIs) Docs · Pricing Yes Many models show temporary free quota periods.
Cohere Docs · Pricing Yes Trial API keys are free and rate-limited.
Mistral AI Docs · Pricing Yes Experiment plan supports free API testing.
Together AI Docs · Pricing No Billing docs indicate paid credits are required.
Groq Docs · Pricing Yes Free API plan and limits are documented.
Kiro Docs · Pricing Yes Perpetual free tier (50 credits/mo) + 500 bonus credits.
Fireworks AI Docs · Pricing Yes Public pricing notes starter free credits.
Replicate Docs · Pricing Partial Some models can be run free before billing.
DeepSeek API Docs · Pricing Unclear Granted balances are mentioned, fixed free tier unclear.
Perplexity API Docs · Pricing No Purchased credits and top-up requirements documented.
AI21 Docs · Pricing Yes Pricing page advertises free trial credits.
Abacus.AI Docs · Pricing Yes Free trial and ChatLLM free access documented.
Voyage AI Docs · Pricing Unclear Paid rates are clear; standing free tier not explicit.
Cloudflare Workers AI Docs · Pricing Yes Free plan includes daily usage.
Hugging Face Inference Providers Docs · Pricing Yes Monthly included inference credits by account tier.
Cerebras Inference Docs · Pricing Yes Pricing references free-tier usage/credits.
NVIDIA API Catalog Docs · Pricing Yes Starter credits are referenced publicly.
SambaNova Cloud Docs · Pricing Unclear Public page shows paid plans; no stable free policy.
AWS Bedrock Docs · Pricing No Metered pay-as-you-go pricing.
Amazon Q Docs · Pricing Yes Free tier for individuals/developers is documented.
Azure OpenAI Service Docs · Pricing Unclear Metered service; only account-level cloud credits may apply.
Vertex AI (Gemini via GCP) Docs · Pricing Unclear Metered pricing; no persistent API free-tier statement.
OCI Generative AI Docs · Pricing Unclear Paid rates are public; free tier not clearly documented.
MiniMax Docs · Pricing Yes Coding Plan provides a low-cost entry tier; trial credits available.
Moonshot AI Docs · Pricing Partial Trial credits are typically granted to new developer accounts.

Developer Program Plans

These are the core subscription plans for developers that bundle AI access, cloud credits, and other professional benefits.

Program / Plan Cost AI Access & Quotas Cloud Credits & Benefits
Google Developer Program — Standard Free 10 Firebase Studio workspaces; Gemini Code Assist (Basic); Gemini CLI (60 RPM / 1000 RPD) Monthly Google Skills credits (via GEAR); community access; private previews.
Google Developer Program — Premium $24.99/mo or $299/yr 30 Firebase Studio workspaces; Gemini Code Assist (Higher); Gemini CLI (120 RPM / 1500 RPD) $45/mo ($550/yr) GenAI/Cloud credit; $500 bonus credit upon certification; 1 Cloud cert voucher; expert consultation.
Google Developer Program — Enterprise Preview Gemini Code Assist Enterprise; Gemini CLI (120 RPM / 2000 RPD) $150/mo Google Cloud credit; centralized purchasing; developer sandboxes.

Model-level quota tracker (expanded list)

This section is grouped by provider with compact four-column tables for narrower screens.

  • Verified = core limits are visible in official docs.
  • Partially verified = provider free-tier stance is verified, but model-level quotas are dynamic or not fully published.
  • Unverified = values come from community reports or account-specific observations not explicitly documented.

Code Generation Quality is subjective and treated as community-assessed, not an official benchmark metric.

Quick jump: Google Gemini · OpenAI · Anthropic Claude · Groq · Together AI · Hugging Face · Mistral AI · DeepSeek · Cohere · OpenRouter · Cerebras · xAI Grok · Kiro

Quotas format is context / RPM / RPD / TPM / daily token cap. n/p means "not published."

Capability tags:

  • CODE code generation and refactoring tasks.
  • VERIFY cross-checking, factual validation, and test review.
  • REASON complex reasoning and multi-step planning.
  • LONGCTX long documents, large prompts, and retrieval-heavy workflows.
  • FAST low-latency interactions and interactive agent loops.
  • BUDGET better free-tier value or lower-cost experimentation.
  • OPEN open-weight/open-model ecosystem affinity.

Capability Capacity Summary (auto-generated)

These summaries are generated from the model rows on this page using scripts/update_api_pricing_capability_summary.py. Only rows with a numeric daily token cap are included in the capacity math.

Leaderboard By Capability (known daily token caps)

Capability Top models Highest known daily cap Known models
Coding Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M); Cerebras — Llama 4 Maverick 400B (1M) 62.5M 9
Verification Google Gemini — Gemini 2.5 Pro (25M) 25M 1
Reasoning Google Gemini — Gemini 2.5 Pro (25M); Groq — GPT OSS 120B (100K) 25M 2
Long-context Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M) 250M 3
Low-latency Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 4 Maverick 400B (1M) 250M 9
Budget/free-value Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M); Cerebras — Llama 3.1 8B (1M) 250M 7
Open-model ecosystem Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M) 1M 7

80% Shortlist (known-cap coverage)

Capability Models to reach >=80% of known capacity Coverage Total known daily cap
Coding Google Gemini — Gemini 2.5 Flash (62.5M); Google Gemini — Gemini 2.5 Pro (25M) 95.9% 91.2M
Verification Google Gemini — Gemini 2.5 Pro (25M) 100.0% 25M
Reasoning Google Gemini — Gemini 2.5 Pro (25M) 99.6% 25.1M
Long-context Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) 92.6% 337.5M
Low-latency Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) 98.5% 317.1M
Budget/free-value Google Gemini — Gemini 2.5 Flash-Lite (250M); Google Gemini — Gemini 2.5 Flash (62.5M) 99.2% 315.1M
Open-model ecosystem Cerebras — Llama 4 Maverick 400B (1M); Cerebras — Qwen3 Coder 235B (1M); Cerebras — Llama 3.1 8B (1M); Groq — Llama 4 Maverick 17B (500K); Groq — Qwen3 32B (500K) 87.0% 4.6M

Fast Recommendation (80% rule, known-cap data)

Goal Recommended free-first models Why this set
Coding Google Gemini — Gemini 2.5 Flash; Google Gemini — Gemini 2.5 Pro Reaches 95.9% of known daily capacity (91.2M total known).
Verification Google Gemini — Gemini 2.5 Pro Reaches 100.0% of known daily capacity (25M total known).
Reasoning Google Gemini — Gemini 2.5 Pro Reaches 99.6% of known daily capacity (25.1M total known).

Google Gemini

Model Quotas Verification Summary
Gemini 2.5 Pro 1M / 5 / 100 / 250K / ~25M Verified CODE VERIFY REASON LONGCTX
Account: Google. Quality: Excellent. Regional/compliance rules may affect access.
Gemini 2.5 Flash 1M / 10 / 250 / 250K / ~62.5M Verified CODE FAST LONGCTX BUDGET
Account: Google. Quality: Very Good. Official free-tier RPM is 10 (not 15).
Gemini 2.5 Flash-Lite 1M / 15 / 1000 / 250K / ~250M Verified FAST LONGCTX BUDGET
Account: Google. Quality: Good. Highest listed Gemini free-tier RPD.

OpenAI

Model Quotas Verification Summary
GPT-4o 128K / tier / tier / tier / tier Partially verified CODE VERIFY REASON
Account: OpenAI. Quality: Excellent. No standing public free API tier documented.
GPT-4o mini 128K / tier / tier / tier / tier Partially verified CODE FAST BUDGET
Account: OpenAI. Quality: Very Good. Limits vary by trust tier/account status.

Anthropic Claude

Model Quotas Verification Summary
Claude 3.5 Sonnet 200K / tier / tier / tier / tier Partially verified CODE VERIFY REASON LONGCTX
Account: Anthropic console. Quality: Excellent. Tier 1 needs purchased credits.
Claude 3 Haiku 200K / tier / tier / tier / tier Partially verified FAST BUDGET VERIFY
Account: Anthropic console. Quality: Good. Fixed global daily caps are not published.

Groq

Model Quotas Verification Summary
Llama 3.3 70B (llama-3.3-70b-versatile) 128K / 30 / 1000 / 12K / 100K Verified FAST CODE OPEN BUDGET
Account: Groq (no CC for free tier). Quality: Very Good. Official row differs from older community numbers.
Llama 4 Maverick 17B 128K / 30 / 1000 / 6K / 500K Verified FAST CODE OPEN BUDGET
Account: Groq (no CC for free tier). Quality: Good. Fast inference; revision limits can change.
Qwen3 32B 128K / 30 / 14,400 / 6K / 500K Partially verified FAST CODE OPEN BUDGET
Account: Groq (no CC for free tier). Quality: Good. Model IDs/limits may shift by release.
Llama 4 Scout 17B 128K / 30 / 1000 / 6K / 500K Verified FAST CODE OPEN BUDGET
Account: Groq. 16E MoE variant optimized for low-latency tasks.
GPT OSS 120B 128K / 30 / 1000 / 12K / 100K Verified CODE REASON
Account: Groq. High performance open-weights model.
Kimi K2 (kimi-k2-0905) 256K / n/p / n/p / n/p / n/p Verified LONGCTX REASON
Account: Groq. 1T parameter MoE with 256K context.
Compound AI (groq/compound) 128K / 30 / 250 / 70K / n/p Verified FAST REASON CODE
Account: Groq (no CC for free tier). Quality: Good. Official docs do not publish TPD.

Kiro

Model Quotas Verification Summary
Auto (Frontier Mix) n/p / n/p / n/p / n/p / 50 credits Verified CODE FAST BUDGET
Account: Kiro. Mixed agent using frontier models. Free tier includes 50 monthly credits.

Together AI

Model Quotas Verification Summary
Llama 4 Maverick 131K / tier / tier / tier / tier Verified CODE REASON OPEN
Account: Together + paid credits. Quality: Very Good. No standing free trial in current docs.
DeepSeek V3.1 64K / tier / tier / tier / tier Verified CODE REASON OPEN
Account: Together + paid credits. Quality: Excellent. "$100 signup credits" not confirmed in current docs.
Mistral Small 3 128K / tier / tier / tier / tier Verified CODE OPEN
Account: Together + paid credits. Quality: Good. Limits are spend/account-tier dependent.

Hugging Face

Model Quotas Verification Summary
Various open models varies / provider / provider / provider / credit-based Verified OPEN BUDGET CODE
Account: Hugging Face. Quality: Varies. Limits depend on routed provider and plan.
Pro-tier routed providers varies / higher / provider / provider / credit-based Verified OPEN CODE VERIFY
Account: Hugging Face Pro. Quality: Very Good. Pro includes higher monthly credits.

Mistral AI

Model Quotas Verification Summary
Mistral Nemo 12B model / plan / plan / plan / plan Partially verified CODE OPEN BUDGET
Account: Mistral (Experiment/Scale). Quality: Good. Free Experiment plan exists; quotas are dynamic.
Mistral Small 3.1 128K / plan / plan / plan / plan Partially verified CODE OPEN
Account: Mistral (Experiment/Scale). Quality: Good. Access/limits depend on plan tier.
Codestral 32K / plan / plan / plan / plan Partially verified CODE VERIFY OPEN
Account: Mistral (Experiment/Scale). Quality: Excellent. Code-oriented with plan gating.
Mistral Large 3 128K / plan / plan / plan / plan Partially verified CODE REASON VERIFY
Account: Mistral (Experiment/Scale). Quality: Excellent. Most capable tier usually paid.

DeepSeek

Model Quotas Verification Summary
DeepSeek V3.2 (deepseek-chat) 128K / n/p / n/p / n/p / n/p Partially verified CODE REASON BUDGET
Account: DeepSeek. Quality: Excellent. Pricing is public; fixed free quotas are not.
DeepSeek R1 / reasoner 128K / n/p / n/p / n/p / n/p Unverified REASON VERIFY CODE
Account: DeepSeek. Quality: Excellent. "5M signup tokens" is not confirmed in official docs.

Cohere

Model Quotas Verification Summary
Command R7B 128K / 20 / ~1000mo / endpoint / 1000mo Verified VERIFY FAST
Account: Cohere trial key. Quality: Good. Free trial usage is heavily rate-limited.
Command R+ 128K / 20 / ~1000mo / endpoint / 1000mo Verified VERIFY FAST CODE
Account: Cohere trial key. Quality: Very Good. Trial cap is account-wide per month.

OpenRouter

Model Quotas Verification Summary
Qwen3 Coder 480B (:free variant when available) model / 20 / 50d (<$10) or 1000d (>= $10) / n/p / n/p Verified CODE OPEN BUDGET
Account: OpenRouter. Quality: Excellent. Free limits are account-plan based.
GPT-OSS-120B (:free variant when available) model / 20 / 50d or 1000d (>= $10) / n/p / n/p Verified CODE OPEN BUDGET
Account: OpenRouter. Quality: Very Good. Free variants can rotate.
Llama 3.3 70B (:free variant when available) model / 20 / 50d or 1000d (>= $10) / n/p / n/p Verified CODE OPEN BUDGET
Account: OpenRouter. Quality: Very Good. Free router pool is dynamic.
Mistral Small 3.1 (:free variant when available) model / 20 / 50d or 1000d (>= $10) / n/p / n/p Verified CODE OPEN BUDGET
Account: OpenRouter. Quality: Good. Best for low-volume testing.
DeepSeek R1 (:free variant when available) model / 20 / 50d or 1000d (>= $10) / n/p / n/p Verified REASON VERIFY BUDGET
Account: OpenRouter. Quality: Excellent. Current docs differ from older community RPD values.

Cerebras

Model Quotas Verification Summary
Llama 4 Maverick 400B 128K (paid) / 30 / 14,400 / 60K / 1M Partially verified FAST CODE OPEN
Account: Cerebras. Quality: Very Good. Context/limits vary by tier and model page.
Qwen3 Coder 235B 64K free, 131K paid / 30 / 14,400 / 60K / 1M Partially verified FAST CODE OPEN
Account: Cerebras. Quality: Excellent. Verify live limits on model page.
Llama 3.1 8B 8K free, 32K paid / 30 / 14,400 / 60K / 1M Verified FAST BUDGET OPEN
Account: Cerebras. Quality: Good. Free-tier limits are explicitly published.

xAI Grok

Model Quotas Verification Summary
Grok 4.1 Fast model / credit / credit / credit / credit Unverified REASON VERIFY
Account: xAI. Quality: Very Good. Promotional credits may exist; fixed "$25 startup credits" not consistently documented.

MiniMax

Model Quotas Verification Summary
MiniMax-M2.5 (Coding Plan Starter) 200K / 40 prompts per 5h / n/p / n/p / n/p Verified CODE BUDGET REASON
Account: MiniMax. Quality: Excellent. Optimized for coding. Fixed-fee subscription.
MiniMax-M2.5 (Pay-as-you-go) 200K / plan / plan / plan / plan Verified CODE FAST
Account: MiniMax. Quality: Excellent. Competitive RMB pricing (2.1/8.4 per 1M tokens).

Moonshot AI

Model Quotas Verification Summary
moonshot-v1-128k 128K / tier / tier / tier / tier Partially verified LONGCTX REASON
Account: Moonshot AI. Quality: Very Good. Famous for pioneer long-context stability.

Notes and caveats

  • Provider pricing and free-tier rules change frequently; always verify from official links before budgeting.
  • Several vendors distinguish between product-level free plans and API-level free access.
  • Account-level cloud credits (Azure/AWS/GCP) are not equivalent to a provider-specific API free tier.

Maintenance protocol

When updating this page:

  1. Validate each row against official docs/pricing pages.
  2. Update Free tier / trial, Evidence summary, and Last verified.
  3. Add providers only when official pricing and docs links are stable.
  4. Use Unclear when evidence is ambiguous.
  5. Regenerate capacity summaries with python3 scripts/update_api_pricing_capability_summary.py.

Sources / References

Contribution Metadata

  • Last reviewed: 2026-03-03
  • Confidence: medium