Learn

Input price (per 1M tokens)

Cost in USD to send 1 million input tokens

What is input price (per 1m tokens)?

Input price is what you pay per million tokens sent to the model — your prompts, instructions, and context. The cheapest models on sourc.dev cost $0.075 per million input tokens (Gemini 1.5 Flash). The most expensive cost $60 (legacy GPT-3 Davinci). That is an 800× range for the same unit of work. If you are sending large documents or long system prompts, input price is the number that determines whether your application is viable at scale.

Why it matters

Here is a concrete example. You have an application that sends a 2,000-token system prompt with every request. At $3.00/1M (Claude 3.5 Sonnet), that system prompt costs $0.006 per call. At $0.075/1M (Gemini 1.5 Flash), it costs $0.00015 per call. Over 1 million calls, the difference is $5,850. Same system prompt. Same task. The input price is the number your CFO will eventually ask about. sourc.dev tracks input_price_per_1m for all published models, verified from primary sources. This feeds the sourc Value Index and Price Deflation Index.

Where models stand

Qwen3 235B A22B

0 USD

GPT-3 (davinci-002)

60 USD

GPT-4

30 USD

Claude 3 Opus

15 USD

Data available for 34 of 271 tracked entities.

How sourc.dev tracks this

sourc.dev tracks input price (per 1m tokens) through its automated monitoring pipeline. Data is collected on a regular schedule, compared against previous values, and any changes are recorded in the history table with full provenance — source URL, effective date, and verification timestamp. Nothing is overwritten. The pipeline ensures this attribute stays current without manual intervention.

Frequently asked questions

FAQ Why is input price quoted per million tokens?

Because individual API calls cost fractions of a cent. Quoting per million tokens makes comparison practical. A typical short prompt might use 500 tokens, costing $0.0005 at $1/M — too small to reason about without the per-million convention.

FAQ Is input always cheaper than output?

Yes, for virtually every commercial LLM API. Output generation requires more computation than reading input. The ratio varies — some models charge 3x more for output, others 5x or more.

FAQ How do cached tokens affect input pricing?

Some providers offer reduced pricing for cached or repeated input tokens. Anthropic and OpenAI both have caching mechanisms that can reduce input costs by 50-90% for repeated context. Check provider documentation for current caching policies.