Input price (per 1M tokens)
Cost in USD to send 1 million input tokens
What is input price (per 1m tokens)?
Input price is what you pay per million tokens sent to the model — your prompts, instructions, and context. The cheapest models on sourc.dev cost $0.075 per million input tokens (Gemini 1.5 Flash). The most expensive cost $60 (legacy GPT-3 Davinci). That is an 800× range for the same unit of work. If you are sending large documents or long system prompts, input price is the number that determines whether your application is viable at scale.
Why it matters
Here is a concrete example. You have an application that sends a 2,000-token system prompt with every request. At $3.00/1M (Claude 3.5 Sonnet), that system prompt costs $0.006 per call. At $0.075/1M (Gemini 1.5 Flash), it costs $0.00015 per call. Over 1 million calls, the difference is $5,850. Same system prompt. Same task. The input price is the number your CFO will eventually ask about. sourc.dev tracks input_price_per_1m for all published models, verified from primary sources. This feeds the sourc Value Index and Price Deflation Index.
Where models stand
Data available for 34 of 271 tracked entities.
How sourc.dev tracks this
sourc.dev tracks input price (per 1m tokens) through its automated monitoring pipeline. Data is collected on a regular schedule, compared against previous values, and any changes are recorded in the history table with full provenance — source URL, effective date, and verification timestamp. Nothing is overwritten. The pipeline ensures this attribute stays current without manual intervention.
Because individual API calls cost fractions of a cent. Quoting per million tokens makes comparison practical. A typical short prompt might use 500 tokens, costing $0.0005 at $1/M — too small to reason about without the per-million convention.
Yes, for virtually every commercial LLM API. Output generation requires more computation than reading input. The ratio varies — some models charge 3x more for output, others 5x or more.
Some providers offer reduced pricing for cached or repeated input tokens. Anthropic and OpenAI both have caching mechanisms that can reduce input costs by 50-90% for repeated context. Check provider documentation for current caching policies.