#1 of 50

Token

You're probably spending more than you need to — here's how to fix it

What is a token

You already know what a token is. You just don't know you know.

When autocomplete on your phone suggests the next word — it's predicting the next token. When you type "I'm" instead of "I am" — you just used one token instead of two. A token is the smallest chunk of text a language model processes. Not always a full word. Sometimes part of one. Sometimes punctuation on its own.

In English, 100 tokens is roughly 75 words — or about half this page. And you just learned what a token is by reading it.

The number that makes it real

Every model on sourc.dev is priced in tokens. Claude 3.5 Sonnet costs $3.00 per 1M input tokens. GPT-4o costs $5.00 per 1M. That means one typical API call — a 500-word prompt, a 300-word response — costs roughly $0.003. One third of a cent. Verified March 2026.

That sounds cheap. It is, per call. At scale it is the number your budget lives or dies by.

Why this matters to you

Here is a situation that happens all the time.

You have a 100-page document. You want the model to help you with page 47. So you paste all 100 pages. The model reads all 100 pages. You pay for all 100 pages. Every call.

A 100-page document is roughly 25,000 tokens. At $3.00/1M input that is $0.075 per call. Run that 10,000 times — $750. For pages the model never needed.

Now do it right. Send only the relevant section. 2 pages. 500 tokens. Same rate. $0.0015 per call. 10,000 times — $15.

Same answer. 98% cheaper. The model did not need the other 98 pages. Now you know that.

How to use this

Send only what the model needs to answer the question. If you are asking about a function on line 240, send that function — not the whole file. If your system prompt repeats on every call, every unnecessary word in it costs you on every request, forever. Trim it once and save across a million calls.

Precise beats thorough. Every time.

The multilingual note

Tokens are not equal across languages. English runs at roughly one token per word. Finnish, Turkish, and Arabic tokenise less efficiently — the same meaning can cost 40–60% more tokens. If you are building for multilingual users, this is not a footnote. It is a line in your cost model.

Verified March 2026 · Source: OpenAI tokeniser documentation, Anthropic pricing page

← All terms

Context window →