#25 of 50

What does 70B mean

The number in the model name — once you understand it, model comparisons make more sense

What does 70B mean

When you see Llama 3.3 70B, the 70B is the parameter count. 70 billion parameters. It is not a version number. It is not a benchmark score. It is a measure of the model's size — specifically, the number of numerical values learned during training that determine how the model responds.

A parameter is a weight in a neural network. A number, adjusted billions of times during training, until the model's outputs match what good outputs should look like. The final values of all those numbers — frozen after training — are the model weights. The count of those numbers is the parameter count.

The A-to-B that makes it concrete

GPT-3 in 2020: 175 billion parameters. Trained on approximately 45TB of text. Took weeks to train on hundreds of GPUs. Cost estimated at several million dollars. State of the art for its time.

Llama 3.3 70B in 2024: 70 billion parameters. Runs on a single high-end consumer GPU. Available as open weights. Performance competitive with models that were closed, expensive, and inaccessible three years earlier.

The number went down. The capability did not. Training efficiency improved faster than parameter counts grew.

What the number tells you — and what it does not

More parameters generally means more capability, more nuance, better reasoning. A 70B model is generally more capable than a 7B model from the same family. A 405B model is generally more capable than a 70B.

But parameter count across model families is not directly comparable. A 7B model from 2024 can outperform a 70B model from 2022. Architecture improvements, training data quality, and fine-tuning all matter as much as raw size.

The number is a useful guide within a model family. Across families, compare benchmark scores and test on your actual task.

Why this matters to you

Parameter count predicts hardware requirements for open weights models. A 7B model runs on a laptop. A 70B model needs a serious GPU. A 405B model needs multiple GPUs. If you are evaluating whether to self-host a model, the parameter count is the first number to check against your available hardware.

For hosted APIs, parameter count correlates loosely with price and latency. Larger models cost more and respond more slowly. Smaller models are cheaper and faster. The right size is the smallest one that does your task well.

Verified March 2026 · Source: Meta AI, Anthropic, OpenAI model documentation

# REVIEW NOTES — what to check

Voice consistency: - Does each page open with something that earns continued reading? - Does the "why it matters to you" section feel personal, not generic? - Are the analogies doing work or just decorating?

Length: - Shortest page: streaming (~350 words). Does it feel complete? - Longest page: hallucination (~550 words). Does it earn the length? - Any page that feels padded — flag the section, not the page

Tone: - Positive framing holding throughout? - Any line that feels like a warning rather than an enabler? - Data jokes: token page (none yet — add one?), context window (earned), input price (none — correct)

Technical accuracy: - Prices are from March 2026 sourc.dev data — verify before publish - Benchmark scores need verification against latest entity page data - Rate limit numbers are illustrative — replace with verified figures per model

Next steps after review: 1. Approve / adjust / rewrite marks per page 2. Identify the 3-5 where voice needs the most work 3. Revise those together before pipeline prompt is finalised 4. Pipeline prompt gets locked based on approved pages 5. Remaining 101 terms generated by pipeline, reviewed by Fredrik

*sourc-glossary-first-25-v1.md* *March 2026 — HODLR & CO Labs* *First draft — voice and process review before build*

Related terms

LLM Open weights Fine-tuning MMLU

← All terms

← Latency Context caching →