Learn

Speed (tokens/sec)

Average output generation speed in tokens per second

What is speed (tokens/sec)?

Speed in tokens per second (TPS) measures how fast a model generates output. Higher TPS means lower latency for the user. Groq's custom hardware delivers extremely high TPS. Cloud providers like OpenAI and Anthropic optimise for concurrent throughput rather than individual request speed. The tradeoff: fast individual requests (Groq) vs high aggregate capacity (OpenAI).

Why it matters

Speed determines user experience in real-time applications and throughput in batch processing. A model at 100 TPS generates a 500-token response in 5 seconds. At 20 TPS, the same response takes 25 seconds. For interactive applications, anything below 30 TPS feels sluggish. sourc.dev tracks speed_tps where providers publish it.

Where models stand

No data available yet for this metric.

How sourc.dev tracks this

sourc.dev tracks speed (tokens/sec) through its automated monitoring pipeline. Data is collected on a regular schedule, compared against previous values, and any changes are recorded in the history table with full provenance — source URL, effective date, and verification timestamp. Nothing is overwritten. The pipeline ensures this attribute stays current without manual intervention.

Frequently asked questions

FAQ How does sourc.dev measure speed (tokens/sec)?

sourc.dev monitors this attribute automatically via pipeline. Every data point includes a source URL and verification date. Changes are recorded in the history table — nothing is overwritten.

FAQ How often is speed (tokens/sec) updated?

This attribute is monitored on a regular schedule by automated pipeline. Changes are detected and recorded automatically.

FAQ Why does speed (tokens/sec) matter for developers?

Understanding speed (tokens/sec) helps developers make informed decisions when choosing between models and providers. Rather than relying on marketing claims, sourc.dev provides verified, dated, source-linked data so the data decides.