Speed (tokens/sec)
Average output generation speed in tokens per second
What is speed (tokens/sec)?
Speed in tokens per second (TPS) measures how fast a model generates output. Higher TPS means lower latency for the user. Groq's custom hardware delivers extremely high TPS. Cloud providers like OpenAI and Anthropic optimise for concurrent throughput rather than individual request speed. The tradeoff: fast individual requests (Groq) vs high aggregate capacity (OpenAI).
Why it matters
Speed determines user experience in real-time applications and throughput in batch processing. A model at 100 TPS generates a 500-token response in 5 seconds. At 20 TPS, the same response takes 25 seconds. For interactive applications, anything below 30 TPS feels sluggish. sourc.dev tracks speed_tps where providers publish it.
Where models stand
No data available yet for this metric.
How sourc.dev tracks this
sourc.dev tracks speed (tokens/sec) through its automated monitoring pipeline. Data is collected on a regular schedule, compared against previous values, and any changes are recorded in the history table with full provenance — source URL, effective date, and verification timestamp. Nothing is overwritten. The pipeline ensures this attribute stays current without manual intervention.
sourc.dev monitors this attribute automatically via pipeline. Every data point includes a source URL and verification date. Changes are recorded in the history table — nothing is overwritten.
This attribute is monitored on a regular schedule by automated pipeline. Changes are detected and recorded automatically.
Understanding speed (tokens/sec) helps developers make informed decisions when choosing between models and providers. Rather than relying on marketing claims, sourc.dev provides verified, dated, source-linked data so the data decides.