How sourc.dev Works With Data

What makes this data different

Most AI data sources are snapshots. sourc.dev is an archive.

When a price changes on sourc.dev, the old price is not overwritten. It is preserved in an append-only history layer with a timestamp and a source link. This means every entity on the platform carries a full audit trail — you can see what changed, when it changed, and what the primary source was at the time of verification.

This design is intentional. The archive cannot be reconstructed retroactively. Every day of tracking that passes is a day that cannot be recovered. The historical record starts from the first verified entry, not from today.

How we verify data

Every attribute on sourc.dev requires three fields before it is published:

Source URL — a publicly accessible primary source confirming the value. Vendor documentation, official pricing pages, and published research papers are accepted. Aggregators and secondary sources are not.

Verification date — the date the value was confirmed against the source. Pricing attributes older than 90 days are flagged for re-verification automatically.

Confidence level — a machine-readable signal indicating whether the value is directly stated in the source, inferred from context, or estimated from partial data.

Unknown is published as unknown. We never publish absence of evidence as evidence of absence. A missing price is not a zero price. An unverified capability flag is not a false.

How changes are tracked

The sourc.dev pipeline monitors tracked entities on a defined schedule. When a value changes, a new row is written to the attribute history layer. The previous value is preserved.

The /changes feed surfaces the most recent verified changes across all entities. It updates automatically when the pipeline detects a new value against a primary source.

This is not a changelog maintained by humans. It is a machine-generated record derived from primary source verification. The methodology that governs it is documented at /methodology.

How knowledge pages are built

The /learn pages explain each tracked metric and attribute in plain language — what it measures, why it matters, and how to interpret it. The /glossary pages define the terminology used across the platform.

Both are built from the same attribute registry that governs the data layer. Every term in a data table has a corresponding explanation. Every metric on an entity page has a corresponding learn page.

Content depth is a function of data depth. A learn page that currently shows limited data is not a finished article — it is a live document that deepens as the underlying data accumulates. When new verified attributes are added to an entity, the corresponding learn page reflects them. When a new metric enters the pipeline, a new learn page is created.

This is the content strategy: build the knowledge layer in parallel with the data layer, so that every new data point is immediately explainable, citable, and linkable.

Who this data serves

Developers

Choose models and tools with confidence. Verified pricing across providers. Tracked capability changes. Integration graphs that show what connects to what.

Investors and analysts

Build evidence-based theses on the AI infrastructure landscape. Market concentration, pricing trends, and integration velocity — all with primary source attribution.

Researchers and journalists

Cite verified, dated, source-linked data. Every data point links to its primary source. The methodology is public. The verification dates are visible.

Enterprise buyers

Evaluate vendor dependency risk, global distribution coverage, and enterprise readiness before committing infrastructure spend.

What is being built

The data layer currently powers the public entity pages at /llms, /tools, /saas, and /apis. The same underlying archive is being extended to support:

Structured data feeds — machine-readable exports for downstream use in analytics pipelines and research workflows.

Time-series dashboards — comparative views across entity types showing how value, price, and capability have moved over time.

Alert infrastructure — threshold-based notifications when significant changes are detected in tracked attributes.

Institutional data access — query access to the verified archive for organisations that need to integrate sourc.dev data into their own systems.

The archive is already accumulating. The intelligence layer is being built on top of it.

Intelligence by audience

The methodology

Every metric on sourc.dev has a formula, a version number, and a documented data source. Formulas only change with a version bump. Old computed values are never deleted. The full methodology — including all base metrics, computation rules, and pricing decisions — is documented at /methodology.

What makes this data different

How we verify data

How changes are tracked

How knowledge pages are built

Who this data serves

What is being built

Intelligence by audience

Developer

Investor

Creator

Enterprise

Researcher

The methodology