tupicAcademy

Data Lineage

·article·2026-06-12

Data Lineage

Definition

The ability to trace any reported number back through every transformation to its original source: an API feed (with batch id), a document (with reference), or a manual entry (with author and attachment).

Worked Example

Dashboard: 'Service X June COGS per active user: $0.41'

$0.41
 |- = June COGS $8,610 / 21,000 active users
     |- COGS $8,610
     |   |- $6,000  AWS Cost Explorer API, batch 2026-07-02
     |   |- $1,450  Stripe fees feed, batch #5520
     |   |- $  900  Twilio invoice INV-88412 (AP #3107)
     |   |- $  260  manual entry by ops@..., receipt.pdf attached
     |- 21,000 users
         |- Google Analytics daily aggregation

Interpretation & Pitfalls

A number whose source cannot be identified is, for audit purposes, not a number — it's a rumor. Manual entries demand the strictest metadata because they are the weakest link.

In TupicFinance

Every data point carries an explicit source field (AWS, Stripe, Cloudflare, Google Analytics, manual), making the lineage walk a query, not an investigation.

share