tupicAcademy

Chapter 06 — Audit, Compliance & Data Integrity

·article·2026-06-12

Chapter 06 — Audit, Compliance & Data Integrity

"In a financial system, the question is never just 'what is the number?' — it is 'can you prove where the number came from?'"

The previous chapters defined the numbers. This chapter is about trusting them. It covers audit trails, data lineage, and a design principle that surprises many engineers: in financial systems, storing facts beats computing them.


6.1 The Audit Trail

Definition: An immutable, chronological record of every financial transaction and every change made to financial data — who did what, when, and from what previous state.

A complete audit-trail entry answers five questions:

WHO     user identity (authenticated, not self-declared)
WHAT    the operation (create / update / pay / void)
WHEN    timestamp (server-side, timezone-safe)
BEFORE  the prior state of the record
AFTER   the new state of the record

Worked Example

A cost item's amount is corrected from $1,200 to $2,100:

2026-06-10 14:32:07Z | user: finance.lead@company.com
operation: UPDATE cost_item #4821
before: { amount: 1200.00, payment_status: "UNPAID" }
after:  { amount: 2100.00, payment_status: "UNPAID" }
reason: "Vendor issued corrected invoice INV-2231-R1"

Rules of a real audit trail:

  1. Append-only. Corrections are new entries; nothing is ever deleted or edited in place.
  2. No silent edits. Every change carries an actor and, ideally, a reason.
  3. Voids, not deletes. A wrong transaction is reversed by an offsetting entry, preserving history.

Why It Matters — Beyond Compliance

  • External audit: auditors sample transactions and walk them back to source documents. No trail → qualified opinion.
  • Fraud detection: unusual edit patterns (amounts changed after payment, weekend edits by unexpected roles) are only visible if history exists.
  • Debugging: "Why did April's COGS change last Tuesday?" becomes answerable in minutes.

6.2 Audit Trail ≠ Activity Log

A common architectural mistake is dumping everything into one log. Two streams must stay separate:

Financial Audit TrailOperational Activity Log
Recordsactual financial transactions & changesplanning events: tasks created, goals re-weighted, sprints moved
Audienceauditors, compliance, financethe team itself
Retentionlong (often 7+ years by law)as useful
Mutabilitystrictly append-onlyappend-only by convention
Noise tolerancezero — every entry is evidencehigh — it's a work diary

Mixing them buries legally significant entries under thousands of "task renamed" events and makes retention policy impossible (you can't keep one part of a table for 7 years and prune the rest comfortably).


6.3 Data Lineage

Definition: The ability to trace any number in a report back through every transformation to its original source.

Worked Example — Tracing One KPI

Dashboard shows: "Service X — June COGS per active user: $0.41."

A lineage-complete system can unwind it:

$0.41
 └── = June COGS $8,610 ÷ 21,000 active users
      ├── COGS $8,610
      │    ├── $6,000  source: AWS Cost Explorer API, account 7733…, pulled 2026-07-02
      │    ├── $1,450  source: Stripe fees, automated feed, batch #5520
      │    ├── $  900  source: Twilio invoice INV-88412 (AP item #3107)
      │    └── $  260  source: manual entry by ops@…, attachment receipt.pdf
      └── 21,000 users
           └── source: Google Analytics daily aggregation, property GA-…

Every leaf is one of: an API feed (with batch id), an invoice/document (with reference), or a manual entry (with author and attachment). A number whose leaf cannot be identified is, for audit purposes, not a number — it's a rumor.

Operating principle: every data point carries a source field — AWS, Stripe, Cloudflare, Google Analytics, manual, … — and manual entries demand the strictest metadata, because they are the weakest link.


6.4 Store Facts, Don't Compute Them

Engineers are trained to avoid redundancy: "payment status can be derived — just check whether a payment transaction exists." In financial systems this instinct is usually wrong. Fields like payment_status and liability_type should be stored explicitly at the moment the business event occurs.

Why Stored Beats Computed

1. The past must not change when logic changes.

Computed approach:
   v1 logic: status = PAID if a withdrawal row exists
   v2 logic (bug fix): … if a *completed* withdrawal exists
   → Historical reports silently change after the deploy. Auditors notice.

Stored approach:
   status was written as a fact when the event happened.
   → Code changes can never rewrite history.

2. Derivations rely on joins that can break. A deleted/archived bank-transaction row would silently flip a computed status. A stored field survives.

3. Auditability. "Status = PAID, set by user X at time T" is evidence. "Status = PAID because the current code evaluates these three tables this way" is an argument.

4. Performance is a free bonus — no multi-table joins to render a list of 10,000 cost items.

The Cost of This Choice — and How to Pay It

Stored facts can drift from underlying data if writes aren't disciplined. The remedy is to make every state change transactional: the cost item update, the bank withdrawal, and the audit entry are written in one atomic operation — all or nothing.

BEGIN;
  UPDATE cost_items SET payment_status='PAID' WHERE id=4821;
  INSERT INTO cash_transactions (type='WITHDRAWAL', amount=2100.00, …);
  INSERT INTO audit_trail (…);
COMMIT;

If any step fails, none happens — the system can never half-pay.


6.5 Access Control as a Financial Control

Data integrity also depends on who can touch the data:

  • Role-based access control (RBAC): permissions follow roles, not individuals; a 7-tier hierarchy (viewer → analyst → … → superadmin) keeps the blast radius of each account small.
  • Row-level security (RLS): the database itself filters rows by tenant/project, so even a buggy API query cannot leak another tenant's ledger.
  • Default-deny: any endpoint or action not explicitly permitted is refused. Financial systems should fail closed.
  • Separation of duties: the person who enters an invoice should not be the only approver of its payment.

These are accounting controls expressed as engineering — the modern equivalent of "two signatures on every check."


6.6 In Practice — TupicFinance

TupicFinance was built around the principles of this chapter rather than retrofitted with them. Financial transactions write to a dedicated, append-only audit trail, while goals, tasks, and sprints log to a separate activity stream — the two are never mixed. Every ingested data point carries an explicit source (AWS, Stripe, Cloudflare, Google Analytics, or manual entry), making the lineage walk of §6.3 a query, not an investigation. Fields like payment_status and liability_type are stored, not derived, and written transactionally together with their bank movements. Access is governed by a seven-tier RBAC model, row-level security across the database, and default-deny middleware. In short: in TupicFinance, every number on a dashboard is designed to survive the auditor's question — "prove it."

share