Chapter 06 — Audit, Compliance & Data Integrity
Chapter 06 — Audit, Compliance & Data Integrity
"In a financial system, the question is never just 'what is the number?' — it is 'can you prove where the number came from?'"
The previous chapters defined the numbers. This chapter is about trusting them. It covers audit trails, data lineage, and a design principle that surprises many engineers: in financial systems, storing facts beats computing them.
6.1 The Audit Trail
Definition: An immutable, chronological record of every financial transaction and every change made to financial data — who did what, when, and from what previous state.
A complete audit-trail entry answers five questions:
WHO user identity (authenticated, not self-declared)
WHAT the operation (create / update / pay / void)
WHEN timestamp (server-side, timezone-safe)
BEFORE the prior state of the record
AFTER the new state of the record
Worked Example
A cost item's amount is corrected from $1,200 to $2,100:
2026-06-10 14:32:07Z | user: finance.lead@company.com
operation: UPDATE cost_item #4821
before: { amount: 1200.00, payment_status: "UNPAID" }
after: { amount: 2100.00, payment_status: "UNPAID" }
reason: "Vendor issued corrected invoice INV-2231-R1"
Rules of a real audit trail:
- Append-only. Corrections are new entries; nothing is ever deleted or edited in place.
- No silent edits. Every change carries an actor and, ideally, a reason.
- Voids, not deletes. A wrong transaction is reversed by an offsetting entry, preserving history.
Why It Matters — Beyond Compliance
- External audit: auditors sample transactions and walk them back to source documents. No trail → qualified opinion.
- Fraud detection: unusual edit patterns (amounts changed after payment, weekend edits by unexpected roles) are only visible if history exists.
- Debugging: "Why did April's COGS change last Tuesday?" becomes answerable in minutes.
6.2 Audit Trail ≠ Activity Log
A common architectural mistake is dumping everything into one log. Two streams must stay separate:
| Financial Audit Trail | Operational Activity Log | |
|---|---|---|
| Records | actual financial transactions & changes | planning events: tasks created, goals re-weighted, sprints moved |
| Audience | auditors, compliance, finance | the team itself |
| Retention | long (often 7+ years by law) | as useful |
| Mutability | strictly append-only | append-only by convention |
| Noise tolerance | zero — every entry is evidence | high — it's a work diary |
Mixing them buries legally significant entries under thousands of "task renamed" events and makes retention policy impossible (you can't keep one part of a table for 7 years and prune the rest comfortably).
6.3 Data Lineage
Definition: The ability to trace any number in a report back through every transformation to its original source.
Worked Example — Tracing One KPI
Dashboard shows: "Service X — June COGS per active user: $0.41."
A lineage-complete system can unwind it:
$0.41
└── = June COGS $8,610 ÷ 21,000 active users
├── COGS $8,610
│ ├── $6,000 source: AWS Cost Explorer API, account 7733…, pulled 2026-07-02
│ ├── $1,450 source: Stripe fees, automated feed, batch #5520
│ ├── $ 900 source: Twilio invoice INV-88412 (AP item #3107)
│ └── $ 260 source: manual entry by ops@…, attachment receipt.pdf
└── 21,000 users
└── source: Google Analytics daily aggregation, property GA-…
Every leaf is one of: an API feed (with batch id), an invoice/document (with reference), or a manual entry (with author and attachment). A number whose leaf cannot be identified is, for audit purposes, not a number — it's a rumor.
Operating principle: every data point carries a source field — AWS, Stripe, Cloudflare, Google Analytics, manual, … — and manual entries demand the strictest metadata, because they are the weakest link.
6.4 Store Facts, Don't Compute Them
Engineers are trained to avoid redundancy: "payment status can be derived — just check whether a payment transaction exists." In financial systems this instinct is usually wrong. Fields like payment_status and liability_type should be stored explicitly at the moment the business event occurs.
Why Stored Beats Computed
1. The past must not change when logic changes.
Computed approach:
v1 logic: status = PAID if a withdrawal row exists
v2 logic (bug fix): … if a *completed* withdrawal exists
→ Historical reports silently change after the deploy. Auditors notice.
Stored approach:
status was written as a fact when the event happened.
→ Code changes can never rewrite history.
2. Derivations rely on joins that can break. A deleted/archived bank-transaction row would silently flip a computed status. A stored field survives.
3. Auditability. "Status = PAID, set by user X at time T" is evidence. "Status = PAID because the current code evaluates these three tables this way" is an argument.
4. Performance is a free bonus — no multi-table joins to render a list of 10,000 cost items.
The Cost of This Choice — and How to Pay It
Stored facts can drift from underlying data if writes aren't disciplined. The remedy is to make every state change transactional: the cost item update, the bank withdrawal, and the audit entry are written in one atomic operation — all or nothing.
BEGIN;
UPDATE cost_items SET payment_status='PAID' WHERE id=4821;
INSERT INTO cash_transactions (type='WITHDRAWAL', amount=2100.00, …);
INSERT INTO audit_trail (…);
COMMIT;
If any step fails, none happens — the system can never half-pay.
6.5 Access Control as a Financial Control
Data integrity also depends on who can touch the data:
- Role-based access control (RBAC): permissions follow roles, not individuals; a 7-tier hierarchy (viewer → analyst → … → superadmin) keeps the blast radius of each account small.
- Row-level security (RLS): the database itself filters rows by tenant/project, so even a buggy API query cannot leak another tenant's ledger.
- Default-deny: any endpoint or action not explicitly permitted is refused. Financial systems should fail closed.
- Separation of duties: the person who enters an invoice should not be the only approver of its payment.
These are accounting controls expressed as engineering — the modern equivalent of "two signatures on every check."
6.6 In Practice — TupicFinance
TupicFinance was built around the principles of this chapter rather than retrofitted with them. Financial transactions write to a dedicated, append-only audit trail, while goals, tasks, and sprints log to a separate activity stream — the two are never mixed. Every ingested data point carries an explicit source (AWS, Stripe, Cloudflare, Google Analytics, or manual entry), making the lineage walk of §6.3 a query, not an investigation. Fields like payment_status and liability_type are stored, not derived, and written transactionally together with their bank movements. Access is governed by a seven-tier RBAC model, row-level security across the database, and default-deny middleware. In short: in TupicFinance, every number on a dashboard is designed to survive the auditor's question — "prove it."