Files

T

tarun-elango 26810e43d0 sd text

2026-04-26 13:27:19 -04:00

63 KiB

Raw Permalink Blame History

Financial Systems

Financial systems are where backend engineering stops being forgiving. A normal CRUD bug might show the wrong profile picture or duplicate a notification. A financial bug can double-charge a customer, underpay a merchant, misstate revenue, break compliance rules, or create an accounting discrepancy that surfaces weeks later.

That is why payment, billing, invoicing, and ledger design show up so often in system design interviews. These systems force you to think about correctness, retries, immutability, eventual consistency, external dependencies, and operational recovery all at once.

This guide is designed for two goals:

Help you explain financial systems clearly in backend and system design interviews.
Help you build a realistic mental model of how production financial systems are actually built.

Examples in this guide are generalized from public industry patterns used by systems like Stripe, PayPal, Amazon, Uber, Netflix, Shopify, GitHub, and typical SaaS subscription platforms.

1. Big Picture: Why Financial Systems Are Different

At a high level, a financial system answers a deceptively simple question:

"Who owes whom how much money, for what reason, in what currency, at what time, and how certain are we?"

That question is harder than it looks because the answer changes over time and is often influenced by external systems you do not control.

In a normal application, the request-response cycle often feels authoritative. In financial systems, the HTTP response is usually only the beginning of the story.

1.1 Why These Systems Are Hard

Problem	Why it exists	What it forces you to design for
Money movement has real consequences	A bug affects customer trust, legal exposure, and revenue	correctness, auditability, strong operational controls
External systems are involved	banks, card networks, processors, and payment providers can be delayed or inconsistent	async workflows, reconciliation, retries
Requests are retried	clients, load balancers, and workers will retry on timeouts	idempotency, deduplication, safe replays
Truth is distributed	your DB, the processor, and the bank may disagree temporarily	state machines, eventual consistency, operational review
History matters	you cannot casually mutate financial records without losing traceability	immutable records, append-only ledgers, reversals
Regulations and contracts matter	taxes, invoices, refunds, disputes, and payouts have legal rules	compliance-aware data design
Failure is partial	auth may succeed while your app times out, or webhook may arrive late	ambiguity handling, polling, reconciliation

1.2 The Core Mental Model

A strong production design separates responsibilities instead of letting one table do everything.

Product systems decide what happened in the business domain.
Billing decides what should be charged.
Payments decide whether money collection succeeded.
Invoices define what the customer legally owes.
The ledger records financial truth as immutable movements.
Reconciliation confirms that internal truth matches external truth.
Audit logs explain who changed what and when.

If a company tries to collapse all of that into a single payments table with a status column, the system usually becomes fragile very quickly.

1.3 Non-Negotiable Design Rules

These are not optional nice-to-haves. They are survival rules.

Represent money as integers in minor units, not floating point.
Separate order state, payment state, invoice state, entitlement state, and ledger state.
Assume every network call can be retried or duplicated.
Assume every webhook can be delayed, reordered, or redelivered.
Prefer append-only financial history over mutable rows.
Build operational tools for refunds, disputes, and reconciliation from day one.
Treat external provider reports as mandatory inputs, not optional debug data.

1.4 End-to-End Financial Architecture

flowchart LR
	PROD[Product Events\norders seats rides storage usage] --> BILL[Billing Engine]
	PLAN[Plan Catalog\npricing discounts tax rules] --> BILL
	BILL --> INV[Invoice Engine]
	INV --> PAY[Payment Orchestrator]
	PAY --> GW[Gateway / Processor / Provider]
	GW --> RAILS[Card Network / Bank Rails / Wallet Rails]
	GW --> WEBHOOKS[Webhook / Settlement Events]
	WEBHOOKS --> PAY
	PAY --> LEDGER[Ledger Posting Engine]
	INV --> LEDGER
	PAY --> ENT[Entitlement Service]
	BILL --> ENT
	LEDGER --> RECON[Reconciliation Engine]
	GW --> REPORTS[Provider Reports]
	RAILS --> BANK[Bank Statements / Settlement Files]
	REPORTS --> RECON
	BANK --> RECON
	PAY --> AUDIT[Audit Log]
	BILL --> AUDIT
	LEDGER --> ACCOUNTING[Accounting Export / Finance Warehouse]
	RECON --> OPS[Finance Ops / Support Review]

This diagram is important because it shows the most common interview mistake: assuming payment success automatically means all other financial systems are done. In reality, payment collection, invoice generation, ledger posting, entitlements, and reconciliation are separate concerns.

2. Payment System

2.1 What a Payment System Is

A payment system is the software and infrastructure that lets a customer transfer money to a merchant or platform. In practice, it is less about "charging a card" and more about coordinating many moving parts safely.

A production payment system often needs to:

accept a payment request
authenticate or verify the payment method
call external payment providers
handle redirects or additional customer action such as 3DS
track payment state transitions over time
prevent duplicates
post financial entries
support refunds and disputes
reconcile internal state with provider and bank reports

2.2 Why Payment Systems Are Difficult

Payment systems are difficult because they combine all of the hardest parts of distributed systems:

external dependencies you do not control
user-facing latency requirements
irreversible or expensive side effects
asynchronous confirmations
legal and audit requirements
fraud and abuse pressure

If an API that sends emails times out, you can usually retry safely and move on. If a payment API times out, you often do not know whether money was already authorized, captured, declined, or is still in flight.

That ambiguity is one of the defining properties of payment engineering.

2.3 Trust, Correctness, and Money Movement Requirements

Financial systems are built around trust:

customers must trust they will not be double charged
merchants must trust they will get paid correctly
finance teams must trust balances and reports
regulators and auditors must trust the records

That leads to three core requirements:

Requirement	What it means in practice
Correctness	no duplicate charges, balanced ledger entries, accurate invoice totals
Durability	once a financial event is committed, it should not disappear
Traceability	every state change should be explainable later

2.4 High-Level Payment Actors

The exact terminology varies across providers, but the standard card ecosystem usually includes these actors.

Actor	Role	Practical note
Customer	the person or business paying	may abandon, retry, dispute, or fail authentication
Merchant	the business receiving payment	often your company or your platform merchant
Payment gateway	API layer that tokenizes or routes payment requests	often confused with processor in interviews
Payment processor	coordinates transaction processing with acquiring banks and networks	Stripe, Adyen, Braintree, PayPal, etc. can abstract this
Acquiring bank	bank that processes payments for the merchant	receives funds on merchant side
Issuing bank	bank that issued the customer's card	decides approve, decline, fraud challenge
Card network	Visa, Mastercard, AmEx, RuPay, etc.	routes messages and settlement rules

Important interview nuance: in modern systems, one provider may abstract multiple roles. Stripe or Adyen may expose a unified API, but the underlying ecosystem still includes processors, acquirers, networks, and issuers.

2.5 High-Level Payment Flow

sequenceDiagram
	participant C as Customer
	participant UI as Merchant App
	participant M as Merchant Backend
	participant P as Payment Service
	participant G as Gateway/Processor
	participant A as Acquirer
	participant N as Card Network
	participant I as Issuer
	participant W as Webhooks

	C->>UI: Checkout and submit payment method
	UI->>M: Create order
	M->>P: Create payment intent / payment session
	P->>G: Authorize payment
	G->>A: Forward request
	A->>N: Route auth
	N->>I: Approve or decline
	I-->>N: Decision
	N-->>A: Decision
	A-->>G: Decision
	G-->>P: Initial auth result
	P-->>M: Payment pending / authorized / requires action
	M-->>UI: Show status to customer
	G-->>W: Async settlement/refund/dispute events
	W-->>P: Webhook delivery
	P->>M: Update order / trigger fulfillment / update ledger

2.6 Payment Lifecycle Overview

A payment usually goes through several phases rather than a single binary success/failure outcome.

Payment is created.
Payment method is attached or selected.
Additional customer action may be required.
Authorization may be attempted.
The merchant may capture immediately or later.
Funds clear and settle later.
Refunds or disputes may happen days or weeks later.

That is why a payment system is almost always modeled as a state machine rather than a single insert into a charges table.

2.7 Online vs Offline Payment Flows

Online payments

Online payments involve real-time network communication at transaction time.

Examples:

card-not-present ecommerce checkout
UPI or instant bank payment
wallet checkout using an external provider

Characteristics:

synchronous API call initiates the payment
immediate or near-immediate response exists
still may need async confirmation later

Offline or deferred flows

Offline payments do not always consult the full network in real time, or the final financial confirmation happens later.

Examples:

transit systems with offline authorization logic
POS terminals in poor connectivity environments
invoice payments by bank transfer
cash on delivery or manual collections

Characteristics:

customer action and financial confirmation are decoupled
risk is shifted to later operational validation
reconciliation becomes more important

2.8 Synchronous vs Asynchronous Confirmation

This distinction is one of the most important concepts to explain in an interview.

Type	Meaning	Example
Synchronous confirmation	request returns an immediate initial result	card authorization approved or declined
Asynchronous confirmation	final truth arrives later via webhook, polling, file, or settlement report	ACH success later, refund completed later, dispute opened later

Real-world payment systems are often both. A card payment may synchronously return authorized, but the final fulfillment decision may still wait for provider webhooks, fraud checks, or successful capture.

2.9 Failure Handling and Retries

Payment failures are not all the same. Good systems classify them.

Failure type	Example	Correct response
User/business decline	insufficient funds, expired card, stolen card suspicion	do not blindly retry; ask for new method or user action
Technical transient error	provider timeout, connection reset, temporary 5xx	retry safely with idempotency
Ambiguous outcome	request timed out after provider received it	query status, use idempotency, reconcile later
Async failure	auth succeeded, capture failed later	state machine transition plus operational handling

Best practices:

separate retriable and non-retriable failures
never retry without an idempotency strategy
preserve provider reference IDs for later lookup
expose pending states to the business system instead of guessing

2.10 How Payment Systems Differ from Normal CRUD Systems

Normal CRUD system	Payment system
request-response often feels authoritative	response is often provisional
updates overwrite prior state	financial history is usually append-only or versioned
duplicate requests may be harmless	duplicate requests can create duplicate charges
a row can often be edited freely	financial records may require reversal instead of mutation
internal DB is primary source of truth	external providers and banks also matter
logs are mostly for debugging	logs and audit trails are operationally and legally important

3. Payment Processing

Payment processing is the detailed mechanics of what happens after the customer initiates a payment.

3.1 Core Stages

Stage	What it means	Why it exists
Authorization	reserve funds or confirm the payment method can be charged	lets merchant validate ability to pay before finalizing
Capture	convert an authorization into an actual charge	useful when final amount or fulfillment timing is delayed
Clearing	exchange transaction details between parties	needed for network and bank processing
Settlement	actual money movement between institutions	this is when funds are truly transferred
Refund	return some or all funds after charge	needed for cancellations, service failures, policy decisions
Chargeback	issuer/cardholder dispute reverses or challenges the charge	consumer protection and fraud handling

3.2 Authorization vs Capture

This is a classic interview comparison.

Topic	Authorization	Capture
Purpose	checks or reserves available funds	actually charges the customer
Timing	often happens at checkout	may happen immediately or later
Business use case	reserve payment before shipment or service completion	take money when fulfillment is confirmed
Example	Amazon authorizes before shipment	Amazon captures when items ship
Failure impact	order can remain unfulfilled	revenue collection fails after customer thought checkout succeeded

Delayed capture is common in real systems:

Amazon captures when an item ships, not when the order is first placed.
Uber may pre-authorize an estimated ride amount and capture the final amount after trip completion.
Hotels often authorize a card at check-in and settle later.

3.3 Payment State Machines

A robust payment system uses explicit state transitions instead of ad hoc boolean flags.

stateDiagram-v2
	[*] --> RequiresPaymentMethod
	RequiresPaymentMethod --> RequiresConfirmation
	RequiresConfirmation --> RequiresAction
	RequiresConfirmation --> Processing
	RequiresAction --> Processing
	Processing --> Authorized
	Processing --> Failed
	Authorized --> Captured
	Authorized --> Expired
	Captured --> Settling
	Settling --> Succeeded
	Succeeded --> PartiallyRefunded
	PartiallyRefunded --> Refunded
	Succeeded --> Refunded
	Succeeded --> ChargebackOpen
	ChargebackOpen --> ChargebackWon
	ChargebackOpen --> ChargebackLost
	Failed --> [*]
	Refunded --> [*]
	ChargebackWon --> [*]
	ChargebackLost --> [*]

The specific states vary across providers, but the idea is consistent:

payments are long-lived workflows
transitions happen over time
asynchronous events may drive later transitions
invalid transitions should be rejected explicitly

3.4 Stripe-Like Payment Intent Model

Stripe popularized the idea that a payment should be represented as a durable intent object rather than a one-shot charge request.

The idea is powerful because it matches reality.

A payment intent usually stores:

merchant or account context
customer context
amount and currency
payment method information or token reference
current state
whether capture is automatic or manual
metadata for order correlation
provider reference IDs

Why this model exists:

the same payment may require multiple attempts
customer authentication may interrupt the flow
the client, server, and provider all need a shared durable object to coordinate around
webhooks can safely attach later state changes to that object

In practice, a payment intent model reduces the temptation to create multiple charges for the same checkout retry.

3.5 Partial Capture

Partial capture means the merchant authorized one amount but captures only part of it.

Examples:

an order with multiple items ships in parts
a ride estimate was higher than the final fare
a restaurant pre-authorized a larger amount but settled the final bill

Design implications:

the payment object must track authorized amount, captured amount, and remaining capturable amount
ledger posting must reflect only captured funds, not full authorization
invoice and fulfillment systems must know whether a partial capture is expected behavior or a problem

3.6 Partial Refunds

Partial refunds are extremely common.

Examples:

one item in a multi-item order is returned
a service credit covers part of the invoice
support grants a goodwill refund for part of the charge

Design implications:

never model refund as a boolean flag on payment
track cumulative refunded amount and remaining refundable amount
prevent total refunds from exceeding captured amount
ledger and accounting should reflect each refund as its own event

3.7 Delayed Capture

Delayed capture is not an edge case. It is a standard production requirement.

Why it exists:

final amount may be unknown at checkout
merchant may want to confirm inventory or fraud review first
fulfillment may happen hours or days later

Tradeoffs:

improves business control
increases complexity in payment lifecycle tracking
introduces authorization expiry risk if capture is too late

3.8 Clearing and Settlement

Developers often stop thinking at authorization, but finance systems cannot.

Clearing

Clearing is the exchange of transaction details among payment institutions. It confirms what transaction should be processed and under what rules.

Settlement

Settlement is the actual movement of funds between institutions.

Why this matters:

a payment can be authorized but not yet settled
a successful API response does not mean cash is in the merchant bank account yet
marketplace payout logic often depends on settlement confidence, not just auth success

3.9 Refunds and Chargebacks

Refunds are merchant-initiated. Chargebacks are usually customer-issuer-initiated disputes.

Topic	Refund	Chargeback
Who initiates	merchant or support workflow	cardholder through issuer
Typical reason	return, cancellation, service issue	fraud, dissatisfaction, unrecognized charge
Timing	usually after capture	often days or weeks later
Control	merchant usually controls initiation	external process, evidence-based
Operational impact	customer experience and accounting adjustment	financial loss risk, dispute operations, fees

Chargebacks are a major reason payment systems need strong evidence storage, audit logs, and operational tooling.

3.10 Retry Behavior in Processing Systems

Good retry behavior requires classification.

Retry candidates:

connection timeout before receiving provider result
transient provider 5xx
webhook delivery failure
temporary downstream database outage

Do not blindly retry:

issuer declines
fraud rules blocked the payment
expired payment method
authorization window already expired

Practical pattern:

attempt payment with provider idempotency key
on timeout, mark local state as uncertain or processing
query provider by external reference if possible
wait for webhook or scheduled reconciliation if ambiguity remains

3.11 Webhooks and Event-Driven Payment Updates

Real payment platforms are event-driven because external truth arrives asynchronously.

Typical events:

payment succeeded
payment failed
payment requires action
charge captured
refund succeeded
dispute opened
payout paid

Best practices:

verify webhook signatures
store raw webhook payloads for replay and audit
deduplicate by provider event ID and business object ID
process via internal queue, not inline on the HTTP handler only
make handlers idempotent and order-aware

Common mistake: assuming events arrive exactly once and in order. Providers often deliver at least once and may redeliver older events.

3.12 Duplicate Charge Prevention

Duplicate charge prevention is not one feature. It is multiple layers.

Layer	Example
API idempotency	same checkout request with same key returns same result
Business key uniqueness	one successful payment per order or invoice
Payment intent reuse	retries reuse the same payment object
Provider-side idempotency	send a provider idempotency key or merchant reference
Operational controls	support tools show prior attempts before allowing manual re-charge

3.13 Exactly-Once vs At-Least-Once Reality

Exactly-once is mostly a marketing phrase in distributed money systems.

What you can realistically achieve is:

at-least-once delivery of events
idempotent processing of duplicates
unique business constraints that prevent duplicate final effects
reconciliation that catches anything missed

A strong interview answer explicitly says this instead of claiming "I will ensure exactly once delivery".

4. Subscriptions

Subscriptions turn one-time payment processing into a long-lived revenue engine.

4.1 What Recurring Billing Really Means

A subscription is a contract-like object that says:

who is being billed
for what plan or service
on what schedule
under what pricing rules
with what payment collection behavior

Recurring billing is not just a cron job that charges a card every month. It is a coordinated system involving:

plan catalog and pricing rules
entitlement logic
invoice generation
taxes and discounts
payment collection
retries and dunning
cancellation and plan change rules

4.2 High-Level Subscription Architecture

flowchart LR
	CAT[Plan Catalog\nprices features coupons] --> SUB[Subscription Service]
	CUST[Customer / Account] --> SUB
	SUB --> SCHED[Renewal Scheduler]
	SCHED --> BILL[Billing Engine]
	USAGE[Usage Metering] --> BILL
	BILL --> INV[Invoice Engine]
	INV --> PAY[Payment Service]
	PAY --> PSP[Payment Provider]
	PAY --> LEDGER[Ledger]
	INV --> LEDGER
	SUB --> ENT[Entitlement Service]
	PAY --> ENT
	SUB --> NOTIFY[Email / In-App Notifications]
	PAY --> NOTIFY

4.3 Monthly vs Annual Subscriptions

Model	Business benefit	Technical impact
Monthly	lower entry barrier, more flexible	more frequent renewals, more retry churn
Annual	better cash flow, lower churn, simpler gross retention story	larger invoice amounts, proration complexity, annual tax handling

A subscription system must treat cadence as a first-class configuration, not a hard-coded monthly assumption.

4.4 Trial Periods

Trials exist because product growth often needs a usage or time-limited evaluation phase.

Design questions:

does the trial require a payment method up front?
what happens when the trial ends without a payment method?
when do entitlements activate and deactivate?
can the same customer create repeated free trials?

Real-world examples:

Netflix historically used trials as a conversion funnel in some markets
SaaS tools often require a card to reduce abuse and improve conversion quality

4.5 Upgrades, Downgrades, and Proration

Proration exists because customers change plans mid-cycle.

Example:

plan A costs $100 per month
plan B costs $200 per month
customer upgrades halfway through the billing cycle
the incremental charge is usually about $50 before tax and discounts

Why proration is hard:

seat counts may change multiple times in the same cycle
taxes may vary by jurisdiction
credits may need to be carried to the next invoice instead of immediately refunded
invoice presentation must stay understandable for the customer

Best practice: store billable events and proration line items explicitly rather than trying to recompute historical changes from current plan state.

4.6 Grace Periods

A grace period allows temporary continued access after a failed renewal or overdue invoice.

Why it exists:

transient card failures are common
immediate shutdown creates bad customer experience
enterprise contracts may allow time for manual payment

But grace periods need clear rules:

what features remain enabled?
how long does grace last?
which customer segments qualify?
when do collections escalate?

4.7 Failed Renewal Handling, Retries, and Dunning

Dunning is the process for recovering failed recurring payments.

Typical strategy:

attempt renewal
if failed for retriable reason, retry on scheduled intervals
notify the customer by email or in-product banners
optionally use backup payment methods
apply grace period
cancel or suspend if payment is not recovered

Important nuance: retry behavior should depend on failure reason. Retrying an insufficient-funds decline after salary day might work. Retrying a stolen-card block probably will not.

4.8 Cancellations

Cancellation behavior should be explicit.

Common models:

immediate cancellation with immediate entitlement removal
cancel at period end
cancel and refund unused period under specific policy
pause rather than cancel

Good systems store both:

operational subscription status
effective cancellation date

Those are not always the same thing.

4.9 Scheduled Plan Changes

Many SaaS systems allow a downgrade or upgrade to take effect later.

Examples:

upgrade immediately, because customer wants more features now
downgrade at next renewal, because customer already paid for current cycle
price increase takes effect at next term for existing customers

That means you often need both:

current plan version
pending future plan version

4.10 Subscription Lifecycle Design

stateDiagram-v2
	[*] --> Trialing
	Trialing --> Active
	Trialing --> Incomplete
	Incomplete --> Active
	Active --> PastDue
	PastDue --> GracePeriod
	GracePeriod --> Active
	GracePeriod --> Paused
	GracePeriod --> Canceled
	Active --> PendingCancellation
	PendingCancellation --> Canceled
	Active --> Paused
	Paused --> Active
	Canceled --> [*]

Do not collapse all of these into one status without clear semantics. In real systems, invoice state, payment state, and entitlement state may temporarily differ.

Example:

invoice is unpaid
payment is retrying
subscription is still active because the customer is inside grace period

4.11 SaaS Subscription Architecture Patterns

Strong SaaS systems usually separate:

subscription contract management
pricing and catalog management
invoice generation
payment collection
entitlement enforcement

That separation matters because GitHub-style seat billing, Shopify merchant billing, and enterprise SaaS contracts all need different rules, but they still plug into the same general architecture.

4.12 Common Failures and Scaling Considerations

What breaks at scale:

renewing millions of subscriptions at exactly midnight UTC
large proration computations for accounts with many seat changes
delayed usage events missing the current invoice cut-off
broken entitlements because payment and subscription state were coupled too tightly

Best practices:

spread renewal schedules across time instead of one giant batch
snapshot pricing terms at subscription creation time
version plans instead of mutating them in place
keep entitlements separately derived from subscription policy plus payment policy

5. Invoices

5.1 What an Invoice Is

An invoice is a formal statement that documents what the customer owes for specific goods or services.

In many businesses, the invoice is not just an internal billing artifact. It is a customer-facing financial document with tax, legal, and accounting implications.

5.2 Why Invoices Exist

Invoices exist because businesses need a structured record of:

what was sold
when it was sold
who owes the money
what taxes applied
when payment is due

For subscription businesses, invoices often bridge billing, finance, and customer support. For enterprise sales, they may be the main financial document the customer actually processes.

5.3 Invoice Generation

Invoice generation typically pulls from multiple sources:

subscription charges
usage aggregates
discounts and credits
taxes
currency settings
customer billing profile

Typical stages:

collect billable line items
compute rating and discounts
compute taxes
create draft invoice
review or finalize
collect payment or send for manual payment

5.4 Invoice Numbering

Invoice numbering sounds trivial, but it is often compliance-sensitive.

Requirements may include:

uniqueness
no accidental reuse
understandable sequence for finance operations
jurisdiction-specific numbering expectations

Practical approaches:

globally unique invoice numbers
per legal entity sequences
per country or tax registration sequences

Common mistake: using a random UUID as the only customer-visible invoice identifier. UUIDs are fine as internal IDs, but finance and customers often still need a human-usable invoice number.

5.5 Tax Basics

A production billing system usually needs at least a working tax model.

Common concerns:

sales tax vs VAT vs GST
business location and customer location
tax-exempt customers
net price vs gross price presentation
tax per line item vs tax on total

Even if a company uses Stripe Tax, Avalara, or another external tax provider, your system still needs to store inputs, outputs, versioned tax decisions, and invoice presentation.

5.6 Invoice Finalization

Finalization is the point at which a draft invoice becomes an authoritative billing document.

Why it matters:

line items usually become locked
taxes and totals become authoritative
payment collection or customer delivery may begin
downstream accounting and reporting can depend on it

Many systems allow drafts to be edited but finalized invoices to be immutable.

5.7 Why Invoices Often Cannot Be Edited Like Normal Records

This is a very common interview discussion point.

Once an invoice has been issued, multiple downstream systems may already rely on it:

customer procurement or AP systems
tax reporting
finance close processes
revenue reporting
external accounting exports

If you simply edit the original invoice row, you can destroy auditability and legal traceability.

Typical production approaches instead are:

void the invoice if rules allow
issue a credit note
create a replacement invoice
maintain versioned presentation while preserving immutable original financial facts

5.8 Due Dates and Payment Status Tracking

Common invoice statuses:

draft
finalized or open
paid
partially paid
overdue
uncollectible
void

These statuses are not just cosmetic. They drive collections, revenue operations, support workflows, and customer entitlements in some businesses.

5.9 Invoice vs Receipt

Topic	Invoice	Receipt
Purpose	request or document amount owed	proof that payment was received
Timing	typically before or at billing time	after payment succeeds
Business meaning	customer owes or was billed	customer already paid
Common use	subscription billing, enterprise accounts receivable	ecommerce confirmation, payment confirmation

This distinction matters because many junior designs treat them as interchangeable.

5.10 Invoice Lifecycle

flowchart LR
	D[Draft Invoice] --> F[Finalized Invoice]
	F --> O[Open / Awaiting Payment]
	O --> P[Paid]
	O --> OD[Overdue]
	OD --> U[Uncollectible]
	F --> V[Void]
	P --> R[Credit Note / Refund Adjustment]

5.11 Invoice Versioning Considerations

Versioning is often needed for presentation changes without rewriting financial history.

Examples:

corrected customer address
updated PDF template
additional display metadata for support

Good design distinguishes between:

immutable financial facts
mutable presentation metadata

5.12 Legal and Compliance Considerations

The rules vary heavily by country and business model, but common themes include:

invoice numbering discipline
retention periods
tax fields and calculations
non-editability of issued documents
legal entity identifiers

The engineering lesson is simple: invoice design is not just a UI export problem.

6. Idempotency

Idempotency is one of the most important topics in financial backend design.

6.1 Why Idempotency Is Critical

Retries are unavoidable.

They happen because:

the client timed out
the mobile network dropped
a proxy retried the request
a worker retried the job
a human clicked the checkout button twice

Without idempotency, retries turn into duplicate financial side effects.

6.2 Duplicate Payment Risks

Duplicate payment failures are expensive because they create:

customer trust loss
support costs
refund workload
dispute risk
accounting noise

That is why idempotency is a first-class design concern, not a nice API feature.

6.3 Idempotency Keys

An idempotency key is a client- or server-generated key representing one logical operation.

Example use cases:

create payment for order 123
refund payment 456 for $20
create invoice for subscription cycle 2026-04

Good idempotency key design usually includes scope:

merchant or account ID
operation type
business object reference

6.4 API Idempotency Design

Typical pattern:

client sends request with idempotency key
server checks durable idempotency store
if key exists with completed identical request, return stored result
if key exists with conflicting payload, reject
if key is new, reserve it and process the request
store final response or final effect reference

Important nuance: storing only "request seen" is often insufficient. You usually also need the final outcome or object ID.

6.5 Idempotency Processing Flow

flowchart TD
	REQ[Incoming Request with Idempotency Key] --> LOOKUP{Key Exists?}
	LOOKUP -->|No| RESERVE[Reserve Key as In-Progress]
	RESERVE --> EXEC[Execute Business Operation]
	EXEC --> STORE[Store Final Result / Object Reference]
	STORE --> RESP[Return Response]
	LOOKUP -->|Yes same payload| RETURN[Return Existing Result]
	LOOKUP -->|Yes different payload| CONFLICT[Reject as Idempotency Conflict]

6.6 Storage Strategies

Strategy	Pros	Cons
Relational table with unique constraint	durable, transactional, simple to reason about	can become hot under heavy traffic
Redis plus durable backing store	low latency	needs careful durability and failover semantics
Business-object uniqueness only	simpler for narrow cases	not enough for generic API retries
Workflow engine state store	good for long-running operations	heavier architecture

In financial systems, a relational durable store is often the safest default for write paths that truly move money.

6.7 Expiration Strategies

Idempotency keys should not necessarily live forever, but expiring them too quickly is dangerous.

Considerations:

retry window of clients and background jobs
provider response delays
support workflows that may replay operations
fraud or abuse risks from unbounded storage

A practical approach is:

keep keys for a bounded period such as 24 hours or several days for payment creation APIs
rely on business-level uniqueness constraints for longer-term protection

6.8 Webhook Idempotency

Webhook handlers also need idempotency.

Use multiple layers:

deduplicate provider event IDs
also make object-state transitions idempotent
reject invalid backward transitions unless explicitly allowed

Why both matter:

the same event may be delivered multiple times
different events may refer to the same payment object
events may arrive out of order

6.9 Exactly-Once Myths

Exactly-once delivery across clients, APIs, workers, providers, and webhooks is not a realistic assumption.

What you want instead:

idempotent APIs
idempotent event processing
durable business object references
reconciliation to catch misses

6.10 Designing Safe Retries

Safe retry design usually means:

deterministic operation identity
state machine aware transitions
provider-side idempotency support when available
ability to query by external reference after ambiguous failures

6.11 Real-World Example: Duplicate Checkout Prevention

Imagine a customer presses "Pay" twice because the page looked frozen.

A robust system uses:

one order ID
one payment intent for that order
one client-generated idempotency key per payment create or confirm action
a unique constraint that only one successful charge can attach to the order

That layered design is how systems like Stripe-integrated checkouts avoid accidental duplicate charges.

7. Reconciliation

Reconciliation is the process of comparing internal financial records with external records to ensure they match.

7.1 What Reconciliation Is

Reconciliation answers questions like:

did every captured payment in our system appear in provider reports?
did every provider settlement correspond to a ledger entry?
did our bank receive the expected payout amount?
are refunds, fees, and chargebacks reflected correctly?

7.2 Why Reconciliation Exists

Reconciliation exists because your code is not the only actor in the system.

Even if your application logic is correct, discrepancies still happen because of:

delayed webhooks
provider bugs or temporary outages
manual operations
settlement timing differences
duplicate or missing files
currency conversion and fee differences

This is why reconciliation is mandatory even when your code is "correct".

7.3 Internal vs External Reconciliation

Type	What it compares	Example
Internal reconciliation	your own systems against each other	order amount matches invoice, payment, and ledger postings
External reconciliation	your systems against providers and banks	captured payments match processor report and bank payout

Strong platforms do both.

7.4 Payment Provider Reconciliation

Provider reconciliation usually compares internal payment records against:

provider event feeds
settlement reports
fee reports
refund reports
dispute reports

Matching keys often include:

external payment ID
merchant reference or order ID
amount and currency
settlement date

7.5 Bank Reconciliation

Bank reconciliation compares expected cash movement against actual bank statements or payout reports.

This matters because provider-level success does not always mean the merchant bank account received the exact expected cash at the expected time.

7.6 Settlement Verification

Settlement verification answers:

was a captured payment settled?
was the correct fee deducted?
did the merchant or platform receive the expected net amount?
were refunds and chargebacks netted correctly?

Marketplace platforms care deeply about this because payouts to sellers depend on correct net settlement calculations.

7.7 Reconciliation Flow

flowchart LR
	INT[Internal Orders / Payments / Ledger] --> MATCH[Reconciliation Engine]
	PR[Processor Reports] --> MATCH
	BANK[Bank Statements / Payout Files] --> MATCH
	MATCH --> OK[Matched Records]
	MATCH --> EX[Exceptions Queue]
	EX --> OPS[Finance Ops Review]
	OPS --> FIX[Adjustments / Replays / Escalation]
	FIX --> LED[Ledger Adjustment or Operational Resolution]

7.8 Mismatch Detection

Common mismatch categories:

internal record exists but provider record missing
provider record exists but internal record missing
amount mismatch
currency mismatch
status mismatch
settlement date mismatch
duplicate external events

7.9 Missing Transaction Handling

When records are missing, the system should not just log and forget.

Typical workflow:

detect exception automatically
classify likely cause
attempt automatic replay or status refresh if safe
escalate to finance or payment ops queue
resolve via adjustment, support action, or provider escalation

7.10 Delayed Event Handling

Reconciliation systems need time windows and tolerance for delay.

Example:

a payment was captured on day 1
settlement report arrives on day 2
bank payout arrives on day 3

If your recon job assumes all systems should match instantly, it will generate noisy false positives.

7.11 Operational Workflows and Manual Review

Real finance systems need exception management.

Manual review may be required for:

ambiguous provider outcomes
chargeback evidence preparation
missing bank settlement
suspicious refund patterns
mismatched currency conversions

This is a good interview point: strong financial system design includes admin tools and ops queues, not just APIs and tables.

7.12 Best Practices and Common Mistakes

Best practices:

store raw provider reports and bank files immutably
keep reconciliation jobs rerunnable
use deterministic matching rules and explainable exception categories
distinguish pending mismatch from confirmed mismatch

Common mistakes:

relying only on webhooks and skipping provider reports
silently auto-correcting discrepancies without traceability
failing to account for settlement delay windows
not exposing exception queues to operations teams

8. Ledger Systems

The ledger is the core financial memory of the platform.

8.1 What a Ledger System Is

A ledger system records money movement as financial entries between accounts.

It is the source of truth for balances, obligations, and historical financial events.

Examples:

wallet balances
merchant payable balances
platform fee revenue entries
stored credits and promotional balances
payout obligations

8.2 Why a Ledger Exists

Applications often start with a naive design like this:

user.balance = user.balance + amount

That works until you need to answer:

why is the balance this number?
which transaction changed it?
can we reverse one specific event?
can we reproduce yesterday's state?
can auditors follow the trail?

The ledger exists because balances should be derived from recorded entries, not treated as unexplained mutable facts.

8.3 Append-Only Design

Good ledgers are append-only.

That means:

new financial events create new entries
existing entries are not casually edited or deleted
corrections are represented as reversing or compensating entries

This is essential for auditability.

8.4 Immutable Financial Records

Immutability matters because historical financial truth should be reconstructable.

If a platform changes an old ledger row in place, it may make today's balance look correct while destroying the explanation of how that balance was reached.

8.5 Double-Entry Bookkeeping Basics

Double-entry bookkeeping means every financial event affects at least two accounts, and the posting remains balanced.

The core idea is simple:

money cannot appear from nowhere
money cannot disappear without an offsetting explanation

In interviews, the exact accounting treatment is less important than the principle that every financial event should create balanced entries.

8.6 Debit and Credit Concepts

Developers often struggle here because debit and credit are not just synonyms for plus and minus.

They are directions whose effect depends on account type.

Useful interview-safe intuition:

assets and expenses typically increase with debits
liabilities, equity, and revenue typically increase with credits

You do not need to present a CPA-level lecture. You do need to show that a financial event should be represented as balanced movement across accounts, not one mutable balance update.

8.7 Example Ledger Posting

Suppose a platform charges a customer $100 and keeps a $3 fee while $97 is owed to the merchant.

One possible journal representation is:

Account	Entry
Processor receivable	Debit $100
Merchant payable	Credit $97
Platform fee revenue	Credit $3

The exact chart of accounts differs by business, but the balanced-entry principle does not.

8.8 Balances Derived from Entries

A balance table may still exist for performance, but it should usually be a materialized or derived view of ledger entries, not the only source of truth.

This is a fundamental distinction.

8.9 Ledger vs Simple Balance Table

Ledger system	Simple balance table
append-only history	latest number only
supports reconstruction and audit	hard to explain changes
supports reversals and corrections	corrections overwrite history
good for compliance and reconciliation	fragile under concurrency and debugging
more complex to build	simpler initially

8.10 Ledger Posting Flow

flowchart LR
	EV[Business Event\npayment capture refund payout] --> POST[Posting Engine]
	POST --> RULES[Posting Rules / Chart of Accounts]
	RULES --> JE[Journal Entry with Balanced Lines]
	JE --> LED[Append-Only Ledger]
	LED --> BAL[Materialized Balances]
	LED --> STMT[Statements / Reporting / Reconciliation]

8.11 Reversals Instead of Updates

If a refund happens, the ledger should usually record a new reversing or compensating event instead of editing the original capture entry.

Why this matters:

preserves history
makes reconciliation explainable
supports financial close and audit workflows

8.12 Auditability and Correctness Guarantees

Good ledger systems enforce invariants such as:

journal entries must balance
account and currency must be explicit
financial periods may be locked after close
external references must be preserved

8.13 Wallet and Platform Examples

Examples where ledgers matter deeply:

Stripe Connect-like marketplace balances
PayPal wallet balances
Uber driver earnings and adjustments
Shopify merchant payouts and fee deductions
SaaS customer credit balances

8.14 Common Mistakes

Common mistakes include:

using floating point for money
mixing currencies in the same account without explicit conversion events
updating balances directly without append-only entries
letting business services write arbitrary ledger rows without a posting engine

8.15 Scaling Considerations

What changes at scale:

very hot accounts may need partitioning or sharding
balances may need snapshotting for fast reads
backfills need controlled replay semantics
period close and replay rules need strong governance

Best practice: centralize posting logic so product teams do not each invent their own accounting behavior.

9. Billing System

Billing is the system that decides what should be charged, when, and under what pricing rules.

9.1 Billing System Overview

Billing is not the same thing as payments.

System	Core question
Billing	what should this customer owe?
Payments	did we successfully collect money?
Invoicing	what formal document shows the charge?
Ledger	what are the immutable financial movements?

Why billing and payments are separate:

a customer can owe money before payment happens
payment can fail while the invoice remains valid
some businesses bill monthly but collect later by wire or ACH
some charges are usage-derived long after the product event happened

9.2 Billing Architecture

flowchart LR
	PE[Product Events\nAPI calls seats storage compute] --> MTR[Metering Ingestion]
	MTR --> AGG[Usage Aggregation]
	CAT[Plan Catalog\nprice books discounts] --> RATE[Rating Engine]
	AGG --> RATE
	SUB[Subscription Contracts] --> RATE
	RATE --> INV[Invoice Engine]
	INV --> PAY[Payment Collection]
	INV --> AR[Accounts Receivable]
	PAY --> LEDGER[Ledger]
	INV --> LEDGER
	LEDGER --> ACC[Accounting Export]
	PAY --> ENT[Entitlement / Service Control]
	INV --> NOTIFY[Invoice Email / Customer Portal]

9.3 Usage-Based vs Subscription Billing

Model	Example	Technical implications
Pure subscription	Netflix monthly plan	scheduled renewals, simpler predictable invoices
Usage-based	API calls, storage, compute hours	metering, aggregation, late events, rating complexity
Seat-based	GitHub or SaaS user seats	seat snapshots, proration, entitlement sync
Hybrid	Shopify plan plus transaction fees	combine recurring and usage or fee-derived line items

9.4 Pricing Model Design

Pricing is partly a product decision and partly a systems design problem.

Technical questions pricing creates:

can pricing rules be versioned?
can invoices explain the charge simply?
can finance reconcile it?
can sales and support understand it?
can the system backfill or re-rate if needed?

Bad pricing models are often hard not because math is hard, but because they create confusing operational behavior.

9.5 Usage Tracking

Usage tracking powers usage-based billing and parts of seat-based or entitlement billing.

9.5.1 Usage Metering Basics

Metering means recording billable events such as:

API requests
storage GB-months
compute minutes or instance-hours
messages sent
seats active during a billing window

9.5.2 Event Ingestion

Meter events usually arrive through:

synchronous product write path
async message streams
log pipelines
batch imports from service usage systems

Best practice: persist raw usage events before aggregation so billing can be recomputed if needed.

9.5.3 Usage Aggregation

Aggregation transforms raw events into billable quantities.

Examples:

total API calls by account per day
average daily active seats in cycle
total storage byte-hours converted to GB-month

Aggregation often needs a dedicated pipeline because raw event volume is too large for invoice-time computation.

9.5.4 Deduplication

Usage events are often duplicated by retries or replay.

Dedup strategies include:

event IDs unique per producer
idempotent upserts into raw event store
windowed dedup during aggregation

9.5.5 Delayed and Late-Arriving Usage

Late usage is normal.

Examples:

a region buffers logs and ships them later
mobile device usage syncs late
a downstream service republishes events after outage recovery

This forces you to choose a policy:

hold invoice finalization until watermark is reached
close invoice on time and roll late usage into next cycle
issue an adjustment invoice later

There is no universal answer. It depends on customer expectations and finance policy.

9.5.6 Billing Windows

Billing windows define which usage belongs to which cycle.

You need clear answers for:

timezone and cutoff rules
how retries across midnight are handled
how backdated corrections are posted

9.5.7 Prepaid vs Postpaid Usage

Model	Meaning	Example
Prepaid	customer buys credits or balance in advance	ad platforms, wallet systems, prepaid API credits
Postpaid	usage is measured first, billed later	cloud compute, SaaS overage billing

Prepaid systems care more about balance control and real-time enforcement. Postpaid systems care more about accurate metering and invoice correctness.

9.5.8 Accuracy Guarantees

Real systems rarely guarantee perfect exactly-once event ingestion. Instead they aim for:

durable raw event retention
deterministic aggregation logic
replay capability
reconciliation between product metrics and billable usage

9.5.9 Fraud Prevention Basics

Usage billing can be abused.

Examples:

account takeover causing huge compute spend
self-generated fake usage for promotional credit abuse
duplicated or forged meter events

Basic controls:

signed or authenticated meter producers
anomaly detection on usage spikes
spend caps and alerts
quota-based temporary throttling

9.5.10 Real-World Examples

API platforms bill on requests or tokens consumed
cloud storage bills on byte-hours or GB-months
compute platforms bill on runtime duration or instance-hours
GitHub-like SaaS products may bill on active seats or premium features enabled

9.6 Subscription Plans

9.6.1 Pricing Models

Model	Example	System implications
Flat rate	one plan, one price	easiest billing, simplest invoices
Seat-based	per user or active seat	seat counting rules, proration, entitlement sync
Tiered pricing	first 100 units one rate, next 900 another	complex rating and customer explanation
Usage-based	pay per request, GB, minute	metering pipeline required
Hybrid	base subscription plus overages	multiple billing engines meet on one invoice

9.6.2 Feature Entitlements

Feature access should not be hard-coded to plan name strings.

Good design uses:

plan version
feature flags or entitlement policies
effective date ranges

Why this matters:

marketing renames plans
enterprise exceptions happen
grandfathered customers need older feature bundles

9.6.3 Enterprise Custom Plans

Enterprise contracts often include:

custom pricing
annual commitments
manual invoicing
negotiated payment terms like net 30 or net 60
true-up charges later

This is why billing platforms need enough flexibility to support both self-serve checkout and finance-managed accounts receivable flows.

9.6.4 Discounts, Coupons, and Promotions

Discount systems need careful modeling.

Questions to answer:

percent or flat discount?
one-time or recurring?
plan-limited or account-wide?
does it apply before or after tax?
does it affect revenue reporting?

9.6.5 Grandfathered Plans

When pricing changes, many existing customers keep old terms.

Engineering implication:

do not mutate the current plan price in place and assume history will still make sense
create new plan versions
store which version each subscription is attached to

9.6.6 Plan Versioning

Plan versioning is one of the most important billing design habits.

Without it, you cannot safely explain historical invoices after pricing or feature changes.

9.7 Refunds

Refunds connect customer support, payments, billing, ledger, and fraud controls.

9.7.1 Full and Partial Refunds

Refunds can be:

full refund of the whole captured amount
partial refund of specific amount or line items
service credit instead of cash refund

9.7.2 Refund Approval Flows

Not every refund should be a direct API call.

Common approval patterns:

automated refund within policy limits
support-initiated refund with permission checks
manager approval for large amounts
finance review for old or exceptional cases

9.7.3 Refund Timing Constraints

Refund behavior depends on payment stage.

if payment was only authorized, you may be able to void instead of refund
if captured but not fully settled, provider behavior may differ
bank transfer refunds may require separate payout instructions

9.7.4 Asynchronous Refund Completion

Refunds are often asynchronous.

That means the system should track states like:

refund requested
refund submitted to provider
refund pending
refund succeeded
refund failed

9.7.5 Ledger Reversal Handling

A refund should create compensating financial entries, not erase the original charge.

That usually means:

reduce receivable or cash position as appropriate
reduce merchant payable or reverse revenue where applicable
link refund entries to original payment event

9.7.6 Abuse Prevention and Fraud Considerations

Refund systems can be abused.

Examples:

compromised support account issuing fraudulent refunds
customer requesting repeated partial refunds across channels
refunding to a payment method not tied to the original transaction

Controls:

permissioned refund roles
approval thresholds
audit logs for every refund action
anomaly detection on refund velocity

9.7.7 Accounting Implications

Refunds affect more than payment state.

They can affect:

revenue reporting
tax adjustments
merchant payable balances
customer statements

9.7.8 Safe Operational Refund Design

flowchart LR
	SUP[Support / Customer Request] --> POLICY[Refund Policy Engine]
	POLICY --> APPROVE{Needs Approval?}
	APPROVE -->|Yes| MGR[Manager / Finance Approval]
	APPROVE -->|No| REQ[Create Refund Request]
	MGR --> REQ
	REQ --> PAY[Payment Provider Refund API]
	REQ --> LED[Ledger Reversal Pending]
	PAY --> WH[Refund Webhook / Status Update]
	WH --> LED2[Ledger Reversal Finalized]
	WH --> NOTIF[Customer Notification]
	WH --> AUD[Audit Log]

9.8 Audit Logs

Audit logs are not the same as application logs.

9.8.1 Why Audit Logs Matter

Audit logs answer questions like:

who issued this refund?
who changed billing settings?
when was a plan changed?
who updated payout bank details?
who overrode a failed payment and granted access?

9.8.2 Compliance and Operational Requirements

Financially relevant actions often require durable traceability for:

internal investigations
fraud reviews
regulatory or audit requests
customer disputes

9.8.3 What Audit Logs Typically Capture

actor identity
action type
target object
before and after state where allowed
timestamp
request ID or trace ID
approval context if applicable

9.8.4 Immutable Event History

Audit logs are usually append-only and write-restricted.

Why:

if admins can casually edit audit history, the log loses its value
investigations need evidence quality, not best-effort debugging

9.8.5 Admin Action Tracking

Particularly sensitive actions include:

refunds
manual captures
invoice voids
plan price changes
bank account or payout changes
permission grants for support and finance roles

9.8.6 Financial Investigation Workflows

When something goes wrong, investigators often need to correlate:

customer-facing event
internal admin actions
payment provider events
ledger postings
reconciliation exceptions

Strong systems make that traceability possible with consistent IDs.

9.8.7 Retention Strategy and Tamper Resistance

Common design choices:

long retention for financially relevant events
restricted delete permissions
append-only storage patterns
checksums or immutable storage layers for high-sensitivity contexts

9.8.8 Application Logs vs Audit Logs

Topic	Application logs	Audit logs
Purpose	debugging and observability	accountability and investigation
Retention	often shorter	often longer
Editability	may be reprocessed or rotated freely	should be tightly controlled
Structure	often operational and noisy	structured and action-focused
Example	HTTP 500 from refund API	admin user 42 approved refund of $500

Common mistake: assuming standard service logs are enough for auditability.

10. How These Systems Connect in Real Architecture

The best way to understand the whole stack is to follow a realistic end-to-end flow.

10.1 SaaS Subscription Example

Imagine a GitHub-like SaaS product with seat-based billing and annual plans.

Customer selects a plan and seat count.
Subscription service creates the contract and stores plan version.
Invoice engine generates the first invoice, applying discount and tax.
Payment service creates a payment intent and charges the default card.
Provider returns initial result, then sends async confirmation webhook.
Ledger posts the charge and resulting receivable or revenue flows.
Entitlement service activates the plan after policy conditions are met.
Renewal scheduler later creates the next billing cycle.
Usage and seat changes may create proration line items.
Reconciliation confirms settlement and fee correctness.

10.2 Marketplace Example

Imagine a Shopify-like or Uber-like platform.

Customer pays the platform.
Payment is authorized and later captured.
Ledger records customer payment, platform fee, and merchant or driver payable.
Refunds or chargebacks later create adjusting entries.
Payout system sends net funds to merchant or driver.
Reconciliation confirms provider settlement and bank payout.

This example shows why ledgers are central in platforms that hold and distribute money between parties.

10.3 Common Architectural Separation

Good financial architectures often split into services like:

checkout or payment orchestration
provider integration adapters
invoice engine
subscription service
metering and rating pipeline
ledger or accounting posting service
reconciliation service
audit and admin tooling

This separation exists because each domain has different correctness rules, scaling patterns, and operational teams.

11. Common Interview Discussions

When interviewers ask about financial systems, they usually care less about memorizing provider jargon and more about whether you understand failure and correctness.

11.1 Questions You Should Be Ready For

How do you prevent duplicate charges?
How do you model payment state transitions?
Why do you need a ledger instead of a balance column?
How do you handle provider webhooks arriving late or twice?
Why is reconciliation needed if your service is correct?
How would you support partial refunds and chargebacks?
How do you design subscription retries and dunning?
Why are billing and payments separate systems?

11.2 Strong Talking Points

Strong answers usually include:

idempotency at API and event-processing layers
explicit state machines for long-lived payment workflows
append-only ledger entries with reversals instead of mutation
asynchronous processing and webhook handling
reconciliation against provider and bank data
operational tooling for refunds, disputes, and manual review

11.3 Weak Talking Points

Weak answers often sound like:

"I would just store payment status in a table"
"I would make sure events are exactly once"
"If the request times out, I would retry"
"The invoice can just be edited if something changes"

Those answers ignore the real complexity of money systems.

12. Common Production Mistakes

These are the mistakes that repeatedly break real systems.

Using floating point for money.
Treating payment success as final truth without waiting for async confirmation where needed.
Not separating order, invoice, payment, and ledger state.
Forgetting idempotency on write APIs and webhook consumers.
Updating financial records in place instead of using reversals or versioned models.
Skipping reconciliation because "our DB is correct".
Designing no admin tooling for refunds, disputes, and exceptions.
Hard-coding plan names and pricing instead of versioning them.
Using the balance table as the only source of truth.
Ignoring currency, tax, and settlement timing edge cases.

13. Practical Best Practices Checklist

If you want an interview answer that also sounds production-ready, these are the habits to emphasize.

represent money in minor units with explicit currency
use business IDs and provider IDs together
design state machines, not booleans
keep financial records append-only where possible
make retries safe with idempotency
verify and store raw webhook events
build reconciliation jobs from the start
maintain audit logs for sensitive operations
version pricing, plans, and invoice-affecting rules
give operations teams visibility and controls

14. Final Mental Model

If you remember only one thing, remember this:

Financial systems are not just about moving money. They are about maintaining trust under retries, ambiguity, delay, failure, and scrutiny.

A strong design usually separates:

business events
billable calculation
payment execution
legal documents
immutable financial recording
external verification
operational accountability

That is the mental model behind real payment platforms, subscription businesses, marketplaces, and SaaS billing systems.

And that is also the mental model interviewers are usually trying to detect.

63 KiB Raw Permalink Blame History

Financial Systems

1. Big Picture: Why Financial Systems Are Different

1.1 Why These Systems Are Hard

1.2 The Core Mental Model

1.3 Non-Negotiable Design Rules

1.4 End-to-End Financial Architecture

2. Payment System

2.1 What a Payment System Is

2.2 Why Payment Systems Are Difficult

2.3 Trust, Correctness, and Money Movement Requirements

2.4 High-Level Payment Actors

2.5 High-Level Payment Flow

2.6 Payment Lifecycle Overview

2.7 Online vs Offline Payment Flows

Online payments

Offline or deferred flows

2.8 Synchronous vs Asynchronous Confirmation

2.9 Failure Handling and Retries

2.10 How Payment Systems Differ from Normal CRUD Systems

3. Payment Processing

3.1 Core Stages

3.2 Authorization vs Capture

3.3 Payment State Machines

3.4 Stripe-Like Payment Intent Model

3.5 Partial Capture

3.6 Partial Refunds

3.7 Delayed Capture

3.8 Clearing and Settlement

Clearing

Settlement

3.9 Refunds and Chargebacks

3.10 Retry Behavior in Processing Systems

3.11 Webhooks and Event-Driven Payment Updates

3.12 Duplicate Charge Prevention

3.13 Exactly-Once vs At-Least-Once Reality

4. Subscriptions

4.1 What Recurring Billing Really Means

4.2 High-Level Subscription Architecture

4.3 Monthly vs Annual Subscriptions

4.4 Trial Periods

4.5 Upgrades, Downgrades, and Proration

4.6 Grace Periods

4.7 Failed Renewal Handling, Retries, and Dunning

4.8 Cancellations

4.9 Scheduled Plan Changes

4.10 Subscription Lifecycle Design

4.11 SaaS Subscription Architecture Patterns

4.12 Common Failures and Scaling Considerations

5. Invoices

5.1 What an Invoice Is

5.2 Why Invoices Exist

5.3 Invoice Generation

5.4 Invoice Numbering

5.5 Tax Basics

5.6 Invoice Finalization

5.7 Why Invoices Often Cannot Be Edited Like Normal Records

5.8 Due Dates and Payment Status Tracking

5.9 Invoice vs Receipt

5.10 Invoice Lifecycle

5.11 Invoice Versioning Considerations

5.12 Legal and Compliance Considerations

6. Idempotency

6.1 Why Idempotency Is Critical

6.2 Duplicate Payment Risks

6.3 Idempotency Keys

6.4 API Idempotency Design

6.5 Idempotency Processing Flow

6.6 Storage Strategies

6.7 Expiration Strategies

6.8 Webhook Idempotency

6.9 Exactly-Once Myths

6.10 Designing Safe Retries

6.11 Real-World Example: Duplicate Checkout Prevention

7. Reconciliation

7.1 What Reconciliation Is

7.2 Why Reconciliation Exists

7.3 Internal vs External Reconciliation

7.4 Payment Provider Reconciliation

7.5 Bank Reconciliation

63 KiB

Raw Permalink Blame History