more text
This commit is contained in:
@@ -0,0 +1,528 @@
|
||||
# File 1: Foundations of C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain how C++ source code becomes a running executable
|
||||
- reason about basic types, object storage, and memory layout
|
||||
- distinguish stack allocation from heap allocation in practical terms
|
||||
- use pointers and references without treating them as magic syntax
|
||||
- debug common low-level failures with a structured mental model
|
||||
|
||||
This file is the foundation for the rest of the guide. If later topics like RAII, smart pointers, iterators, or multithreading feel abstract, come back here first. C++ becomes much easier once you can picture what the compiler produces and what memory actually looks like at runtime.
|
||||
|
||||
## Why C++ Exists
|
||||
|
||||
C++ sits in an unusual position among mainstream languages. It gives you high-level abstractions such as classes, templates, exceptions, and a rich standard library, but it still lets you work close to the machine.
|
||||
|
||||
That combination is why C++ shows up in places where both abstraction and control matter:
|
||||
|
||||
- game engines that need tight performance and custom memory behavior
|
||||
- trading systems that care about latency and predictable execution
|
||||
- databases, compilers, browsers, and storage engines that manipulate large amounts of structured data
|
||||
- embedded and systems code where resource use must be explicit
|
||||
|
||||
The core idea is not just “fast language.” Many languages are fast in some contexts. C++ is valuable because it lets you choose where to pay for abstraction and where to avoid it.
|
||||
|
||||
## The Compilation Model
|
||||
|
||||
### Intuition
|
||||
|
||||
In Python or JavaScript, you can often treat “running the code” as a direct action. In C++, there is a build pipeline between the source you write and the machine code the CPU executes. Understanding that pipeline helps explain many common C++ issues:
|
||||
|
||||
- why header files exist
|
||||
- why template code often lives in headers
|
||||
- why link errors happen even when code compiles
|
||||
- why build systems matter so much in large codebases
|
||||
|
||||
### The Big Picture
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Source files .cpp] --> B[Preprocessor]
|
||||
H[Header files .h .hpp] --> B
|
||||
B --> C[Compiler]
|
||||
C --> D[Object files .o]
|
||||
D --> E[Linker]
|
||||
L[Libraries] --> E
|
||||
E --> F[Executable or shared library]
|
||||
```
|
||||
|
||||
### Preprocessing
|
||||
|
||||
Before the compiler sees your program, the preprocessor handles directives such as `#include`, `#define`, `#if`, and include guards.
|
||||
|
||||
What this means internally:
|
||||
|
||||
- `#include` is essentially textual inclusion
|
||||
- macros are expanded before real compilation begins
|
||||
- conditional compilation can remove or include chunks of code based on flags
|
||||
|
||||
That is why headers can feel deceptively simple. A header is not linked in as a separate unit. Its contents are copied into each translation unit that includes it.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
// math_utils.h
|
||||
int add(int a, int b);
|
||||
|
||||
// main.cpp
|
||||
#include "math_utils.h"
|
||||
```
|
||||
|
||||
The compiler effectively sees the declaration from the header pasted into `main.cpp` before actual parsing.
|
||||
|
||||
### Compilation
|
||||
|
||||
The compiler parses the preprocessed source, checks types, builds intermediate representations, optimizes code, and emits object files.
|
||||
|
||||
A `.cpp` file plus all text included into it after preprocessing becomes a translation unit.
|
||||
|
||||
Practical consequence:
|
||||
|
||||
- syntax errors, type errors, and many template errors are compilation-time issues
|
||||
- each translation unit is compiled independently
|
||||
- the compiler only knows what declarations are visible in that translation unit
|
||||
|
||||
### Linking
|
||||
|
||||
The linker resolves symbol references across object files and libraries.
|
||||
|
||||
If you declare a function in a header but forget to provide the definition in a compiled source file, compilation may succeed while linking fails.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
// declared
|
||||
int compute();
|
||||
|
||||
// used
|
||||
int main() {
|
||||
return compute();
|
||||
}
|
||||
```
|
||||
|
||||
If no compiled object file contains a matching definition of `compute`, the linker reports an unresolved symbol.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
This model matters constantly in real systems:
|
||||
|
||||
- large codebases use headers to expose interfaces and source files to hide implementation
|
||||
- build time can explode if headers pull in too much code
|
||||
- libraries are distributed as headers plus compiled binaries or as header-only template libraries
|
||||
- ABI and symbol compatibility matter when separate teams ship shared libraries
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- confusing compile errors with link errors
|
||||
- putting non-inline function definitions in headers and causing multiple definition errors
|
||||
- overusing macros when constants, `constexpr`, or templates would be safer
|
||||
- including large dependency trees in headers, which slows builds and increases coupling
|
||||
|
||||
## Variables, Types, and Object Storage
|
||||
|
||||
### Intuition
|
||||
|
||||
A variable in C++ is not “just a name.” It is usually a named object with a type, storage duration, alignment requirements, and a region of memory associated with it.
|
||||
|
||||
The type system tells both the compiler and the reader what operations are legal and how many bytes an object likely occupies.
|
||||
|
||||
### What a Type Really Means
|
||||
|
||||
A C++ type typically determines:
|
||||
|
||||
- size, though this can vary by platform
|
||||
- alignment requirements
|
||||
- how the value is interpreted in memory
|
||||
- what operations are available
|
||||
- construction and destruction behavior for user-defined types
|
||||
|
||||
Consider:
|
||||
|
||||
```cpp
|
||||
int count = 42;
|
||||
double ratio = 0.5;
|
||||
char flag = 'Y';
|
||||
```
|
||||
|
||||
These values are all just bits in memory, but the type tells the compiler how to read and manipulate those bits.
|
||||
|
||||
### Value vs Representation
|
||||
|
||||
One useful systems-level habit is to separate a value from its representation.
|
||||
|
||||
For example, an `int` stores a signed integer value, but underneath it is represented in binary with a platform-defined size, usually 32 bits on modern desktop/server platforms. A pointer stores an address value, but underneath it is also just bits.
|
||||
|
||||
This distinction matters when you debug memory corruption. The CPU does not know “this is a tree node” in some abstract sense. It only sees instructions and bytes. The meaning comes from your program's types and the compiler's generated code.
|
||||
|
||||
### Storage Duration
|
||||
|
||||
Every object in C++ has a storage duration. At a practical level, that answers: when does this object come into existence, and when does its storage stop being valid?
|
||||
|
||||
The main categories are:
|
||||
|
||||
- automatic storage duration: usually local variables created when a scope is entered
|
||||
- static storage duration: global variables and `static` locals that live for the life of the program
|
||||
- dynamic storage duration: objects created explicitly on the heap, typically with `new` or via allocators
|
||||
|
||||
Later, RAII and smart pointers will build directly on this idea.
|
||||
|
||||
## Stack vs Heap
|
||||
|
||||
### Intuition
|
||||
|
||||
Beginners often memorize “stack is fast, heap is slow.” That is too shallow and often misleading.
|
||||
|
||||
The real difference is about lifetime management and allocation strategy.
|
||||
|
||||
- stack allocation is usually automatic and scoped
|
||||
- heap allocation is explicit or indirect and more flexible
|
||||
|
||||
### Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[Program starts] --> B[Call main]
|
||||
B --> C[Create stack frame for main]
|
||||
C --> D[Call function]
|
||||
D --> E[Create another stack frame]
|
||||
E --> F[Return from function]
|
||||
F --> G[Frame removed automatically]
|
||||
C --> H[Heap objects may outlive function scope]
|
||||
```
|
||||
|
||||
### Stack Allocation
|
||||
|
||||
Local variables inside a function usually live on the stack, though the exact implementation is up to the compiler and optimizer.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
void process() {
|
||||
int retries = 3;
|
||||
double threshold = 0.75;
|
||||
}
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- function-local state is extremely common
|
||||
- scoped lifetimes are easy to manage automatically
|
||||
- creation and cleanup can often be handled without a general-purpose allocator
|
||||
|
||||
Internally, each function call usually gets a stack frame holding return information, saved registers, and local storage. When the function returns, that frame is popped.
|
||||
|
||||
Practical usage:
|
||||
|
||||
- temporary computation state
|
||||
- small fixed-size objects
|
||||
- ownership that should never outlive the current scope
|
||||
|
||||
Pitfalls:
|
||||
|
||||
- returning pointers or references to local variables
|
||||
- allocating very large arrays on the stack and causing stack overflow
|
||||
- assuming stack layout is fixed across compilers or optimization levels
|
||||
|
||||
### Heap Allocation
|
||||
|
||||
Heap allocation is used when an object's lifetime must outlive a scope, when size is only known at runtime, or when ownership must be transferred across components.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
int* value = new int(42);
|
||||
delete value;
|
||||
```
|
||||
|
||||
Internally, `new` usually asks an allocator for a chunk of dynamic memory, then constructs the object in that memory. `delete` destroys the object and releases the storage.
|
||||
|
||||
Practical usage:
|
||||
|
||||
- dynamic data structures such as graphs or trees
|
||||
- objects shared across subsystems
|
||||
- buffers sized from runtime input
|
||||
|
||||
Pitfalls:
|
||||
|
||||
- memory leaks from forgetting `delete`
|
||||
- double delete from freeing the same pointer twice
|
||||
- dangling pointers after deletion
|
||||
- heap fragmentation and allocator overhead in performance-sensitive systems
|
||||
|
||||
Important note: in modern C++, direct `new` and `delete` should be rare in application code. Prefer containers and smart pointers. You still need to understand heap behavior because the abstractions are built on top of it.
|
||||
|
||||
## Pointers
|
||||
|
||||
### Intuition
|
||||
|
||||
A pointer is a value whose job is to hold the address of another object. That is all. It is powerful because it lets you refer to memory indirectly.
|
||||
|
||||
Pointers exist because systems software constantly needs indirect access:
|
||||
|
||||
- linked data structures
|
||||
- optional access to objects
|
||||
- efficient parameter passing without copying large objects
|
||||
- polymorphic behavior through base-class pointers
|
||||
- interaction with operating systems, hardware, and C APIs
|
||||
|
||||
### Basic Form
|
||||
|
||||
```cpp
|
||||
int score = 99;
|
||||
int* ptr = &score;
|
||||
```
|
||||
|
||||
Here:
|
||||
|
||||
- `score` is an `int`
|
||||
- `&score` means “address of score”
|
||||
- `ptr` stores that address
|
||||
- `*ptr` means “the int stored at that address”
|
||||
|
||||
### Pointer Relationship Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[ptr] -->|stores address| S[score in memory]
|
||||
S --> V[99]
|
||||
```
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
On a 64-bit system, a pointer is commonly 8 bytes. The compiler tracks the pointed-to type because pointer arithmetic and dereferencing depend on that type.
|
||||
|
||||
For example, incrementing an `int*` advances by `sizeof(int)` bytes, not by 1 byte.
|
||||
|
||||
```cpp
|
||||
int values[3] = {10, 20, 30};
|
||||
int* p = values;
|
||||
+p; // now points to values[1]
|
||||
```
|
||||
|
||||
The compiler scales the increment according to the pointed-to type.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- traversal in low-level data structures
|
||||
- API boundaries that may accept nullable inputs
|
||||
- efficient manipulation of contiguous buffers
|
||||
- ownership and lifetime control in specialized libraries or allocators
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- dereferencing `nullptr`
|
||||
- dereferencing uninitialized pointers
|
||||
- using a pointer after the object it points to has been destroyed
|
||||
- confusing ownership with access: a pointer can point to something without owning it
|
||||
|
||||
That last point is critical. A raw pointer does not tell you who is responsible for deleting the object.
|
||||
|
||||
## References
|
||||
|
||||
### Intuition
|
||||
|
||||
A reference is an alias to an existing object. It exists to make code safer and clearer than pointer-heavy interfaces when nullability and reseating are not needed.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
void increment(int& value) {
|
||||
++value;
|
||||
}
|
||||
```
|
||||
|
||||
### Why References Exist
|
||||
|
||||
Without references, you would often pass pointers just to avoid copying objects. But pointers imply optionality and manual dereferencing.
|
||||
|
||||
References express a stronger contract:
|
||||
|
||||
- this function expects a valid object
|
||||
- there is no need for null checks as part of normal usage
|
||||
- the alias should behave like the original object
|
||||
|
||||
### Internal View
|
||||
|
||||
At the machine level, a reference is often implemented similarly to a pointer, but the language treats it differently.
|
||||
|
||||
Key properties:
|
||||
|
||||
- must be initialized when created
|
||||
- cannot be reseated to refer to another object
|
||||
- usually cannot be null in well-formed code
|
||||
- use normal object syntax instead of pointer syntax
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
R[ref] -->|alias of| X[x]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- passing large objects efficiently without copying
|
||||
- operator overloading and fluent APIs
|
||||
- returning aliases to subobjects when lifetime is guaranteed
|
||||
|
||||
### Pitfalls and Misconceptions
|
||||
|
||||
- a reference is not an independent object with its own lifetime target management
|
||||
- returning a reference to a local variable is still invalid
|
||||
- “references are always safer than pointers” is too simplistic; pointers are the right tool when optionality, reseating, or explicit low-level behavior is required
|
||||
|
||||
## Const Correctness
|
||||
|
||||
### Intuition
|
||||
|
||||
`const` is one of the cheapest ways to make C++ code easier to reason about. It restricts mutation and therefore reduces the number of possible program states.
|
||||
|
||||
### Practical Examples
|
||||
|
||||
```cpp
|
||||
void print(const std::string& name);
|
||||
|
||||
const int limit = 100;
|
||||
```
|
||||
|
||||
Why it matters in real systems:
|
||||
|
||||
- APIs become clearer about who is allowed to modify data
|
||||
- the compiler can catch accidental writes
|
||||
- reviewers can reason more quickly about ownership and side effects
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- confusing `const int* p` with `int* const p`
|
||||
- using `const` inconsistently across interfaces
|
||||
- assuming `const` automatically implies thread safety or deep immutability
|
||||
|
||||
## Arrays, Decay, and Basic Memory Layout
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ inherits much of C's memory model. Arrays are contiguous blocks of elements, which is why they are fast for indexed access and cache-friendly iteration.
|
||||
|
||||
```cpp
|
||||
int values[4] = {1, 2, 3, 4};
|
||||
```
|
||||
|
||||
The elements are stored adjacent in memory. That contiguity is why pointer arithmetic and array indexing are closely related.
|
||||
|
||||
### Under the Hood
|
||||
|
||||
`values[i]` is conceptually equivalent to `*(values + i)`.
|
||||
|
||||
This is powerful, but it is also why out-of-bounds access is dangerous. C++ does not automatically check bounds for raw arrays.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- numerical buffers
|
||||
- serialization code
|
||||
- high-performance loops
|
||||
- interop with C libraries
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- array-to-pointer decay in function parameters
|
||||
- buffer overflows
|
||||
- assuming stack arrays automatically know their size when passed to a function
|
||||
|
||||
In most application code, prefer `std::array` for fixed-size arrays and `std::vector` for dynamic arrays. You will still see raw arrays in systems code, embedded code, and performance-critical paths.
|
||||
|
||||
## A Debugging Mental Model
|
||||
|
||||
### Intuition
|
||||
|
||||
Low-level bugs in C++ often feel mysterious only when you lack a runtime model. Most of the time, they reduce to one of a few categories:
|
||||
|
||||
- invalid lifetime
|
||||
- invalid memory access
|
||||
- wrong ownership
|
||||
- incorrect assumptions about object state
|
||||
- data races in concurrent code
|
||||
|
||||
### A Useful Diagnostic Loop
|
||||
|
||||
When debugging a crash or corruption issue, ask these questions in order:
|
||||
|
||||
1. What object was accessed?
|
||||
2. Was it initialized?
|
||||
3. Is its lifetime still valid?
|
||||
4. Who owns it?
|
||||
5. Could memory nearby have been overwritten?
|
||||
6. Is the failure deterministic or timing-dependent?
|
||||
|
||||
That checklist is more valuable than memorizing debugger buttons.
|
||||
|
||||
### Common Failure Modes
|
||||
|
||||
#### Segmentation Faults
|
||||
|
||||
Usually caused by dereferencing an invalid address such as:
|
||||
|
||||
- `nullptr`
|
||||
- a dangling pointer
|
||||
- a wild pointer from uninitialized memory
|
||||
|
||||
#### Use-After-Free
|
||||
|
||||
You delete an object, but some pointer or reference still points to the old address. The address may still look valid for a while, which makes this class of bug subtle.
|
||||
|
||||
#### Stack Corruption
|
||||
|
||||
Often caused by out-of-bounds writes into local arrays or incorrect pointer arithmetic.
|
||||
|
||||
#### Memory Leaks
|
||||
|
||||
The program keeps allocating memory without freeing it. In long-running services, that becomes a production issue rather than just a test annoyance.
|
||||
|
||||
### Practical Tools
|
||||
|
||||
Real C++ debugging is easier when you use tooling, not just intuition:
|
||||
|
||||
- compiler warnings: start with strict warnings enabled
|
||||
- AddressSanitizer: catches use-after-free, buffer overflows, and more
|
||||
- UndefinedBehaviorSanitizer: catches many invalid language-level operations
|
||||
- Valgrind on supported platforms: useful for leaks and invalid accesses
|
||||
- debugger: inspect stack frames, variables, and memory addresses
|
||||
|
||||
Example build flags on Clang or GCC for local debugging:
|
||||
|
||||
```bash
|
||||
-Wall -Wextra -Wpedantic -fsanitize=address,undefined -g
|
||||
```
|
||||
|
||||
### Misconception to Avoid
|
||||
|
||||
“If it only crashes sometimes, the code is almost correct.”
|
||||
|
||||
In C++, nondeterministic behavior is often a sign of undefined behavior, not a minor bug. Once you have UB, the optimizer and runtime can produce very different outcomes from one build or machine to another.
|
||||
|
||||
## Foundation Patterns That Matter Later
|
||||
|
||||
Several later C++ ideas are really lifetime-management patterns built on the concepts above:
|
||||
|
||||
- constructors and destructors manage object setup and cleanup
|
||||
- RAII ties resource lifetime to scope lifetime
|
||||
- smart pointers model ownership on top of heap allocation
|
||||
- containers hide raw memory management while preserving performance properties
|
||||
- concurrency primitives rely on precise reasoning about storage and object lifetime
|
||||
|
||||
If you can already picture stack frames, heap allocation, pointer indirection, and the compile-link pipeline, you are ready for object-oriented and modern C++ design.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain these clearly in an interview without hiding behind buzzwords:
|
||||
|
||||
- the difference between compilation and linking
|
||||
- why headers can increase build time and coupling
|
||||
- what stack and heap allocation really mean in terms of lifetime
|
||||
- the difference between a pointer and a reference
|
||||
- what causes dangling pointers and use-after-free bugs
|
||||
- why `const` improves API design and reasoning
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file builds on these memory and lifetime foundations to explain classes, constructors, destructors, inheritance, and polymorphism. The key shift is this: C++ object-oriented features are not separate from the memory model. They are layered on top of it.
|
||||
@@ -0,0 +1,551 @@
|
||||
# File 2: Core Object-Oriented C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain what a C++ class actually represents in memory and in code
|
||||
- reason about constructors, destructors, and object lifetime without hand-waving
|
||||
- use encapsulation and abstraction to protect invariants
|
||||
- distinguish inheritance from polymorphism and understand when each is appropriate
|
||||
- recognize common object-oriented mistakes that cause subtle bugs in production C++
|
||||
|
||||
This file builds directly on the foundations from File 1. C++ object-oriented features are not separate from the memory model. A class is still a concrete object layout plus functions that operate on it.
|
||||
|
||||
## Why Object-Oriented Features Exist in C++
|
||||
|
||||
### Intuition
|
||||
|
||||
As programs grow, raw functions and primitive types stop being enough. You need a way to keep data and the rules for using that data together.
|
||||
|
||||
That is the heart of classes in C++:
|
||||
|
||||
- package state with behavior
|
||||
- enforce invariants at the boundary
|
||||
- model domain concepts clearly
|
||||
- make ownership and lifetime more explicit
|
||||
|
||||
In real systems, object-oriented design is less about textbook hierarchy diagrams and more about making illegal states harder to represent.
|
||||
|
||||
## Classes and Objects
|
||||
|
||||
### Intuition
|
||||
|
||||
A class is a user-defined type. It describes:
|
||||
|
||||
- what data an object holds
|
||||
- what operations are allowed on that data
|
||||
- what rules govern creation, use, and destruction
|
||||
|
||||
An object is an instance of that type occupying real storage at runtime.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class BankAccount {
|
||||
public:
|
||||
explicit BankAccount(double starting_balance)
|
||||
: balance_(starting_balance) {}
|
||||
|
||||
void deposit(double amount) {
|
||||
balance_ += amount;
|
||||
}
|
||||
|
||||
bool withdraw(double amount) {
|
||||
if (amount > balance_) {
|
||||
return false;
|
||||
}
|
||||
balance_ -= amount;
|
||||
return true;
|
||||
}
|
||||
|
||||
double balance() const {
|
||||
return balance_;
|
||||
}
|
||||
|
||||
private:
|
||||
double balance_;
|
||||
};
|
||||
```
|
||||
|
||||
### What Happens Internally
|
||||
|
||||
At runtime, an object usually contains only its data members. Member functions are not copied into every object. They are compiled as ordinary functions that receive an implicit object parameter, usually called `this`.
|
||||
|
||||
Conceptually, this:
|
||||
|
||||
```cpp
|
||||
account.deposit(50.0);
|
||||
```
|
||||
|
||||
behaves like:
|
||||
|
||||
```cpp
|
||||
deposit(&account, 50.0);
|
||||
```
|
||||
|
||||
That is not exact source-level syntax, but it is the right mental model.
|
||||
|
||||
### Object Layout Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[BankAccount object] --> B[balance_ : double]
|
||||
C[Member functions] --> D[operate using this pointer]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
Classes are useful when data has invariants:
|
||||
|
||||
- account balances should not go negative unless explicitly allowed
|
||||
- sockets should not be used after closure
|
||||
- file handles must be released exactly once
|
||||
- caches should hide eviction details behind a stable API
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- making everything public and losing invariant protection
|
||||
- creating “data bag” classes that do not meaningfully model behavior
|
||||
- assuming classes are automatically heap-allocated; in C++, class objects can live on the stack, in static storage, or on the heap
|
||||
|
||||
## Access Control, Encapsulation, and Abstraction
|
||||
|
||||
### Intuition
|
||||
|
||||
Encapsulation is about protecting internal state from invalid use. Abstraction is about exposing the right conceptual interface while hiding irrelevant details.
|
||||
|
||||
These are related but not identical.
|
||||
|
||||
- encapsulation protects data and invariants
|
||||
- abstraction reduces cognitive load for callers
|
||||
|
||||
### How It Works
|
||||
|
||||
Access specifiers such as `public`, `private`, and `protected` control what code may access certain members.
|
||||
|
||||
In the `BankAccount` example, `balance_` is private. That forces all mutations to go through functions that can enforce rules.
|
||||
|
||||
### Why This Matters in Real Systems
|
||||
|
||||
Without encapsulation, every caller can put an object into a bad state. In a large codebase, that turns local correctness into a global burden.
|
||||
|
||||
Good class design moves validation and lifecycle rules into one place so they are not reimplemented badly in ten different subsystems.
|
||||
|
||||
### Example: Protecting an Invariant
|
||||
|
||||
```cpp
|
||||
class Percentage {
|
||||
public:
|
||||
explicit Percentage(int value) {
|
||||
if (value < 0 || value > 100) {
|
||||
throw std::out_of_range("percentage must be between 0 and 100");
|
||||
}
|
||||
value_ = value;
|
||||
}
|
||||
|
||||
int value() const {
|
||||
return value_;
|
||||
}
|
||||
|
||||
private:
|
||||
int value_;
|
||||
};
|
||||
```
|
||||
|
||||
If `value_` were public, every call site would need to remember the rule. That does not scale.
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“Encapsulation means getters and setters for everything.”
|
||||
|
||||
No. Blind getters and setters often expose implementation details without preserving invariants. The better question is: what operations make sense for this domain object?
|
||||
|
||||
## Constructors
|
||||
|
||||
### Intuition
|
||||
|
||||
Constructors exist because objects often need to establish a valid initial state before they can be used safely.
|
||||
|
||||
This is not cosmetic. In C++, an object can represent a real system resource or a nontrivial invariant. Construction is where you set that up.
|
||||
|
||||
### Types of Constructors
|
||||
|
||||
Common constructor categories include:
|
||||
|
||||
- default constructor: creates an object with no arguments
|
||||
- parameterized constructor: creates an object with explicit setup values
|
||||
- copy constructor: creates a new object from an existing object
|
||||
- move constructor: transfers resources from a temporary or expiring object
|
||||
|
||||
Copy and move are covered in depth in File 3. For now, focus on the fact that constructors are part of an object's lifecycle contract.
|
||||
|
||||
### Initialization Lists
|
||||
|
||||
Use member initializer lists when constructing members:
|
||||
|
||||
```cpp
|
||||
class User {
|
||||
public:
|
||||
User(std::string name, int id)
|
||||
: name_(std::move(name)), id_(id) {}
|
||||
|
||||
private:
|
||||
std::string name_;
|
||||
int id_;
|
||||
};
|
||||
```
|
||||
|
||||
Why they exist:
|
||||
|
||||
- members are constructed before the constructor body runs
|
||||
- some types must be initialized rather than assigned later
|
||||
- initializer lists avoid unnecessary work
|
||||
|
||||
Internal detail:
|
||||
|
||||
If you assign inside the constructor body, the member is first default-constructed and then assigned to. Initializer lists construct it directly in its final state.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- initialize references and `const` members
|
||||
- pass dependencies explicitly
|
||||
- guarantee a valid object immediately after construction
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- doing too much work in constructors, especially work that can fail in complex ways
|
||||
- relying on member declaration order incorrectly; members are initialized in the order they are declared in the class, not the order written in the initializer list
|
||||
- forgetting `explicit` on single-argument constructors that should not allow implicit conversion
|
||||
|
||||
## Destructors
|
||||
|
||||
### Intuition
|
||||
|
||||
If constructors establish a valid object, destructors clean it up. They exist because C++ objects often manage resources beyond plain memory:
|
||||
|
||||
- file descriptors
|
||||
- mutexes
|
||||
- sockets
|
||||
- memory buffers
|
||||
- database handles
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class FileLogger {
|
||||
public:
|
||||
explicit FileLogger(const std::string& path) {
|
||||
file_ = std::fopen(path.c_str(), "a");
|
||||
if (!file_) {
|
||||
throw std::runtime_error("failed to open log file");
|
||||
}
|
||||
}
|
||||
|
||||
~FileLogger() {
|
||||
if (file_) {
|
||||
std::fclose(file_);
|
||||
}
|
||||
}
|
||||
|
||||
private:
|
||||
std::FILE* file_ = nullptr;
|
||||
};
|
||||
```
|
||||
|
||||
### Object Lifecycle Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Storage acquired] --> B[Constructor runs]
|
||||
B --> C[Object is usable]
|
||||
C --> D[Destructor runs]
|
||||
D --> E[Storage released]
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
When an object goes out of scope, its destructor runs automatically. For class members, destruction happens in reverse order of construction.
|
||||
|
||||
This reverse unwinding is critical. It is how C++ guarantees cleanup during normal scope exit and exception propagation.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- releasing OS resources
|
||||
- flushing buffered output
|
||||
- unlocking a mutex through a guard object
|
||||
- rolling back or committing scoped transactions
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- performing work in a destructor that can throw exceptions
|
||||
- forgetting that base and member destructors run automatically
|
||||
- assuming destruction order across unrelated objects is obvious
|
||||
|
||||
## RAII: Resource Acquisition Is Initialization
|
||||
|
||||
### Intuition
|
||||
|
||||
RAII is one of the most important ideas in C++. It ties resource lifetime to object lifetime.
|
||||
|
||||
The idea is simple:
|
||||
|
||||
- acquire the resource in the constructor
|
||||
- release it in the destructor
|
||||
- let scope determine cleanup
|
||||
|
||||
This is why modern C++ code can be both expressive and safe without a garbage collector.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Manual cleanup does not scale well in the presence of:
|
||||
|
||||
- early returns
|
||||
- exceptions
|
||||
- multiple code paths
|
||||
- partial initialization
|
||||
|
||||
RAII turns cleanup into a language-level guarantee rather than a discipline you hope every engineer remembers.
|
||||
|
||||
### Example: Mutex Lock Guard
|
||||
|
||||
```cpp
|
||||
void update(std::mutex& mutex, int& value) {
|
||||
std::lock_guard<std::mutex> lock(mutex);
|
||||
++value;
|
||||
}
|
||||
```
|
||||
|
||||
The mutex is locked when `lock` is constructed and automatically unlocked when `lock` goes out of scope.
|
||||
|
||||
### Real-World Usage
|
||||
|
||||
- file wrappers
|
||||
- transaction guards
|
||||
- scoped timers
|
||||
- custom allocator guards
|
||||
- lock management
|
||||
|
||||
### Misconception to Avoid
|
||||
|
||||
“RAII is only about memory.”
|
||||
|
||||
No. RAII is about any resource that must be released reliably.
|
||||
|
||||
## Inheritance
|
||||
|
||||
### Intuition
|
||||
|
||||
Inheritance exists to model an “is-a” relationship when a derived type should be usable where a base type is expected.
|
||||
|
||||
Used well, inheritance enables substitution and shared interfaces. Used poorly, it creates fragile hierarchies and confusing coupling.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Shape {
|
||||
public:
|
||||
virtual ~Shape() = default;
|
||||
virtual double area() const = 0;
|
||||
};
|
||||
|
||||
class Rectangle : public Shape {
|
||||
public:
|
||||
Rectangle(double width, double height)
|
||||
: width_(width), height_(height) {}
|
||||
|
||||
double area() const override {
|
||||
return width_ * height_;
|
||||
}
|
||||
|
||||
private:
|
||||
double width_;
|
||||
double height_;
|
||||
};
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
A derived object contains a base subobject plus its own members.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Rectangle object] --> B[Shape base subobject]
|
||||
A --> C[width_]
|
||||
A --> D[height_]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- plugin interfaces
|
||||
- GUI widget hierarchies
|
||||
- polymorphic simulation entities
|
||||
- abstractions over hardware or platform-specific implementations
|
||||
|
||||
### When Not to Use It
|
||||
|
||||
If you only want code reuse, composition is often better. Inheritance should model substitutability, not just convenience.
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- deep hierarchies that are hard to reason about
|
||||
- using inheritance for implementation reuse where composition is cleaner
|
||||
- base classes that expose too many assumptions about derived classes
|
||||
- object slicing when derived objects are copied into base objects by value
|
||||
|
||||
## Polymorphism
|
||||
|
||||
### Intuition
|
||||
|
||||
Polymorphism means “same interface, different implementation.” In C++, there are two major forms:
|
||||
|
||||
- runtime polymorphism: usually through virtual functions and base-class references or pointers
|
||||
- compile-time polymorphism: usually through templates or function overloading
|
||||
|
||||
Both matter in interviews and production code, but they solve different problems.
|
||||
|
||||
## Runtime Polymorphism
|
||||
|
||||
### How It Works
|
||||
|
||||
With `virtual` functions, the call target is chosen at runtime based on the dynamic type of the object.
|
||||
|
||||
```cpp
|
||||
void print_area(const Shape& shape) {
|
||||
std::cout << shape.area() << '\n';
|
||||
}
|
||||
```
|
||||
|
||||
If `shape` refers to a `Rectangle`, `Rectangle::area()` runs.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
The exact mechanism is implementation-defined, but the common model is:
|
||||
|
||||
- polymorphic objects contain a hidden pointer, often called a vptr
|
||||
- that pointer refers to a virtual function table, or vtable
|
||||
- virtual calls use the vtable to resolve the correct function at runtime
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[shape reference] --> B[Rectangle object]
|
||||
B --> C[vptr]
|
||||
C --> D[vtable]
|
||||
D --> E[Rectangle::area]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- runtime-selected backends
|
||||
- plugin systems
|
||||
- interface-driven architecture across modules
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- extra indirection
|
||||
- usually one pointer-sized overhead per polymorphic object
|
||||
- reduced inlining opportunities in some cases
|
||||
|
||||
These costs are often acceptable, but they are not free.
|
||||
|
||||
## Compile-Time Polymorphism
|
||||
|
||||
### Intuition
|
||||
|
||||
Sometimes you want generic behavior without runtime overhead. Templates enable this by generating type-specific code at compile time.
|
||||
|
||||
```cpp
|
||||
template <typename T>
|
||||
T max_value(T a, T b) {
|
||||
return a < b ? b : a;
|
||||
}
|
||||
```
|
||||
|
||||
### Why It Exists
|
||||
|
||||
The standard library relies heavily on compile-time polymorphism because it allows generic, highly optimizable code.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- STL algorithms and containers
|
||||
- numeric and serialization libraries
|
||||
- policy-based design
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- template errors can be verbose and hard to read
|
||||
- heavy template usage can increase compile times
|
||||
- overengineering generic code can make APIs harder to understand
|
||||
|
||||
## Object Slicing
|
||||
|
||||
### Intuition
|
||||
|
||||
Object slicing happens when a derived object is copied into a base object by value. The derived-specific part is discarded.
|
||||
|
||||
```cpp
|
||||
Rectangle rectangle(3.0, 4.0);
|
||||
Shape shape = rectangle; // invalid here because Shape is abstract, but slicing is the general idea
|
||||
```
|
||||
|
||||
In non-abstract hierarchies, this creates a new base object that no longer behaves like the original derived object.
|
||||
|
||||
### Why It Matters
|
||||
|
||||
This bug appears when engineers store polymorphic objects by value instead of via pointers or references.
|
||||
|
||||
### Rule of Thumb
|
||||
|
||||
If you want polymorphism, use references or pointers to the base type, not base objects by value.
|
||||
|
||||
## Virtual Destructors
|
||||
|
||||
### Intuition
|
||||
|
||||
If a class is meant to be used polymorphically, it usually needs a virtual destructor.
|
||||
|
||||
Why:
|
||||
|
||||
- deleting a derived object through a base pointer must run the derived destructor first
|
||||
- otherwise cleanup may be incomplete, causing leaks or broken invariants
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Base {
|
||||
public:
|
||||
virtual ~Base() = default;
|
||||
};
|
||||
```
|
||||
|
||||
### Pitfall
|
||||
|
||||
Forgetting this is a classic interview question because it reflects whether you understand object destruction through base interfaces.
|
||||
|
||||
## Design Guidance for Real Systems
|
||||
|
||||
The most maintainable C++ systems usually follow these patterns:
|
||||
|
||||
- small classes with clear ownership boundaries
|
||||
- composition before inheritance
|
||||
- constructors that establish valid state immediately
|
||||
- destructors that make cleanup automatic and boring
|
||||
- polymorphism only where substitution is genuinely needed
|
||||
|
||||
Good C++ OOP is less about building clever hierarchies and more about making lifecycle and resource rules obvious.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- what a class object contains at runtime
|
||||
- why initializer lists matter
|
||||
- what RAII solves that manual cleanup does not
|
||||
- the difference between inheritance and polymorphism
|
||||
- how virtual dispatch works conceptually
|
||||
- why polymorphic base classes usually need virtual destructors
|
||||
- what object slicing is and how to avoid it
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file shifts from basic object lifetime to modern ownership and resource management. That is where raw pointers, smart pointers, move semantics, and the Rule of 0 or 3 or 5 all fit together.
|
||||
@@ -0,0 +1,438 @@
|
||||
# File 3: Memory Management and Modern C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- describe ownership clearly instead of saying “the pointer points there” and stopping
|
||||
- choose between raw pointers, references, and smart pointers based on lifetime semantics
|
||||
- explain copy vs move semantics with both intuition and internal mechanics
|
||||
- apply the Rule of 0, Rule of 3, and Rule of 5 in real code
|
||||
- design resource-managing types that behave predictably under exceptions and refactoring
|
||||
|
||||
This file builds on File 1 and File 2. Once you understand lifetime, construction, and destruction, modern C++ memory management becomes a set of ownership patterns rather than a pile of features.
|
||||
|
||||
## Why Modern C++ Changed Memory Management Style
|
||||
|
||||
### Intuition
|
||||
|
||||
Older C++ code often used raw `new` and `delete` directly. That approach exposes too much manual lifetime bookkeeping to everyday code.
|
||||
|
||||
Modern C++ tries to encode ownership in types so the compiler and API design help enforce the intended lifetime model.
|
||||
|
||||
The goal is not to hide memory. It is to make ownership explicit and failure-resistant.
|
||||
|
||||
### Ownership Vocabulary
|
||||
|
||||
Before discussing smart pointers, use precise terms:
|
||||
|
||||
- owning handle: responsible for cleanup
|
||||
- non-owning handle: can access an object but does not control its lifetime
|
||||
- exclusive ownership: exactly one owner at a time
|
||||
- shared ownership: multiple owners coordinate lifetime
|
||||
- observing reference: can see an object if it still exists, but does not keep it alive
|
||||
|
||||
This vocabulary matters in interviews and code reviews because “it works” is not enough. Engineers need to know who frees the resource and when.
|
||||
|
||||
## Raw Pointers Revisited
|
||||
|
||||
### Intuition
|
||||
|
||||
A raw pointer is best treated as a non-owning access mechanism unless documentation says otherwise.
|
||||
|
||||
Why this shift matters:
|
||||
|
||||
- a raw pointer by itself does not communicate ownership clearly
|
||||
- codebases that treat raw pointers as owning create leaks and double frees
|
||||
- most modern APIs reserve raw pointers for nullable or borrowed access
|
||||
|
||||
### Good Modern Interpretation
|
||||
|
||||
Use raw pointers when you need one of these semantics:
|
||||
|
||||
- optional access to an object
|
||||
- traversal without ownership transfer
|
||||
- interoperability with C APIs or low-level subsystems
|
||||
- custom memory systems where ownership is expressed elsewhere
|
||||
|
||||
### Pitfall
|
||||
|
||||
The problem is not that raw pointers are inherently bad. The problem is that ownership encoded only in comments is fragile.
|
||||
|
||||
## `std::unique_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::unique_ptr` represents exclusive ownership. One object owns the resource, and when that owner dies, the resource is released.
|
||||
|
||||
This is the closest high-level replacement for raw owning pointers.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
auto socket = std::make_unique<Socket>(config);
|
||||
|
||||
if (!socket->connect()) {
|
||||
return false;
|
||||
}
|
||||
```
|
||||
|
||||
No manual `delete` is needed. Cleanup happens automatically.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
A `unique_ptr<T>` usually contains:
|
||||
|
||||
- a raw pointer to `T`
|
||||
- optionally a deleter object
|
||||
|
||||
It is move-only, not copyable. That restriction is the entire point. The type system prevents accidental duplicate ownership.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Exclusive ownership is extremely common:
|
||||
|
||||
- a service owns a cache
|
||||
- a tree node owns its children
|
||||
- a parser owns a token buffer
|
||||
- a component owns a resource handle
|
||||
|
||||
`unique_ptr` makes that ownership explicit and exception-safe.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- return heap objects from factories
|
||||
- store polymorphic objects in containers
|
||||
- model tree and DAG ownership where one parent clearly owns one child
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- copying is not allowed, so design APIs around moving or referencing
|
||||
- do not wrap stack objects in `unique_ptr`
|
||||
- avoid calling `release()` unless you are deliberately transferring responsibility
|
||||
|
||||
## `std::shared_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::shared_ptr` represents shared ownership. The object stays alive until the last owning `shared_ptr` goes away.
|
||||
|
||||
It exists for cases where a single clear owner does not exist.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
auto session = std::make_shared<Session>(config);
|
||||
worker_pool.add(session);
|
||||
monitor.attach(session);
|
||||
```
|
||||
|
||||
Both the worker pool and monitor may extend the lifetime of the same session object.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
`shared_ptr` typically uses a control block containing:
|
||||
|
||||
- the reference count for strong owners
|
||||
- the reference count for weak observers
|
||||
- deleter and allocator information
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[shared_ptr A] --> C[Control block]
|
||||
B[shared_ptr B] --> C
|
||||
C --> D[Managed object]
|
||||
C --> E[strong count]
|
||||
C --> F[weak count]
|
||||
```
|
||||
|
||||
When the strong count reaches zero, the managed object is destroyed. The control block itself can remain until weak references are gone.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- asynchronous workflows where several components may need to keep work alive
|
||||
- graph-like application objects when ownership is genuinely shared
|
||||
- callback systems where tasks may outlive the originating scope
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- more memory overhead than `unique_ptr`
|
||||
- reference counting operations add runtime cost
|
||||
- shared ownership can make program structure harder to reason about
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- using `shared_ptr` by default instead of designing clear ownership
|
||||
- creating hidden lifetime extension that makes cleanup unpredictable
|
||||
- forming cycles that prevent destruction
|
||||
|
||||
## `std::weak_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`weak_ptr` exists because sometimes you need to observe a shared object without keeping it alive.
|
||||
|
||||
The classic use case is breaking reference cycles.
|
||||
|
||||
### Example of a Cycle Problem
|
||||
|
||||
If parent and child both store `shared_ptr` to each other, neither reference count reaches zero.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[Parent shared_ptr] --> C[Child object]
|
||||
C --> W[weak_ptr back to parent]
|
||||
```
|
||||
|
||||
With `weak_ptr`, the child can refer back to the parent without extending the parent's lifetime.
|
||||
|
||||
### How It Works
|
||||
|
||||
`weak_ptr` points to the same control block as `shared_ptr`, but it does not contribute to the strong owner count.
|
||||
|
||||
To use the object safely, call `lock()` to obtain a temporary `shared_ptr` if the object still exists.
|
||||
|
||||
```cpp
|
||||
if (auto parent = weak_parent.lock()) {
|
||||
parent->notify();
|
||||
}
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- observer patterns
|
||||
- caches of shared resources
|
||||
- parent back-references in trees or graphs
|
||||
- asynchronous callback registries
|
||||
|
||||
### Pitfall
|
||||
|
||||
Do not assume the object is still alive just because a `weak_ptr` exists. Always revalidate via `lock()`.
|
||||
|
||||
## Copy Semantics
|
||||
|
||||
### Intuition
|
||||
|
||||
Copying means making another object with the same logical value.
|
||||
|
||||
For simple types, this is straightforward. For resource-owning types, copying becomes a design decision:
|
||||
|
||||
- should both objects own independent resources?
|
||||
- should copying be forbidden?
|
||||
- should copying be expensive or cheap?
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::string a = "trade";
|
||||
std::string b = a; // copy
|
||||
```
|
||||
|
||||
Here, `b` becomes its own string object with its own storage.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
For resource-owning classes, a correct copy operation often requires a deep copy, not a copied raw pointer. If two objects copy the same owning raw pointer blindly, both will try to free the same resource.
|
||||
|
||||
That is why copy control exists at all.
|
||||
|
||||
## Move Semantics
|
||||
|
||||
### Intuition
|
||||
|
||||
Move semantics exist because copying expensive resources is often unnecessary. If an object is temporary or no longer needed, its resources can be transferred instead of duplicated.
|
||||
|
||||
This is one of the defining features of modern C++.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::vector<int> build_values() {
|
||||
std::vector<int> values = {1, 2, 3, 4};
|
||||
return values;
|
||||
}
|
||||
```
|
||||
|
||||
In modern C++, returning `values` is efficient because the compiler can elide copies or move the vector's internal buffer.
|
||||
|
||||
### Transfer Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[source object owns buffer] --> B[move operation]
|
||||
B --> C[destination now owns buffer]
|
||||
B --> D[source becomes valid but unspecified]
|
||||
```
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Moves typically transfer internal pointers, handles, or buffers from one object to another and leave the source object in a valid but unspecified state.
|
||||
|
||||
That phrase is important:
|
||||
|
||||
- valid means the source can still be destroyed safely
|
||||
- unspecified means you should not rely on its old value
|
||||
|
||||
### `std::move` Is a Cast, Not a Move by Itself
|
||||
|
||||
This is a common misconception.
|
||||
|
||||
`std::move(x)` does not move anything on its own. It casts `x` to an rvalue expression, signaling that moving is allowed if an appropriate move operation exists.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- returning large objects from functions
|
||||
- transferring ownership into containers or asynchronous tasks
|
||||
- avoiding unnecessary deep copies in performance-sensitive code
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- using moved-from objects as though they still contain the old value
|
||||
- writing move operations that forget to preserve class invariants
|
||||
- overusing `std::move` on values where copy elision or normal forwarding would be better
|
||||
|
||||
## Rule of 3, Rule of 5, and Rule of 0
|
||||
|
||||
### Rule of 3
|
||||
|
||||
If your class manually defines any of these, it probably needs all three:
|
||||
|
||||
- destructor
|
||||
- copy constructor
|
||||
- copy assignment operator
|
||||
|
||||
Why:
|
||||
|
||||
If your class manages a resource manually, the defaults may perform shallow copies that break ownership.
|
||||
|
||||
### Rule of 5
|
||||
|
||||
In modern C++, move constructor and move assignment operator join the list.
|
||||
|
||||
- destructor
|
||||
- copy constructor
|
||||
- copy assignment
|
||||
- move constructor
|
||||
- move assignment
|
||||
|
||||
If you manage resources manually, you likely need to think about all five.
|
||||
|
||||
### Rule of 0
|
||||
|
||||
The best modern outcome is often the Rule of 0: do not manually write special member functions at all. Instead, compose your class from well-behaved members such as `std::string`, `std::vector`, `std::unique_ptr`, and other RAII types.
|
||||
|
||||
That lets the compiler-generated defaults behave correctly.
|
||||
|
||||
### Practical Guidance
|
||||
|
||||
- prefer Rule of 0 when possible
|
||||
- use Rule of 5 only when building true resource-managing types
|
||||
- if you write one special member function, stop and consider the others
|
||||
|
||||
## Resource Management Patterns
|
||||
|
||||
## Prefer RAII Wrappers Over Manual Cleanup
|
||||
|
||||
Wrap raw resources in types that own cleanup.
|
||||
|
||||
Examples:
|
||||
|
||||
- file descriptor wrapper
|
||||
- socket wrapper
|
||||
- scoped timer
|
||||
- custom allocator arena handle
|
||||
|
||||
## Prefer Containers Over Raw Dynamic Arrays
|
||||
|
||||
Instead of:
|
||||
|
||||
```cpp
|
||||
int* data = new int[count];
|
||||
```
|
||||
|
||||
prefer:
|
||||
|
||||
```cpp
|
||||
std::vector<int> data(count);
|
||||
```
|
||||
|
||||
Why:
|
||||
|
||||
- size information stays with the data structure
|
||||
- cleanup becomes automatic
|
||||
- resizing and range-aware APIs become available
|
||||
|
||||
## Use Views for Non-Owning Access
|
||||
|
||||
Modern C++ increasingly uses non-owning views such as `std::string_view` and `std::span` to express borrowed access without copying.
|
||||
|
||||
These are powerful, but they require lifetime discipline. A view is only valid while the underlying data is alive.
|
||||
|
||||
### Example Pitfall
|
||||
|
||||
Returning `std::string_view` to a temporary `std::string` creates a dangling view.
|
||||
|
||||
## Exception Safety and Ownership
|
||||
|
||||
### Intuition
|
||||
|
||||
Memory management decisions matter most when control flow becomes non-linear. Exceptions, early returns, and partial initialization are exactly where manual cleanup breaks down.
|
||||
|
||||
RAII and smart pointers give you strong exception safety by making cleanup automatic during stack unwinding.
|
||||
|
||||
### Practical Levels of Safety
|
||||
|
||||
Common exception-safety language:
|
||||
|
||||
- basic guarantee: no leaks, object remains valid
|
||||
- strong guarantee: operation either succeeds fully or has no observable effect
|
||||
- no-throw guarantee: operation cannot throw
|
||||
|
||||
You do not need to recite these mechanically, but you should understand how ownership design influences them.
|
||||
|
||||
## Common Modern C++ Pitfalls
|
||||
|
||||
- using `shared_ptr` to avoid thinking about ownership
|
||||
- mixing owning raw pointers with smart pointers ambiguously
|
||||
- forming `shared_ptr` cycles
|
||||
- assuming moved-from objects retain useful values
|
||||
- exposing raw references or pointers to internal data whose lifetime is not guaranteed
|
||||
- returning views to destroyed storage
|
||||
|
||||
## Real-World Design Examples
|
||||
|
||||
### Tree Ownership
|
||||
|
||||
Use `unique_ptr` for children and raw pointers or references for parent-aware traversal when parent does not own child separately.
|
||||
|
||||
### Shared Async Work
|
||||
|
||||
Use `shared_ptr` when multiple asynchronous callbacks must keep an object alive until all work is finished.
|
||||
|
||||
### C API Wrapping
|
||||
|
||||
Use a custom RAII wrapper or `unique_ptr` with a custom deleter for resources acquired through legacy APIs.
|
||||
|
||||
```cpp
|
||||
using FilePtr = std::unique_ptr<std::FILE, decltype(&std::fclose)>;
|
||||
|
||||
FilePtr open_file(const char* path) {
|
||||
return FilePtr(std::fopen(path, "r"), &std::fclose);
|
||||
}
|
||||
```
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- why raw pointers are weak ownership signals
|
||||
- when `unique_ptr` is preferable to `shared_ptr`
|
||||
- how `shared_ptr` uses a control block
|
||||
- what `weak_ptr` solves
|
||||
- the difference between copy and move semantics
|
||||
- why `std::move` does not itself move anything
|
||||
- when the Rule of 0 beats the Rule of 5
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file focuses on the standard library, especially the containers and algorithms that most production C++ code uses every day. Many STL design choices make much more sense once you understand ownership, moves, and lifetime.
|
||||
@@ -0,0 +1,399 @@
|
||||
# File 4: STL Deep Dive
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- choose standard containers based on access patterns, not habit
|
||||
- explain how core STL containers work internally
|
||||
- understand iterator categories and invalidation rules well enough to avoid subtle bugs
|
||||
- use algorithms library functions as first-class tools rather than optional extras
|
||||
- discuss STL complexity tradeoffs in interviews and system design conversations
|
||||
|
||||
This file assumes you already understand object lifetime, move semantics, and ownership. The STL is not “just a library.” It is a design philosophy built around generic programming, iterator-based abstraction, and predictable complexity.
|
||||
|
||||
## What the STL Is Trying to Solve
|
||||
|
||||
### Intuition
|
||||
|
||||
Most programs need the same families of operations:
|
||||
|
||||
- store collections of data
|
||||
- traverse them efficiently
|
||||
- search, sort, transform, filter, and aggregate
|
||||
|
||||
The STL gives reusable building blocks for those tasks while preserving performance transparency.
|
||||
|
||||
Its core ideas are:
|
||||
|
||||
- containers own and organize data
|
||||
- iterators provide a common traversal interface
|
||||
- algorithms operate over iterator ranges instead of hardcoding container types
|
||||
|
||||
That separation is one of the most important patterns in C++.
|
||||
|
||||
## `std::vector`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::vector` is the default dynamic array in C++. It stores elements contiguously and grows as needed.
|
||||
|
||||
If you do not have a strong reason to pick something else, `vector` is often the correct first choice.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
A vector typically stores:
|
||||
|
||||
- a pointer to a contiguous heap buffer
|
||||
- its current size
|
||||
- its current capacity
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[vector object] --> B[data pointer]
|
||||
A --> C[size]
|
||||
A --> D[capacity]
|
||||
B --> E[element 0]
|
||||
B --> F[element 1]
|
||||
B --> G[element 2]
|
||||
```
|
||||
|
||||
When capacity is exceeded, vector allocates a larger buffer, moves or copies elements into it, then frees the old buffer.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Contiguous storage gives major benefits:
|
||||
|
||||
- O(1) random access
|
||||
- strong cache locality
|
||||
- easy interop with C APIs and low-level buffers
|
||||
- efficient iteration and algorithm use
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- numeric data
|
||||
- event buffers
|
||||
- parsed records
|
||||
- task queues with append-heavy patterns
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- reallocation invalidates pointers, references, and iterators to elements
|
||||
- frequent small growth can cause repeated reallocations if capacity is not reserved
|
||||
- insertion in the middle is expensive because elements after the insertion point must shift
|
||||
|
||||
### Real Advice
|
||||
|
||||
If you know approximate size up front, call `reserve()`. That is one of the highest-value micro-optimizations in ordinary C++ code.
|
||||
|
||||
## `std::deque`
|
||||
|
||||
### Intuition
|
||||
|
||||
`deque` is a double-ended queue optimized for efficient insertion and removal at both ends while still supporting indexed access.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Unlike vector, deque is not typically one contiguous buffer. It is often implemented as a segmented structure: a map of fixed-size blocks.
|
||||
|
||||
This avoids whole-buffer reallocation for growth at the front or back.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- queue-like workloads needing both front and back operations
|
||||
- sliding window logic
|
||||
- schedulers and work-stealing structures in some implementations
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- weaker cache locality than vector
|
||||
- assumptions about contiguity are invalid
|
||||
- iterators can be invalidated in ways different from vector
|
||||
|
||||
## `std::list`
|
||||
|
||||
### Intuition
|
||||
|
||||
`list` is a doubly linked list. It exists because some workloads benefit from stable iterators and cheap insertion or removal at known positions.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Each node usually stores:
|
||||
|
||||
- the element value
|
||||
- pointer to previous node
|
||||
- pointer to next node
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Node] --> B[prev]
|
||||
A --> C[value]
|
||||
A --> D[next]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
In practice, far fewer workloads need `list` than many engineers assume. It can be useful when:
|
||||
|
||||
- you already hold iterators to splice locations
|
||||
- stable node addresses matter
|
||||
- frequent insertion and erasure in the middle dominate performance and traversal locality matters less
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“List is better for lots of inserts and deletes.”
|
||||
|
||||
Only sometimes. Pointer chasing hurts cache locality badly. In many real workloads, vector still wins despite O(n) insertion because contiguous memory is so CPU-friendly.
|
||||
|
||||
## Associative Containers: `map` and `set`
|
||||
|
||||
### Intuition
|
||||
|
||||
Ordered associative containers maintain elements in sorted order and support logarithmic lookup, insertion, and removal.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
`std::map` and `std::set` are typically implemented as balanced binary search trees, commonly red-black trees.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[8] --> B[4]
|
||||
A --> C[12]
|
||||
B --> D[2]
|
||||
B --> E[6]
|
||||
C --> F[10]
|
||||
C --> G[14]
|
||||
```
|
||||
|
||||
Why this matters:
|
||||
|
||||
- elements are kept ordered
|
||||
- lookup is O(log n)
|
||||
- iterating produces sorted order
|
||||
- node-based storage means references and iterators are often more stable than in vector
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- ordered dictionaries
|
||||
- interval or range logic using `lower_bound` and `upper_bound`
|
||||
- workloads where sorted traversal is part of the contract
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- higher per-element overhead than vector-based approaches
|
||||
- poorer cache locality because nodes are separately allocated
|
||||
- using `map` by default when ordered traversal is not needed
|
||||
|
||||
## Hash-Based Containers: `unordered_map` and `unordered_set`
|
||||
|
||||
### Intuition
|
||||
|
||||
Hash-based containers optimize for average-case constant-time lookup rather than ordering.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
An `unordered_map` typically uses:
|
||||
|
||||
- a bucket array
|
||||
- a hash function to choose a bucket
|
||||
- collision handling, often with chains or equivalent node structures
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[key hash] --> B[bucket array]
|
||||
B --> C[bucket 0]
|
||||
B --> D[bucket 1]
|
||||
B --> E[bucket 2]
|
||||
D --> F[node -> node]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- caches
|
||||
- symbol tables
|
||||
- frequency counting
|
||||
- routing tables or registries when order does not matter
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- average O(1) lookup, but worst-case O(n)
|
||||
- memory overhead from buckets and nodes
|
||||
- iteration order is not stable or meaningful
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- bad custom hash functions hurt performance
|
||||
- rehashing invalidates iterators in many cases
|
||||
- using `unordered_map` when deterministic iteration order is important
|
||||
|
||||
## Iterators
|
||||
|
||||
### Intuition
|
||||
|
||||
Iterators generalize traversal so algorithms can work across many containers.
|
||||
|
||||
Instead of writing one sorting routine for vectors and another for arrays, algorithms operate on iterator ranges.
|
||||
|
||||
### Categories Matter
|
||||
|
||||
Different iterators support different capabilities:
|
||||
|
||||
- input iterator: read sequentially
|
||||
- forward iterator: one-way multi-pass traversal
|
||||
- bidirectional iterator: move both forward and backward
|
||||
- random-access iterator: jump in constant time
|
||||
|
||||
This is why `std::sort` works with vector iterators but not list iterators. Sorting efficiently requires random access.
|
||||
|
||||
### Mental Model
|
||||
|
||||
Think of an iterator as a generalized cursor with container-specific guarantees.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- generic algorithms over different container types
|
||||
- decoupling traversal from storage details
|
||||
- writing reusable library code
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- invalidating iterators after insertions or erasures
|
||||
- dereferencing `end()`
|
||||
- assuming all iterators support the same operations
|
||||
|
||||
## Iterator Invalidation
|
||||
|
||||
### Intuition
|
||||
|
||||
This is one of the most frequent real-world STL bug sources. The container changes, but code keeps using old iterators, references, or pointers.
|
||||
|
||||
### Practical Rules of Thumb
|
||||
|
||||
- vector reallocation invalidates all iterators, pointers, and references to elements
|
||||
- list node insertions usually preserve iterators to other nodes
|
||||
- unordered containers may invalidate iterators when rehashing occurs
|
||||
|
||||
Do not rely on vague memory here. For critical code, check the container's exact guarantees.
|
||||
|
||||
## Algorithms Library
|
||||
|
||||
### Intuition
|
||||
|
||||
The algorithms library exists so you can express intent at a higher level than manual loops while still staying efficient.
|
||||
|
||||
Common examples include:
|
||||
|
||||
- `std::sort`
|
||||
- `std::find_if`
|
||||
- `std::transform`
|
||||
- `std::accumulate`
|
||||
- `std::lower_bound`
|
||||
- `std::remove_if`
|
||||
|
||||
### Why It Matters
|
||||
|
||||
Algorithms make code:
|
||||
|
||||
- more declarative
|
||||
- easier to review
|
||||
- easier for the compiler to optimize consistently
|
||||
- less error-prone than handwritten index manipulation
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::vector<int> values = {5, 1, 4, 2, 3};
|
||||
std::sort(values.begin(), values.end());
|
||||
```
|
||||
|
||||
You do not need to reimplement quicksort or mergesort in production code unless the problem specifically requires it.
|
||||
|
||||
### The Erase-Remove Idiom
|
||||
|
||||
This is a classic STL pattern:
|
||||
|
||||
```cpp
|
||||
values.erase(
|
||||
std::remove_if(values.begin(), values.end(), [](int v) { return v % 2 == 0; }),
|
||||
values.end());
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- `remove_if` reorders the range so kept elements move to the front
|
||||
- it returns the new logical end
|
||||
- `erase` actually shrinks the container
|
||||
|
||||
Understanding this pattern signals real STL fluency.
|
||||
|
||||
## Complexity Cheat Sheet
|
||||
|
||||
### Sequence Containers
|
||||
|
||||
| Container | Random Access | Push Back | Push Front | Insert Middle | Iterator Stability |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| `vector` | O(1) | amortized O(1) | O(n) | O(n) | weak under reallocation |
|
||||
| `deque` | O(1) | O(1) | O(1) | O(n) | moderate, container-specific |
|
||||
| `list` | O(n) | O(1) | O(1) | O(1) with iterator | strong for other nodes |
|
||||
|
||||
### Associative Containers
|
||||
|
||||
| Container | Lookup | Insert | Order | Typical Internal Structure |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `map` | O(log n) | O(log n) | sorted | balanced tree |
|
||||
| `set` | O(log n) | O(log n) | sorted | balanced tree |
|
||||
| `unordered_map` | average O(1) | average O(1) | none | hash table |
|
||||
| `unordered_set` | average O(1) | average O(1) | none | hash table |
|
||||
|
||||
### Why Interviews Ask About This
|
||||
|
||||
Interviewers are usually not checking if you memorized tables. They want to know whether you can choose the right structure for a workload.
|
||||
|
||||
Examples:
|
||||
|
||||
- frequent append plus indexed reads: likely `vector`
|
||||
- ordered lookup with range queries: likely `map`
|
||||
- key lookup without ordering: likely `unordered_map`
|
||||
- middle splicing with stable iterators: maybe `list`, but verify locality costs first
|
||||
|
||||
## Container Selection in Real Systems
|
||||
|
||||
### Prefer `vector` More Often Than You Think
|
||||
|
||||
Because of contiguity, vector is often the fastest general-purpose container even when its theoretical complexity looks worse than a node-based alternative.
|
||||
|
||||
### Reach for Ordered Containers When Order Matters as Part of the Contract
|
||||
|
||||
If you need sorted traversal, nearest-key queries, or stable ordering semantics, `map` earns its cost.
|
||||
|
||||
### Use Hash Containers When Key Lookup Dominates and Order Does Not Matter
|
||||
|
||||
This is common in compilers, interpreters, caches, and service registries.
|
||||
|
||||
### Avoid Cargo-Culting `list`
|
||||
|
||||
Many engineers learn linked lists academically and then overestimate their usefulness in high-performance software.
|
||||
|
||||
## Common STL Pitfalls
|
||||
|
||||
- forgetting iterator invalidation rules
|
||||
- using `operator[]` on `map` or `unordered_map` when accidental insertion is undesirable
|
||||
- choosing containers by asymptotic complexity alone and ignoring memory locality
|
||||
- copying large containers accidentally when references or moves were intended
|
||||
- assuming all algorithms work with all iterator categories
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- why `vector` is often the default container
|
||||
- how vector reallocation works and why `reserve()` helps
|
||||
- the internal difference between `map` and `unordered_map`
|
||||
- what iterator categories mean in practice
|
||||
- why `std::sort` requires random-access iterators
|
||||
- how the erase-remove idiom works
|
||||
- why cache locality can beat seemingly better asymptotic complexity
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The final file moves from language and library mechanics into systems-level C++: threads, locks, atomics, performance work, and the patterns that show up in production engines, compilers, and low-latency systems.
|
||||
@@ -0,0 +1,436 @@
|
||||
# File 5: Advanced and Real-World Systems
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain the basics of C++ concurrency without treating it as a bag of library calls
|
||||
- reason about mutexes, atomics, and condition variables in terms of correctness and performance
|
||||
- identify practical optimization levers beyond “use a faster algorithm”
|
||||
- describe where C++ fits in real systems and why teams still choose it
|
||||
- connect language features and library choices to larger architectural patterns
|
||||
|
||||
This final file builds on everything before it. Concurrency depends on lifetime correctness. Performance depends on data layout and container choice. Systems design depends on clear ownership and predictable cleanup.
|
||||
|
||||
## Why C++ Is Still Used for Real Systems
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ remains relevant because many systems need a rare combination:
|
||||
|
||||
- low-level control over memory and layout
|
||||
- high performance with minimal runtime overhead
|
||||
- strong abstraction tools for large codebases
|
||||
- portability across platforms and hardware
|
||||
|
||||
If your system is sensitive to latency, memory footprint, or hardware interaction, C++ is still one of the strongest options.
|
||||
|
||||
### Common Domains
|
||||
|
||||
- game engines
|
||||
- trading systems
|
||||
- browser engines
|
||||
- compilers and developer tools
|
||||
- databases and storage engines
|
||||
- robotics and embedded platforms
|
||||
- audio, graphics, and simulation systems
|
||||
|
||||
The rest of this file focuses on the patterns those systems rely on.
|
||||
|
||||
## Threads and Concurrency Basics
|
||||
|
||||
### Intuition
|
||||
|
||||
A thread is an independent path of execution within a process. Concurrency exists because real systems often need to overlap work:
|
||||
|
||||
- serving multiple requests
|
||||
- handling I/O while computing
|
||||
- parallelizing CPU-heavy workloads
|
||||
- keeping user interfaces responsive
|
||||
|
||||
### Basic Thread Model
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[Process] --> B[Thread 1]
|
||||
A --> C[Thread 2]
|
||||
A --> D[Thread 3]
|
||||
B --> E[Shared heap]
|
||||
C --> E
|
||||
D --> E
|
||||
B --> F[Own call stack]
|
||||
C --> G[Own call stack]
|
||||
D --> H[Own call stack]
|
||||
```
|
||||
|
||||
Threads in the same process usually share heap memory but have separate stacks. That makes communication possible, but it also creates the risk of races.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
void worker(int id) {
|
||||
std::cout << "worker " << id << " running\n";
|
||||
}
|
||||
|
||||
int main() {
|
||||
std::thread t1(worker, 1);
|
||||
std::thread t2(worker, 2);
|
||||
t1.join();
|
||||
t2.join();
|
||||
}
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- worker pools in backend services
|
||||
- background asset loading in game engines
|
||||
- compiler pipelines that parallelize parsing or optimization passes
|
||||
- real-time analytics pipelines
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- forgetting to `join()` or `detach()` a thread
|
||||
- accessing shared state without synchronization
|
||||
- spawning too many threads instead of using task pools
|
||||
|
||||
## Data Races and Memory Visibility
|
||||
|
||||
### Intuition
|
||||
|
||||
A data race happens when multiple threads access the same memory concurrently, at least one access is a write, and there is no proper synchronization.
|
||||
|
||||
In C++, data races are not just “sometimes wrong.” They are undefined behavior.
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Without synchronization, the compiler and CPU are free to reorder operations in ways that break naive assumptions about “obvious” execution order.
|
||||
|
||||
Concurrency bugs often come from incorrect mental models, not missing syntax.
|
||||
|
||||
### Practical Rule
|
||||
|
||||
If shared mutable state exists, you usually need one of:
|
||||
|
||||
- a mutex
|
||||
- an atomic type
|
||||
- message passing that avoids shared mutation
|
||||
|
||||
## Mutexes and Locking
|
||||
|
||||
### Intuition
|
||||
|
||||
A mutex protects a critical section so only one thread at a time can access a shared resource.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Counter {
|
||||
public:
|
||||
void increment() {
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
++value_;
|
||||
}
|
||||
|
||||
int value() const {
|
||||
return value_;
|
||||
}
|
||||
|
||||
private:
|
||||
mutable std::mutex mutex_;
|
||||
int value_ = 0;
|
||||
};
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
The exact implementation depends on the platform, but a mutex generally coordinates access through OS or low-level runtime primitives that block or spin until ownership can be acquired safely.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- protecting queues, maps, and caches
|
||||
- guarding shared configuration or metrics
|
||||
- making compound state transitions atomic at the application level
|
||||
|
||||
### `lock_guard` vs `unique_lock`
|
||||
|
||||
`std::lock_guard` is minimal and scope-bound.
|
||||
|
||||
`std::unique_lock` is more flexible and useful when you need:
|
||||
|
||||
- deferred locking
|
||||
- manual unlock before scope end
|
||||
- compatibility with condition variables
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- holding locks for too long
|
||||
- calling external or user-defined code while holding a lock
|
||||
- locking multiple mutexes in inconsistent order and causing deadlocks
|
||||
|
||||
## Condition Variables
|
||||
|
||||
### Intuition
|
||||
|
||||
A condition variable lets one thread wait until a condition becomes true while releasing the mutex during the wait.
|
||||
|
||||
This avoids wasteful busy-waiting.
|
||||
|
||||
### Producer-Consumer Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[Producer thread] --> Q[Shared queue]
|
||||
Q --> C[Consumer thread]
|
||||
M[Mutex] --> Q
|
||||
CV[Condition variable] --> C
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::mutex mutex;
|
||||
std::condition_variable cv;
|
||||
std::queue<int> queue;
|
||||
bool done = false;
|
||||
|
||||
void producer() {
|
||||
{
|
||||
std::lock_guard<std::mutex> lock(mutex);
|
||||
queue.push(42);
|
||||
}
|
||||
cv.notify_one();
|
||||
}
|
||||
|
||||
void consumer() {
|
||||
std::unique_lock<std::mutex> lock(mutex);
|
||||
cv.wait(lock, [] { return !queue.empty() || done; });
|
||||
|
||||
if (!queue.empty()) {
|
||||
int value = queue.front();
|
||||
queue.pop();
|
||||
std::cout << value << '\n';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why the Predicate Matters
|
||||
|
||||
Condition variables can wake spuriously. Always wait with a predicate or recheck the condition in a loop.
|
||||
|
||||
## Atomics
|
||||
|
||||
### Intuition
|
||||
|
||||
Atomics provide operations on shared values that can be performed safely without a mutex for certain patterns.
|
||||
|
||||
They are powerful, but they are not a general replacement for locks.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::atomic<int> requests = 0;
|
||||
requests.fetch_add(1, std::memory_order_relaxed);
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- counters and statistics
|
||||
- lock-free flags
|
||||
- reference counts and state transitions
|
||||
- specialized low-latency data structures
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“Atomics are always faster than mutexes.”
|
||||
|
||||
Not necessarily. They can reduce blocking, but they can also introduce complexity, cache contention, and hard-to-debug ordering issues.
|
||||
|
||||
### Rule of Thumb
|
||||
|
||||
Use mutexes for protecting complex invariants. Use atomics for simple shared state or carefully designed low-level structures.
|
||||
|
||||
## Concurrency Patterns That Actually Matter
|
||||
|
||||
## Producer-Consumer Queues
|
||||
|
||||
Classic in logging pipelines, background job systems, and network servers. One set of threads produces work; another consumes it.
|
||||
|
||||
Questions to think about:
|
||||
|
||||
- bounded or unbounded queue?
|
||||
- backpressure behavior?
|
||||
- shutdown semantics?
|
||||
|
||||
## Thread Pools
|
||||
|
||||
Instead of spawning threads per task, a fixed set of worker threads pulls tasks from a queue.
|
||||
|
||||
Why it exists:
|
||||
|
||||
- thread creation is not free
|
||||
- unbounded thread growth harms latency and memory use
|
||||
- controlled scheduling improves predictability
|
||||
|
||||
## Read-Mostly Data
|
||||
|
||||
Some systems have frequent reads and rare writes. In those cases, techniques such as reader-writer locks, versioned snapshots, or immutable data replacement can outperform coarse locking.
|
||||
|
||||
## Message Passing
|
||||
|
||||
Sometimes the best way to avoid synchronization bugs is to avoid shared mutable state. Passing messages between components can simplify reasoning, especially in actor-like or staged architectures.
|
||||
|
||||
## Performance Optimization in C++
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ gives you performance opportunities, but it also gives you enough rope to optimize the wrong thing. The right approach is disciplined measurement.
|
||||
|
||||
### Step 1: Measure Before Changing Code
|
||||
|
||||
Use profilers, tracing, and benchmarks. Do not trust intuition alone.
|
||||
|
||||
Real performance work usually asks:
|
||||
|
||||
- is the problem CPU, memory, I/O, or lock contention?
|
||||
- is the bottleneck algorithmic or microarchitectural?
|
||||
- is latency or throughput the primary goal?
|
||||
|
||||
### Step 2: Care About Data Layout
|
||||
|
||||
Cache behavior often dominates performance.
|
||||
|
||||
Contiguous memory and compact structures usually outperform pointer-heavy designs because CPUs like predictable access patterns.
|
||||
|
||||
This is why:
|
||||
|
||||
- `vector` often beats `list`
|
||||
- structure layout matters in hot paths
|
||||
- unnecessary indirection hurts
|
||||
|
||||
### Step 3: Reduce Unnecessary Allocation
|
||||
|
||||
Heap allocation can be expensive because it involves allocator overhead, synchronization in some allocators, and worse locality.
|
||||
|
||||
Practical techniques:
|
||||
|
||||
- reserve container capacity
|
||||
- reuse buffers
|
||||
- use arenas or pools when appropriate
|
||||
- prefer stack or embedded storage for small fixed-size data when it simplifies lifetime
|
||||
|
||||
### Step 4: Choose the Right Granularity
|
||||
|
||||
Overly fine-grained locking and overly fine-grained tasks can both destroy performance. Coordination cost can outweigh useful work.
|
||||
|
||||
### Step 5: Respect the Compiler, but Verify
|
||||
|
||||
Compilers can inline, vectorize, reorder, and eliminate copies aggressively, but only when code structure allows it. Write clear code first, then inspect profiles and generated behavior if performance truly matters.
|
||||
|
||||
## Common Performance Pitfalls
|
||||
|
||||
- optimizing before measuring
|
||||
- using node-heavy containers in hot loops without considering locality
|
||||
- creating excessive temporary allocations
|
||||
- copying large objects accidentally instead of moving or borrowing them
|
||||
- adding threads to a workload that is actually memory-bound or lock-bound
|
||||
- false sharing, where independent thread-local counters sit on the same cache line and interfere with each other
|
||||
|
||||
## C++ in Real Systems
|
||||
|
||||
## Game Engines
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- control over memory layout and custom allocators
|
||||
- tight frame budgets
|
||||
- performance-sensitive math, rendering, and asset systems
|
||||
- need for portable native code across platforms
|
||||
|
||||
Common themes:
|
||||
|
||||
- entity-component systems
|
||||
- data-oriented design
|
||||
- custom resource streaming
|
||||
|
||||
## Trading Systems
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- low latency matters more than developer convenience in hot paths
|
||||
- careful control over allocations and CPU behavior
|
||||
- direct integration with network stacks and specialized hardware
|
||||
|
||||
Common themes:
|
||||
|
||||
- lock minimization
|
||||
- cache-aware data structures
|
||||
- careful measurement of tail latency
|
||||
|
||||
## Compilers and Developer Tools
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- large in-memory graph and tree structures
|
||||
- need for performance across parsing, semantic analysis, and optimization
|
||||
- portable command-line tooling
|
||||
|
||||
Common themes:
|
||||
|
||||
- arenas and bump allocators
|
||||
- ownership-aware AST design
|
||||
- string interning and symbol tables
|
||||
|
||||
## Design Patterns in C++
|
||||
|
||||
### RAII
|
||||
|
||||
In C++, RAII is more than a pattern. It is one of the language's core architectural strengths.
|
||||
|
||||
### Strategy
|
||||
|
||||
Useful when behavior varies but the call site should stay stable. This may be implemented with virtual interfaces, templates, or function objects depending on runtime vs compile-time needs.
|
||||
|
||||
### Factory
|
||||
|
||||
Useful when object creation logic is complex or ownership should be centralized. Modern C++ factories often return `unique_ptr` to make ownership explicit.
|
||||
|
||||
### Observer
|
||||
|
||||
Useful for event systems, but dangerous if lifetime is not carefully managed. Weak references, scoped subscriptions, or explicit unregistering are essential.
|
||||
|
||||
### Pimpl
|
||||
|
||||
The pointer-to-implementation pattern hides private representation details behind an owning pointer in the public class. It reduces rebuild cost and improves ABI stability, though it adds indirection.
|
||||
|
||||
### Composition Over Inheritance
|
||||
|
||||
This is especially valuable in C++ because inheritance carries object model and lifetime implications. Composition often produces flatter, easier-to-reason-about systems.
|
||||
|
||||
## Practical Systems Mindset
|
||||
|
||||
Strong C++ engineering usually comes from asking these questions repeatedly:
|
||||
|
||||
1. Who owns this object?
|
||||
2. How long must it live?
|
||||
3. What are the synchronization rules?
|
||||
4. What is the dominant access pattern?
|
||||
5. Where is the actual bottleneck?
|
||||
6. Can the type system express the intended contract more clearly?
|
||||
|
||||
These questions connect language mechanics to system design.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- what a data race is and why it is undefined behavior
|
||||
- when to use mutexes vs atomics
|
||||
- why condition variables require predicate-based waiting
|
||||
- how thread pools differ from thread-per-task designs
|
||||
- why memory locality affects real performance
|
||||
- where C++ still provides strong advantages in production systems
|
||||
- which design patterns map naturally to C++ and why
|
||||
|
||||
## Final Takeaway
|
||||
|
||||
C++ rewards engineers who reason from first principles: memory layout, lifetime, ownership, data access patterns, and concurrency semantics. That is why it remains a serious language for systems work and interviews alike. Once you stop treating it as a bag of syntax and start treating it as a model of how software inhabits hardware, the language becomes much more coherent.
|
||||
@@ -0,0 +1,486 @@
|
||||
# Go: Introduction and Setup
|
||||
|
||||
## Learning Objectives
|
||||
|
||||
- Understand what Go is and why it was created.
|
||||
- Build a mental model of the Go compiler, runtime, and toolchain.
|
||||
- Install Go and verify that the environment is correct.
|
||||
- Understand modules, packages, and the shape of a simple Go project.
|
||||
- Read and run a minimal Go program with confidence.
|
||||
- Recognize the kinds of systems Go is especially good at building.
|
||||
|
||||
## Why Learn Go
|
||||
|
||||
Go, also called Golang, is a statically typed compiled language designed for software that needs to be simple to read, fast to build, easy to deploy, and reliable under load.
|
||||
|
||||
At first glance, Go can look smaller than languages like Java, C++, or Rust. That is not an accident. Go was deliberately designed to remove a lot of language surface area so engineers spend less time debating style and more time shipping understandable systems.
|
||||
|
||||
### What Problem Go Was Trying to Solve
|
||||
|
||||
Go came from a practical frustration inside large software teams. The designers wanted a language that would:
|
||||
|
||||
- compile quickly even for large codebases
|
||||
- make dependency management and builds straightforward
|
||||
- support concurrency without forcing every engineer to become a threads expert
|
||||
- produce binaries that are easy to deploy in servers and containers
|
||||
- encourage code that many people can read, not just the original author
|
||||
|
||||
This makes Go especially strong in infrastructure and backend work:
|
||||
|
||||
- HTTP APIs and web services
|
||||
- reverse proxies and gateways
|
||||
- distributed systems components
|
||||
- command-line tools
|
||||
- data pipelines and background workers
|
||||
- cloud-native control planes, schedulers, and operators
|
||||
|
||||
Projects like Docker, Kubernetes, Terraform, Prometheus, and many internal backend platforms rely on Go for exactly these reasons.
|
||||
|
||||
### Why Go Feels Different
|
||||
|
||||
Many languages try to give you more expressive power by adding more features. Go often does the opposite. It removes features that create ambiguity or deep complexity.
|
||||
|
||||
For example:
|
||||
|
||||
- there are no classes in the traditional Java sense
|
||||
- inheritance is replaced by composition
|
||||
- exceptions are replaced by explicit error values
|
||||
- formatting is standardized by tooling rather than team debate
|
||||
|
||||
That tradeoff matters. Go is not trying to be the most flexible language for every programming style. It is trying to be a dependable language for teams building production systems.
|
||||
|
||||
## The Go Philosophy in Practical Terms
|
||||
|
||||
Before learning syntax, it helps to understand the values the language is optimized for.
|
||||
|
||||
### Simplicity Over Cleverness
|
||||
|
||||
Go code is meant to be read quickly. If a solution is slightly more verbose but much easier to understand, Go generally prefers that version.
|
||||
|
||||
In real systems this matters more than beginners often expect. Most production code is maintained by someone who did not originally write it. The simpler the code reads, the lower the long-term cost.
|
||||
|
||||
### Fast Feedback Loops
|
||||
|
||||
Go's toolchain is intentionally fast. Building, testing, and formatting are part of the normal workflow rather than optional extras.
|
||||
|
||||
That speed changes engineering behavior. Developers run tests more often, refactor with more confidence, and keep tighter iteration loops.
|
||||
|
||||
### Built-In Tooling Culture
|
||||
|
||||
Some ecosystems depend heavily on third-party tools for basic workflow consistency. Go bakes a large part of that workflow into the language toolchain itself.
|
||||
|
||||
Common tasks use the standard `go` command:
|
||||
|
||||
- `go run`
|
||||
- `go build`
|
||||
- `go test`
|
||||
- `go fmt`
|
||||
- `go mod`
|
||||
- `go doc`
|
||||
|
||||
This is one reason Go projects often feel operationally clean compared with ecosystems that require many layers of build tooling.
|
||||
|
||||
## Where Go Fits in a System
|
||||
|
||||
Go is not the answer to every problem. It shines in a particular band of workloads.
|
||||
|
||||
### Strong Fits
|
||||
|
||||
- backend services that need predictable performance
|
||||
- network servers handling many concurrent requests
|
||||
- tools distributed as a single binary
|
||||
- microservices that need fast startup and straightforward containerization
|
||||
- platform engineering components such as controllers, schedulers, and sidecars
|
||||
|
||||
### Weaker Fits
|
||||
|
||||
- highly dynamic scripting where a REPL-first workflow matters more than static guarantees
|
||||
- extremely low-level systems programming where full control over memory layout is critical
|
||||
- domains where advanced compile-time type programming is a core need
|
||||
|
||||
That does not mean Go cannot be used there. It means the language was optimized for another center of gravity.
|
||||
|
||||
## Mental Model: From Source Code to Running Program
|
||||
|
||||
Many beginners treat a language as just syntax. That is too shallow for systems work. You should understand the path from source files to a running process.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[.go source files] --> B[go fmt]
|
||||
B --> C[go build]
|
||||
C --> D[Compiler]
|
||||
D --> E[Linker]
|
||||
E --> F[Single binary]
|
||||
F --> G[OS process]
|
||||
G --> H[Go runtime starts]
|
||||
H --> I[main.main executes]
|
||||
```
|
||||
|
||||
### What Happens Internally
|
||||
|
||||
When you run `go build`, several important things happen:
|
||||
|
||||
1. The compiler parses and type-checks your code.
|
||||
2. It compiles packages into machine code for the target platform.
|
||||
3. The linker combines your code, the standard library, and runtime support into a binary.
|
||||
4. When the binary starts, the Go runtime initializes memory management, the scheduler, and other low-level runtime state.
|
||||
5. Your `main` package starts executing from `main.main()`.
|
||||
|
||||
This is one reason Go is attractive operationally. The result is often a single deployable binary with few moving parts.
|
||||
|
||||
### Why the Runtime Exists in a Compiled Language
|
||||
|
||||
Go is compiled, but it still has a runtime. That runtime is not a virtual machine like the JVM. It is a support layer linked into the binary.
|
||||
|
||||
It is responsible for things such as:
|
||||
|
||||
- garbage collection
|
||||
- goroutine scheduling
|
||||
- stack growth
|
||||
- map and channel internals
|
||||
- panic handling
|
||||
- parts of reflection and interface support
|
||||
|
||||
In practice, this means Go gives you a native binary while still providing higher-level language features that would be painful to implement manually.
|
||||
|
||||
## Installing Go
|
||||
|
||||
The official distribution is available from the Go project site, and package managers also work well on macOS and Linux.
|
||||
|
||||
After installation, confirm the toolchain is available:
|
||||
|
||||
```bash
|
||||
go version
|
||||
go env GOROOT GOPATH
|
||||
```
|
||||
|
||||
### What These Values Mean
|
||||
|
||||
- `GOROOT` points to the Go installation itself.
|
||||
- `GOPATH` is the old workspace model and still exists for cache and tool behavior, but modern projects should use modules.
|
||||
|
||||
The most important shift to understand is this:
|
||||
|
||||
- old Go development often centered around `GOPATH`
|
||||
- modern Go development centers around `go.mod`
|
||||
|
||||
If you are learning Go today, think in modules first.
|
||||
|
||||
## Your First Module
|
||||
|
||||
A Go module is the unit of versioning and dependency management.
|
||||
|
||||
Create a project:
|
||||
|
||||
```bash
|
||||
mkdir hello-go
|
||||
cd hello-go
|
||||
go mod init example.com/hello-go
|
||||
```
|
||||
|
||||
This creates a `go.mod` file. That file tells Go two important things:
|
||||
|
||||
- the module path
|
||||
- the dependency set for the project
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
module example.com/hello-go
|
||||
|
||||
go 1.25.0
|
||||
```
|
||||
|
||||
The exact Go version may differ, but the idea is the same.
|
||||
|
||||
### Why Modules Exist
|
||||
|
||||
Without a module system, dependency versions become fragile and hard to reproduce. A module gives Go enough information to:
|
||||
|
||||
- resolve imports
|
||||
- fetch dependencies
|
||||
- build the same project consistently on other machines
|
||||
|
||||
In real backend systems, reproducible dependency state is not optional. It is part of shipping dependable software.
|
||||
|
||||
## A Minimal Go Program
|
||||
|
||||
Create `main.go`:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import "fmt"
|
||||
|
||||
func main() {
|
||||
fmt.Println("hello, Go")
|
||||
}
|
||||
```
|
||||
|
||||
Run it:
|
||||
|
||||
```bash
|
||||
go run .
|
||||
```
|
||||
|
||||
Build it:
|
||||
|
||||
```bash
|
||||
go build .
|
||||
```
|
||||
|
||||
### Read the Program Line by Line
|
||||
|
||||
`package main`
|
||||
|
||||
- Every Go file belongs to a package.
|
||||
- The special package `main` produces an executable program.
|
||||
|
||||
`import "fmt"`
|
||||
|
||||
- Packages must be imported explicitly.
|
||||
- `fmt` is part of the standard library and handles formatted I/O.
|
||||
|
||||
`func main()`
|
||||
|
||||
- Functions are declared with `func`.
|
||||
- `main` is the entry point for an executable.
|
||||
|
||||
`fmt.Println(...)`
|
||||
|
||||
- A package-qualified function call.
|
||||
- The standard library is intentionally strong, so you will use packages like `fmt`, `net/http`, `context`, `time`, and `encoding/json` constantly.
|
||||
|
||||
### Why Go Is Strict About Unused Imports and Variables
|
||||
|
||||
Go rejects unused local variables and unused imports. At first this can feel annoying. In practice it keeps code cleaner and reduces confusion while refactoring.
|
||||
|
||||
In long-lived services, that strictness is useful. It prevents stale code from quietly accumulating.
|
||||
|
||||
## Understanding Packages and Files Early
|
||||
|
||||
A common beginner mistake is to think each file is independent. In Go, files in the same folder and package are compiled together.
|
||||
|
||||
That means this is one logical package:
|
||||
|
||||
```text
|
||||
myservice/
|
||||
handlers.go
|
||||
server.go
|
||||
config.go
|
||||
```
|
||||
|
||||
if all files declare the same package name.
|
||||
|
||||
### Simple Project Shape
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[myservice] --> B[go.mod]
|
||||
A --> C[main.go]
|
||||
A --> D[internal]
|
||||
D --> E[httpapi]
|
||||
D --> F[store]
|
||||
A --> G[pkg]
|
||||
G --> H[client]
|
||||
```
|
||||
|
||||
This diagram introduces a pattern you will see often:
|
||||
|
||||
- `main.go` or `cmd/...` starts the program
|
||||
- `internal/...` holds application-private packages
|
||||
- `pkg/...` is sometimes used for reusable exported packages, though many teams avoid it unless it adds real clarity
|
||||
|
||||
Do not overcomplicate layout early. Start simple and split packages only when the structure earns its keep.
|
||||
|
||||
## Core Tooling You Should Use Immediately
|
||||
|
||||
Go learning goes faster when you treat tooling as part of the language.
|
||||
|
||||
### `go fmt`
|
||||
|
||||
```bash
|
||||
go fmt ./...
|
||||
```
|
||||
|
||||
This formats your code according to the standard Go style.
|
||||
|
||||
Why it exists:
|
||||
|
||||
- removes formatting debates
|
||||
- keeps diffs smaller and more readable
|
||||
- makes code look familiar across projects
|
||||
|
||||
### `go test`
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
```
|
||||
|
||||
This runs tests across packages.
|
||||
|
||||
Even before you know advanced testing, you should get used to this command. In Go, running the full package test set is normal, not exceptional.
|
||||
|
||||
### `go doc`
|
||||
|
||||
```bash
|
||||
go doc fmt.Println
|
||||
```
|
||||
|
||||
This helps you inspect package and symbol documentation from the command line.
|
||||
|
||||
### `go env`
|
||||
|
||||
```bash
|
||||
go env
|
||||
```
|
||||
|
||||
This prints environment details the toolchain is using. It is extremely helpful when debugging build or dependency issues.
|
||||
|
||||
## Zero Values: A Go Idea You Should Learn Early
|
||||
|
||||
Go gives every variable a default zero value.
|
||||
|
||||
Examples:
|
||||
|
||||
- `0` for integers
|
||||
- `false` for booleans
|
||||
- `""` for strings
|
||||
- `nil` for pointers, slices, maps, interfaces, channels, and function values
|
||||
|
||||
Why this exists:
|
||||
|
||||
- it reduces uninitialized-memory style bugs
|
||||
- it makes declarations cheap and predictable
|
||||
- it encourages data structures that are usable in a default state when designed well
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
var retries int
|
||||
var enabled bool
|
||||
var name string
|
||||
|
||||
fmt.Println(retries, enabled, name)
|
||||
```
|
||||
|
||||
In production code, zero values matter all the time. For example, a `sync.Mutex` or `bytes.Buffer` works correctly without manual initialization. That is a subtle but powerful ergonomics win.
|
||||
|
||||
## The Standard Library Is Part of the Language Experience
|
||||
|
||||
One reason Go feels productive in backend systems is that you can build a lot with the standard library alone.
|
||||
|
||||
Packages you will quickly rely on include:
|
||||
|
||||
- `fmt` for formatting and printing
|
||||
- `errors` for error handling helpers
|
||||
- `time` for deadlines, timers, and durations
|
||||
- `context` for cancellation and request scope
|
||||
- `net/http` for servers and clients
|
||||
- `encoding/json` for JSON encoding and decoding
|
||||
- `os` and `io` for file and stream operations
|
||||
- `sync` for mutexes and synchronization primitives
|
||||
|
||||
This matters because fewer external dependencies often means:
|
||||
|
||||
- easier upgrades
|
||||
- fewer version conflicts
|
||||
- less supply chain risk
|
||||
- more consistent team knowledge
|
||||
|
||||
## How Go Is Used in Real Systems
|
||||
|
||||
It helps to attach the language to actual engineering tasks rather than seeing it as abstract syntax.
|
||||
|
||||
### Backend API Service
|
||||
|
||||
A Go service might:
|
||||
|
||||
- listen for HTTP requests
|
||||
- parse JSON into structs
|
||||
- validate input
|
||||
- call a database or downstream service
|
||||
- return a JSON response
|
||||
|
||||
Go is strong here because:
|
||||
|
||||
- request handling maps naturally to goroutines
|
||||
- binaries are simple to deploy
|
||||
- startup is fast
|
||||
- memory and CPU use are usually predictable enough for service operation
|
||||
|
||||
### Distributed Systems Component
|
||||
|
||||
A scheduler, controller, queue worker, or service discovery agent often needs:
|
||||
|
||||
- concurrency
|
||||
- networking
|
||||
- serialization
|
||||
- low operational complexity
|
||||
- strong observability hooks
|
||||
|
||||
Go's standard library and runtime model fit that space very well.
|
||||
|
||||
### CLI and Platform Tooling
|
||||
|
||||
Internal developer tools are another strong Go use case. A single statically linked binary is easy to ship across machines and CI environments.
|
||||
|
||||
## Common Mistakes and Misconceptions
|
||||
|
||||
### Mistake: Treating Go Like Tiny Java or Tiny Python
|
||||
|
||||
Go is its own language with its own design center. If you constantly try to recreate class-heavy Java patterns or highly dynamic Python patterns, the code usually becomes awkward.
|
||||
|
||||
### Mistake: Ignoring the Toolchain
|
||||
|
||||
Go is not just syntax plus a compiler. The standard workflow is a major part of the language experience. Learn `go build`, `go test`, `go fmt`, and `go mod` early.
|
||||
|
||||
### Mistake: Overengineering Project Structure on Day One
|
||||
|
||||
Beginners sometimes create many directories and interfaces before the project has real complexity. Start with a small module and grow structure as the codebase proves it needs it.
|
||||
|
||||
### Mistake: Thinking "Compiled" Means "No Runtime"
|
||||
|
||||
Go produces native binaries, but those binaries include runtime support for garbage collection, scheduling, and other language features.
|
||||
|
||||
### Mistake: Treating Modules and Packages as the Same Thing
|
||||
|
||||
They are related but different.
|
||||
|
||||
- a module is a versioned collection of packages
|
||||
- a package is a unit of code organization and namespace
|
||||
|
||||
That distinction becomes important once projects grow.
|
||||
|
||||
## Practical Intuition to Carry Forward
|
||||
|
||||
At this stage, the most important thing is not memorizing every command. It is building a mental model:
|
||||
|
||||
- Go is optimized for readable, deployable, concurrent systems software.
|
||||
- The toolchain is part of the language culture.
|
||||
- Modules manage dependencies.
|
||||
- Packages organize code.
|
||||
- A Go program becomes a native process with runtime support linked in.
|
||||
|
||||
If you understand those ideas, the language details in the next file will make much more sense.
|
||||
|
||||
## Real-World Use Cases
|
||||
|
||||
- Building a JSON API server for a mobile app backend.
|
||||
- Writing a queue consumer that processes jobs concurrently.
|
||||
- Creating an internal deployment CLI distributed as one binary.
|
||||
- Implementing a control-plane component that watches cluster state and reconciles resources.
|
||||
|
||||
## Summary
|
||||
|
||||
Go exists to make production engineering simpler, especially for backend and infrastructure software. Its power is not just in syntax. It comes from the combination of a clear language, a fast toolchain, a strong standard library, native binaries, and a runtime designed for concurrency.
|
||||
|
||||
You should now be comfortable with the big picture:
|
||||
|
||||
- why Go exists
|
||||
- how source code becomes a running binary
|
||||
- how to install and verify the toolchain
|
||||
- how modules and packages fit together
|
||||
- how to create and run a first Go program
|
||||
|
||||
The next step is learning the language itself: values, types, control flow, data structures, functions, methods, structs, interfaces, and error handling.
|
||||
@@ -0,0 +1,717 @@
|
||||
# Go: Core Language Fundamentals
|
||||
|
||||
## Learning Objectives
|
||||
|
||||
- Understand Go's type system and value model.
|
||||
- Use variables, constants, control flow, and core data structures idiomatically.
|
||||
- Understand how slices, maps, strings, and pointers actually behave.
|
||||
- Write functions, methods, structs, and interfaces with practical clarity.
|
||||
- Handle errors the Go way instead of forcing exception-style thinking onto the language.
|
||||
- Build intuition for memory, receivers, and abstraction choices in real backend code.
|
||||
|
||||
## Start with the Right Mental Model
|
||||
|
||||
Go is a statically typed language centered on values. That sounds simple, but it shapes almost everything.
|
||||
|
||||
In practice, most Go code is about:
|
||||
|
||||
- creating values
|
||||
- transforming values
|
||||
- passing values between functions
|
||||
- attaching behavior to named types
|
||||
- returning explicit errors when something goes wrong
|
||||
|
||||
Go is not a class hierarchy language. It is not an exception-driven language. It is not a macro-heavy metaprogramming language. It is a language that tries to keep data flow and control flow visible.
|
||||
|
||||
That visibility is why production Go code can be easy to reason about even when the system itself is large.
|
||||
|
||||
## Variables, Constants, and Zero Values
|
||||
|
||||
### What They Are
|
||||
|
||||
Variables store values whose contents can change. Constants are fixed compile-time values.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import "fmt"
|
||||
|
||||
func main() {
|
||||
var retries int = 3
|
||||
timeoutSeconds := 10
|
||||
const serviceName = "billing-api"
|
||||
|
||||
fmt.Println(retries, timeoutSeconds, serviceName)
|
||||
}
|
||||
```
|
||||
|
||||
### Why Go Has Multiple Declaration Styles
|
||||
|
||||
Go gives you a few common forms:
|
||||
|
||||
- `var name Type` when the type matters or you want the zero value
|
||||
- `var name Type = value` when you want explicitness
|
||||
- `name := value` inside functions when the type is obvious
|
||||
- `const` for fixed values known at compile time
|
||||
|
||||
Idiomatic Go uses short declaration heavily inside functions because it keeps code readable without losing type safety.
|
||||
|
||||
### Zero Values Matter More Than They Seem
|
||||
|
||||
Every variable has a useful default value.
|
||||
|
||||
- numbers become `0`
|
||||
- booleans become `false`
|
||||
- strings become `""`
|
||||
- pointers, slices, maps, interfaces, channels, and functions become `nil`
|
||||
|
||||
This matters because many Go types are designed so the zero value is already valid.
|
||||
|
||||
Examples:
|
||||
|
||||
- a `bytes.Buffer` can be used immediately
|
||||
- a `sync.Mutex` can be locked immediately
|
||||
- a `time.Time` has a well-defined zero state
|
||||
|
||||
That design lowers initialization friction and reduces a whole class of bugs.
|
||||
|
||||
### When to Use `const`
|
||||
|
||||
Use constants for values that are conceptually fixed:
|
||||
|
||||
- protocol names
|
||||
- status labels
|
||||
- numeric tuning knobs that should not change at runtime
|
||||
|
||||
Do not use `const` just because a variable happens not to change in one function. Prefer clarity over ceremony.
|
||||
|
||||
### `iota` and Enumerated Constants
|
||||
|
||||
Go does not have enums in the Java sense, but it uses typed constants effectively.
|
||||
|
||||
```go
|
||||
type JobState int
|
||||
|
||||
const (
|
||||
JobPending JobState = iota
|
||||
JobRunning
|
||||
JobDone
|
||||
JobFailed
|
||||
)
|
||||
```
|
||||
|
||||
Why this exists:
|
||||
|
||||
- it gives you readable symbolic values
|
||||
- it keeps the type distinct from unrelated integers
|
||||
- it works well with switches, logging, and serialization helpers
|
||||
|
||||
## Control Flow: Small Set, High Clarity
|
||||
|
||||
Go intentionally keeps control flow simple.
|
||||
|
||||
### `if`
|
||||
|
||||
```go
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
```
|
||||
|
||||
This pattern appears constantly in Go because errors are explicit return values.
|
||||
|
||||
You will also see scoped initialization inside `if`:
|
||||
|
||||
```go
|
||||
if user, err := repo.Find(ctx, id); err != nil {
|
||||
return err
|
||||
} else {
|
||||
fmt.Println(user.Name)
|
||||
}
|
||||
```
|
||||
|
||||
Use this sparingly. It is useful when the scope should remain local, but overused it can make code denser than necessary.
|
||||
|
||||
### `for`
|
||||
|
||||
Go has one looping keyword: `for`.
|
||||
|
||||
```go
|
||||
for i := 0; i < 3; i++ {
|
||||
fmt.Println(i)
|
||||
}
|
||||
|
||||
for condition {
|
||||
fmt.Println("acts like while")
|
||||
break
|
||||
}
|
||||
|
||||
for {
|
||||
break
|
||||
}
|
||||
```
|
||||
|
||||
Why this is useful:
|
||||
|
||||
- fewer looping forms to remember
|
||||
- easier syntax surface for reading code
|
||||
- the language avoids duplicated constructs with minor differences
|
||||
|
||||
### `range`
|
||||
|
||||
`range` iterates over slices, arrays, strings, maps, and channels.
|
||||
|
||||
```go
|
||||
nums := []int{10, 20, 30}
|
||||
for index, value := range nums {
|
||||
fmt.Println(index, value)
|
||||
}
|
||||
```
|
||||
|
||||
Important practical details:
|
||||
|
||||
- ranging over a map does not guarantee order
|
||||
- ranging over a string gives runes, not raw bytes
|
||||
- ranging over a channel continues until the channel is closed
|
||||
|
||||
### `switch`
|
||||
|
||||
Go's `switch` is more flexible than many beginners expect.
|
||||
|
||||
```go
|
||||
switch state {
|
||||
case JobPending:
|
||||
fmt.Println("queued")
|
||||
case JobRunning:
|
||||
fmt.Println("in progress")
|
||||
default:
|
||||
fmt.Println("terminal state")
|
||||
}
|
||||
```
|
||||
|
||||
There is also expression-less `switch`, which is a readable alternative to long `if` ladders.
|
||||
|
||||
## Core Data Structures
|
||||
|
||||
### Arrays
|
||||
|
||||
Arrays have fixed length and are value types.
|
||||
|
||||
```go
|
||||
var ports [3]int
|
||||
ports[0] = 8080
|
||||
```
|
||||
|
||||
Arrays exist, but in day-to-day Go you use slices much more often.
|
||||
|
||||
Why arrays still matter:
|
||||
|
||||
- they are the underlying foundation for slices
|
||||
- fixed-size data can be useful for performance-sensitive code
|
||||
- array values emphasize that size is part of the type
|
||||
|
||||
### Slices
|
||||
|
||||
A slice is a small descriptor pointing at an underlying array.
|
||||
|
||||
```go
|
||||
ids := []int{101, 102, 103}
|
||||
ids = append(ids, 104)
|
||||
```
|
||||
|
||||
Internally, a slice conceptually contains:
|
||||
|
||||
- a pointer to backing storage
|
||||
- a length
|
||||
- a capacity
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Slice Header] --> B[Pointer]
|
||||
A --> C[Length]
|
||||
A --> D[Capacity]
|
||||
B --> E[Backing Array]
|
||||
```
|
||||
|
||||
### Why Slices Exist
|
||||
|
||||
They give you dynamic-seeming sequences without hiding memory behavior completely. They are a practical middle ground:
|
||||
|
||||
- easier to use than raw arrays
|
||||
- cheaper than many boxed collection abstractions
|
||||
- explicit enough that performance behavior is still understandable
|
||||
|
||||
### How `append` Works Internally
|
||||
|
||||
If capacity is available, `append` writes into the existing backing array. If capacity is exhausted, Go allocates a new array, copies the old contents, and returns a new slice header.
|
||||
|
||||
That means two things:
|
||||
|
||||
- appending may reallocate
|
||||
- slices sharing backing storage can affect each other unexpectedly
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
base := []int{1, 2, 3, 4}
|
||||
a := base[:2]
|
||||
b := base[:3]
|
||||
|
||||
a = append(a, 99)
|
||||
fmt.Println(base, a, b)
|
||||
```
|
||||
|
||||
This surprises many beginners because `a` may overwrite data visible through `b` if both still share the same backing array.
|
||||
|
||||
### Maps
|
||||
|
||||
Maps are Go's hash table type.
|
||||
|
||||
```go
|
||||
counts := map[string]int{
|
||||
"ok": 12,
|
||||
"error": 3,
|
||||
}
|
||||
|
||||
counts["retry"]++
|
||||
```
|
||||
|
||||
Why maps exist:
|
||||
|
||||
- fast key lookup
|
||||
- natural representation for counters, indexes, sets, and lookup tables
|
||||
|
||||
Important details:
|
||||
|
||||
- reading a missing key returns the value type's zero value
|
||||
- use the two-result form when absence matters
|
||||
- map iteration order is deliberately not stable
|
||||
|
||||
```go
|
||||
value, ok := counts["missing"]
|
||||
fmt.Println(value, ok)
|
||||
```
|
||||
|
||||
Maps are reference-like structures managed by the runtime. A nil map can be read from, but writing to it panics.
|
||||
|
||||
### Strings, Bytes, and Runes
|
||||
|
||||
Go strings are immutable sequences of bytes, usually holding UTF-8 encoded text.
|
||||
|
||||
This distinction matters:
|
||||
|
||||
- `byte` is an alias for `uint8`
|
||||
- `rune` is an alias for `int32` and represents a Unicode code point
|
||||
|
||||
```go
|
||||
message := "Go cafe"
|
||||
fmt.Println(len(message))
|
||||
|
||||
for _, r := range message {
|
||||
fmt.Printf("%c\n", r)
|
||||
}
|
||||
```
|
||||
|
||||
Why this matters in real systems:
|
||||
|
||||
- HTTP payloads and file formats are byte-oriented
|
||||
- user-visible text is Unicode-oriented
|
||||
- confusing the two leads to subtle bugs in validation, truncation, and indexing
|
||||
|
||||
If you need mutable byte data, use `[]byte`, not `string`.
|
||||
|
||||
## Functions: Multiple Return Values and Defer
|
||||
|
||||
### Functions as the Unit of Composition
|
||||
|
||||
Go prefers straightforward function composition over deep inheritance trees.
|
||||
|
||||
```go
|
||||
func parsePort(value string) (int, error) {
|
||||
port, err := strconv.Atoi(value)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("parse port %q: %w", value, err)
|
||||
}
|
||||
|
||||
return port, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Why Multiple Return Values Exist
|
||||
|
||||
This is one of the most important Go design choices. Instead of exceptions for normal failure, functions can return both a result and an error.
|
||||
|
||||
Benefits:
|
||||
|
||||
- the failure path is visible in the function signature
|
||||
- callers must consciously handle errors
|
||||
- control flow stays explicit
|
||||
|
||||
In backend systems, this reduces the hidden control-flow jumps that exception-heavy code can create.
|
||||
|
||||
### `defer`
|
||||
|
||||
`defer` schedules a function call to run when the surrounding function returns.
|
||||
|
||||
```go
|
||||
func readConfig(path string) error {
|
||||
file, err := os.Open(path)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
// read file
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- resource cleanup should be hard to forget
|
||||
- cleanup code often belongs near acquisition code
|
||||
|
||||
Internally, deferred calls are recorded and executed in last-in, first-out order when the function exits.
|
||||
|
||||
Use `defer` for correctness first. In very hot paths, you may care about its cost, but in most business logic the clarity win is worth it.
|
||||
|
||||
## Pointers and Memory Intuition
|
||||
|
||||
### What a Pointer Is
|
||||
|
||||
A pointer stores the address of another value.
|
||||
|
||||
```go
|
||||
func increment(value *int) {
|
||||
*value++
|
||||
}
|
||||
```
|
||||
|
||||
Go pointers exist so you can:
|
||||
|
||||
- mutate shared state intentionally
|
||||
- avoid copying very large values when appropriate
|
||||
- define methods that update a receiver
|
||||
|
||||
### Why Go Pointers Feel Safer Than C Pointers
|
||||
|
||||
Go allows pointers, but it removes several sharp edges:
|
||||
|
||||
- no pointer arithmetic
|
||||
- garbage collection manages object lifetime
|
||||
- type safety is preserved
|
||||
|
||||
This is a very deliberate design choice. Go wants the practical utility of pointers without turning everyday backend code into manual memory management.
|
||||
|
||||
### Stack, Heap, and Escape Analysis
|
||||
|
||||
Do not think of Go as "everything is on the heap." The compiler decides where values live.
|
||||
|
||||
- if a value can stay local safely, it may live on the stack
|
||||
- if it must outlive the local frame or be referenced elsewhere, it may escape to the heap
|
||||
|
||||
This compiler decision process is called escape analysis.
|
||||
|
||||
Why it matters:
|
||||
|
||||
- heap allocation can increase GC pressure
|
||||
- unnecessary pointer-heavy designs can make code slower and harder to reason about
|
||||
|
||||
You usually do not hand-place objects yourself. Instead, you write clear code and learn enough about allocation behavior to avoid obviously wasteful patterns.
|
||||
|
||||
## Structs and Methods
|
||||
|
||||
### Structs: Go's Primary Data Modeling Tool
|
||||
|
||||
Structs group related fields.
|
||||
|
||||
```go
|
||||
type User struct {
|
||||
ID int64
|
||||
Name string
|
||||
Email string
|
||||
}
|
||||
```
|
||||
|
||||
Why structs matter:
|
||||
|
||||
- they model domain entities cleanly
|
||||
- they work naturally with JSON, databases, and configuration
|
||||
- they keep data layout explicit
|
||||
|
||||
### Methods
|
||||
|
||||
Methods are functions attached to a type.
|
||||
|
||||
```go
|
||||
type Counter struct {
|
||||
value int
|
||||
}
|
||||
|
||||
func (c *Counter) Inc() {
|
||||
c.value++
|
||||
}
|
||||
|
||||
func (c Counter) Value() int {
|
||||
return c.value
|
||||
}
|
||||
```
|
||||
|
||||
### Value Receiver or Pointer Receiver
|
||||
|
||||
Use a pointer receiver when:
|
||||
|
||||
- the method mutates the receiver
|
||||
- the struct is large enough that copying is undesirable
|
||||
- consistency across methods is clearer with pointers
|
||||
|
||||
Use a value receiver when:
|
||||
|
||||
- the value is small and conceptually immutable
|
||||
- copying is cheap and expected
|
||||
|
||||
In real projects, consistency matters. If a type usually uses pointer receivers, use them across the method set unless there is a strong reason not to.
|
||||
|
||||
### Composition Over Inheritance
|
||||
|
||||
Go does not have class inheritance. Instead, it leans on composition and embedding.
|
||||
|
||||
```go
|
||||
type Logger struct{}
|
||||
|
||||
func (Logger) Info(msg string) {
|
||||
fmt.Println("INFO:", msg)
|
||||
}
|
||||
|
||||
type Server struct {
|
||||
Logger
|
||||
addr string
|
||||
}
|
||||
```
|
||||
|
||||
Embedding can promote fields and methods, but use it to express a real structural relationship, not to imitate inheritance mechanically.
|
||||
|
||||
## Interfaces: Small, Behavioral, and Implicit
|
||||
|
||||
### What an Interface Is
|
||||
|
||||
An interface describes behavior, not concrete data layout.
|
||||
|
||||
```go
|
||||
type Store interface {
|
||||
Save(ctx context.Context, user User) error
|
||||
}
|
||||
```
|
||||
|
||||
Any type with a matching method set satisfies the interface automatically.
|
||||
|
||||
### Why Implicit Satisfaction Exists
|
||||
|
||||
Go avoids the ceremony of explicit `implements` declarations. The benefit is that interfaces are lightweight and decoupled from concrete types.
|
||||
|
||||
That means you can define interfaces where they are needed, usually at the consumer side.
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
type UserService struct {
|
||||
store Store
|
||||
}
|
||||
```
|
||||
|
||||
The service depends on behavior, not a particular database implementation.
|
||||
|
||||
### How Interfaces Work Internally
|
||||
|
||||
Conceptually, an interface value holds:
|
||||
|
||||
- the dynamic concrete type
|
||||
- the concrete value or pointer value
|
||||
|
||||
This is why the "nil interface" pitfall exists. An interface can contain a typed nil pointer and still be non-nil as an interface value.
|
||||
|
||||
That pitfall shows up in logging, error returns, and optional dependency wiring.
|
||||
|
||||
### Keep Interfaces Small
|
||||
|
||||
Good Go interfaces are often tiny.
|
||||
|
||||
```go
|
||||
type Clock interface {
|
||||
Now() time.Time
|
||||
}
|
||||
```
|
||||
|
||||
Why small interfaces are better:
|
||||
|
||||
- easier to implement
|
||||
- easier to test
|
||||
- less coupling
|
||||
- behavior stays focused
|
||||
|
||||
Large interfaces usually signal that an abstraction was designed too early or at the wrong level.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[HTTP Handler] --> B[UserService interface]
|
||||
B --> C[PostgresUserService]
|
||||
B --> D[InMemoryUserService]
|
||||
```
|
||||
|
||||
## Generics: Useful, But Not the Center of Go
|
||||
|
||||
Modern Go supports type parameters.
|
||||
|
||||
```go
|
||||
func MapSlice[T any, U any](items []T, fn func(T) U) []U {
|
||||
result := make([]U, 0, len(items))
|
||||
for _, item := range items {
|
||||
result = append(result, fn(item))
|
||||
}
|
||||
return result
|
||||
}
|
||||
```
|
||||
|
||||
Why generics exist:
|
||||
|
||||
- avoid repetitive boilerplate for reusable containers and algorithms
|
||||
- preserve static type safety
|
||||
- reduce reliance on `interface{}` or reflection for generic utilities
|
||||
|
||||
When to use them:
|
||||
|
||||
- collections
|
||||
- reusable helpers where type-specific duplication is obvious
|
||||
- infrastructure libraries
|
||||
|
||||
When not to use them:
|
||||
|
||||
- when a plain interface or a concrete type is simpler
|
||||
- when the abstraction is more confusing than the duplication it removes
|
||||
|
||||
Go generics are powerful, but idiomatic Go still prefers the simplest tool that expresses the problem clearly.
|
||||
|
||||
## Error Handling Idioms
|
||||
|
||||
### Errors Are Values
|
||||
|
||||
This is a core Go idea. Failure is usually represented as an `error` return value.
|
||||
|
||||
```go
|
||||
func loadUser(ctx context.Context, repo Store, user User) error {
|
||||
if err := repo.Save(ctx, user); err != nil {
|
||||
return fmt.Errorf("save user %d: %w", user.ID, err)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Why Go Chooses Explicit Errors
|
||||
|
||||
In systems code, unexpected control flow is expensive to reason about. Explicit errors keep the failure path visible.
|
||||
|
||||
This helps with:
|
||||
|
||||
- tracing the source of failures
|
||||
- adding context at each layer
|
||||
- making retry and fallback decisions deliberately
|
||||
|
||||
### Wrapping Errors
|
||||
|
||||
Use `%w` with `fmt.Errorf` when you want to preserve the underlying cause.
|
||||
|
||||
```go
|
||||
return fmt.Errorf("read config: %w", err)
|
||||
```
|
||||
|
||||
Then callers can inspect it with `errors.Is` or `errors.As`.
|
||||
|
||||
### Sentinel Errors and Typed Errors
|
||||
|
||||
Sentinel errors are package-level variables for meaningful known cases.
|
||||
|
||||
```go
|
||||
var ErrNotFound = errors.New("not found")
|
||||
```
|
||||
|
||||
Typed errors are useful when the caller needs structured details.
|
||||
|
||||
Use them carefully. Too many special-case errors can make APIs harder to use.
|
||||
|
||||
### When to Panic
|
||||
|
||||
`panic` is for truly unrecoverable programmer or runtime states, not normal business failures.
|
||||
|
||||
Good uses are rare:
|
||||
|
||||
- impossible invariants that indicate a bug
|
||||
- startup-time failures in very small programs where continuing makes no sense
|
||||
|
||||
Bad uses are common:
|
||||
|
||||
- validation failure from user input
|
||||
- database connection hiccups during a request
|
||||
- file-not-found in normal control flow
|
||||
|
||||
In production services, panicking on ordinary errors creates instability and poor operability.
|
||||
|
||||
## Real-World Usage Patterns
|
||||
|
||||
### Request and Response Structs
|
||||
|
||||
Structs commonly model HTTP payloads, database rows, queue messages, and config.
|
||||
|
||||
```go
|
||||
type CreateOrderRequest struct {
|
||||
CustomerID string `json:"customer_id"`
|
||||
AmountCents int64 `json:"amount_cents"`
|
||||
}
|
||||
```
|
||||
|
||||
### Interfaces at Boundaries
|
||||
|
||||
A service often depends on an interface for storage or external calls, while the concrete implementation stays in another package.
|
||||
|
||||
### Explicit Error Returns for Every I/O Layer
|
||||
|
||||
Anything involving the network, disk, serialization, or databases should return precise errors with context. This is normal, not noisy. It is how production Go code stays debuggable.
|
||||
|
||||
## Common Mistakes and Misconceptions
|
||||
|
||||
### Mistake: Treating Slices Like Independent Dynamic Arrays
|
||||
|
||||
Slices can share backing storage. Appends and sub-slices can interact in surprising ways if you ignore capacity and aliasing.
|
||||
|
||||
### Mistake: Using Pointers Everywhere
|
||||
|
||||
Beginners sometimes assume pointer-heavy code is more advanced. Often it is just harder to read and increases allocation pressure. Use pointers for a reason, not by default.
|
||||
|
||||
### Mistake: Designing Huge Interfaces Up Front
|
||||
|
||||
Go interfaces work best when they are small and shaped by actual use. Large "service interfaces" often become rigid and awkward.
|
||||
|
||||
### Mistake: Ignoring Unicode Details
|
||||
|
||||
Indexing a string operates on bytes, not necessarily user-visible characters. This matters for APIs, validation, and text handling.
|
||||
|
||||
### Mistake: Using `panic` for Routine Error Handling
|
||||
|
||||
That is usually a sign you are importing habits from another language rather than using Go idioms.
|
||||
|
||||
### Mistake: Confusing Nil Values
|
||||
|
||||
Nil slices, nil maps, nil pointers, and nil interfaces do not all behave the same. Learn their semantics explicitly.
|
||||
|
||||
## Summary
|
||||
|
||||
Go's core language is intentionally compact, but it is not shallow. The important ideas are practical:
|
||||
|
||||
- values and types are central
|
||||
- slices, maps, and strings have concrete runtime behavior worth understanding
|
||||
- functions return errors explicitly
|
||||
- structs and methods model data and behavior
|
||||
- interfaces describe behavior and stay best when small
|
||||
- pointers and memory behavior matter, but Go shields you from manual memory management
|
||||
|
||||
If you can read and write these fundamentals fluently, you are ready for the most distinctive part of Go: concurrency with goroutines, channels, synchronization primitives, and request-scoped cancellation.
|
||||
@@ -0,0 +1,465 @@
|
||||
# Go: Concurrency and Goroutines
|
||||
|
||||
## Learning Objectives
|
||||
|
||||
- Understand the difference between concurrency and parallelism.
|
||||
- Learn how goroutines work and why they are cheaper than OS threads.
|
||||
- Use channels, `select`, mutexes, and synchronization primitives appropriately.
|
||||
- Understand the `context` package as the control plane for cancellation and deadlines.
|
||||
- Build a light but correct mental model of Go's memory model and data races.
|
||||
- Recognize common production concurrency patterns and the bugs that come with them.
|
||||
|
||||
## Why Concurrency Matters in Go
|
||||
|
||||
Go became popular partly because it made concurrent programming feel accessible.
|
||||
|
||||
Backend and systems software naturally deals with many things at once:
|
||||
|
||||
- handling multiple HTTP requests
|
||||
- waiting on databases and other services
|
||||
- processing jobs from queues
|
||||
- streaming data through pipelines
|
||||
- watching timers, sockets, and shutdown signals
|
||||
|
||||
If you handle all of that in a single linear flow, the program spends a lot of time idle. Concurrency lets you structure work so independent tasks can make progress without blocking each other unnecessarily.
|
||||
|
||||
### Concurrency vs Parallelism
|
||||
|
||||
- concurrency is about structuring many tasks in progress
|
||||
- parallelism is about tasks literally running at the same time on multiple CPU cores
|
||||
|
||||
Go helps with both, but it starts with concurrency as a programming model.
|
||||
|
||||
## Goroutines: Lightweight Concurrent Execution
|
||||
|
||||
### What They Are
|
||||
|
||||
A goroutine is a function executing independently from other goroutines.
|
||||
|
||||
```go
|
||||
go sendEmail(userID)
|
||||
```
|
||||
|
||||
That single keyword starts a concurrent unit of execution.
|
||||
|
||||
### Why Goroutines Exist
|
||||
|
||||
OS threads are powerful, but they are relatively expensive to create and manage directly. Go wanted a lighter abstraction so programs could comfortably run thousands or even millions of concurrent tasks, as long as the workload and memory usage made that reasonable.
|
||||
|
||||
### How Goroutines Work Internally
|
||||
|
||||
Go uses an M:N scheduler. In simplified terms:
|
||||
|
||||
- many goroutines are multiplexed onto fewer OS threads
|
||||
- the runtime scheduler decides which goroutine runs where
|
||||
- the scheduler cooperates with the runtime and system calls to keep work moving
|
||||
|
||||
The common mental model is `G`, `M`, and `P`:
|
||||
|
||||
- `G` is a goroutine
|
||||
- `M` is an OS thread, called a machine in runtime terminology
|
||||
- `P` is a processor token that lets Go code execute and carries scheduler state
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
G1[Goroutine] --> P1[Processor P]
|
||||
G2[Goroutine] --> P1
|
||||
G3[Goroutine] --> P2[Processor P]
|
||||
P1 --> M1[OS Thread M]
|
||||
P2 --> M2[OS Thread M]
|
||||
```
|
||||
|
||||
This model lets Go keep concurrency cheap while still using real CPU parallelism when available.
|
||||
|
||||
### Why Goroutines Feel Cheap
|
||||
|
||||
They start with small stacks that can grow as needed. That is very different from traditional thread models where each thread may reserve a much larger stack up front.
|
||||
|
||||
Still, "cheap" does not mean "free."
|
||||
|
||||
Each goroutine has:
|
||||
|
||||
- scheduler overhead
|
||||
- stack memory
|
||||
- potential references keeping heap data alive
|
||||
|
||||
Launching goroutines without bounds in a busy server can still create memory pressure and operational problems.
|
||||
|
||||
## Waiting for Goroutines to Finish
|
||||
|
||||
When goroutines need coordination, a common tool is `sync.WaitGroup`.
|
||||
|
||||
```go
|
||||
var wg sync.WaitGroup
|
||||
|
||||
for _, id := range ids {
|
||||
wg.Add(1)
|
||||
|
||||
go func(userID int64) {
|
||||
defer wg.Done()
|
||||
processUser(userID)
|
||||
}(id)
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- lets one part of the program wait for a known set of concurrent tasks
|
||||
- keeps coordination explicit without using channels for every case
|
||||
|
||||
In production code, `WaitGroup` is often simpler than a custom done channel when you only need task completion, not data transfer.
|
||||
|
||||
## Channels: Communication and Coordination
|
||||
|
||||
### What a Channel Is
|
||||
|
||||
A channel is a typed conduit used to send values between goroutines.
|
||||
|
||||
```go
|
||||
jobs := make(chan int)
|
||||
results := make(chan string)
|
||||
```
|
||||
|
||||
### Why Channels Exist
|
||||
|
||||
Go popularized the idea "share memory by communicating." The point is not that shared memory is forbidden. The point is that ownership transfer through communication is often easier to reason about than unrestricted shared mutation.
|
||||
|
||||
Channels are useful for:
|
||||
|
||||
- handing work to workers
|
||||
- propagating results
|
||||
- signaling completion
|
||||
- coordinating pipelines
|
||||
|
||||
### Unbuffered Channels
|
||||
|
||||
An unbuffered channel requires sender and receiver to synchronize.
|
||||
|
||||
```go
|
||||
done := make(chan struct{})
|
||||
|
||||
go func() {
|
||||
fmt.Println("work complete")
|
||||
done <- struct{}{}
|
||||
}()
|
||||
|
||||
<-done
|
||||
```
|
||||
|
||||
Why this matters:
|
||||
|
||||
- send and receive form a handoff point
|
||||
- it is both data transfer and synchronization
|
||||
|
||||
### Buffered Channels
|
||||
|
||||
A buffered channel can hold a fixed number of values without an immediate receiver.
|
||||
|
||||
```go
|
||||
queue := make(chan string, 100)
|
||||
queue <- "task-1"
|
||||
```
|
||||
|
||||
Why buffered channels exist:
|
||||
|
||||
- smooth over short bursts
|
||||
- decouple producer and consumer timing somewhat
|
||||
- model bounded queues naturally
|
||||
|
||||
Do not treat buffering as magic. A large enough producer can still fill the buffer and block.
|
||||
|
||||
### Closing Channels
|
||||
|
||||
Closing a channel means no more values will be sent.
|
||||
|
||||
```go
|
||||
close(queue)
|
||||
```
|
||||
|
||||
Rules that matter:
|
||||
|
||||
- only close from the sending side when it owns completion
|
||||
- do not close a channel just because you are done receiving from it
|
||||
- sending on a closed channel panics
|
||||
|
||||
Receivers can use the two-result form:
|
||||
|
||||
```go
|
||||
value, ok := <-queue
|
||||
```
|
||||
|
||||
When `ok` is false, the channel is closed and drained.
|
||||
|
||||
### When Not to Use Channels
|
||||
|
||||
Channels are excellent, but not universal. If you just need to protect a shared map or counter, a mutex may be simpler. Overusing channels can make code look concurrent while actually becoming harder to understand.
|
||||
|
||||
## `select`: Wait on Multiple Communication Paths
|
||||
|
||||
`select` lets a goroutine wait on multiple channel operations.
|
||||
|
||||
```go
|
||||
select {
|
||||
case result := <-results:
|
||||
fmt.Println("got result", result)
|
||||
case <-time.After(200 * time.Millisecond):
|
||||
fmt.Println("timed out")
|
||||
}
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- real systems often wait on multiple events
|
||||
- timeouts and cancellation are first-class concerns
|
||||
- many concurrent flows need to react to whichever signal arrives first
|
||||
|
||||
### Real-World Use: Timeout and Cancellation
|
||||
|
||||
```go
|
||||
select {
|
||||
case msg := <-incoming:
|
||||
handle(msg)
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
}
|
||||
```
|
||||
|
||||
This is the backbone of responsive concurrent systems in Go: do work if possible, but remain interruptible.
|
||||
|
||||
## The `context` Package: Cancellation, Deadlines, and Scope
|
||||
|
||||
### What It Is
|
||||
|
||||
`context.Context` carries request-scoped cancellation, deadlines, and small pieces of request metadata across API boundaries.
|
||||
|
||||
```go
|
||||
func FetchUser(ctx context.Context, id string) (User, error) {
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://example.internal/users/"+id, nil)
|
||||
if err != nil {
|
||||
return User{}, err
|
||||
}
|
||||
|
||||
// send request with ctx-aware client
|
||||
return User{}, nil
|
||||
}
|
||||
```
|
||||
|
||||
### Why It Exists
|
||||
|
||||
In distributed systems, work rarely lives in a single function. An HTTP request may trigger:
|
||||
|
||||
- JSON parsing
|
||||
- database queries
|
||||
- downstream HTTP calls
|
||||
- cache lookups
|
||||
- logging and tracing
|
||||
|
||||
If the client disconnects or a deadline expires, you want the whole chain to stop promptly. Context gives the program a standard way to express that control signal.
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
Contexts form a tree.
|
||||
|
||||
- a parent context can be derived into child contexts
|
||||
- canceling the parent cancels all children
|
||||
- deadlines propagate downward
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Background Context] --> B[HTTP Request Context]
|
||||
B --> C[DB Query Context]
|
||||
B --> D[Downstream API Context]
|
||||
B --> E[Worker Task Context]
|
||||
```
|
||||
|
||||
### Rules for Using Context Correctly
|
||||
|
||||
- pass it as the first parameter by convention
|
||||
- do not store it inside structs for long-lived use
|
||||
- do not use it as a bag of optional business parameters
|
||||
- respect cancellation by checking `ctx.Done()` or using context-aware APIs
|
||||
|
||||
### Common Misuse
|
||||
|
||||
Putting every random value into context makes code opaque. Use context values only for request-scoped metadata that crosses process boundaries or middleware layers, such as trace IDs or auth claims when your framework expects it.
|
||||
|
||||
## Mutexes, RWMutexes, and Atomics
|
||||
|
||||
### Why These Exist Alongside Channels
|
||||
|
||||
The slogan "share memory by communicating" is helpful, but it is not a religion. Some problems are fundamentally shared-state problems.
|
||||
|
||||
Example:
|
||||
|
||||
- protecting a cache map
|
||||
- incrementing metrics counters
|
||||
- updating a shared in-memory registry
|
||||
|
||||
For these, a mutex is often clearer than designing a special manager goroutine and channel protocol.
|
||||
|
||||
### `sync.Mutex`
|
||||
|
||||
```go
|
||||
type Counter struct {
|
||||
mu sync.Mutex
|
||||
value int64
|
||||
}
|
||||
|
||||
func (c *Counter) Inc() {
|
||||
c.mu.Lock()
|
||||
defer c.mu.Unlock()
|
||||
c.value++
|
||||
}
|
||||
```
|
||||
|
||||
Why it works:
|
||||
|
||||
- only one goroutine can hold the lock at a time
|
||||
- the critical section becomes explicit
|
||||
|
||||
### `sync.RWMutex`
|
||||
|
||||
Useful when reads are much more frequent than writes, but do not assume it is always faster. Its benefits depend on workload and contention patterns.
|
||||
|
||||
### `sync/atomic`
|
||||
|
||||
Atomic operations are useful for low-level counters, flags, and lock-free coordination where the semantics are simple and precise.
|
||||
|
||||
Use atomics carefully. They are powerful but easy to misuse if you do not understand memory ordering and invariants.
|
||||
|
||||
## The Go Memory Model, Lightly Explained
|
||||
|
||||
The memory model answers a critical question: when one goroutine writes data, when is another goroutine guaranteed to see it?
|
||||
|
||||
If two goroutines touch the same variable without proper synchronization and at least one access is a write, you have a data race.
|
||||
|
||||
This is not a style issue. It is a correctness bug.
|
||||
|
||||
### Synchronization Creates Visibility Guarantees
|
||||
|
||||
Common happens-before edges include:
|
||||
|
||||
- sending on a channel before the corresponding receive completes
|
||||
- unlocking a mutex before a later lock on that mutex
|
||||
- closing a channel before receives observe closure
|
||||
- `WaitGroup` and other primitives coordinating completion
|
||||
|
||||
If you rely on plain timing, such as "the other goroutine will probably run first," you do not have a guarantee.
|
||||
|
||||
### Why This Matters in Production
|
||||
|
||||
Data races can pass tests and still fail under load, on different CPUs, or only once every few days. That is why race bugs are among the most frustrating backend failures.
|
||||
|
||||
Use the race detector early:
|
||||
|
||||
```bash
|
||||
go test -race ./...
|
||||
```
|
||||
|
||||
## Concurrency Patterns You Will Actually Use
|
||||
|
||||
### Worker Pool
|
||||
|
||||
Useful when you have many jobs but want bounded concurrency.
|
||||
|
||||
```go
|
||||
func worker(id int, jobs <-chan int, results chan<- int) {
|
||||
for job := range jobs {
|
||||
results <- job * 2
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Job Producer] --> B[Buffered Jobs Channel]
|
||||
B --> C[Worker 1]
|
||||
B --> D[Worker 2]
|
||||
B --> E[Worker 3]
|
||||
C --> F[Results Channel]
|
||||
D --> F
|
||||
E --> F
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- prevents unbounded goroutine creation
|
||||
- smooths throughput
|
||||
- matches CPU or downstream capacity constraints
|
||||
|
||||
### Fan-Out and Fan-In
|
||||
|
||||
This pattern sends work to multiple goroutines and merges results back together. It is common in API aggregation, search, and parallel I/O.
|
||||
|
||||
### Pipelines
|
||||
|
||||
Each stage reads from an input channel, transforms data, and sends to an output channel. This is useful for streaming transformations, though you must design cancellation carefully or you can leak goroutines when downstream stops consuming.
|
||||
|
||||
### Bounded Semaphores with Channels
|
||||
|
||||
A buffered channel can act as a semaphore controlling how many operations run at once. This is handy for limiting downstream API calls or database work.
|
||||
|
||||
## Real-World Usage Patterns
|
||||
|
||||
### HTTP Request Fan-Out
|
||||
|
||||
An API gateway might receive one request, then concurrently ask a profile service, inventory service, and pricing service for data. Context cancellation ensures that if the client goes away or a deadline expires, those downstream calls stop too.
|
||||
|
||||
### Background Job Processing
|
||||
|
||||
A worker service reading from a queue often uses:
|
||||
|
||||
- one intake goroutine
|
||||
- a bounded worker pool
|
||||
- retry logic
|
||||
- context cancellation for shutdown
|
||||
- metrics on success, failure, and latency
|
||||
|
||||
### Streaming and Event Processing
|
||||
|
||||
Go is good at managing concurrent streams from sockets, brokers, or internal pipelines because goroutines map well to independent flows of work.
|
||||
|
||||
## Common Mistakes and Misconceptions
|
||||
|
||||
### Mistake: Spawning Unbounded Goroutines
|
||||
|
||||
If every request starts many goroutines without a limit, memory and scheduler pressure can explode under load.
|
||||
|
||||
### Mistake: Forgetting Cancellation
|
||||
|
||||
Goroutines that wait forever on channels, I/O, or timers become leaks. In servers, leaked goroutines are a real operational bug.
|
||||
|
||||
### Mistake: Closing Channels from the Wrong Side
|
||||
|
||||
Channel closure should usually be owned by the sender that knows when production is complete.
|
||||
|
||||
### Mistake: Using Channels for Everything
|
||||
|
||||
Sometimes a mutex is the simplest and most correct tool.
|
||||
|
||||
### Mistake: Assuming Concurrent Means Safe
|
||||
|
||||
Starting work in multiple goroutines does not automatically make the code synchronized. Shared state still needs a correctness story.
|
||||
|
||||
### Mistake: Ignoring the Race Detector
|
||||
|
||||
If you write concurrent Go and do not run `go test -race`, you are skipping one of the most useful safety tools in the ecosystem.
|
||||
|
||||
### Mistake: Misusing Context Values
|
||||
|
||||
Context is for cancellation, deadlines, and narrow request-scoped metadata. It is not general dependency injection.
|
||||
|
||||
## Summary
|
||||
|
||||
Go concurrency is powerful because it combines a simple source-level model with strong runtime support.
|
||||
|
||||
- goroutines make concurrent work cheap to express
|
||||
- channels coordinate ownership transfer and signaling
|
||||
- `select` handles multiple events, timeouts, and cancellation
|
||||
- mutexes and atomics remain essential for shared-state problems
|
||||
- `context` is the control plane for request-scoped work
|
||||
- the memory model and race detector protect correctness when multiple goroutines interact
|
||||
|
||||
The next step is learning how to organize real Go codebases: packages, modules, tests, benchmarks, and the toolchain that keeps production Go code clean and maintainable.
|
||||
@@ -0,0 +1,565 @@
|
||||
# Go: Packages, Testing, and Tools
|
||||
|
||||
## Learning Objectives
|
||||
|
||||
- Understand how Go packages and modules organize code and dependencies.
|
||||
- Learn the visibility rules that shape API design in Go.
|
||||
- Write unit tests, table-driven tests, handler tests, benchmarks, and fuzz tests.
|
||||
- Use the Go toolchain to format, vet, benchmark, and inspect code.
|
||||
- Understand common package layout patterns for growing services.
|
||||
- Recognize how tooling discipline keeps Go codebases maintainable in production.
|
||||
|
||||
## Packages and Modules: Two Different Layers
|
||||
|
||||
One of the easiest ways to get confused in Go is to blur packages and modules together. They are connected, but they are not the same thing.
|
||||
|
||||
### Package
|
||||
|
||||
A package is a unit of code organization and namespace.
|
||||
|
||||
Examples:
|
||||
|
||||
- `fmt`
|
||||
- `net/http`
|
||||
- `time`
|
||||
- your own package like `internal/store`
|
||||
|
||||
All files in a directory usually belong to the same package and compile together.
|
||||
|
||||
### Module
|
||||
|
||||
A module is the versioned dependency unit defined by `go.mod`.
|
||||
|
||||
One module can contain many packages.
|
||||
|
||||
This distinction matters in real projects because:
|
||||
|
||||
- packages organize design inside the codebase
|
||||
- modules organize dependency and version boundaries across codebases
|
||||
|
||||
## Why Go Organizes Code This Way
|
||||
|
||||
Go wants dependency structure to stay visible and simple.
|
||||
|
||||
Packages make it easy to answer questions like:
|
||||
|
||||
- what code belongs together
|
||||
- what API surface is exported
|
||||
- what dependencies are allowed here
|
||||
|
||||
Modules make it easy to answer questions like:
|
||||
|
||||
- what external libraries does this project depend on
|
||||
- what version of a dependency are we building against
|
||||
- how can another project import this code reproducibly
|
||||
|
||||
In large backend systems, those questions are not administrative details. They directly affect build speed, deploy safety, and team comprehension.
|
||||
|
||||
## Package Naming and Visibility
|
||||
|
||||
### Package Names
|
||||
|
||||
Go package names are usually short, lower-case, and simple.
|
||||
|
||||
Good examples:
|
||||
|
||||
- `store`
|
||||
- `auth`
|
||||
- `queue`
|
||||
- `config`
|
||||
|
||||
Less ideal examples:
|
||||
|
||||
- `storeutils`
|
||||
- `commonhelpers`
|
||||
- `myAmazingPackage`
|
||||
|
||||
Why the simplicity matters:
|
||||
|
||||
- package names appear at every call site
|
||||
- short names keep code readable
|
||||
- packages should represent clear concepts, not junk drawers
|
||||
|
||||
### Exported vs Unexported Identifiers
|
||||
|
||||
Go uses capitalization to control visibility.
|
||||
|
||||
- identifiers starting with an uppercase letter are exported
|
||||
- identifiers starting with a lowercase letter are package-private
|
||||
|
||||
```go
|
||||
package store
|
||||
|
||||
type User struct {
|
||||
ID int64
|
||||
Name string
|
||||
}
|
||||
|
||||
type repository struct {
|
||||
db *sql.DB
|
||||
}
|
||||
|
||||
func New(db *sql.DB) *repository {
|
||||
return &repository{db: db}
|
||||
}
|
||||
```
|
||||
|
||||
Why this rule exists:
|
||||
|
||||
- it keeps visibility obvious at the point of declaration
|
||||
- it avoids separate access modifiers like `public` and `private`
|
||||
- it encourages small package APIs
|
||||
|
||||
### Internal APIs with `internal/`
|
||||
|
||||
Go has a special `internal` directory rule. Packages inside `internal/` can only be imported by code within the parent module tree.
|
||||
|
||||
This is useful when you want:
|
||||
|
||||
- separation between application-private code and reusable libraries
|
||||
- freedom to refactor internals without pretending they are public APIs
|
||||
|
||||
That makes `internal/` a strong default for service code.
|
||||
|
||||
## Common Package Layout Patterns
|
||||
|
||||
There is no single mandatory project structure in Go. That is intentional. The language tries to avoid framework-enforced layout.
|
||||
|
||||
A common service layout looks like this:
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[payments-service] --> B[go.mod]
|
||||
A --> C[cmd/api/main.go]
|
||||
A --> D[internal/httpapi]
|
||||
A --> E[internal/service]
|
||||
A --> F[internal/store]
|
||||
A --> G[internal/config]
|
||||
A --> H[internal/worker]
|
||||
```
|
||||
|
||||
### `cmd/`
|
||||
|
||||
Holds entry points for executables.
|
||||
|
||||
If one repository produces multiple binaries, `cmd/` keeps them separate cleanly.
|
||||
|
||||
### `internal/`
|
||||
|
||||
Holds application-private packages.
|
||||
|
||||
For most services, this is where the real code lives.
|
||||
|
||||
### `pkg/`
|
||||
|
||||
Some repositories use `pkg/` for packages intended to be imported by other modules. This is a convention, not a language rule. Use it only if it creates actual clarity.
|
||||
|
||||
### Keep Layout Earned, Not Decorative
|
||||
|
||||
Beginners often copy large directory trees too early. If a project has only one executable and a few packages, keep it smaller until complexity genuinely arrives.
|
||||
|
||||
## Modules, Dependencies, and Reproducibility
|
||||
|
||||
### `go.mod`
|
||||
|
||||
`go.mod` declares the module path and dependency requirements.
|
||||
|
||||
```go
|
||||
module example.com/payments
|
||||
|
||||
go 1.25.0
|
||||
|
||||
require (
|
||||
github.com/google/uuid v1.6.0
|
||||
)
|
||||
```
|
||||
|
||||
### `go.sum`
|
||||
|
||||
`go.sum` records checksums for module content so builds can verify dependency integrity.
|
||||
|
||||
Why this matters:
|
||||
|
||||
- catches unexpected module tampering
|
||||
- helps keep builds reproducible
|
||||
- gives the module downloader integrity data
|
||||
|
||||
### How Version Resolution Works
|
||||
|
||||
Go uses module version selection designed to be reproducible and understandable. The details can get deep, but the practical takeaway is:
|
||||
|
||||
- dependency versions are declared in `go.mod`
|
||||
- the toolchain resolves one version per module in the build graph
|
||||
- commands like `go mod tidy` keep the graph clean
|
||||
|
||||
Useful commands:
|
||||
|
||||
```bash
|
||||
go get github.com/google/uuid@latest
|
||||
go mod tidy
|
||||
go list -m all
|
||||
```
|
||||
|
||||
### Why Import Cycles Are Forbidden
|
||||
|
||||
Go does not allow cyclic package imports.
|
||||
|
||||
That may feel restrictive, but it enforces architectural clarity. Cycles usually signal packages that were split at the wrong boundary or abstractions that are too tangled.
|
||||
|
||||
In large systems, this restriction prevents dependency graphs from collapsing into spaghetti.
|
||||
|
||||
## Designing Package Boundaries Well
|
||||
|
||||
### Group by Responsibility, Not by Type Name Alone
|
||||
|
||||
Better:
|
||||
|
||||
- `auth`
|
||||
- `store`
|
||||
- `httpapi`
|
||||
- `billing`
|
||||
|
||||
Weaker:
|
||||
|
||||
- `models`
|
||||
- `utils`
|
||||
- `helpers`
|
||||
|
||||
Packages should usually represent behavior or domain areas, not just a pile of vaguely related structs.
|
||||
|
||||
### Define Interfaces Where They Are Consumed
|
||||
|
||||
Instead of putting every interface next to every implementation, define small interfaces at the consumer boundary when that improves decoupling and testing.
|
||||
|
||||
### Keep APIs Small
|
||||
|
||||
If another package imports yours, what do they truly need? In Go, smaller package surfaces are easier to maintain and easier to refactor safely.
|
||||
|
||||
## Testing in Go: A Built-In Culture
|
||||
|
||||
Testing is part of normal Go workflow, not an optional afterthought.
|
||||
|
||||
### Where Tests Live
|
||||
|
||||
Tests live in files ending with `_test.go`.
|
||||
|
||||
```text
|
||||
store/
|
||||
store.go
|
||||
store_test.go
|
||||
```
|
||||
|
||||
Functions that begin with `Test` are discovered by `go test`.
|
||||
|
||||
```go
|
||||
func TestParsePort(t *testing.T) {
|
||||
port, err := parsePort("8080")
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
|
||||
if port != 8080 {
|
||||
t.Fatalf("got %d want %d", port, 8080)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why Go's Testing Model Is Simple
|
||||
|
||||
The standard library's `testing` package is deliberately small. That simplicity means:
|
||||
|
||||
- the barrier to writing tests is low
|
||||
- teams do not need a huge framework to get started
|
||||
- test execution integrates naturally with the language toolchain
|
||||
|
||||
## Table-Driven Tests
|
||||
|
||||
This is one of the most common idioms in Go.
|
||||
|
||||
```go
|
||||
func TestParsePort(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
input string
|
||||
want int
|
||||
wantErr bool
|
||||
}{
|
||||
{name: "valid", input: "8080", want: 8080},
|
||||
{name: "invalid", input: "abc", wantErr: true},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got, err := parsePort(tt.input)
|
||||
|
||||
if tt.wantErr {
|
||||
if err == nil {
|
||||
t.Fatalf("expected error, got nil")
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
|
||||
if got != tt.want {
|
||||
t.Fatalf("got %d want %d", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Why this pattern is so useful:
|
||||
|
||||
- test cases stay compact
|
||||
- edge cases become easy to enumerate
|
||||
- adding new scenarios becomes mechanical rather than repetitive
|
||||
|
||||
## Testing HTTP Handlers
|
||||
|
||||
For APIs, `net/http/httptest` is essential.
|
||||
|
||||
```go
|
||||
func TestHealthHandler(t *testing.T) {
|
||||
req := httptest.NewRequest(http.MethodGet, "/health", nil)
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_, _ = w.Write([]byte(`{"status":"ok"}`))
|
||||
})
|
||||
|
||||
handler.ServeHTTP(rec, req)
|
||||
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("got %d want %d", rec.Code, http.StatusOK)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Why this matters:
|
||||
|
||||
- lets you test handlers without a real network listener
|
||||
- keeps tests fast and deterministic
|
||||
- makes request/response behavior easy to inspect
|
||||
|
||||
## Benchmarks
|
||||
|
||||
Performance-sensitive code can be benchmarked with the same toolchain.
|
||||
|
||||
```go
|
||||
func BenchmarkParsePort(b *testing.B) {
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, _ = parsePort("8080")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Run it with:
|
||||
|
||||
```bash
|
||||
go test -bench=. -benchmem ./...
|
||||
```
|
||||
|
||||
### Why Benchmarks Matter
|
||||
|
||||
In backend systems, intuition about performance is often wrong. Benchmarks let you compare alternatives with data.
|
||||
|
||||
Be careful though:
|
||||
|
||||
- microbenchmarks can miss real I/O behavior
|
||||
- unrealistic inputs produce misleading results
|
||||
- one benchmark is not a system profile
|
||||
|
||||
### Allocation Awareness
|
||||
|
||||
`-benchmem` shows allocation counts and bytes per operation. This is particularly helpful in Go because excess allocation often increases garbage collection pressure.
|
||||
|
||||
## Fuzz Testing
|
||||
|
||||
Modern Go supports fuzz testing through the standard toolchain.
|
||||
|
||||
```go
|
||||
func FuzzParsePort(f *testing.F) {
|
||||
f.Add("8080")
|
||||
f.Add("abc")
|
||||
|
||||
f.Fuzz(func(t *testing.T, input string) {
|
||||
_, _ = parsePort(input)
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
Why fuzzing is valuable:
|
||||
|
||||
- explores unexpected input combinations
|
||||
- finds parser and validation edge cases
|
||||
- is especially useful for network protocols, text parsing, and serialization code
|
||||
|
||||
## Tooling That Should Be in Your Daily Workflow
|
||||
|
||||
### `go fmt`
|
||||
|
||||
```bash
|
||||
go fmt ./...
|
||||
```
|
||||
|
||||
Standard formatting is one of Go's biggest team-level productivity wins.
|
||||
|
||||
### `go test`
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
```
|
||||
|
||||
Run it constantly. Fast feedback is part of idiomatic Go engineering.
|
||||
|
||||
### `go test -race`
|
||||
|
||||
```bash
|
||||
go test -race ./...
|
||||
```
|
||||
|
||||
This detects many data races in concurrent code. If you write goroutines and shared state, this is a critical tool.
|
||||
|
||||
### `go vet`
|
||||
|
||||
```bash
|
||||
go vet ./...
|
||||
```
|
||||
|
||||
`go vet` looks for suspicious constructs that compile but are likely wrong.
|
||||
|
||||
### `go test -cover`
|
||||
|
||||
```bash
|
||||
go test -cover ./...
|
||||
```
|
||||
|
||||
Coverage is useful as a signal, not a religion. High coverage does not guarantee meaningful tests, but very low coverage may show untested risk.
|
||||
|
||||
### Useful External Tools
|
||||
|
||||
- `staticcheck` for deeper linting and bug finding
|
||||
- `govulncheck` for known vulnerability detection in dependencies and reachable code
|
||||
- `goimports` for formatting plus import cleanup
|
||||
- `pprof` for CPU and memory profiling
|
||||
|
||||
These are not built into the core language, but they fit naturally into Go's tooling culture.
|
||||
|
||||
## Build and Release Workflows
|
||||
|
||||
Go makes builds operationally simple.
|
||||
|
||||
### Build a Binary
|
||||
|
||||
```bash
|
||||
go build ./cmd/api
|
||||
```
|
||||
|
||||
### Cross-Compile
|
||||
|
||||
```bash
|
||||
GOOS=linux GOARCH=amd64 go build ./cmd/api
|
||||
```
|
||||
|
||||
This is a practical reason Go is so common in infrastructure tools. Cross-platform binaries are relatively easy to produce.
|
||||
|
||||
### Why Single-Binary Delivery Matters
|
||||
|
||||
In deployment pipelines, simplicity is leverage.
|
||||
|
||||
- containers become smaller and simpler
|
||||
- startup is predictable
|
||||
- dependency packaging becomes easier
|
||||
- CI artifacts are easier to reason about
|
||||
|
||||
## Documentation as Part of the API Surface
|
||||
|
||||
Go places real value on package and symbol documentation.
|
||||
|
||||
Exported identifiers should usually have comments when the package is intended for reuse.
|
||||
|
||||
Why this matters:
|
||||
|
||||
- tooling can surface docs automatically
|
||||
- package APIs become easier to consume without reading implementation details
|
||||
- maintainers communicate intent, not just mechanics
|
||||
|
||||
## Real-World Usage Patterns
|
||||
|
||||
### Service Repository Layout
|
||||
|
||||
A production service often has:
|
||||
|
||||
- one or more `cmd/` entry points
|
||||
- internal packages for handlers, business logic, storage, and config
|
||||
- tests next to each package
|
||||
- benchmark and race-check jobs in CI
|
||||
|
||||
### Testing Strategy
|
||||
|
||||
Healthy Go services usually combine:
|
||||
|
||||
- unit tests for pure logic
|
||||
- handler tests with `httptest`
|
||||
- integration tests for DB and network boundaries
|
||||
- benchmarks for hot paths
|
||||
- race detection for concurrent subsystems
|
||||
|
||||
### Tooling in CI
|
||||
|
||||
A practical CI pipeline often runs:
|
||||
|
||||
- `go fmt` or formatting checks
|
||||
- `go test ./...`
|
||||
- `go test -race ./...`
|
||||
- `go vet ./...`
|
||||
- optional static analysis and vulnerability scanning
|
||||
|
||||
## Common Mistakes and Misconceptions
|
||||
|
||||
### Mistake: Creating Huge Project Layouts Too Early
|
||||
|
||||
Structure should reflect real complexity. Decorative folders make learning and maintenance harder.
|
||||
|
||||
### Mistake: Treating `pkg/` as Mandatory
|
||||
|
||||
It is only a convention. Many good Go services never use it.
|
||||
|
||||
### Mistake: Exporting Too Much
|
||||
|
||||
Large public package surfaces make refactoring harder and couple packages unnecessarily.
|
||||
|
||||
### Mistake: Ignoring Import Cycles Until Late
|
||||
|
||||
If your packages keep wanting to import each other, the boundaries are probably wrong.
|
||||
|
||||
### Mistake: Writing Only Happy-Path Tests
|
||||
|
||||
Most production failures happen in error paths, edge inputs, and timeout scenarios.
|
||||
|
||||
### Mistake: Optimizing from Guesswork Instead of Benchmarks
|
||||
|
||||
Measure before changing code for performance reasons.
|
||||
|
||||
### Mistake: Treating Coverage as the Goal
|
||||
|
||||
Coverage can be useful, but well-chosen tests matter more than inflated percentages.
|
||||
|
||||
## Summary
|
||||
|
||||
Go packages and modules keep code organization and dependency management explicit. The testing package and standard toolchain make quality checks part of ordinary development rather than an extra framework burden.
|
||||
|
||||
The big practical lessons are:
|
||||
|
||||
- packages shape design boundaries
|
||||
- modules shape dependency boundaries
|
||||
- small APIs and clean imports keep code maintainable
|
||||
- table-driven tests and `httptest` are core testing patterns
|
||||
- benchmarks, race checks, and tooling provide objective feedback
|
||||
- operational simplicity is one of Go's biggest strengths
|
||||
|
||||
The next step is to bring all of this together in production system design: HTTP servers, request lifecycles, context propagation, timeouts, graceful shutdown, observability, and distributed systems patterns in Go.
|
||||
@@ -0,0 +1,470 @@
|
||||
# Go: Real-World System Design in Go
|
||||
|
||||
## Learning Objectives
|
||||
|
||||
- Understand how Go is used to build production HTTP services and distributed systems.
|
||||
- Learn the request lifecycle in `net/http` and how handlers interact with context.
|
||||
- Design services with clear boundaries, sane package structure, and operational safety.
|
||||
- Apply timeouts, cancellation, connection reuse, and graceful shutdown correctly.
|
||||
- Recognize common backend patterns in Go for workers, queues, caches, and external service calls.
|
||||
- Reason about real-world tradeoffs rather than just writing syntax-correct code.
|
||||
|
||||
## Why Go Works Well for Production Systems
|
||||
|
||||
By the time you reach system design, you should stop thinking of Go as just a language and start thinking of it as an operating model.
|
||||
|
||||
Go is attractive in production because it combines:
|
||||
|
||||
- native binaries that are easy to ship
|
||||
- a concurrency model that fits network services well
|
||||
- a standard library strong enough to build real servers
|
||||
- explicit errors and visible control flow
|
||||
- tooling that supports fast feedback and straightforward CI
|
||||
|
||||
That combination makes Go common in:
|
||||
|
||||
- REST and JSON APIs
|
||||
- RPC services
|
||||
- control-plane components
|
||||
- stream and queue consumers
|
||||
- gateways and reverse proxies
|
||||
- schedulers and automation tooling
|
||||
|
||||
## The HTTP Request Lifecycle in Go
|
||||
|
||||
### What `net/http` Gives You
|
||||
|
||||
Go's standard library includes both an HTTP server and HTTP client. You do not need a framework to build a real API.
|
||||
|
||||
At a high level:
|
||||
|
||||
1. the server listens on a socket
|
||||
2. connections are accepted
|
||||
3. requests are parsed
|
||||
4. a handler is invoked
|
||||
5. the handler writes a response
|
||||
|
||||
Conceptually, the flow looks like this:
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Client] --> B[Load Balancer]
|
||||
B --> C[Go HTTP Server]
|
||||
C --> D[Middleware]
|
||||
D --> E[Handler]
|
||||
E --> F[Service Layer]
|
||||
F --> G[Database]
|
||||
F --> H[Cache]
|
||||
F --> I[Downstream API]
|
||||
```
|
||||
|
||||
### Why This Model Is Powerful
|
||||
|
||||
The standard library model is deliberately small:
|
||||
|
||||
- `http.Handler` is just an interface
|
||||
- middleware is ordinary function composition
|
||||
- routing can be simple or sophisticated depending on need
|
||||
|
||||
This keeps the underlying mechanics easy to understand. Even if you later use a router or framework, it usually plugs into the same `http.Handler` shape.
|
||||
|
||||
### Internal Behavior That Matters
|
||||
|
||||
You do not need to memorize the internals, but you should know the operational consequences:
|
||||
|
||||
- the server can handle many requests concurrently
|
||||
- handler code must therefore be safe under concurrency
|
||||
- request bodies and response writers are tied to request lifetime
|
||||
- request contexts are canceled when clients disconnect or the server shuts down the request
|
||||
|
||||
That last point is critical. Context is not decoration. It is how Go propagates lifecycle control through the call stack.
|
||||
|
||||
## Building an Idiomatic HTTP Service
|
||||
|
||||
Here is a small but production-minded shape for a service:
|
||||
|
||||
```go
|
||||
type UserStore interface {
|
||||
Create(ctx context.Context, user User) error
|
||||
}
|
||||
|
||||
type App struct {
|
||||
logger *slog.Logger
|
||||
store UserStore
|
||||
}
|
||||
|
||||
func NewApp(logger *slog.Logger, store UserStore) *App {
|
||||
return &App{logger: logger, store: store}
|
||||
}
|
||||
|
||||
func (a *App) routes() http.Handler {
|
||||
mux := http.NewServeMux()
|
||||
mux.HandleFunc("/health", a.handleHealth)
|
||||
mux.HandleFunc("/users", a.handleCreateUser)
|
||||
return a.logging(a.recover(mux))
|
||||
}
|
||||
|
||||
func (a *App) handleHealth(w http.ResponseWriter, r *http.Request) {
|
||||
writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
|
||||
}
|
||||
|
||||
func (a *App) handleCreateUser(w http.ResponseWriter, r *http.Request) {
|
||||
if r.Method != http.MethodPost {
|
||||
http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
|
||||
return
|
||||
}
|
||||
|
||||
var req struct {
|
||||
Name string `json:"name"`
|
||||
Email string `json:"email"`
|
||||
}
|
||||
|
||||
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
|
||||
http.Error(w, "invalid json", http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
|
||||
user := User{Name: req.Name, Email: req.Email}
|
||||
|
||||
if err := a.store.Create(r.Context(), user); err != nil {
|
||||
a.logger.Error("create user", "err", err)
|
||||
http.Error(w, "internal server error", http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
writeJSON(w, http.StatusCreated, user)
|
||||
}
|
||||
```
|
||||
|
||||
### What This Example Demonstrates
|
||||
|
||||
- constructor-based dependency injection
|
||||
- interfaces at the boundary where behavior matters
|
||||
- handlers that stay thin and pass context downward
|
||||
- standard library routing and JSON handling
|
||||
- explicit error handling rather than hidden control flow
|
||||
|
||||
This shape scales well. You can add middleware, tracing, validation, auth, metrics, and graceful shutdown without replacing the whole architecture.
|
||||
|
||||
## Context in Production Request Paths
|
||||
|
||||
### Why Context Is Central
|
||||
|
||||
When an HTTP request comes in, the context attached to it should usually flow through all downstream operations.
|
||||
|
||||
Example chain:
|
||||
|
||||
- HTTP handler receives request
|
||||
- service layer validates business logic
|
||||
- repository executes SQL query
|
||||
- service makes a downstream HTTP call
|
||||
- background operation respects cancellation if appropriate
|
||||
|
||||
If the client disconnects or the server deadline is exceeded, that context cancellation should stop the rest of the work.
|
||||
|
||||
### Practical Rules
|
||||
|
||||
- pass `ctx` explicitly as the first parameter
|
||||
- use `NewRequestWithContext` for outbound HTTP
|
||||
- use database APIs that accept context
|
||||
- never replace request context with `context.Background()` in the middle of request processing unless you are intentionally detaching work
|
||||
|
||||
### Timeouts and Deadlines
|
||||
|
||||
Timeouts are not just protection against slowness. They are protection against resource exhaustion.
|
||||
|
||||
Without timeouts:
|
||||
|
||||
- goroutines can pile up waiting on I/O
|
||||
- file descriptors remain occupied
|
||||
- request latency can become unbounded
|
||||
- downstream incidents can cascade back into your service
|
||||
|
||||
Good Go services apply timeouts at multiple layers:
|
||||
|
||||
- incoming server read and header timeouts
|
||||
- request-scoped deadlines via context
|
||||
- outbound client timeouts
|
||||
- database query timeouts
|
||||
|
||||
## Graceful Shutdown
|
||||
|
||||
### What It Means
|
||||
|
||||
Graceful shutdown means the process stops accepting new work, gives in-flight work a chance to finish within a bounded time, and then exits cleanly.
|
||||
|
||||
This matters for:
|
||||
|
||||
- rolling deployments
|
||||
- autoscaling events
|
||||
- node drains in orchestration platforms
|
||||
- operator-triggered restarts
|
||||
|
||||
### Example
|
||||
|
||||
```go
|
||||
func run() error {
|
||||
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
|
||||
app := NewApp(logger, newInMemoryStore())
|
||||
|
||||
srv := &http.Server{
|
||||
Addr: ":8080",
|
||||
Handler: app.routes(),
|
||||
ReadHeaderTimeout: 2 * time.Second,
|
||||
}
|
||||
|
||||
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
|
||||
defer stop()
|
||||
|
||||
go func() {
|
||||
<-ctx.Done()
|
||||
|
||||
shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
|
||||
_ = srv.Shutdown(shutdownCtx)
|
||||
}()
|
||||
|
||||
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Why It Matters Internally
|
||||
|
||||
If you just kill the process abruptly:
|
||||
|
||||
- in-flight requests get dropped
|
||||
- partial writes may occur
|
||||
- background jobs may stop mid-operation
|
||||
- queue acknowledgments may be inconsistent
|
||||
|
||||
Graceful shutdown is an operational correctness feature, not just polish.
|
||||
|
||||
## Outbound Clients, Connection Pools, and Resource Reuse
|
||||
|
||||
### HTTP Clients
|
||||
|
||||
Creating a new `http.Client` per request is usually a mistake. Clients and transports manage connection reuse.
|
||||
|
||||
Why reuse matters:
|
||||
|
||||
- avoids needless TCP and TLS setup costs
|
||||
- improves latency
|
||||
- reduces load on downstream services
|
||||
|
||||
At the same time, the zero-value default client setup is not enough for every production case. You usually want explicit timeouts and transport tuning.
|
||||
|
||||
### Database Pools
|
||||
|
||||
Packages like `database/sql` manage connection pools. Your job is to configure them sanely and use context-aware operations.
|
||||
|
||||
Important operational knobs include:
|
||||
|
||||
- max open connections
|
||||
- max idle connections
|
||||
- connection lifetime
|
||||
|
||||
These are part of system design, not just code details. Wrong pool settings can overload databases or starve your service.
|
||||
|
||||
## Service Architecture Patterns in Go
|
||||
|
||||
### Composition Root in `main`
|
||||
|
||||
In Go, `main` often acts as the composition root where you:
|
||||
|
||||
- load config
|
||||
- initialize logging
|
||||
- create clients and stores
|
||||
- wire dependencies together
|
||||
- start the server or worker
|
||||
|
||||
This keeps wiring visible and avoids magic containers.
|
||||
|
||||
### Thin Handlers, Clear Services
|
||||
|
||||
A healthy pattern is:
|
||||
|
||||
- handlers translate transport concerns
|
||||
- services handle business logic
|
||||
- repositories or clients handle I/O boundaries
|
||||
|
||||
Do not turn this into rigid architecture theater. The point is clarity, not layers for their own sake.
|
||||
|
||||
### Interfaces at Edges
|
||||
|
||||
Use interfaces where they help isolate external systems or enable tests. Do not create an interface for every struct just because a pattern from another language told you to.
|
||||
|
||||
## Background Workers and Queues
|
||||
|
||||
Go is also strong for worker processes.
|
||||
|
||||
Common worker responsibilities:
|
||||
|
||||
- poll or receive jobs
|
||||
- decode payloads
|
||||
- apply business logic
|
||||
- talk to storage or downstream services
|
||||
- retry or dead-letter on failure
|
||||
|
||||
A production worker often combines:
|
||||
|
||||
- bounded concurrency
|
||||
- context-driven shutdown
|
||||
- idempotent processing
|
||||
- metrics and tracing
|
||||
- retry with backoff
|
||||
|
||||
### Why Idempotency Matters
|
||||
|
||||
Distributed systems retry. That means your worker or API should behave safely when the same logical operation arrives more than once.
|
||||
|
||||
Examples:
|
||||
|
||||
- charging an order only once
|
||||
- ignoring duplicate event delivery with a deduplication key
|
||||
- using upserts or unique constraints to protect state transitions
|
||||
|
||||
## Resilience Patterns in Go Services
|
||||
|
||||
### Retries
|
||||
|
||||
Retries can improve reliability, but they are dangerous when used carelessly.
|
||||
|
||||
Use retries when:
|
||||
|
||||
- the error is transient
|
||||
- the operation is safe to retry
|
||||
- you apply limits and backoff
|
||||
|
||||
Do not blindly retry every failure. That can turn a partial outage into a full overload event.
|
||||
|
||||
### Backpressure and Bounded Concurrency
|
||||
|
||||
Every service has finite CPU, memory, DB connections, and downstream quota. Good Go systems acknowledge this with:
|
||||
|
||||
- worker pool limits
|
||||
- channel buffer sizing based on real capacity, not guesswork
|
||||
- request timeouts
|
||||
- queue sizing and shedding strategies
|
||||
|
||||
### Caching
|
||||
|
||||
Caches reduce latency and downstream load, but they introduce staleness, invalidation complexity, and memory pressure.
|
||||
|
||||
In Go services, a cache may be:
|
||||
|
||||
- in-memory with mutex protection
|
||||
- external like Redis
|
||||
- layered with local plus remote caching
|
||||
|
||||
Choose based on consistency needs and failure modes, not just speed.
|
||||
|
||||
## Observability: Systems Need to Explain Themselves
|
||||
|
||||
### Logging
|
||||
|
||||
Structured logs are easier to query and correlate than ad hoc strings. Go's `log/slog` is a good default in modern code.
|
||||
|
||||
### Metrics
|
||||
|
||||
Metrics help answer:
|
||||
|
||||
- how many requests or jobs are happening
|
||||
- how often errors occur
|
||||
- how long operations take
|
||||
- whether queues, pools, or workers are saturating
|
||||
|
||||
### Tracing
|
||||
|
||||
Tracing becomes valuable once a request crosses multiple services. Go's context propagation model fits tracing naturally because trace metadata can move alongside request lifecycle.
|
||||
|
||||
### Profiling
|
||||
|
||||
When a service is slow or memory-hungry, use profiling rather than guesswork. Go's pprof ecosystem is one of the language's strongest practical advantages.
|
||||
|
||||
## A Realistic Service Architecture Example
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Client] --> B[API Gateway]
|
||||
B --> C[Go API Service]
|
||||
C --> D[Auth Middleware]
|
||||
D --> E[Business Service]
|
||||
E --> F[(Postgres)]
|
||||
E --> G[(Redis Cache)]
|
||||
E --> H[Message Broker]
|
||||
H --> I[Go Worker Service]
|
||||
I --> F
|
||||
C --> J[Observability Stack]
|
||||
I --> J
|
||||
```
|
||||
|
||||
Why Go fits this architecture well:
|
||||
|
||||
- API and worker components can share libraries and tooling
|
||||
- binaries are easy to containerize
|
||||
- concurrency model fits request handling and job processing
|
||||
- context propagation helps with cancellation and tracing
|
||||
|
||||
## Real-World Usage Patterns
|
||||
|
||||
### JSON API Service
|
||||
|
||||
Go is widely used for services that accept JSON, validate input, call storage or other APIs, and return typed responses.
|
||||
|
||||
### Internal Platform Components
|
||||
|
||||
Controllers, schedulers, reconcilers, and long-running agents are natural Go workloads because they need networking, concurrency, and operational predictability.
|
||||
|
||||
### Data and Event Processing
|
||||
|
||||
Consumers and workers benefit from Go's lightweight concurrency and straightforward deployment model.
|
||||
|
||||
## Common Mistakes and Misconceptions
|
||||
|
||||
### Mistake: Starting with a Framework Instead of Understanding `net/http`
|
||||
|
||||
Frameworks can help, but you should first understand the handler model underneath them.
|
||||
|
||||
### Mistake: Ignoring Timeouts
|
||||
|
||||
Untimed network calls are operational liabilities.
|
||||
|
||||
### Mistake: Creating New Clients Per Request
|
||||
|
||||
That defeats connection reuse and often harms performance badly.
|
||||
|
||||
### Mistake: Letting Handlers Contain All Business Logic
|
||||
|
||||
This makes testing harder and transport concerns bleed into domain behavior.
|
||||
|
||||
### Mistake: Launching Background Goroutines Without Shutdown Strategy
|
||||
|
||||
Every long-lived goroutine in a service should have a lifecycle story.
|
||||
|
||||
### Mistake: Overabstracting Everything into Interfaces
|
||||
|
||||
Use interfaces deliberately at boundaries, not as decoration.
|
||||
|
||||
### Mistake: Forgetting That Handlers Run Concurrently
|
||||
|
||||
Shared state in a server must be synchronized properly.
|
||||
|
||||
## Summary
|
||||
|
||||
Production Go system design is about making the whole service lifecycle explicit:
|
||||
|
||||
- request entry through `net/http`
|
||||
- context propagation through each downstream operation
|
||||
- bounded concurrency and sensible resource reuse
|
||||
- graceful shutdown during deploys and failures
|
||||
- clear package and dependency boundaries
|
||||
- observability and performance measurement built into the operational model
|
||||
|
||||
The most important mindset shift is this: idiomatic Go systems are not built around hidden magic. They are built around visible control flow, explicit dependencies, clear boundaries, and operationally honest concurrency.
|
||||
|
||||
That is exactly why Go remains such a strong language for backend and distributed systems engineering.
|
||||
@@ -0,0 +1,608 @@
|
||||
# File 1: Java Fundamentals
|
||||
|
||||
Java is often introduced as "write once, run anywhere," but that slogan only makes sense after you understand what Java is trying to optimize for: predictable behavior, strong tooling, portable runtime execution, and maintainability in medium-to-large systems. If you have seen Python or JavaScript first, Java can feel more explicit and more verbose. That is intentional. Java asks you to declare structure early so the compiler, IDE, runtime, and teammates can reason about your code reliably.
|
||||
|
||||
This file builds the base mental model you need before object-oriented design or concurrency starts to make sense. The goal is not just to memorize syntax. The goal is to understand what happens from source code to running program, why types matter, how control flow shapes logic, and how Java code is organized in real projects.
|
||||
|
||||
## Why Java Still Matters
|
||||
|
||||
Java has been around for decades, but it stays relevant because it solves real engineering problems well:
|
||||
|
||||
- teams need code that is easy to refactor safely
|
||||
- backend systems need stable performance under load
|
||||
- large organizations need strong tooling, dependency management, and observability support
|
||||
- the JVM ecosystem provides mature libraries for networking, concurrency, persistence, messaging, and security
|
||||
|
||||
In production, Java commonly appears in:
|
||||
|
||||
- backend APIs handling payments, inventory, identity, notifications, and reporting
|
||||
- internal enterprise systems with long maintenance lifetimes
|
||||
- streaming and data-processing systems
|
||||
- Android history, although modern Android development now leans heavily on Kotlin
|
||||
- high-throughput services built with frameworks such as Spring Boot, Micronaut, Quarkus, or Dropwizard
|
||||
|
||||
Java is not popular because it is the shortest language to write. It is popular because it creates a strong balance between developer productivity, runtime performance, and operational predictability.
|
||||
|
||||
## The Java Ecosystem: JDK, JRE, and JVM
|
||||
|
||||
One of the first sources of confusion for beginners is that people casually say "install Java" when they actually mean different pieces of the platform.
|
||||
|
||||
### Intuition
|
||||
|
||||
Think of Java like a small software factory:
|
||||
|
||||
- you write source code
|
||||
- a compiler turns it into bytecode
|
||||
- a runtime executes that bytecode
|
||||
- development tools help you debug, package, test, and ship the application
|
||||
|
||||
The three terms below refer to different layers in that process.
|
||||
|
||||
### JDK
|
||||
|
||||
The JDK, or Java Development Kit, is what you install when you want to build Java programs. It includes:
|
||||
|
||||
- the compiler `javac`
|
||||
- the Java launcher `java`
|
||||
- debugging and inspection tools like `jdb`, `jstack`, `jmap`, `jcmd`
|
||||
- the runtime needed to execute programs
|
||||
- standard libraries used by your code
|
||||
|
||||
If you are writing, compiling, packaging, or debugging Java, the JDK is your full toolkit.
|
||||
|
||||
### JRE
|
||||
|
||||
The JRE, or Java Runtime Environment, historically referred to the runtime needed to execute already-compiled Java applications. In older explanations, the distinction was:
|
||||
|
||||
- JDK = build and run
|
||||
- JRE = run only
|
||||
|
||||
In modern Java distributions, the JRE is less emphasized as a separate installable concept. Still, the term matters because many articles and interviews use it. Conceptually, it means the runtime layer rather than the full developer toolkit.
|
||||
|
||||
### JVM
|
||||
|
||||
The JVM, or Java Virtual Machine, is the engine that actually runs Java bytecode. It handles:
|
||||
|
||||
- loading classes into memory
|
||||
- verifying bytecode safety
|
||||
- interpreting or JIT-compiling code into machine instructions
|
||||
- memory allocation and garbage collection
|
||||
- thread scheduling integration with the operating system
|
||||
|
||||
The JVM is the main reason Java is portable. Your source code is compiled into bytecode, and that bytecode can run on any machine with a compatible JVM.
|
||||
|
||||
### Runtime Architecture Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[.java source files] --> B[javac compiler]
|
||||
B --> C[.class bytecode]
|
||||
C --> D[Class Loader]
|
||||
D --> E[JVM Runtime]
|
||||
E --> F[Interpreter]
|
||||
E --> G[JIT Compiler]
|
||||
G --> H[Native Machine Code]
|
||||
E --> I[Garbage Collector]
|
||||
E --> J[Heap and Thread Stacks]
|
||||
```
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
When you run a Java program, the JVM does not immediately convert every method into optimized native machine code. That would make startup too expensive. Instead, it usually starts by interpreting bytecode and watching which methods are "hot," meaning frequently executed. Hot code paths are then compiled by the Just-In-Time compiler into optimized native instructions.
|
||||
|
||||
That means Java has two important performance characteristics:
|
||||
|
||||
- startup can be slower than a small native binary because the runtime is initializing and warming up
|
||||
- long-running applications can become very fast because the JVM gathers execution data and optimizes real usage patterns
|
||||
|
||||
This is why Java is a strong fit for backend services that run continuously for hours or days.
|
||||
|
||||
### Common Misconceptions
|
||||
|
||||
- "Java is purely interpreted." Not true. Java is compiled to bytecode, then interpreted and JIT-compiled at runtime.
|
||||
- "Java is slow because it runs in a virtual machine." That is outdated thinking. For many long-running server workloads, modern JVM performance is excellent.
|
||||
- "Installing Java means installing only the JVM." In practice, developers usually install a full JDK.
|
||||
|
||||
## A Minimal Java Program
|
||||
|
||||
```java
|
||||
public class HelloApplication {
|
||||
public static void main(String[] args) {
|
||||
System.out.println("Hello, Java");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### What Each Part Means
|
||||
|
||||
- `public class HelloApplication`: defines a class named `HelloApplication`
|
||||
- `public static void main(String[] args)`: the standard entry point for a standalone Java application
|
||||
- `String[] args`: command-line arguments passed to the program
|
||||
- `System.out.println(...)`: writes text to standard output
|
||||
|
||||
### Why Java Starts Here
|
||||
|
||||
Java treats execution as behavior that belongs inside a type. That is why even a simple program lives inside a class. In more advanced Java, you will also see records, enums, interfaces, and frameworks that manage object creation for you, but this class-based entry point is still the core model.
|
||||
|
||||
### Production Relevance
|
||||
|
||||
Real services often start with more than one line in `main`, but the high-level idea is similar:
|
||||
|
||||
- bootstrap configuration
|
||||
- create or start the application container
|
||||
- connect logging and monitoring
|
||||
- register shutdown hooks
|
||||
- start serving traffic
|
||||
|
||||
Spring Boot, for example, still starts with a `main` method, even though most of the heavy lifting is hidden behind the framework.
|
||||
|
||||
## Variables and Data Types
|
||||
|
||||
Types are one of Java's biggest strengths. A type tells both the compiler and the reader what kind of data a variable can hold and what operations are valid on that data.
|
||||
|
||||
### Intuition
|
||||
|
||||
You can think of types as contracts for data. They prevent a large class of bugs early. If a method expects an `int`, you cannot silently pass a string. If a variable holds a `Customer`, the IDE can show what methods and fields are available.
|
||||
|
||||
In fast-growing codebases, this matters a lot. The compiler becomes a guardrail against accidental misuse.
|
||||
|
||||
### Primitive Types
|
||||
|
||||
Primitive types store simple values directly.
|
||||
|
||||
| Type | Typical Use | Example |
|
||||
| --- | --- | --- |
|
||||
| `byte` | raw binary data, very small integers | `byte flags = 1;` |
|
||||
| `short` | niche memory-sensitive numeric storage | `short yearOffset = 12;` |
|
||||
| `int` | default integer arithmetic | `int itemCount = 42;` |
|
||||
| `long` | large counters, timestamps, IDs | `long epochMillis = System.currentTimeMillis();` |
|
||||
| `float` | specialized numeric work | `float ratio = 0.25f;` |
|
||||
| `double` | default floating-point math | `double price = 19.99;` |
|
||||
| `char` | individual UTF-16 code unit | `char grade = 'A';` |
|
||||
| `boolean` | true/false logic | `boolean active = true;` |
|
||||
|
||||
### Reference Types
|
||||
|
||||
Reference types store a reference to an object, not the object value inline. Examples include:
|
||||
|
||||
- `String`
|
||||
- arrays like `int[]`
|
||||
- user-defined classes like `Order`
|
||||
- interfaces like `List<String>`
|
||||
- wrapper types like `Integer`
|
||||
|
||||
```java
|
||||
int quantity = 5;
|
||||
String status = "SHIPPED";
|
||||
Order order = new Order();
|
||||
```
|
||||
|
||||
Here:
|
||||
|
||||
- `quantity` directly holds a primitive value
|
||||
- `status` holds a reference to a `String` object
|
||||
- `order` holds a reference to an `Order` object
|
||||
|
||||
### Stack and Heap Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Method invocation] --> B[Stack frame created]
|
||||
B --> C[Local primitives stored directly]
|
||||
B --> D[Local references stored]
|
||||
D --> E[Objects allocated on heap]
|
||||
E --> F[Shared until unreachable]
|
||||
F --> G[Garbage collector reclaims memory]
|
||||
```
|
||||
|
||||
This diagram is simplified, but it is good enough for a beginner mental model:
|
||||
|
||||
- local method state lives in a stack frame
|
||||
- objects usually live on the heap
|
||||
- references point to heap objects
|
||||
- when objects are no longer reachable, garbage collection can reclaim them
|
||||
|
||||
### Why This Matters in Practice
|
||||
|
||||
If you accidentally keep references to objects longer than needed, memory usage grows. That is how some memory leaks happen in Java: not by forgetting `free()` like in C, but by retaining references in caches, static fields, thread locals, listeners, or long-lived collections.
|
||||
|
||||
### `var` and Type Inference
|
||||
|
||||
Modern Java supports local variable type inference with `var`.
|
||||
|
||||
```java
|
||||
var customerName = "Priya";
|
||||
var orderCount = 12;
|
||||
```
|
||||
|
||||
This does not make Java dynamically typed. The compiler still infers a concrete static type.
|
||||
|
||||
Use `var` when the type is obvious from the right-hand side. Avoid it when it hides important meaning.
|
||||
|
||||
Bad:
|
||||
|
||||
```java
|
||||
var result = service.execute(config, payload, strategy);
|
||||
```
|
||||
|
||||
Better:
|
||||
|
||||
```java
|
||||
PaymentResponse result = service.execute(config, payload, strategy);
|
||||
```
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- confusing primitives with wrapper types like `int` versus `Integer`
|
||||
- assuming `null` is valid for primitives; it is not
|
||||
- using floating-point types for money; production systems usually prefer `BigDecimal`
|
||||
- overusing `var` until the code becomes harder to read
|
||||
|
||||
## Operators and Expressions
|
||||
|
||||
Operators are how you transform and compare values. Beginners often see them as simple symbols, but production bugs frequently come from operator misuse, especially around comparison, short-circuiting, and numeric behavior.
|
||||
|
||||
### Arithmetic Operators
|
||||
|
||||
Java supports familiar arithmetic operations:
|
||||
|
||||
```java
|
||||
int total = 10 + 5;
|
||||
int remaining = 10 - 3;
|
||||
int doubled = 4 * 2;
|
||||
int quotient = 9 / 2;
|
||||
int remainder = 9 % 2;
|
||||
```
|
||||
|
||||
Notice that `9 / 2` with integers becomes `4`, not `4.5`. This matters in billing, pagination, and rate calculations.
|
||||
|
||||
### Comparison Operators
|
||||
|
||||
```java
|
||||
int threshold = 100;
|
||||
boolean high = threshold > 80;
|
||||
boolean exact = threshold == 100;
|
||||
boolean different = threshold != 50;
|
||||
```
|
||||
|
||||
For primitives, `==` compares values. For objects, `==` compares whether two references point to the same object.
|
||||
|
||||
That distinction is critical.
|
||||
|
||||
```java
|
||||
String a = new String("ok");
|
||||
String b = new String("ok");
|
||||
|
||||
System.out.println(a == b); // false
|
||||
System.out.println(a.equals(b)); // true
|
||||
```
|
||||
|
||||
In domain code, use `equals()` for logical comparison unless you explicitly care about identity.
|
||||
|
||||
### Logical Operators
|
||||
|
||||
```java
|
||||
boolean hasToken = true;
|
||||
boolean notExpired = true;
|
||||
|
||||
boolean allowed = hasToken && notExpired;
|
||||
boolean fallback = hasToken || notExpired;
|
||||
boolean blocked = !hasToken;
|
||||
```
|
||||
|
||||
`&&` and `||` short-circuit. That means Java may skip evaluating the right side if the left side already determines the result.
|
||||
|
||||
This is useful and common:
|
||||
|
||||
```java
|
||||
if (user != null && user.isActive()) {
|
||||
// safe because the second check only runs when user is not null
|
||||
}
|
||||
```
|
||||
|
||||
### Assignment and Increment Operators
|
||||
|
||||
```java
|
||||
int attempts = 0;
|
||||
attempts += 1;
|
||||
attempts++;
|
||||
```
|
||||
|
||||
Be careful with pre-increment and post-increment in larger expressions. They are legal, but often make code harder to reason about. Clear code is usually better than clever code.
|
||||
|
||||
### Ternary Operator
|
||||
|
||||
```java
|
||||
String label = isAdmin ? "admin" : "user";
|
||||
```
|
||||
|
||||
This is useful for simple conditional value selection. If the expression becomes nested or long, switch to an `if` block for readability.
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- integer division truncates toward zero
|
||||
- `==` on objects compares identity, not logical equality
|
||||
- combining too many operators in one expression makes debugging harder
|
||||
- side effects inside expressions reduce clarity
|
||||
|
||||
## Control Flow: If, Loops, and Switch
|
||||
|
||||
Control flow determines how your program makes decisions and repeats work. In backend systems, this often appears in validation logic, retry loops, data transformation, request routing, and state handling.
|
||||
|
||||
## Conditional Logic with `if`
|
||||
|
||||
```java
|
||||
if (amount <= 0) {
|
||||
throw new IllegalArgumentException("Amount must be positive");
|
||||
} else if (amount > 10_000) {
|
||||
System.out.println("Manual review required");
|
||||
} else {
|
||||
System.out.println("Payment accepted");
|
||||
}
|
||||
```
|
||||
|
||||
### Why `if` Matters
|
||||
|
||||
In production code, many failures happen because conditions are incomplete, ordered incorrectly, or too hard to understand. Good conditionals are:
|
||||
|
||||
- explicit
|
||||
- mutually understandable
|
||||
- narrow in purpose
|
||||
|
||||
For example, authentication code often checks conditions in a deliberate order:
|
||||
|
||||
1. is the token present?
|
||||
2. is the token well-formed?
|
||||
3. is it expired?
|
||||
4. does it map to a valid user?
|
||||
|
||||
That order affects both correctness and security.
|
||||
|
||||
## Loops
|
||||
|
||||
### `for` loop
|
||||
|
||||
```java
|
||||
for (int index = 0; index < orders.size(); index++) {
|
||||
System.out.println(orders.get(index));
|
||||
}
|
||||
```
|
||||
|
||||
Useful when you need the index or precise control.
|
||||
|
||||
### Enhanced `for` loop
|
||||
|
||||
```java
|
||||
for (String email : emails) {
|
||||
System.out.println(email);
|
||||
}
|
||||
```
|
||||
|
||||
Useful when you only need each element.
|
||||
|
||||
### `while` loop
|
||||
|
||||
```java
|
||||
while (!queue.isEmpty()) {
|
||||
process(queue.poll());
|
||||
}
|
||||
```
|
||||
|
||||
Useful when the stopping condition is not naturally tied to an index.
|
||||
|
||||
### Real-World Use Cases
|
||||
|
||||
- polling a message queue until empty in a batch job
|
||||
- retrying an external API call with a maximum attempt count
|
||||
- iterating through rows returned from a database or file
|
||||
- walking through a list of events to build an aggregate state
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- off-by-one errors in indexed loops
|
||||
- modifying a collection incorrectly while iterating
|
||||
- infinite loops caused by state that never changes
|
||||
- putting expensive work inside nested loops without noticing the performance cost
|
||||
|
||||
If a loop processes millions of records, simple mistakes become production incidents.
|
||||
|
||||
## `switch`
|
||||
|
||||
Java's `switch` is useful when one variable determines multiple branches.
|
||||
|
||||
```java
|
||||
String region = "EU";
|
||||
|
||||
switch (region) {
|
||||
case "US":
|
||||
System.out.println("Use US tax rules");
|
||||
break;
|
||||
case "EU":
|
||||
System.out.println("Use EU VAT rules");
|
||||
break;
|
||||
default:
|
||||
System.out.println("Use global defaults");
|
||||
}
|
||||
```
|
||||
|
||||
Modern Java also supports switch expressions, which are often cleaner.
|
||||
|
||||
```java
|
||||
String action = switch (region) {
|
||||
case "US" -> "usd-pricing";
|
||||
case "EU" -> "eur-pricing";
|
||||
default -> "default-pricing";
|
||||
};
|
||||
```
|
||||
|
||||
### Why This Matters in Production
|
||||
|
||||
`switch` logic often appears in:
|
||||
|
||||
- request routing based on event type
|
||||
- status-to-action mapping
|
||||
- feature behavior by region or tenant type
|
||||
- serialization and parsing logic
|
||||
|
||||
The main design risk is letting a `switch` grow so large that it becomes a code smell. At some point, polymorphism or strategy objects become a better fit.
|
||||
|
||||
## Methods and Basic Program Structure
|
||||
|
||||
Methods are where Java code becomes reusable, testable, and readable.
|
||||
|
||||
### Intuition
|
||||
|
||||
A method should represent one coherent action. When methods are too large, names stop helping and reasoning becomes difficult. Good Java code often feels readable because each method does one thing at the right level of abstraction.
|
||||
|
||||
### Example
|
||||
|
||||
```java
|
||||
public class InvoiceService {
|
||||
|
||||
public double calculateTotal(double subtotal, double taxRate) {
|
||||
validateSubtotal(subtotal);
|
||||
return subtotal + (subtotal * taxRate);
|
||||
}
|
||||
|
||||
private void validateSubtotal(double subtotal) {
|
||||
if (subtotal < 0) {
|
||||
throw new IllegalArgumentException("Subtotal cannot be negative");
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why This Structure Is Better
|
||||
|
||||
Instead of mixing validation, math, logging, and persistence in one method, this design separates concerns. That gives you:
|
||||
|
||||
- clearer intent
|
||||
- easier unit testing
|
||||
- simpler debugging
|
||||
- less duplication when validation rules are reused
|
||||
|
||||
### Method Signature Elements
|
||||
|
||||
- access modifier: `public`, `private`, and so on
|
||||
- return type: `double`, `String`, custom types, or `void`
|
||||
- method name: should describe behavior
|
||||
- parameters: inputs required by the method
|
||||
- thrown exceptions: sometimes declared explicitly
|
||||
|
||||
### Parameter Passing in Java
|
||||
|
||||
Java is always pass-by-value.
|
||||
|
||||
For primitives, the value itself is copied.
|
||||
|
||||
For objects, the reference is copied. That copied reference still points to the same underlying object.
|
||||
|
||||
```java
|
||||
class Counter {
|
||||
int value;
|
||||
}
|
||||
|
||||
void increment(Counter counter) {
|
||||
counter.value++;
|
||||
}
|
||||
```
|
||||
|
||||
If you call `increment(counter)`, the original object changes because both the caller and callee reference the same object. But if the method assigns `counter = new Counter();`, the caller does not start pointing to that new object.
|
||||
|
||||
This is a very common interview topic and a very common source of beginner confusion.
|
||||
|
||||
### Common Method Design Mistakes
|
||||
|
||||
- methods that do too many things
|
||||
- boolean flags like `process(true, false, true)` that make call sites unreadable
|
||||
- hidden side effects such as mutating shared state unexpectedly
|
||||
- returning `null` casually when an empty collection, exception, or `Optional` would be clearer
|
||||
|
||||
## Packages, Imports, and Source Organization
|
||||
|
||||
As Java codebases grow, organization matters as much as syntax.
|
||||
|
||||
### Packages
|
||||
|
||||
Packages group related classes.
|
||||
|
||||
```java
|
||||
package com.example.orders;
|
||||
```
|
||||
|
||||
In real systems, packages usually reflect bounded areas of responsibility, such as:
|
||||
|
||||
- `com.company.auth`
|
||||
- `com.company.billing`
|
||||
- `com.company.notifications`
|
||||
|
||||
### Imports
|
||||
|
||||
Imports let you refer to classes without writing their fully qualified names everywhere.
|
||||
|
||||
```java
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
```
|
||||
|
||||
### Real-World Convention
|
||||
|
||||
In production services, code is usually organized around features or layers. A simple service might contain packages for:
|
||||
|
||||
- controllers or API endpoints
|
||||
- services or business logic
|
||||
- repositories or persistence
|
||||
- domain models
|
||||
- configuration
|
||||
|
||||
That structure becomes much easier to reason about once you already understand classes, methods, and object interactions.
|
||||
|
||||
## Build Tools and the Everyday Java Workflow
|
||||
|
||||
You can compile Java with `javac` directly, but professional projects almost always use a build tool.
|
||||
|
||||
### Maven and Gradle
|
||||
|
||||
These tools handle:
|
||||
|
||||
- dependency downloads
|
||||
- compilation
|
||||
- test execution
|
||||
- packaging into JARs
|
||||
- plugin-based workflows like code generation or static analysis
|
||||
|
||||
### Why This Matters
|
||||
|
||||
In a real backend service, you rarely work with one file in isolation. Your build tool defines how the application is assembled, which library versions are used, and what happens in CI before code is deployed.
|
||||
|
||||
If Java syntax is the grammar, Maven or Gradle is part of the operating system of day-to-day Java development.
|
||||
|
||||
## How Fundamentals Show Up in Production Systems
|
||||
|
||||
The concepts in this file are not classroom-only topics.
|
||||
|
||||
- Strong typing makes API contracts and refactors safer.
|
||||
- JVM portability allows the same service artifact to run in local development, test, and cloud environments.
|
||||
- Control flow drives validation, retries, workflows, and business rules.
|
||||
- Method design strongly affects maintainability and testability.
|
||||
- Correct type choice matters for money, timestamps, concurrency, and memory usage.
|
||||
|
||||
For example, an order-processing service may:
|
||||
|
||||
1. start inside a JVM launched by a container image
|
||||
2. receive a request into a controller method
|
||||
3. validate fields with `if` conditions
|
||||
4. loop through order items
|
||||
5. call helper methods to calculate totals and taxes
|
||||
6. store domain objects on the heap while processing the request
|
||||
|
||||
That is not advanced Java. That is Java fundamentals in production.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- Java's real strength is not brevity. It is explicit structure, strong tooling, and predictable runtime behavior.
|
||||
- The JDK is your development toolkit, the JVM is the runtime engine, and bytecode portability is central to how Java works.
|
||||
- Primitive types and reference types behave differently, and understanding heap versus stack is essential for reasoning about memory.
|
||||
- Operators and control flow look simple, but many real bugs come from incorrect comparisons, truncation, bad branching, or poorly designed loops.
|
||||
- Methods are the unit of reusable behavior, and clean method design has direct impact on readability, testing, and long-term maintenance.
|
||||
- These fundamentals appear everywhere in real backend systems, so learning them deeply pays off before moving on to OOP, collections, concurrency, and frameworks.
|
||||
@@ -0,0 +1,554 @@
|
||||
# File 2: Object-Oriented Programming in Java
|
||||
|
||||
Object-oriented programming is where Java starts to feel like Java. The syntax from the fundamentals file matters, but OOP is what gives Java its shape in real applications. Backend systems are rarely a loose collection of standalone functions. They are usually modeled as collaborating objects: controllers, services, repositories, domain entities, validators, clients, schedulers, and configuration components.
|
||||
|
||||
If you only memorize the four textbook pillars of OOP, you will miss the point. The real question is this: how do we organize code so behavior stays understandable as the system grows? This file answers that question from a Java engineer's perspective.
|
||||
|
||||
## Classes and Objects
|
||||
|
||||
### Intuition
|
||||
|
||||
A class is a blueprint for state and behavior. An object is a concrete instance created from that blueprint.
|
||||
|
||||
That definition is technically correct, but the useful mental model is stronger:
|
||||
|
||||
- a class defines a unit of responsibility
|
||||
- an object is one live participant carrying out that responsibility at runtime
|
||||
|
||||
For example, in an order-processing service:
|
||||
|
||||
- `Order` might represent business data
|
||||
- `OrderService` might represent business behavior
|
||||
- `EmailNotifier` might represent an integration with an external system
|
||||
|
||||
### Example
|
||||
|
||||
```java
|
||||
public class BankAccount {
|
||||
private final String accountNumber;
|
||||
private double balance;
|
||||
|
||||
public BankAccount(String accountNumber, double openingBalance) {
|
||||
this.accountNumber = accountNumber;
|
||||
this.balance = openingBalance;
|
||||
}
|
||||
|
||||
public void deposit(double amount) {
|
||||
if (amount <= 0) {
|
||||
throw new IllegalArgumentException("Amount must be positive");
|
||||
}
|
||||
balance += amount;
|
||||
}
|
||||
|
||||
public void withdraw(double amount) {
|
||||
if (amount <= balance) {
|
||||
balance -= amount;
|
||||
}
|
||||
}
|
||||
|
||||
public double getBalance() {
|
||||
return balance;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Notice that the account is not just a bag of fields. It contains rules. That is one of the key ideas in OOP: data and behavior should often live together so the invariants are easier to protect.
|
||||
|
||||
In real systems, that helps prevent invalid state changes from being scattered across the codebase.
|
||||
|
||||
## Encapsulation
|
||||
|
||||
Encapsulation means hiding internal state and exposing only the operations that make sense.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
If every part of the system can freely mutate every field, the object stops being trustworthy. You can no longer tell where a bug came from because the object's state could have been changed anywhere.
|
||||
|
||||
Encapsulation creates a boundary:
|
||||
|
||||
- internal representation is private
|
||||
- changes happen through controlled methods
|
||||
- validation and invariants live close to the data they protect
|
||||
|
||||
### Example
|
||||
|
||||
Bad design:
|
||||
|
||||
```java
|
||||
public class UserProfile {
|
||||
public String email;
|
||||
public boolean verified;
|
||||
}
|
||||
```
|
||||
|
||||
Any caller can put this object into inconsistent states.
|
||||
|
||||
Better design:
|
||||
|
||||
```java
|
||||
public class UserProfile {
|
||||
private String email;
|
||||
private boolean verified;
|
||||
|
||||
public UserProfile(String email) {
|
||||
changeEmail(email);
|
||||
this.verified = false;
|
||||
}
|
||||
|
||||
public void changeEmail(String newEmail) {
|
||||
if (newEmail == null || newEmail.isBlank()) {
|
||||
throw new IllegalArgumentException("Email is required");
|
||||
}
|
||||
this.email = newEmail;
|
||||
this.verified = false;
|
||||
}
|
||||
|
||||
public void markVerified() {
|
||||
this.verified = true;
|
||||
}
|
||||
|
||||
public String getEmail() {
|
||||
return email;
|
||||
}
|
||||
|
||||
public boolean isVerified() {
|
||||
return verified;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Real-World Use Case
|
||||
|
||||
Encapsulation is critical in domains such as:
|
||||
|
||||
- payments, where balance or ledger state must obey strict rules
|
||||
- authentication, where token and session transitions must remain valid
|
||||
- inventory, where stock cannot drop below zero because random code mutated a field directly
|
||||
|
||||
### Common Misconception
|
||||
|
||||
Encapsulation is not the same as "always write getters and setters for every field." Blindly exposing setters often destroys the protection that encapsulation is supposed to provide. A class should expose behavior, not just field mutation.
|
||||
|
||||
## Inheritance
|
||||
|
||||
Inheritance lets one class reuse and extend behavior from another class.
|
||||
|
||||
```java
|
||||
public class Employee {
|
||||
protected final String name;
|
||||
|
||||
public Employee(String name) {
|
||||
this.name = name;
|
||||
}
|
||||
|
||||
public void work() {
|
||||
System.out.println(name + " is working");
|
||||
}
|
||||
}
|
||||
|
||||
public class Manager extends Employee {
|
||||
public Manager(String name) {
|
||||
super(name);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void work() {
|
||||
System.out.println(name + " is planning and reviewing work");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Intuition
|
||||
|
||||
Inheritance represents an "is-a" relationship.
|
||||
|
||||
- a `Manager` is an `Employee`
|
||||
- a `Circle` is a `Shape`
|
||||
|
||||
The subclass gets access to behavior from the parent and may override methods.
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
When you call an overridden method on a parent reference, Java uses dynamic dispatch at runtime to choose the actual implementation based on the real object type, not just the variable type.
|
||||
|
||||
```java
|
||||
Employee employee = new Manager("Asha");
|
||||
employee.work();
|
||||
```
|
||||
|
||||
This calls `Manager.work()`, not `Employee.work()`.
|
||||
|
||||
### Where Inheritance Helps
|
||||
|
||||
Inheritance is useful when subclasses truly share a stable conceptual model and substantial reusable behavior. Typical examples:
|
||||
|
||||
- framework base classes
|
||||
- exception hierarchies
|
||||
- UI component hierarchies in some systems
|
||||
- domain modeling where a clear subtype relationship exists
|
||||
|
||||
### Why Engineers Are Careful with It
|
||||
|
||||
Inheritance creates tight coupling. Once a subclass depends on parent implementation details, changes become harder. Deep inheritance trees are often brittle and confusing.
|
||||
|
||||
That is why experienced Java engineers often prefer composition unless the inheritance relationship is very natural.
|
||||
|
||||
## Polymorphism
|
||||
|
||||
Polymorphism means code can work against a general type while runtime behavior changes based on the actual object implementation.
|
||||
|
||||
### Example
|
||||
|
||||
```java
|
||||
public interface PaymentProcessor {
|
||||
PaymentResult process(PaymentRequest request);
|
||||
}
|
||||
|
||||
public class CardPaymentProcessor implements PaymentProcessor {
|
||||
@Override
|
||||
public PaymentResult process(PaymentRequest request) {
|
||||
return new PaymentResult("card-authorized");
|
||||
}
|
||||
}
|
||||
|
||||
public class WalletPaymentProcessor implements PaymentProcessor {
|
||||
@Override
|
||||
public PaymentResult process(PaymentRequest request) {
|
||||
return new PaymentResult("wallet-authorized");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now calling code can depend on `PaymentProcessor` without caring which concrete implementation is active.
|
||||
|
||||
```java
|
||||
public class CheckoutService {
|
||||
private final PaymentProcessor paymentProcessor;
|
||||
|
||||
public CheckoutService(PaymentProcessor paymentProcessor) {
|
||||
this.paymentProcessor = paymentProcessor;
|
||||
}
|
||||
|
||||
public PaymentResult checkout(PaymentRequest request) {
|
||||
return paymentProcessor.process(request);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Production Relevance
|
||||
|
||||
This pattern shows up everywhere:
|
||||
|
||||
- different payment providers behind one contract
|
||||
- multiple notification channels behind one interface
|
||||
- storage implementations for local disk, S3, or blob storage
|
||||
- authentication strategies for API key, OAuth, or internal service token
|
||||
|
||||
Polymorphism reduces branching and keeps behavior extensible.
|
||||
|
||||
## Abstraction
|
||||
|
||||
Abstraction means exposing the essential behavior while hiding unnecessary implementation detail.
|
||||
|
||||
### Intuition
|
||||
|
||||
When you drive a car, you use the steering wheel and pedals, not the combustion process details. In code, abstraction gives callers the same kind of high-level interface.
|
||||
|
||||
### Abstract Class Example
|
||||
|
||||
```java
|
||||
public abstract class ReportGenerator {
|
||||
|
||||
public final String generate() {
|
||||
fetchData();
|
||||
transformData();
|
||||
return renderOutput();
|
||||
}
|
||||
|
||||
protected abstract void fetchData();
|
||||
protected abstract void transformData();
|
||||
protected abstract String renderOutput();
|
||||
}
|
||||
```
|
||||
|
||||
This is useful when you want a common workflow with customizable steps.
|
||||
|
||||
### Real-World Use Case
|
||||
|
||||
Frameworks use abstraction heavily. A framework might define the overall lifecycle and let your code implement only the project-specific pieces.
|
||||
|
||||
That idea appears in:
|
||||
|
||||
- servlet filters
|
||||
- batch job processing
|
||||
- event handlers
|
||||
- test frameworks
|
||||
- template methods inside enterprise applications
|
||||
|
||||
## Interfaces vs Abstract Classes
|
||||
|
||||
This is a classic Java question, but it is more valuable when framed as a design decision rather than a quiz answer.
|
||||
|
||||
### Interface
|
||||
|
||||
An interface defines a contract. It says what behavior a type must support.
|
||||
|
||||
```java
|
||||
public interface Notifier {
|
||||
void send(String destination, String message);
|
||||
}
|
||||
```
|
||||
|
||||
### Abstract Class
|
||||
|
||||
An abstract class is a partial implementation. It is useful when related types share both behavior and state.
|
||||
|
||||
```java
|
||||
public abstract class BaseNotifier {
|
||||
protected final AuditLogger auditLogger;
|
||||
|
||||
protected BaseNotifier(AuditLogger auditLogger) {
|
||||
this.auditLogger = auditLogger;
|
||||
}
|
||||
|
||||
protected void recordAudit(String destination) {
|
||||
auditLogger.record("sent to " + destination);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Practical Comparison
|
||||
|
||||
| Topic | Interface | Abstract Class |
|
||||
| --- | --- | --- |
|
||||
| Main purpose | Define capability or role | Share base behavior and state |
|
||||
| Multiple inheritance | A class can implement many | A class can extend only one |
|
||||
| Instance fields | No regular instance state | Yes |
|
||||
| Constructor | No | Yes |
|
||||
| Best use | Flexible contracts | Common base workflow or shared implementation |
|
||||
|
||||
### Design Intuition
|
||||
|
||||
Use an interface when you want callers to depend on behavior and stay decoupled from implementation details.
|
||||
|
||||
Use an abstract class when multiple related types genuinely share implementation and that shared base is stable enough to matter.
|
||||
|
||||
In many backend codebases, interfaces are common for boundaries and abstract classes are used more selectively.
|
||||
|
||||
## Composition vs Inheritance
|
||||
|
||||
This is one of the most important design instincts in Java.
|
||||
|
||||
### Composition
|
||||
|
||||
Composition means building objects by combining other objects.
|
||||
|
||||
```java
|
||||
public class OrderService {
|
||||
private final PaymentProcessor paymentProcessor;
|
||||
private final InventoryService inventoryService;
|
||||
private final NotificationService notificationService;
|
||||
|
||||
public OrderService(
|
||||
PaymentProcessor paymentProcessor,
|
||||
InventoryService inventoryService,
|
||||
NotificationService notificationService) {
|
||||
this.paymentProcessor = paymentProcessor;
|
||||
this.inventoryService = inventoryService;
|
||||
this.notificationService = notificationService;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The service is not inheriting behavior. It is coordinating collaborators.
|
||||
|
||||
### Why Composition Usually Wins
|
||||
|
||||
Composition is often more flexible because:
|
||||
|
||||
- dependencies are explicit
|
||||
- behavior can be swapped at runtime or in tests
|
||||
- classes stay focused on one role
|
||||
- you avoid fragile parent-child coupling
|
||||
|
||||
### Relationship Diagram
|
||||
|
||||
```mermaid
|
||||
classDiagram
|
||||
class OrderService {
|
||||
-PaymentProcessor paymentProcessor
|
||||
-InventoryService inventoryService
|
||||
-NotificationService notificationService
|
||||
+placeOrder()
|
||||
}
|
||||
|
||||
class PaymentProcessor {
|
||||
<<interface>>
|
||||
+process(request)
|
||||
}
|
||||
|
||||
class CardPaymentProcessor {
|
||||
+process(request)
|
||||
}
|
||||
|
||||
class InventoryService {
|
||||
+reserve(items)
|
||||
}
|
||||
|
||||
class NotificationService {
|
||||
+sendConfirmation(order)
|
||||
}
|
||||
|
||||
OrderService --> PaymentProcessor : uses
|
||||
OrderService --> InventoryService : uses
|
||||
OrderService --> NotificationService : uses
|
||||
CardPaymentProcessor ..|> PaymentProcessor
|
||||
```
|
||||
|
||||
### Inheritance Diagram
|
||||
|
||||
```mermaid
|
||||
classDiagram
|
||||
class Employee {
|
||||
+work()
|
||||
}
|
||||
class Manager {
|
||||
+work()
|
||||
+approveBudget()
|
||||
}
|
||||
class Engineer {
|
||||
+work()
|
||||
+buildFeature()
|
||||
}
|
||||
|
||||
Employee <|-- Manager
|
||||
Employee <|-- Engineer
|
||||
```
|
||||
|
||||
### Rule of Thumb
|
||||
|
||||
If the relationship is naturally "has-a," use composition.
|
||||
|
||||
If the relationship is naturally "is-a" and the base behavior is stable and meaningful, inheritance may be fine.
|
||||
|
||||
### Common Pitfall
|
||||
|
||||
Do not use inheritance just to avoid writing a few lines twice. Duplication can often be refactored in other ways. Wrong inheritance is more expensive than small duplication.
|
||||
|
||||
## Object Construction and Lifecycle
|
||||
|
||||
In Java, objects are created with `new`, by factories, or by frameworks that manage dependency injection.
|
||||
|
||||
### Basic Construction
|
||||
|
||||
```java
|
||||
UserProfile profile = new UserProfile("user@example.com");
|
||||
```
|
||||
|
||||
### Why Construction Matters in Real Systems
|
||||
|
||||
As applications grow, object creation itself becomes design-relevant.
|
||||
|
||||
- should every caller build the object manually?
|
||||
- should construction enforce invariants?
|
||||
- should expensive resources be shared?
|
||||
- should a framework control creation for testability and lifecycle management?
|
||||
|
||||
This is why design patterns such as builder, factory, and dependency injection appear so often in Java engineering.
|
||||
|
||||
## `this`, `super`, and Method Overriding
|
||||
|
||||
### `this`
|
||||
|
||||
`this` refers to the current object.
|
||||
|
||||
```java
|
||||
public class Customer {
|
||||
private final String name;
|
||||
|
||||
public Customer(String name) {
|
||||
this.name = name;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### `super`
|
||||
|
||||
`super` refers to the parent class portion of the object.
|
||||
|
||||
```java
|
||||
public class PremiumCustomer extends Customer {
|
||||
public PremiumCustomer(String name) {
|
||||
super(name);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Overriding Rules Worth Remembering
|
||||
|
||||
- method name and parameters must match
|
||||
- return type must be compatible
|
||||
- access level cannot be more restrictive
|
||||
- static methods are hidden, not overridden
|
||||
- constructors are not overridden
|
||||
|
||||
These rules matter because subtle mistakes can lead to code that compiles but does not behave the way you expected.
|
||||
|
||||
## OOP Design Intuition
|
||||
|
||||
Strong OOP design is less about showing every language feature and more about assigning responsibility well.
|
||||
|
||||
### Good Design Questions
|
||||
|
||||
- which object should own this rule?
|
||||
- which class is holding too much knowledge?
|
||||
- is this behavior better represented as a method, a collaborator, or a separate strategy?
|
||||
- are we modeling domain concepts clearly, or just dumping data into DTO-like objects?
|
||||
|
||||
### Signs of Weak OOP Design
|
||||
|
||||
- giant service classes that do everything
|
||||
- data classes with no behavior and business logic scattered elsewhere
|
||||
- excessive inheritance used for convenience rather than meaning
|
||||
- public mutable fields everywhere
|
||||
- interfaces added for every class even when no abstraction need exists
|
||||
|
||||
### Production Perspective
|
||||
|
||||
In real backend systems, the best OOP often looks boring in a good way:
|
||||
|
||||
- classes have narrow responsibilities
|
||||
- collaborators are injected explicitly
|
||||
- domain rules live near the domain model
|
||||
- interfaces define important boundaries
|
||||
- composition is preferred over deep inheritance
|
||||
|
||||
That style scales better than clever class hierarchies.
|
||||
|
||||
## How OOP Shows Up in a Backend Request
|
||||
|
||||
When an HTTP request arrives in a Java service, several objects usually collaborate:
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[HTTP Request] --> B[Controller]
|
||||
B --> C[Service]
|
||||
C --> D[Repository]
|
||||
C --> E[PaymentProcessor]
|
||||
C --> F[NotificationService]
|
||||
D --> G[(Database)]
|
||||
E --> H[External Provider]
|
||||
```
|
||||
|
||||
Each object has a distinct role. That separation is one of the main benefits of object-oriented design in Java. The controller handles transport concerns, the service handles business workflow, the repository handles persistence, and integrations are wrapped behind focused components.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- A class should represent a coherent responsibility, and an object is a runtime participant carrying out that responsibility.
|
||||
- Encapsulation protects invariants and is more valuable than blindly generating getters and setters.
|
||||
- Inheritance enables reuse and polymorphism, but it also creates tight coupling and should be used carefully.
|
||||
- Interfaces are excellent for contracts and system boundaries, while abstract classes are better for shared base behavior and state.
|
||||
- Composition is usually more flexible and maintainable than inheritance in production Java systems.
|
||||
- Good OOP design is about clear responsibility and collaboration, not about maximizing the number of classes or language features used.
|
||||
@@ -0,0 +1,447 @@
|
||||
# File 3: Core Java APIs
|
||||
|
||||
The Java language by itself is only part of the story. What makes Java productive in real work is the standard library. Once you move beyond syntax, you spend large amounts of time with `String`, collections, generics, exceptions, file and stream APIs, time APIs, and utility classes.
|
||||
|
||||
This file focuses on the APIs that show up constantly in real applications. These topics are foundational because they influence how data moves through the system, how failures are handled, and how code remains readable and type-safe at scale.
|
||||
|
||||
## Strings and Immutability
|
||||
|
||||
### Intuition
|
||||
|
||||
A `String` is one of the most frequently used objects in Java. It represents text such as user names, JSON fields, URLs, tokens, file paths, SQL fragments, log messages, and HTTP headers.
|
||||
|
||||
Java makes `String` immutable, which means once a string object is created, its contents do not change.
|
||||
|
||||
### Why Immutability Matters
|
||||
|
||||
Immutability makes strings:
|
||||
|
||||
- safer to share across threads
|
||||
- easier to reason about
|
||||
- more predictable as map keys and cache keys
|
||||
- less error-prone in APIs that pass text through multiple layers
|
||||
|
||||
### Example
|
||||
|
||||
```java
|
||||
String original = "order";
|
||||
String upper = original.toUpperCase();
|
||||
|
||||
System.out.println(original); // order
|
||||
System.out.println(upper); // ORDER
|
||||
```
|
||||
|
||||
`original` is unchanged. `toUpperCase()` returns a new `String`.
|
||||
|
||||
### Internal Detail That Matters
|
||||
|
||||
Java often interns string literals, meaning identical literals may refer to the same pooled object. That is an optimization detail, not something business logic should rely on.
|
||||
|
||||
This is why using `==` for strings is dangerous.
|
||||
|
||||
```java
|
||||
String a = new String("paid");
|
||||
String b = new String("paid");
|
||||
|
||||
System.out.println(a == b); // false
|
||||
System.out.println(a.equals(b)); // true
|
||||
```
|
||||
|
||||
### Performance Concern
|
||||
|
||||
Repeated string concatenation inside loops can create many temporary objects.
|
||||
|
||||
```java
|
||||
String result = "";
|
||||
for (String item : items) {
|
||||
result += item;
|
||||
}
|
||||
```
|
||||
|
||||
Prefer `StringBuilder` for heavy concatenation.
|
||||
|
||||
```java
|
||||
StringBuilder builder = new StringBuilder();
|
||||
for (String item : items) {
|
||||
builder.append(item);
|
||||
}
|
||||
String result = builder.toString();
|
||||
```
|
||||
|
||||
### Real-World Use Cases
|
||||
|
||||
- building log lines or audit entries
|
||||
- parsing and validating incoming request fields
|
||||
- constructing file paths, cache keys, and search queries
|
||||
- formatting API responses or templated messages
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- comparing strings with `==`
|
||||
- forgetting that strings are immutable and assuming a method call mutates them
|
||||
- building large strings inefficiently in loops
|
||||
|
||||
## Collections Framework
|
||||
|
||||
Collections are how Java code represents groups of related elements. In real systems, almost every workflow involves a collection somewhere: request headers, order items, search results, cache entries, event batches, deduplicated IDs, grouped metrics, or lookup tables.
|
||||
|
||||
### Collections Structure Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Collection Framework] --> B[List]
|
||||
A --> C[Set]
|
||||
A --> D[Map]
|
||||
B --> E[ArrayList]
|
||||
B --> F[LinkedList]
|
||||
C --> G[HashSet]
|
||||
C --> H[TreeSet]
|
||||
D --> I[HashMap]
|
||||
D --> J[TreeMap]
|
||||
```
|
||||
|
||||
### `List`
|
||||
|
||||
A `List` is ordered and allows duplicates.
|
||||
|
||||
```java
|
||||
List<String> steps = new ArrayList<>();
|
||||
steps.add("validate");
|
||||
steps.add("charge");
|
||||
steps.add("notify");
|
||||
```
|
||||
|
||||
Use a list when order matters or when repeated values are acceptable.
|
||||
|
||||
#### `ArrayList`
|
||||
|
||||
This is the default `List` choice in most business applications. It provides fast indexed access and good general-purpose performance.
|
||||
|
||||
Typical use cases:
|
||||
|
||||
- ordered API results
|
||||
- batched work items
|
||||
- DTO lists in controller responses
|
||||
|
||||
#### `LinkedList`
|
||||
|
||||
Used far less often in typical backend applications than beginners expect. It can be helpful in queue-like patterns, but `ArrayList` and dedicated queue implementations are usually more practical.
|
||||
|
||||
### `Set`
|
||||
|
||||
A `Set` stores unique elements.
|
||||
|
||||
```java
|
||||
Set<String> roles = new HashSet<>();
|
||||
roles.add("ADMIN");
|
||||
roles.add("USER");
|
||||
roles.add("ADMIN");
|
||||
```
|
||||
|
||||
The duplicate `ADMIN` is ignored.
|
||||
|
||||
Use a set when uniqueness matters.
|
||||
|
||||
Real-world examples:
|
||||
|
||||
- deduplicating email addresses
|
||||
- tracking processed event IDs for idempotency checks
|
||||
- storing permission names
|
||||
|
||||
### `Map`
|
||||
|
||||
A `Map` stores key-value pairs.
|
||||
|
||||
```java
|
||||
Map<String, Integer> inventory = new HashMap<>();
|
||||
inventory.put("laptop", 15);
|
||||
inventory.put("mouse", 48);
|
||||
```
|
||||
|
||||
Maps are everywhere in backend systems.
|
||||
|
||||
Typical examples:
|
||||
|
||||
- ID to entity lookup
|
||||
- configuration name to value
|
||||
- region to tax rule
|
||||
- user ID to session metadata
|
||||
|
||||
### Choosing Common Implementations
|
||||
|
||||
| Type | Default Choice | Why |
|
||||
| --- | --- | --- |
|
||||
| `List` | `ArrayList` | Fast iteration and index-based reads |
|
||||
| `Set` | `HashSet` | Fast uniqueness checks |
|
||||
| `Map` | `HashMap` | Fast key-based lookup |
|
||||
|
||||
Choose tree-based variants like `TreeSet` or `TreeMap` when you explicitly need sorted ordering.
|
||||
|
||||
### Practical Usage Pattern
|
||||
|
||||
Suppose you receive a batch of orders and want to group them by customer.
|
||||
|
||||
```java
|
||||
Map<String, List<Order>> ordersByCustomer = new HashMap<>();
|
||||
|
||||
for (Order order : orders) {
|
||||
ordersByCustomer
|
||||
.computeIfAbsent(order.getCustomerId(), key -> new ArrayList<>())
|
||||
.add(order);
|
||||
}
|
||||
```
|
||||
|
||||
This is a very common pattern in reporting, aggregation, and batch-processing pipelines.
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- assuming `HashMap` preserves insertion order; it does not
|
||||
- forgetting that mutable keys can break map behavior
|
||||
- choosing a list when you really need uniqueness or fast lookup
|
||||
- modifying a collection while iterating in unsupported ways, causing `ConcurrentModificationException`
|
||||
|
||||
## Generics
|
||||
|
||||
Generics allow Java to express type-safe reusable data structures and APIs.
|
||||
|
||||
### Intuition
|
||||
|
||||
Without generics, collections would hold generic `Object` values, forcing you to cast manually and discover mistakes late. Generics move those checks to compile time.
|
||||
|
||||
```java
|
||||
List<String> names = new ArrayList<>();
|
||||
names.add("Anita");
|
||||
```
|
||||
|
||||
Now the compiler knows this list is supposed to contain strings.
|
||||
|
||||
### Why This Matters in Real Systems
|
||||
|
||||
Generics make API contracts clearer.
|
||||
|
||||
Examples:
|
||||
|
||||
- `List<Order>` means a list of orders, not arbitrary objects
|
||||
- `Map<String, UserSession>` means string keys mapped to session objects
|
||||
- `ResponseEntity<CustomerDto>` clearly communicates response payload type in a web application
|
||||
|
||||
### Generic Methods
|
||||
|
||||
```java
|
||||
public static <T> T first(List<T> items) {
|
||||
if (items.isEmpty()) {
|
||||
throw new IllegalArgumentException("List cannot be empty");
|
||||
}
|
||||
return items.get(0);
|
||||
}
|
||||
```
|
||||
|
||||
### Wildcards at a High Level
|
||||
|
||||
You will eventually see forms such as `List<? extends Number>` or `List<? super Integer>`. The full details take practice, but the intuition is:
|
||||
|
||||
- `extends` is for reading from a producer
|
||||
- `super` is for writing to a consumer
|
||||
|
||||
This is often summarized as PECS: producer extends, consumer super.
|
||||
|
||||
### Internal Note
|
||||
|
||||
Java generics use type erasure, meaning generic type information is mostly removed at runtime. This is why you cannot do everything with generics that reified generic systems in some other languages allow.
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- using raw types like `List` instead of `List<String>`
|
||||
- overcomplicating code with advanced wildcards when a simpler API would do
|
||||
- expecting generic type arguments to remain fully available at runtime
|
||||
|
||||
## Exception Handling
|
||||
|
||||
Failure handling is a core part of Java engineering. Good code does not just describe the happy path. It makes failure modes visible and deliberate.
|
||||
|
||||
### The Idea Behind Exceptions
|
||||
|
||||
An exception represents a failure or abnormal condition that interrupts normal execution.
|
||||
|
||||
Some failures are expected operationally, such as missing files or invalid user input. Others represent programming bugs, such as null dereferences or invalid assumptions.
|
||||
|
||||
### Checked vs Unchecked Exceptions
|
||||
|
||||
Java has two broad exception categories.
|
||||
|
||||
#### Checked exceptions
|
||||
|
||||
These are enforced by the compiler. You must catch them or declare them.
|
||||
|
||||
Examples include many file and I/O related exceptions.
|
||||
|
||||
#### Unchecked exceptions
|
||||
|
||||
These extend `RuntimeException`. The compiler does not force handling.
|
||||
|
||||
Examples include:
|
||||
|
||||
- `NullPointerException`
|
||||
- `IllegalArgumentException`
|
||||
- `IllegalStateException`
|
||||
|
||||
### Why the Distinction Exists
|
||||
|
||||
Checked exceptions signal recoverable or expected external failure modes.
|
||||
|
||||
Unchecked exceptions often represent programming mistakes or invalid internal states.
|
||||
|
||||
In real engineering, teams vary on how much they like checked exceptions, but understanding the model is still important.
|
||||
|
||||
### Example
|
||||
|
||||
```java
|
||||
public String readFile(Path path) throws IOException {
|
||||
return Files.readString(path);
|
||||
}
|
||||
```
|
||||
|
||||
The caller must decide whether to handle `IOException` or propagate it.
|
||||
|
||||
### Practical Guidance
|
||||
|
||||
- throw exceptions that communicate intent clearly
|
||||
- do not catch exceptions just to ignore them
|
||||
- wrap low-level exceptions when exposing a cleaner domain-level abstraction
|
||||
- include useful context in logs and messages
|
||||
|
||||
### Real-World Pattern
|
||||
|
||||
Suppose a service calls a third-party payment gateway. The low-level HTTP client might throw transport exceptions, but your business layer may convert those into a domain-specific exception like `PaymentUnavailableException`.
|
||||
|
||||
That makes the rest of the system easier to understand.
|
||||
|
||||
### `try`, `catch`, `finally`, and try-with-resources
|
||||
|
||||
```java
|
||||
try (BufferedReader reader = Files.newBufferedReader(path)) {
|
||||
return reader.readLine();
|
||||
} catch (IOException exception) {
|
||||
throw new IllegalStateException("Failed to read config", exception);
|
||||
}
|
||||
```
|
||||
|
||||
The try-with-resources form is important because it closes resources automatically.
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- catching `Exception` too broadly and hiding real issues
|
||||
- using exceptions for normal control flow
|
||||
- logging and rethrowing the same exception repeatedly, creating noisy duplicate logs
|
||||
- losing the original cause when wrapping exceptions badly
|
||||
|
||||
## Java I/O Basics
|
||||
|
||||
I/O means interacting with systems outside your program's in-memory state.
|
||||
|
||||
Examples include:
|
||||
|
||||
- reading files
|
||||
- writing logs
|
||||
- handling network sockets
|
||||
- streaming data to cloud storage
|
||||
- loading configuration from disk
|
||||
|
||||
### Intuition
|
||||
|
||||
I/O is slower and less predictable than pure in-memory operations because it depends on disks, networks, operating system buffers, remote services, and external state.
|
||||
|
||||
That is why I/O-heavy code needs stronger error handling and performance awareness.
|
||||
|
||||
### Core Modern APIs
|
||||
|
||||
The `java.nio.file` package is the common modern entry point for file work.
|
||||
|
||||
```java
|
||||
Path path = Path.of("config/app.properties");
|
||||
String content = Files.readString(path);
|
||||
Files.writeString(path, content + "\nmode=prod");
|
||||
```
|
||||
|
||||
### Streams of Data
|
||||
|
||||
Buffered APIs are often used for efficient reading and writing.
|
||||
|
||||
```java
|
||||
try (BufferedReader reader = Files.newBufferedReader(path)) {
|
||||
String line;
|
||||
while ((line = reader.readLine()) != null) {
|
||||
System.out.println(line);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Production Use Cases
|
||||
|
||||
- loading service configuration or templates at startup
|
||||
- ingesting CSV or log files in batch jobs
|
||||
- exporting reports to disk or object storage
|
||||
- reading secrets or certificates mounted into a container
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- reading huge files into memory when streaming would be safer
|
||||
- forgetting character encoding concerns
|
||||
- failing to close resources properly
|
||||
- performing blocking I/O on critical request threads without understanding throughput impact
|
||||
|
||||
## Practical Usage Patterns That Show Up Everywhere
|
||||
|
||||
### Pattern 1: Deduplication
|
||||
|
||||
Use a `Set` when idempotency or uniqueness matters.
|
||||
|
||||
```java
|
||||
Set<String> processedEventIds = new HashSet<>();
|
||||
```
|
||||
|
||||
### Pattern 2: Lookup by Key
|
||||
|
||||
Use a `Map` for fast retrieval.
|
||||
|
||||
```java
|
||||
Map<Long, Customer> customersById = new HashMap<>();
|
||||
```
|
||||
|
||||
### Pattern 3: Ordered Results
|
||||
|
||||
Use a `List` when sequence matters.
|
||||
|
||||
```java
|
||||
List<Transaction> transactions = new ArrayList<>();
|
||||
```
|
||||
|
||||
### Pattern 4: Safe Text Handling
|
||||
|
||||
Use immutable strings for identifiers, payload fragments, and logs, but switch to `StringBuilder` for heavy concatenation.
|
||||
|
||||
### Pattern 5: Resource Safety
|
||||
|
||||
Use try-with-resources for anything that must be closed, such as readers, streams, sockets, and many database-facing abstractions.
|
||||
|
||||
## How These APIs Show Up in Production Systems
|
||||
|
||||
Consider a batch job that processes order exports:
|
||||
|
||||
1. it reads a file using Java I/O APIs
|
||||
2. it parses each line into objects and stores them in collections
|
||||
3. it uses maps for lookups and sets for deduplication
|
||||
4. it uses strings to validate and normalize text fields
|
||||
5. it throws or wraps exceptions when bad data or file issues occur
|
||||
6. it emits a summary report built safely and efficiently
|
||||
|
||||
This is ordinary Java engineering. It is not glamorous, but a large amount of production reliability depends on doing these basics well.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- `String` is immutable, which improves safety and predictability, but you should still be mindful of comparison and concatenation costs.
|
||||
- The collections framework gives you distinct tools for ordering, uniqueness, and lookup, and choosing the right one affects both clarity and performance.
|
||||
- Generics move many type errors to compile time and make library and application code much easier to reason about.
|
||||
- Exception handling is about making failure explicit and meaningful, not just preventing crashes.
|
||||
- Java I/O interacts with slow and failure-prone external systems, so resource handling and error handling matter a lot.
|
||||
- These APIs form the everyday vocabulary of production Java code, especially in backend services, batch jobs, and integration-heavy systems.
|
||||
@@ -0,0 +1,384 @@
|
||||
# File 4: Advanced Java Concepts
|
||||
|
||||
This file covers the topics that make Java powerful in long-running, high-throughput systems: concurrency, the memory model, garbage collection, streams, and performance thinking. These are the areas where beginner knowledge often stops and engineering maturity starts.
|
||||
|
||||
You do not need to become a JVM internals expert on day one, but you do need a solid mental model. Otherwise, concurrency bugs, latency spikes, and memory issues will feel mysterious. The goal here is to remove that mystery.
|
||||
|
||||
## Multithreading and Concurrency
|
||||
|
||||
### Intuition
|
||||
|
||||
A thread is an independent path of execution inside a process. Java supports multiple threads so a program can do more than one thing at the same time or keep making progress while other work waits on I/O.
|
||||
|
||||
In a backend service, threads may handle:
|
||||
|
||||
- incoming HTTP requests
|
||||
- scheduled jobs
|
||||
- asynchronous message processing
|
||||
- background cleanup work
|
||||
- database or network calls coordinated by worker pools
|
||||
|
||||
### Why Concurrency Exists
|
||||
|
||||
Without concurrency, a service that waits on slow operations would waste time doing nothing. With concurrency, other work can proceed while one task waits.
|
||||
|
||||
### Basic Example
|
||||
|
||||
```java
|
||||
Runnable task = () -> System.out.println("Processing in " + Thread.currentThread().getName());
|
||||
Thread worker = new Thread(task);
|
||||
worker.start();
|
||||
```
|
||||
|
||||
This creates and starts a new thread, but in production systems you usually prefer executors and thread pools over manually creating raw threads.
|
||||
|
||||
### Thread Pool Mental Model
|
||||
|
||||
Instead of constantly creating and destroying threads, a thread pool reuses a fixed or managed set of worker threads.
|
||||
|
||||
That reduces overhead and gives you control over concurrency level.
|
||||
|
||||
```java
|
||||
ExecutorService executor = Executors.newFixedThreadPool(4);
|
||||
executor.submit(() -> processOrder(orderId));
|
||||
```
|
||||
|
||||
### Real-World Use Cases
|
||||
|
||||
- processing multiple independent tasks in parallel
|
||||
- handling high request volume in servers
|
||||
- offloading email or report generation from the request thread
|
||||
- consuming events from a queue with worker threads
|
||||
|
||||
### Common Pitfall
|
||||
|
||||
More threads do not automatically mean better performance. Too many threads can increase context switching, memory pressure, lock contention, and tail latency.
|
||||
|
||||
## Thread Lifecycle
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> New
|
||||
New --> Runnable: start()
|
||||
Runnable --> Running: scheduled by CPU
|
||||
Running --> Blocked: waiting for lock or I/O
|
||||
Blocked --> Runnable: lock or I/O available
|
||||
Running --> Waiting: wait/sleep/join
|
||||
Waiting --> Runnable: notified or timeout
|
||||
Running --> Terminated: run completes
|
||||
```
|
||||
|
||||
### Why the Lifecycle Matters
|
||||
|
||||
If you debug a production incident involving stuck requests or slow jobs, understanding whether threads are runnable, blocked, waiting, or deadlocked becomes very important. Tools like thread dumps only make sense if you understand these states.
|
||||
|
||||
## Synchronization
|
||||
|
||||
Concurrency becomes difficult when multiple threads access shared mutable state.
|
||||
|
||||
### The Core Problem
|
||||
|
||||
Imagine two threads incrementing the same counter.
|
||||
|
||||
```java
|
||||
counter++;
|
||||
```
|
||||
|
||||
This looks atomic, but it is actually multiple steps:
|
||||
|
||||
1. read current value
|
||||
2. add one
|
||||
3. write new value back
|
||||
|
||||
If two threads interleave these steps, updates can be lost.
|
||||
|
||||
### `synchronized`
|
||||
|
||||
Java's `synchronized` keyword provides mutual exclusion and visibility guarantees.
|
||||
|
||||
```java
|
||||
public class Counter {
|
||||
private int value;
|
||||
|
||||
public synchronized void increment() {
|
||||
value++;
|
||||
}
|
||||
|
||||
public synchronized int getValue() {
|
||||
return value;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Only one thread can execute a synchronized method or block on the same monitor at a time.
|
||||
|
||||
### Why It Works
|
||||
|
||||
Synchronization does two important things:
|
||||
|
||||
- prevents unsafe concurrent access to the protected region
|
||||
- ensures memory visibility so threads see up-to-date values when entering and leaving synchronized regions
|
||||
|
||||
### Real-World Use Cases
|
||||
|
||||
- protecting shared in-memory caches or mutable counters
|
||||
- coordinating state transitions in schedulers and worker systems
|
||||
- ensuring thread-safe updates in domain objects used by multiple threads
|
||||
|
||||
### Common Pitfall
|
||||
|
||||
Synchronizing too broadly can kill throughput. Synchronizing too narrowly can fail to protect the actual shared invariant.
|
||||
|
||||
## Locks and Higher-Level Concurrency Utilities
|
||||
|
||||
Beyond `synchronized`, Java provides explicit lock types and concurrency utilities in `java.util.concurrent`.
|
||||
|
||||
### `ReentrantLock`
|
||||
|
||||
This can be useful when you need features beyond simple intrinsic locking, such as timed lock acquisition or finer control.
|
||||
|
||||
```java
|
||||
Lock lock = new ReentrantLock();
|
||||
lock.lock();
|
||||
try {
|
||||
updateSharedState();
|
||||
} finally {
|
||||
lock.unlock();
|
||||
}
|
||||
```
|
||||
|
||||
### Other Important Utilities
|
||||
|
||||
- `AtomicInteger`, `AtomicLong`: lock-free atomic updates for simple counters and state
|
||||
- `ConcurrentHashMap`: concurrent map implementation for many common shared lookup patterns
|
||||
- `CountDownLatch`: wait until several tasks complete
|
||||
- `Semaphore`: limit concurrent access to a resource
|
||||
- `CompletableFuture`: compose asynchronous operations more cleanly
|
||||
|
||||
### Production Relevance
|
||||
|
||||
These utilities are common in rate limiting, request fan-out, concurrent caches, asynchronous orchestration, and worker coordination.
|
||||
|
||||
## `volatile`
|
||||
|
||||
`volatile` is often misunderstood.
|
||||
|
||||
### What It Guarantees
|
||||
|
||||
A volatile field provides visibility. When one thread writes a new value, other threads reading that field will see the latest value.
|
||||
|
||||
```java
|
||||
private volatile boolean running = true;
|
||||
```
|
||||
|
||||
This is useful for simple state flags.
|
||||
|
||||
### What It Does Not Guarantee
|
||||
|
||||
`volatile` does not make compound operations atomic.
|
||||
|
||||
This is still unsafe:
|
||||
|
||||
```java
|
||||
volatile int count;
|
||||
count++;
|
||||
```
|
||||
|
||||
The increment is still a read-modify-write sequence.
|
||||
|
||||
### Good Use Case
|
||||
|
||||
Stopping a background loop cleanly:
|
||||
|
||||
```java
|
||||
while (running) {
|
||||
pollQueue();
|
||||
}
|
||||
```
|
||||
|
||||
### Bad Use Case
|
||||
|
||||
Using `volatile` as a replacement for proper locking around shared mutable objects.
|
||||
|
||||
## JVM Memory Model
|
||||
|
||||
The Java Memory Model explains how threads interact through memory and what guarantees exist around reads, writes, ordering, and visibility.
|
||||
|
||||
### Intuition
|
||||
|
||||
In a multithreaded system, one thread can update a value, but another thread may not immediately observe that update unless the program uses the right synchronization rules.
|
||||
|
||||
This is because CPUs, caches, compilers, and runtimes all optimize memory access.
|
||||
|
||||
### Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Main Memory] --> B[Thread 1 Working Memory]
|
||||
A --> C[Thread 2 Working Memory]
|
||||
B --> D[Local reads and writes]
|
||||
C --> E[Local reads and writes]
|
||||
F[synchronized or volatile] --> A
|
||||
```
|
||||
|
||||
### Why Engineers Care
|
||||
|
||||
Without the memory model, some code would appear to "work on my machine" and fail under load or on different hardware.
|
||||
|
||||
The important practical lesson is not to memorize the whole specification. The important lesson is:
|
||||
|
||||
- do not share mutable state casually between threads
|
||||
- use proper synchronization tools when you do share it
|
||||
- visibility and atomicity are different concerns
|
||||
|
||||
## Garbage Collection
|
||||
|
||||
Java manages memory automatically through garbage collection, but automatic does not mean irrelevant.
|
||||
|
||||
### Intuition
|
||||
|
||||
Instead of manually freeing objects, Java tracks which objects are still reachable. Unreachable objects become candidates for collection.
|
||||
|
||||
### Simplified Heap View
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Application creates objects] --> B[Objects live on heap]
|
||||
B --> C[Reachable from roots]
|
||||
B --> D[Unreachable objects]
|
||||
D --> E[Garbage collector reclaims memory]
|
||||
```
|
||||
|
||||
### What Are Roots
|
||||
|
||||
Objects are considered reachable if they can be reached from GC roots such as:
|
||||
|
||||
- active thread stacks
|
||||
- static references
|
||||
- JNI references and runtime internals
|
||||
|
||||
### Why This Matters in Real Systems
|
||||
|
||||
GC behavior affects:
|
||||
|
||||
- latency spikes
|
||||
- memory footprint
|
||||
- throughput
|
||||
- container sizing decisions
|
||||
|
||||
### Common Memory Problems in Java
|
||||
|
||||
- retaining objects in caches without eviction
|
||||
- storing large collections in long-lived singletons
|
||||
- leaking listeners or thread-local data
|
||||
- creating too many short-lived temporary objects in hot code paths
|
||||
|
||||
### Misconception
|
||||
|
||||
"Java cannot have memory leaks because it has garbage collection" is false. Java absolutely can have memory leaks if your code keeps references to objects that are no longer useful.
|
||||
|
||||
## Streams API and Functional Programming Style
|
||||
|
||||
Java's Streams API provides a declarative way to process collections and sequences of data.
|
||||
|
||||
### Intuition
|
||||
|
||||
Instead of describing every loop step manually, you describe a pipeline of transformations.
|
||||
|
||||
```java
|
||||
List<String> emails = users.stream()
|
||||
.filter(User::isActive)
|
||||
.map(User::getEmail)
|
||||
.sorted()
|
||||
.toList();
|
||||
```
|
||||
|
||||
### What Happens Conceptually
|
||||
|
||||
The stream pipeline is lazy until a terminal operation runs. Operations like `filter` and `map` describe work. A terminal operation such as `toList()`, `count()`, or `forEach()` triggers execution.
|
||||
|
||||
### Why This Is Useful
|
||||
|
||||
Streams can make transformation pipelines clearer when the logic is naturally data-oriented.
|
||||
|
||||
Real-world examples:
|
||||
|
||||
- filtering valid requests
|
||||
- mapping entities to DTOs
|
||||
- aggregating metrics
|
||||
- grouping records for reporting
|
||||
|
||||
### When Streams Help
|
||||
|
||||
- straightforward transformations and aggregations
|
||||
- collection processing where each step is conceptually distinct
|
||||
- code that benefits from a pipeline style
|
||||
|
||||
### When Plain Loops Are Better
|
||||
|
||||
- highly stateful logic
|
||||
- error handling that becomes awkward in a stream chain
|
||||
- performance-sensitive paths where allocation and readability must be controlled carefully
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- overusing streams for complex business workflows that become unreadable
|
||||
- assuming streams are always faster than loops
|
||||
- using shared mutable state inside stream operations
|
||||
- misunderstanding laziness and side effects
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
Performance in Java is not just about writing "fast code." It is about understanding tradeoffs across CPU, memory, I/O, latency, and concurrency.
|
||||
|
||||
### Key Areas to Watch
|
||||
|
||||
- object allocation rate
|
||||
- unnecessary boxing and unboxing
|
||||
- poor data structure choice
|
||||
- lock contention
|
||||
- blocking I/O on critical threads
|
||||
- large heap pressure and GC pauses
|
||||
|
||||
### Example: Data Structure Choice Matters
|
||||
|
||||
If you repeatedly test membership in a list of one million IDs, a `HashSet` is usually a much better fit than a `List` because lookup behavior is different.
|
||||
|
||||
### Example: Avoiding Work on Hot Paths
|
||||
|
||||
If a request path is called tens of thousands of times per second, avoid unnecessary logging, object churn, repeated parsing, or expensive string formatting.
|
||||
|
||||
### Production Perspective
|
||||
|
||||
Java performance work should be evidence-driven. Good engineers measure before changing code. They use:
|
||||
|
||||
- metrics and tracing
|
||||
- profiling tools
|
||||
- thread dumps
|
||||
- heap dumps
|
||||
- realistic load tests
|
||||
|
||||
Premature optimization is a problem, but ignoring obvious bottlenecks is also a problem. Performance work is about judgment, not superstition.
|
||||
|
||||
## How Advanced Java Shows Up in Real Systems
|
||||
|
||||
Consider a service that consumes messages from a queue:
|
||||
|
||||
1. a thread pool pulls messages concurrently
|
||||
2. shared state such as rate limits or caches needs safe access
|
||||
3. visibility and ordering matter between worker threads
|
||||
4. processed payloads create object allocations on the heap
|
||||
5. stream pipelines may transform batches for enrichment or filtering
|
||||
6. GC and lock contention affect latency under load
|
||||
|
||||
This is why advanced Java matters. These are not academic concerns. They directly influence system correctness and performance.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- Concurrency is about safely making progress with multiple threads, not just spawning more work in parallel.
|
||||
- `synchronized`, locks, atomics, and concurrent collections exist because shared mutable state is hard to manage correctly.
|
||||
- `volatile` provides visibility, not full atomicity.
|
||||
- The Java Memory Model explains why synchronization rules matter for correctness across threads.
|
||||
- Garbage collection removes manual memory management, but memory leaks and GC-related latency issues still exist.
|
||||
- Streams are powerful for data transformation pipelines, but they are not automatically clearer or faster in every situation.
|
||||
- Strong Java engineers combine runtime understanding with measurement rather than guessing about performance.
|
||||
@@ -0,0 +1,346 @@
|
||||
# File 5: Real-world Java Engineering
|
||||
|
||||
By the time you reach this stage, Java should no longer feel like a collection of disconnected language features. The real question becomes: how do these features come together in production systems?
|
||||
|
||||
This file focuses on practical Java engineering in backend services. It connects language fundamentals, OOP, APIs, concurrency, and system design intuition to the kinds of systems teams actually build and operate.
|
||||
|
||||
## Building Backend Services in Java
|
||||
|
||||
### Intuition
|
||||
|
||||
A backend service receives requests or events, executes business logic, talks to data stores or external systems, and returns results. Java is a strong fit for this because it combines:
|
||||
|
||||
- mature web frameworks
|
||||
- good concurrency support
|
||||
- strong type systems for large codebases
|
||||
- stable operational tooling on the JVM
|
||||
- rich ecosystems for persistence, messaging, validation, security, and observability
|
||||
|
||||
### Typical Request Flow
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Client] --> B[HTTP Controller]
|
||||
B --> C[Service Layer]
|
||||
C --> D[Repository]
|
||||
C --> E[External API Client]
|
||||
C --> F[Message Publisher]
|
||||
D --> G[(Database)]
|
||||
E --> H[Payment or Identity Provider]
|
||||
F --> I[(Message Broker)]
|
||||
```
|
||||
|
||||
### Why This Structure Is Common
|
||||
|
||||
- controllers handle transport details such as HTTP mapping and status codes
|
||||
- services coordinate business workflows
|
||||
- repositories isolate persistence concerns
|
||||
- external clients isolate integrations with other services
|
||||
- publishers decouple asynchronous work through queues or topics
|
||||
|
||||
This separation makes testing easier and changes safer.
|
||||
|
||||
## The Spring Ecosystem at a High Level
|
||||
|
||||
Spring is the dominant Java backend ecosystem in many organizations, so you need a high-level mental model even if you are not yet learning every annotation.
|
||||
|
||||
### Spring Framework
|
||||
|
||||
Spring Framework provides the core programming model:
|
||||
|
||||
- dependency injection
|
||||
- bean lifecycle management
|
||||
- AOP support
|
||||
- transaction support
|
||||
- web and data integration
|
||||
|
||||
### Spring Boot
|
||||
|
||||
Spring Boot is the opinionated layer that makes Spring applications faster to start and easier to run.
|
||||
|
||||
It gives you:
|
||||
|
||||
- auto-configuration
|
||||
- starter dependencies
|
||||
- embedded servers
|
||||
- conventions for configuration and packaging
|
||||
- production-oriented tooling such as Actuator
|
||||
|
||||
### Why Teams Use It
|
||||
|
||||
- less boilerplate for common backend setups
|
||||
- consistent structure across services
|
||||
- mature ecosystem support
|
||||
- easy packaging and deployment for APIs and internal tools
|
||||
|
||||
### What to Understand First
|
||||
|
||||
At a high level, Spring Boot helps wire together objects and infrastructure so your application code can focus on business behavior.
|
||||
|
||||
That means a controller might receive a request, call a service bean, which depends on a repository bean, which depends on database infrastructure configured automatically by the framework.
|
||||
|
||||
### Startup Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[main method] --> B[SpringApplication.run]
|
||||
B --> C[Create ApplicationContext]
|
||||
C --> D[Component Scan]
|
||||
C --> E[Auto-Configuration]
|
||||
D --> F[Beans Created]
|
||||
E --> F
|
||||
F --> G[Embedded Server Starts]
|
||||
G --> H[Application Ready]
|
||||
```
|
||||
|
||||
This matters because when something breaks in a real service, you need to understand whether it is your code, dependency injection wiring, configuration, or auto-configuration behavior.
|
||||
|
||||
## REST API Concepts in Java
|
||||
|
||||
Many Java services expose REST-style HTTP APIs.
|
||||
|
||||
### Core Concepts
|
||||
|
||||
- resources are represented over HTTP
|
||||
- endpoints typically map to nouns such as `/orders` or `/customers`
|
||||
- standard HTTP verbs map to common actions like create, read, update, and delete
|
||||
- status codes communicate outcome
|
||||
- request and response bodies are usually JSON
|
||||
|
||||
### Example Workflow
|
||||
|
||||
An order API might support:
|
||||
|
||||
- `POST /orders` to create an order
|
||||
- `GET /orders/{id}` to fetch details
|
||||
- `PATCH /orders/{id}` to update state
|
||||
|
||||
### What Java Brings Here
|
||||
|
||||
Java web frameworks help with:
|
||||
|
||||
- request routing
|
||||
- JSON serialization and deserialization
|
||||
- validation of incoming payloads
|
||||
- exception mapping into HTTP responses
|
||||
- security filters and authentication
|
||||
- observability and tracing hooks
|
||||
|
||||
### Production Considerations
|
||||
|
||||
Good API design is not only about returning JSON.
|
||||
|
||||
It also includes:
|
||||
|
||||
- idempotency where appropriate
|
||||
- versioning strategy
|
||||
- validation and error message clarity
|
||||
- timeout behavior for downstream calls
|
||||
- pagination for large results
|
||||
- authentication and authorization boundaries
|
||||
|
||||
### Common Pitfall
|
||||
|
||||
Beginners often put all business logic directly in controllers. That leads to code that is difficult to test and reuse. Controllers should stay thin. They translate transport concerns and delegate real work to the service layer.
|
||||
|
||||
## Design Patterns in Java
|
||||
|
||||
Design patterns are recurring solutions to common design problems. They are useful when they make code clearer, not when they are used as decoration.
|
||||
|
||||
### Singleton
|
||||
|
||||
The singleton idea means one shared instance exists for a type.
|
||||
|
||||
### Where It Appears in Practice
|
||||
|
||||
- application-wide configuration holders
|
||||
- stateless shared services managed by dependency injection containers
|
||||
- logging or registry-like components in some designs
|
||||
|
||||
### Caution
|
||||
|
||||
Hand-written singletons can create hidden global state, testability problems, and lifecycle issues. In modern Java backend systems, dependency injection frameworks often manage shared singleton-like components for you in a safer way.
|
||||
|
||||
### Factory
|
||||
|
||||
A factory centralizes object creation.
|
||||
|
||||
```java
|
||||
public interface ReportExporter {
|
||||
byte[] export(Report report);
|
||||
}
|
||||
|
||||
public class ReportExporterFactory {
|
||||
public ReportExporter create(String format) {
|
||||
return switch (format) {
|
||||
case "csv" -> new CsvReportExporter();
|
||||
case "pdf" -> new PdfReportExporter();
|
||||
default -> throw new IllegalArgumentException("Unsupported format: " + format);
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Factories are useful when construction rules vary or when the caller should not depend on concrete types.
|
||||
|
||||
### Strategy
|
||||
|
||||
Strategy encapsulates interchangeable algorithms behind a common interface.
|
||||
|
||||
```java
|
||||
public interface PricingStrategy {
|
||||
Money calculate(Cart cart);
|
||||
}
|
||||
```
|
||||
|
||||
Useful in:
|
||||
|
||||
- discount logic by customer type
|
||||
- routing behavior by region
|
||||
- fraud checks by provider
|
||||
- retry policies by integration
|
||||
|
||||
### Builder
|
||||
|
||||
Builder helps construct complex objects with many optional fields.
|
||||
|
||||
This is common in:
|
||||
|
||||
- DTO creation
|
||||
- test data setup
|
||||
- configuration objects
|
||||
- HTTP client requests
|
||||
|
||||
### Observer and Event-Driven Patterns
|
||||
|
||||
Java systems often use event-driven design, either inside the application or across services.
|
||||
|
||||
Examples:
|
||||
|
||||
- publish an event when an order is placed
|
||||
- notify inventory and analytics consumers asynchronously
|
||||
- trigger email sending after the main transaction completes
|
||||
|
||||
This reduces tight coupling and supports scalable workflows.
|
||||
|
||||
## System Design Relevance
|
||||
|
||||
Java engineering becomes much more effective when you connect code-level decisions to system-level behavior.
|
||||
|
||||
### Example: Synchronous vs Asynchronous Work
|
||||
|
||||
Suppose an API receives an order request.
|
||||
|
||||
Should it do all work inline?
|
||||
|
||||
- charge payment
|
||||
- reserve inventory
|
||||
- send email
|
||||
- write analytics records
|
||||
|
||||
Probably not. Some work belongs in the synchronous request path. Some belongs in asynchronous messaging.
|
||||
|
||||
Java and the surrounding ecosystem make both styles possible.
|
||||
|
||||
### Example: Layered Architecture
|
||||
|
||||
Separating controllers, services, repositories, and integrations is not just style. It supports:
|
||||
|
||||
- easier testing
|
||||
- clearer ownership of business rules
|
||||
- safer refactoring
|
||||
- better operational debugging
|
||||
|
||||
### Example: Resilience
|
||||
|
||||
If your Java service depends on another service, you need to think about:
|
||||
|
||||
- timeouts
|
||||
- retries
|
||||
- circuit breaking
|
||||
- fallbacks
|
||||
- idempotency
|
||||
|
||||
These are system design concerns, but they show up in code through client configuration, exception handling, and workflow design.
|
||||
|
||||
## Production Best Practices
|
||||
|
||||
### Keep Business Logic Out of Framework Glue
|
||||
|
||||
Framework annotations and configurations are useful, but the business rules should still be understandable in plain Java classes. This makes testing and refactoring easier.
|
||||
|
||||
### Prefer Constructor Injection
|
||||
|
||||
It keeps dependencies explicit and reduces hidden magic.
|
||||
|
||||
### Model Failure Clearly
|
||||
|
||||
Do not swallow exceptions. Decide what should be retried, what should fail fast, and what should surface to callers with meaningful error information.
|
||||
|
||||
### Use Validation at Boundaries
|
||||
|
||||
Validate incoming API payloads, message payloads, and configuration values early. Invalid inputs should not travel deep into the system.
|
||||
|
||||
### Be Intentional About Concurrency
|
||||
|
||||
Do not share mutable state casually. Understand thread pools, blocking calls, and contention points.
|
||||
|
||||
### Observe the System
|
||||
|
||||
Production systems need:
|
||||
|
||||
- structured logs
|
||||
- metrics
|
||||
- traces
|
||||
- health checks
|
||||
- dashboards and alerts
|
||||
|
||||
If a service fails at 3 AM, observability is what turns confusion into diagnosis.
|
||||
|
||||
### Design for Change
|
||||
|
||||
Requirements evolve. Code that is rigid or overcoupled becomes expensive quickly. Good Java systems isolate change behind interfaces, composition, configuration, and focused modules.
|
||||
|
||||
### Think About Data Boundaries
|
||||
|
||||
Be careful about leaking persistence models directly into API responses. Domain models, DTOs, and storage schemas often evolve at different speeds.
|
||||
|
||||
## How a Real Java Service Evolves
|
||||
|
||||
An actual production Java service often grows through stages:
|
||||
|
||||
1. start with a few endpoints and straightforward business logic
|
||||
2. add persistence, validation, and external integrations
|
||||
3. introduce asynchronous processing for slow side effects
|
||||
4. add caching, concurrency controls, and observability
|
||||
5. harden the service with better error handling, retries, metrics, and deployment discipline
|
||||
|
||||
At each stage, the Java concepts from the earlier files become more important, not less.
|
||||
|
||||
- OOP shapes the service boundaries
|
||||
- collections and exceptions shape data flow and failure handling
|
||||
- concurrency affects throughput and safety
|
||||
- framework knowledge affects startup, deployment, and maintainability
|
||||
|
||||
## A Senior Engineer's Mental Checklist
|
||||
|
||||
When reading or designing Java backend code, ask:
|
||||
|
||||
- where does the request enter the system?
|
||||
- where do business rules live?
|
||||
- what dependencies does this component have?
|
||||
- what happens when a downstream dependency is slow or unavailable?
|
||||
- what state is shared and how is it protected?
|
||||
- what gets logged, measured, and traced?
|
||||
- how easy is this code to test without the whole framework running?
|
||||
|
||||
These questions are often more valuable than memorizing another annotation.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- Real-world Java engineering is about combining language fundamentals with runtime, framework, and system design thinking.
|
||||
- Backend Java services usually separate transport, business logic, persistence, and external integrations into focused layers.
|
||||
- The Spring ecosystem matters because it is a common way Java teams wire objects, configure infrastructure, and ship services.
|
||||
- REST APIs in Java require careful thinking about validation, status codes, timeouts, idempotency, and service boundaries.
|
||||
- Design patterns such as factory, strategy, and builder are useful when they clarify construction and variation, not when they add ceremony.
|
||||
- Production quality comes from explicit dependencies, clear failure handling, observability, safe concurrency, and code organized for change.
|
||||
@@ -0,0 +1,670 @@
|
||||
# 01. JavaScript Fundamentals
|
||||
|
||||
This chapter builds the language model that everything else in the browser depends on. If you do not have a precise idea of how JavaScript creates execution contexts, resolves variables, binds `this`, and links objects through prototypes, then later topics such as the browser event loop, DOM events, React rendering, or async data fetching become a collection of trivia instead of a coherent system.
|
||||
|
||||
For interview preparation, this chapter matters because many "advanced" frontend questions are actually fundamentals wearing different clothes. A stale closure in React is still a closure problem. A mysterious `this` value in a class method is still a function invocation problem. A performance issue caused by accidental object shape churn still starts with understanding how objects behave.
|
||||
|
||||
This file is self-contained, but it connects directly to:
|
||||
|
||||
- [02-javascript-in-the-browser.md](./02-javascript-in-the-browser.md), which explains where the browser runtime begins
|
||||
- [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md), which builds on the call stack and async model introduced here
|
||||
- [05-real-world-architecture-patterns.md](./05-real-world-architecture-patterns.md), which shows how frameworks rely on these exact language behaviors
|
||||
|
||||
## JavaScript as a Runtime Story, Not Just a Syntax Story
|
||||
|
||||
Many beginners treat JavaScript as a bag of syntax rules: variables, loops, arrays, functions, promises. That is not how experienced engineers think about it.
|
||||
|
||||
The better model is this:
|
||||
|
||||
- JavaScript is a language with a specification called ECMAScript.
|
||||
- A JavaScript engine, such as V8 in Chrome, turns that specification into something executable.
|
||||
- A host environment, such as the browser, gives the language practical capabilities like the DOM, timers, networking, and storage.
|
||||
|
||||
That separation is important.
|
||||
|
||||
When someone asks, "How does JavaScript work?" they may mean one of three different layers:
|
||||
|
||||
- how the language resolves variables and executes functions
|
||||
- how the engine parses, optimizes, and runs code
|
||||
- how the browser schedules callbacks and coordinates with rendering
|
||||
|
||||
This chapter focuses on the first layer, with a few hints toward the other two.
|
||||
|
||||
## Execution Context: The Unit of Running Code
|
||||
|
||||
JavaScript does not execute raw text directly. It executes code through execution contexts.
|
||||
|
||||
An execution context is the runtime record that tells the engine what code is currently running and what data that code can access. You can think of it as a frame containing:
|
||||
|
||||
- the current scope and its variables
|
||||
- the outer lexical environment
|
||||
- the value of `this`
|
||||
- bookkeeping for how control returns when this work is done
|
||||
|
||||
### Why JavaScript Needs Execution Contexts
|
||||
|
||||
Imagine calling one function from another without any runtime frame. The engine would have no clean way to answer:
|
||||
|
||||
- which variables belong to which function call
|
||||
- where to resume after a nested call finishes
|
||||
- which `this` value applies to this call
|
||||
- how closures should remember outer variables
|
||||
|
||||
Execution contexts solve that. They let JavaScript behave as though each running function has its own private workspace, even when many calls happen one after another.
|
||||
|
||||
### Types of Execution Contexts
|
||||
|
||||
At a high level, you should know three:
|
||||
|
||||
1. Global execution context
|
||||
2. Function execution context
|
||||
3. Module execution context
|
||||
|
||||
The global execution context is created when a script first starts running. In browsers, top-level script code historically interacted with the global object `window`, though modules behave differently and are more isolated.
|
||||
|
||||
Each function call creates a new function execution context. That means calling the same function three times does not reuse the exact same runtime frame; it creates three distinct calls with separate local state.
|
||||
|
||||
Modules also create their own top-level scope, which is one reason ES modules are safer and easier to reason about than older script tags that dumped everything into the global namespace.
|
||||
|
||||
### Creation Phase and Execution Phase
|
||||
|
||||
An interview-friendly but useful simplification is that execution contexts have two stages:
|
||||
|
||||
- creation phase
|
||||
- execution phase
|
||||
|
||||
During creation, the engine determines what bindings exist in that scope. During execution, statements actually run line by line.
|
||||
|
||||
This is the intuition behind hoisting. Hoisting is not "the code moves upward." It is better understood as: the engine knows about bindings before it executes the body line by line.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Source code loaded] --> B[Parse into syntax tree]
|
||||
B --> C[Create global execution context]
|
||||
C --> D[Register bindings and outer scope links]
|
||||
D --> E[Execute top-level statements]
|
||||
E --> F{Function called?}
|
||||
F -- Yes --> G[Create function execution context]
|
||||
G --> H[Bind parameters, local declarations, this]
|
||||
H --> I[Execute function body]
|
||||
I --> J[Pop context and return]
|
||||
J --> E
|
||||
F -- No --> K[Program waits for more work]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```js
|
||||
const siteName = "LearnJS";
|
||||
|
||||
function greet(user) {
|
||||
const message = `Hello, ${user}`;
|
||||
return `${message} from ${siteName}`;
|
||||
}
|
||||
|
||||
greet("Asha");
|
||||
```
|
||||
|
||||
When the file starts:
|
||||
|
||||
- a global execution context is created
|
||||
- `siteName` and `greet` bindings are registered
|
||||
- top-level statements execute
|
||||
|
||||
When `greet("Asha")` runs:
|
||||
|
||||
- a new function execution context is created
|
||||
- `user` is bound to `"Asha"`
|
||||
- `message` is created inside that function scope
|
||||
- the function can still access `siteName` because of lexical scope
|
||||
|
||||
This one example already connects execution context, function calls, local variables, and closure-friendly scope lookup.
|
||||
|
||||
## The Call Stack: Where Synchronous JavaScript Lives
|
||||
|
||||
The call stack is the structure that keeps track of currently active execution contexts.
|
||||
|
||||
If execution context is the runtime frame for a single piece of work, the call stack is the ordered pile of those frames.
|
||||
|
||||
### Mental Model
|
||||
|
||||
Think of the call stack like a stack of trays in a cafeteria:
|
||||
|
||||
- when a function is called, a new tray is placed on top
|
||||
- the engine works on the tray at the top only
|
||||
- when that function returns, the tray is removed
|
||||
- control continues from the tray underneath
|
||||
|
||||
Because JavaScript on the main browser thread runs one call stack at a time, synchronous code is single-threaded from the perspective of your page code.
|
||||
|
||||
```mermaid
|
||||
flowchart BT
|
||||
A[third()] --> B[second()]
|
||||
B --> C[first()]
|
||||
C --> D[global]
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```js
|
||||
function third() {
|
||||
console.log("third");
|
||||
}
|
||||
|
||||
function second() {
|
||||
third();
|
||||
console.log("second");
|
||||
}
|
||||
|
||||
function first() {
|
||||
second();
|
||||
console.log("first");
|
||||
}
|
||||
|
||||
first();
|
||||
```
|
||||
|
||||
What happens:
|
||||
|
||||
1. Global code runs.
|
||||
2. `first()` is called, so its execution context is pushed.
|
||||
3. `second()` is called, so its context is pushed.
|
||||
4. `third()` is called, so its context is pushed.
|
||||
5. `third()` finishes and is popped.
|
||||
6. `second()` continues and then is popped.
|
||||
7. `first()` continues and then is popped.
|
||||
|
||||
### Why This Matters in Browsers
|
||||
|
||||
If JavaScript is executing a long synchronous function on the main thread, the browser cannot use that same thread to run your next event handler or do its normal page work at the same moment. That is why a CPU-heavy loop can freeze the UI.
|
||||
|
||||
This also explains the phrase "JavaScript is single-threaded" in frontend interviews. The more precise statement is:
|
||||
|
||||
"Your page's main JavaScript execution on the main thread uses a single call stack. The browser can use other threads internally, but your code experiences a single-threaded execution model for synchronous main-thread work."
|
||||
|
||||
## Hoisting: Early Binding Knowledge, Not Literal Movement
|
||||
|
||||
Hoisting is one of the most misunderstood topics in JavaScript because it is often taught with cartoon explanations.
|
||||
|
||||
The engine does not physically move declarations to the top of your file. Instead, when a scope is created, the engine registers bindings before normal execution reaches the lines where those declarations appear.
|
||||
|
||||
### `var`, Function Declarations, and `let`/`const`
|
||||
|
||||
These behave differently because they were designed in different eras of the language.
|
||||
|
||||
#### `var`
|
||||
|
||||
`var` is function-scoped and is initialized to `undefined` during context creation.
|
||||
|
||||
```js
|
||||
console.log(total); // undefined
|
||||
var total = 3;
|
||||
```
|
||||
|
||||
This works without throwing because the binding exists from the start of the scope, even though the assignment happens later.
|
||||
|
||||
#### Function declarations
|
||||
|
||||
Function declarations are hoisted with their function value.
|
||||
|
||||
```js
|
||||
sayHi();
|
||||
|
||||
function sayHi() {
|
||||
console.log("hi");
|
||||
}
|
||||
```
|
||||
|
||||
This is one reason classic JavaScript code often calls functions before they appear in the file.
|
||||
|
||||
#### `let` and `const`
|
||||
|
||||
`let` and `const` are also known to the engine during scope creation, but they are not initialized for use until execution reaches the declaration. The period before initialization is called the temporal dead zone, or TDZ.
|
||||
|
||||
```js
|
||||
console.log(user); // ReferenceError
|
||||
let user = "Mina";
|
||||
```
|
||||
|
||||
### Why `let` and `const` Behave This Way
|
||||
|
||||
This design closes off an entire class of accidental bugs that `var` made easy:
|
||||
|
||||
- reading a variable before its real initialization
|
||||
- leaking values across blocks unintentionally
|
||||
- writing loops whose callbacks all share one mutable binding
|
||||
|
||||
The modern language moved toward safer defaults without breaking older JavaScript completely.
|
||||
|
||||
### A Better Interview Answer for Hoisting
|
||||
|
||||
Instead of saying, "JavaScript moves declarations to the top," say this:
|
||||
|
||||
"When a scope is created, JavaScript registers bindings before line-by-line execution starts. Different declaration forms are initialized differently, which is why `var`, function declarations, and `let`/`const` behave differently before their declaration lines."
|
||||
|
||||
That answer shows actual understanding instead of repetition.
|
||||
|
||||
## Scope: How JavaScript Resolves Names
|
||||
|
||||
Scope answers the question: when code uses a variable name, where does JavaScript look for it?
|
||||
|
||||
JavaScript uses lexical scope. That means name resolution depends on where code is written, not where it is called from.
|
||||
|
||||
### Lexical Scope
|
||||
|
||||
```js
|
||||
const topic = "global";
|
||||
|
||||
function outer() {
|
||||
const topic = "outer";
|
||||
|
||||
function inner() {
|
||||
console.log(topic);
|
||||
}
|
||||
|
||||
return inner;
|
||||
}
|
||||
|
||||
const fn = outer();
|
||||
fn(); // outer
|
||||
```
|
||||
|
||||
`inner` resolves `topic` from the environment in which `inner` was defined. It does not care that `fn()` is called later from global code.
|
||||
|
||||
### Scope Chain
|
||||
|
||||
If a variable is not found in the current scope, JavaScript walks outward through enclosing lexical environments.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[inner scope] --> B[outer function scope]
|
||||
B --> C[module or global scope]
|
||||
C --> D[unresolved => ReferenceError]
|
||||
```
|
||||
|
||||
### Block Scope vs Function Scope
|
||||
|
||||
`let` and `const` are block-scoped.
|
||||
|
||||
```js
|
||||
if (true) {
|
||||
const status = "ready";
|
||||
}
|
||||
|
||||
console.log(status); // ReferenceError
|
||||
```
|
||||
|
||||
`var` is not block-scoped.
|
||||
|
||||
```js
|
||||
if (true) {
|
||||
var legacy = "visible outside block";
|
||||
}
|
||||
|
||||
console.log(legacy); // works
|
||||
```
|
||||
|
||||
That difference is why `var` can create subtle bugs in loops and conditionals.
|
||||
|
||||
## Closures: Functions That Remember Their Surroundings
|
||||
|
||||
A closure is a function together with the lexical environment it can still access.
|
||||
|
||||
This is one of the most powerful features in JavaScript. It is also one of the most practical.
|
||||
|
||||
### Intuition
|
||||
|
||||
Imagine a function leaving the room but taking a backpack full of references to outer variables with it. That is not exactly how the engine stores it internally, but it is a useful mental model. The function can still use those outer bindings later.
|
||||
|
||||
```js
|
||||
function createCounter() {
|
||||
let count = 0;
|
||||
|
||||
return function increment() {
|
||||
count += 1;
|
||||
return count;
|
||||
};
|
||||
}
|
||||
|
||||
const counter = createCounter();
|
||||
console.log(counter()); // 1
|
||||
console.log(counter()); // 2
|
||||
```
|
||||
|
||||
The outer function has finished, but `count` is still reachable because the returned function closes over it.
|
||||
|
||||
### Why Closures Exist
|
||||
|
||||
Closures are not a quirky feature bolted on top of the language. They are the natural consequence of lexical scope plus first-class functions.
|
||||
|
||||
If functions can be passed around like values, and if they resolve names from where they were defined, then closures are unavoidable.
|
||||
|
||||
### Real-World Uses of Closures
|
||||
|
||||
- event handlers that need access to component state
|
||||
- factory functions that create specialized behavior
|
||||
- memoization utilities
|
||||
- module patterns that hide private data
|
||||
- React hooks and render callbacks
|
||||
- debouncing and throttling utilities
|
||||
|
||||
### The Famous Loop Example
|
||||
|
||||
```js
|
||||
for (var i = 0; i < 3; i += 1) {
|
||||
setTimeout(() => console.log(i), 0);
|
||||
}
|
||||
```
|
||||
|
||||
This logs `3` three times.
|
||||
|
||||
Why:
|
||||
|
||||
- there is one shared function-scoped `i`
|
||||
- the callbacks close over that same binding
|
||||
- by the time they run, the loop has finished and `i` is `3`
|
||||
|
||||
Using `let` fixes this because each loop iteration gets a new block-scoped binding.
|
||||
|
||||
```js
|
||||
for (let i = 0; i < 3; i += 1) {
|
||||
setTimeout(() => console.log(i), 0);
|
||||
}
|
||||
```
|
||||
|
||||
This logs `0`, `1`, and `2`.
|
||||
|
||||
### Closures and Memory
|
||||
|
||||
Closures are powerful, but they also mean data can stay alive longer than beginners expect. If a long-lived callback still references a large object, that object cannot be garbage collected.
|
||||
|
||||
In real apps, memory leaks often come from some combination of:
|
||||
|
||||
- long-lived event listeners
|
||||
- timers that are never cleaned up
|
||||
- caches that never expire
|
||||
- closures capturing more state than necessary
|
||||
|
||||
That is not a reason to avoid closures. It is a reason to understand lifecycle.
|
||||
|
||||
## `this`: Determined by Call Site, Except When It Is Not
|
||||
|
||||
Few JavaScript topics cause more confusion than `this`, largely because people first learn it as if it were a variable stored inside the function definition. That is not usually how it works.
|
||||
|
||||
For normal functions, `this` is determined by how the function is called.
|
||||
|
||||
### The Four Practical Binding Rules
|
||||
|
||||
In everyday JavaScript, think in this order:
|
||||
|
||||
1. Was the function called with `new`?
|
||||
2. Was it called with `call`, `apply`, or `bind`?
|
||||
3. Was it called as a method through an object reference?
|
||||
4. Otherwise, default binding applies.
|
||||
|
||||
#### Default binding
|
||||
|
||||
```js
|
||||
function show() {
|
||||
console.log(this);
|
||||
}
|
||||
|
||||
show();
|
||||
```
|
||||
|
||||
In strict mode, `this` is `undefined`. In sloppy older browser script mode, it may become the global object. Modern code should assume strict mode semantics, especially in modules.
|
||||
|
||||
#### Implicit binding through an object
|
||||
|
||||
```js
|
||||
const user = {
|
||||
name: "Ravi",
|
||||
greet() {
|
||||
console.log(this.name);
|
||||
}
|
||||
};
|
||||
|
||||
user.greet(); // Ravi
|
||||
```
|
||||
|
||||
Here `this` points to the object used at the call site.
|
||||
|
||||
#### Explicit binding
|
||||
|
||||
```js
|
||||
function greet() {
|
||||
console.log(`Hello ${this.name}`);
|
||||
}
|
||||
|
||||
greet.call({ name: "Lena" });
|
||||
```
|
||||
|
||||
#### Constructor call with `new`
|
||||
|
||||
```js
|
||||
function User(name) {
|
||||
this.name = name;
|
||||
}
|
||||
|
||||
const person = new User("Karan");
|
||||
```
|
||||
|
||||
With `new`, JavaScript creates a fresh object, links its prototype, binds `this` to that object inside the constructor, and returns it unless something else is explicitly returned.
|
||||
|
||||
### Arrow Functions
|
||||
|
||||
Arrow functions do not get their own dynamic `this`. They capture `this` lexically from the surrounding scope.
|
||||
|
||||
```js
|
||||
const team = {
|
||||
name: "Platform",
|
||||
members: ["A", "B"],
|
||||
print() {
|
||||
this.members.forEach((member) => {
|
||||
console.log(this.name, member);
|
||||
});
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
The arrow function inside `forEach` uses the `this` from `print`, which is often exactly what you want.
|
||||
|
||||
### Why `this` Works This Way
|
||||
|
||||
JavaScript was designed to support several programming styles at once:
|
||||
|
||||
- simple function calls
|
||||
- object method calls
|
||||
- constructor-style object creation
|
||||
- callback-heavy event-driven code
|
||||
|
||||
The call-site-based `this` model gives flexibility, but the tradeoff is confusion. ES6 arrow functions were partly a repair mechanism for the most common pain point: nested callbacks losing the intended outer `this`.
|
||||
|
||||
### Real-World Frontend Consequence
|
||||
|
||||
In browser apps, `this` bugs often show up when:
|
||||
|
||||
- passing class methods as callbacks without binding
|
||||
- using DOM event handlers and assuming `this` is some app object
|
||||
- mixing arrow functions and method syntax without understanding the difference
|
||||
|
||||
Frameworks reduced the visibility of many `this` issues, but they did not remove the underlying rules.
|
||||
|
||||
## Objects, Prototypes, and Inheritance
|
||||
|
||||
JavaScript is prototype-based. That sentence is often memorized and rarely understood.
|
||||
|
||||
### The Core Idea
|
||||
|
||||
Objects in JavaScript can delegate property lookup to another object, called their prototype.
|
||||
|
||||
If you try to access `obj.value` and `value` is not found directly on `obj`, the engine checks the prototype, then the prototype's prototype, and so on until it reaches `null`.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[instance object] --> B[instance prototype]
|
||||
B --> C[parent prototype]
|
||||
C --> D[Object.prototype]
|
||||
D --> E[null]
|
||||
```
|
||||
|
||||
### Property Lookup Example
|
||||
|
||||
```js
|
||||
const animal = {
|
||||
eats: true,
|
||||
speak() {
|
||||
return "sound";
|
||||
}
|
||||
};
|
||||
|
||||
const dog = Object.create(animal);
|
||||
dog.barks = true;
|
||||
|
||||
console.log(dog.barks); // own property
|
||||
console.log(dog.eats); // found on prototype
|
||||
console.log(dog.speak());
|
||||
```
|
||||
|
||||
### Why Prototypes Exist
|
||||
|
||||
They allow many objects to share behavior without copying methods onto every instance. That saves memory and provides a natural form of delegation.
|
||||
|
||||
If one shared method lives on the prototype, every instance can use it through lookup.
|
||||
|
||||
### Constructor Functions and Prototypes
|
||||
|
||||
Before `class` syntax, constructor functions were the common way to create related objects.
|
||||
|
||||
```js
|
||||
function User(name) {
|
||||
this.name = name;
|
||||
}
|
||||
|
||||
User.prototype.describe = function describe() {
|
||||
return `User: ${this.name}`;
|
||||
};
|
||||
|
||||
const u1 = new User("Isha");
|
||||
console.log(u1.describe());
|
||||
```
|
||||
|
||||
Every `User` instance can access `describe` via the prototype chain.
|
||||
|
||||
### `class` Syntax: Cleaner Surface, Same Underlying Model
|
||||
|
||||
```js
|
||||
class AdminUser extends User {
|
||||
deletePost() {
|
||||
return "deleted";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`class` makes the code look more classically object-oriented, but under the hood JavaScript is still using prototypes.
|
||||
|
||||
This is important in interviews. A good answer is:
|
||||
|
||||
"JavaScript classes are primarily syntactic sugar over prototype-based inheritance, with some stricter rules and cleaner syntax."
|
||||
|
||||
### Prototype Delegation vs Classical Inheritance
|
||||
|
||||
In class-based languages, people often imagine copying a parent blueprint into a child blueprint. JavaScript is closer to delegation: if an object does not have a property, it asks another object up the chain.
|
||||
|
||||
That is a simpler and more accurate mental model.
|
||||
|
||||
## Functions Are Objects Too
|
||||
|
||||
JavaScript functions are special objects.
|
||||
|
||||
That means functions can:
|
||||
|
||||
- be passed around as values
|
||||
- have properties
|
||||
- be stored in arrays or objects
|
||||
- act as constructors when used with `new`
|
||||
|
||||
This is why JavaScript can support such a strong functional style while still allowing object-oriented patterns. It is also why callbacks, middleware, hooks, and higher-order utilities are so common across web frameworks.
|
||||
|
||||
## Event-Driven Thinking: The First Shift Toward Browser Programming
|
||||
|
||||
Even before discussing browsers in detail, it helps to understand the event-driven mindset.
|
||||
|
||||
Traditional step-by-step programs often look like this:
|
||||
|
||||
1. read input
|
||||
2. process it
|
||||
3. print output
|
||||
4. exit
|
||||
|
||||
Browser applications are different. They stay alive and wait for things to happen:
|
||||
|
||||
- a user clicks a button
|
||||
- a timer fires
|
||||
- a network response arrives
|
||||
- a Promise settles
|
||||
- the browser asks to repaint
|
||||
|
||||
That means much of frontend programming is about registering behavior now that will run later.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[App starts] --> B[Register handlers]
|
||||
B --> C[Wait for events]
|
||||
C --> D[User click]
|
||||
C --> E[Timer completes]
|
||||
C --> F[Network response]
|
||||
D --> G[Run callback]
|
||||
E --> G
|
||||
F --> G
|
||||
G --> C
|
||||
```
|
||||
|
||||
### Why This Changes How You Design Code
|
||||
|
||||
When code runs later instead of immediately, several language features become central:
|
||||
|
||||
- closures, because callbacks need access to earlier state
|
||||
- the call stack, because callbacks run only when the current stack is clear
|
||||
- `this`, because callback invocation style affects binding
|
||||
- immutability and state discipline, because time gaps make mutation harder to reason about
|
||||
|
||||
This is why frontend interviews repeatedly return to fundamentals. The browser multiplies the importance of small language details.
|
||||
|
||||
## Common Misconceptions That Hurt Engineers
|
||||
|
||||
### "Hoisting means code moves"
|
||||
|
||||
No. Bindings are registered before execution. That is more accurate and leads to better reasoning.
|
||||
|
||||
### "Closures copy values"
|
||||
|
||||
Not usually. Closures keep access to bindings in an outer lexical environment. That distinction matters when the closed-over value can still change.
|
||||
|
||||
### "`this` points to the function's object"
|
||||
|
||||
Functions do not have a built-in permanent `this`. For normal functions, call site determines it.
|
||||
|
||||
### "Classes replaced prototypes"
|
||||
|
||||
No. Classes mostly provide nicer syntax over the prototype system.
|
||||
|
||||
### "JavaScript is asynchronous"
|
||||
|
||||
JavaScript itself executes synchronous code on a call stack. Async behavior comes from the host environment and scheduling model, which the next chapters explain in detail.
|
||||
|
||||
## Interview-Ready Summary
|
||||
|
||||
If you need a compact but strong summary, keep these statements ready:
|
||||
|
||||
- JavaScript executes code through execution contexts, which store scope, `this`, and runtime state.
|
||||
- Synchronous JavaScript runs on a call stack, so long-running code blocks progress on the main thread.
|
||||
- Hoisting is best understood as early binding registration during scope creation, not literal line movement.
|
||||
- JavaScript uses lexical scope, so variable lookup depends on where code is written.
|
||||
- Closures allow functions to retain access to outer bindings and are fundamental to callbacks, modules, and UI frameworks.
|
||||
- For normal functions, `this` depends on call site; arrow functions capture `this` lexically.
|
||||
- JavaScript object inheritance works through prototype delegation, and `class` syntax is built on top of that model.
|
||||
|
||||
## What to Read Next
|
||||
|
||||
Continue with [02-javascript-in-the-browser.md](./02-javascript-in-the-browser.md). That chapter answers the next obvious question: if the language gives you functions, objects, and promises, where do timers, the DOM, `fetch`, and events actually come from inside a browser?
|
||||
@@ -0,0 +1,373 @@
|
||||
# 02. JavaScript in the Browser
|
||||
|
||||
The previous chapter explained JavaScript as a language. This chapter explains the environment that makes JavaScript useful on the web.
|
||||
|
||||
That distinction matters more than many candidates realize. The language gives you functions, objects, promises, modules, and classes. The browser gives you a page, a DOM, timers, rendering, networking, storage, security boundaries, and user events. Most frontend engineering work happens at the boundary between those two layers.
|
||||
|
||||
If Chapter 1 answered, "How does JavaScript think?" this chapter answers, "Where does browser JavaScript actually live, and who gives it its powers?"
|
||||
|
||||
This file connects naturally to:
|
||||
|
||||
- [01-javascript-fundamentals.md](./01-javascript-fundamentals.md), which explains the execution model the browser embeds
|
||||
- [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md), which dives into event loop and rendering details
|
||||
- [04-networking-storage-security.md](./04-networking-storage-security.md), which explains how browser-provided capabilities are constrained by security rules
|
||||
|
||||
## The Browser Is Not Just a Window for JavaScript
|
||||
|
||||
When people casually say, "JavaScript runs in the browser," they often picture a single machine that simply reads code and executes it. Real browsers are more like miniature operating systems dedicated to documents and apps.
|
||||
|
||||
They have to coordinate:
|
||||
|
||||
- HTML parsing
|
||||
- CSS parsing and style resolution
|
||||
- JavaScript execution
|
||||
- network requests
|
||||
- image and font decoding
|
||||
- input events
|
||||
- layout, painting, and compositing
|
||||
- sandboxing and security isolation
|
||||
- storage and caching
|
||||
|
||||
So a browser runtime is not "JavaScript plus some extra APIs." It is a complex host that embeds a JavaScript engine into a much larger document and rendering system.
|
||||
|
||||
## JavaScript Engine vs Browser Environment
|
||||
|
||||
This is the most important conceptual split in browser JavaScript.
|
||||
|
||||
### The JavaScript Engine
|
||||
|
||||
The engine is responsible for the language itself:
|
||||
|
||||
- parsing source code
|
||||
- producing internal representations such as syntax trees and bytecode
|
||||
- managing execution contexts and the call stack
|
||||
- allocating objects and garbage collecting memory
|
||||
- optimizing hot code paths
|
||||
|
||||
Examples:
|
||||
|
||||
- Chrome and Edge use V8
|
||||
- Firefox uses SpiderMonkey
|
||||
- Safari uses JavaScriptCore
|
||||
|
||||
If an interview asks about Chrome specifically, mentioning V8 is useful. If the discussion is broader, focus on the engine role rather than vendor trivia.
|
||||
|
||||
### The Browser Environment
|
||||
|
||||
The browser environment is responsible for everything that makes web applications interactive:
|
||||
|
||||
- the DOM API
|
||||
- timers like `setTimeout`
|
||||
- networking through `fetch`, `XMLHttpRequest`, WebSocket, and more
|
||||
- event handling for clicks, keyboard input, scrolling, and navigation
|
||||
- rendering pipeline integration
|
||||
- storage such as cookies, `localStorage`, `sessionStorage`, and IndexedDB
|
||||
|
||||
These APIs are not part of ECMAScript itself. They are host-provided capabilities.
|
||||
|
||||
### A Strong Interview Answer
|
||||
|
||||
If asked, "Is `fetch` part of JavaScript?" the best short answer is:
|
||||
|
||||
"No. `fetch` is not part of the ECMAScript language specification. It is provided by the host environment, such as a browser or modern Node.js runtime."
|
||||
|
||||
That answer shows you understand the layering.
|
||||
|
||||
## High-Level Browser Architecture
|
||||
|
||||
Real browsers are multi-process and heavily optimized, but for frontend reasoning a high-level model is enough.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[HTML CSS JS assets] --> B[Browser process]
|
||||
B --> C[Renderer process for tab]
|
||||
C --> D[HTML parser]
|
||||
C --> E[Style engine]
|
||||
C --> F[JavaScript engine]
|
||||
C --> G[DOM and render data]
|
||||
C --> H[Event system]
|
||||
C --> I[Scheduler]
|
||||
B --> J[Network stack]
|
||||
B --> K[Storage and cache]
|
||||
G --> L[Layout paint composite]
|
||||
H --> F
|
||||
F --> G
|
||||
J --> C
|
||||
K --> C
|
||||
```
|
||||
|
||||
This diagram hides many details, but it captures the engineering truth that matters most: JavaScript is only one subsystem in a browser page.
|
||||
|
||||
## Chrome, V8, and Blink at a Useful Level
|
||||
|
||||
Interview preparation often suffers from too much shallow browser jargon. You do not need vendor-specific internals for every interview, but you do need a credible mental model.
|
||||
|
||||
### V8
|
||||
|
||||
V8 is Chrome's JavaScript engine. At a high level it:
|
||||
|
||||
- parses JavaScript source
|
||||
- generates bytecode
|
||||
- runs code through an interpreter first
|
||||
- optimizes hot code paths with a compiler
|
||||
- performs garbage collection
|
||||
|
||||
The broad performance idea is simple: browsers do not want to spend too long optimizing cold code that runs once, but they also do not want hot UI logic to remain slow forever. So they often start quickly and optimize as usage patterns become clear.
|
||||
|
||||
### Blink
|
||||
|
||||
Blink is Chrome's rendering engine. At a useful high level it handles:
|
||||
|
||||
- parsing HTML into the DOM
|
||||
- parsing CSS into style structures
|
||||
- calculating layout
|
||||
- painting pixels
|
||||
- coordinating rendering updates with the rest of the browser
|
||||
|
||||
From a frontend engineer's point of view, a lot of performance work is really about respecting how the rendering engine wants to work. If you force layout repeatedly, mutate the DOM too often, or block the main thread, you are not just writing "slow JavaScript." You are fighting Blink's scheduling and rendering pipeline.
|
||||
|
||||
### WebKit Concepts at a High Level
|
||||
|
||||
Safari uses JavaScriptCore as its engine and WebKit as its broader engine stack. Even if you mostly target Chromium browsers, it is worth understanding that there is no single universal implementation. Standards define behavior, but engine teams make different tradeoffs in optimization, scheduling, and edge-case behavior.
|
||||
|
||||
That is one reason real-world engineering must care about standards and cross-browser testing rather than assuming that Chrome behavior alone defines the web.
|
||||
|
||||
## How JavaScript Gets Embedded Into a Page
|
||||
|
||||
The browser does not magically know when or how to execute your script. Script loading is part of document parsing and page scheduling.
|
||||
|
||||
### Classic Script Tags
|
||||
|
||||
```html
|
||||
<script src="app.js"></script>
|
||||
```
|
||||
|
||||
With a classic script tag in the document body or head, the browser typically:
|
||||
|
||||
1. parses HTML until it encounters the script
|
||||
2. pauses parsing
|
||||
3. fetches the script if needed
|
||||
4. executes the script
|
||||
5. resumes HTML parsing
|
||||
|
||||
That parser-blocking behavior is one reason careless script placement can slow page startup.
|
||||
|
||||
### `defer`
|
||||
|
||||
```html
|
||||
<script defer src="app.js"></script>
|
||||
```
|
||||
|
||||
With `defer`, the browser can continue parsing HTML while the script is fetched, and execution happens after the document has been parsed, before `DOMContentLoaded` fires.
|
||||
|
||||
`defer` is usually what you want for page-level scripts that depend on the DOM being present.
|
||||
|
||||
### `async`
|
||||
|
||||
```html
|
||||
<script async src="analytics.js"></script>
|
||||
```
|
||||
|
||||
With `async`, the browser fetches in parallel and executes as soon as the script is ready, independent of document parsing order. That makes it good for independent scripts, but dangerous for scripts that depend on other scripts or on predictable execution order.
|
||||
|
||||
### ES Modules
|
||||
|
||||
```html
|
||||
<script type="module" src="main.js"></script>
|
||||
```
|
||||
|
||||
Module scripts behave more like deferred scripts by default and add better scoping and import/export semantics. They avoid many historical problems of script globals colliding with one another.
|
||||
|
||||
### Script Loading Timeline
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Parser as HTML Parser
|
||||
participant Network as Network
|
||||
participant Engine as JS Engine
|
||||
|
||||
Parser->>Parser: Parse HTML
|
||||
Parser->>Network: Discover script URL
|
||||
alt classic script
|
||||
Parser->>Parser: Pause parsing
|
||||
Network-->>Parser: Script bytes ready
|
||||
Parser->>Engine: Execute now
|
||||
Engine-->>Parser: Done
|
||||
Parser->>Parser: Resume parsing
|
||||
else defer or module
|
||||
Parser->>Parser: Continue parsing
|
||||
Network-->>Parser: Script bytes ready
|
||||
Parser->>Engine: Execute after parse completes
|
||||
else async
|
||||
Parser->>Parser: Continue parsing until ready
|
||||
Network-->>Parser: Script bytes ready
|
||||
Parser->>Engine: Execute immediately when available
|
||||
end
|
||||
```
|
||||
|
||||
### Why the Browser Works This Way
|
||||
|
||||
The browser wants to preserve correctness first and optimize second.
|
||||
|
||||
- Classic scripts were designed in an era where order and shared globals were the normal pattern.
|
||||
- `defer` lets the browser keep parsing without breaking predictable order.
|
||||
- `async` prioritizes early independent execution when order does not matter.
|
||||
- modules modernize the system with better dependency management and isolation.
|
||||
|
||||
These are tradeoffs between compatibility, performance, and predictability.
|
||||
|
||||
## Web APIs: The Browser's Power Layer
|
||||
|
||||
Once the script is running, it can call APIs that the engine alone does not provide.
|
||||
|
||||
### DOM APIs
|
||||
|
||||
These let JavaScript inspect and change the document:
|
||||
|
||||
- `document.querySelector`
|
||||
- `element.append`
|
||||
- `element.classList.add`
|
||||
- `addEventListener`
|
||||
|
||||
DOM APIs connect JavaScript to the visible page. Without them, JavaScript in the browser would still be a language, but it would not be able to build a UI.
|
||||
|
||||
### Timer APIs
|
||||
|
||||
`setTimeout` and `setInterval` are browser scheduling tools, not language keywords. They register work with the host environment, which later re-enters JavaScript when the scheduled time has elapsed and the event loop allows it.
|
||||
|
||||
Important nuance: `setTimeout(fn, 0)` does not mean "run immediately." It means "schedule this as a future task after the current synchronous work and any higher-priority scheduled work have had their turn."
|
||||
|
||||
### Networking APIs
|
||||
|
||||
`fetch` lets JavaScript request resources without forcing full-page navigation.
|
||||
|
||||
This capability is what made rich single-page applications practical. Before asynchronous in-page data fetching, much of web interaction meant whole-page reloads.
|
||||
|
||||
### Observer and Browser Integration APIs
|
||||
|
||||
Modern browsers expose APIs that let code react efficiently to browser-managed events:
|
||||
|
||||
- `requestAnimationFrame`
|
||||
- `IntersectionObserver`
|
||||
- `ResizeObserver`
|
||||
- `MutationObserver`
|
||||
- `AbortController`
|
||||
|
||||
These exist because the browser knows things your code cannot infer efficiently on its own. Instead of polling, you often get better correctness and performance by letting the browser notify you.
|
||||
|
||||
## The Runtime Model at a High Level
|
||||
|
||||
The detailed event loop comes in Chapter 3, but you should already hold the high-level picture.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Your JavaScript] --> B[Call stack]
|
||||
B --> C[Browser Web APIs]
|
||||
C --> D[Task queue]
|
||||
C --> E[Microtask queue triggers]
|
||||
D --> B
|
||||
E --> B
|
||||
B --> F[DOM updates]
|
||||
F --> G[Render pipeline]
|
||||
```
|
||||
|
||||
The important part is not memorizing arrows. The important part is understanding responsibility boundaries:
|
||||
|
||||
- the engine executes JavaScript on the call stack
|
||||
- the browser owns host features like timers, networking, and DOM event sources
|
||||
- the browser decides when queued callbacks are allowed back onto the stack
|
||||
- rendering is coordinated with this scheduling model rather than happening after every line of code
|
||||
|
||||
This is why frontend debugging often requires reasoning across layers, not just staring at JavaScript syntax.
|
||||
|
||||
## The Global Object in Browsers
|
||||
|
||||
In browser scripts, the global object is historically `window`. Many browser globals live there:
|
||||
|
||||
- `window.document`
|
||||
- `window.setTimeout`
|
||||
- `window.fetch`
|
||||
- `window.localStorage`
|
||||
|
||||
In global script code, `var` declarations historically create properties on `window`, which is one reason legacy browser JavaScript was prone to name collisions.
|
||||
|
||||
ES modules improve this story because top-level declarations in modules do not leak the same way into the global object.
|
||||
|
||||
## Browser JavaScript vs Node.js
|
||||
|
||||
Interviewers often use this comparison to test whether you understand what belongs to the language and what belongs to the host.
|
||||
|
||||
| Topic | Browser | Node.js |
|
||||
| --- | --- | --- |
|
||||
| Main purpose | UI, documents, user interaction | Servers, tooling, scripts |
|
||||
| DOM access | Yes | No |
|
||||
| File system access | No direct user-file access by default | Yes |
|
||||
| Networking focus | Page requests, subresources, APIs, sockets | Servers, APIs, CLI tools, sockets |
|
||||
| Global object | `window` or `self` depending on context | `global` |
|
||||
| Event loop flavor | Oriented around page, rendering, input | Oriented around I/O and server tasks |
|
||||
| Rendering pipeline | Central concern | Usually none |
|
||||
|
||||
### Same Language, Different Constraints
|
||||
|
||||
Both environments run JavaScript, but they optimize for different jobs.
|
||||
|
||||
The browser is security-sensitive and user-facing:
|
||||
|
||||
- pages from different origins must be isolated
|
||||
- direct disk access is restricted
|
||||
- rendering smoothness matters
|
||||
- input latency matters
|
||||
|
||||
Node.js is server- and tooling-oriented:
|
||||
|
||||
- file system access is normal
|
||||
- network server APIs are normal
|
||||
- there is no DOM
|
||||
- rendering is not part of the runtime model
|
||||
|
||||
So when a candidate says, "JavaScript can do X," a strong interviewer may immediately ask, "In which environment?"
|
||||
|
||||
## Web Workers: A Useful Boundary Case
|
||||
|
||||
Browser JavaScript is often described as single-threaded, but that statement needs precision.
|
||||
|
||||
The main page's synchronous JavaScript execution happens on a single call stack on the main thread. But browsers also provide Web Workers, which let you run JavaScript in separate worker contexts without direct DOM access.
|
||||
|
||||
That tells you something fundamental about browser design:
|
||||
|
||||
- DOM and rendering stay centralized and carefully controlled
|
||||
- CPU-heavy work can be moved away from the main thread when needed
|
||||
- communication happens through messages, not shared direct access by default
|
||||
|
||||
That model protects rendering consistency and reduces a whole class of race conditions in UI code.
|
||||
|
||||
## How Real Applications Use the Browser Runtime
|
||||
|
||||
A modern React or Vue app is still just browser JavaScript with a structured architecture on top.
|
||||
|
||||
When the app starts, it typically:
|
||||
|
||||
1. loads script bundles or modules
|
||||
2. creates application state
|
||||
3. attaches event listeners
|
||||
4. reads routing information from the URL
|
||||
5. fetches data from APIs
|
||||
6. updates the DOM, often through a framework abstraction
|
||||
7. continues reacting to user events and network responses
|
||||
|
||||
Nothing about a framework escapes the browser runtime. It only organizes it.
|
||||
|
||||
That is why framework expertise is much more durable when it is built on browser fundamentals. If you know where the event loop, DOM, rendering pipeline, and network layer sit, you can reason about almost any frontend stack.
|
||||
|
||||
## Interview-Ready Summary
|
||||
|
||||
- The JavaScript engine handles the language: parsing, execution, memory management, and optimization.
|
||||
- The browser environment provides host APIs such as the DOM, timers, networking, storage, and events.
|
||||
- `fetch`, `setTimeout`, and DOM methods are browser-provided APIs, not core ECMAScript language features.
|
||||
- Script loading strategy matters because classic scripts can block parsing, while `defer`, `async`, and modules make different performance and ordering tradeoffs.
|
||||
- In Chrome, V8 handles JavaScript execution while Blink handles document and rendering behavior at a high level.
|
||||
- Browser JavaScript and Node.js share the language but run inside different host environments with different capabilities and constraints.
|
||||
|
||||
## What to Read Next
|
||||
|
||||
Continue with [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md). That chapter explains how browser documents are represented, how callbacks are actually scheduled, and why DOM changes and rendering cost what they cost.
|
||||
@@ -0,0 +1,446 @@
|
||||
# 03. DOM, Event Loop, and Rendering
|
||||
|
||||
This chapter is where browser behavior becomes concrete. Up to now, the story has been split between the JavaScript language and the browser host environment. Here those pieces meet.
|
||||
|
||||
When a frontend engineer says, "The UI is janky," or "The callback order is weird," or "This DOM update is expensive," they are talking about the interaction between three systems:
|
||||
|
||||
- the DOM and style data structures
|
||||
- the event loop and scheduling model
|
||||
- the rendering pipeline that turns document state into pixels
|
||||
|
||||
This chapter is central for both interviews and production debugging because it explains why JavaScript timing and browser painting behave the way they do.
|
||||
|
||||
It connects directly to:
|
||||
|
||||
- [02-javascript-in-the-browser.md](./02-javascript-in-the-browser.md), which introduced the browser runtime and host APIs
|
||||
- [04-networking-storage-security.md](./04-networking-storage-security.md), where fetch and browser security policies rely on the same scheduling model
|
||||
- [05-real-world-architecture-patterns.md](./05-real-world-architecture-patterns.md), where frameworks exploit these mechanics for app updates and performance
|
||||
|
||||
## The DOM: JavaScript's In-Memory View of the Document
|
||||
|
||||
The DOM, or Document Object Model, is the browser's in-memory representation of the HTML document as a tree of nodes.
|
||||
|
||||
That sentence is correct but too shallow on its own. The deeper idea is that the browser needs a structured, mutable object graph that:
|
||||
|
||||
- preserves document hierarchy
|
||||
- can be queried and modified by code
|
||||
- can participate in style calculation and layout
|
||||
- can dispatch events along parent-child relationships
|
||||
|
||||
HTML text alone cannot do that. A tree of objects can.
|
||||
|
||||
### Example HTML
|
||||
|
||||
```html
|
||||
<body>
|
||||
<main id="app">
|
||||
<h1>Dashboard</h1>
|
||||
<button>Refresh</button>
|
||||
</main>
|
||||
</body>
|
||||
```
|
||||
|
||||
### Simplified DOM Tree
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[document]
|
||||
A --> B[html]
|
||||
B --> C[body]
|
||||
C --> D[main#app]
|
||||
D --> E[h1]
|
||||
E --> F[text: Dashboard]
|
||||
D --> G[button]
|
||||
G --> H[text: Refresh]
|
||||
```
|
||||
|
||||
### Why a Tree Structure Matters
|
||||
|
||||
The DOM is a tree because documents are nested. That gives the browser a natural way to answer questions like:
|
||||
|
||||
- which element contains which other element
|
||||
- what styles may inherit downwards
|
||||
- how events should travel during capture and bubble phases
|
||||
- what needs to be recomputed when a subtree changes
|
||||
|
||||
The tree shape is not arbitrary. It is the data model that makes layout, events, and selectors possible.
|
||||
|
||||
### DOM Nodes Are Live Objects
|
||||
|
||||
DOM nodes are not snapshots. They are live objects managed by the browser. When you do this:
|
||||
|
||||
```js
|
||||
const button = document.querySelector("button");
|
||||
button.textContent = "Loading...";
|
||||
```
|
||||
|
||||
you are mutating browser-managed document state. That may later trigger style recalculation, layout changes, paint work, accessibility tree updates, and more.
|
||||
|
||||
This is why DOM mutations are more expensive than changing a local JavaScript variable.
|
||||
|
||||
## The CSSOM: Style Information as a Structured Graph
|
||||
|
||||
The browser does not only need document structure. It also needs style information in a machine-friendly form.
|
||||
|
||||
The CSSOM, or CSS Object Model, is the structured representation of CSS rules and resolved style relationships.
|
||||
|
||||
At a high level:
|
||||
|
||||
- HTML becomes the DOM
|
||||
- CSS becomes style rule structures, often referred to as the CSSOM in learning material
|
||||
- the browser combines relevant document and style information to determine how elements should look
|
||||
|
||||
### Why the Browser Needs Both DOM and Style Data
|
||||
|
||||
The browser cannot draw from HTML alone because HTML says what exists, not exactly how it should appear. CSS cannot render on its own either because selectors need actual elements to match against.
|
||||
|
||||
Rendering requires both structure and style.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[HTML bytes] --> B[DOM tree]
|
||||
C[CSS bytes] --> D[Style rules and CSSOM]
|
||||
B --> E[Style calculation]
|
||||
D --> E
|
||||
E --> F[Render tree]
|
||||
```
|
||||
|
||||
### The Render Tree
|
||||
|
||||
The render tree is a browser-internal structure derived from document structure and style information. It focuses on what actually needs to be rendered.
|
||||
|
||||
Important nuance:
|
||||
|
||||
- not every DOM node necessarily appears as a visible render object
|
||||
- hidden elements may not participate the same way
|
||||
- pseudo-elements and generated content complicate the picture
|
||||
|
||||
For interview purposes, it is enough to know that the render tree is closer to "what needs drawing" than the raw DOM is.
|
||||
|
||||
## Rendering Pipeline: From Document State to Pixels
|
||||
|
||||
The rendering pipeline is the sequence of work the browser performs to update what the user sees.
|
||||
|
||||
At a high level, a useful model is:
|
||||
|
||||
1. parse HTML into DOM
|
||||
2. parse CSS into style structures
|
||||
3. calculate styles
|
||||
4. compute layout
|
||||
5. paint pixels
|
||||
6. composite layers to the screen
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[DOM change or initial load] --> B[Style recalculation]
|
||||
B --> C[Layout]
|
||||
C --> D[Paint]
|
||||
D --> E[Composite]
|
||||
E --> F[Screen update]
|
||||
```
|
||||
|
||||
### Style Recalculation
|
||||
|
||||
The browser determines which CSS rules apply to which elements and resolves computed styles.
|
||||
|
||||
This can become expensive if:
|
||||
|
||||
- the DOM is very large
|
||||
- selectors are complex
|
||||
- many elements are affected by a class or style change
|
||||
|
||||
### Layout
|
||||
|
||||
Layout determines geometry:
|
||||
|
||||
- element sizes
|
||||
- positions
|
||||
- line wrapping
|
||||
- overflow effects
|
||||
|
||||
This stage answers questions like, "Where is the button?" and "How wide is the card after styles and available space are applied?"
|
||||
|
||||
### Paint
|
||||
|
||||
Paint turns layout and visual styles into draw commands: backgrounds, text, shadows, borders, images, and so on.
|
||||
|
||||
### Composite
|
||||
|
||||
Compositing combines painted layers into the final frame sent to the display.
|
||||
|
||||
This is one reason some CSS changes are much cheaper than others. Changes that can be handled at the compositor level may avoid full layout and paint work.
|
||||
|
||||
## Reflow vs Repaint
|
||||
|
||||
These are common interview terms, though different browser documentation may use language like "layout" and "paint" more directly.
|
||||
|
||||
### Reflow
|
||||
|
||||
Reflow usually refers to layout recalculation when geometry may have changed.
|
||||
|
||||
Examples:
|
||||
|
||||
- changing width or height
|
||||
- adding or removing DOM nodes
|
||||
- changing text content in a way that affects size
|
||||
- reading and writing layout-sensitive properties repeatedly
|
||||
|
||||
Reflow is often expensive because geometry changes can ripple through other elements.
|
||||
|
||||
### Repaint
|
||||
|
||||
Repaint refers to visual updates where geometry stays the same but appearance changes.
|
||||
|
||||
Examples:
|
||||
|
||||
- changing background color
|
||||
- changing text color
|
||||
- changing visibility in some cases
|
||||
|
||||
Repaint is often cheaper than reflow, but still not free.
|
||||
|
||||
### Why This Matters in Practice
|
||||
|
||||
If you change a property that affects layout on a large page, you may trigger broad recomputation. If you can express the same visual effect using transform or opacity, the browser may keep the work closer to compositing, which is usually cheaper.
|
||||
|
||||
This is why frontend performance advice often sounds very specific. It is not superstition. It is rooted in the rendering pipeline's cost model.
|
||||
|
||||
## The Event Loop: How JavaScript Work Gets Scheduled
|
||||
|
||||
JavaScript on the main thread runs synchronously on a call stack, but browsers also need to handle timers, network responses, user input, and Promise callbacks. The event loop is the coordination mechanism.
|
||||
|
||||
### Core Mental Model
|
||||
|
||||
Think of the browser main thread as a single cashier handling one customer at a time.
|
||||
|
||||
- the call stack is the cashier's current customer
|
||||
- tasks are customers waiting in the main line
|
||||
- microtasks are high-priority notes that must be cleared before the cashier takes the next main-line customer
|
||||
- rendering happens in between suitable opportunities, not in the middle of arbitrary JavaScript execution
|
||||
|
||||
### Simplified Event Loop Cycle
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Run one task] --> B[Execute synchronous JS on call stack]
|
||||
B --> C{Call stack empty?}
|
||||
C -- No --> B
|
||||
C -- Yes --> D[Drain all microtasks]
|
||||
D --> E[Browser may render]
|
||||
E --> F[Take next task]
|
||||
F --> A
|
||||
```
|
||||
|
||||
### Tasks vs Microtasks
|
||||
|
||||
Useful everyday examples:
|
||||
|
||||
| Source | Category |
|
||||
| --- | --- |
|
||||
| `setTimeout` | task |
|
||||
| DOM events like `click` | task |
|
||||
| message events | task |
|
||||
| `Promise.then` / `catch` / `finally` | microtask |
|
||||
| `queueMicrotask` | microtask |
|
||||
| mutation observer delivery | microtask-like scheduling behavior |
|
||||
|
||||
Important rule: after a task finishes and the call stack becomes empty, the browser drains the microtask queue before moving on to the next task.
|
||||
|
||||
### Why Microtasks Exist
|
||||
|
||||
Microtasks are a way to schedule follow-up work that should happen soon after the current JavaScript completes, before the browser continues normal task progression.
|
||||
|
||||
This is very useful for:
|
||||
|
||||
- Promise chaining semantics
|
||||
- batching follow-up logic
|
||||
- preserving invariants after synchronous code completes
|
||||
|
||||
The tradeoff is that excessive microtasks can starve rendering and delay other tasks.
|
||||
|
||||
## Promises vs `setTimeout`: Why the Order Surprises People
|
||||
|
||||
Consider this code:
|
||||
|
||||
```js
|
||||
console.log("start");
|
||||
|
||||
setTimeout(() => {
|
||||
console.log("timeout");
|
||||
}, 0);
|
||||
|
||||
Promise.resolve().then(() => {
|
||||
console.log("promise");
|
||||
});
|
||||
|
||||
console.log("end");
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```js
|
||||
start
|
||||
end
|
||||
promise
|
||||
timeout
|
||||
```
|
||||
|
||||
### What Actually Happened
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant JS as Call Stack
|
||||
participant Micro as Microtask Queue
|
||||
participant Task as Task Queue
|
||||
participant Browser as Browser Loop
|
||||
|
||||
JS->>JS: log start
|
||||
JS->>Task: schedule setTimeout callback
|
||||
JS->>Micro: schedule Promise reaction
|
||||
JS->>JS: log end
|
||||
JS-->>Browser: stack empty
|
||||
Browser->>Micro: drain microtasks
|
||||
Micro->>JS: run promise callback
|
||||
Browser->>Task: take next task
|
||||
Task->>JS: run timeout callback
|
||||
```
|
||||
|
||||
The key is that `setTimeout(..., 0)` means "put this in the task queue after at least the timeout threshold," not "run this before microtasks."
|
||||
|
||||
### Common Interview Mistake
|
||||
|
||||
People say, "Promises are faster than timers." That is not the right explanation.
|
||||
|
||||
The real explanation is scheduling category:
|
||||
|
||||
- Promise reactions are microtasks
|
||||
- timer callbacks are tasks
|
||||
- microtasks run before the browser takes the next task
|
||||
|
||||
That is a much stronger answer.
|
||||
|
||||
## Rendering Does Not Happen After Every Statement
|
||||
|
||||
Beginners often imagine that every DOM change instantly appears on the screen. That would be far too expensive.
|
||||
|
||||
Instead, browsers batch work. If your JavaScript makes several DOM mutations in one turn of the event loop, the browser usually waits for a rendering opportunity rather than repainting after every line.
|
||||
|
||||
This is one reason framework batching works well. It aligns with how browsers already prefer to operate.
|
||||
|
||||
## `requestAnimationFrame`: Coordinating With Rendering
|
||||
|
||||
`requestAnimationFrame` asks the browser to run a callback before the next repaint.
|
||||
|
||||
That makes it a better place than `setTimeout` for visual animation work because it aligns your code with the browser's frame schedule.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Animations are about frame production, not merely time delays. `setTimeout` can schedule callbacks, but it does not naturally align with paint timing. `requestAnimationFrame` does.
|
||||
|
||||
If the display is refreshing at around 60 frames per second, the browser has only about 16.7 ms per frame budget for scripting, style, layout, paint, and compositing. If your main-thread work overruns that budget, the user sees dropped frames or stutter.
|
||||
|
||||
## Event Propagation in the DOM
|
||||
|
||||
The DOM tree is also the path along which many events travel.
|
||||
|
||||
For a typical event, the browser can move through:
|
||||
|
||||
1. capture phase, from outer ancestors down toward the target
|
||||
2. target phase, at the element itself
|
||||
3. bubble phase, from the target back upward
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[window/document] --> B[body]
|
||||
B --> C[div container]
|
||||
C --> D[button target]
|
||||
D --> E[bubble back to container]
|
||||
E --> F[bubble back to body]
|
||||
```
|
||||
|
||||
This propagation model is one reason event delegation is so effective. Instead of attaching listeners to every item in a large list, you can often attach one listener to a parent and inspect the event target.
|
||||
|
||||
That reduces listener count and handles dynamically added children naturally.
|
||||
|
||||
## Forced Synchronous Layout: A Hidden Performance Trap
|
||||
|
||||
One of the easiest ways to create jank is to mix DOM writes and layout reads in the wrong order.
|
||||
|
||||
Example pattern:
|
||||
|
||||
```js
|
||||
box.style.width = "200px";
|
||||
const height = box.offsetHeight;
|
||||
```
|
||||
|
||||
After the style write, the browser may need updated layout information before it can answer `offsetHeight`. So your read can force the browser to flush pending style and layout work early.
|
||||
|
||||
If this happens repeatedly inside loops, you get layout thrashing.
|
||||
|
||||
### Better Strategy
|
||||
|
||||
- batch reads together
|
||||
- batch writes together
|
||||
- use `requestAnimationFrame` for visual updates when appropriate
|
||||
- avoid unnecessary layout-sensitive property access
|
||||
|
||||
## Rendering Optimization Concepts That Matter in Real Apps
|
||||
|
||||
### Keep DOM Size Reasonable
|
||||
|
||||
Larger DOM trees increase the cost of style recalculation, layout, and some forms of event handling. Infinite feeds, tables, and dashboards often need virtualization for this reason.
|
||||
|
||||
### Prefer Transform and Opacity for Animation
|
||||
|
||||
These often avoid layout changes and may stay closer to compositing work, which is usually more efficient.
|
||||
|
||||
### Avoid Long Main-Thread Tasks
|
||||
|
||||
If JavaScript monopolizes the main thread, input responsiveness and rendering suffer. Break up heavy work, defer non-urgent work, or move computation to workers when appropriate.
|
||||
|
||||
### Use Passive Listeners When Appropriate
|
||||
|
||||
For scroll and touch-sensitive interactions, passive listeners tell the browser your handler will not call `preventDefault()`, which can improve scrolling performance by reducing uncertainty.
|
||||
|
||||
### Use Event Delegation for Dynamic Interfaces
|
||||
|
||||
This reduces listener churn and exploits the DOM's event propagation model efficiently.
|
||||
|
||||
### Let the Browser Help You
|
||||
|
||||
Use the right browser primitives instead of manual polling:
|
||||
|
||||
- `requestAnimationFrame` for animation
|
||||
- `IntersectionObserver` for visibility-based loading
|
||||
- `ResizeObserver` for size observation
|
||||
- `AbortController` for cancelable async work
|
||||
|
||||
These APIs exist because the browser can provide better scheduling and lower overhead than ad hoc userland loops.
|
||||
|
||||
## How Frameworks Relate to This Chapter
|
||||
|
||||
Frameworks like React, Vue, and Svelte do not replace the DOM, event loop, or rendering pipeline. They manage how your application decides to update them.
|
||||
|
||||
For example:
|
||||
|
||||
- framework state changes eventually become DOM updates or platform view updates
|
||||
- Promise scheduling still uses browser microtasks
|
||||
- network requests still return through browser-controlled async flow
|
||||
- layout costs still depend on real DOM and CSS behavior
|
||||
|
||||
This is why performance debugging often drops below the framework layer. A component abstraction may be elegant, but if it triggers repeated layout invalidations or long scripting tasks, the browser still pays the bill.
|
||||
|
||||
## Interview-Ready Summary
|
||||
|
||||
- The DOM is a live tree of browser-managed document nodes that JavaScript can query and mutate.
|
||||
- CSS is represented in structured form and combined with the DOM to compute renderable output.
|
||||
- The rendering pipeline usually involves style calculation, layout, paint, and compositing.
|
||||
- Reflow or layout work is expensive because geometry changes can affect other elements.
|
||||
- Repaint updates appearance without necessarily changing geometry, but it still costs work.
|
||||
- The event loop coordinates tasks, microtasks, and rendering opportunities on the main thread.
|
||||
- Promise callbacks run as microtasks, so they run before the next task such as a timer callback.
|
||||
- `requestAnimationFrame` aligns animation work with the browser's repaint schedule.
|
||||
|
||||
## What to Read Next
|
||||
|
||||
Continue with [04-networking-storage-security.md](./04-networking-storage-security.md). That chapter explains how the browser makes network requests, stores data, and enforces origin-based security boundaries around your app.
|
||||
@@ -0,0 +1,448 @@
|
||||
# 04. Networking, Storage, and Security
|
||||
|
||||
At this point in the handbook, you know how JavaScript runs and how browsers render. The next question is: how does a browser app talk to the outside world, keep data around, and stay safe while doing it?
|
||||
|
||||
This chapter covers that boundary.
|
||||
|
||||
Networking, storage, and security are deeply connected in browsers. A request is not just "send bytes to a server." It is constrained by origin policy, cookies, credential rules, caching, and user safety. Storage is not just "save some data." It is tied to persistence, quota, privacy, and attack surface. Authentication is not just a backend concern because the browser decides which credentials are attached automatically and which APIs are exposed to JavaScript.
|
||||
|
||||
This file connects directly to:
|
||||
|
||||
- [02-javascript-in-the-browser.md](./02-javascript-in-the-browser.md), which introduced browser Web APIs such as `fetch` and storage
|
||||
- [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md), because fetch completion and storage use still re-enter JavaScript through the same scheduling model
|
||||
- [05-real-world-architecture-patterns.md](./05-real-world-architecture-patterns.md), where these primitives become data-layer and deployment decisions
|
||||
|
||||
## HTTP and HTTPS From the Browser's Point of View
|
||||
|
||||
Backend engineers often describe HTTP in server-centric terms: routes, handlers, payloads, proxies. Frontend engineers need a browser-centric model too.
|
||||
|
||||
From a browser page's perspective, HTTP is how it gets almost everything:
|
||||
|
||||
- the initial HTML document
|
||||
- JavaScript bundles or modules
|
||||
- CSS files
|
||||
- images, fonts, and media
|
||||
- API data for in-page updates
|
||||
|
||||
So HTTP is not a side system. It is the bloodstream of the web page.
|
||||
|
||||
### What the Browser Actually Cares About
|
||||
|
||||
When the browser makes or receives an HTTP request, it is not only tracking the URL and body. It also cares about:
|
||||
|
||||
- origin and security policy
|
||||
- request method and headers
|
||||
- cacheability
|
||||
- cookies and credentials
|
||||
- redirect behavior
|
||||
- content type and decoding
|
||||
- connection reuse and protocol version
|
||||
- whether the response is allowed to reach page JavaScript
|
||||
|
||||
That last point matters. A server can return bytes, but the browser still decides whether page code is allowed to see them under the web security model.
|
||||
|
||||
### Why HTTPS Matters So Much in Browsers
|
||||
|
||||
HTTPS is HTTP over TLS. The practical browser reason is not just privacy in the abstract. It is trust and integrity.
|
||||
|
||||
Without HTTPS, an attacker on the network path could:
|
||||
|
||||
- read traffic
|
||||
- modify scripts in transit
|
||||
- inject hostile content
|
||||
- steal cookies without proper protection
|
||||
|
||||
If an attacker can alter your JavaScript bundle in transit, they effectively own your page. That is why secure transport is foundational to browser security rather than a nice extra.
|
||||
|
||||
Modern browser features also increasingly assume secure contexts. Many APIs either require or strongly prefer HTTPS because the browser only wants to expose powerful capabilities in a transport it can trust more.
|
||||
|
||||
## Request Types the Browser Makes
|
||||
|
||||
From the browser's perspective, not all requests are the same.
|
||||
|
||||
### Navigation Requests
|
||||
|
||||
These load a new document. Typing a URL in the address bar or clicking a normal link usually triggers navigation.
|
||||
|
||||
### Subresource Requests
|
||||
|
||||
These fetch supporting assets for a page:
|
||||
|
||||
- scripts
|
||||
- stylesheets
|
||||
- images
|
||||
- fonts
|
||||
|
||||
### Programmatic Requests
|
||||
|
||||
These are created by JavaScript using APIs such as `fetch` or `XMLHttpRequest`. They are central to single-page apps because they let the UI update without a full-page reload.
|
||||
|
||||
This distinction matters because browsers handle navigation, rendering, and security policies differently across these request categories.
|
||||
|
||||
## The Fetch Lifecycle
|
||||
|
||||
`fetch` looks simple in code, but a lot happens between `await fetch(...)` and your response handler.
|
||||
|
||||
```js
|
||||
const response = await fetch("/api/products", {
|
||||
method: "GET",
|
||||
credentials: "include"
|
||||
});
|
||||
|
||||
const data = await response.json();
|
||||
```
|
||||
|
||||
### High-Level Fetch Request Lifecycle
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[JavaScript calls fetch] --> B[Browser validates request options]
|
||||
B --> C[Check service worker and caches]
|
||||
C --> D[Apply origin and CORS rules]
|
||||
D --> E[Reuse or create network connection]
|
||||
E --> F[Send HTTP request]
|
||||
F --> G[Receive response headers and body]
|
||||
G --> H[Resolve fetch promise with Response object]
|
||||
H --> I[JavaScript consumes body stream or parsed data]
|
||||
```
|
||||
|
||||
### Step-by-Step Intuition
|
||||
|
||||
1. JavaScript creates a request description.
|
||||
2. The browser checks request mode, credentials settings, headers, and policy restrictions.
|
||||
3. A service worker may intercept the request if one is active.
|
||||
4. The browser may consult caches before going to the network.
|
||||
5. Security rules such as same-origin policy and CORS are applied.
|
||||
6. The browser uses an existing connection or establishes a new one.
|
||||
7. Response headers arrive, then body data streams in.
|
||||
8. The fetch promise resolves once the response object is available.
|
||||
9. Later body-reading methods such as `response.json()` perform additional async work.
|
||||
|
||||
### Important Subtlety: `fetch` Resolves on HTTP Errors Too
|
||||
|
||||
`fetch` rejects on network-level failure, abort, or some policy failures. But a normal HTTP error such as `404` or `500` still produces a response object.
|
||||
|
||||
That is why this is common:
|
||||
|
||||
```js
|
||||
const response = await fetch("/api/user");
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Request failed: ${response.status}`);
|
||||
}
|
||||
```
|
||||
|
||||
This behavior makes sense once you realize that HTTP error status is still a valid HTTP response, not a transport failure.
|
||||
|
||||
### Streams and Why the Browser Works This Way
|
||||
|
||||
Response bodies are often stream-oriented because browsers do not want to wait for the entire payload to be fully buffered before exposing data. Streaming improves memory efficiency and latency for large responses.
|
||||
|
||||
This becomes especially important for:
|
||||
|
||||
- large downloads
|
||||
- video and media delivery
|
||||
- streaming server responses
|
||||
- progressively rendered HTML or data pipelines
|
||||
|
||||
## Same-Origin Policy: The Browser's Core Isolation Rule
|
||||
|
||||
The same-origin policy is one of the web's foundational security rules.
|
||||
|
||||
At a useful level, an origin is a combination of:
|
||||
|
||||
- scheme, such as `http` or `https`
|
||||
- host, such as `app.example.com`
|
||||
- port, such as `443`
|
||||
|
||||
If two resources differ in one of those, they are different origins.
|
||||
|
||||
### Why This Policy Exists
|
||||
|
||||
Imagine you are logged into your bank in one browser tab and visit a malicious site in another. Without origin isolation, that malicious page could simply read sensitive data from the bank site using your browser's authenticated session.
|
||||
|
||||
The same-origin policy exists to prevent one origin's page JavaScript from freely reading data from another origin.
|
||||
|
||||
This is one of the most important "why it works this way" topics in browser engineering. The browser is not trying to make developers suffer. It is acting as a guard between mutually untrusting websites loaded by the same user.
|
||||
|
||||
## CORS: Controlled Relaxation of Same-Origin Restrictions
|
||||
|
||||
CORS, or Cross-Origin Resource Sharing, is a browser mechanism that lets servers explicitly say which cross-origin requests are allowed.
|
||||
|
||||
### Deep Intuition
|
||||
|
||||
Same-origin policy says: "pages cannot freely read responses from other origins."
|
||||
|
||||
CORS adds: "unless the target server explicitly opts in under defined rules."
|
||||
|
||||
### What CORS Is and Is Not
|
||||
|
||||
- CORS is enforced by browsers.
|
||||
- CORS is mainly about whether frontend JavaScript can access the response.
|
||||
- CORS does not stop server-to-server requests because there is no browser enforcing the rule there.
|
||||
- CORS is not an authentication mechanism.
|
||||
|
||||
### Simple Request vs Preflighted Request
|
||||
|
||||
Some cross-origin requests can be sent directly, and the browser checks whether the response includes the right CORS headers.
|
||||
|
||||
Other requests require a preflight: an `OPTIONS` request sent first to ask the server what is allowed.
|
||||
|
||||
Triggers for preflight commonly include:
|
||||
|
||||
- non-simple methods such as `PUT` or `DELETE`
|
||||
- certain custom headers
|
||||
- certain content types outside the simple set
|
||||
|
||||
### CORS Flow
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant App as Browser App
|
||||
participant Browser as Browser Security Layer
|
||||
participant API as API Server
|
||||
|
||||
App->>Browser: fetch cross-origin resource
|
||||
alt preflight required
|
||||
Browser->>API: OPTIONS preflight
|
||||
API-->>Browser: Access-Control-* headers
|
||||
end
|
||||
Browser->>API: actual request
|
||||
API-->>Browser: response with CORS headers
|
||||
Browser-->>App: expose response only if policy allows
|
||||
```
|
||||
|
||||
### Credentials and CORS
|
||||
|
||||
Cross-origin requests become trickier when cookies or other credentials are involved.
|
||||
|
||||
If credentials are included:
|
||||
|
||||
- the browser requires tighter CORS rules
|
||||
- wildcard origin settings usually cannot be used in the same permissive way
|
||||
- server intent must be explicit
|
||||
|
||||
This is deliberate. If authenticated browser requests could be casually shared across origins, cross-site data leaks would be much easier.
|
||||
|
||||
## Cookies, `localStorage`, `sessionStorage`, and IndexedDB
|
||||
|
||||
Browser storage is not one thing. Different storage mechanisms exist because the web needs different tradeoffs.
|
||||
|
||||
### Storage Comparison
|
||||
|
||||
| Storage | Lifetime | Sent automatically with requests | Typical size | Best use |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| Cookies | configurable expiry or session | Yes, depending on scope and rules | small | session/auth and server-visible state |
|
||||
| `localStorage` | persists until cleared | No | small-ish | simple client-only persistence |
|
||||
| `sessionStorage` | per-tab session | No | small-ish | temporary tab-scoped data |
|
||||
| IndexedDB | persistent | No | much larger | structured app data and offline caching |
|
||||
|
||||
### Cookies
|
||||
|
||||
Cookies are small key-value pieces of data associated with requests to matching origins and paths.
|
||||
|
||||
Why cookies are special:
|
||||
|
||||
- the browser can attach them automatically to matching requests
|
||||
- servers can set them using `Set-Cookie`
|
||||
- security attributes such as `HttpOnly`, `Secure`, and `SameSite` affect how they behave
|
||||
|
||||
That automatic attachment makes cookies powerful for authentication, but also sensitive from a security standpoint.
|
||||
|
||||
### `localStorage`
|
||||
|
||||
`localStorage` is a simple synchronous key-value API accessible from JavaScript.
|
||||
|
||||
Why it is popular:
|
||||
|
||||
- easy to use
|
||||
- persistent across page reloads
|
||||
- fine for small client-only preferences
|
||||
|
||||
Why it should not be overused:
|
||||
|
||||
- synchronous API can block if abused
|
||||
- string-only interface is primitive
|
||||
- available to JavaScript, so any successful XSS can read it
|
||||
|
||||
### `sessionStorage`
|
||||
|
||||
`sessionStorage` is similar to `localStorage` but scoped to the lifetime of a tab or page session.
|
||||
|
||||
Useful for:
|
||||
|
||||
- temporary wizard progress
|
||||
- transient UI state that should not survive a full long-term session
|
||||
|
||||
### IndexedDB
|
||||
|
||||
IndexedDB is the browser's serious client-side database option.
|
||||
|
||||
It exists because richer apps need more than tiny string storage:
|
||||
|
||||
- offline data
|
||||
- larger structured records
|
||||
- indexes and queries
|
||||
- versioned schema upgrades
|
||||
|
||||
If `localStorage` is a sticky note, IndexedDB is a filing cabinet.
|
||||
|
||||
### Storage Landscape Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Browser App] --> B[Cookies]
|
||||
A --> C[localStorage]
|
||||
A --> D[sessionStorage]
|
||||
A --> E[IndexedDB]
|
||||
B --> F[Automatically attached to matching HTTP requests]
|
||||
C --> G[Client-side key value persistence]
|
||||
D --> H[Tab-scoped temporary persistence]
|
||||
E --> I[Structured offline-capable app data]
|
||||
```
|
||||
|
||||
## Authentication in Browser Apps
|
||||
|
||||
Frontend interview questions about auth are often really questions about browser behavior.
|
||||
|
||||
### Session Cookie Model
|
||||
|
||||
In a traditional server-backed web app:
|
||||
|
||||
1. user logs in
|
||||
2. server verifies credentials
|
||||
3. server sends a session cookie
|
||||
4. browser stores it
|
||||
5. browser automatically includes it on future matching requests
|
||||
6. server uses it to find the session state
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User as User Browser
|
||||
participant App as Frontend App
|
||||
participant API as Server
|
||||
|
||||
App->>API: POST /login with credentials
|
||||
API-->>User: Set-Cookie session=abc
|
||||
User->>User: Browser stores cookie
|
||||
App->>API: GET /profile
|
||||
User->>API: Cookie: session=abc
|
||||
API-->>App: profile data
|
||||
```
|
||||
|
||||
Why this model is strong:
|
||||
|
||||
- browser handles cookie attachment automatically
|
||||
- server can invalidate session centrally
|
||||
- `HttpOnly` cookies are not readable by JavaScript, reducing exposure to token theft via XSS
|
||||
|
||||
### JWT Basics
|
||||
|
||||
JWTs, or JSON Web Tokens, are a token format often used in stateless auth systems. The token itself can carry claims and can be verified by the server.
|
||||
|
||||
Important nuance: JWT is a token format, not a storage strategy.
|
||||
|
||||
You still need to decide where to keep it:
|
||||
|
||||
- in memory
|
||||
- in a cookie
|
||||
- in storage accessible to JavaScript
|
||||
|
||||
That decision affects security properties more than the string format does.
|
||||
|
||||
### Cookie vs JavaScript-Accessible Token Storage
|
||||
|
||||
If you store auth material in `localStorage`, JavaScript can read it, which means XSS can read it too.
|
||||
|
||||
If you store auth material in `HttpOnly` cookies, JavaScript cannot read it directly, which is often safer against token exfiltration. But cookies are automatically attached to requests, so CSRF protections become especially important.
|
||||
|
||||
This is why authentication discussions in browser apps are always tied to both XSS and CSRF.
|
||||
|
||||
## XSS: Cross-Site Scripting
|
||||
|
||||
XSS happens when untrusted content is executed as script in your page's origin.
|
||||
|
||||
### Why It Is So Dangerous
|
||||
|
||||
The browser treats injected script as if it were your app's own JavaScript running in your origin. That means it can:
|
||||
|
||||
- read page data
|
||||
- make authenticated requests
|
||||
- manipulate the UI
|
||||
- access storage available to JavaScript
|
||||
- potentially exfiltrate secrets
|
||||
|
||||
Once XSS lands, many other defenses become weaker because the attacker is now "inside" your app's trust boundary.
|
||||
|
||||
### Common Causes
|
||||
|
||||
- injecting unsanitized HTML
|
||||
- unsafe template interpolation
|
||||
- dangerous use of `innerHTML`
|
||||
- third-party script compromise
|
||||
|
||||
### Defenses
|
||||
|
||||
- escape or sanitize untrusted content
|
||||
- prefer safe DOM APIs and framework escaping defaults
|
||||
- limit inline script execution with CSP
|
||||
- reduce token exposure to JavaScript when possible
|
||||
|
||||
## CSRF: Cross-Site Request Forgery
|
||||
|
||||
CSRF exploits the fact that the browser may automatically attach cookies to requests.
|
||||
|
||||
An attacker cannot necessarily read your app's responses because of same-origin policy, but they may still be able to cause the browser to send authenticated requests if the app relies on automatically attached credentials.
|
||||
|
||||
### Why CSRF Exists
|
||||
|
||||
The browser is trying to be convenient. If the user is logged in, the browser sends matching cookies automatically. That convenience is exactly what attackers try to abuse.
|
||||
|
||||
### Defenses
|
||||
|
||||
- `SameSite` cookie settings
|
||||
- anti-CSRF tokens
|
||||
- checking origin or referer where appropriate
|
||||
- using auth patterns that do not blindly trust automatic cross-site requests
|
||||
|
||||
## CSP: Content Security Policy
|
||||
|
||||
CSP is a browser-enforced policy that lets a site restrict what content and scripts are allowed to execute.
|
||||
|
||||
Think of CSP as a damage-limiting policy layer. It does not replace safe coding, but it can greatly reduce what injected content is able to do.
|
||||
|
||||
Examples of what CSP can control:
|
||||
|
||||
- which script sources are allowed
|
||||
- whether inline scripts are allowed
|
||||
- which image, style, and connect destinations are allowed
|
||||
|
||||
This matters in real apps because modern frontends often load many third-party resources. CSP helps turn "trust everything the page references" into an explicit policy.
|
||||
|
||||
## Browser Security Is a System, Not a Single Feature
|
||||
|
||||
Strong browser security comes from layers working together:
|
||||
|
||||
- HTTPS protects transport integrity and confidentiality
|
||||
- same-origin policy isolates sites from one another
|
||||
- CORS selectively relaxes read restrictions under server control
|
||||
- cookie attributes constrain credential behavior
|
||||
- XSS defenses stop hostile code from entering your origin
|
||||
- CSRF defenses protect automatically attached credentials
|
||||
- CSP reduces the blast radius of injection mistakes
|
||||
|
||||
This layered thinking is what interviewers want when they ask security questions. They are usually testing whether you understand that the browser is an active security participant, not a passive request sender.
|
||||
|
||||
## Interview-Ready Summary
|
||||
|
||||
- The browser cares about origin, credentials, cache, and security policy in addition to raw HTTP semantics.
|
||||
- `fetch` triggers browser-managed request logic involving policy checks, caching, networking, and async response delivery.
|
||||
- Same-origin policy prevents one origin's page JavaScript from freely reading another origin's data.
|
||||
- CORS is a browser-enforced mechanism that allows controlled cross-origin access when the server opts in.
|
||||
- Cookies are automatically attached to matching requests; `localStorage`, `sessionStorage`, and IndexedDB are not.
|
||||
- `HttpOnly` cookies reduce JavaScript access to sensitive auth material, while JavaScript-readable storage is more exposed to XSS.
|
||||
- XSS, CSRF, and CSP are related browser security topics, not isolated trivia items.
|
||||
|
||||
## What to Read Next
|
||||
|
||||
Continue with [05-real-world-architecture-patterns.md](./05-real-world-architecture-patterns.md). That chapter turns all of these browser primitives into application design choices: SPA vs MPA, SSR vs hydration, state management, caching, and production frontend architecture.
|
||||
@@ -0,0 +1,396 @@
|
||||
# 05. Real-World Architecture Patterns
|
||||
|
||||
The earlier chapters explained the mechanics of JavaScript in browsers. This final chapter answers the engineering question: given those mechanics, how should real frontend systems be structured?
|
||||
|
||||
Production frontend architecture is mostly about managing tradeoffs:
|
||||
|
||||
- initial load speed vs runtime flexibility
|
||||
- server work vs client work
|
||||
- caching aggressiveness vs freshness
|
||||
- local simplicity vs shared state coordination
|
||||
- bundle size vs feature richness
|
||||
- team velocity vs long-term maintainability
|
||||
|
||||
Interviewers ask about these patterns because they reveal whether you can think past syntax and into system behavior.
|
||||
|
||||
This chapter connects back to the entire handbook:
|
||||
|
||||
- [01-javascript-fundamentals.md](./01-javascript-fundamentals.md), because closures, modules, and object models still shape large apps
|
||||
- [02-javascript-in-the-browser.md](./02-javascript-in-the-browser.md), because architecture sits on top of browser runtime constraints
|
||||
- [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md), because UI performance and scheduling determine what architectures feel fast
|
||||
- [04-networking-storage-security.md](./04-networking-storage-security.md), because data fetching, caching, and auth drive most application complexity
|
||||
|
||||
## SPA vs MPA
|
||||
|
||||
One of the first architecture splits in web applications is whether the app behaves primarily like an MPA, a multi-page application, or an SPA, a single-page application.
|
||||
|
||||
### MPA: Multi-Page Application
|
||||
|
||||
In an MPA, navigation usually loads a new document from the server. Each page request returns fresh HTML.
|
||||
|
||||
Strengths:
|
||||
|
||||
- simpler mental model for navigation
|
||||
- strong default SEO story
|
||||
- server remains the center of rendering and routing
|
||||
- less client-side JavaScript is often needed for basic experiences
|
||||
|
||||
Tradeoffs:
|
||||
|
||||
- full-page navigations can feel heavier
|
||||
- state continuity across pages is less automatic
|
||||
- rich app-like interactions may require more partial enhancement work
|
||||
|
||||
### SPA: Single-Page Application
|
||||
|
||||
In an SPA, the browser loads an application shell and subsequent navigation often updates view state without full document reloads.
|
||||
|
||||
Strengths:
|
||||
|
||||
- highly interactive UI flows
|
||||
- smoother in-app transitions
|
||||
- client-side state can persist across route changes
|
||||
- fits well with component-driven architectures such as React and Vue
|
||||
|
||||
Tradeoffs:
|
||||
|
||||
- larger JavaScript cost up front if not optimized
|
||||
- more complex client-side routing and state handling
|
||||
- SEO and initial load behavior require deliberate design
|
||||
- more architectural pressure on data fetching, hydration, caching, and error recovery
|
||||
|
||||
### High-Level Comparison
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[User action] --> B{App style}
|
||||
B -- MPA --> C[Browser navigates to new document]
|
||||
C --> D[Server returns HTML]
|
||||
D --> E[Browser parses and renders page]
|
||||
B -- SPA --> F[Client router updates route]
|
||||
F --> G[Fetch route data if needed]
|
||||
G --> H[Render updated view in existing page]
|
||||
```
|
||||
|
||||
### Why This Tradeoff Exists
|
||||
|
||||
SPAs move more application coordination into the browser. MPAs keep more of it on the server. Neither is universally better. The right choice depends on product behavior, team expertise, SEO needs, latency sensitivity, and operational constraints.
|
||||
|
||||
Many modern systems are hybrids rather than pure versions of either.
|
||||
|
||||
## CSR, SSR, and Hydration
|
||||
|
||||
These terms describe how UI markup gets produced and when a page becomes interactive.
|
||||
|
||||
### CSR: Client-Side Rendering
|
||||
|
||||
With CSR, the browser loads JavaScript that then builds or updates the UI primarily on the client.
|
||||
|
||||
Benefits:
|
||||
|
||||
- flexible rich interactivity
|
||||
- natural fit for component state and dynamic routing
|
||||
- server can act more like a data API layer
|
||||
|
||||
Costs:
|
||||
|
||||
- initial render may be delayed by JavaScript download, parse, and execution
|
||||
- poor bundle discipline can hurt startup badly
|
||||
- SEO and low-end devices may suffer if CSR is the only strategy
|
||||
|
||||
### SSR: Server-Side Rendering
|
||||
|
||||
With SSR, the server sends HTML for the requested view. The page can often show useful content earlier because the browser does not need to wait for all client rendering logic before seeing structure.
|
||||
|
||||
Benefits:
|
||||
|
||||
- faster meaningful content for many pages
|
||||
- stronger SEO by default
|
||||
- better performance for users on weak devices or slow networks
|
||||
|
||||
Costs:
|
||||
|
||||
- server rendering complexity
|
||||
- coordination between server-rendered markup and client behavior
|
||||
- more architectural attention needed around caching and personalization
|
||||
|
||||
### Hydration
|
||||
|
||||
Hydration is the process where client-side JavaScript attaches interactivity to HTML that already exists from server rendering.
|
||||
|
||||
This is a powerful compromise:
|
||||
|
||||
- server provides initial HTML quickly
|
||||
- client attaches event handlers and stateful behavior afterward
|
||||
|
||||
But hydration is not free. The browser still has to download, parse, and run the JavaScript, then connect it to the existing DOM.
|
||||
|
||||
### Render Strategy Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[User requests page] --> B{Strategy}
|
||||
B -- CSR --> C[Server sends minimal HTML and JS assets]
|
||||
C --> D[Browser downloads JS]
|
||||
D --> E[Client renders UI]
|
||||
B -- SSR --> F[Server renders HTML]
|
||||
F --> G[Browser displays content]
|
||||
G --> H[Client JS loads]
|
||||
H --> I[Hydration attaches interactivity]
|
||||
```
|
||||
|
||||
### Streaming and Progressive Delivery
|
||||
|
||||
Modern frameworks increasingly support streaming HTML or data so the browser can start showing useful content sooner instead of waiting for the entire page to be fully ready.
|
||||
|
||||
This follows the same browser principles from earlier chapters: progressive work generally feels faster than large all-at-once work.
|
||||
|
||||
## State Management Patterns
|
||||
|
||||
The phrase "state management" is often used too broadly. In production systems, the first question is not "Which library should we use?" It is "What kind of state is this?"
|
||||
|
||||
### Common State Categories
|
||||
|
||||
1. Local UI state
|
||||
2. Server state
|
||||
3. Shared client application state
|
||||
4. URL state
|
||||
5. Form state
|
||||
|
||||
### Local UI State
|
||||
|
||||
Examples:
|
||||
|
||||
- whether a modal is open
|
||||
- current tab selection
|
||||
- text inside an input
|
||||
|
||||
This state usually belongs close to the component or feature using it.
|
||||
|
||||
### Server State
|
||||
|
||||
Examples:
|
||||
|
||||
- product lists from an API
|
||||
- current user profile from the backend
|
||||
- notifications feed
|
||||
|
||||
This state is different because the server is the source of truth. It has freshness, invalidation, and cache concerns. Treating server state like plain local state often creates bugs and redundant refetching.
|
||||
|
||||
### Shared Client State
|
||||
|
||||
Examples:
|
||||
|
||||
- theme
|
||||
- feature flags already loaded into the client
|
||||
- cross-cutting workflow state
|
||||
|
||||
This state may justify a shared store, but only when many parts of the app truly need coordinated access.
|
||||
|
||||
### URL State
|
||||
|
||||
Examples:
|
||||
|
||||
- search query
|
||||
- pagination page
|
||||
- selected filters
|
||||
- current route segment
|
||||
|
||||
If state changes should be shareable, restorable, or navigable with back/forward history, the URL is often the right place for it.
|
||||
|
||||
### Why Misclassifying State Hurts Architecture
|
||||
|
||||
If every piece of state gets pushed into one global store, the app becomes harder to reason about. If important cross-cutting state is scattered locally, coordination becomes painful.
|
||||
|
||||
Good frontend architecture often begins with correct state classification.
|
||||
|
||||
## State Flow in a Real App
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[User interaction] --> B[UI component state]
|
||||
B --> C[Action or event]
|
||||
C --> D[Server state cache or shared store]
|
||||
D --> E[API client]
|
||||
E --> F[Backend]
|
||||
F --> D
|
||||
D --> G[Derived view model]
|
||||
G --> H[Rendered UI]
|
||||
```
|
||||
|
||||
This kind of layered flow is common in React and Vue apps even when the exact libraries differ.
|
||||
|
||||
## Caching Strategies
|
||||
|
||||
Caching in frontend systems is layered. Many teams think only about one layer and miss the rest.
|
||||
|
||||
### Browser HTTP Cache
|
||||
|
||||
The browser can cache network responses according to HTTP cache headers. This helps with:
|
||||
|
||||
- repeat visits
|
||||
- static assets
|
||||
- reducing unnecessary transfers
|
||||
|
||||
### CDN Cache
|
||||
|
||||
Before requests even reach your origin, a CDN may serve cached assets or responses from edge locations. This is especially valuable for static JavaScript bundles, images, and sometimes cacheable HTML or API responses.
|
||||
|
||||
### In-Memory App Cache
|
||||
|
||||
Client apps often keep recent data in memory so route changes or repeated views do not trigger unnecessary requests immediately.
|
||||
|
||||
### Service Worker Cache
|
||||
|
||||
Service workers can intercept requests and implement custom caching logic, enabling offline support and app-shell strategies.
|
||||
|
||||
### Cache Layer Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[User navigates or requests data]
|
||||
A --> B{Browser cache hit?}
|
||||
B -- Yes --> C[Serve cached response]
|
||||
B -- No --> D{CDN cache hit?}
|
||||
D -- Yes --> E[Serve from edge cache]
|
||||
D -- No --> F[Origin server]
|
||||
F --> G[Response stored according to policy]
|
||||
G --> H[Client memory cache may retain parsed data]
|
||||
```
|
||||
|
||||
### Stale-While-Revalidate Thinking
|
||||
|
||||
A common production strategy is to show cached data quickly and refresh in the background. This gives users fast response while still converging toward fresh data.
|
||||
|
||||
The important architectural lesson is that freshness and latency are usually in tension. Good caching is about choosing where to sit on that tradeoff, not pretending there is no tradeoff.
|
||||
|
||||
## Lazy Loading and Code Splitting
|
||||
|
||||
Large frontend apps fail slowly before they fail visibly. Bundle size creeps up, startup cost increases, and low-end devices begin to struggle.
|
||||
|
||||
Code splitting is how you avoid shipping every feature to every user on day one of a session.
|
||||
|
||||
### Common Strategies
|
||||
|
||||
- route-based splitting
|
||||
- component-level lazy loading
|
||||
- loading heavy admin or analytics features only when needed
|
||||
- deferring non-critical third-party scripts
|
||||
|
||||
### Why It Works
|
||||
|
||||
Users rarely need the full app graph immediately. If the first screen needs only a subset of code, shipping less upfront lowers download, parse, and execution cost.
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- more network boundaries at runtime
|
||||
- loading states become part of UX design
|
||||
- too much fragmentation can create request overhead and complexity
|
||||
|
||||
Good architecture chooses meaningful split points rather than slicing code randomly.
|
||||
|
||||
## A Practical Frontend Architecture for React- or Vue-Style Apps
|
||||
|
||||
Many real applications settle into a layered shape.
|
||||
|
||||
### Common Layers
|
||||
|
||||
1. App shell and bootstrapping
|
||||
2. Router
|
||||
3. Feature modules or route modules
|
||||
4. Shared design system and UI primitives
|
||||
5. API client and data-fetching layer
|
||||
6. Server-state cache and invalidation logic
|
||||
7. Local state and component logic
|
||||
8. Observability, error reporting, and performance instrumentation
|
||||
|
||||
### Example Architecture Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Browser loads app shell] --> B[Router]
|
||||
B --> C[Route module]
|
||||
C --> D[Feature components]
|
||||
D --> E[Design system primitives]
|
||||
C --> F[State and business logic]
|
||||
F --> G[API client]
|
||||
G --> H[Backend or BFF]
|
||||
F --> I[Server-state cache]
|
||||
I --> D
|
||||
C --> J[Analytics and error reporting]
|
||||
```
|
||||
|
||||
### Why This Structure Works
|
||||
|
||||
- the router controls high-level navigation concerns
|
||||
- feature modules keep domain logic close to the screens that need it
|
||||
- design-system components reduce duplicated UI implementation
|
||||
- API and cache layers centralize network policy and invalidation rules
|
||||
- observability is treated as infrastructure rather than an afterthought
|
||||
|
||||
This is the difference between a demo app and a production system. Production systems deliberately separate concerns that would otherwise tangle together under growth.
|
||||
|
||||
## BFFs, Edge Logic, and Backend Coordination
|
||||
|
||||
Many large frontend systems talk not directly to many microservices, but to a BFF, a backend-for-frontend.
|
||||
|
||||
Why teams do this:
|
||||
|
||||
- reduce over-fetching and under-fetching
|
||||
- tailor APIs to UI needs
|
||||
- centralize auth and aggregation logic
|
||||
- shield the browser from backend complexity
|
||||
|
||||
This is especially relevant in system design interviews because frontend architecture and backend API design are tightly coupled.
|
||||
|
||||
## Performance Is an Architectural Concern, Not a Last-Mile Concern
|
||||
|
||||
Teams often treat performance like a final optimization pass. That usually fails.
|
||||
|
||||
Performance is shaped by architecture from the start:
|
||||
|
||||
- choosing CSR-only vs SSR or hybrid rendering
|
||||
- deciding bundle boundaries
|
||||
- choosing cache strategies
|
||||
- classifying state correctly
|
||||
- deciding how much work runs on the main thread
|
||||
- avoiding unnecessary DOM churn
|
||||
|
||||
If you get those decisions wrong, no amount of local component optimization will fully rescue the app.
|
||||
|
||||
## What Strong Frontend Engineers Optimize For
|
||||
|
||||
In production, strong frontend architecture usually tries to improve all of these together:
|
||||
|
||||
- fast first meaningful render
|
||||
- quick route transitions
|
||||
- resilient data loading and retry behavior
|
||||
- understandable state ownership
|
||||
- security-safe auth handling
|
||||
- small enough bundles and incremental loading
|
||||
- observability for failures and regressions
|
||||
- a codebase structure teams can extend without chaos
|
||||
|
||||
That combination is what interviewers are really probing when they ask broad architecture questions.
|
||||
|
||||
## Interview-Ready Summary
|
||||
|
||||
- MPAs center navigation and rendering on the server; SPAs keep a longer-lived client runtime and often do in-place route transitions.
|
||||
- CSR, SSR, and hydration are different tradeoffs about where markup is produced and when interactivity arrives.
|
||||
- Not all state is the same; local UI state, server state, shared client state, and URL state should usually be managed differently.
|
||||
- Caching is layered across browser cache, CDN, app memory, and sometimes service workers.
|
||||
- Lazy loading and code splitting reduce startup cost by shipping less JavaScript up front.
|
||||
- Real frontend architecture usually includes an app shell, router, feature modules, a data layer, caches, shared UI primitives, and observability.
|
||||
- Performance and maintainability are architectural properties, not just component-level concerns.
|
||||
|
||||
## Closing Perspective
|
||||
|
||||
If you can explain the browser using the full path covered by this handbook, you are operating at a strong engineering level:
|
||||
|
||||
- JavaScript language fundamentals
|
||||
- browser runtime boundaries
|
||||
- DOM, rendering, and scheduling
|
||||
- networking, storage, and security
|
||||
- production application architecture
|
||||
|
||||
That path is what turns "I know JavaScript" into "I can reason about modern web systems from first principles."
|
||||
@@ -0,0 +1,937 @@
|
||||
# 06. TypeScript for JavaScript Developers
|
||||
|
||||
The earlier chapters focused on how JavaScript actually runs: execution contexts, the event loop, the DOM, networking, rendering, and frontend architecture. TypeScript changes a different part of the story.
|
||||
|
||||
It changes how engineers reason about JavaScript systems before those systems run.
|
||||
|
||||
That distinction matters. TypeScript is not a second runtime next to the browser or Node. It does not replace JavaScript semantics. It adds a compile-time layer that lets teams describe assumptions about data, APIs, UI states, and module boundaries in a form the compiler can check.
|
||||
|
||||
This chapter is about that reasoning layer.
|
||||
|
||||
It connects directly to:
|
||||
|
||||
- [03-dom-event-loop-rendering.md](./03-dom-event-loop-rendering.md), because the runtime is still ordinary JavaScript running on the same event loop
|
||||
- [04-networking-storage-security.md](./04-networking-storage-security.md), because external data enters your app without trusted runtime types
|
||||
- [05-real-world-architecture-patterns.md](./05-real-world-architecture-patterns.md), because TypeScript becomes most valuable when systems and teams grow
|
||||
|
||||
## The Core Mental Model
|
||||
|
||||
TypeScript is best understood as a compile-time type layer over JavaScript.
|
||||
|
||||
That sentence is more useful than saying "TypeScript is a typed language" because it preserves the real engineering model:
|
||||
|
||||
- you write code that looks like JavaScript plus type information
|
||||
- a type checker analyzes whether your assumptions are internally consistent
|
||||
- the output that runs is JavaScript
|
||||
- the browser and Node do not know or care that your source used TypeScript
|
||||
|
||||
In practical terms, TypeScript exists because large JavaScript systems accumulate hidden assumptions.
|
||||
|
||||
Examples:
|
||||
|
||||
- one module assumes `user.id` is always a string
|
||||
- another assumes an API response always has `items`
|
||||
- a React component assumes `onClose` exists when `isOpen` is `true`
|
||||
- a refactor renames a field in one feature but misses four other call sites
|
||||
|
||||
In plain JavaScript, many of these assumptions survive only in developer memory, comments, tests, or convention. That works for small code, but it degrades as codebases and teams scale.
|
||||
|
||||
TypeScript turns those assumptions into artifacts the compiler can inspect.
|
||||
|
||||
### Why This Helps
|
||||
|
||||
In a mature codebase, the biggest value of TypeScript is usually not "catching type mismatches" in the abstract. It is reducing ambiguity at module boundaries.
|
||||
|
||||
It makes questions explicit:
|
||||
|
||||
- What shape does this API response have?
|
||||
- Which states can this component actually be in?
|
||||
- Can this function return `null`?
|
||||
- Is this object safe to access yet?
|
||||
- If I rename a field, what else breaks?
|
||||
|
||||
That means TypeScript is less like an extra syntax feature and more like a continuously running reasoning system attached to the codebase.
|
||||
|
||||
### The Tradeoff
|
||||
|
||||
TypeScript solves ambiguity by asking you to model it.
|
||||
|
||||
That introduces real costs:
|
||||
|
||||
- more syntax at important boundaries
|
||||
- compiler friction during refactors
|
||||
- types that can become harder to read than the code they describe
|
||||
- a risk of false confidence if teams forget that runtime data is still untrusted
|
||||
|
||||
So the real trade is straightforward: less ambiguity in exchange for more upfront modeling.
|
||||
|
||||
## TypeScript Sits on Top of JavaScript, Not Beside It
|
||||
|
||||
The cleanest intuition is that JavaScript is still the language of execution and TypeScript is the language of compile-time explanation.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[TypeScript Source] --> B[Type Checker]
|
||||
B -->|Errors and warnings| E[Developer]
|
||||
A --> C[Transpilation]
|
||||
C --> D[JavaScript Output]
|
||||
D --> F[Browser or Node Runtime]
|
||||
```
|
||||
|
||||
This layering explains several important truths at once:
|
||||
|
||||
- TypeScript can prevent many mistakes before runtime.
|
||||
- TypeScript cannot change how closures, promises, the DOM, or the event loop behave.
|
||||
- TypeScript cannot inspect real network payloads unless you write runtime validation logic.
|
||||
- If JavaScript can do something highly dynamic, TypeScript may only be able to model it approximately.
|
||||
|
||||
An analogy that often helps: JavaScript is the building that gets constructed. TypeScript is the structural review of the blueprint. A strong review catches bad assumptions early, but it disappears before the building is occupied.
|
||||
|
||||
## Structural Typing: Shape Over Identity
|
||||
|
||||
One of the most important TypeScript intuitions is that it is structurally typed.
|
||||
|
||||
That means compatibility is mostly based on shape rather than on explicit nominal identity.
|
||||
|
||||
If an object has the required properties with compatible types, it is usually acceptable, even if it did not come from a specific class or declaration.
|
||||
|
||||
```ts
|
||||
type UserSummary = {
|
||||
id: string;
|
||||
name: string;
|
||||
};
|
||||
|
||||
type AuditActor = {
|
||||
id: string;
|
||||
name: string;
|
||||
role: "admin" | "editor";
|
||||
};
|
||||
|
||||
const actor: AuditActor = {
|
||||
id: "u_42",
|
||||
name: "Riley",
|
||||
role: "admin"
|
||||
};
|
||||
|
||||
const summary: UserSummary = actor;
|
||||
```
|
||||
|
||||
That assignment works because `actor` has at least the shape required by `UserSummary`.
|
||||
|
||||
### Why TypeScript Works This Way
|
||||
|
||||
This is a natural fit for JavaScript.
|
||||
|
||||
JavaScript code routinely passes around:
|
||||
|
||||
- object literals
|
||||
- JSON payloads
|
||||
- callback objects
|
||||
- configuration objects
|
||||
- DOM-like event objects
|
||||
|
||||
JavaScript itself already has a "duck typing" culture: if the value has the properties a function needs, it often works.
|
||||
|
||||
Structural typing lets TypeScript model that style without forcing everything into rigid nominal hierarchies.
|
||||
|
||||
### What Problem It Solves
|
||||
|
||||
It makes interoperability practical.
|
||||
|
||||
You can type ordinary JavaScript-style objects without wrapping them in classes purely for the type system. That matters in frontend work, where many values are plain objects produced by APIs, state containers, React props, form libraries, and browser APIs.
|
||||
|
||||
### The Tradeoff
|
||||
|
||||
Structural typing is ergonomic, but it can also accept values that are only accidentally compatible.
|
||||
|
||||
If two objects happen to share the same shape, TypeScript may consider them compatible even when they represent different concepts in the business domain.
|
||||
|
||||
That is the price of a shape-based system:
|
||||
|
||||
- it works well with JavaScript objects
|
||||
- it is less strict about identity than nominal systems
|
||||
|
||||
This is one reason naming, module boundaries, and clear DTO definitions still matter even when the type checker is strong.
|
||||
|
||||
## Basic Types and Inference
|
||||
|
||||
At the lowest level, TypeScript starts with familiar JavaScript value categories:
|
||||
|
||||
- `number`
|
||||
- `string`
|
||||
- `boolean`
|
||||
- arrays such as `string[]`
|
||||
- object types such as `{ id: string; active: boolean }`
|
||||
|
||||
None of that is conceptually new to a JavaScript developer. What changes is that TypeScript treats those value categories as information it can track across a program.
|
||||
|
||||
```ts
|
||||
const retryCount = 3;
|
||||
const serviceName = "billing";
|
||||
const isStale = false;
|
||||
const tags = ["api", "cache", "ui"];
|
||||
|
||||
const user = {
|
||||
id: "u_1",
|
||||
email: "dev@example.com",
|
||||
active: true
|
||||
};
|
||||
```
|
||||
|
||||
TypeScript can infer:
|
||||
|
||||
- `retryCount` is a `number`
|
||||
- `serviceName` is a `string`
|
||||
- `isStale` is a `boolean`
|
||||
- `tags` is a `string[]`
|
||||
- `user` is an object with known fields
|
||||
|
||||
### Why Inference Exists
|
||||
|
||||
If every variable in a JavaScript codebase needed an explicit annotation, TypeScript would be unusably noisy.
|
||||
|
||||
Inference exists because much of the time the compiler already has enough local information to make a safe conclusion.
|
||||
|
||||
If you write `const retryCount = 3`, requiring `: number` adds little value. It repeats what the compiler can already see.
|
||||
|
||||
TypeScript is productive because it combines explicit modeling at important boundaries with inference inside ordinary implementation code.
|
||||
|
||||
### What Problem It Solves
|
||||
|
||||
It removes a large amount of annotation ceremony while still preserving useful type information for autocomplete, refactoring, and error checking.
|
||||
|
||||
This balance is one reason TypeScript works well in real JavaScript codebases. The compiler does not ask you to describe every local detail manually.
|
||||
|
||||
### Where Inference Has Edges
|
||||
|
||||
Inference is not magic. It is based on what the compiler can know from the code in front of it.
|
||||
|
||||
That means it is strongest when:
|
||||
|
||||
- values are initialized immediately
|
||||
- control flow is clear
|
||||
- function return paths are consistent
|
||||
- object shapes are locally visible
|
||||
|
||||
It is weaker at broad boundaries where intent matters more than local evidence, such as:
|
||||
|
||||
- exported functions
|
||||
- public library APIs
|
||||
- React props
|
||||
- backend DTOs
|
||||
- data returned from external systems
|
||||
|
||||
That leads to a practical rule used in many production codebases:
|
||||
|
||||
- rely on inference inside implementation details
|
||||
- annotate boundaries where other modules or teams depend on your intent
|
||||
|
||||
### Inference Can Intentionally Widen
|
||||
|
||||
TypeScript often widens types when it expects a value may change later.
|
||||
|
||||
```ts
|
||||
const requestConfig = {
|
||||
method: "GET"
|
||||
};
|
||||
```
|
||||
|
||||
Here `method` is typically inferred as `string`, not only the literal value `"GET"`, because object properties are usually mutable unless you signal otherwise.
|
||||
|
||||
That behavior can feel surprising at first, but it reflects a sensible tradeoff: the compiler avoids pretending a mutable value will stay fixed forever.
|
||||
|
||||
## Interfaces vs Types
|
||||
|
||||
This topic is often presented as if one is modern and the other is obsolete. Real codebases are not that simple.
|
||||
|
||||
Both `interface` and `type` can describe object shapes.
|
||||
|
||||
```ts
|
||||
interface UserDTO {
|
||||
id: string;
|
||||
email: string;
|
||||
}
|
||||
|
||||
type AuthToken = {
|
||||
value: string;
|
||||
expiresAt: string;
|
||||
};
|
||||
```
|
||||
|
||||
The more useful question is not "Which keyword is better?" It is "What kind of type am I trying to model?"
|
||||
|
||||
### When Interfaces Fit Well
|
||||
|
||||
Interfaces are often a good fit for object contracts that may need to be extended or implemented across module boundaries.
|
||||
|
||||
Common examples:
|
||||
|
||||
- request or context objects
|
||||
- SDK contracts
|
||||
- class implementations
|
||||
- framework augmentation points
|
||||
|
||||
```ts
|
||||
interface RequestContext {
|
||||
requestId: string;
|
||||
userId?: string;
|
||||
}
|
||||
|
||||
interface AdminRequestContext extends RequestContext {
|
||||
scopes: string[];
|
||||
}
|
||||
```
|
||||
|
||||
Why they exist:
|
||||
|
||||
- they communicate "this is an object-shaped contract"
|
||||
- they support extension naturally
|
||||
- they can participate in declaration merging, which some frameworks use for augmentation
|
||||
|
||||
Tradeoff:
|
||||
|
||||
- that openness can be useful, but it can also make reasoning less local if a type is reopened elsewhere
|
||||
|
||||
### When Type Aliases Fit Well
|
||||
|
||||
Type aliases are more flexible because they can name far more than object shapes.
|
||||
|
||||
They are essential for:
|
||||
|
||||
- unions
|
||||
- intersections
|
||||
- tuples
|
||||
- primitive aliases
|
||||
- mapped and conditional types
|
||||
- function signatures
|
||||
|
||||
```ts
|
||||
type RequestStatus = "idle" | "loading" | "success" | "error";
|
||||
|
||||
type FetchState<T> =
|
||||
| { status: "idle" }
|
||||
| { status: "loading" }
|
||||
| { status: "success"; data: T }
|
||||
| { status: "error"; message: string };
|
||||
```
|
||||
|
||||
Why they exist:
|
||||
|
||||
- they let the type system model composition patterns that interfaces alone cannot express well
|
||||
|
||||
Tradeoff:
|
||||
|
||||
- because they are very flexible, teams can build deeply abstract type layers that become hard to read
|
||||
|
||||
### Practical Guidance
|
||||
|
||||
In many modern codebases, the pattern is simple:
|
||||
|
||||
- use `interface` when you want an extendable object contract
|
||||
- use `type` when you need flexibility, unions, intersections, or type-level composition
|
||||
|
||||
Some teams use mostly `type` for consistency. Others prefer `interface` for public object models. Either can work if the codebase stays consistent and avoids unnecessary cleverness.
|
||||
|
||||
## Unions and Narrowing: Modeling Real States
|
||||
|
||||
One of TypeScript's biggest advantages is its ability to model values that may legitimately be one of several shapes.
|
||||
|
||||
That is what union types are for.
|
||||
|
||||
In real applications, many bugs come from pretending a value is simpler than it is.
|
||||
|
||||
An API request is not just "data." It is usually one of several states:
|
||||
|
||||
- not started
|
||||
- loading
|
||||
- failed
|
||||
- succeeded
|
||||
|
||||
Plain JavaScript often represents this with loosely related flags:
|
||||
|
||||
```ts
|
||||
type LooseUserState = {
|
||||
isLoading: boolean;
|
||||
data?: UserDTO;
|
||||
error?: string;
|
||||
};
|
||||
```
|
||||
|
||||
The problem is that this allows impossible or contradictory states:
|
||||
|
||||
- `isLoading: true` and `data` present
|
||||
- `error` and `data` both present
|
||||
- neither `error` nor `data` after loading completes
|
||||
|
||||
TypeScript gives you a better model.
|
||||
|
||||
```ts
|
||||
type UserState =
|
||||
| { status: "idle" }
|
||||
| { status: "loading" }
|
||||
| { status: "error"; message: string }
|
||||
| { status: "success"; data: UserDTO };
|
||||
```
|
||||
|
||||
This is a more accurate description of the system. It says there are several legal states, and each state carries different data.
|
||||
|
||||
### Why This Matters
|
||||
|
||||
This is not just about type safety. It is about representing the state machine honestly.
|
||||
|
||||
The type system becomes a tool for making illegal states harder to express.
|
||||
|
||||
That is a major shift from typical JavaScript reasoning. Instead of writing code that defensively checks every combination after the fact, you design the state space up front.
|
||||
|
||||
### Control Flow Narrowing
|
||||
|
||||
Once you use unions, TypeScript can narrow a value as your code learns more about it.
|
||||
|
||||
```ts
|
||||
function renderUserPanel(state: UserState) {
|
||||
switch (state.status) {
|
||||
case "idle":
|
||||
return "No request yet";
|
||||
|
||||
case "loading":
|
||||
return "Loading...";
|
||||
|
||||
case "error":
|
||||
return `Error: ${state.message}`;
|
||||
|
||||
case "success":
|
||||
return `User: ${state.data.email}`;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Inside the `"error"` branch, TypeScript knows `state` has a `message` field. Inside the `"success"` branch, it knows `state` has `data`.
|
||||
|
||||
That is control flow narrowing: the type checker follows the same branching logic you use to reason about the program.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[UserState] --> B{Check status}
|
||||
B -->|idle| C[State narrows to idle branch]
|
||||
B -->|loading| D[State narrows to loading branch]
|
||||
B -->|error| E[State narrows to error branch with message]
|
||||
B -->|success| F[State narrows to success branch with data]
|
||||
```
|
||||
|
||||
TypeScript can narrow through several common patterns:
|
||||
|
||||
- equality checks such as `state.status === "success"`
|
||||
- `typeof` checks for primitives
|
||||
- `in` checks for property presence
|
||||
- custom type guard functions
|
||||
|
||||
### A React Example
|
||||
|
||||
This pattern is especially powerful in React because component rendering is already branch-heavy.
|
||||
|
||||
```tsx
|
||||
type UserPanelProps = {
|
||||
state: UserState;
|
||||
};
|
||||
|
||||
function UserPanel({ state }: UserPanelProps) {
|
||||
if (state.status === "loading") {
|
||||
return <Spinner />;
|
||||
}
|
||||
|
||||
if (state.status === "error") {
|
||||
return <ErrorBanner message={state.message} />;
|
||||
}
|
||||
|
||||
if (state.status === "success") {
|
||||
return <ProfileCard user={state.data} />;
|
||||
}
|
||||
|
||||
return <EmptyState />;
|
||||
}
|
||||
```
|
||||
|
||||
The component is easier to reason about because the prop type expresses the valid states directly.
|
||||
|
||||
### The Tradeoff
|
||||
|
||||
Union modeling is more explicit and sometimes more verbose than a loose collection of booleans and optional fields.
|
||||
|
||||
That is exactly why it works.
|
||||
|
||||
You pay with modeling effort in exchange for fewer ambiguous states and clearer rendering logic.
|
||||
|
||||
## Generics: Preserving Relationships Across Reuse
|
||||
|
||||
Generics are often introduced as "types with placeholders," which is technically true but not especially helpful.
|
||||
|
||||
The deeper intuition is this: a generic lets you describe a relationship between types without hardcoding the concrete type up front.
|
||||
|
||||
That matters whenever you are writing reusable logic that should preserve information about the caller's data.
|
||||
|
||||
### Why Generics Exist
|
||||
|
||||
Suppose you build a helper for paginated API responses.
|
||||
|
||||
```ts
|
||||
type Page<T> = {
|
||||
items: T[];
|
||||
nextCursor?: string;
|
||||
};
|
||||
```
|
||||
|
||||
This says the shape of a page is stable, but the item type varies.
|
||||
|
||||
Now `Page<UserDTO>` and `Page<OrderDTO>` share the same envelope without losing the specific item type.
|
||||
|
||||
Without generics, you would either:
|
||||
|
||||
- duplicate the same structure for every payload type
|
||||
- or fall back to overly broad types that lose useful information
|
||||
|
||||
### Generic API Response Example
|
||||
|
||||
```ts
|
||||
type ApiResponse<T> = {
|
||||
data: T;
|
||||
requestId: string;
|
||||
};
|
||||
|
||||
async function getJson<T>(url: string): Promise<ApiResponse<T>> {
|
||||
const response = await fetch(url);
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Request failed: ${response.status}`);
|
||||
}
|
||||
|
||||
return response.json();
|
||||
}
|
||||
|
||||
type UserDTO = {
|
||||
id: string;
|
||||
email: string;
|
||||
};
|
||||
|
||||
const result = await getJson<UserDTO>("/api/user");
|
||||
result.data.email;
|
||||
```
|
||||
|
||||
The function stays reusable, but the caller still gets precise information about the payload.
|
||||
|
||||
### A React Example
|
||||
|
||||
Reusable components also benefit from generics.
|
||||
|
||||
```tsx
|
||||
type DataTableProps<T> = {
|
||||
rows: T[];
|
||||
getKey: (row: T) => string;
|
||||
renderRow: (row: T) => React.ReactNode;
|
||||
};
|
||||
|
||||
function DataTable<T>({ rows, getKey, renderRow }: DataTableProps<T>) {
|
||||
return (
|
||||
<table>
|
||||
<tbody>
|
||||
{rows.map((row) => (
|
||||
<tr key={getKey(row)}>{renderRow(row)}</tr>
|
||||
))}
|
||||
</tbody>
|
||||
</table>
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
The component does not care whether `T` is a user, product, or audit entry. What it preserves is the relationship: every callback and row consistently refers to the same item type.
|
||||
|
||||
### The Tradeoff
|
||||
|
||||
Generics are powerful, but they are also one of the easiest places to make a codebase too abstract.
|
||||
|
||||
Good generic code preserves a clear relationship.
|
||||
|
||||
Bad generic code hides simple ideas behind layers of type parameters, conditional types, and inferred helpers until the type signatures are harder to understand than the feature itself.
|
||||
|
||||
There is another critical limit: generics do not validate runtime data.
|
||||
|
||||
`getJson<UserDTO>(...)` does not prove the server actually returned a `UserDTO`. It only tells the compiler how you intend to use the result.
|
||||
|
||||
That is why TypeScript helps most when paired with honest runtime boundaries.
|
||||
|
||||
## `any` vs `unknown`: The Boundary Between Trust and Proof
|
||||
|
||||
This is one of the most important practical distinctions in TypeScript.
|
||||
|
||||
### `any`
|
||||
|
||||
`any` means "stop checking here."
|
||||
|
||||
Once a value becomes `any`, the compiler largely gives up on that branch of reasoning.
|
||||
|
||||
```ts
|
||||
let payload: any = JSON.parse(raw);
|
||||
payload.user.name.toUpperCase();
|
||||
payload.notARealMethod();
|
||||
```
|
||||
|
||||
This compiles because `any` opts out of the type system.
|
||||
|
||||
Why it exists:
|
||||
|
||||
- migration from JavaScript needs escape hatches
|
||||
- some third-party libraries are poorly typed
|
||||
- some dynamic code is difficult to model precisely
|
||||
|
||||
Problem it solves:
|
||||
|
||||
- it reduces friction when strict modeling is temporarily impractical
|
||||
|
||||
Tradeoff:
|
||||
|
||||
- it can silently infect surrounding code and erase the safety benefits of TypeScript
|
||||
|
||||
`any` is not just unspecific. It is contagious.
|
||||
|
||||
### `unknown`
|
||||
|
||||
`unknown` means "a value exists here, but you are not allowed to assume what it is yet."
|
||||
|
||||
```ts
|
||||
function parseJson(raw: string): unknown {
|
||||
return JSON.parse(raw);
|
||||
}
|
||||
```
|
||||
|
||||
Now the compiler forces you to prove something about the value before using it.
|
||||
|
||||
```ts
|
||||
type UserDTO = {
|
||||
id: string;
|
||||
email: string;
|
||||
};
|
||||
|
||||
function isUserDTO(value: unknown): value is UserDTO {
|
||||
return (
|
||||
typeof value === "object" &&
|
||||
value !== null &&
|
||||
"id" in value &&
|
||||
"email" in value
|
||||
);
|
||||
}
|
||||
|
||||
const payload = parseJson(raw);
|
||||
|
||||
if (!isUserDTO(payload)) {
|
||||
throw new Error("Invalid user payload");
|
||||
}
|
||||
|
||||
payload.email.toLowerCase();
|
||||
```
|
||||
|
||||
Why `unknown` exists:
|
||||
|
||||
- real systems receive untrusted values from APIs, local storage, URL params, user input, and browser APIs
|
||||
- the compiler needs a way to represent "not yet proven"
|
||||
|
||||
Problem it solves:
|
||||
|
||||
- it keeps unsafe data quarantined until code narrows or validates it
|
||||
|
||||
Tradeoff:
|
||||
|
||||
- it requires more code at boundaries, because you must actually check the data before using it
|
||||
|
||||
That extra work is usually worth it. `unknown` turns a vague boundary into an explicit proof step.
|
||||
|
||||
## How TypeScript Becomes JavaScript
|
||||
|
||||
The compilation step matters because it explains both the power and the limits of TypeScript.
|
||||
|
||||
At a high level, the toolchain does two jobs:
|
||||
|
||||
1. type-check the source
|
||||
2. emit JavaScript the runtime can execute
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A[TypeScript Code] --> B[Parser and Type Checker]
|
||||
B --> C[Verified Program Model]
|
||||
C --> D[JavaScript Emit]
|
||||
D --> E[Bundler or Runtime Loader]
|
||||
E --> F[Browser or Node Execution]
|
||||
```
|
||||
|
||||
### What Gets Removed
|
||||
|
||||
Most type-only constructs are erased during compilation.
|
||||
|
||||
Examples:
|
||||
|
||||
- type annotations
|
||||
- interfaces
|
||||
- type aliases
|
||||
- generic parameters
|
||||
- `as` assertions
|
||||
- `satisfies` checks
|
||||
|
||||
```ts
|
||||
type User = {
|
||||
id: string;
|
||||
email: string;
|
||||
};
|
||||
|
||||
function getEmail(user: User): string {
|
||||
return user.email;
|
||||
}
|
||||
```
|
||||
|
||||
Becomes JavaScript conceptually like this:
|
||||
|
||||
```js
|
||||
function getEmail(user) {
|
||||
return user.email;
|
||||
}
|
||||
```
|
||||
|
||||
The runtime sees only the JavaScript version.
|
||||
|
||||
### What Remains at Runtime
|
||||
|
||||
Ordinary JavaScript behavior remains:
|
||||
|
||||
- objects and arrays
|
||||
- functions and closures
|
||||
- promises and async behavior
|
||||
- conditionals and loops
|
||||
- imports and exports after transformation
|
||||
- any explicit runtime validation code you wrote
|
||||
|
||||
If you want runtime guarantees, you must implement runtime logic for them.
|
||||
|
||||
That could be:
|
||||
|
||||
- manual validation
|
||||
- schema validation libraries
|
||||
- backend contract validation
|
||||
- defensive parsing at boundaries
|
||||
|
||||
### The Most Important Consequence
|
||||
|
||||
TypeScript can prove that your program is internally consistent with its declared assumptions.
|
||||
|
||||
It cannot prove that the outside world is telling the truth.
|
||||
|
||||
That is why this compiles but can still fail at runtime:
|
||||
|
||||
```ts
|
||||
type User = {
|
||||
id: string;
|
||||
};
|
||||
|
||||
const user = JSON.parse('{"id": 123}') as User;
|
||||
user.id.toUpperCase();
|
||||
```
|
||||
|
||||
The assertion tells the compiler to trust you. The runtime still receives a number.
|
||||
|
||||
This boundary between compile-time belief and runtime fact is the single most important TypeScript habit to internalize.
|
||||
|
||||
## Real-World Usage Patterns
|
||||
|
||||
TypeScript becomes most valuable when it describes contracts that multiple people or modules rely on.
|
||||
|
||||
### React Components
|
||||
|
||||
In React, props are component contracts.
|
||||
|
||||
TypeScript helps most when props express valid combinations rather than just field names.
|
||||
|
||||
```tsx
|
||||
type ButtonProps =
|
||||
| { kind: "link"; href: string; onClick?: never }
|
||||
| { kind: "button"; onClick: () => void; href?: never };
|
||||
|
||||
function Button(props: ButtonProps) {
|
||||
if (props.kind === "link") {
|
||||
return <a href={props.href}>Open</a>;
|
||||
}
|
||||
|
||||
return <button onClick={props.onClick}>Open</button>;
|
||||
}
|
||||
```
|
||||
|
||||
This solves a real JavaScript problem: invalid prop combinations often live as informal rules unless the type system encodes them.
|
||||
|
||||
Tradeoff:
|
||||
|
||||
- the prop types become more explicit and sometimes more verbose
|
||||
|
||||
Benefit:
|
||||
|
||||
- impossible combinations are rejected before runtime and before QA discovers them in a screen flow
|
||||
|
||||
### API Request Typing
|
||||
|
||||
Most frontend bugs eventually touch the network boundary.
|
||||
|
||||
TypeScript helps by making request and response shapes visible throughout the app.
|
||||
|
||||
```ts
|
||||
type CreateUserRequest = {
|
||||
email: string;
|
||||
displayName: string;
|
||||
};
|
||||
|
||||
type CreateUserResponse = {
|
||||
id: string;
|
||||
email: string;
|
||||
displayName: string;
|
||||
};
|
||||
|
||||
async function createUser(input: CreateUserRequest): Promise<CreateUserResponse> {
|
||||
const response = await fetch("/api/users", {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify(input)
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error("Create user failed");
|
||||
}
|
||||
|
||||
return response.json();
|
||||
}
|
||||
```
|
||||
|
||||
This improves maintainability because:
|
||||
|
||||
- refactors surface all dependent call sites
|
||||
- editors expose the contract immediately
|
||||
- reviewers can reason about payload changes without mentally reconstructing object shapes
|
||||
|
||||
But the limitation remains the same: the server can still return malformed data unless the contract is validated somewhere.
|
||||
|
||||
### Backend DTOs
|
||||
|
||||
Backend DTOs, data transfer objects, are one of the cleanest places to use TypeScript because they represent intentional boundary contracts.
|
||||
|
||||
Examples:
|
||||
|
||||
- request bodies
|
||||
- response payloads
|
||||
- queue messages
|
||||
- event payloads
|
||||
|
||||
DTOs are not just "objects with fields." They are agreement surfaces between systems.
|
||||
|
||||
TypeScript helps make those agreements visible and enforceable in code review and refactoring.
|
||||
|
||||
### Shared Frontend and Backend Types
|
||||
|
||||
Sharing types across frontend and backend can reduce drift, especially in monorepos or tightly coordinated teams.
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A[Shared DTO Package] --> B[Frontend App]
|
||||
A --> C[Backend Service]
|
||||
B --> D[Typed API Calls]
|
||||
C --> E[Typed Handlers]
|
||||
```
|
||||
|
||||
This works well when the shared package contains stable contracts such as:
|
||||
|
||||
- DTOs
|
||||
- event payload schemas
|
||||
- route parameter shapes
|
||||
- enum-like literal unions
|
||||
|
||||
It works poorly when teams share internal implementation models that should stay private.
|
||||
|
||||
A good rule is to share contracts, not persistence details.
|
||||
|
||||
For example:
|
||||
|
||||
- good to share `UserProfileDTO`
|
||||
- risky to share a database entity type with internal fields the client should not depend on
|
||||
|
||||
The tradeoff is coupling.
|
||||
|
||||
Shared types reduce duplication, but they also tie teams and packages more closely together. A backend change can now ripple through frontend compilation immediately. Sometimes that is exactly the protection you want. Sometimes it slows independent deployment.
|
||||
|
||||
## How TypeScript Improves Maintainability at Scale
|
||||
|
||||
TypeScript matters more as the number of modules, contributors, and change paths increases.
|
||||
|
||||
Its strongest maintainability benefits are usually these:
|
||||
|
||||
- safer refactors, because changed contracts surface compile-time failures
|
||||
- stronger onboarding, because types document expectations where code is used
|
||||
- better editor tooling, because symbols, fields, and call signatures remain discoverable
|
||||
- clearer state modeling, especially for async flows and UI state machines
|
||||
- better boundary hygiene, because APIs, DTOs, and component props become explicit contracts
|
||||
|
||||
The key pattern is that TypeScript reduces the number of assumptions that exist only in people's heads.
|
||||
|
||||
That is why it feels disproportionately useful in medium and large systems compared with small scripts.
|
||||
|
||||
## Where TypeScript Fails or Adds Friction
|
||||
|
||||
TypeScript is useful, but it is not free and it is not complete.
|
||||
|
||||
### It Does Not Replace Runtime Validation
|
||||
|
||||
If data comes from outside your process, the type checker is not enough.
|
||||
|
||||
Network payloads, local storage content, query parameters, and user input are runtime facts, not compile-time facts.
|
||||
|
||||
### It Can Become Over-Abstract
|
||||
|
||||
Some codebases start using advanced mapped, conditional, and generic types for problems that would be clearer with straightforward duplication.
|
||||
|
||||
When types become harder to reason about than runtime code, the system is no longer helping.
|
||||
|
||||
### It Adds Build and Migration Cost
|
||||
|
||||
Strict typing improves consistency, but it also means refactors may require more coordinated edits. Migration from legacy JavaScript can be slow. Compiler errors may point to real design problems, but they still take time to resolve.
|
||||
|
||||
### Third-Party Typings Can Be Wrong
|
||||
|
||||
Your code can be "type-safe" relative to inaccurate library type definitions. That means the type system is only as trustworthy as the contracts feeding it.
|
||||
|
||||
### Structural Typing Can Be Too Permissive
|
||||
|
||||
Because compatibility is shape-based, the compiler may accept values that are technically compatible but conceptually different.
|
||||
|
||||
That is why TypeScript is best viewed as a strong reasoning aid, not a perfect semantic model of the business domain.
|
||||
|
||||
## Practical Heuristics for JavaScript Developers Adopting TypeScript
|
||||
|
||||
If you already think well in JavaScript, the goal is not to become obsessed with types. The goal is to use types where they improve reasoning.
|
||||
|
||||
Useful heuristics:
|
||||
|
||||
- annotate module boundaries more than local variables
|
||||
- prefer unions over collections of loosely related booleans
|
||||
- use generics to preserve relationships, not to impress the compiler
|
||||
- prefer `unknown` at untrusted boundaries and narrow deliberately
|
||||
- share DTOs and contracts, not internal implementation models
|
||||
- stop adding type complexity once the type layer becomes harder to understand than the runtime behavior
|
||||
|
||||
These heuristics keep TypeScript aligned with its real purpose: making JavaScript systems easier to change safely.
|
||||
|
||||
## Final Mental Model
|
||||
|
||||
The most accurate way to think about TypeScript is not "I learned a new language."
|
||||
|
||||
It is:
|
||||
|
||||
- JavaScript remains the runtime language
|
||||
- TypeScript adds a compile-time model of assumptions
|
||||
- that model makes large systems safer to change and easier to reason about
|
||||
- the model is powerful, but it does not eliminate runtime uncertainty
|
||||
|
||||
When used well, TypeScript does not make code correct by magic.
|
||||
|
||||
It makes intent visible.
|
||||
|
||||
That is the real win. In large JavaScript systems, visible intent is what turns flexible code into maintainable engineering.
|
||||
Vendored
BIN
Binary file not shown.
Reference in New Issue
Block a user