more text
This commit is contained in:
@@ -0,0 +1,528 @@
|
||||
# File 1: Foundations of C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain how C++ source code becomes a running executable
|
||||
- reason about basic types, object storage, and memory layout
|
||||
- distinguish stack allocation from heap allocation in practical terms
|
||||
- use pointers and references without treating them as magic syntax
|
||||
- debug common low-level failures with a structured mental model
|
||||
|
||||
This file is the foundation for the rest of the guide. If later topics like RAII, smart pointers, iterators, or multithreading feel abstract, come back here first. C++ becomes much easier once you can picture what the compiler produces and what memory actually looks like at runtime.
|
||||
|
||||
## Why C++ Exists
|
||||
|
||||
C++ sits in an unusual position among mainstream languages. It gives you high-level abstractions such as classes, templates, exceptions, and a rich standard library, but it still lets you work close to the machine.
|
||||
|
||||
That combination is why C++ shows up in places where both abstraction and control matter:
|
||||
|
||||
- game engines that need tight performance and custom memory behavior
|
||||
- trading systems that care about latency and predictable execution
|
||||
- databases, compilers, browsers, and storage engines that manipulate large amounts of structured data
|
||||
- embedded and systems code where resource use must be explicit
|
||||
|
||||
The core idea is not just “fast language.” Many languages are fast in some contexts. C++ is valuable because it lets you choose where to pay for abstraction and where to avoid it.
|
||||
|
||||
## The Compilation Model
|
||||
|
||||
### Intuition
|
||||
|
||||
In Python or JavaScript, you can often treat “running the code” as a direct action. In C++, there is a build pipeline between the source you write and the machine code the CPU executes. Understanding that pipeline helps explain many common C++ issues:
|
||||
|
||||
- why header files exist
|
||||
- why template code often lives in headers
|
||||
- why link errors happen even when code compiles
|
||||
- why build systems matter so much in large codebases
|
||||
|
||||
### The Big Picture
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Source files .cpp] --> B[Preprocessor]
|
||||
H[Header files .h .hpp] --> B
|
||||
B --> C[Compiler]
|
||||
C --> D[Object files .o]
|
||||
D --> E[Linker]
|
||||
L[Libraries] --> E
|
||||
E --> F[Executable or shared library]
|
||||
```
|
||||
|
||||
### Preprocessing
|
||||
|
||||
Before the compiler sees your program, the preprocessor handles directives such as `#include`, `#define`, `#if`, and include guards.
|
||||
|
||||
What this means internally:
|
||||
|
||||
- `#include` is essentially textual inclusion
|
||||
- macros are expanded before real compilation begins
|
||||
- conditional compilation can remove or include chunks of code based on flags
|
||||
|
||||
That is why headers can feel deceptively simple. A header is not linked in as a separate unit. Its contents are copied into each translation unit that includes it.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
// math_utils.h
|
||||
int add(int a, int b);
|
||||
|
||||
// main.cpp
|
||||
#include "math_utils.h"
|
||||
```
|
||||
|
||||
The compiler effectively sees the declaration from the header pasted into `main.cpp` before actual parsing.
|
||||
|
||||
### Compilation
|
||||
|
||||
The compiler parses the preprocessed source, checks types, builds intermediate representations, optimizes code, and emits object files.
|
||||
|
||||
A `.cpp` file plus all text included into it after preprocessing becomes a translation unit.
|
||||
|
||||
Practical consequence:
|
||||
|
||||
- syntax errors, type errors, and many template errors are compilation-time issues
|
||||
- each translation unit is compiled independently
|
||||
- the compiler only knows what declarations are visible in that translation unit
|
||||
|
||||
### Linking
|
||||
|
||||
The linker resolves symbol references across object files and libraries.
|
||||
|
||||
If you declare a function in a header but forget to provide the definition in a compiled source file, compilation may succeed while linking fails.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
// declared
|
||||
int compute();
|
||||
|
||||
// used
|
||||
int main() {
|
||||
return compute();
|
||||
}
|
||||
```
|
||||
|
||||
If no compiled object file contains a matching definition of `compute`, the linker reports an unresolved symbol.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
This model matters constantly in real systems:
|
||||
|
||||
- large codebases use headers to expose interfaces and source files to hide implementation
|
||||
- build time can explode if headers pull in too much code
|
||||
- libraries are distributed as headers plus compiled binaries or as header-only template libraries
|
||||
- ABI and symbol compatibility matter when separate teams ship shared libraries
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- confusing compile errors with link errors
|
||||
- putting non-inline function definitions in headers and causing multiple definition errors
|
||||
- overusing macros when constants, `constexpr`, or templates would be safer
|
||||
- including large dependency trees in headers, which slows builds and increases coupling
|
||||
|
||||
## Variables, Types, and Object Storage
|
||||
|
||||
### Intuition
|
||||
|
||||
A variable in C++ is not “just a name.” It is usually a named object with a type, storage duration, alignment requirements, and a region of memory associated with it.
|
||||
|
||||
The type system tells both the compiler and the reader what operations are legal and how many bytes an object likely occupies.
|
||||
|
||||
### What a Type Really Means
|
||||
|
||||
A C++ type typically determines:
|
||||
|
||||
- size, though this can vary by platform
|
||||
- alignment requirements
|
||||
- how the value is interpreted in memory
|
||||
- what operations are available
|
||||
- construction and destruction behavior for user-defined types
|
||||
|
||||
Consider:
|
||||
|
||||
```cpp
|
||||
int count = 42;
|
||||
double ratio = 0.5;
|
||||
char flag = 'Y';
|
||||
```
|
||||
|
||||
These values are all just bits in memory, but the type tells the compiler how to read and manipulate those bits.
|
||||
|
||||
### Value vs Representation
|
||||
|
||||
One useful systems-level habit is to separate a value from its representation.
|
||||
|
||||
For example, an `int` stores a signed integer value, but underneath it is represented in binary with a platform-defined size, usually 32 bits on modern desktop/server platforms. A pointer stores an address value, but underneath it is also just bits.
|
||||
|
||||
This distinction matters when you debug memory corruption. The CPU does not know “this is a tree node” in some abstract sense. It only sees instructions and bytes. The meaning comes from your program's types and the compiler's generated code.
|
||||
|
||||
### Storage Duration
|
||||
|
||||
Every object in C++ has a storage duration. At a practical level, that answers: when does this object come into existence, and when does its storage stop being valid?
|
||||
|
||||
The main categories are:
|
||||
|
||||
- automatic storage duration: usually local variables created when a scope is entered
|
||||
- static storage duration: global variables and `static` locals that live for the life of the program
|
||||
- dynamic storage duration: objects created explicitly on the heap, typically with `new` or via allocators
|
||||
|
||||
Later, RAII and smart pointers will build directly on this idea.
|
||||
|
||||
## Stack vs Heap
|
||||
|
||||
### Intuition
|
||||
|
||||
Beginners often memorize “stack is fast, heap is slow.” That is too shallow and often misleading.
|
||||
|
||||
The real difference is about lifetime management and allocation strategy.
|
||||
|
||||
- stack allocation is usually automatic and scoped
|
||||
- heap allocation is explicit or indirect and more flexible
|
||||
|
||||
### Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[Program starts] --> B[Call main]
|
||||
B --> C[Create stack frame for main]
|
||||
C --> D[Call function]
|
||||
D --> E[Create another stack frame]
|
||||
E --> F[Return from function]
|
||||
F --> G[Frame removed automatically]
|
||||
C --> H[Heap objects may outlive function scope]
|
||||
```
|
||||
|
||||
### Stack Allocation
|
||||
|
||||
Local variables inside a function usually live on the stack, though the exact implementation is up to the compiler and optimizer.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
void process() {
|
||||
int retries = 3;
|
||||
double threshold = 0.75;
|
||||
}
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- function-local state is extremely common
|
||||
- scoped lifetimes are easy to manage automatically
|
||||
- creation and cleanup can often be handled without a general-purpose allocator
|
||||
|
||||
Internally, each function call usually gets a stack frame holding return information, saved registers, and local storage. When the function returns, that frame is popped.
|
||||
|
||||
Practical usage:
|
||||
|
||||
- temporary computation state
|
||||
- small fixed-size objects
|
||||
- ownership that should never outlive the current scope
|
||||
|
||||
Pitfalls:
|
||||
|
||||
- returning pointers or references to local variables
|
||||
- allocating very large arrays on the stack and causing stack overflow
|
||||
- assuming stack layout is fixed across compilers or optimization levels
|
||||
|
||||
### Heap Allocation
|
||||
|
||||
Heap allocation is used when an object's lifetime must outlive a scope, when size is only known at runtime, or when ownership must be transferred across components.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
int* value = new int(42);
|
||||
delete value;
|
||||
```
|
||||
|
||||
Internally, `new` usually asks an allocator for a chunk of dynamic memory, then constructs the object in that memory. `delete` destroys the object and releases the storage.
|
||||
|
||||
Practical usage:
|
||||
|
||||
- dynamic data structures such as graphs or trees
|
||||
- objects shared across subsystems
|
||||
- buffers sized from runtime input
|
||||
|
||||
Pitfalls:
|
||||
|
||||
- memory leaks from forgetting `delete`
|
||||
- double delete from freeing the same pointer twice
|
||||
- dangling pointers after deletion
|
||||
- heap fragmentation and allocator overhead in performance-sensitive systems
|
||||
|
||||
Important note: in modern C++, direct `new` and `delete` should be rare in application code. Prefer containers and smart pointers. You still need to understand heap behavior because the abstractions are built on top of it.
|
||||
|
||||
## Pointers
|
||||
|
||||
### Intuition
|
||||
|
||||
A pointer is a value whose job is to hold the address of another object. That is all. It is powerful because it lets you refer to memory indirectly.
|
||||
|
||||
Pointers exist because systems software constantly needs indirect access:
|
||||
|
||||
- linked data structures
|
||||
- optional access to objects
|
||||
- efficient parameter passing without copying large objects
|
||||
- polymorphic behavior through base-class pointers
|
||||
- interaction with operating systems, hardware, and C APIs
|
||||
|
||||
### Basic Form
|
||||
|
||||
```cpp
|
||||
int score = 99;
|
||||
int* ptr = &score;
|
||||
```
|
||||
|
||||
Here:
|
||||
|
||||
- `score` is an `int`
|
||||
- `&score` means “address of score”
|
||||
- `ptr` stores that address
|
||||
- `*ptr` means “the int stored at that address”
|
||||
|
||||
### Pointer Relationship Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[ptr] -->|stores address| S[score in memory]
|
||||
S --> V[99]
|
||||
```
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
On a 64-bit system, a pointer is commonly 8 bytes. The compiler tracks the pointed-to type because pointer arithmetic and dereferencing depend on that type.
|
||||
|
||||
For example, incrementing an `int*` advances by `sizeof(int)` bytes, not by 1 byte.
|
||||
|
||||
```cpp
|
||||
int values[3] = {10, 20, 30};
|
||||
int* p = values;
|
||||
+p; // now points to values[1]
|
||||
```
|
||||
|
||||
The compiler scales the increment according to the pointed-to type.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- traversal in low-level data structures
|
||||
- API boundaries that may accept nullable inputs
|
||||
- efficient manipulation of contiguous buffers
|
||||
- ownership and lifetime control in specialized libraries or allocators
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- dereferencing `nullptr`
|
||||
- dereferencing uninitialized pointers
|
||||
- using a pointer after the object it points to has been destroyed
|
||||
- confusing ownership with access: a pointer can point to something without owning it
|
||||
|
||||
That last point is critical. A raw pointer does not tell you who is responsible for deleting the object.
|
||||
|
||||
## References
|
||||
|
||||
### Intuition
|
||||
|
||||
A reference is an alias to an existing object. It exists to make code safer and clearer than pointer-heavy interfaces when nullability and reseating are not needed.
|
||||
|
||||
Example:
|
||||
|
||||
```cpp
|
||||
void increment(int& value) {
|
||||
++value;
|
||||
}
|
||||
```
|
||||
|
||||
### Why References Exist
|
||||
|
||||
Without references, you would often pass pointers just to avoid copying objects. But pointers imply optionality and manual dereferencing.
|
||||
|
||||
References express a stronger contract:
|
||||
|
||||
- this function expects a valid object
|
||||
- there is no need for null checks as part of normal usage
|
||||
- the alias should behave like the original object
|
||||
|
||||
### Internal View
|
||||
|
||||
At the machine level, a reference is often implemented similarly to a pointer, but the language treats it differently.
|
||||
|
||||
Key properties:
|
||||
|
||||
- must be initialized when created
|
||||
- cannot be reseated to refer to another object
|
||||
- usually cannot be null in well-formed code
|
||||
- use normal object syntax instead of pointer syntax
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
R[ref] -->|alias of| X[x]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- passing large objects efficiently without copying
|
||||
- operator overloading and fluent APIs
|
||||
- returning aliases to subobjects when lifetime is guaranteed
|
||||
|
||||
### Pitfalls and Misconceptions
|
||||
|
||||
- a reference is not an independent object with its own lifetime target management
|
||||
- returning a reference to a local variable is still invalid
|
||||
- “references are always safer than pointers” is too simplistic; pointers are the right tool when optionality, reseating, or explicit low-level behavior is required
|
||||
|
||||
## Const Correctness
|
||||
|
||||
### Intuition
|
||||
|
||||
`const` is one of the cheapest ways to make C++ code easier to reason about. It restricts mutation and therefore reduces the number of possible program states.
|
||||
|
||||
### Practical Examples
|
||||
|
||||
```cpp
|
||||
void print(const std::string& name);
|
||||
|
||||
const int limit = 100;
|
||||
```
|
||||
|
||||
Why it matters in real systems:
|
||||
|
||||
- APIs become clearer about who is allowed to modify data
|
||||
- the compiler can catch accidental writes
|
||||
- reviewers can reason more quickly about ownership and side effects
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- confusing `const int* p` with `int* const p`
|
||||
- using `const` inconsistently across interfaces
|
||||
- assuming `const` automatically implies thread safety or deep immutability
|
||||
|
||||
## Arrays, Decay, and Basic Memory Layout
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ inherits much of C's memory model. Arrays are contiguous blocks of elements, which is why they are fast for indexed access and cache-friendly iteration.
|
||||
|
||||
```cpp
|
||||
int values[4] = {1, 2, 3, 4};
|
||||
```
|
||||
|
||||
The elements are stored adjacent in memory. That contiguity is why pointer arithmetic and array indexing are closely related.
|
||||
|
||||
### Under the Hood
|
||||
|
||||
`values[i]` is conceptually equivalent to `*(values + i)`.
|
||||
|
||||
This is powerful, but it is also why out-of-bounds access is dangerous. C++ does not automatically check bounds for raw arrays.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- numerical buffers
|
||||
- serialization code
|
||||
- high-performance loops
|
||||
- interop with C libraries
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- array-to-pointer decay in function parameters
|
||||
- buffer overflows
|
||||
- assuming stack arrays automatically know their size when passed to a function
|
||||
|
||||
In most application code, prefer `std::array` for fixed-size arrays and `std::vector` for dynamic arrays. You will still see raw arrays in systems code, embedded code, and performance-critical paths.
|
||||
|
||||
## A Debugging Mental Model
|
||||
|
||||
### Intuition
|
||||
|
||||
Low-level bugs in C++ often feel mysterious only when you lack a runtime model. Most of the time, they reduce to one of a few categories:
|
||||
|
||||
- invalid lifetime
|
||||
- invalid memory access
|
||||
- wrong ownership
|
||||
- incorrect assumptions about object state
|
||||
- data races in concurrent code
|
||||
|
||||
### A Useful Diagnostic Loop
|
||||
|
||||
When debugging a crash or corruption issue, ask these questions in order:
|
||||
|
||||
1. What object was accessed?
|
||||
2. Was it initialized?
|
||||
3. Is its lifetime still valid?
|
||||
4. Who owns it?
|
||||
5. Could memory nearby have been overwritten?
|
||||
6. Is the failure deterministic or timing-dependent?
|
||||
|
||||
That checklist is more valuable than memorizing debugger buttons.
|
||||
|
||||
### Common Failure Modes
|
||||
|
||||
#### Segmentation Faults
|
||||
|
||||
Usually caused by dereferencing an invalid address such as:
|
||||
|
||||
- `nullptr`
|
||||
- a dangling pointer
|
||||
- a wild pointer from uninitialized memory
|
||||
|
||||
#### Use-After-Free
|
||||
|
||||
You delete an object, but some pointer or reference still points to the old address. The address may still look valid for a while, which makes this class of bug subtle.
|
||||
|
||||
#### Stack Corruption
|
||||
|
||||
Often caused by out-of-bounds writes into local arrays or incorrect pointer arithmetic.
|
||||
|
||||
#### Memory Leaks
|
||||
|
||||
The program keeps allocating memory without freeing it. In long-running services, that becomes a production issue rather than just a test annoyance.
|
||||
|
||||
### Practical Tools
|
||||
|
||||
Real C++ debugging is easier when you use tooling, not just intuition:
|
||||
|
||||
- compiler warnings: start with strict warnings enabled
|
||||
- AddressSanitizer: catches use-after-free, buffer overflows, and more
|
||||
- UndefinedBehaviorSanitizer: catches many invalid language-level operations
|
||||
- Valgrind on supported platforms: useful for leaks and invalid accesses
|
||||
- debugger: inspect stack frames, variables, and memory addresses
|
||||
|
||||
Example build flags on Clang or GCC for local debugging:
|
||||
|
||||
```bash
|
||||
-Wall -Wextra -Wpedantic -fsanitize=address,undefined -g
|
||||
```
|
||||
|
||||
### Misconception to Avoid
|
||||
|
||||
“If it only crashes sometimes, the code is almost correct.”
|
||||
|
||||
In C++, nondeterministic behavior is often a sign of undefined behavior, not a minor bug. Once you have UB, the optimizer and runtime can produce very different outcomes from one build or machine to another.
|
||||
|
||||
## Foundation Patterns That Matter Later
|
||||
|
||||
Several later C++ ideas are really lifetime-management patterns built on the concepts above:
|
||||
|
||||
- constructors and destructors manage object setup and cleanup
|
||||
- RAII ties resource lifetime to scope lifetime
|
||||
- smart pointers model ownership on top of heap allocation
|
||||
- containers hide raw memory management while preserving performance properties
|
||||
- concurrency primitives rely on precise reasoning about storage and object lifetime
|
||||
|
||||
If you can already picture stack frames, heap allocation, pointer indirection, and the compile-link pipeline, you are ready for object-oriented and modern C++ design.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain these clearly in an interview without hiding behind buzzwords:
|
||||
|
||||
- the difference between compilation and linking
|
||||
- why headers can increase build time and coupling
|
||||
- what stack and heap allocation really mean in terms of lifetime
|
||||
- the difference between a pointer and a reference
|
||||
- what causes dangling pointers and use-after-free bugs
|
||||
- why `const` improves API design and reasoning
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file builds on these memory and lifetime foundations to explain classes, constructors, destructors, inheritance, and polymorphism. The key shift is this: C++ object-oriented features are not separate from the memory model. They are layered on top of it.
|
||||
@@ -0,0 +1,551 @@
|
||||
# File 2: Core Object-Oriented C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain what a C++ class actually represents in memory and in code
|
||||
- reason about constructors, destructors, and object lifetime without hand-waving
|
||||
- use encapsulation and abstraction to protect invariants
|
||||
- distinguish inheritance from polymorphism and understand when each is appropriate
|
||||
- recognize common object-oriented mistakes that cause subtle bugs in production C++
|
||||
|
||||
This file builds directly on the foundations from File 1. C++ object-oriented features are not separate from the memory model. A class is still a concrete object layout plus functions that operate on it.
|
||||
|
||||
## Why Object-Oriented Features Exist in C++
|
||||
|
||||
### Intuition
|
||||
|
||||
As programs grow, raw functions and primitive types stop being enough. You need a way to keep data and the rules for using that data together.
|
||||
|
||||
That is the heart of classes in C++:
|
||||
|
||||
- package state with behavior
|
||||
- enforce invariants at the boundary
|
||||
- model domain concepts clearly
|
||||
- make ownership and lifetime more explicit
|
||||
|
||||
In real systems, object-oriented design is less about textbook hierarchy diagrams and more about making illegal states harder to represent.
|
||||
|
||||
## Classes and Objects
|
||||
|
||||
### Intuition
|
||||
|
||||
A class is a user-defined type. It describes:
|
||||
|
||||
- what data an object holds
|
||||
- what operations are allowed on that data
|
||||
- what rules govern creation, use, and destruction
|
||||
|
||||
An object is an instance of that type occupying real storage at runtime.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class BankAccount {
|
||||
public:
|
||||
explicit BankAccount(double starting_balance)
|
||||
: balance_(starting_balance) {}
|
||||
|
||||
void deposit(double amount) {
|
||||
balance_ += amount;
|
||||
}
|
||||
|
||||
bool withdraw(double amount) {
|
||||
if (amount > balance_) {
|
||||
return false;
|
||||
}
|
||||
balance_ -= amount;
|
||||
return true;
|
||||
}
|
||||
|
||||
double balance() const {
|
||||
return balance_;
|
||||
}
|
||||
|
||||
private:
|
||||
double balance_;
|
||||
};
|
||||
```
|
||||
|
||||
### What Happens Internally
|
||||
|
||||
At runtime, an object usually contains only its data members. Member functions are not copied into every object. They are compiled as ordinary functions that receive an implicit object parameter, usually called `this`.
|
||||
|
||||
Conceptually, this:
|
||||
|
||||
```cpp
|
||||
account.deposit(50.0);
|
||||
```
|
||||
|
||||
behaves like:
|
||||
|
||||
```cpp
|
||||
deposit(&account, 50.0);
|
||||
```
|
||||
|
||||
That is not exact source-level syntax, but it is the right mental model.
|
||||
|
||||
### Object Layout Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[BankAccount object] --> B[balance_ : double]
|
||||
C[Member functions] --> D[operate using this pointer]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
Classes are useful when data has invariants:
|
||||
|
||||
- account balances should not go negative unless explicitly allowed
|
||||
- sockets should not be used after closure
|
||||
- file handles must be released exactly once
|
||||
- caches should hide eviction details behind a stable API
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- making everything public and losing invariant protection
|
||||
- creating “data bag” classes that do not meaningfully model behavior
|
||||
- assuming classes are automatically heap-allocated; in C++, class objects can live on the stack, in static storage, or on the heap
|
||||
|
||||
## Access Control, Encapsulation, and Abstraction
|
||||
|
||||
### Intuition
|
||||
|
||||
Encapsulation is about protecting internal state from invalid use. Abstraction is about exposing the right conceptual interface while hiding irrelevant details.
|
||||
|
||||
These are related but not identical.
|
||||
|
||||
- encapsulation protects data and invariants
|
||||
- abstraction reduces cognitive load for callers
|
||||
|
||||
### How It Works
|
||||
|
||||
Access specifiers such as `public`, `private`, and `protected` control what code may access certain members.
|
||||
|
||||
In the `BankAccount` example, `balance_` is private. That forces all mutations to go through functions that can enforce rules.
|
||||
|
||||
### Why This Matters in Real Systems
|
||||
|
||||
Without encapsulation, every caller can put an object into a bad state. In a large codebase, that turns local correctness into a global burden.
|
||||
|
||||
Good class design moves validation and lifecycle rules into one place so they are not reimplemented badly in ten different subsystems.
|
||||
|
||||
### Example: Protecting an Invariant
|
||||
|
||||
```cpp
|
||||
class Percentage {
|
||||
public:
|
||||
explicit Percentage(int value) {
|
||||
if (value < 0 || value > 100) {
|
||||
throw std::out_of_range("percentage must be between 0 and 100");
|
||||
}
|
||||
value_ = value;
|
||||
}
|
||||
|
||||
int value() const {
|
||||
return value_;
|
||||
}
|
||||
|
||||
private:
|
||||
int value_;
|
||||
};
|
||||
```
|
||||
|
||||
If `value_` were public, every call site would need to remember the rule. That does not scale.
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“Encapsulation means getters and setters for everything.”
|
||||
|
||||
No. Blind getters and setters often expose implementation details without preserving invariants. The better question is: what operations make sense for this domain object?
|
||||
|
||||
## Constructors
|
||||
|
||||
### Intuition
|
||||
|
||||
Constructors exist because objects often need to establish a valid initial state before they can be used safely.
|
||||
|
||||
This is not cosmetic. In C++, an object can represent a real system resource or a nontrivial invariant. Construction is where you set that up.
|
||||
|
||||
### Types of Constructors
|
||||
|
||||
Common constructor categories include:
|
||||
|
||||
- default constructor: creates an object with no arguments
|
||||
- parameterized constructor: creates an object with explicit setup values
|
||||
- copy constructor: creates a new object from an existing object
|
||||
- move constructor: transfers resources from a temporary or expiring object
|
||||
|
||||
Copy and move are covered in depth in File 3. For now, focus on the fact that constructors are part of an object's lifecycle contract.
|
||||
|
||||
### Initialization Lists
|
||||
|
||||
Use member initializer lists when constructing members:
|
||||
|
||||
```cpp
|
||||
class User {
|
||||
public:
|
||||
User(std::string name, int id)
|
||||
: name_(std::move(name)), id_(id) {}
|
||||
|
||||
private:
|
||||
std::string name_;
|
||||
int id_;
|
||||
};
|
||||
```
|
||||
|
||||
Why they exist:
|
||||
|
||||
- members are constructed before the constructor body runs
|
||||
- some types must be initialized rather than assigned later
|
||||
- initializer lists avoid unnecessary work
|
||||
|
||||
Internal detail:
|
||||
|
||||
If you assign inside the constructor body, the member is first default-constructed and then assigned to. Initializer lists construct it directly in its final state.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- initialize references and `const` members
|
||||
- pass dependencies explicitly
|
||||
- guarantee a valid object immediately after construction
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- doing too much work in constructors, especially work that can fail in complex ways
|
||||
- relying on member declaration order incorrectly; members are initialized in the order they are declared in the class, not the order written in the initializer list
|
||||
- forgetting `explicit` on single-argument constructors that should not allow implicit conversion
|
||||
|
||||
## Destructors
|
||||
|
||||
### Intuition
|
||||
|
||||
If constructors establish a valid object, destructors clean it up. They exist because C++ objects often manage resources beyond plain memory:
|
||||
|
||||
- file descriptors
|
||||
- mutexes
|
||||
- sockets
|
||||
- memory buffers
|
||||
- database handles
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class FileLogger {
|
||||
public:
|
||||
explicit FileLogger(const std::string& path) {
|
||||
file_ = std::fopen(path.c_str(), "a");
|
||||
if (!file_) {
|
||||
throw std::runtime_error("failed to open log file");
|
||||
}
|
||||
}
|
||||
|
||||
~FileLogger() {
|
||||
if (file_) {
|
||||
std::fclose(file_);
|
||||
}
|
||||
}
|
||||
|
||||
private:
|
||||
std::FILE* file_ = nullptr;
|
||||
};
|
||||
```
|
||||
|
||||
### Object Lifecycle Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Storage acquired] --> B[Constructor runs]
|
||||
B --> C[Object is usable]
|
||||
C --> D[Destructor runs]
|
||||
D --> E[Storage released]
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
When an object goes out of scope, its destructor runs automatically. For class members, destruction happens in reverse order of construction.
|
||||
|
||||
This reverse unwinding is critical. It is how C++ guarantees cleanup during normal scope exit and exception propagation.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- releasing OS resources
|
||||
- flushing buffered output
|
||||
- unlocking a mutex through a guard object
|
||||
- rolling back or committing scoped transactions
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- performing work in a destructor that can throw exceptions
|
||||
- forgetting that base and member destructors run automatically
|
||||
- assuming destruction order across unrelated objects is obvious
|
||||
|
||||
## RAII: Resource Acquisition Is Initialization
|
||||
|
||||
### Intuition
|
||||
|
||||
RAII is one of the most important ideas in C++. It ties resource lifetime to object lifetime.
|
||||
|
||||
The idea is simple:
|
||||
|
||||
- acquire the resource in the constructor
|
||||
- release it in the destructor
|
||||
- let scope determine cleanup
|
||||
|
||||
This is why modern C++ code can be both expressive and safe without a garbage collector.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Manual cleanup does not scale well in the presence of:
|
||||
|
||||
- early returns
|
||||
- exceptions
|
||||
- multiple code paths
|
||||
- partial initialization
|
||||
|
||||
RAII turns cleanup into a language-level guarantee rather than a discipline you hope every engineer remembers.
|
||||
|
||||
### Example: Mutex Lock Guard
|
||||
|
||||
```cpp
|
||||
void update(std::mutex& mutex, int& value) {
|
||||
std::lock_guard<std::mutex> lock(mutex);
|
||||
++value;
|
||||
}
|
||||
```
|
||||
|
||||
The mutex is locked when `lock` is constructed and automatically unlocked when `lock` goes out of scope.
|
||||
|
||||
### Real-World Usage
|
||||
|
||||
- file wrappers
|
||||
- transaction guards
|
||||
- scoped timers
|
||||
- custom allocator guards
|
||||
- lock management
|
||||
|
||||
### Misconception to Avoid
|
||||
|
||||
“RAII is only about memory.”
|
||||
|
||||
No. RAII is about any resource that must be released reliably.
|
||||
|
||||
## Inheritance
|
||||
|
||||
### Intuition
|
||||
|
||||
Inheritance exists to model an “is-a” relationship when a derived type should be usable where a base type is expected.
|
||||
|
||||
Used well, inheritance enables substitution and shared interfaces. Used poorly, it creates fragile hierarchies and confusing coupling.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Shape {
|
||||
public:
|
||||
virtual ~Shape() = default;
|
||||
virtual double area() const = 0;
|
||||
};
|
||||
|
||||
class Rectangle : public Shape {
|
||||
public:
|
||||
Rectangle(double width, double height)
|
||||
: width_(width), height_(height) {}
|
||||
|
||||
double area() const override {
|
||||
return width_ * height_;
|
||||
}
|
||||
|
||||
private:
|
||||
double width_;
|
||||
double height_;
|
||||
};
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
A derived object contains a base subobject plus its own members.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Rectangle object] --> B[Shape base subobject]
|
||||
A --> C[width_]
|
||||
A --> D[height_]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- plugin interfaces
|
||||
- GUI widget hierarchies
|
||||
- polymorphic simulation entities
|
||||
- abstractions over hardware or platform-specific implementations
|
||||
|
||||
### When Not to Use It
|
||||
|
||||
If you only want code reuse, composition is often better. Inheritance should model substitutability, not just convenience.
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- deep hierarchies that are hard to reason about
|
||||
- using inheritance for implementation reuse where composition is cleaner
|
||||
- base classes that expose too many assumptions about derived classes
|
||||
- object slicing when derived objects are copied into base objects by value
|
||||
|
||||
## Polymorphism
|
||||
|
||||
### Intuition
|
||||
|
||||
Polymorphism means “same interface, different implementation.” In C++, there are two major forms:
|
||||
|
||||
- runtime polymorphism: usually through virtual functions and base-class references or pointers
|
||||
- compile-time polymorphism: usually through templates or function overloading
|
||||
|
||||
Both matter in interviews and production code, but they solve different problems.
|
||||
|
||||
## Runtime Polymorphism
|
||||
|
||||
### How It Works
|
||||
|
||||
With `virtual` functions, the call target is chosen at runtime based on the dynamic type of the object.
|
||||
|
||||
```cpp
|
||||
void print_area(const Shape& shape) {
|
||||
std::cout << shape.area() << '\n';
|
||||
}
|
||||
```
|
||||
|
||||
If `shape` refers to a `Rectangle`, `Rectangle::area()` runs.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
The exact mechanism is implementation-defined, but the common model is:
|
||||
|
||||
- polymorphic objects contain a hidden pointer, often called a vptr
|
||||
- that pointer refers to a virtual function table, or vtable
|
||||
- virtual calls use the vtable to resolve the correct function at runtime
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[shape reference] --> B[Rectangle object]
|
||||
B --> C[vptr]
|
||||
C --> D[vtable]
|
||||
D --> E[Rectangle::area]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- runtime-selected backends
|
||||
- plugin systems
|
||||
- interface-driven architecture across modules
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- extra indirection
|
||||
- usually one pointer-sized overhead per polymorphic object
|
||||
- reduced inlining opportunities in some cases
|
||||
|
||||
These costs are often acceptable, but they are not free.
|
||||
|
||||
## Compile-Time Polymorphism
|
||||
|
||||
### Intuition
|
||||
|
||||
Sometimes you want generic behavior without runtime overhead. Templates enable this by generating type-specific code at compile time.
|
||||
|
||||
```cpp
|
||||
template <typename T>
|
||||
T max_value(T a, T b) {
|
||||
return a < b ? b : a;
|
||||
}
|
||||
```
|
||||
|
||||
### Why It Exists
|
||||
|
||||
The standard library relies heavily on compile-time polymorphism because it allows generic, highly optimizable code.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- STL algorithms and containers
|
||||
- numeric and serialization libraries
|
||||
- policy-based design
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- template errors can be verbose and hard to read
|
||||
- heavy template usage can increase compile times
|
||||
- overengineering generic code can make APIs harder to understand
|
||||
|
||||
## Object Slicing
|
||||
|
||||
### Intuition
|
||||
|
||||
Object slicing happens when a derived object is copied into a base object by value. The derived-specific part is discarded.
|
||||
|
||||
```cpp
|
||||
Rectangle rectangle(3.0, 4.0);
|
||||
Shape shape = rectangle; // invalid here because Shape is abstract, but slicing is the general idea
|
||||
```
|
||||
|
||||
In non-abstract hierarchies, this creates a new base object that no longer behaves like the original derived object.
|
||||
|
||||
### Why It Matters
|
||||
|
||||
This bug appears when engineers store polymorphic objects by value instead of via pointers or references.
|
||||
|
||||
### Rule of Thumb
|
||||
|
||||
If you want polymorphism, use references or pointers to the base type, not base objects by value.
|
||||
|
||||
## Virtual Destructors
|
||||
|
||||
### Intuition
|
||||
|
||||
If a class is meant to be used polymorphically, it usually needs a virtual destructor.
|
||||
|
||||
Why:
|
||||
|
||||
- deleting a derived object through a base pointer must run the derived destructor first
|
||||
- otherwise cleanup may be incomplete, causing leaks or broken invariants
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Base {
|
||||
public:
|
||||
virtual ~Base() = default;
|
||||
};
|
||||
```
|
||||
|
||||
### Pitfall
|
||||
|
||||
Forgetting this is a classic interview question because it reflects whether you understand object destruction through base interfaces.
|
||||
|
||||
## Design Guidance for Real Systems
|
||||
|
||||
The most maintainable C++ systems usually follow these patterns:
|
||||
|
||||
- small classes with clear ownership boundaries
|
||||
- composition before inheritance
|
||||
- constructors that establish valid state immediately
|
||||
- destructors that make cleanup automatic and boring
|
||||
- polymorphism only where substitution is genuinely needed
|
||||
|
||||
Good C++ OOP is less about building clever hierarchies and more about making lifecycle and resource rules obvious.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- what a class object contains at runtime
|
||||
- why initializer lists matter
|
||||
- what RAII solves that manual cleanup does not
|
||||
- the difference between inheritance and polymorphism
|
||||
- how virtual dispatch works conceptually
|
||||
- why polymorphic base classes usually need virtual destructors
|
||||
- what object slicing is and how to avoid it
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file shifts from basic object lifetime to modern ownership and resource management. That is where raw pointers, smart pointers, move semantics, and the Rule of 0 or 3 or 5 all fit together.
|
||||
@@ -0,0 +1,438 @@
|
||||
# File 3: Memory Management and Modern C++
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- describe ownership clearly instead of saying “the pointer points there” and stopping
|
||||
- choose between raw pointers, references, and smart pointers based on lifetime semantics
|
||||
- explain copy vs move semantics with both intuition and internal mechanics
|
||||
- apply the Rule of 0, Rule of 3, and Rule of 5 in real code
|
||||
- design resource-managing types that behave predictably under exceptions and refactoring
|
||||
|
||||
This file builds on File 1 and File 2. Once you understand lifetime, construction, and destruction, modern C++ memory management becomes a set of ownership patterns rather than a pile of features.
|
||||
|
||||
## Why Modern C++ Changed Memory Management Style
|
||||
|
||||
### Intuition
|
||||
|
||||
Older C++ code often used raw `new` and `delete` directly. That approach exposes too much manual lifetime bookkeeping to everyday code.
|
||||
|
||||
Modern C++ tries to encode ownership in types so the compiler and API design help enforce the intended lifetime model.
|
||||
|
||||
The goal is not to hide memory. It is to make ownership explicit and failure-resistant.
|
||||
|
||||
### Ownership Vocabulary
|
||||
|
||||
Before discussing smart pointers, use precise terms:
|
||||
|
||||
- owning handle: responsible for cleanup
|
||||
- non-owning handle: can access an object but does not control its lifetime
|
||||
- exclusive ownership: exactly one owner at a time
|
||||
- shared ownership: multiple owners coordinate lifetime
|
||||
- observing reference: can see an object if it still exists, but does not keep it alive
|
||||
|
||||
This vocabulary matters in interviews and code reviews because “it works” is not enough. Engineers need to know who frees the resource and when.
|
||||
|
||||
## Raw Pointers Revisited
|
||||
|
||||
### Intuition
|
||||
|
||||
A raw pointer is best treated as a non-owning access mechanism unless documentation says otherwise.
|
||||
|
||||
Why this shift matters:
|
||||
|
||||
- a raw pointer by itself does not communicate ownership clearly
|
||||
- codebases that treat raw pointers as owning create leaks and double frees
|
||||
- most modern APIs reserve raw pointers for nullable or borrowed access
|
||||
|
||||
### Good Modern Interpretation
|
||||
|
||||
Use raw pointers when you need one of these semantics:
|
||||
|
||||
- optional access to an object
|
||||
- traversal without ownership transfer
|
||||
- interoperability with C APIs or low-level subsystems
|
||||
- custom memory systems where ownership is expressed elsewhere
|
||||
|
||||
### Pitfall
|
||||
|
||||
The problem is not that raw pointers are inherently bad. The problem is that ownership encoded only in comments is fragile.
|
||||
|
||||
## `std::unique_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::unique_ptr` represents exclusive ownership. One object owns the resource, and when that owner dies, the resource is released.
|
||||
|
||||
This is the closest high-level replacement for raw owning pointers.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
auto socket = std::make_unique<Socket>(config);
|
||||
|
||||
if (!socket->connect()) {
|
||||
return false;
|
||||
}
|
||||
```
|
||||
|
||||
No manual `delete` is needed. Cleanup happens automatically.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
A `unique_ptr<T>` usually contains:
|
||||
|
||||
- a raw pointer to `T`
|
||||
- optionally a deleter object
|
||||
|
||||
It is move-only, not copyable. That restriction is the entire point. The type system prevents accidental duplicate ownership.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Exclusive ownership is extremely common:
|
||||
|
||||
- a service owns a cache
|
||||
- a tree node owns its children
|
||||
- a parser owns a token buffer
|
||||
- a component owns a resource handle
|
||||
|
||||
`unique_ptr` makes that ownership explicit and exception-safe.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- return heap objects from factories
|
||||
- store polymorphic objects in containers
|
||||
- model tree and DAG ownership where one parent clearly owns one child
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- copying is not allowed, so design APIs around moving or referencing
|
||||
- do not wrap stack objects in `unique_ptr`
|
||||
- avoid calling `release()` unless you are deliberately transferring responsibility
|
||||
|
||||
## `std::shared_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::shared_ptr` represents shared ownership. The object stays alive until the last owning `shared_ptr` goes away.
|
||||
|
||||
It exists for cases where a single clear owner does not exist.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
auto session = std::make_shared<Session>(config);
|
||||
worker_pool.add(session);
|
||||
monitor.attach(session);
|
||||
```
|
||||
|
||||
Both the worker pool and monitor may extend the lifetime of the same session object.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
`shared_ptr` typically uses a control block containing:
|
||||
|
||||
- the reference count for strong owners
|
||||
- the reference count for weak observers
|
||||
- deleter and allocator information
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[shared_ptr A] --> C[Control block]
|
||||
B[shared_ptr B] --> C
|
||||
C --> D[Managed object]
|
||||
C --> E[strong count]
|
||||
C --> F[weak count]
|
||||
```
|
||||
|
||||
When the strong count reaches zero, the managed object is destroyed. The control block itself can remain until weak references are gone.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- asynchronous workflows where several components may need to keep work alive
|
||||
- graph-like application objects when ownership is genuinely shared
|
||||
- callback systems where tasks may outlive the originating scope
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- more memory overhead than `unique_ptr`
|
||||
- reference counting operations add runtime cost
|
||||
- shared ownership can make program structure harder to reason about
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- using `shared_ptr` by default instead of designing clear ownership
|
||||
- creating hidden lifetime extension that makes cleanup unpredictable
|
||||
- forming cycles that prevent destruction
|
||||
|
||||
## `std::weak_ptr`
|
||||
|
||||
### Intuition
|
||||
|
||||
`weak_ptr` exists because sometimes you need to observe a shared object without keeping it alive.
|
||||
|
||||
The classic use case is breaking reference cycles.
|
||||
|
||||
### Example of a Cycle Problem
|
||||
|
||||
If parent and child both store `shared_ptr` to each other, neither reference count reaches zero.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[Parent shared_ptr] --> C[Child object]
|
||||
C --> W[weak_ptr back to parent]
|
||||
```
|
||||
|
||||
With `weak_ptr`, the child can refer back to the parent without extending the parent's lifetime.
|
||||
|
||||
### How It Works
|
||||
|
||||
`weak_ptr` points to the same control block as `shared_ptr`, but it does not contribute to the strong owner count.
|
||||
|
||||
To use the object safely, call `lock()` to obtain a temporary `shared_ptr` if the object still exists.
|
||||
|
||||
```cpp
|
||||
if (auto parent = weak_parent.lock()) {
|
||||
parent->notify();
|
||||
}
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- observer patterns
|
||||
- caches of shared resources
|
||||
- parent back-references in trees or graphs
|
||||
- asynchronous callback registries
|
||||
|
||||
### Pitfall
|
||||
|
||||
Do not assume the object is still alive just because a `weak_ptr` exists. Always revalidate via `lock()`.
|
||||
|
||||
## Copy Semantics
|
||||
|
||||
### Intuition
|
||||
|
||||
Copying means making another object with the same logical value.
|
||||
|
||||
For simple types, this is straightforward. For resource-owning types, copying becomes a design decision:
|
||||
|
||||
- should both objects own independent resources?
|
||||
- should copying be forbidden?
|
||||
- should copying be expensive or cheap?
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::string a = "trade";
|
||||
std::string b = a; // copy
|
||||
```
|
||||
|
||||
Here, `b` becomes its own string object with its own storage.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
For resource-owning classes, a correct copy operation often requires a deep copy, not a copied raw pointer. If two objects copy the same owning raw pointer blindly, both will try to free the same resource.
|
||||
|
||||
That is why copy control exists at all.
|
||||
|
||||
## Move Semantics
|
||||
|
||||
### Intuition
|
||||
|
||||
Move semantics exist because copying expensive resources is often unnecessary. If an object is temporary or no longer needed, its resources can be transferred instead of duplicated.
|
||||
|
||||
This is one of the defining features of modern C++.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::vector<int> build_values() {
|
||||
std::vector<int> values = {1, 2, 3, 4};
|
||||
return values;
|
||||
}
|
||||
```
|
||||
|
||||
In modern C++, returning `values` is efficient because the compiler can elide copies or move the vector's internal buffer.
|
||||
|
||||
### Transfer Mental Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[source object owns buffer] --> B[move operation]
|
||||
B --> C[destination now owns buffer]
|
||||
B --> D[source becomes valid but unspecified]
|
||||
```
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Moves typically transfer internal pointers, handles, or buffers from one object to another and leave the source object in a valid but unspecified state.
|
||||
|
||||
That phrase is important:
|
||||
|
||||
- valid means the source can still be destroyed safely
|
||||
- unspecified means you should not rely on its old value
|
||||
|
||||
### `std::move` Is a Cast, Not a Move by Itself
|
||||
|
||||
This is a common misconception.
|
||||
|
||||
`std::move(x)` does not move anything on its own. It casts `x` to an rvalue expression, signaling that moving is allowed if an appropriate move operation exists.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- returning large objects from functions
|
||||
- transferring ownership into containers or asynchronous tasks
|
||||
- avoiding unnecessary deep copies in performance-sensitive code
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- using moved-from objects as though they still contain the old value
|
||||
- writing move operations that forget to preserve class invariants
|
||||
- overusing `std::move` on values where copy elision or normal forwarding would be better
|
||||
|
||||
## Rule of 3, Rule of 5, and Rule of 0
|
||||
|
||||
### Rule of 3
|
||||
|
||||
If your class manually defines any of these, it probably needs all three:
|
||||
|
||||
- destructor
|
||||
- copy constructor
|
||||
- copy assignment operator
|
||||
|
||||
Why:
|
||||
|
||||
If your class manages a resource manually, the defaults may perform shallow copies that break ownership.
|
||||
|
||||
### Rule of 5
|
||||
|
||||
In modern C++, move constructor and move assignment operator join the list.
|
||||
|
||||
- destructor
|
||||
- copy constructor
|
||||
- copy assignment
|
||||
- move constructor
|
||||
- move assignment
|
||||
|
||||
If you manage resources manually, you likely need to think about all five.
|
||||
|
||||
### Rule of 0
|
||||
|
||||
The best modern outcome is often the Rule of 0: do not manually write special member functions at all. Instead, compose your class from well-behaved members such as `std::string`, `std::vector`, `std::unique_ptr`, and other RAII types.
|
||||
|
||||
That lets the compiler-generated defaults behave correctly.
|
||||
|
||||
### Practical Guidance
|
||||
|
||||
- prefer Rule of 0 when possible
|
||||
- use Rule of 5 only when building true resource-managing types
|
||||
- if you write one special member function, stop and consider the others
|
||||
|
||||
## Resource Management Patterns
|
||||
|
||||
## Prefer RAII Wrappers Over Manual Cleanup
|
||||
|
||||
Wrap raw resources in types that own cleanup.
|
||||
|
||||
Examples:
|
||||
|
||||
- file descriptor wrapper
|
||||
- socket wrapper
|
||||
- scoped timer
|
||||
- custom allocator arena handle
|
||||
|
||||
## Prefer Containers Over Raw Dynamic Arrays
|
||||
|
||||
Instead of:
|
||||
|
||||
```cpp
|
||||
int* data = new int[count];
|
||||
```
|
||||
|
||||
prefer:
|
||||
|
||||
```cpp
|
||||
std::vector<int> data(count);
|
||||
```
|
||||
|
||||
Why:
|
||||
|
||||
- size information stays with the data structure
|
||||
- cleanup becomes automatic
|
||||
- resizing and range-aware APIs become available
|
||||
|
||||
## Use Views for Non-Owning Access
|
||||
|
||||
Modern C++ increasingly uses non-owning views such as `std::string_view` and `std::span` to express borrowed access without copying.
|
||||
|
||||
These are powerful, but they require lifetime discipline. A view is only valid while the underlying data is alive.
|
||||
|
||||
### Example Pitfall
|
||||
|
||||
Returning `std::string_view` to a temporary `std::string` creates a dangling view.
|
||||
|
||||
## Exception Safety and Ownership
|
||||
|
||||
### Intuition
|
||||
|
||||
Memory management decisions matter most when control flow becomes non-linear. Exceptions, early returns, and partial initialization are exactly where manual cleanup breaks down.
|
||||
|
||||
RAII and smart pointers give you strong exception safety by making cleanup automatic during stack unwinding.
|
||||
|
||||
### Practical Levels of Safety
|
||||
|
||||
Common exception-safety language:
|
||||
|
||||
- basic guarantee: no leaks, object remains valid
|
||||
- strong guarantee: operation either succeeds fully or has no observable effect
|
||||
- no-throw guarantee: operation cannot throw
|
||||
|
||||
You do not need to recite these mechanically, but you should understand how ownership design influences them.
|
||||
|
||||
## Common Modern C++ Pitfalls
|
||||
|
||||
- using `shared_ptr` to avoid thinking about ownership
|
||||
- mixing owning raw pointers with smart pointers ambiguously
|
||||
- forming `shared_ptr` cycles
|
||||
- assuming moved-from objects retain useful values
|
||||
- exposing raw references or pointers to internal data whose lifetime is not guaranteed
|
||||
- returning views to destroyed storage
|
||||
|
||||
## Real-World Design Examples
|
||||
|
||||
### Tree Ownership
|
||||
|
||||
Use `unique_ptr` for children and raw pointers or references for parent-aware traversal when parent does not own child separately.
|
||||
|
||||
### Shared Async Work
|
||||
|
||||
Use `shared_ptr` when multiple asynchronous callbacks must keep an object alive until all work is finished.
|
||||
|
||||
### C API Wrapping
|
||||
|
||||
Use a custom RAII wrapper or `unique_ptr` with a custom deleter for resources acquired through legacy APIs.
|
||||
|
||||
```cpp
|
||||
using FilePtr = std::unique_ptr<std::FILE, decltype(&std::fclose)>;
|
||||
|
||||
FilePtr open_file(const char* path) {
|
||||
return FilePtr(std::fopen(path, "r"), &std::fclose);
|
||||
}
|
||||
```
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- why raw pointers are weak ownership signals
|
||||
- when `unique_ptr` is preferable to `shared_ptr`
|
||||
- how `shared_ptr` uses a control block
|
||||
- what `weak_ptr` solves
|
||||
- the difference between copy and move semantics
|
||||
- why `std::move` does not itself move anything
|
||||
- when the Rule of 0 beats the Rule of 5
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The next file focuses on the standard library, especially the containers and algorithms that most production C++ code uses every day. Many STL design choices make much more sense once you understand ownership, moves, and lifetime.
|
||||
@@ -0,0 +1,399 @@
|
||||
# File 4: STL Deep Dive
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- choose standard containers based on access patterns, not habit
|
||||
- explain how core STL containers work internally
|
||||
- understand iterator categories and invalidation rules well enough to avoid subtle bugs
|
||||
- use algorithms library functions as first-class tools rather than optional extras
|
||||
- discuss STL complexity tradeoffs in interviews and system design conversations
|
||||
|
||||
This file assumes you already understand object lifetime, move semantics, and ownership. The STL is not “just a library.” It is a design philosophy built around generic programming, iterator-based abstraction, and predictable complexity.
|
||||
|
||||
## What the STL Is Trying to Solve
|
||||
|
||||
### Intuition
|
||||
|
||||
Most programs need the same families of operations:
|
||||
|
||||
- store collections of data
|
||||
- traverse them efficiently
|
||||
- search, sort, transform, filter, and aggregate
|
||||
|
||||
The STL gives reusable building blocks for those tasks while preserving performance transparency.
|
||||
|
||||
Its core ideas are:
|
||||
|
||||
- containers own and organize data
|
||||
- iterators provide a common traversal interface
|
||||
- algorithms operate over iterator ranges instead of hardcoding container types
|
||||
|
||||
That separation is one of the most important patterns in C++.
|
||||
|
||||
## `std::vector`
|
||||
|
||||
### Intuition
|
||||
|
||||
`std::vector` is the default dynamic array in C++. It stores elements contiguously and grows as needed.
|
||||
|
||||
If you do not have a strong reason to pick something else, `vector` is often the correct first choice.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
A vector typically stores:
|
||||
|
||||
- a pointer to a contiguous heap buffer
|
||||
- its current size
|
||||
- its current capacity
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[vector object] --> B[data pointer]
|
||||
A --> C[size]
|
||||
A --> D[capacity]
|
||||
B --> E[element 0]
|
||||
B --> F[element 1]
|
||||
B --> G[element 2]
|
||||
```
|
||||
|
||||
When capacity is exceeded, vector allocates a larger buffer, moves or copies elements into it, then frees the old buffer.
|
||||
|
||||
### Why It Exists
|
||||
|
||||
Contiguous storage gives major benefits:
|
||||
|
||||
- O(1) random access
|
||||
- strong cache locality
|
||||
- easy interop with C APIs and low-level buffers
|
||||
- efficient iteration and algorithm use
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- numeric data
|
||||
- event buffers
|
||||
- parsed records
|
||||
- task queues with append-heavy patterns
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- reallocation invalidates pointers, references, and iterators to elements
|
||||
- frequent small growth can cause repeated reallocations if capacity is not reserved
|
||||
- insertion in the middle is expensive because elements after the insertion point must shift
|
||||
|
||||
### Real Advice
|
||||
|
||||
If you know approximate size up front, call `reserve()`. That is one of the highest-value micro-optimizations in ordinary C++ code.
|
||||
|
||||
## `std::deque`
|
||||
|
||||
### Intuition
|
||||
|
||||
`deque` is a double-ended queue optimized for efficient insertion and removal at both ends while still supporting indexed access.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Unlike vector, deque is not typically one contiguous buffer. It is often implemented as a segmented structure: a map of fixed-size blocks.
|
||||
|
||||
This avoids whole-buffer reallocation for growth at the front or back.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- queue-like workloads needing both front and back operations
|
||||
- sliding window logic
|
||||
- schedulers and work-stealing structures in some implementations
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- weaker cache locality than vector
|
||||
- assumptions about contiguity are invalid
|
||||
- iterators can be invalidated in ways different from vector
|
||||
|
||||
## `std::list`
|
||||
|
||||
### Intuition
|
||||
|
||||
`list` is a doubly linked list. It exists because some workloads benefit from stable iterators and cheap insertion or removal at known positions.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
Each node usually stores:
|
||||
|
||||
- the element value
|
||||
- pointer to previous node
|
||||
- pointer to next node
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Node] --> B[prev]
|
||||
A --> C[value]
|
||||
A --> D[next]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
In practice, far fewer workloads need `list` than many engineers assume. It can be useful when:
|
||||
|
||||
- you already hold iterators to splice locations
|
||||
- stable node addresses matter
|
||||
- frequent insertion and erasure in the middle dominate performance and traversal locality matters less
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“List is better for lots of inserts and deletes.”
|
||||
|
||||
Only sometimes. Pointer chasing hurts cache locality badly. In many real workloads, vector still wins despite O(n) insertion because contiguous memory is so CPU-friendly.
|
||||
|
||||
## Associative Containers: `map` and `set`
|
||||
|
||||
### Intuition
|
||||
|
||||
Ordered associative containers maintain elements in sorted order and support logarithmic lookup, insertion, and removal.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
`std::map` and `std::set` are typically implemented as balanced binary search trees, commonly red-black trees.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[8] --> B[4]
|
||||
A --> C[12]
|
||||
B --> D[2]
|
||||
B --> E[6]
|
||||
C --> F[10]
|
||||
C --> G[14]
|
||||
```
|
||||
|
||||
Why this matters:
|
||||
|
||||
- elements are kept ordered
|
||||
- lookup is O(log n)
|
||||
- iterating produces sorted order
|
||||
- node-based storage means references and iterators are often more stable than in vector
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- ordered dictionaries
|
||||
- interval or range logic using `lower_bound` and `upper_bound`
|
||||
- workloads where sorted traversal is part of the contract
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- higher per-element overhead than vector-based approaches
|
||||
- poorer cache locality because nodes are separately allocated
|
||||
- using `map` by default when ordered traversal is not needed
|
||||
|
||||
## Hash-Based Containers: `unordered_map` and `unordered_set`
|
||||
|
||||
### Intuition
|
||||
|
||||
Hash-based containers optimize for average-case constant-time lookup rather than ordering.
|
||||
|
||||
### Internal Mechanics
|
||||
|
||||
An `unordered_map` typically uses:
|
||||
|
||||
- a bucket array
|
||||
- a hash function to choose a bucket
|
||||
- collision handling, often with chains or equivalent node structures
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[key hash] --> B[bucket array]
|
||||
B --> C[bucket 0]
|
||||
B --> D[bucket 1]
|
||||
B --> E[bucket 2]
|
||||
D --> F[node -> node]
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- caches
|
||||
- symbol tables
|
||||
- frequency counting
|
||||
- routing tables or registries when order does not matter
|
||||
|
||||
### Tradeoffs
|
||||
|
||||
- average O(1) lookup, but worst-case O(n)
|
||||
- memory overhead from buckets and nodes
|
||||
- iteration order is not stable or meaningful
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- bad custom hash functions hurt performance
|
||||
- rehashing invalidates iterators in many cases
|
||||
- using `unordered_map` when deterministic iteration order is important
|
||||
|
||||
## Iterators
|
||||
|
||||
### Intuition
|
||||
|
||||
Iterators generalize traversal so algorithms can work across many containers.
|
||||
|
||||
Instead of writing one sorting routine for vectors and another for arrays, algorithms operate on iterator ranges.
|
||||
|
||||
### Categories Matter
|
||||
|
||||
Different iterators support different capabilities:
|
||||
|
||||
- input iterator: read sequentially
|
||||
- forward iterator: one-way multi-pass traversal
|
||||
- bidirectional iterator: move both forward and backward
|
||||
- random-access iterator: jump in constant time
|
||||
|
||||
This is why `std::sort` works with vector iterators but not list iterators. Sorting efficiently requires random access.
|
||||
|
||||
### Mental Model
|
||||
|
||||
Think of an iterator as a generalized cursor with container-specific guarantees.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- generic algorithms over different container types
|
||||
- decoupling traversal from storage details
|
||||
- writing reusable library code
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- invalidating iterators after insertions or erasures
|
||||
- dereferencing `end()`
|
||||
- assuming all iterators support the same operations
|
||||
|
||||
## Iterator Invalidation
|
||||
|
||||
### Intuition
|
||||
|
||||
This is one of the most frequent real-world STL bug sources. The container changes, but code keeps using old iterators, references, or pointers.
|
||||
|
||||
### Practical Rules of Thumb
|
||||
|
||||
- vector reallocation invalidates all iterators, pointers, and references to elements
|
||||
- list node insertions usually preserve iterators to other nodes
|
||||
- unordered containers may invalidate iterators when rehashing occurs
|
||||
|
||||
Do not rely on vague memory here. For critical code, check the container's exact guarantees.
|
||||
|
||||
## Algorithms Library
|
||||
|
||||
### Intuition
|
||||
|
||||
The algorithms library exists so you can express intent at a higher level than manual loops while still staying efficient.
|
||||
|
||||
Common examples include:
|
||||
|
||||
- `std::sort`
|
||||
- `std::find_if`
|
||||
- `std::transform`
|
||||
- `std::accumulate`
|
||||
- `std::lower_bound`
|
||||
- `std::remove_if`
|
||||
|
||||
### Why It Matters
|
||||
|
||||
Algorithms make code:
|
||||
|
||||
- more declarative
|
||||
- easier to review
|
||||
- easier for the compiler to optimize consistently
|
||||
- less error-prone than handwritten index manipulation
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::vector<int> values = {5, 1, 4, 2, 3};
|
||||
std::sort(values.begin(), values.end());
|
||||
```
|
||||
|
||||
You do not need to reimplement quicksort or mergesort in production code unless the problem specifically requires it.
|
||||
|
||||
### The Erase-Remove Idiom
|
||||
|
||||
This is a classic STL pattern:
|
||||
|
||||
```cpp
|
||||
values.erase(
|
||||
std::remove_if(values.begin(), values.end(), [](int v) { return v % 2 == 0; }),
|
||||
values.end());
|
||||
```
|
||||
|
||||
Why it exists:
|
||||
|
||||
- `remove_if` reorders the range so kept elements move to the front
|
||||
- it returns the new logical end
|
||||
- `erase` actually shrinks the container
|
||||
|
||||
Understanding this pattern signals real STL fluency.
|
||||
|
||||
## Complexity Cheat Sheet
|
||||
|
||||
### Sequence Containers
|
||||
|
||||
| Container | Random Access | Push Back | Push Front | Insert Middle | Iterator Stability |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| `vector` | O(1) | amortized O(1) | O(n) | O(n) | weak under reallocation |
|
||||
| `deque` | O(1) | O(1) | O(1) | O(n) | moderate, container-specific |
|
||||
| `list` | O(n) | O(1) | O(1) | O(1) with iterator | strong for other nodes |
|
||||
|
||||
### Associative Containers
|
||||
|
||||
| Container | Lookup | Insert | Order | Typical Internal Structure |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| `map` | O(log n) | O(log n) | sorted | balanced tree |
|
||||
| `set` | O(log n) | O(log n) | sorted | balanced tree |
|
||||
| `unordered_map` | average O(1) | average O(1) | none | hash table |
|
||||
| `unordered_set` | average O(1) | average O(1) | none | hash table |
|
||||
|
||||
### Why Interviews Ask About This
|
||||
|
||||
Interviewers are usually not checking if you memorized tables. They want to know whether you can choose the right structure for a workload.
|
||||
|
||||
Examples:
|
||||
|
||||
- frequent append plus indexed reads: likely `vector`
|
||||
- ordered lookup with range queries: likely `map`
|
||||
- key lookup without ordering: likely `unordered_map`
|
||||
- middle splicing with stable iterators: maybe `list`, but verify locality costs first
|
||||
|
||||
## Container Selection in Real Systems
|
||||
|
||||
### Prefer `vector` More Often Than You Think
|
||||
|
||||
Because of contiguity, vector is often the fastest general-purpose container even when its theoretical complexity looks worse than a node-based alternative.
|
||||
|
||||
### Reach for Ordered Containers When Order Matters as Part of the Contract
|
||||
|
||||
If you need sorted traversal, nearest-key queries, or stable ordering semantics, `map` earns its cost.
|
||||
|
||||
### Use Hash Containers When Key Lookup Dominates and Order Does Not Matter
|
||||
|
||||
This is common in compilers, interpreters, caches, and service registries.
|
||||
|
||||
### Avoid Cargo-Culting `list`
|
||||
|
||||
Many engineers learn linked lists academically and then overestimate their usefulness in high-performance software.
|
||||
|
||||
## Common STL Pitfalls
|
||||
|
||||
- forgetting iterator invalidation rules
|
||||
- using `operator[]` on `map` or `unordered_map` when accidental insertion is undesirable
|
||||
- choosing containers by asymptotic complexity alone and ignoring memory locality
|
||||
- copying large containers accidentally when references or moves were intended
|
||||
- assuming all algorithms work with all iterator categories
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- why `vector` is often the default container
|
||||
- how vector reallocation works and why `reserve()` helps
|
||||
- the internal difference between `map` and `unordered_map`
|
||||
- what iterator categories mean in practice
|
||||
- why `std::sort` requires random-access iterators
|
||||
- how the erase-remove idiom works
|
||||
- why cache locality can beat seemingly better asymptotic complexity
|
||||
|
||||
## What Comes Next
|
||||
|
||||
The final file moves from language and library mechanics into systems-level C++: threads, locks, atomics, performance work, and the patterns that show up in production engines, compilers, and low-latency systems.
|
||||
@@ -0,0 +1,436 @@
|
||||
# File 5: Advanced and Real-World Systems
|
||||
|
||||
## Learning Goals
|
||||
|
||||
By the end of this file, you should be able to:
|
||||
|
||||
- explain the basics of C++ concurrency without treating it as a bag of library calls
|
||||
- reason about mutexes, atomics, and condition variables in terms of correctness and performance
|
||||
- identify practical optimization levers beyond “use a faster algorithm”
|
||||
- describe where C++ fits in real systems and why teams still choose it
|
||||
- connect language features and library choices to larger architectural patterns
|
||||
|
||||
This final file builds on everything before it. Concurrency depends on lifetime correctness. Performance depends on data layout and container choice. Systems design depends on clear ownership and predictable cleanup.
|
||||
|
||||
## Why C++ Is Still Used for Real Systems
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ remains relevant because many systems need a rare combination:
|
||||
|
||||
- low-level control over memory and layout
|
||||
- high performance with minimal runtime overhead
|
||||
- strong abstraction tools for large codebases
|
||||
- portability across platforms and hardware
|
||||
|
||||
If your system is sensitive to latency, memory footprint, or hardware interaction, C++ is still one of the strongest options.
|
||||
|
||||
### Common Domains
|
||||
|
||||
- game engines
|
||||
- trading systems
|
||||
- browser engines
|
||||
- compilers and developer tools
|
||||
- databases and storage engines
|
||||
- robotics and embedded platforms
|
||||
- audio, graphics, and simulation systems
|
||||
|
||||
The rest of this file focuses on the patterns those systems rely on.
|
||||
|
||||
## Threads and Concurrency Basics
|
||||
|
||||
### Intuition
|
||||
|
||||
A thread is an independent path of execution within a process. Concurrency exists because real systems often need to overlap work:
|
||||
|
||||
- serving multiple requests
|
||||
- handling I/O while computing
|
||||
- parallelizing CPU-heavy workloads
|
||||
- keeping user interfaces responsive
|
||||
|
||||
### Basic Thread Model
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
A[Process] --> B[Thread 1]
|
||||
A --> C[Thread 2]
|
||||
A --> D[Thread 3]
|
||||
B --> E[Shared heap]
|
||||
C --> E
|
||||
D --> E
|
||||
B --> F[Own call stack]
|
||||
C --> G[Own call stack]
|
||||
D --> H[Own call stack]
|
||||
```
|
||||
|
||||
Threads in the same process usually share heap memory but have separate stacks. That makes communication possible, but it also creates the risk of races.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
void worker(int id) {
|
||||
std::cout << "worker " << id << " running\n";
|
||||
}
|
||||
|
||||
int main() {
|
||||
std::thread t1(worker, 1);
|
||||
std::thread t2(worker, 2);
|
||||
t1.join();
|
||||
t2.join();
|
||||
}
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- worker pools in backend services
|
||||
- background asset loading in game engines
|
||||
- compiler pipelines that parallelize parsing or optimization passes
|
||||
- real-time analytics pipelines
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
- forgetting to `join()` or `detach()` a thread
|
||||
- accessing shared state without synchronization
|
||||
- spawning too many threads instead of using task pools
|
||||
|
||||
## Data Races and Memory Visibility
|
||||
|
||||
### Intuition
|
||||
|
||||
A data race happens when multiple threads access the same memory concurrently, at least one access is a write, and there is no proper synchronization.
|
||||
|
||||
In C++, data races are not just “sometimes wrong.” They are undefined behavior.
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Without synchronization, the compiler and CPU are free to reorder operations in ways that break naive assumptions about “obvious” execution order.
|
||||
|
||||
Concurrency bugs often come from incorrect mental models, not missing syntax.
|
||||
|
||||
### Practical Rule
|
||||
|
||||
If shared mutable state exists, you usually need one of:
|
||||
|
||||
- a mutex
|
||||
- an atomic type
|
||||
- message passing that avoids shared mutation
|
||||
|
||||
## Mutexes and Locking
|
||||
|
||||
### Intuition
|
||||
|
||||
A mutex protects a critical section so only one thread at a time can access a shared resource.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
class Counter {
|
||||
public:
|
||||
void increment() {
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
++value_;
|
||||
}
|
||||
|
||||
int value() const {
|
||||
return value_;
|
||||
}
|
||||
|
||||
private:
|
||||
mutable std::mutex mutex_;
|
||||
int value_ = 0;
|
||||
};
|
||||
```
|
||||
|
||||
### Internal View
|
||||
|
||||
The exact implementation depends on the platform, but a mutex generally coordinates access through OS or low-level runtime primitives that block or spin until ownership can be acquired safely.
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- protecting queues, maps, and caches
|
||||
- guarding shared configuration or metrics
|
||||
- making compound state transitions atomic at the application level
|
||||
|
||||
### `lock_guard` vs `unique_lock`
|
||||
|
||||
`std::lock_guard` is minimal and scope-bound.
|
||||
|
||||
`std::unique_lock` is more flexible and useful when you need:
|
||||
|
||||
- deferred locking
|
||||
- manual unlock before scope end
|
||||
- compatibility with condition variables
|
||||
|
||||
### Pitfalls
|
||||
|
||||
- holding locks for too long
|
||||
- calling external or user-defined code while holding a lock
|
||||
- locking multiple mutexes in inconsistent order and causing deadlocks
|
||||
|
||||
## Condition Variables
|
||||
|
||||
### Intuition
|
||||
|
||||
A condition variable lets one thread wait until a condition becomes true while releasing the mutex during the wait.
|
||||
|
||||
This avoids wasteful busy-waiting.
|
||||
|
||||
### Producer-Consumer Model
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
P[Producer thread] --> Q[Shared queue]
|
||||
Q --> C[Consumer thread]
|
||||
M[Mutex] --> Q
|
||||
CV[Condition variable] --> C
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::mutex mutex;
|
||||
std::condition_variable cv;
|
||||
std::queue<int> queue;
|
||||
bool done = false;
|
||||
|
||||
void producer() {
|
||||
{
|
||||
std::lock_guard<std::mutex> lock(mutex);
|
||||
queue.push(42);
|
||||
}
|
||||
cv.notify_one();
|
||||
}
|
||||
|
||||
void consumer() {
|
||||
std::unique_lock<std::mutex> lock(mutex);
|
||||
cv.wait(lock, [] { return !queue.empty() || done; });
|
||||
|
||||
if (!queue.empty()) {
|
||||
int value = queue.front();
|
||||
queue.pop();
|
||||
std::cout << value << '\n';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why the Predicate Matters
|
||||
|
||||
Condition variables can wake spuriously. Always wait with a predicate or recheck the condition in a loop.
|
||||
|
||||
## Atomics
|
||||
|
||||
### Intuition
|
||||
|
||||
Atomics provide operations on shared values that can be performed safely without a mutex for certain patterns.
|
||||
|
||||
They are powerful, but they are not a general replacement for locks.
|
||||
|
||||
### Example
|
||||
|
||||
```cpp
|
||||
std::atomic<int> requests = 0;
|
||||
requests.fetch_add(1, std::memory_order_relaxed);
|
||||
```
|
||||
|
||||
### Practical Usage
|
||||
|
||||
- counters and statistics
|
||||
- lock-free flags
|
||||
- reference counts and state transitions
|
||||
- specialized low-latency data structures
|
||||
|
||||
### Common Misconception
|
||||
|
||||
“Atomics are always faster than mutexes.”
|
||||
|
||||
Not necessarily. They can reduce blocking, but they can also introduce complexity, cache contention, and hard-to-debug ordering issues.
|
||||
|
||||
### Rule of Thumb
|
||||
|
||||
Use mutexes for protecting complex invariants. Use atomics for simple shared state or carefully designed low-level structures.
|
||||
|
||||
## Concurrency Patterns That Actually Matter
|
||||
|
||||
## Producer-Consumer Queues
|
||||
|
||||
Classic in logging pipelines, background job systems, and network servers. One set of threads produces work; another consumes it.
|
||||
|
||||
Questions to think about:
|
||||
|
||||
- bounded or unbounded queue?
|
||||
- backpressure behavior?
|
||||
- shutdown semantics?
|
||||
|
||||
## Thread Pools
|
||||
|
||||
Instead of spawning threads per task, a fixed set of worker threads pulls tasks from a queue.
|
||||
|
||||
Why it exists:
|
||||
|
||||
- thread creation is not free
|
||||
- unbounded thread growth harms latency and memory use
|
||||
- controlled scheduling improves predictability
|
||||
|
||||
## Read-Mostly Data
|
||||
|
||||
Some systems have frequent reads and rare writes. In those cases, techniques such as reader-writer locks, versioned snapshots, or immutable data replacement can outperform coarse locking.
|
||||
|
||||
## Message Passing
|
||||
|
||||
Sometimes the best way to avoid synchronization bugs is to avoid shared mutable state. Passing messages between components can simplify reasoning, especially in actor-like or staged architectures.
|
||||
|
||||
## Performance Optimization in C++
|
||||
|
||||
### Intuition
|
||||
|
||||
C++ gives you performance opportunities, but it also gives you enough rope to optimize the wrong thing. The right approach is disciplined measurement.
|
||||
|
||||
### Step 1: Measure Before Changing Code
|
||||
|
||||
Use profilers, tracing, and benchmarks. Do not trust intuition alone.
|
||||
|
||||
Real performance work usually asks:
|
||||
|
||||
- is the problem CPU, memory, I/O, or lock contention?
|
||||
- is the bottleneck algorithmic or microarchitectural?
|
||||
- is latency or throughput the primary goal?
|
||||
|
||||
### Step 2: Care About Data Layout
|
||||
|
||||
Cache behavior often dominates performance.
|
||||
|
||||
Contiguous memory and compact structures usually outperform pointer-heavy designs because CPUs like predictable access patterns.
|
||||
|
||||
This is why:
|
||||
|
||||
- `vector` often beats `list`
|
||||
- structure layout matters in hot paths
|
||||
- unnecessary indirection hurts
|
||||
|
||||
### Step 3: Reduce Unnecessary Allocation
|
||||
|
||||
Heap allocation can be expensive because it involves allocator overhead, synchronization in some allocators, and worse locality.
|
||||
|
||||
Practical techniques:
|
||||
|
||||
- reserve container capacity
|
||||
- reuse buffers
|
||||
- use arenas or pools when appropriate
|
||||
- prefer stack or embedded storage for small fixed-size data when it simplifies lifetime
|
||||
|
||||
### Step 4: Choose the Right Granularity
|
||||
|
||||
Overly fine-grained locking and overly fine-grained tasks can both destroy performance. Coordination cost can outweigh useful work.
|
||||
|
||||
### Step 5: Respect the Compiler, but Verify
|
||||
|
||||
Compilers can inline, vectorize, reorder, and eliminate copies aggressively, but only when code structure allows it. Write clear code first, then inspect profiles and generated behavior if performance truly matters.
|
||||
|
||||
## Common Performance Pitfalls
|
||||
|
||||
- optimizing before measuring
|
||||
- using node-heavy containers in hot loops without considering locality
|
||||
- creating excessive temporary allocations
|
||||
- copying large objects accidentally instead of moving or borrowing them
|
||||
- adding threads to a workload that is actually memory-bound or lock-bound
|
||||
- false sharing, where independent thread-local counters sit on the same cache line and interfere with each other
|
||||
|
||||
## C++ in Real Systems
|
||||
|
||||
## Game Engines
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- control over memory layout and custom allocators
|
||||
- tight frame budgets
|
||||
- performance-sensitive math, rendering, and asset systems
|
||||
- need for portable native code across platforms
|
||||
|
||||
Common themes:
|
||||
|
||||
- entity-component systems
|
||||
- data-oriented design
|
||||
- custom resource streaming
|
||||
|
||||
## Trading Systems
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- low latency matters more than developer convenience in hot paths
|
||||
- careful control over allocations and CPU behavior
|
||||
- direct integration with network stacks and specialized hardware
|
||||
|
||||
Common themes:
|
||||
|
||||
- lock minimization
|
||||
- cache-aware data structures
|
||||
- careful measurement of tail latency
|
||||
|
||||
## Compilers and Developer Tools
|
||||
|
||||
Why C++ fits:
|
||||
|
||||
- large in-memory graph and tree structures
|
||||
- need for performance across parsing, semantic analysis, and optimization
|
||||
- portable command-line tooling
|
||||
|
||||
Common themes:
|
||||
|
||||
- arenas and bump allocators
|
||||
- ownership-aware AST design
|
||||
- string interning and symbol tables
|
||||
|
||||
## Design Patterns in C++
|
||||
|
||||
### RAII
|
||||
|
||||
In C++, RAII is more than a pattern. It is one of the language's core architectural strengths.
|
||||
|
||||
### Strategy
|
||||
|
||||
Useful when behavior varies but the call site should stay stable. This may be implemented with virtual interfaces, templates, or function objects depending on runtime vs compile-time needs.
|
||||
|
||||
### Factory
|
||||
|
||||
Useful when object creation logic is complex or ownership should be centralized. Modern C++ factories often return `unique_ptr` to make ownership explicit.
|
||||
|
||||
### Observer
|
||||
|
||||
Useful for event systems, but dangerous if lifetime is not carefully managed. Weak references, scoped subscriptions, or explicit unregistering are essential.
|
||||
|
||||
### Pimpl
|
||||
|
||||
The pointer-to-implementation pattern hides private representation details behind an owning pointer in the public class. It reduces rebuild cost and improves ABI stability, though it adds indirection.
|
||||
|
||||
### Composition Over Inheritance
|
||||
|
||||
This is especially valuable in C++ because inheritance carries object model and lifetime implications. Composition often produces flatter, easier-to-reason-about systems.
|
||||
|
||||
## Practical Systems Mindset
|
||||
|
||||
Strong C++ engineering usually comes from asking these questions repeatedly:
|
||||
|
||||
1. Who owns this object?
|
||||
2. How long must it live?
|
||||
3. What are the synchronization rules?
|
||||
4. What is the dominant access pattern?
|
||||
5. Where is the actual bottleneck?
|
||||
6. Can the type system express the intended contract more clearly?
|
||||
|
||||
These questions connect language mechanics to system design.
|
||||
|
||||
## Interview Checkpoints
|
||||
|
||||
You should be able to explain:
|
||||
|
||||
- what a data race is and why it is undefined behavior
|
||||
- when to use mutexes vs atomics
|
||||
- why condition variables require predicate-based waiting
|
||||
- how thread pools differ from thread-per-task designs
|
||||
- why memory locality affects real performance
|
||||
- where C++ still provides strong advantages in production systems
|
||||
- which design patterns map naturally to C++ and why
|
||||
|
||||
## Final Takeaway
|
||||
|
||||
C++ rewards engineers who reason from first principles: memory layout, lifetime, ownership, data access patterns, and concurrency semantics. That is why it remains a serious language for systems work and interviews alike. Once you stop treating it as a bag of syntax and start treating it as a model of how software inhabits hardware, the language becomes much more coherent.
|
||||
Reference in New Issue
Block a user