TarunElango/Computer-Fundamentals

Fork 0

Files

T

tarun-elango be31df2d44 more text

2026-04-26 14:09:04 -04:00

18 KiB

Raw Blame History

File 1: Foundations of C++

Learning Goals

By the end of this file, you should be able to:

explain how C++ source code becomes a running executable
reason about basic types, object storage, and memory layout
distinguish stack allocation from heap allocation in practical terms
use pointers and references without treating them as magic syntax
debug common low-level failures with a structured mental model

This file is the foundation for the rest of the guide. If later topics like RAII, smart pointers, iterators, or multithreading feel abstract, come back here first. C++ becomes much easier once you can picture what the compiler produces and what memory actually looks like at runtime.

Why C++ Exists

C++ sits in an unusual position among mainstream languages. It gives you high-level abstractions such as classes, templates, exceptions, and a rich standard library, but it still lets you work close to the machine.

That combination is why C++ shows up in places where both abstraction and control matter:

game engines that need tight performance and custom memory behavior
trading systems that care about latency and predictable execution
databases, compilers, browsers, and storage engines that manipulate large amounts of structured data
embedded and systems code where resource use must be explicit

The core idea is not just “fast language.” Many languages are fast in some contexts. C++ is valuable because it lets you choose where to pay for abstraction and where to avoid it.

The Compilation Model

Intuition

In Python or JavaScript, you can often treat “running the code” as a direct action. In C++, there is a build pipeline between the source you write and the machine code the CPU executes. Understanding that pipeline helps explain many common C++ issues:

why header files exist
why template code often lives in headers
why link errors happen even when code compiles
why build systems matter so much in large codebases

The Big Picture

flowchart LR
    A[Source files .cpp] --> B[Preprocessor]
    H[Header files .h .hpp] --> B
    B --> C[Compiler]
    C --> D[Object files .o]
    D --> E[Linker]
    L[Libraries] --> E
    E --> F[Executable or shared library]

Preprocessing

Before the compiler sees your program, the preprocessor handles directives such as #include, #define, #if, and include guards.

What this means internally:

#include is essentially textual inclusion
macros are expanded before real compilation begins
conditional compilation can remove or include chunks of code based on flags

That is why headers can feel deceptively simple. A header is not linked in as a separate unit. Its contents are copied into each translation unit that includes it.

Example:

// math_utils.h
int add(int a, int b);

// main.cpp
#include "math_utils.h"

The compiler effectively sees the declaration from the header pasted into main.cpp before actual parsing.

Compilation

The compiler parses the preprocessed source, checks types, builds intermediate representations, optimizes code, and emits object files.

A .cpp file plus all text included into it after preprocessing becomes a translation unit.

Practical consequence:

syntax errors, type errors, and many template errors are compilation-time issues
each translation unit is compiled independently
the compiler only knows what declarations are visible in that translation unit

Linking

The linker resolves symbol references across object files and libraries.

If you declare a function in a header but forget to provide the definition in a compiled source file, compilation may succeed while linking fails.

Example:

// declared
int compute();

// used
int main() {
    return compute();
}

If no compiled object file contains a matching definition of compute, the linker reports an unresolved symbol.

Practical Usage

This model matters constantly in real systems:

large codebases use headers to expose interfaces and source files to hide implementation
build time can explode if headers pull in too much code
libraries are distributed as headers plus compiled binaries or as header-only template libraries
ABI and symbol compatibility matter when separate teams ship shared libraries

Common Pitfalls

confusing compile errors with link errors
putting non-inline function definitions in headers and causing multiple definition errors
overusing macros when constants, constexpr, or templates would be safer
including large dependency trees in headers, which slows builds and increases coupling

Variables, Types, and Object Storage

Intuition

A variable in C++ is not “just a name.” It is usually a named object with a type, storage duration, alignment requirements, and a region of memory associated with it.

The type system tells both the compiler and the reader what operations are legal and how many bytes an object likely occupies.

What a Type Really Means

A C++ type typically determines:

size, though this can vary by platform
alignment requirements
how the value is interpreted in memory
what operations are available
construction and destruction behavior for user-defined types

Consider:

int count = 42;
double ratio = 0.5;
char flag = 'Y';

These values are all just bits in memory, but the type tells the compiler how to read and manipulate those bits.

Value vs Representation

One useful systems-level habit is to separate a value from its representation.

For example, an int stores a signed integer value, but underneath it is represented in binary with a platform-defined size, usually 32 bits on modern desktop/server platforms. A pointer stores an address value, but underneath it is also just bits.

This distinction matters when you debug memory corruption. The CPU does not know “this is a tree node” in some abstract sense. It only sees instructions and bytes. The meaning comes from your program's types and the compiler's generated code.

Storage Duration

Every object in C++ has a storage duration. At a practical level, that answers: when does this object come into existence, and when does its storage stop being valid?

The main categories are:

automatic storage duration: usually local variables created when a scope is entered
static storage duration: global variables and static locals that live for the life of the program
dynamic storage duration: objects created explicitly on the heap, typically with new or via allocators

Later, RAII and smart pointers will build directly on this idea.

Stack vs Heap

Intuition

Beginners often memorize “stack is fast, heap is slow.” That is too shallow and often misleading.

The real difference is about lifetime management and allocation strategy.

stack allocation is usually automatic and scoped
heap allocation is explicit or indirect and more flexible

Mental Model

flowchart TB
    A[Program starts] --> B[Call main]
    B --> C[Create stack frame for main]
    C --> D[Call function]
    D --> E[Create another stack frame]
    E --> F[Return from function]
    F --> G[Frame removed automatically]
    C --> H[Heap objects may outlive function scope]

Stack Allocation

Local variables inside a function usually live on the stack, though the exact implementation is up to the compiler and optimizer.

Example:

void process() {
    int retries = 3;
    double threshold = 0.75;
}

Why it exists:

function-local state is extremely common
scoped lifetimes are easy to manage automatically
creation and cleanup can often be handled without a general-purpose allocator

Internally, each function call usually gets a stack frame holding return information, saved registers, and local storage. When the function returns, that frame is popped.

Practical usage:

temporary computation state
small fixed-size objects
ownership that should never outlive the current scope

Pitfalls:

returning pointers or references to local variables
allocating very large arrays on the stack and causing stack overflow
assuming stack layout is fixed across compilers or optimization levels

Heap Allocation

Heap allocation is used when an object's lifetime must outlive a scope, when size is only known at runtime, or when ownership must be transferred across components.

Example:

int* value = new int(42);
delete value;

Internally, new usually asks an allocator for a chunk of dynamic memory, then constructs the object in that memory. delete destroys the object and releases the storage.

Practical usage:

dynamic data structures such as graphs or trees
objects shared across subsystems
buffers sized from runtime input

Pitfalls:

memory leaks from forgetting delete
double delete from freeing the same pointer twice
dangling pointers after deletion
heap fragmentation and allocator overhead in performance-sensitive systems

Important note: in modern C++, direct new and delete should be rare in application code. Prefer containers and smart pointers. You still need to understand heap behavior because the abstractions are built on top of it.

Pointers

Intuition

A pointer is a value whose job is to hold the address of another object. That is all. It is powerful because it lets you refer to memory indirectly.

Pointers exist because systems software constantly needs indirect access:

linked data structures
optional access to objects
efficient parameter passing without copying large objects
polymorphic behavior through base-class pointers
interaction with operating systems, hardware, and C APIs

Basic Form

int score = 99;
int* ptr = &score;

Here:

score is an int
&score means “address of score”
ptr stores that address
*ptr means “the int stored at that address”

Pointer Relationship Diagram

flowchart LR
    P[ptr] -->|stores address| S[score in memory]
    S --> V[99]

How It Works Internally

On a 64-bit system, a pointer is commonly 8 bytes. The compiler tracks the pointed-to type because pointer arithmetic and dereferencing depend on that type.

For example, incrementing an int* advances by sizeof(int) bytes, not by 1 byte.

int values[3] = {10, 20, 30};
int* p = values;
+p; // now points to values[1]

The compiler scales the increment according to the pointed-to type.

Practical Usage

traversal in low-level data structures
API boundaries that may accept nullable inputs
efficient manipulation of contiguous buffers
ownership and lifetime control in specialized libraries or allocators

Common Pitfalls

dereferencing nullptr
dereferencing uninitialized pointers
using a pointer after the object it points to has been destroyed
confusing ownership with access: a pointer can point to something without owning it

That last point is critical. A raw pointer does not tell you who is responsible for deleting the object.

References

Intuition

A reference is an alias to an existing object. It exists to make code safer and clearer than pointer-heavy interfaces when nullability and reseating are not needed.

Example:

void increment(int& value) {
    ++value;
}

Why References Exist

Without references, you would often pass pointers just to avoid copying objects. But pointers imply optionality and manual dereferencing.

References express a stronger contract:

this function expects a valid object
there is no need for null checks as part of normal usage
the alias should behave like the original object

Internal View

At the machine level, a reference is often implemented similarly to a pointer, but the language treats it differently.

Key properties:

must be initialized when created
cannot be reseated to refer to another object
usually cannot be null in well-formed code
use normal object syntax instead of pointer syntax

flowchart LR
    R[ref] -->|alias of| X[x]

Practical Usage

passing large objects efficiently without copying
operator overloading and fluent APIs
returning aliases to subobjects when lifetime is guaranteed

Pitfalls and Misconceptions

a reference is not an independent object with its own lifetime target management
returning a reference to a local variable is still invalid
“references are always safer than pointers” is too simplistic; pointers are the right tool when optionality, reseating, or explicit low-level behavior is required

Const Correctness

Intuition

const is one of the cheapest ways to make C++ code easier to reason about. It restricts mutation and therefore reduces the number of possible program states.

Practical Examples

void print(const std::string& name);

const int limit = 100;

Why it matters in real systems:

APIs become clearer about who is allowed to modify data
the compiler can catch accidental writes
reviewers can reason more quickly about ownership and side effects

Common Pitfalls

confusing const int* p with int* const p
using const inconsistently across interfaces
assuming const automatically implies thread safety or deep immutability

Arrays, Decay, and Basic Memory Layout

Intuition

C++ inherits much of C's memory model. Arrays are contiguous blocks of elements, which is why they are fast for indexed access and cache-friendly iteration.

int values[4] = {1, 2, 3, 4};

The elements are stored adjacent in memory. That contiguity is why pointer arithmetic and array indexing are closely related.

Under the Hood

values[i] is conceptually equivalent to *(values + i).

This is powerful, but it is also why out-of-bounds access is dangerous. C++ does not automatically check bounds for raw arrays.

Practical Usage

numerical buffers
serialization code
high-performance loops
interop with C libraries

Pitfalls

array-to-pointer decay in function parameters
buffer overflows
assuming stack arrays automatically know their size when passed to a function

In most application code, prefer std::array for fixed-size arrays and std::vector for dynamic arrays. You will still see raw arrays in systems code, embedded code, and performance-critical paths.

A Debugging Mental Model

Intuition

Low-level bugs in C++ often feel mysterious only when you lack a runtime model. Most of the time, they reduce to one of a few categories:

invalid lifetime
invalid memory access
wrong ownership
incorrect assumptions about object state
data races in concurrent code

A Useful Diagnostic Loop

When debugging a crash or corruption issue, ask these questions in order:

What object was accessed?
Was it initialized?
Is its lifetime still valid?
Who owns it?
Could memory nearby have been overwritten?
Is the failure deterministic or timing-dependent?

That checklist is more valuable than memorizing debugger buttons.

Common Failure Modes

Segmentation Faults

Usually caused by dereferencing an invalid address such as:

nullptr
a dangling pointer
a wild pointer from uninitialized memory

Use-After-Free

You delete an object, but some pointer or reference still points to the old address. The address may still look valid for a while, which makes this class of bug subtle.

Stack Corruption

Often caused by out-of-bounds writes into local arrays or incorrect pointer arithmetic.

Memory Leaks

The program keeps allocating memory without freeing it. In long-running services, that becomes a production issue rather than just a test annoyance.

Practical Tools

Real C++ debugging is easier when you use tooling, not just intuition:

compiler warnings: start with strict warnings enabled
AddressSanitizer: catches use-after-free, buffer overflows, and more
UndefinedBehaviorSanitizer: catches many invalid language-level operations
Valgrind on supported platforms: useful for leaks and invalid accesses
debugger: inspect stack frames, variables, and memory addresses

Example build flags on Clang or GCC for local debugging:

-Wall -Wextra -Wpedantic -fsanitize=address,undefined -g

Misconception to Avoid

“If it only crashes sometimes, the code is almost correct.”

In C++, nondeterministic behavior is often a sign of undefined behavior, not a minor bug. Once you have UB, the optimizer and runtime can produce very different outcomes from one build or machine to another.

Foundation Patterns That Matter Later

Several later C++ ideas are really lifetime-management patterns built on the concepts above:

constructors and destructors manage object setup and cleanup
RAII ties resource lifetime to scope lifetime
smart pointers model ownership on top of heap allocation
containers hide raw memory management while preserving performance properties
concurrency primitives rely on precise reasoning about storage and object lifetime

If you can already picture stack frames, heap allocation, pointer indirection, and the compile-link pipeline, you are ready for object-oriented and modern C++ design.

Interview Checkpoints

You should be able to explain these clearly in an interview without hiding behind buzzwords:

the difference between compilation and linking
why headers can increase build time and coupling
what stack and heap allocation really mean in terms of lifetime
the difference between a pointer and a reference
what causes dangling pointers and use-after-free bugs
why const improves API design and reasoning

What Comes Next

The next file builds on these memory and lifetime foundations to explain classes, constructors, destructors, inheritance, and polymorphism. The key shift is this: C++ object-oriented features are not separate from the memory model. They are layered on top of it.

18 KiB Raw Blame History

File 1: Foundations of C++

Learning Goals

Why C++ Exists

The Compilation Model

Intuition

The Big Picture

Preprocessing

Compilation

Linking

Practical Usage

Common Pitfalls

Variables, Types, and Object Storage

Intuition

What a Type Really Means

Value vs Representation

Storage Duration

Stack vs Heap

Intuition

Mental Model

Stack Allocation

Heap Allocation

Pointers

Intuition

Basic Form

Pointer Relationship Diagram

How It Works Internally

Practical Usage

Common Pitfalls

References

Intuition

Why References Exist

Internal View

Practical Usage

Pitfalls and Misconceptions

Const Correctness

Intuition

Practical Examples

Common Pitfalls

Arrays, Decay, and Basic Memory Layout

Intuition

Under the Hood

Practical Usage

Pitfalls

A Debugging Mental Model

Intuition

A Useful Diagnostic Loop

Common Failure Modes

Segmentation Faults

Use-After-Free

Stack Corruption

Memory Leaks

Practical Tools

Misconception to Avoid

Foundation Patterns That Matter Later

Interview Checkpoints

What Comes Next

18 KiB

Raw Blame History