From 3325c57a504a2875fb7ebc5cbb46d5b927a84154 Mon Sep 17 00:00:00 2001 From: tarun-elango Date: Sat, 25 Apr 2026 12:49:47 -0400 Subject: [PATCH] first commit --- .DS_Store | Bin 0 -> 6148 bytes compilersInterpreters.md | 1019 ++++++++++++++++++++++++++++++++++ networking.md | 829 ++++++++++++++++++++++++++++ os/concurrency.md | 1125 ++++++++++++++++++++++++++++++++++++++ os/memoryManagement.md | 921 +++++++++++++++++++++++++++++++ os/processmanagement.md | 1021 ++++++++++++++++++++++++++++++++++ os/storage.md | 2 + os/systemOperations.md | 1108 +++++++++++++++++++++++++++++++++++++ 8 files changed, 6025 insertions(+) create mode 100644 .DS_Store create mode 100644 compilersInterpreters.md create mode 100644 networking.md create mode 100644 os/concurrency.md create mode 100644 os/memoryManagement.md create mode 100644 os/processmanagement.md create mode 100644 os/storage.md create mode 100644 os/systemOperations.md diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..b654f622cf1a298ccf3953f78f730a4064bf16b3 GIT binary patch literal 6148 zcmeHKO>fgc5S>j^>Zq!e14xy)tki3O1VJOj#R}!nLoXPC5TFp(RnLqxPk$}(zHE$AFo6`gHle*1;l0;625F5Yd*iR*Ief5F#ynrCIz?|&Dqt@iev zonR-}4L(MnbsbfcYF3UW#e36xt#uwR<4OD`OY3p>?hCD|Nv6`m60#&k%7?d^N_0Ka zGnJH<8@Y*~6LiMi`}6sO!D0VdPd*+#UG(I9@MLhp~uD{MqvCQU}fNj75JkH+yF0jb*lgX literal 0 HcmV?d00001 diff --git a/compilersInterpreters.md b/compilersInterpreters.md new file mode 100644 index 0000000..a10dd0d --- /dev/null +++ b/compilersInterpreters.md @@ -0,0 +1,1019 @@ +# Compilers and Interpreters: Interview Guide for Software Engineers + +This guide is aimed at interview preparation rather than language theory for its own sake. The goal is to help you build a mental model that is strong enough to answer both conceptual questions and practical engineering questions such as: + +- Why does Java start slower but often run faster than Python? +- What exactly happens between writing `main.cpp` and running a binary? +- Why do some languages need a runtime even after they are compiled? +- What does a parser or an AST actually do in a real compiler? +- Why do link errors happen even when compilation succeeds? + +The most important meta-point is this: modern language implementations are usually hybrids. In interviews, avoid talking as if every language is either "compiled" or "interpreted" in a pure sense. C++, Java, Python, JavaScript, and Go all sit at different points on a spectrum. + +## 1. What Compilers and Interpreters Are + +At a high level, a programming language implementation must turn human-readable source code into behavior that a machine can execute. + +### Compiler + +A compiler translates source code into another form before the program runs. That target form might be: + +- Native machine code for a specific CPU and operating system +- Assembly that is later assembled into machine code +- Bytecode for a virtual machine +- An intermediate representation consumed by later stages + +The key idea is that significant translation work happens ahead of execution. + +Examples: + +- C and C++ are usually compiled ahead of time into native binaries. +- Go is compiled ahead of time into native machine code. +- Java source is compiled by `javac` into JVM bytecode. + +### Interpreter + +An interpreter executes a program more directly, usually by reading a representation of the program and performing the required actions step by step. Historically that meant walking source code or a parse tree directly. In modern systems, it often means executing bytecode inside a virtual machine. + +Examples: + +- CPython compiles Python source to bytecode, then interprets that bytecode in a virtual machine. +- JavaScript engines often start by interpreting bytecode before JIT-compiling hot code paths. + +### The important nuance + +People often say: + +- "C++ is compiled" +- "Python is interpreted" +- "Java uses JIT" + +These are directionally correct, but incomplete. + +More precise versions are: + +- C++ is usually compiled ahead of time to native code and then linked into an executable. +- CPython compiles source to bytecode, then interprets bytecode in a VM. +- Java compiles source to bytecode, then the JVM interprets and JIT-compiles hot paths to native code. + +That precision is often what separates a shallow interview answer from a strong one. + +## 2. Compiler vs Interpreter vs JIT + +### Ahead-of-time compilation + +Ahead-of-time, or AOT, compilation means translating source code before the program starts running. + +Strengths: + +- Low runtime translation overhead +- Strong optimization opportunities before deployment +- Produces native binaries that the CPU can execute directly + +Tradeoffs: + +- Output is often platform-specific +- Less runtime information is available for optimization +- Build steps can be more complex + +### Interpretation + +Interpretation means execution happens through a program that understands another program. + +Strengths: + +- Portability when the interpreter exists on many platforms +- Faster edit-run feedback in many workflows +- Easier dynamic behavior in many implementations + +Tradeoffs: + +- Higher execution overhead +- Fewer opportunities for low-level optimization in the simple case +- Performance depends heavily on interpreter design + +### JIT, or just-in-time compilation + +JIT compilation happens during execution. The runtime observes the program while it runs, identifies hot code paths, and compiles those parts into machine code on the fly. + +Strengths: + +- Can use runtime information such as actual types, branch behavior, and call frequency +- Often gets better peak performance than a pure interpreter +- Can optimize only the parts of the program that matter in practice + +Tradeoffs: + +- Startup cost and warm-up time +- More complex runtime system +- Optimized code may need to be discarded if assumptions become invalid, which is called deoptimization + +### Comparison table + +| Aspect | Compiler (AOT) | Interpreter | JIT | +| --- | --- | --- | --- | +| When translation happens | Before execution | During execution | During execution | +| Typical target | Native code or bytecode | Direct execution of source, AST, or bytecode | Native code generated from bytecode or IR | +| Startup time | Often good after build step | Often good for simple scripts | Usually slower due to warm-up | +| Peak performance | Usually strong | Usually weaker | Often very strong | +| Portability | Often lower for native binaries | Often high | High if VM exists | +| Runtime complexity | Lower | Moderate | High | + +```mermaid +flowchart LR + A[Source Code] --> B[AOT Compiler] + B --> C[Native Binary] + C --> D[CPU Executes] + + E[Source Code] --> F[Interpreter or VM] + F --> G[Operations Executed Step by Step] + + H[Source Code] --> I[Bytecode Compiler] + I --> J[VM] + J --> K[Profiler Identifies Hot Paths] + K --> L[JIT Compiler] + L --> M[Native Machine Code] + M --> N[CPU Executes Hot Code] +``` + +## 3. High-Level Language to Machine Code Flow + +For many languages, the implementation pipeline looks like this: + +1. Read source text. +2. Break the text into tokens. +3. Parse tokens into a structural representation such as an AST. +4. Run semantic checks such as name resolution and type checking. +5. Lower the program into an intermediate representation. +6. Optimize the intermediate representation. +7. Generate machine code or bytecode. +8. Link with libraries and runtime support. +9. Load the result into memory. +10. Execute with help from the runtime, operating system, and hardware. + +This is the high-level picture interviewers usually want you to hold in your head. + +```mermaid +flowchart LR + A[Source Code] --> B[Lexical Analysis] + B --> C[Parsing] + C --> D[AST] + D --> E[Semantic Analysis] + E --> F[Intermediate Representation] + F --> G[Optimization] + G --> H[Code Generation] + H --> I[Object Code or Bytecode] + I --> J[Linking and Loading] + J --> K[Runtime Execution] +``` + +### Front end vs back end + +Compilers are often divided into two broad parts: + +- Front end: language-specific work such as lexing, parsing, semantic analysis, and type checking +- Back end: target-specific work such as optimization, register allocation, instruction selection, and code generation + +The intermediate representation between them is what makes retargeting possible. A single front end can feed multiple back ends, and a single back end can sometimes support multiple languages. + +## 4. Lexical Analysis, or Tokenization + +Lexical analysis turns a stream of characters into a stream of tokens. + +Tokens are language-level units such as: + +- Keywords: `if`, `while`, `return` +- Identifiers: variable names and function names +- Literals: `42`, `3.14`, `"hello"` +- Operators: `+`, `-`, `==`, `&&` +- Delimiters: `(`, `)`, `{`, `}`, `,`, `;` + +Given this code: + +```txt +total = price * 2 + tax +``` + +A lexer may produce something like: + +- `IDENT(total)` +- `ASSIGN` +- `IDENT(price)` +- `STAR` +- `INT(2)` +- `PLUS` +- `IDENT(tax)` + +### Why tokenization matters + +It simplifies later stages. The parser does not want to reason about raw characters. It wants structured units with categories and values. + +### What lexers usually handle + +- Whitespace and comments +- Keyword recognition +- Numeric and string literal formats +- Error reporting for invalid characters or malformed literals +- Longest-match behavior for operators such as `=` versus `==` + +### Implementation intuition + +Lexers are often built using regular expressions or finite automata. You do not usually need to derive automata in interviews unless the role is compiler-heavy, but you should know the relationship: + +- Regular languages are enough for token patterns. +- Context-free grammars are needed for nested syntax structures. + +## 5. Parsing and Syntax Trees + +Parsing takes the token stream and checks whether it matches the grammar of the language. + +If lexing answers, "What are the words?", parsing answers, "What is the sentence structure?" + +### Parse tree vs AST + +A parse tree preserves the grammar structure in a very literal way. + +An abstract syntax tree, or AST, keeps the meaningful program structure while dropping unnecessary grammar detail. + +For example, the expression: + +```txt +a + b * c +``` + +should parse so that multiplication binds tighter than addition. + +An AST might look like this: + +```mermaid +graph TD + Add[+] + Add --> A[a] + Add --> Mul[*] + Mul --> B[b] + Mul --> C[c] +``` + +That tree encodes precedence correctly. If the parser built `(a + b) * c`, the AST would have a different shape and the meaning would change. + +### Common parsing approaches + +- Recursive descent: simple and common for hand-written parsers +- LL parsers: top-down parsing families +- LR parsers: bottom-up parsing families, common in parser generators +- Pratt parsers: elegant for expression parsing with precedence and associativity + +In interviews, you usually do not need to compare LR item sets or parsing tables unless that is the focus. It is more valuable to explain what parsing accomplishes and why operator precedence and associativity matter. + +## 6. Semantic Analysis + +After parsing, a program may be syntactically valid but still semantically invalid. + +Examples: + +- Using a variable before it is declared +- Calling a function with the wrong number of arguments +- Adding a string to an integer in a statically typed language that disallows it +- Returning a value from a function declared as `void` +- Referencing a private symbol where it is not visible + +Semantic analysis is where the compiler checks meaning. + +### Typical semantic checks + +- Name resolution: what declaration does this identifier refer to? +- Scope checking: is the name visible here? +- Type checking: are operations applied to compatible types? +- Control-flow checks: does every path return a value when required? +- Definite assignment checks: was a variable initialized before use? +- Access control checks: public, private, protected, package visibility, and similar rules + +### Static vs dynamic semantic checks + +Some languages shift more checks to runtime. + +- In Java or Go, many errors are caught before execution. +- In Python or JavaScript, some checks happen only when that code path actually runs. + +This is why syntactically valid Python code can import successfully but still fail later when a particular function is executed. + +## 7. Symbol Tables + +A symbol table is the compiler's record of names and what they mean. + +It typically stores entries for things like: + +- Variables +- Functions and methods +- Classes, structs, interfaces, and types +- Constants and enums +- Modules and packages +- Labels and sometimes temporary compiler-generated symbols + +For each symbol, the compiler may track: + +- Name +- Kind, such as variable or function +- Type information +- Scope level +- Storage location or stack offset +- Linkage and visibility +- Mutability or const-ness + +### Why symbol tables matter + +They are used across multiple stages: + +- Semantic analysis uses them for name lookup and type checking. +- Optimization may use them for aliasing and data-flow facts. +- Code generation uses them to map names to memory locations or registers. +- Linkers use symbol information to resolve references across object files. + +### Scope example + +```txt +int x = 1; +{ + string x = "inner"; + print(x); +} +print(x); +``` + +The inner `x` shadows the outer `x`. A symbol table usually models this by pushing a new scope when entering a block and popping it when leaving. + +## 8. Static vs Dynamic Typing + +Typing is about when type constraints are checked and how much type information is known before execution. + +### Static typing + +In statically typed languages, many type errors are caught before the program runs. + +Examples: + +- Java +- Go +- C++ + +Benefits: + +- Earlier error detection +- Better compiler reasoning for optimization +- Stronger tooling and refactoring support + +Tradeoffs: + +- More upfront constraints +- Some patterns may require more annotations or more complex type systems + +### Dynamic typing + +In dynamically typed languages, values carry type information at runtime and many checks happen during execution. + +Examples: + +- Python +- JavaScript + +Benefits: + +- Flexible programming style +- Faster prototyping in many cases +- Easier expression of some dynamic patterns + +Tradeoffs: + +- More runtime checks +- Some errors appear later +- Implementations often need more runtime machinery + +### Important nuance for interviews + +Static versus dynamic typing is not the same thing as compiled versus interpreted. + +- Go is statically typed and compiled. +- Java is statically typed and compiled to bytecode, then JIT-executed. +- Python is dynamically typed but still goes through compilation to bytecode. +- TypeScript is statically typed at development time, but JavaScript at runtime remains dynamically typed. + +## 9. Intermediate Representation, or IR + +An intermediate representation is a program form used between the source language and the target machine code. + +Why compilers use IR: + +- It decouples language-specific parsing from machine-specific code generation. +- It gives the optimizer a cleaner structure to work on. +- It allows multiple source languages or multiple target architectures to share infrastructure. + +### Common IR forms + +- ASTs +- Three-address code +- Control-flow graphs +- Static single assignment, or SSA, form +- Bytecode for a VM + +### Simple example + +Source code: + +```txt +x = a * 2 + b +``` + +A simple three-address IR could look like: + +```txt +t1 = a * 2 +t2 = t1 + b +x = t2 +``` + +That form is easier for later optimization passes than the original source text. + +### SSA intuition + +In SSA form, each variable is assigned only once. That makes data-flow analysis cleaner. + +Instead of repeatedly updating `x`, the compiler may create versions like `x1`, `x2`, and `x3`. SSA is heavily used in modern optimizing compilers such as LLVM-based systems and JIT compilers. + +## 10. Optimization Basics + +Optimization is about improving some objective while preserving program meaning. + +That objective might be: + +- Faster execution +- Smaller binary size +- Lower memory use +- Lower power consumption +- Better startup latency + +There is no universal "best" optimization. Interviewers often want to hear that optimization is a tradeoff, not magic. + +### Common optimization passes + +- Constant folding: compute constant expressions at compile time +- Constant propagation: carry known constant values forward +- Dead code elimination: remove code that cannot affect results +- Common subexpression elimination: avoid recomputing the same expression +- Inlining: replace a call with the function body when profitable +- Loop-invariant code motion: move repeated work out of loops +- Strength reduction: replace expensive operations with cheaper ones when possible +- Escape analysis: determine whether an object can stay on the stack instead of the heap +- Devirtualization: replace indirect dispatch with direct calls when the target is known + +### Why optimizers need care + +They must preserve observable behavior. That gets tricky because "observable behavior" includes more than just printed output. It can include: + +- Memory ordering +- Exception behavior +- Volatile reads and writes +- Undefined behavior rules in languages like C and C++ +- Reflection and dynamic code loading in runtime-heavy languages + +### Interview insight + +If asked why a compiler cannot always optimize more aggressively, a strong answer mentions aliasing, unknown side effects, dynamic dispatch, separate compilation boundaries, and the need to preserve semantics. + +## 11. Code Generation + +Code generation turns IR into target code such as machine instructions or bytecode. + +Important subproblems include: + +- Instruction selection: which machine instructions implement the operation? +- Register allocation: which values stay in CPU registers versus spilling to memory? +- Stack frame layout: where do locals, saved registers, and return addresses live? +- Calling conventions: how are arguments passed and return values received? +- Emitting metadata: debug info, relocation entries, unwind info, and symbol information + +### Example intuition + +For `return a + b;`, a backend may: + +1. Load `a` and `b` from argument locations or registers. +2. Emit an add instruction. +3. Place the result in the return register dictated by the calling convention. +4. Emit function epilogue code and `ret`. + +The exact instructions differ by architecture and ABI. + +## 12. Linking and Loading + +Compilation alone usually does not produce a complete runnable program. + +The compiler often produces object files. Those object files contain: + +- Machine code for the compiled translation unit +- Unresolved references to symbols defined elsewhere +- Symbol tables +- Relocation information +- Debug metadata + +### Linking + +The linker combines object files and libraries into a final executable or shared library. + +It performs tasks such as: + +- Resolving symbol references +- Laying out code and data sections +- Applying relocations +- Merging in startup code and runtime support + +Typical interview examples: + +- Compilation succeeds, but linking fails with an undefined reference +- Two object files define the same global symbol and the linker reports a duplicate symbol error + +### Loading + +The loader, usually with help from the operating system, maps the executable into memory, sets up the process address space, loads shared libraries, resolves dynamic symbols, and transfers control to the entry point. + +On real systems, startup also includes runtime initialization such as: + +- Setting up the heap +- Initializing thread-local storage +- Running static initializers +- Preparing language runtime state + +```mermaid +flowchart LR + A[Object Files] --> B[Linker] + C[Libraries] --> B + B --> D[Executable or Shared Library] + D --> E[OS Loader] + E --> F[Process Memory Image] + F --> G[Program Entry Point] +``` + +## 13. Static vs Dynamic Linking + +### Static linking + +With static linking, library code is copied into the final binary at link time. + +Benefits: + +- Easier deployment because dependencies are bundled +- Often fewer runtime dependency issues +- Common in Go builds and some systems programming deployments + +Tradeoffs: + +- Larger binaries +- Library bug fixes usually require rebuilding and redeploying the binary +- Multiple processes may each carry their own copy of the code + +### Dynamic linking + +With dynamic linking, the program depends on shared libraries that are loaded at runtime. + +Benefits: + +- Smaller binaries +- Shared code can be updated independently +- Common system libraries can be shared across many processes + +Tradeoffs: + +- Dependency versioning issues +- Runtime failures if the expected shared library is missing or incompatible +- More complicated deployment and startup behavior + +### Interview framing + +If asked when you would prefer static or dynamic linking, the right answer is contextual: + +- Prefer static linking when deployment simplicity matters and binary size is acceptable. +- Prefer dynamic linking when shared updates, ecosystem integration, or reduced duplication matter more. + +## 14. Bytecode and Virtual Machines + +Bytecode is a lower-level representation than source code, but usually more portable than machine code. A virtual machine executes bytecode rather than directly running source text. + +### Why bytecode exists + +- Portability across platforms +- Faster startup than parsing source every time +- Easier verification and tooling than raw machine code +- A stable target for multiple language implementations or runtime optimizers + +### Stack VM vs register VM + +Many bytecode VMs are stack-based. + +Example model: + +```txt +LOAD a +LOAD b +ADD +STORE x +``` + +The VM implicitly uses the operand stack. + +Other VMs use register-like instructions, which can reduce dispatch overhead but may be more complex to generate. + +### Real-world examples + +- Java: source compiles to JVM bytecode in `.class` files, executed by the JVM +- Python: CPython compiles source to bytecode in `.pyc` caches and executes it in the CPython VM +- JavaScript: engines such as V8 commonly lower code to internal bytecode before optimization tiers run + +### Verification and safety + +VM-based systems often perform bytecode verification or structural checks before execution. This is part of why Java could promise "write once, run anywhere" more plausibly than shipping raw native binaries. + +## 15. Runtime Systems + +The runtime system is everything the program needs while executing beyond just the generated instructions. + +Depending on the language, the runtime may handle: + +- Memory allocation +- Garbage collection +- Exception handling and stack unwinding +- Thread scheduling +- Reflection and metadata lookup +- Dynamic dispatch support +- Security checks or sandboxing +- Foreign function interfaces +- Class loading or module loading + +### Key insight + +A compiled language can still need a significant runtime. + +For example: + +- Go compiles to native code but relies on a runtime for goroutine scheduling and garbage collection. +- Java compiles to bytecode but relies heavily on the JVM runtime. +- C++ compiles to native code, but still depends on runtime support for exceptions, RTTI, startup code, and parts of the standard library. + +## 16. Garbage Collection Basics + +Garbage collection, or GC, automatically reclaims memory that is no longer reachable by the program. + +### Core idea + +The runtime starts from roots such as: + +- Stack references +- CPU registers +- Global variables +- Static fields + +It then discovers reachable objects. Objects that are no longer reachable can be reclaimed. + +### Common GC strategies + +#### Mark-and-sweep + +1. Mark reachable objects. +2. Sweep through memory and reclaim unmarked objects. + +Simple mental model, but can introduce pauses. + +#### Copying collection + +Move live objects from one space to another and reclaim the old space wholesale. + +This can make allocation fast and reduce fragmentation. + +#### Generational GC + +Based on the observation that most objects die young. + +Young objects are collected frequently. Older surviving objects are promoted and collected less often. + +This is common in production runtimes such as the JVM. + +#### Reference counting + +Track how many references point to each object. + +When the count drops to zero, reclaim the object immediately. + +Benefit: + +- Predictable reclamation in many cases + +Weakness: + +- Cycles require extra handling + +CPython is the classic interview example here: reference counting plus a cyclic garbage collector. + +### Important tradeoffs + +- Throughput versus pause time +- Memory overhead versus collection frequency +- Simplicity versus sophistication +- Predictability versus peak performance + +### Practical examples + +- Java: tracing GC, often generational and highly optimized +- Go: concurrent GC designed to limit pause times +- CPython: reference counting with cycle detection +- C++: typically manual memory management or smart pointers rather than mandatory GC + +## 17. Language-Specific Mental Models + +## C++ + +Typical flow: + +1. Preprocessing expands includes and macros. +2. The compiler parses and type-checks each translation unit. +3. The backend generates machine code into object files. +4. The linker resolves symbols and produces the final executable. + +What to emphasize in interviews: + +- Native ahead-of-time compilation +- Strong optimization opportunities +- Separate compilation and linking are central +- ABI, templates, inline functions, and ODR issues can matter +- Memory management is largely explicit, though abstractions such as RAII and smart pointers help + +Good interview phrasing: + +"C++ is not just compile and run. It has a distinct preprocessing, compilation, assembly, and linking pipeline, and many real build issues happen at link time rather than compile time." + +## Java + +Typical flow: + +1. `javac` compiles source to JVM bytecode. +2. The JVM loads classes and verifies bytecode. +3. Code may initially run in an interpreter or baseline tier. +4. Hot methods are JIT-compiled to native code. +5. The runtime manages memory, GC, class loading, and synchronization. + +What to emphasize: + +- Bytecode portability +- Heavyweight runtime with class loading and GC +- Tiered execution with JIT for hot code +- Startup versus warm performance tradeoff + +## Python + +Typical flow in CPython: + +1. Python source is parsed into an AST. +2. The AST is compiled into Python bytecode. +3. The CPython VM executes the bytecode. +4. Memory management uses reference counting plus cycle detection. + +What to emphasize: + +- Python is not "just interpreted" in the simplistic sense +- CPython does have a compilation stage, but not usually to native code +- Runtime type checks and object model overhead are significant +- Different implementations exist, such as PyPy with a JIT + +## JavaScript + +Typical flow in engines like V8: + +1. Source is parsed. +2. The engine produces AST and internal bytecode. +3. Bytecode runs in an interpreter or baseline execution tier. +4. The engine profiles runtime behavior. +5. Hot code is optimized by JIT tiers. +6. If assumptions break, optimized code can be deoptimized. + +What to emphasize: + +- Highly dynamic language semantics make optimization hard +- JITs rely on speculative assumptions about object shapes and types +- Warm-up and deoptimization are important real-world performance concepts + +## Go + +Typical flow: + +1. `go build` compiles source ahead of time into native machine code. +2. The linker produces a native executable, often with static linking in common deployments. +3. The Go runtime provides GC, goroutine scheduling, stack growth, and other services. + +What to emphasize: + +- Native compilation with a substantial runtime +- Simple deployment model compared with many dynamic ecosystems +- Concurrency is language-level, but implemented with runtime support +- `go run` still compiles first, then runs the resulting binary + +## 18. One Command, Five Different Execution Stories + +This is a very interview-friendly way to compare languages. + +### `g++ main.cpp && ./a.out` + +- Preprocess source +- Compile to object code +- Link libraries +- Produce native executable +- OS loader maps it into memory and starts execution + +### `java Main` + +- The source was already compiled to bytecode +- JVM loads classes +- Bytecode is verified +- Methods run and hot methods may be JIT-compiled +- GC and runtime services stay active throughout execution + +### `python app.py` + +- Source is parsed +- Bytecode is produced or loaded from cache when valid +- CPython VM executes bytecode +- Runtime type checks and object dispatch happen as execution proceeds + +### `node app.js` + +- V8 parses JavaScript +- Internal bytecode is generated +- Code starts running in execution tiers +- Hot paths may be JIT-optimized +- Runtime assumptions may trigger deoptimization if they become invalid + +### `go run main.go` + +- Go compiles source to a temporary native binary +- The binary is linked +- The binary is executed +- The Go runtime manages scheduling, GC, and other services at runtime + +## 19. Common Interview Questions and Strong Answer Shapes + +### 1. What is the difference between a compiler and an interpreter? + +Strong answer shape: + +"A compiler performs substantial translation before execution, often to native code or bytecode. An interpreter executes a representation of the program directly, often statement by statement or bytecode instruction by bytecode instruction. In practice many systems combine both approaches, for example CPython compiles to bytecode and Java uses both bytecode interpretation and JIT compilation." + +### 2. What is JIT and why is it useful? + +Strong answer shape: + +"JIT compilation moves some compilation work to runtime so the system can optimize hot code paths using real execution data such as observed types and branch behavior. That improves peak performance, but it adds warm-up cost and runtime complexity." + +### 3. What is an AST and why do compilers use it? + +Strong answer shape: + +"An AST captures the structural meaning of a program without unnecessary grammar detail. It is easier than raw tokens for semantic analysis, type checking, optimization, and code generation." + +### 4. What does semantic analysis do? + +Strong answer shape: + +"It checks meaning after parsing: name resolution, scope rules, type rules, control-flow constraints, and related consistency checks. A program can be syntactically valid but semantically invalid." + +### 5. Why do we need an intermediate representation? + +Strong answer shape: + +"IR decouples the source language from the target architecture and gives optimizers a uniform representation to transform. It is one of the main reasons modern compilers are modular and retargetable." + +### 6. What is the difference between linking and loading? + +Strong answer shape: + +"Linking resolves symbols and combines object files and libraries into a final binary or shared library. Loading is the operating system and runtime step that maps the program into memory, loads shared libraries, sets up the process, and transfers control to the entry point." + +### 7. Why can a compiled language still need a runtime? + +Strong answer shape: + +"Because generated code still relies on services such as memory management, exception handling, thread scheduling, metadata lookup, or dynamic dispatch. Go and Java are clear examples." + +### 8. Why is Python generally slower than C++? + +Strong answer shape: + +"Python usually executes through a VM with dynamic typing and higher object model overhead, while C++ is compiled ahead of time to optimized native code with far fewer runtime checks. That difference shows up in dispatch cost, memory layout, and optimization opportunities." + +### 9. What is the difference between static and dynamic linking? + +Strong answer shape: + +"Static linking copies library code into the final binary at build time, while dynamic linking resolves shared libraries at runtime. Static linking simplifies deployment, while dynamic linking reduces binary size and allows shared library updates." + +### 10. What happens when the compiler says nothing but the linker fails? + +Strong answer shape: + +"Compilation checked each translation unit in isolation, but the final symbol references could not be resolved across units or libraries. That usually indicates an undefined symbol, duplicate definition, or library ordering or ABI issue." + +## 20. Practical Scenarios Interviewers Like + +### Scenario 1: The program builds, but you get an undefined reference error + +What it tests: + +- Understanding of separate compilation +- Symbol resolution +- Link-time failures versus compile-time failures + +Good explanation: + +"The source compiled fine because each file was valid on its own, but at link time the toolchain could not find a definition for a referenced symbol. I would check whether the function is actually defined, whether the declaration matches the definition, whether the right object files or libraries were linked, and whether name mangling or ABI mismatch is involved." + +### Scenario 2: Java is slow on first requests and faster later + +What it tests: + +- JIT warm-up +- Tiered compilation +- Runtime profiling + +Good explanation: + +"That often reflects JIT behavior. The JVM initially runs code in lower tiers and profiles execution. Frequently executed methods are then optimized into machine code, so steady-state throughput improves after warm-up." + +### Scenario 3: Python memory usage keeps growing + +What it tests: + +- GC basics +- Reachability versus leaks +- Runtime behavior + +Good explanation: + +"In a GC or reference-counted language, memory growth does not necessarily mean the collector is broken. It may mean objects remain reachable through caches, global references, cycles, or long-lived containers. I would investigate object lifetimes and retained references rather than assuming raw malloc-style leaks." + +### Scenario 4: JavaScript gets slower after a code change that adds many object shapes + +What it tests: + +- JIT assumptions +- Speculation and deoptimization +- Dynamic language optimization limits + +Good explanation: + +"Many JavaScript engines optimize based on stable object shapes and observed type patterns. If the code change makes shapes more polymorphic, the optimizer may lose its assumptions, trigger deoptimizations, and reduce inline cache effectiveness." + +### Scenario 5: A Go service has noticeable GC pauses + +What it tests: + +- Runtime systems +- Allocation behavior +- GC tradeoffs + +Good explanation: + +"I would look at allocation rate, object lifetimes, and memory pressure. Even with a good concurrent GC, excessive short-lived allocation or large heaps can raise collector work and pause costs." + +## 21. How to Talk About This Topic in Interviews + +When you answer, try to move through three levels: + +1. State the concept precisely. +2. Explain where it fits in the end-to-end execution pipeline. +3. Ground it in one real language implementation. + +For example: + +"An AST is the structural representation the parser produces after tokenization. The compiler then uses it for semantic analysis and later lowering into IR. In CPython, the source is parsed to an AST before compilation to bytecode." + +That style of answer shows both theoretical knowledge and practical intuition. + +## 22. Quick Summary to Remember + +- A compiler translates code ahead of execution, but the output may be native code or bytecode. +- An interpreter executes a program representation directly, often through a VM. +- JIT compilation performs runtime compilation of hot code paths. +- Lexing turns characters into tokens. +- Parsing turns tokens into structured syntax such as an AST. +- Semantic analysis checks meaning, not just syntax. +- Symbol tables track what names refer to. +- IR gives compilers a better form for optimization and code generation. +- Optimization preserves semantics while improving speed, size, memory use, or startup. +- Code generation maps IR onto instructions, registers, calling conventions, and stack layout. +- Linking resolves symbols and produces final binaries. +- Loading maps the program into memory and starts execution. +- Runtime systems provide services such as GC, class loading, scheduling, and exception handling. +- Bytecode and VMs trade some raw speed for portability and runtime flexibility. +- The most accurate interview answers are language-specific and pipeline-aware. + +## 23. Final Mental Model + +If you remember only one sentence, remember this: + +Source code becomes executable behavior through a pipeline of analysis, transformation, packaging, loading, and runtime support, and different languages move work between those stages in different ways. + +That single idea ties together compilers, interpreters, JITs, bytecode, VMs, linking, loading, runtimes, and garbage collection. diff --git a/networking.md b/networking.md new file mode 100644 index 0000000..3410395 --- /dev/null +++ b/networking.md @@ -0,0 +1,829 @@ +# Networking Interview Guide For Software Engineers + +This guide is meant for interview preparation, not just memorization. The goal is to help you explain how traffic moves through systems, why protocols were designed the way they were, and what tradeoffs show up in production. + +## How To Study Networking For Interviews + +Treat networking as three linked layers of understanding: + +1. Protocol model: what each layer is responsible for. +2. Request lifecycle: what happens from client to server and back. +3. Production systems: what breaks, what gets cached, what gets retried, and what gets secured. + +If you can answer all three for a topic, you usually understand it well enough for interviews. + +## Big Picture: What Happens When You Open A URL? + +When a browser requests `https://api.example.com/users`, a lot happens under the hood: + +1. The browser parses the URL and extracts the scheme, hostname, path, and port. +2. DNS resolves `api.example.com` to an IP address. +3. The client decides where to send the packet, usually to a default gateway if the server is outside the local subnet. +4. A TCP connection is established to the server, often on port `443`. +5. A TLS handshake happens to create an encrypted session. +6. The browser sends an HTTP request over that secure connection. +7. Load balancers, proxies, firewalls, and application servers may inspect or forward the request. +8. The server sends back an HTTP response. +9. The browser may cache content, reuse the connection, or make additional requests. + +```mermaid +flowchart LR + A[Browser or Client] --> B[DNS Resolver] + B --> C[IP Address Returned] + C --> D[TCP Handshake] + D --> E[TLS Handshake] + E --> F[Load Balancer or Reverse Proxy] + F --> G[Application Service] + G --> H[Database or Cache] + H --> G + G --> F + F --> E + E --> A +``` + +This single flow touches most of the topics interviewers care about. + +## OSI 7-Layer Model + +The OSI model is a conceptual framework. Real networks do not literally say "now we are in layer 5," but the model is useful because it separates responsibilities. + +| Layer | Name | What It Does | Real Examples | +| --- | --- | --- | --- | +| 7 | Application | User-facing network services | HTTP, DNS, SMTP | +| 6 | Presentation | Data representation, encoding, encryption | TLS, JSON, JPEG | +| 5 | Session | Session establishment and management | RPC sessions, NetBIOS | +| 4 | Transport | End-to-end delivery between processes | TCP, UDP | +| 3 | Network | Routing between networks | IP, ICMP | +| 2 | Data Link | Delivery within the local network | Ethernet, MAC, ARP | +| 1 | Physical | Actual electrical, radio, or optical transmission | Cables, Wi-Fi radio | + +### How To Explain Each Layer In Practice + +#### Layer 7: Application + +This is where application protocols live. If someone says "HTTP request," they are talking about an application-layer protocol. DNS also lives here because it provides a service used by applications. + +Interview angle: if an API returns `404`, `500`, or malformed JSON, that is usually an application-layer issue, not a routing issue. + +#### Layer 6: Presentation + +This layer is about representation. Encryption, compression, and serialization belong here conceptually. In the real world, TLS is often described here because it transforms readable data into encrypted bytes. + +Interview angle: when asked why HTTPS exists, explain that application data is wrapped in TLS before it is sent on the transport connection. + +#### Layer 5: Session + +The session layer is less visible in day-to-day backend work, but the idea still matters. It manages ongoing communication state: establishing sessions, keeping them alive, resuming them, and closing them. + +In modern systems, session responsibilities are often spread across libraries and protocols rather than called out separately. + +#### Layer 4: Transport + +This is where TCP and UDP live. The transport layer gives process-to-process communication, meaning it is not just machine A to machine B, but application process on port X to process on port Y. + +Interview angle: if packets arrive but the application still times out, transport details like retransmissions, congestion, socket backlog, or connection exhaustion might matter. + +#### Layer 3: Network + +This layer is responsible for logical addressing and routing. IP addresses live here. Routers examine layer 3 information to move packets across networks. + +Interview angle: if two services are in different networks or subnets, layer 3 determines how traffic gets routed between them. + +#### Layer 2: Data Link + +This is communication on the local link. Devices here care about MAC addresses. Switches mainly operate here. ARP maps IP addresses to MAC addresses on IPv4 local networks. + +Interview angle: when traffic stays inside a LAN, layer 2 behavior matters. When traffic leaves the LAN, a router is usually involved. + +#### Layer 1: Physical + +The physical layer is the raw transmission medium. It carries bits as electrical signals, light, or radio waves. + +Interview angle: higher-level developers do not work here much, but it explains why distance, cable quality, radio interference, and bandwidth constraints are real. + +### Encapsulation + +As data moves down the stack, each layer adds its own header. + +```mermaid +flowchart TD + A[Application Data] --> B[Transport Segment: TCP or UDP Header + Data] + B --> C[Network Packet: IP Header + Segment] + C --> D[Data Link Frame: MAC Header + Packet] + D --> E[Bits On Wire or Air] +``` + +Interview answer worth memorizing: each lower layer treats the entire payload from the layer above as data. + +## TCP/IP Model + +The TCP/IP model is more practical and maps better to the Internet. + +| TCP/IP Layer | Rough OSI Mapping | Purpose | +| --- | --- | --- | +| Application | OSI 5, 6, 7 | User-visible protocols and data formats | +| Transport | OSI 4 | End-to-end process communication | +| Internet | OSI 3 | Routing with IP | +| Link or Network Access | OSI 1, 2 | Local network delivery | + +```mermaid +flowchart LR + A[OSI 7 Application] --> T1[TCP/IP Application] + B[OSI 6 Presentation] --> T1 + C[OSI 5 Session] --> T1 + D[OSI 4 Transport] --> T2[TCP/IP Transport] + E[OSI 3 Network] --> T3[TCP/IP Internet] + F[OSI 2 Data Link] --> T4[TCP/IP Link] + G[OSI 1 Physical] --> T4 +``` + +### Why Interviewers Ask About Both Models + +They want to know whether you can: + +1. Use OSI for reasoning and categorization. +2. Use TCP/IP for real Internet systems. +3. Explain where a specific technology belongs. + +If asked which model is more "real," say TCP/IP. If asked which model is more useful for teaching, say OSI is often better. + +## TCP Vs UDP + +This is one of the most common interview topics because it is about tradeoffs. + +| Property | TCP | UDP | +| --- | --- | --- | +| Connection model | Connection-oriented | Connectionless | +| Reliability | Reliable delivery with retransmission | No built-in reliability | +| Ordering | Preserves order | No guaranteed order | +| Flow control | Yes | No | +| Congestion control | Yes | No built-in congestion control | +| Overhead | Higher | Lower | +| Typical use cases | Web, APIs, databases, SSH | DNS, streaming, gaming, VoIP | + +### TCP + +TCP provides a reliable byte stream. That means the application reads a stream of bytes, not individual packets. TCP handles: + +1. Three-way handshake for connection setup. +2. Sequence numbers for ordering. +3. Acknowledgments for successful receipt. +4. Retransmission of lost segments. +5. Flow control so the receiver is not overwhelmed. +6. Congestion control so the network is not flooded. + +Three-way handshake: + +```mermaid +sequenceDiagram + participant C as Client + participant S as Server + C->>S: SYN + S->>C: SYN-ACK + C->>S: ACK +``` + +Key point: TCP is reliable, but not magical. It can still time out, stall, or back off under congestion. + +### UDP + +UDP is a lightweight datagram protocol. It sends discrete messages with minimal built-in guarantees. That makes it faster in some cases, but the application must tolerate loss, duplication, or reordering. + +Real-world point: many modern protocols use UDP because they want low latency and implement reliability selectively at the application layer. QUIC is the best modern example. + +### Interview-Ready Tradeoff + +If asked "Why not always use UDP because it is faster?" the right answer is that many applications need reliable ordered delivery, and reimplementing all of TCP behavior in every application would be complex and error-prone. + +If asked "Why not always use TCP?" the answer is that strict ordering and retransmission can increase latency, which is a bad fit for real-time media or interactive gaming. + +## Ports, Sockets, And Connections + +### Ports + +A port identifies a process or service on a host. The IP address identifies the machine, and the port identifies which application on that machine should receive the traffic. + +Common well-known ports: + +| Port | Protocol / Service | +| --- | --- | +| 20/21 | FTP | +| 22 | SSH | +| 25 | SMTP | +| 53 | DNS | +| 80 | HTTP | +| 123 | NTP | +| 143 | IMAP | +| 443 | HTTPS | +| 3306 | MySQL | +| 5432 | PostgreSQL | + +### Sockets + +A socket is an endpoint for communication. In practice, people often use the term to mean the OS abstraction used by programs to send and receive network data. + +For a TCP connection, it is often identified by a 4-tuple: + +1. Source IP +2. Source port +3. Destination IP +4. Destination port + +That is how a server can support many simultaneous client connections on the same listening port. + +### Connections + +TCP has connections. UDP does not require a connected session in the same sense, even though programming APIs sometimes let you call `connect()` on a UDP socket for convenience. + +Interview angle: if a server is "listening on port 443," that means it has a listening socket waiting for new connections. After accept, each client gets a separate established connection. + +## IP Addressing + +IP addresses identify network interfaces in an IP network. They are logical addresses, not hardware addresses. + +### IPv4 + +IPv4 uses 32-bit addresses, usually written as four decimal octets, such as `192.168.1.10`. + +Important private IPv4 ranges: + +| Range | Purpose | +| --- | --- | +| 10.0.0.0/8 | Private network | +| 172.16.0.0/12 | Private network | +| 192.168.0.0/16 | Private network | +| 127.0.0.0/8 | Loopback | + +Why private IPs matter: devices with private IPs are not directly routable on the public Internet. They usually reach the Internet through NAT. + +### IPv6 + +IPv6 uses 128-bit addresses, written in hexadecimal, such as `2001:0db8:85a3:0000:0000:8a2e:0370:7334`. + +Benefits of IPv6: + +1. Vast address space. +2. Reduced need for NAT. +3. Cleaner support for auto-configuration and modern routing. +4. Better long-term scalability. + +Compressed IPv6 notation removes leading zeros and can shorten one run of zeros with `::`. + +Example: + +`2001:0db8:0000:0000:0000:ff00:0042:8329` + +can become: + +`2001:db8::ff00:42:8329` + +### Subnetting Basics + +Subnetting divides a network into smaller networks. CIDR notation tells you how many bits belong to the network prefix. + +Examples: + +1. `/24` means 24 bits are network bits, leaving 8 bits for hosts. +2. `192.168.1.0/24` covers `192.168.1.0` through `192.168.1.255`. +3. A `/24` usually gives 256 total addresses, though usable host counts depend on context. + +Quick mental model: + +| CIDR | Total Addresses | Common Use | +| --- | --- | --- | +| /24 | 256 | Small subnet | +| /25 | 128 | Split a /24 into two | +| /26 | 64 | Smaller service segment | +| /32 | 1 | Single host route | + +Real-world interview point: subnetting is about grouping addresses for routing and isolation, not just address counting. + +### Default Gateway + +If a host wants to reach an IP outside its local subnet, it sends traffic to the default gateway, which is usually a router. + +This is a common practical question: "How does a machine know whether to send directly or to the router?" It compares the destination IP against its subnet mask. + +## DNS + +DNS maps human-friendly names to machine-usable records. + +Common record types: + +| Record | Purpose | +| --- | --- | +| A | Hostname to IPv4 | +| AAAA | Hostname to IPv6 | +| CNAME | Alias to another name | +| MX | Mail server | +| TXT | Arbitrary text, verification, SPF, DKIM | +| NS | Authoritative name server | + +### DNS Resolution Flow + +```mermaid +sequenceDiagram + participant U as User Device + participant R as Recursive Resolver + participant Root as Root DNS + participant TLD as TLD Server + participant Auth as Authoritative DNS + U->>R: Query api.example.com + R->>Root: Where is .com? + Root->>R: Ask .com TLD + R->>TLD: Where is example.com? + TLD->>R: Ask authoritative server + R->>Auth: Where is api.example.com? + Auth->>R: A or AAAA record + R->>U: Cached answer returned +``` + +### Important Interview Points About DNS + +1. DNS is hierarchical and distributed. +2. Resolvers cache answers for performance. +3. DNS is not just one server; it is a chain of delegated authority. +4. DNS can fail due to bad records, TTL delays, resolver issues, or propagation lag. + +### TTL + +TTL, or time to live, tells resolvers how long they may cache a record. A low TTL makes changes propagate faster but increases DNS lookup load. A high TTL reduces lookup traffic but slows rollout changes. + +### Practical Scenario + +If a service moves to a new IP and some users still hit the old server, DNS caching is one of the first suspects. + +## HTTP Vs HTTPS + +HTTP is an application protocol for request-response communication. HTTPS is HTTP carried over TLS, which provides encryption, integrity, and server authentication. + +| Feature | HTTP | HTTPS | +| --- | --- | --- | +| Encryption | No | Yes, via TLS | +| Integrity | Weak, data can be modified in transit | Protected by TLS | +| Authentication | No built-in server identity | Certificates authenticate server identity | +| Default Port | 80 | 443 | + +### Why HTTPS Matters + +Without HTTPS, anyone on the path can potentially read or modify traffic. That includes credentials, tokens, cookies, and application data. + +With HTTPS: + +1. The client verifies the server certificate. +2. A secure session key is negotiated. +3. Data is encrypted in transit. +4. Tampering becomes detectable. + +### Important Clarification + +HTTPS does not hide everything. Observers may still see: + +1. Destination IP address. +2. The fact that a TLS connection exists. +3. Timing and traffic volume. +4. In some cases, hostname metadata depending on protocol features and environment. + +### HTTP Methods + +Common methods and their intent: + +| Method | Typical Meaning | +| --- | --- | +| GET | Read resource | +| POST | Create or trigger processing | +| PUT | Replace resource | +| PATCH | Partially update resource | +| DELETE | Remove resource | + +### Status Codes Worth Knowing + +| Range | Meaning | +| --- | --- | +| 1xx | Informational | +| 2xx | Success | +| 3xx | Redirection | +| 4xx | Client error | +| 5xx | Server error | + +Commonly discussed codes: `200`, `201`, `204`, `301`, `302`, `304`, `400`, `401`, `403`, `404`, `409`, `429`, `500`, `502`, `503`, `504`. + +### Keep-Alive And Connection Reuse + +Opening a new TCP and TLS session for every request is expensive. Modern clients reuse connections when possible. This reduces handshake overhead and latency. + +### HTTP/1.1, HTTP/2, And HTTP/3 + +You do not always need deep protocol internals for interviews, but these are worth knowing: + +1. HTTP/1.1 commonly uses persistent connections but has more request multiplexing limitations. +2. HTTP/2 multiplexes streams over one TCP connection, reducing some head-of-line issues at the HTTP layer. +3. HTTP/3 runs over QUIC on UDP, improving connection setup and transport behavior. + +## SSL/TLS Basics + +Strictly speaking, SSL is old and obsolete; modern systems use TLS. In interviews, people still say "SSL certificate" informally, but the actual protocol is TLS. + +### What TLS Provides + +1. Confidentiality: traffic is encrypted. +2. Integrity: tampering is detectable. +3. Authentication: the client can verify the server's identity. + +### Simplified TLS Handshake + +```mermaid +sequenceDiagram + participant C as Client + participant S as Server + C->>S: ClientHello + S->>C: ServerHello + Certificate + C->>S: Key Exchange Material + Note over C,S: Shared session keys derived + C->>S: Encrypted HTTP request + S->>C: Encrypted HTTP response +``` + +### Certificates + +A certificate binds a public key to a domain identity. Certificate authorities, or CAs, sign certificates so clients can trust them. + +If a certificate is expired, signed by an untrusted CA, or does not match the hostname, clients will reject or warn about the connection. + +### Interview Pitfall + +TLS secures data in transit, not data at rest. Encrypting database traffic with TLS does not automatically encrypt the database files on disk. + +## REST APIs + +REST is an architectural style, not a wire protocol. It usually rides on HTTP, but the important idea is resource-oriented design and stateless client-server interaction. + +### Core REST Ideas + +1. Resources are identified by URLs. +2. Standard HTTP methods express intent. +3. Requests should be stateless, meaning the server should not rely on hidden per-client session state for each call. +4. Representations like JSON are sent over HTTP. +5. Caching can improve scalability. + +### Good REST Thinking + +Prefer resource-based endpoints like: + +`GET /users/123` + +instead of action-heavy endpoints like: + +`POST /getUserById` + +That said, real APIs often mix purity with practicality. + +### Idempotency + +This topic comes up often. + +An operation is idempotent if doing it multiple times has the same effect as doing it once. + +1. `GET` should be idempotent. +2. `PUT` is usually designed to be idempotent. +3. `DELETE` is usually considered idempotent if repeated deletes do not further change state. +4. `POST` is not usually idempotent. + +Why it matters: retries are common in distributed systems. Idempotent operations are safer to retry. + +### Authentication And Authorization In APIs + +REST itself does not define auth. Real systems commonly use: + +1. Bearer tokens +2. API keys +3. Session cookies +4. OAuth flows + +Know the difference: + +1. Authentication answers who you are. +2. Authorization answers what you are allowed to do. + +## Load Balancers, Proxies, And Gateways + +These terms are related but not interchangeable. + +### Load Balancer + +A load balancer distributes traffic across multiple backend servers. + +Goals: + +1. Higher availability. +2. Better scalability. +3. Health-based routing. +4. Sometimes TLS termination. + +Two common types: + +| Type | Operates Mainly At | Example Decisions | +| --- | --- | --- | +| Layer 4 LB | Transport | Route by IP and port | +| Layer 7 LB | Application | Route by host, path, headers | + +Practical example: a layer 7 load balancer can send `/images/*` to one service and `/api/*` to another. + +### Forward Proxy + +A forward proxy sits between client and Internet. Clients intentionally send requests to it. + +Uses: + +1. Outbound filtering. +2. Privacy or identity masking. +3. Shared caching. +4. Corporate network control. + +### Reverse Proxy + +A reverse proxy sits in front of servers. Clients think they are talking to the server directly. + +Uses: + +1. TLS termination. +2. Load balancing. +3. Caching. +4. Compression. +5. Authentication integration. +6. Hiding internal services. + +Nginx and Envoy are common examples. + +### Gateway + +A gateway is a boundary device or service that connects different networks or different protocol domains. + +Two meanings show up often in interviews: + +1. Network gateway: the next hop that lets a host reach destinations outside its local subnet. This is what "default gateway" usually means. +2. Application gateway: a service that sits at the edge of an application boundary and manages incoming traffic, translation, policy, or routing. + +The common idea is controlled passage between one domain and another. + +### API Gateway + +An API gateway is a specialized reverse proxy for APIs. It can do routing, auth, rate limiting, request transformation, observability, and aggregation. + +Interview distinction: not every reverse proxy is an API gateway, but most API gateways behave like advanced reverse proxies. + +```mermaid +flowchart LR + A[Clients] --> B[Load Balancer] + B --> C[Reverse Proxy or API Gateway] + C --> D[Service A] + C --> E[Service B] + C --> F[Service C] +``` + +## Firewalls And Security Basics + +A firewall filters traffic according to rules. + +### Types Of Firewalls + +1. Packet-filtering or stateless firewalls inspect packets individually. +2. Stateful firewalls track connection state and make smarter decisions. +3. Web application firewalls, or WAFs, inspect HTTP traffic for application-layer attacks. + +### Common Security Concepts + +1. Least privilege: allow only what is needed. +2. Network segmentation: isolate systems by trust boundary. +3. Ingress rules: who can reach a service. +4. Egress rules: where a service can send traffic. +5. Rate limiting: reduce abuse and protect capacity. +6. DDoS mitigation: absorb or filter massive traffic floods. + +### Examples Of Real Risks + +1. Exposing a database port to the public Internet. +2. Allowing unrestricted outbound traffic from sensitive workloads. +3. Trusting internal traffic too much. +4. Forgetting to rotate certificates or credentials. + +## CDN, Caching, And Latency + +### CDN + +A content delivery network caches content at edge locations closer to users. + +Benefits: + +1. Lower latency. +2. Reduced origin load. +3. Better global performance. +4. Improved resilience for static content. + +Typical fit: images, videos, JS bundles, CSS, downloads, and cacheable API responses. + +### Caching + +Caching means storing a response closer to where it will be reused. + +Common cache locations: + +1. Browser cache. +2. CDN edge cache. +3. Reverse proxy cache. +4. Application cache like Redis. +5. Database page cache. + +### Cache Concepts Worth Knowing + +1. Cache hit: request served from cache. +2. Cache miss: request must go to origin. +3. TTL: how long data stays valid. +4. Eviction: removal policy such as LRU. +5. Staleness: cached data may be out of date. +6. Invalidation: explicitly removing outdated cache entries. + +### HTTP Caching Headers + +Useful headers to know: + +1. `Cache-Control` +2. `ETag` +3. `Last-Modified` +4. `Expires` + +Interview point: cache invalidation is hard because you are trading correctness, freshness, and performance. + +### Latency Breakdown + +Latency is not one thing. It is the sum of multiple delays: + +1. DNS lookup time. +2. TCP handshake time. +3. TLS handshake time. +4. Network propagation time. +5. Queuing delay in devices. +6. Server processing time. +7. Response transfer time. + +```mermaid +flowchart LR + A[User Click] --> B[DNS] + B --> C[TCP] + C --> D[TLS] + D --> E[Request Transit] + E --> F[Server Processing] + F --> G[Response Transit] + G --> H[Rendering or Client Work] +``` + +### Throughput Vs Latency + +Throughput is how much data or how many requests you can process over time. Latency is how long a single request takes. You can improve one without improving the other. + +Example: batching work may improve throughput but worsen latency for individual requests. + +## Practical Real-World Scenarios + +### Scenario 1: A Website Feels Slow For Users Far From The Server + +Possible causes: + +1. High round-trip time because of geography. +2. Too many sequential requests. +3. No CDN for static assets. +4. Repeated TCP and TLS handshakes due to poor connection reuse. +5. Cache misses at multiple layers. + +Good interview answer: discuss CDN, caching, compression, connection reuse, reducing chattiness, and serving content from closer regions. + +### Scenario 2: Service Works By IP But Not By Domain Name + +Likely DNS issue. + +Check: + +1. DNS record correctness. +2. TTL and propagation delay. +3. Resolver cache. +4. Split-horizon DNS or internal DNS differences. +5. Wrong CNAME chain or stale record. + +### Scenario 3: HTTPS Certificate Errors After Deployment + +Possible causes: + +1. Expired certificate. +2. Hostname mismatch. +3. Missing intermediate certificate. +4. Clock skew on client or server. +5. TLS terminated at the wrong layer with the wrong cert. + +### Scenario 4: Requests Time Out Intermittently + +Possible causes: + +1. Packet loss and TCP retransmissions. +2. Load balancer health check problems. +3. Connection pool exhaustion. +4. Slow downstream dependencies. +5. Firewall or security group rules dropping traffic. +6. DNS flapping between healthy and unhealthy endpoints. + +### Scenario 5: API Returns `502 Bad Gateway` + +This usually means a gateway or proxy could not get a valid response from the upstream server. + +Good debugging path: + +1. Check reverse proxy or load balancer logs. +2. Check upstream health. +3. Check timeouts and connection resets. +4. Check TLS settings between proxy and upstream. + +## Common Interview Questions And Strong Answer Directions + +### What Is The Difference Between OSI And TCP/IP? + +Strong answer: OSI is a conceptual 7-layer model used for understanding network responsibilities. TCP/IP is the practical model used by the Internet. OSI is better for teaching, TCP/IP is closer to real implementation. + +### Explain TCP Vs UDP + +Strong answer: TCP is connection-oriented and provides reliable ordered delivery with retransmissions and congestion control. UDP is connectionless with lower overhead and no built-in reliability, which makes it useful for latency-sensitive workloads. + +### Why Is HTTPS More Secure Than HTTP? + +Strong answer: HTTPS is HTTP over TLS. TLS encrypts traffic, verifies server identity through certificates, and protects integrity so attackers cannot silently tamper with data in transit. + +### What Happens When You Enter A URL In The Browser? + +Strong answer should mention DNS lookup, routing, TCP handshake, TLS handshake for HTTPS, HTTP request/response, load balancers or proxies, server processing, and caching. + +### What Is DNS And Why Is It Needed? + +Strong answer: DNS is a distributed hierarchical naming system that maps domain names to resource records like IP addresses. It lets humans use names while systems use addresses. + +### What Is A Socket? + +Strong answer: a socket is an OS-level communication endpoint. For TCP connections, each connection can be identified by source IP, source port, destination IP, and destination port. + +### What Is Subnetting? + +Strong answer: subnetting divides an address space into smaller networks using a prefix length. It helps with routing, isolation, and efficient address management. + +### What Is The Difference Between A Load Balancer And A Reverse Proxy? + +Strong answer: a load balancer's primary job is distributing traffic across backends. A reverse proxy sits in front of servers and can also do caching, TLS termination, routing, or auth. In practice, one component can do both. + +### Why Use A CDN? + +Strong answer: CDNs reduce latency by serving cacheable content from locations closer to users, lower origin load, and improve scalability for global traffic. + +### What Causes High Latency? + +Strong answer: latency can come from DNS, handshakes, network distance, queuing, packet loss, retransmissions, server processing, and payload size. The right answer is usually a breakdown, not a single cause. + +## High-Value Interview Nuances + +These points often separate a solid answer from a shallow one: + +1. TCP is a byte stream, not a message protocol. +2. HTTPS secures transport in transit, not application correctness. +3. DNS is cached heavily, so configuration changes are not always immediate. +4. Load balancers can work at different layers and may also terminate TLS. +5. A `401` and a `403` are not the same: unauthenticated versus authenticated but forbidden. +6. `502`, `503`, and `504` point to different upstream failure modes. +7. NAT helps conserve IPv4 addresses but changes how addressing works between internal and external networks. +8. Reliability, ordering, and latency are often tradeoffs, not free features. + +## Quick Revision Sheet + +If you need a short final-pass checklist before an interview, make sure you can explain: + +1. OSI layers and what problems each layer solves. +2. TCP/IP model and how it maps to OSI. +3. TCP handshake, reliability, and congestion basics. +4. UDP tradeoffs and good use cases. +5. DNS resolution and common records. +6. IPv4, IPv6, CIDR, and subnet basics. +7. Difference between HTTP and HTTPS. +8. TLS purpose, certificates, and handshake basics. +9. REST principles and idempotency. +10. Ports, sockets, and connection reuse. +11. Load balancer, forward proxy, reverse proxy, and gateway differences. +12. Firewall basics, segmentation, and least privilege. +13. CDN and caching fundamentals. +14. How to reason about latency end to end. + +## Final Interview Advice + +For networking interviews, the strongest answers are usually not the most academic ones. They are the answers that connect protocol theory to production behavior. + +A good pattern is: + +1. Define the concept clearly. +2. Explain why it exists. +3. Describe the tradeoff. +4. Give a real example of where it matters. + +If you can do that consistently, you will sound like an engineer who has worked with real systems rather than someone reciting flashcards. diff --git a/os/concurrency.md b/os/concurrency.md new file mode 100644 index 0000000..27c478a --- /dev/null +++ b/os/concurrency.md @@ -0,0 +1,1125 @@ +# Concurrency and Synchronization for Interviews + +This guide is meant for software engineers who already write production code and want a deeper, interview-ready mental model of concurrency. The goal is not just to memorize definitions, but to understand what problems concurrency introduces, what tools operating systems and runtimes provide, and how these ideas show up in backend systems and multithreaded applications. + +--- + +## 1. Why Concurrency Matters + +Modern software rarely runs as a single straight-line sequence of instructions. Web servers handle many requests at once, databases coordinate many clients, background workers process jobs in parallel, and operating systems multiplex CPU time across many threads and processes. + +Concurrency exists because real systems need at least one of these: + +- Responsiveness: one task should not block all others. +- Throughput: multiple units of work should make progress over the same period. +- Resource utilization: idle CPU time or I/O wait should be reduced. +- Structure: independent activities are easier to model as independent execution flows. + +The cost is coordination. As soon as two execution flows can interact with shared state, correctness becomes harder. + +--- + +## 2. Concurrency and Parallelism + +### Concurrency + +Concurrency means multiple tasks are in progress during the same overall time interval. They may not literally run at the same instant. A single CPU core can still execute concurrent work by interleaving threads. + +Think of concurrency as dealing with many things at once. + +Examples: + +- A web server handling many client connections with an event loop. +- A program with a UI thread and a background worker thread. +- An OS scheduling several runnable threads on one core. + +### Parallelism + +Parallelism means multiple tasks are executing at the same instant, usually on different CPU cores or different hardware units. + +Think of parallelism as doing many things at once. + +Examples: + +- A matrix multiplication split across 8 CPU cores. +- A thread pool running several CPU-heavy jobs at the same time. +- GPU kernels operating on many data elements simultaneously. + +### Concurrency vs Parallelism + +They are related but not identical. + +- Concurrency is about composition and coordination. +- Parallelism is about simultaneous execution. +- A program can be concurrent without being parallel. +- A parallel program is usually concurrent, because simultaneous work still needs coordination. + +```mermaid +flowchart TD + A[Program has multiple tasks] --> B{One core or many cores?} + B -->|One core| C[Concurrent via interleaving] + B -->|Many cores| D[Potentially parallel] + C --> E[Tasks overlap in time but not at the same instant] + D --> F[Tasks can run at the same instant] + E --> G[Still needs synchronization if state is shared] + F --> G +``` + +### Interview framing + +If asked for the difference, a strong answer is: + +> Concurrency is a design property where multiple tasks make progress during the same time window. Parallelism is an execution property where multiple tasks run simultaneously. Concurrency is mainly about coordinating independent work; parallelism is mainly about speeding work up with more hardware. + +--- + +## 3. Synchronization Basics + +Synchronization is the coordination of concurrent execution so that shared data stays correct and required ordering relationships are preserved. + +When engineers say synchronization, they usually mean one or more of these: + +- Mutual exclusion: only one thread enters a critical region at a time. +- Ordering: thread B should not proceed until thread A completes some step. +- Visibility: changes made by one thread should become visible to others at the correct time. +- Coordination: threads should wait, signal, or hand off work safely. + +### The three main correctness concerns + +#### 1. Atomicity + +An operation is atomic if it happens as one indivisible unit from the perspective of other threads. + +`count++` is usually not atomic. It is often: + +1. Read `count` +2. Add 1 +3. Write result back + +If two threads do that concurrently, one update can be lost. + +#### 2. Visibility + +Even if one thread writes a new value, another thread may not see it immediately because of CPU caches, compiler reordering, or runtime memory models. + +#### 3. Ordering + +Operations may be observed in a different order than they appear in code unless synchronization constructs create ordering guarantees. + +This is why synchronization is not only about locking. It is also about memory semantics. + +--- + +## 4. Race Conditions + +A race condition occurs when the correctness of a program depends on the relative timing or interleaving of concurrent operations. + +Not every race is bad, but most interview discussions mean a harmful data race or logic race. + +### Simple example + +Two threads increment the same shared counter initialized to 0. + +Without synchronization: + +1. Thread A reads 0 +2. Thread B reads 0 +3. Thread A writes 1 +4. Thread B writes 1 + +Final value becomes 1 instead of 2. + +```mermaid +sequenceDiagram + participant A as Thread A + participant M as Shared Counter + participant B as Thread B + + A->>M: read 0 + B->>M: read 0 + A->>M: write 1 + B->>M: write 1 +``` + +### Data race vs race condition + +- A data race usually means two threads access the same memory location concurrently, at least one is a write, and there is no proper synchronization. +- A race condition is broader. The bug comes from timing dependence, even if access is not literally the same variable. + +Example of a logic race: + +- Thread A checks that a file exists. +- Thread B deletes it. +- Thread A then tries to open it and fails. + +### How to talk about race conditions in interviews + +Do not stop at "two threads update the same variable." Mention the root issue: + +- non-atomic read-modify-write, +- missing ordering guarantees, +- unsafely shared mutable state, +- check-then-act or read-then-use races, +- stale reads due to visibility issues. + +--- + +## 5. The Critical Section Problem + +A critical section is a part of code that accesses shared data or shared resources and therefore must not be executed by more than one thread at the same time when doing so would break correctness. + +The critical section problem asks: how do we design a protocol so multiple threads can safely share resources? + +### Classic requirements + +An ideal solution satisfies: + +1. Mutual exclusion + At most one thread is in the critical section at a time. +2. Progress + If no thread is in the critical section, the choice of who enters next should not be postponed forever. +3. Bounded waiting + A thread waiting to enter should not be postponed indefinitely. + +### Example + +Updating a shared bank account balance, modifying a shared queue, or writing to a shared log buffer are all critical section scenarios. + +The main interview point: once you identify shared mutable state, ask whether access must be serialized. + +--- + +## 6. Mutual Exclusion + +Mutual exclusion means ensuring that only one thread or process at a time can execute a critical section. + +This is the most basic synchronization guarantee. + +Common ways to provide mutual exclusion: + +- Locks +- Mutexes +- Binary semaphores +- Monitors +- Atomic instructions used to build higher-level locks + +Mutual exclusion solves some problems, but not all. If threads also need to wait for particular conditions, you need coordination mechanisms such as semaphores or condition variables. + +--- + +## 7. Locks and Synchronization Mechanisms + +Synchronization mechanisms are tools that control access to shared state or coordinate thread behavior. + +### Major categories + +- Blocking locks: thread sleeps if it cannot proceed. +- Spinning locks: thread keeps retrying until it can proceed. +- Signaling primitives: thread waits for an event or condition. +- Atomic primitives: hardware-supported indivisible operations. + +### Design tradeoff + +Every synchronization tool trades off some combination of: + +- Simplicity +- Latency +- CPU efficiency +- Fairness +- Scalability under contention +- Ease of reasoning + +In interviews, strong answers compare mechanisms instead of treating them as equivalent. + +--- + +## 8. Mutexes + +A mutex is a mutual exclusion lock. Only one thread can hold it at a time. + +Typical behavior: + +1. Thread calls `lock()` +2. If mutex is free, thread acquires it and enters critical section +3. If mutex is already held, thread blocks or waits +4. Thread eventually calls `unlock()` + +### Why mutexes are useful + +- Simple mental model +- Good for protecting shared data structures +- Usually implemented efficiently by the OS and runtime + +### Example + +```text +lock(mutex) +balance = balance - amount +unlock(mutex) +``` + +### Important interview details + +- A mutex is ownership-based: the thread that acquires it is usually the one expected to release it. +- Holding a mutex too long hurts throughput. +- Calling blocking I/O while holding a mutex can create latency spikes and deadlock risk. +- Fine-grained locking improves concurrency but increases complexity. +- Coarse-grained locking is simpler but reduces parallelism. + +### Common backend example + +An in-memory cache may protect its hash map with a mutex so concurrent requests do not corrupt internal buckets or linked structures. + +--- + +## 9. Semaphores + +A semaphore is a synchronization primitive represented by a counter plus wait/signal operations. + +Typical operations: + +- `wait()` or `P()`: decrement the counter if possible; otherwise block until it becomes possible. +- `signal()` or `V()`: increment the counter and potentially wake a waiting thread. + +Semaphores are more general than mutexes. + +### Intuition + +- If the count represents available resources, `wait()` consumes one resource. +- `signal()` releases one resource. + +### Example use cases + +- Limit only 10 threads to access a connection pool at once. +- Coordinate producer-consumer queues. +- Gate access to a fixed number of identical resources. + +### Interview caution + +Semaphores are powerful but easy to misuse because they do not encode ownership like mutexes do. A thread can signal a semaphore even if it was not the thread that waited on it. + +--- + +## 10. Binary Semaphore vs Counting Semaphore + +### Binary semaphore + +A binary semaphore takes values like 0 or 1. + +It can be used similarly to a lock: + +- 1 means available +- 0 means unavailable + +But conceptually it is still a semaphore, not a mutex. + +Differences from a mutex: + +- Mutexes generally have ownership semantics. +- Binary semaphores are more about signaling and availability than ownership. + +### Counting semaphore + +A counting semaphore can take values greater than 1. + +It represents multiple identical resources. + +Examples: + +- 20 database connections available +- 8 worker slots available +- 100 requests allowed in a bounded in-flight queue + +### Interview summary + +- Use a mutex when exactly one thread should enter a critical section. +- Use a counting semaphore when you want to control access to a pool of N identical resources. +- Use a binary semaphore when you want simple availability or signaling semantics, not necessarily lock ownership. + +--- + +## 11. Spinlocks + +A spinlock is a lock where a thread repeatedly checks until the lock becomes available instead of sleeping. + +### When spinlocks make sense + +- Lock hold time is expected to be extremely short. +- Sleeping and waking a thread would cost more than spinning. +- Code is running in low-level systems contexts such as kernels. +- The waiting thread cannot safely sleep. + +### When spinlocks are a bad idea + +- The critical section may take noticeable time. +- The holder may be preempted. +- CPU cycles are scarce. +- Contention is high. + +### Tradeoff + +Spinlocks reduce context-switch overhead but waste CPU while waiting. + +### Interview phrasing + +> A spinlock is good when wait time is shorter than sleep/wake overhead. It is bad under long hold times or high contention because it burns CPU doing no useful work. + +--- + +## 12. Monitors + +A monitor is a higher-level synchronization construct that combines: + +- shared data, +- procedures that operate on that data, +- an implicit lock, +- and often condition variables for waiting and signaling. + +Only one thread executes inside the monitor at a time. + +### Why monitors matter + +Monitors package data and synchronization together, which makes reasoning easier than manually scattering locks around code. + +Languages and runtimes often offer monitor-like behavior through: + +- synchronized methods or blocks, +- object monitors, +- condition variables associated with locks. + +### Interview intuition + +A monitor is not just a lock. It is a structured model where the shared state and the rules for accessing it live together. + +--- + +## 13. Condition Variables + +A condition variable allows threads to sleep until some condition becomes true. + +This is useful when mutual exclusion alone is not enough. + +Example: + +- A consumer should wait until the queue is non-empty. +- A producer should wait until the buffer is non-full. + +### Key pattern + +Condition variables are used with a lock. + +Typical pattern: + +1. Acquire lock +2. While condition is false, wait on condition variable +3. Waiting atomically releases the lock and sleeps +4. When awakened, re-acquire lock and re-check the condition +5. Proceed when the condition is truly satisfied + +### Why use `while`, not `if` + +Because: + +- spurious wakeups can happen, +- another thread may consume the condition before this thread resumes, +- the condition must be re-verified under the lock. + +### Example pseudocode + +```text +lock(mutex) +while queue.is_empty(): + wait(not_empty, mutex) +item = queue.pop() +unlock(mutex) +``` + +### Interview point + +Condition variables are for waiting on state changes, not for protecting data by themselves. The mutex protects the state; the condition variable coordinates waiting. + +--- + +## 14. Producer-Consumer Problem + +This is a classic synchronization problem. + +- Producers generate data and place it into a buffer. +- Consumers remove data from the buffer and process it. + +### Core challenge + +- Producers must not write into a full buffer. +- Consumers must not read from an empty buffer. +- Shared buffer operations must be synchronized. + +### Common solution + +Use: + +- a mutex to protect buffer access, +- a `not_empty` condition, +- a `not_full` condition. + +```mermaid +flowchart LR + P[Producer] -->|enqueue item| B[Bounded Buffer] + B -->|dequeue item| C[Consumer] + P --> N1[Wait if buffer full] + C --> N2[Wait if buffer empty] +``` + +### Pseudocode + +```text +produce(item): + lock(mutex) + while buffer.is_full(): + wait(not_full, mutex) + buffer.push(item) + signal(not_empty) + unlock(mutex) + +consume(): + lock(mutex) + while buffer.is_empty(): + wait(not_empty, mutex) + item = buffer.pop() + signal(not_full) + unlock(mutex) + return item +``` + +### Real-world backend analogy + +- Producers: API servers enqueue tasks. +- Buffer: message queue or in-memory work queue. +- Consumers: worker threads or worker processes. + +Interviewers like this problem because it tests whether you understand both mutual exclusion and condition synchronization. + +--- + +## 15. Readers-Writers Problem + +This problem models a shared resource such as a database record, cache entry, or in-memory configuration. + +- Many readers can safely access the resource at the same time. +- Writers require exclusive access. + +### Goal + +Maximize concurrency for readers without allowing reads to overlap with writes. + +### Variants + +- Reader-preference: readers proceed freely, but writers may starve. +- Writer-preference: writers are favored, but readers may wait longer. +- Fair solution: tries to avoid starvation for either side. + +### Real-world example + +Suppose many threads read configuration data from a shared object, but only occasional admin updates modify it. + +- A plain mutex works but serializes all readers. +- A read-write lock allows concurrent readers and exclusive writers. + +### Interview lesson + +This problem is about balancing throughput and fairness. + +If you mention read-write locks, also mention their downside: + +- More overhead than a normal mutex +- Possible writer starvation depending on policy +- Sometimes not worth it if writes are frequent or critical sections are tiny + +--- + +## 16. Dining Philosophers Problem + +This classic problem shows how individually sensible resource acquisition can produce system-wide deadlock. + +Setup: + +- 5 philosophers sit around a table. +- A fork lies between each pair. +- Each philosopher needs both adjacent forks to eat. + +### The problem + +If every philosopher picks up the left fork first, they may all hold one fork and wait forever for the other. + +### Why interviewers ask it + +It tests whether you can reason about: + +- circular wait, +- deadlock, +- resource ordering, +- concurrency design, not just syntax. + +```mermaid +flowchart LR + P1[Philosopher 1] --> F1[Fork 1] + P2[Philosopher 2] --> F2[Fork 2] + P3[Philosopher 3] --> F3[Fork 3] + P4[Philosopher 4] --> F4[Fork 4] + P5[Philosopher 5] --> F5[Fork 5] + F1 --> P2 + F2 --> P3 + F3 --> P4 + F4 --> P5 + F5 --> P1 +``` + +### Common solutions + +- Impose a global ordering on forks and always pick lower-numbered first. +- Allow at most 4 philosophers to try eating at once. +- Use a waiter/arbitrator to control access. +- Make one philosopher pick up forks in opposite order. + +### Deeper takeaway + +The important concept is not philosophers. It is resource acquisition ordering. + +--- + +## 17. Deadlocks + +A deadlock occurs when a set of threads or processes are permanently blocked because each is waiting for a resource held by another. + +### Example + +- Thread A holds lock 1 and waits for lock 2. +- Thread B holds lock 2 and waits for lock 1. + +Neither can proceed. + +### Deadlock in real systems + +- Two services each hold a distributed lock and wait on the other. +- One thread holds a cache lock and calls into code that needs a DB lock, while another thread holds the DB lock and needs the cache lock. +- A transaction waits for rows locked by another transaction, while that transaction waits back on rows from the first. + +```mermaid +flowchart LR + T1[Thread A] -->|waiting for| L2[Lock 2] + L1[Lock 1] -->|held by| T1 + T2[Thread B] -->|waiting for| L1 + L2 -->|held by| T2 +``` + +--- + +## 18. Necessary Conditions for Deadlock: Coffman Conditions + +Four conditions must hold simultaneously for deadlock to be possible. + +1. Mutual exclusion + At least one resource must be non-shareable. +2. Hold and wait + A thread holds at least one resource while waiting for others. +3. No preemption + Resources cannot be forcibly taken away; they must be released voluntarily. +4. Circular wait + There exists a circular chain of threads, each waiting for a resource held by the next. + +### Interview insight + +To prevent deadlock, you only need to break one of the Coffman conditions. This is the foundation of many prevention strategies. + +--- + +## 19. Deadlock Prevention, Avoidance, Detection, and Recovery + +These are four different strategies. Interviewers often expect you to distinguish them clearly. + +### Prevention + +Design the system so deadlock cannot occur. + +Ways to do this: + +- Remove hold-and-wait: acquire everything at once. +- Remove circular wait: enforce global lock ordering. +- Remove no-preemption: allow rollback or forced release in some systems. +- Reduce mutual exclusion when possible: use immutable data or lock-free structures. + +Tradeoff: + +- Often conservative +- Can reduce utilization or concurrency + +### Avoidance + +Make allocation decisions dynamically so the system never enters an unsafe state. + +The classic example is Banker's Algorithm. + +Key idea: + +- Not every currently safe-looking allocation is future-safe. +- The system checks whether granting a request could leave it in a state where some completion order still exists. + +Tradeoff: + +- Requires knowledge of future maximum demands +- Rare in general-purpose production software +- Important mostly as an interview and OS theory concept + +### Detection + +Allow deadlocks to happen, then detect them. + +Examples: + +- DBMS lock manager detects wait-for cycles. +- OS or runtime analyzes resource graphs. + +Tradeoff: + +- Useful when deadlocks are rare +- Requires monitoring and detection overhead + +### Recovery + +Once deadlock is detected, recover by: + +- killing a process or transaction, +- rolling back work, +- preempting resources if possible, +- restarting part of the system. + +### Practical interview answer + +> In application code, deadlock prevention through lock ordering is the most common strategy. Databases often use detection and recovery because transactions can be rolled back. + +--- + +## 20. Starvation vs Deadlock vs Livelock + +These three are often confused. + +### Deadlock + +Threads are blocked forever waiting on each other. + +- No progress is possible for the involved threads. +- Classic cause: circular waiting on resources. + +### Starvation + +A thread waits indefinitely because others keep getting access first. + +- System as a whole may still make progress. +- The unlucky thread does not. + +Example: + +- A low-priority thread never gets CPU time. +- Writers never acquire a read-write lock because readers keep arriving. + +### Livelock + +Threads are not blocked. They keep changing state in response to each other, but no useful work completes. + +Example: + +- Two threads repeatedly back off and retry at the same time forever. + +### Quick comparison + +- Deadlock: stuck and waiting. +- Starvation: one participant keeps losing. +- Livelock: everyone is active but ineffective. + +### Interview one-liner + +> Deadlock means nobody can move, starvation means someone never gets a turn, and livelock means everyone keeps moving but nobody makes progress. + +--- + +## 21. Thread Safety + +Thread safety means code behaves correctly when accessed by multiple threads concurrently. + +### A thread-safe component typically guarantees one of these + +- Internal synchronization protects shared mutable state. +- It is immutable after construction. +- It uses thread-local state. +- It relies only on atomic operations for correctness. +- It requires external synchronization and documents that requirement. + +### Common misconceptions + +- Thread-safe does not mean fast. +- Thread-safe does not mean deadlock-free. +- Thread-safe does not mean lock-free. +- "Works most of the time" is not thread-safe. + +### Interview framing + +When asked how to make something thread-safe, answer in layers: + +1. Identify shared mutable state. +2. Define invariants that must always hold. +3. Choose synchronization or immutability strategy. +4. Minimize lock scope and avoid calling unknown code while locked. +5. Consider contention, fairness, and visibility. + +--- + +## 22. Atomic Operations + +Atomic operations are indivisible operations supported by hardware and exposed through runtimes or language libraries. + +Common atomic operations: + +- atomic load +- atomic store +- compare-and-swap (CAS) +- fetch-and-add +- exchange + +### Compare-and-swap intuition + +CAS means: + +1. Check whether memory still contains the expected value. +2. If yes, replace it with a new value atomically. +3. If no, fail and let caller retry. + +This is foundational for lock-free algorithms. + +### Why atomics matter + +- Efficient counters +- Lock-free queues and stacks +- Building higher-level locks +- Memory ordering and visibility guarantees + +### Important caveat + +Atomic does not mean the entire high-level operation is correct. + +Example: + +- Multiple atomic variable updates together are not automatically atomic as a group. + +### Memory ordering note + +In deeper interviews, mention that atomics can have different memory orders. Some guarantee only atomicity; others also impose stronger ordering and visibility semantics. + +If the interviewer goes deeper, terms like acquire, release, and sequential consistency are relevant. + +--- + +## 23. How These Concepts Appear in Backend Systems + +### 1. Request handling in web servers + +A server may use: + +- one thread per request, +- a thread pool, +- or an event loop with background workers. + +Concurrency issues show up in: + +- shared caches, +- request counters, +- connection pools, +- session state, +- rate limiters. + +### 2. Connection pools + +Only a fixed number of DB connections exist. + +- This is naturally modeled with a counting semaphore. +- Threads wait when the pool is exhausted. + +### 3. In-memory caches + +Concurrent reads and writes can corrupt internal state unless synchronized. + +Possible designs: + +- global mutex, +- sharded locks by key range, +- read-write lock, +- immutable snapshots with atomic pointer swaps. + +### 4. Job queues and workers + +Producer-consumer is everywhere: + +- API tier produces jobs, +- workers consume them, +- bounded queues prevent overload, +- condition variables or blocking queues coordinate work. + +### 5. Logging and metrics + +Naive shared counters and buffers often create contention. + +Production systems may use: + +- atomic counters, +- per-thread buffers, +- batch flushing, +- lock-free ring buffers. + +### 6. Distributed systems note + +The same ideas reappear at larger scale: + +- distributed locks, +- transactional deadlocks, +- leader election races, +- message ordering, +- idempotency instead of shared-memory locking. + +Concurrency bugs do not disappear in distributed systems. They become harder because the scheduler is now the network. + +--- + +## 24. Practical Patterns and Tradeoffs + +### Prefer less shared mutable state + +The easiest race condition to fix is the one you never create. + +Useful patterns: + +- immutable objects, +- message passing, +- partitioning state by key, +- ownership transfer, +- copy-on-write, +- thread-local storage. + +### Minimize lock duration + +Do not hold locks while doing: + +- network I/O, +- disk I/O, +- long CPU computation, +- callbacks into unknown code. + +### Maintain lock ordering + +If code may acquire multiple locks, impose a consistent global order. + +This is one of the most practical deadlock prevention techniques. + +### Beware of hidden lock boundaries + +Deadlocks often involve: + +- library calls, +- logging inside locked code, +- callback re-entry, +- nested object methods each taking locks. + +### Measure contention + +A correct locking strategy can still fail performance goals if contention is high. + +Symptoms: + +- high CPU with low throughput, +- many blocked threads, +- tail latency spikes, +- reduced scaling as cores increase. + +--- + +## 25. Common Interview Questions and How to Think About Them + +### "What is the difference between concurrency and parallelism?" + +Answer with both concept and example. + +Good answer: + +> Concurrency is about multiple tasks making progress during the same time window, often by interleaving. Parallelism is about tasks literally running at the same instant on different cores. A single-core event loop is concurrent but not parallel. + +### "What is a race condition?" + +Say more than "two threads access the same variable." + +Good answer: + +> A race condition happens when correctness depends on timing or interleaving between concurrent operations. A common case is unsynchronized shared mutable state, where non-atomic read-modify-write leads to lost updates. + +### "What is the critical section problem?" + +Good answer: + +> It is the problem of designing a protocol so threads sharing data can enter critical sections safely while satisfying mutual exclusion, progress, and bounded waiting. + +### "Mutex vs semaphore?" + +Good answer: + +> A mutex provides mutual exclusion and usually has ownership semantics. A semaphore is a counter-based signaling primitive. Binary semaphores can resemble locks, but counting semaphores are better for controlling access to N identical resources. + +### "When would you use a spinlock?" + +Good answer: + +> Only when the expected wait is extremely short and the cost of sleeping is higher than busy-waiting, such as in low-level systems code. Under long waits or high contention, spinlocks waste CPU. + +### "How do deadlocks happen and how do you prevent them?" + +Good answer: + +> Deadlocks require mutual exclusion, hold-and-wait, no preemption, and circular wait. In application code, the most practical prevention strategy is global lock ordering and minimizing nested lock acquisition. + +### "What is starvation? What is livelock?" + +Good answer: + +> Starvation means a thread never gets the resource or CPU time it needs even though others continue to make progress. Livelock means threads keep reacting to each other and changing state, but no useful progress is made. + +### "How would you make a class thread-safe?" + +Good answer structure: + +1. Identify shared mutable state. +2. Define invariants. +3. Choose a strategy: immutability, mutex, RW lock, atomics, or confinement. +4. Keep lock scope minimal. +5. Validate performance and deadlock risk. + +--- + +## 26. Practical Interview Scenarios + +### Scenario 1: Shared counter for analytics + +Question: + +"Many threads increment a global request counter. What can go wrong and how would you fix it?" + +What interviewer wants: + +- identify lost updates, +- explain why `count++` is not atomic, +- propose mutex or atomic increment, +- discuss contention if traffic is high. + +Strong extension: + +> For very high update rates, I might use per-thread counters and periodically aggregate them, because a single atomic counter can become a contention hotspot. + +### Scenario 2: Thread-safe LRU cache + +Question: + +"How would you make an LRU cache thread-safe?" + +Strong approach: + +- The map and linked list must remain consistent together. +- A coarse mutex is the simplest correct approach. +- If reads dominate and performance matters, consider sharding or redesign. +- Beware of callbacks, eviction hooks, or loaders that run while holding the lock. + +### Scenario 3: Database connection pool + +Question: + +"Which primitive fits best?" + +Strong answer: + +> A counting semaphore fits naturally because it tracks how many connections are available. The actual pool data structure still needs synchronization, but the semaphore models resource capacity cleanly. + +### Scenario 4: Avoiding deadlock with two resources + +Question: + +"Two threads sometimes need both lock A and lock B. How do you avoid deadlock?" + +Strong answer: + +> Enforce lock ordering. Always acquire A before B everywhere in the codebase. If order cannot be guaranteed, use try-lock with backoff or redesign to reduce nested locking. + +### Scenario 5: Producer-consumer queue overload + +Question: + +"Workers are slower than producers. What should happen?" + +Strong answer: + +> Use a bounded queue so memory usage cannot grow without limit. Producers should block, shed load, or apply backpressure once the queue is full. + +That answer shows system thinking, not just API knowledge. + +--- + +## 27. A Good Mental Model for Solving Concurrency Questions + +When you see a concurrency problem, reason in this order: + +1. What state is shared? +2. What invariants must remain true? +3. Who can read or write the state concurrently? +4. Is the problem about exclusion, ordering, visibility, or all three? +5. Which primitive best matches the need? +6. Could deadlock, starvation, or contention appear? +7. Can the design reduce sharing instead of synchronizing more? + +This is often what separates a strong interview answer from a memorized one. + +--- + +## 28. High-Value Takeaways to Remember + +- Concurrency is about coordinating multiple in-progress tasks. +- Parallelism is about simultaneous execution. +- Race conditions happen when correctness depends on timing. +- Critical sections protect shared mutable state. +- Mutual exclusion solves only one part of synchronization. +- Mutexes protect exclusive access. +- Semaphores coordinate access to counted resources or events. +- Condition variables let threads wait for state changes. +- Spinlocks trade CPU for low waiting latency. +- Monitors package state and synchronization together. +- Deadlock requires all four Coffman conditions. +- Starvation, deadlock, and livelock are different failure modes. +- Thread safety is about correctness under concurrent access, not just locking. +- Atomic operations are building blocks, not magic. +- In real systems, the best design often reduces shared state instead of adding more locks. + +--- + +## 29. Final Interview Advice + +For OS and systems interviews, aim to answer at three levels: + +1. Definition + Show you know the term precisely. +2. Mechanism + Explain how it works and what guarantees it provides. +3. Tradeoff and application + Show when you would use it in real systems and what can still go wrong. + +That combination makes your answers sound like engineering judgment rather than memorization. + +If you can explain: + +- why `count++` is unsafe, +- why `while` is used around condition waits, +- why lock ordering prevents deadlocks, +- when semaphores fit better than mutexes, +- and how these ideas appear in caches, queues, pools, and services, + +you are already operating at a strong interview level. diff --git a/os/memoryManagement.md b/os/memoryManagement.md new file mode 100644 index 0000000..3e1bb0d --- /dev/null +++ b/os/memoryManagement.md @@ -0,0 +1,921 @@ +# Memory Management for Software Engineering Interviews + +Memory management is one of the most important operating-system topics for interviews because it sits at the boundary between hardware reality, kernel policy, language runtime behavior, and application performance. If you build backend systems, work with C++ or Java, debug production latency, or reason about scale, you are already dealing with memory-management tradeoffs even if the kernel hides most of the mechanics. + +This guide aims to give you an interview-ready mental model, not just a glossary. The central question is simple: + +> How does the operating system make memory appear large, fast, isolated, and safe even though physical RAM is limited, shared, and much slower than the CPU? + +## 1. Why Memory Management Exists + +An operating system cannot let every process read and write raw physical memory arbitrarily. If it did: + +- Any process could corrupt another process. +- The kernel would have no isolation boundary. +- Programs would need to know where they are loaded in RAM. +- Memory would be difficult to share safely. +- Fragmentation and relocation would become unmanageable. + +Memory management exists to solve a few core problems at once: + +- Isolation: each process should feel like it owns memory. +- Protection: invalid or unauthorized accesses should be blocked. +- Efficiency: RAM should be used well, not wasted. +- Abstraction: programs should use addresses without caring where data physically lives. +- Performance: recently used translations and data should be fast to access. +- Flexibility: the OS should be able to load, move, share, swap, and evict memory as needed. + +The big idea is that processes mostly work with logical or virtual addresses, while the operating system and hardware cooperate to map those to physical memory. + +## 2. How Memory Works in an Operating System + +At a high level, a running process sees a virtual address space. The CPU issues a memory reference like "load from address X". That address is usually not a raw DRAM location. Instead, hardware called the Memory Management Unit (MMU) translates it into a physical address. + +The actual flow usually looks like this: + +1. A process executes an instruction that references a virtual address. +2. The CPU checks the TLB, which is a small cache of recent address translations. +3. If the translation is in the TLB, the CPU quickly gets the physical frame. +4. If not, hardware or the kernel walks the page tables to find the mapping. +5. If the page is present in RAM and permissions allow access, the read or write proceeds. +6. If the page is not present, a page fault occurs and the kernel decides how to handle it. + +```mermaid +flowchart TD + A[Instruction references virtual address] --> B{TLB hit?} + B -->|Yes| C[Get physical frame quickly] + B -->|No| D[Walk page tables] + D --> E{Valid present mapping?} + E -->|Yes| F[Fill TLB and continue] + E -->|No| G[Page fault trap to kernel] + G --> H{Can kernel resolve it?} + H -->|Yes| I[Load or map page and resume] + H -->|No| J[Send error like SIGSEGV or kill process] + C --> K[Access cache or DRAM] + F --> K + I --> K +``` + +This explains a lot of interview topics at once: + +- Virtual memory gives each process its own address space. +- Paging breaks memory into fixed-size units. +- Page tables store the mapping. +- The TLB makes translation fast. +- Page faults handle missing pages. +- Swapping and demand paging allow memory to exceed RAM. + +## 3. Logical Address vs Physical Address + +This distinction is foundational. + +### Logical address + +A logical address is the address generated by the CPU from the program's point of view. In modern systems, the term virtual address is usually used in practice, and in interview conversation logical and virtual are often treated as effectively the same thing. + +Examples: + +- A pointer in C++ points to a virtual address in the process address space. +- A Java object reference is resolved by the JVM within the process's memory model, but the underlying memory still ultimately lives in virtual memory managed by the OS. + +### Physical address + +A physical address is the real location in RAM that the memory controller uses. + +### Important nuance + +Historically, some textbooks distinguish logical from virtual more carefully, especially in segmented systems. For most modern interview contexts, the useful distinction is: + +- Program-visible address: logical or virtual +- Hardware RAM location: physical + +### Why the distinction matters + +- Protection is enforced on virtual-to-physical translation. +- Different processes can use the same virtual address values without conflict. +- The OS can relocate or swap memory without changing application code. + +Example: + +- Process A may read from virtual address `0x7fff0000`. +- Process B may also read from virtual address `0x7fff0000`. +- Those can map to completely different physical frames. + +That is why virtual addresses are per-process, while physical addresses are system-wide. + +## 4. Address Space + +An address space is the range of memory addresses a process can use. More precisely, it is the abstraction of memory visible to that process. + +Each process typically gets its own virtual address space containing regions such as: + +- Text or code segment +- Read-only data +- Global and static data +- Heap +- Memory-mapped files +- Shared libraries +- Stack + +Typical process layout looks like this: + +```mermaid +flowchart TB + K[High virtual addresses] + S[Stack grows downward] + M[Memory-mapped region and shared libraries] + H[Heap grows upward] + D[Data and BSS] + T[Code or text] + Z[Low virtual addresses] + + K --> S --> M --> H --> D --> T --> Z +``` + +### Interview-level understanding + +- The address space is virtual, not raw RAM. +- The heap and stack are just regions inside that space. +- Separate processes have separate address spaces. +- Threads in the same process share the address space but usually have separate stacks. + +### 32-bit vs 64-bit intuition + +- A 32-bit address space is much smaller and historically made memory pressure and layout constraints more visible. +- A 64-bit address space is so large that modern systems can use sparse mappings comfortably, which makes techniques like memory-mapped files and guard pages easier to support. + +Large virtual address spaces do not mean the machine has that much RAM. They just give the OS a large namespace to manage. + +## 5. Memory Allocation Basics + +Memory allocation means deciding how memory is assigned to processes, threads, objects, buffers, or pages. + +There are several layers of allocation: + +- The kernel allocates physical page frames. +- The kernel maps virtual pages into a process address space. +- User-space allocators such as `malloc`, `new`, `jemalloc`, or `tcmalloc` manage heap memory inside the process. +- Language runtimes like the JVM allocate objects within managed heap regions. + +### Common allocation categories + +#### Static allocation + +Memory decided before execution, such as global variables or static storage. + +#### Stack allocation + +Memory associated with function calls and local variables with automatic lifetime. + +#### Heap allocation + +Memory requested dynamically at runtime, often with manual or runtime-managed lifetime. + +### What `malloc` or `new` really does + +Interviewers often ask this because it reveals whether you understand the layers. + +At a simplified level: + +1. Your program asks the allocator for some bytes. +2. The allocator tries to satisfy it from existing heap arenas or free lists. +3. If it needs more memory, it may ask the kernel for additional pages using mechanisms like `brk` or `mmap`. +4. The kernel updates page tables so those virtual pages belong to the process. +5. Actual physical pages may still be assigned lazily on first touch, depending on the OS. + +So `malloc(1024)` usually does not mean "immediately reserve exactly 1024 physical bytes in RAM". It means "make this memory available in the process's virtual address space and allocator bookkeeping". + +## 6. Contiguous vs Non-Contiguous Memory Allocation + +This topic is really about how memory is laid out physically or logically for a process. + +### Contiguous allocation + +In contiguous allocation, a process or region is placed in one continuous block of physical memory. + +Advantages: + +- Simple bookkeeping +- Simple address computation +- Historically easy to implement + +Disadvantages: + +- Hard to fit variable-sized processes efficiently +- External fragmentation becomes a serious problem +- Growing processes is awkward +- Compaction may be needed + +Older memory-management designs used fixed or variable partitions in physical memory, but these approaches did not scale well. + +### Non-contiguous allocation + +In non-contiguous allocation, a process can occupy multiple separated physical locations. + +Examples: + +- Paging: memory split into fixed-size pages and frames +- Segmentation: memory split into logical variable-sized segments +- Combined designs: segmented paging or paged virtual memory + +Advantages: + +- Better flexibility +- Better RAM utilization +- Easier growth of address spaces +- Simplifies sharing and protection at smaller granularity + +Disadvantages: + +- More translation overhead +- More metadata such as page tables +- More complex hardware and kernel logic + +Modern general-purpose operating systems rely heavily on non-contiguous allocation, especially paging. + +## 7. Fragmentation: Internal vs External + +Fragmentation means memory is being wasted, but the reason for the waste differs. + +### Internal fragmentation + +Internal fragmentation happens when allocated memory is larger than what the program actually needs, so wasted space exists inside the allocated unit. + +Example: + +- If page size is 4 KiB and a process needs 6 KiB, it will use 2 pages, or 8 KiB total. +- About 2 KiB is unused inside the allocated pages. + +This is internal fragmentation because the wasted space is inside the allocated blocks. + +### External fragmentation + +External fragmentation happens when enough total free memory exists, but it is split into small scattered holes, so a large contiguous request cannot be satisfied. + +Example: + +- Free blocks of 10 MB, 5 MB, and 20 MB exist. +- A process requests a contiguous 30 MB block. +- Total free memory is 35 MB, but there is no single 30 MB region. + +This is external fragmentation because the waste exists between allocated regions. + +### What causes each one + +- Fixed-size allocation units, like pages, tend to create internal fragmentation. +- Variable-sized contiguous allocation tends to create external fragmentation. + +### Interview framing + +If asked which fragmentation paging solves, the strong answer is: + +> Paging largely eliminates external fragmentation in physical allocation because pages can be placed anywhere, but it still suffers from internal fragmentation at page granularity. + +## 8. Virtual Memory + +Virtual memory is the abstraction that gives each process a large, private, contiguous-looking address space, regardless of how memory is physically arranged. + +The key word is illusion. The OS does not promise that every virtual page is backed by RAM right now. It promises that accesses will either work through translation or be handled through faults, allocation, or process termination. + +### What virtual memory provides + +- Isolation between processes +- Protection via access permissions +- Sparse address spaces +- The ability to use more virtual memory than physical RAM +- Efficient sharing of libraries and file mappings +- Simplified programming model + +### Why virtual memory is needed + +Without virtual memory: + +- Programs would need physical addresses or explicit relocation logic. +- Different processes could not reuse the same convenient address ranges. +- Swapping and demand paging would be much harder. +- Isolation would be weak and unsafe. +- Shared libraries and memory-mapped files would be more complicated. + +### The most important interview insight + +Virtual memory is not just about pretending disk is extra RAM. That is too shallow. + +It is mainly about: + +- address translation, +- protection, +- isolation, +- flexible placement, +- and loading data only when needed. + +Using disk as a backing store is one consequence, not the whole story. + +## 9. Paging + +Paging is the dominant memory-management technique in modern operating systems. + +The idea is simple: + +- Divide virtual memory into fixed-size pages. +- Divide physical memory into fixed-size frames of the same size. +- Map each virtual page to some physical frame. + +If page size is 4 KiB, a virtual address is split into: + +- Virtual page number +- Offset within the page + +The offset stays the same during translation. Only the page number changes. + +Example: + +- Virtual address = page 42, offset 100 +- Page table says page 42 is in frame 900 +- Physical address = frame 900, offset 100 + +This is why paging avoids needing contiguous physical memory. + +### Advantages of paging + +- Eliminates most external fragmentation +- Supports virtual memory naturally +- Makes sharing and protection easy at page granularity +- Allows demand paging and swapping + +### Costs of paging + +- Page-table memory overhead +- Internal fragmentation inside the last page +- Translation overhead without a TLB +- Page faults can be very expensive + +## 10. Page Tables + +A page table is the data structure that maps virtual pages to physical frames. + +Each entry usually stores more than just a frame number. Typical metadata includes: + +- Present or valid bit +- Read or write permissions +- User or kernel accessibility +- Dirty bit, meaning page has been modified +- Accessed or referenced bit +- Execute-disable bit on supported hardware + +### Why page tables matter + +They are where isolation and protection become concrete. If the mapping is missing or permissions do not allow access, the CPU traps into the kernel. + +### Why page tables can be large + +Suppose a process has a large virtual address space and small page size. A flat page table would need an entry for a huge number of possible pages, even if the process only uses a small subset. + +That is why real systems use hierarchical or multi-level page tables. + +## 11. Multi-Level Paging + +Multi-level paging is an optimization for page-table storage. + +Instead of one giant page table, the address is broken into multiple index levels. Lower-level tables are allocated only for the parts of the address space actually in use. + +```mermaid +flowchart TD + A[Virtual address] --> B[Level 1 index] + A --> C[Level 2 index] + A --> D[Level 3 index] + A --> E[Page offset] + B --> F[Top-level page table] + F --> G[Next-level table] + G --> H[Leaf page table entry] + H --> I[Physical frame] + E --> J[Physical address uses same offset] + I --> J +``` + +### Why it helps + +- Sparse address spaces do not require allocating a full flat page table. +- Memory overhead becomes proportional to the used regions of the address space. + +### Tradeoff + +Walking multiple levels takes more memory accesses on a TLB miss. That is one reason the TLB is so important. + +### Real-world example + +Modern 64-bit systems such as x86-64 often use four or five levels of paging for large address spaces. + +You do not usually need to memorize exact bit splits unless the interviewer is going deep into architecture. What matters is understanding why multi-level paging exists. + +## 12. Translation Lookaside Buffer (TLB) + +The TLB is a small, very fast cache inside the CPU that stores recent virtual-to-physical translations. + +Without a TLB, every memory access could require extra page-table lookups, which would be far too slow. + +### Why the TLB matters so much + +Every instruction fetch, stack access, heap access, and data read depends on address translation. If translation were always a full page-table walk, memory access would be dramatically slower. + +### TLB hit vs miss + +- TLB hit: translation found quickly, access continues. +- TLB miss: hardware or software must walk page tables and possibly populate the TLB. + +### Practical implications + +- Good locality improves TLB effectiveness. +- Large page sizes or huge pages can reduce TLB pressure because one entry covers more memory. +- Context switches can reduce TLB usefulness unless the CPU supports address-space tagging such as ASIDs or PCIDs. + +### Backend-system angle + +Databases, caches, in-memory analytics engines, and JVM heaps can all suffer when working sets exceed TLB coverage. This is one reason huge pages sometimes help performance-sensitive systems. + +## 13. Page Faults + +A page fault occurs when a process accesses a virtual page whose translation cannot be completed normally. + +That does not automatically mean a bug. Some page faults are expected and legitimate. + +### Common reasons for a page fault + +- The page has not been loaded yet and must be brought into memory. +- The page exists but is currently swapped out. +- The page is marked copy-on-write and needs a private copy on write. +- The access violates protection, such as writing to a read-only page. +- The address is invalid and not mapped at all. + +```mermaid +sequenceDiagram + participant P as Process + participant CPU as CPU or MMU + participant K as Kernel + participant D as Disk or backing store + + P->>CPU: access virtual page + CPU->>K: page fault trap + K->>K: inspect page-table entry and permissions + alt page can be resolved + K->>D: read page if needed + D-->>K: page data + K->>K: update page table and TLB state + K-->>P: resume instruction + else invalid or forbidden access + K-->>P: send fault signal or terminate + end +``` + +### Major vs minor page fault + +Interviewers sometimes like this distinction. + +- Minor page fault: page can be satisfied without disk I/O, for example a copy-on-write mapping or a page already in memory but not mapped into this process yet. +- Major page fault: servicing the fault requires disk I/O, which is much slower. + +### Important nuance + +A segmentation fault in Linux is often the user-visible result of an invalid or protection-violating page fault. So page fault is the low-level event; `SIGSEGV` is often the process-level consequence. + +## 14. Demand Paging + +Demand paging means pages are loaded into memory only when they are actually accessed. + +This is one of the biggest reasons virtual memory is efficient. Instead of loading an entire executable or heap eagerly, the OS can load pages lazily. + +### Benefits + +- Faster program startup +- Lower RAM usage +- Only touched pages consume physical memory +- Large sparse data structures become feasible + +### Costs + +- First access latency due to page faults +- Too much lazy loading under pressure can cause many faults + +### Real-world examples + +- Executable code pages are often loaded on first use. +- `mmap` of a large file typically does not read the whole file immediately. +- After `fork`, Linux often uses copy-on-write so parent and child share pages until one writes. + +Demand paging is a great interview bridge topic because it connects virtual memory, page tables, page faults, and performance. + +## 15. Thrashing + +Thrashing happens when the system spends too much time paging pages in and out and too little time doing useful work. + +This usually occurs when the active working sets of processes do not fit in available RAM. + +### Symptoms + +- Very high page fault rate +- Heavy disk I/O or swap activity +- CPU utilization may drop because tasks keep waiting on memory +- Throughput collapses +- Tail latency becomes terrible + +### Why it happens + +If a process keeps needing pages that were just evicted, the system enters a destructive loop: + +- page needed, +- page fault, +- load from disk, +- evict another needed page, +- repeat. + +### Mitigations + +- Add more RAM +- Reduce multiprogramming level +- Tune memory limits and eviction behavior +- Use better locality-friendly algorithms +- Reduce heap size or working set size +- Avoid overcommitting memory aggressively + +### Practical production example + +A Java service in a container with tight memory limits may begin swapping or faulting heavily under burst traffic. Even if CPU looks available, the service becomes slow because it is memory-bound rather than compute-bound. + +## 16. Segmentation + +Segmentation divides memory into logical variable-sized regions called segments, such as code, data, stack, or heap. + +Instead of address = page number + offset, the idea is more like: + +- segment number +- offset within the segment + +Each segment has a base and limit. + +### Why segmentation is attractive conceptually + +- It matches program structure well. +- Different segments can have different permissions. +- Sharing logical regions can be natural. + +### Main problem + +Because segments are variable-sized, segmentation suffers from external fragmentation. + +### Modern relevance + +Pure segmentation is not the main model in modern general-purpose systems. Modern systems are dominated by paging, though some architectures preserve limited segmentation concepts for special purposes. + +Still, segmentation remains important in interviews because it teaches the difference between logical program regions and fixed-size paging units. + +## 17. Paging vs Segmentation + +This comparison comes up often. + +| Aspect | Paging | Segmentation | +| --- | --- | --- | +| Unit size | Fixed-size pages | Variable-size segments | +| View of memory | Physical-management oriented | Logical-program-structure oriented | +| Fragmentation | Internal fragmentation | External fragmentation | +| Allocation flexibility | High | Lower under pressure | +| Protection granularity | Page-based | Segment-based | +| Modern OS usage | Very common | Limited or combined | + +### Strong interview explanation + +Paging is better for efficient physical memory management because fixed-size frames are easy to allocate. Segmentation is better for expressing logical program structure, but variable-sized segments fragment memory. That is why modern systems mostly use paging, sometimes with segmentation concepts layered on top or retained for limited architectural roles. + +## 18. Swapping + +Swapping means moving memory contents between RAM and disk to free physical memory. + +Historically, systems sometimes swapped entire processes. Modern systems usually work at page granularity, not by moving whole processes out all at once. + +### Why swapping exists + +- RAM is finite. +- Some pages are cold and can be moved out temporarily. +- This allows the system to keep more virtual memory in use than physical RAM alone would permit. + +### Why swapping is dangerous for performance + +Disk, even SSD, is far slower than RAM. If hot pages are swapped out and quickly needed again, latency explodes. + +### Linux perspective + +- Linux can swap anonymous pages under pressure. +- The kernel also uses the page cache heavily for file-backed data. +- In containerized systems, excessive swapping often causes severe performance issues, and some deployments disable swap to avoid unpredictable latency. + +Swapping is sometimes useful as a safety buffer, but if a latency-sensitive service is actively depending on swap, it is usually already in trouble. + +## 19. Stack vs Heap + +This is a classic interview topic because it connects language semantics to OS memory layout. + +### Stack + +The stack is typically: + +- Per thread +- Automatically managed by function call discipline +- Used for call frames, return addresses, parameters, and many local variables +- Very fast to allocate and free because it usually just moves the stack pointer + +Common properties: + +- Lifetime is usually lexical or call-scoped. +- Size is limited. +- Deep recursion can cause stack overflow. + +### Heap + +The heap is typically: + +- Shared by threads in the same process +- Used for dynamically allocated objects +- Flexible in lifetime and size relative to the stack +- Managed by allocators or garbage collectors + +Common properties: + +- Allocation and freeing are more expensive than simple stack-pointer movement. +- Fragmentation can occur. +- Bugs such as leaks, double free, or use-after-free often involve heap memory. + +### Language examples + +#### C++ + +- Local automatic variable usually lives on the stack. +- `new` typically allocates on the heap. +- RAII helps tie resource lifetime to scope. + +#### Java + +- Each thread has a stack for method frames. +- Most objects live on the heap managed by the JVM. +- Some values may be optimized away or scalar-replaced by the JIT, so the old rule "objects are always on the heap" is directionally right for interviews but not perfectly literal. + +### Strong interview summary + +Stack allocation is fast and structured but limited and scope-bound. Heap allocation is flexible and long-lived but more expensive to manage and more prone to fragmentation and lifetime bugs. + +## 20. Memory Leaks and Garbage Collection Basics + +Memory leaks are not just a C or C++ problem. They also happen in managed runtimes, just in a different form. + +### Memory leak in manual-memory systems + +In C or C++, a memory leak usually means allocated memory is no longer needed but can no longer be freed because the program lost track of it. + +Examples: + +- `malloc` without `free` +- `new` without `delete` +- Overwriting the only pointer to an allocated object + +### Memory leak in garbage-collected systems + +In Java, Go, or other GC languages, a leak usually means memory is still reachable, so the garbage collector cannot reclaim it, even though the application no longer logically needs it. + +Examples: + +- Static caches that grow forever +- Listeners never deregistered +- `ThreadLocal` values retained too long +- Maps holding references to expired sessions + +GC prevents many manual deallocation bugs, but it does not prevent retaining useless objects. + +### Garbage collection basics + +Most modern garbage collectors are tracing collectors. They start from GC roots, such as stacks, registers, and global references, then mark reachable objects. + +Common ideas you should know: + +- Mark-sweep: mark reachable objects, reclaim the rest. +- Mark-compact: reclaim and then compact live objects to reduce fragmentation. +- Copying collection: copy live objects into a new region, usually efficient for young generations. +- Generational GC: exploit the fact that most objects die young, so collect young space frequently and old space less often. + +### Why GC exists + +- Reduces manual memory-management bugs +- Improves safety and developer productivity +- Makes high-level languages practical at scale + +### Why GC is not free + +- Extra CPU overhead +- Pause times or concurrent collection complexity +- Write barriers and runtime bookkeeping +- Potential memory overhead from fragmentation, reserve spaces, or collection strategy + +### C++ angle + +C++ usually relies on deterministic destruction rather than GC. Strong interview topics include: + +- RAII +- `unique_ptr` +- `shared_ptr` +- reference cycles with `shared_ptr` +- custom allocators and arena allocation + +## 21. Real-World Examples from Linux, Java, C++, and Modern Backend Systems + +### Linux + +#### Copy-on-write after `fork` + +When a process forks, Linux does not eagerly copy every page. Parent and child initially share pages as read-only. If one writes, that page faults and the kernel creates a private copy. + +This is a classic example of demand paging, page faults, and efficient memory sharing working together. + +#### `mmap` + +Linux can map files directly into a process address space. Reads and writes can then operate through memory access rather than explicit `read` and `write` calls. + +This is important for: + +- databases, +- analytics engines, +- file-backed caches, +- zero-copy-style optimizations. + +#### Page cache + +Linux uses RAM aggressively as a page cache for file data. This is why "free memory" is not the right metric by itself. Used memory may still be reclaimable cache. + +### Java + +Java memory interview discussion often includes: + +- Heap for objects +- Per-thread stacks +- Metaspace for class metadata +- GC generations +- Stop-the-world pauses vs concurrent collectors +- Off-heap memory via direct buffers or native libraries + +Important practical point: + +A Java service can fail from memory pressure even if heap graphs look reasonable because total memory also includes thread stacks, direct buffers, mapped files, metaspace, and native allocations. + +### C++ + +C++ brings memory ownership and lifetime to the front. + +Important practical topics: + +- stack vs heap allocation, +- manual memory management, +- smart pointers, +- object lifetime, +- fragmentation under general-purpose allocators, +- use-after-free, +- double free, +- arena allocation for predictable performance. + +Many low-latency systems use custom allocators or memory pools to reduce allocator overhead and fragmentation. + +### Modern backend systems + +#### Containers and cgroups + +A process may have plenty of virtual address space but still be killed because the container memory limit is reached. From an interview point of view, that shows the difference between address-space size, RSS, heap size, and actual allowed physical usage. + +#### Databases and caches + +Databases often care about page size, cache locality, huge pages, and NUMA effects because translation and memory locality directly affect throughput. + +#### Managed services + +High object churn in a Java service can increase GC frequency. The issue is not just "not enough memory" but often allocation rate, object lifetime distribution, and heap tuning. + +#### Native services + +A C++ service can show stable CPU but rising latency because of allocator contention, fragmentation, or page faults under memory pressure. + +## 22. Common Interview Questions and How to Think About Them + +### Why is virtual memory needed? + +Strong answer: + +Virtual memory provides isolation, protection, flexible placement, sparse address spaces, efficient sharing, and the ability to load or back memory lazily. It is not only about using disk as extra memory. + +### What is the difference between a page fault and a segmentation fault? + +Strong answer: + +A page fault is the low-level event when translation cannot proceed normally. It may be valid and recoverable, like demand paging. A segmentation fault is usually the operating system signal sent to the process when the fault is invalid or violates permissions. + +### Why are page tables multi-level? + +Strong answer: + +A flat page table for a large sparse address space would waste too much memory. Multi-level paging allocates lower-level tables only where needed. + +### What problem does the TLB solve? + +Strong answer: + +It caches recent address translations so each memory access does not require a costly full page-table walk. + +### What is the difference between internal and external fragmentation? + +Strong answer: + +Internal fragmentation is wasted space inside allocated units, like partially used pages. External fragmentation is wasted space between allocated regions, where enough total memory exists but not as one contiguous block. + +### Why does paging reduce external fragmentation? + +Strong answer: + +Because physical frames are fixed-size and pages can be placed anywhere, memory does not need one large contiguous block per process. + +### What happens when you call `malloc`? + +Strong answer: + +Usually the allocator serves the request from an internal free list or arena. If necessary, it requests more pages from the kernel. Physical memory may still be assigned lazily on first access. + +### Why can Java still have memory leaks? + +Strong answer: + +Because GC only frees unreachable objects. If the program keeps references to objects it no longer logically needs, those objects remain reachable and consume memory. + +### What is thrashing? + +Strong answer: + +Thrashing occurs when the system spends most of its time servicing page faults and swapping pages instead of doing useful work, usually because working sets exceed available RAM. + +## 23. Practical Scenarios Interviewers Like + +### Scenario 1: Service latency spikes under load even though CPU is not maxed + +Possible memory-related explanations: + +- major page faults, +- swapping, +- allocator contention, +- GC pauses, +- poor locality causing cache and TLB misses. + +### Scenario 2: Container is OOM-killed even though Java heap was below `-Xmx` + +Reasoning: + +- total process memory includes more than Java heap, +- thread stacks, direct buffers, metaspace, native libraries, and page cache can all matter, +- cgroup limit is the real boundary. + +### Scenario 3: C++ process memory grows forever + +Possibilities: + +- actual leak, +- retained caches, +- allocator arenas not returned to OS, +- fragmentation, +- memory-mapped growth. + +### Scenario 4: `fork` is surprisingly cheap on Linux + +Reasoning: + +- because of copy-on-write, pages are not copied immediately, +- the kernel mainly duplicates metadata and page-table structures, +- actual copying happens only on write. + +### Scenario 5: Large memory-mapped file is opened instantly + +Reasoning: + +- `mmap` mainly creates virtual mappings, +- actual file pages are brought in lazily by page faults on access. + +## 24. What to Say in an Interview When You Want to Sound Strong + +If you need a compact but impressive explanation, this is a solid framing: + +> Modern memory management is built around virtual memory. Each process gets its own virtual address space, and the MMU translates virtual addresses to physical frames using page tables. Paging allows non-contiguous physical allocation, which improves flexibility and largely removes external fragmentation. The TLB makes translation fast, while page faults let the OS load pages lazily through demand paging. Multi-level page tables keep metadata manageable for sparse address spaces. In practice, performance issues often come from page faults, poor locality, TLB pressure, swapping, fragmentation, or runtime-level behaviors like GC and allocator overhead. + +That answer ties together theory, hardware, OS behavior, and real production effects. + +## 25. Final Mental Model + +If you remember only one model, remember this: + +- Programs operate in virtual address spaces. +- The OS and MMU map virtual pages to physical frames. +- Page tables store mappings and permissions. +- The TLB caches those mappings for speed. +- Missing mappings trigger page faults. +- Demand paging and swapping let the system use RAM lazily and extend apparent memory capacity. +- Paging trades external fragmentation for manageable internal fragmentation and metadata overhead. +- Real systems succeed or fail based on locality, working set size, and lifetime management. + +Once this clicks, many interview topics stop feeling like disconnected definitions and start feeling like one coherent system. diff --git a/os/processmanagement.md b/os/processmanagement.md new file mode 100644 index 0000000..67992e2 --- /dev/null +++ b/os/processmanagement.md @@ -0,0 +1,1021 @@ +# Process Management for Software Engineering Interviews + +Process management is the part of an operating system responsible for creating, scheduling, coordinating, and cleaning up running programs. In interview terms, it sits at the intersection of operating system theory and real production behavior: latency, throughput, fairness, isolation, resource sharing, and failure handling all depend on it. + +If you already build backend systems, the practical framing is this: + +- A process gives you isolation and a private virtual address space. +- A thread gives you a unit of execution inside a process. +- The scheduler decides who runs next. +- Context switches are the price the system pays to move the CPU from one runnable task to another. +- IPC exists because isolated execution units still need to cooperate. + +This guide covers the theory, the Linux view, and the interview-level reasoning you should be able to explain clearly. + +## 1. Processes and Threads + +### What is a process? + +A process is a running instance of a program. It is more than just code on disk. Once started, the operating system gives it: + +- Its own virtual address space +- A process identifier (PID) +- Open file descriptors +- Security credentials and environment variables +- Accounting information such as CPU time and memory usage +- One or more threads of execution + +You can think of a process as a resource container plus execution context. + +Typical process resources include: + +- Code segment +- Heap +- Global data +- Open files and sockets +- Signal handlers +- Page tables and memory mappings + +### What is a thread? + +A thread is the smallest schedulable unit of execution inside a process. Multiple threads in the same process share most process resources, but each thread still has its own: + +- Program counter +- Register set +- Stack +- Thread-local storage + +This is why threads are lighter than processes. Creating a new thread usually costs less than creating a new process, and switching between threads in the same process is usually cheaper than switching between unrelated processes. + +### Shared and private state inside a process + +```mermaid +flowchart TB + P["Process"] + C["Shared: code / data / heap"] + F["Shared: open files / sockets"] + T1["Thread 1\nprivate stack\nprivate registers\nprivate PC"] + T2["Thread 2\nprivate stack\nprivate registers\nprivate PC"] + T3["Thread 3\nprivate stack\nprivate registers\nprivate PC"] + + P --> C + P --> F + P --> T1 + P --> T2 + P --> T3 +``` + +Interview point: when two threads race on the same variable, that happens because they share the same address space. Two separate processes do not directly race on ordinary variables unless they use shared memory. + +## 2. Process vs Thread + +This comparison is foundational. Interviewers ask it directly because it reveals whether you understand isolation, scheduling, and communication costs. + +| Aspect | Process | Thread | +| --- | --- | --- | +| Address space | Separate | Shared within the same process | +| Isolation | Stronger | Weaker | +| Failure impact | Crash is usually isolated to that process | A bad thread can crash the whole process | +| Creation cost | Higher | Lower | +| Context switch cost | Usually higher | Usually lower | +| Communication | IPC needed | Simple shared-memory access | +| Resource ownership | Own files, memory mappings, credentials | Uses process-owned resources | +| Security boundary | Commonly yes | Usually no | + +### Practical interpretation + +- Use processes when isolation matters more than sharing. +- Use threads when low-latency cooperation matters and shared memory is useful. +- Modern systems often mix both. For example, a service may run multiple worker processes, and each worker may use several threads. + +### Real-world examples + +- PostgreSQL traditionally uses a process-per-connection model for strong isolation. +- MySQL commonly uses threads to handle many connections efficiently. +- Nginx uses a small number of worker processes, each running an event loop. +- The JVM runs as a process but internally uses many threads for application code, GC, JIT, and runtime services. + +## 3. Process Lifecycle and Process States + +Operating systems represent a process using metadata such as a Process Control Block (PCB). The PCB stores what the OS needs to manage and later resume the process. + +Typical PCB content includes: + +- PID and parent PID +- Current state +- CPU register snapshot +- Scheduling information such as priority +- Open file table references +- Memory management information +- Accounting and signal information + +### Core process states + +The exact names vary across systems, but the standard model is: + +- New: process is being created +- Ready: process is prepared to run but waiting for CPU time +- Running: process currently has the CPU +- Waiting or Blocked: process is waiting for I/O, a lock, a signal, or another event +- Terminated: process has finished execution + +Some systems also expose suspended states when a process is swapped out or explicitly paused. + +### State transitions + +```mermaid +stateDiagram-v2 + [*] --> New + New --> Ready: admitted + Ready --> Running: scheduler dispatch + Running --> Ready: preempted / time slice ends + Running --> Waiting: I/O wait / sleep / lock wait + Waiting --> Ready: event completes + Running --> Terminated: exit / kill + Terminated --> [*] +``` + +### How to explain the lifecycle in an interview + +When a process is created, it starts in a creation phase, then enters the ready queue. The scheduler picks it to run. If it blocks on I/O, it moves to waiting. Once the I/O completes, it becomes ready again. Eventually it exits and becomes terminated. The scheduler and the kernel move processes among these states. + +## 4. Process States and Context Switching + +### What is a context switch? + +A context switch happens when the CPU stops executing one task and starts executing another. The operating system saves the execution context of the outgoing task and restores the saved context of the incoming task. + +That saved context usually includes: + +- Program counter +- Stack pointer +- General-purpose registers +- CPU flags +- Scheduling metadata +- Sometimes memory-management state such as page-table-related data + +### When do context switches happen? + +Common triggers are: + +- Timer interrupt fires and the running task is preempted +- Process blocks on I/O +- Process voluntarily yields +- Higher-priority task becomes runnable +- Kernel wakes a sleeping task + +### Context switch flow + +```mermaid +sequenceDiagram + participant CPU + participant Scheduler + participant A as Running Task A + participant B as Next Task B + + CPU->>Scheduler: timer interrupt or blocking event + Scheduler->>A: save registers, PC, stack pointer + Scheduler->>Scheduler: choose next runnable task + Scheduler->>B: restore saved context + B->>CPU: resume execution +``` + +### Mode switch vs context switch + +These are related but not identical. + +- A mode switch means the CPU moves between user mode and kernel mode. +- A context switch means the CPU changes which task is running. + +For example, a system call may enter kernel mode and return to the same process without any context switch. Interviews often check whether you can separate these two ideas. + +## 5. Context Switch Overhead and Performance Impact + +Context switching is necessary, but it is not free. + +### Why it costs time + +The kernel must: + +- Save and restore CPU state +- Update scheduling structures +- Potentially switch address spaces +- Disturb CPU cache locality +- Potentially disturb TLB state + +The direct overhead may be small, but the indirect overhead can be significant because the new task may need to warm caches again. That is why frequent switching can reduce throughput. + +### Process switch vs thread switch + +Not all switches cost the same. + +- Switching between threads in the same process often avoids some address-space work. +- Switching between unrelated processes usually has more memory-management overhead. +- User-space thread runtimes can switch very quickly between user threads, but if the underlying kernel thread blocks, the runtime can still stall. + +### Interview framing + +If an interviewer asks why too-small time slices are bad, the answer is: they improve responsiveness up to a point, but after that the CPU spends too much time switching instead of doing useful work. + +## 6. CPU Scheduling + +Scheduling decides which ready task runs next. This is one of the most important process-management topics because it directly affects latency, throughput, fairness, and resource utilization. + +### Goals of CPU scheduling + +Schedulers try to balance several goals that often conflict: + +- High CPU utilization +- High throughput +- Low waiting time +- Low turnaround time +- Low response time +- Fairness +- Predictability + +Definitions worth memorizing: + +- Turnaround time: total time from submission to completion +- Waiting time: time spent waiting in the ready queue +- Response time: time until the task first gets CPU service + +### Ready queue mental model + +The scheduler chooses from runnable tasks in the ready queue. When a task blocks, it leaves the ready queue. When I/O completes, it re-enters. + +## 7. Scheduling Algorithms + +You should know how each algorithm works, where it performs well, and what tradeoffs it makes. + +### First-Come, First-Served (FCFS) + +FCFS runs the task that arrived earliest. + +How it works: + +- Non-preemptive +- Tasks run in arrival order +- Once a task gets CPU, it keeps it until it blocks or finishes + +Strengths: + +- Very simple +- Low scheduling overhead +- Easy to reason about + +Weaknesses: + +- Poor response time for short interactive tasks +- Convoy effect: a long CPU-bound job can force many short jobs to wait behind it + +Interview note: FCFS is fair in arrival order, but not fair in terms of responsiveness. + +### Shortest Job First (SJF) + +SJF picks the job with the smallest CPU burst. + +How it works: + +- Classic SJF is non-preemptive +- If exact burst lengths were known, it minimizes average waiting time + +Strengths: + +- Excellent theoretical average waiting time + +Weaknesses: + +- Real systems rarely know the future burst length exactly +- Long jobs can starve if short jobs keep arriving + +### Shortest Remaining Time First (SRTF) + +SRTF is the preemptive version of SJF. + +How it works: + +- If a newly arrived task has a shorter remaining burst than the currently running one, the scheduler preempts + +Strengths: + +- Better response for short jobs than non-preemptive SJF + +Weaknesses: + +- More context-switch overhead +- Still depends on burst estimation + +### Round Robin (RR) + +Round Robin gives each runnable task a time slice, often called a quantum. + +How it works: + +- Preemptive +- Each task gets CPU for at most one quantum +- If it does not finish, it goes to the back of the ready queue + +Strengths: + +- Good response time for interactive systems +- Prevents one task from monopolizing the CPU + +Weaknesses: + +- Too small a quantum increases context-switch overhead +- Too large a quantum makes it behave more like FCFS + +Interview note: the key tuning parameter is the time quantum. That parameter determines the tradeoff between responsiveness and overhead. + +### Priority Scheduling + +Priority scheduling picks the highest-priority runnable task. + +How it works: + +- Can be preemptive or non-preemptive +- Priorities may be static or dynamic + +Strengths: + +- Lets critical or latency-sensitive work run sooner +- Useful in systems with service classes or real-time priorities + +Weaknesses: + +- Starvation risk for low-priority tasks + +Common fix: + +- Aging gradually increases the priority of waiting tasks so they eventually run + +### Scheduling algorithm comparison + +| Algorithm | Preemptive | Main strength | Main weakness | Good fit | +| --- | --- | --- | --- | --- | +| FCFS | No | Simplicity | Convoy effect | Very simple batch workloads | +| SJF | No | Great average waiting time in theory | Needs burst prediction | Controlled batch-style systems | +| SRTF | Yes | Excellent for short jobs | Higher overhead, starvation risk | Short-job-heavy workloads | +| Round Robin | Yes | Good responsiveness | Quantum tuning matters | Time-sharing and interactive systems | +| Priority | Either | Favors important tasks | Starvation risk | Systems with service differentiation | + +### What modern Linux does + +Linux does not use plain FCFS or Round Robin for normal tasks. Its Completely Fair Scheduler (CFS) tries to approximate fairness by tracking virtual runtime. The task that has received the least fair share of CPU tends to run next. + +High-level intuition: + +- CPU-hungry tasks accumulate runtime quickly +- Tasks that sleep often, such as interactive or I/O-heavy ones, do not accumulate runtime while sleeping +- When they wake up, they often get CPU relatively quickly + +This is one reason interactive systems feel responsive even under load. + +## 8. Preemptive vs Non-Preemptive Scheduling + +### Non-preemptive scheduling + +Once a task gets the CPU, it keeps it until it finishes, blocks, or voluntarily yields. + +Pros: + +- Simpler implementation +- Lower context-switch overhead + +Cons: + +- Poor responsiveness +- A long-running job can delay everyone else + +### Preemptive scheduling + +The OS can interrupt a running task and give the CPU to another runnable task. + +Pros: + +- Better responsiveness +- Better support for fairness and latency-sensitive work + +Cons: + +- More scheduler complexity +- More context-switch overhead +- More concurrency hazards inside kernels and runtimes + +### Interview summary + +Preemption improves responsiveness and fairness, especially in multi-user and interactive systems. Non-preemptive scheduling is simpler and sometimes easier to reason about, but it performs poorly when short tasks sit behind long ones. + +## 9. CPU-Bound vs I/O-Bound Processes + +This distinction explains a lot about scheduler behavior. + +### CPU-bound process + +A CPU-bound process spends most of its time doing computation. It has long CPU bursts and relatively little waiting for I/O. + +Examples: + +- Compression +- Video encoding +- Large numerical workloads +- Data transformation jobs + +### I/O-bound process + +An I/O-bound process spends much of its time waiting on disk, network, or other external events. It has short CPU bursts and frequent waits. + +Examples: + +- Web servers waiting on sockets +- Database clients waiting for query results +- Log processors waiting on disk or network streams + +### Why the distinction matters + +- CPU-bound tasks benefit from throughput-oriented scheduling and cache locality. +- I/O-bound tasks benefit from quick wakeup and good response time. +- A good general-purpose scheduler tries not to let CPU-bound work starve interactive or I/O-heavy tasks. + +### Backend-system intuition + +Many backend services are mostly I/O-bound at the request level. They parse a request, hit storage or another service, wait, and resume. That is why event-driven systems and efficient wakeup behavior matter so much in real server software. + +## 10. Inter-Process Communication (IPC) + +Processes are isolated by design, so the OS provides explicit communication mechanisms. + +IPC is used for: + +- Data exchange +- Coordination +- Event notification +- Work distribution +- Accessing services across process boundaries + +### Two broad IPC models + +- Shared memory: processes map a common memory region and communicate by reading and writing the same bytes +- Message passing: the OS or runtime moves discrete messages between processes + +### IPC decision view + +```mermaid +flowchart TD + A["Need communication between execution units"] --> B{"Same address space?"} + B -->|Yes| C["Threads: shared memory by default\nneed synchronization"] + B -->|No| D{"Same machine?"} + D -->|Yes| E["Pipes / FIFOs / shared memory / message queues / Unix sockets"] + D -->|No| F["Network sockets / RPC / messaging systems"] +``` + +## 11. Shared Memory vs Message Passing + +### Shared memory + +With shared memory, two or more processes map the same physical memory pages into their virtual address spaces. + +Strengths: + +- Very fast for large data exchange +- Avoids repeated kernel copying after setup + +Weaknesses: + +- Harder to program correctly +- Requires synchronization to avoid races and corruption +- Debugging becomes more difficult + +Common use cases: + +- High-performance analytics pipelines +- Multimedia systems +- Shared in-memory caches on the same host + +### Message passing + +With message passing, processes send discrete messages through kernel-managed mechanisms or runtime-managed queues. + +Strengths: + +- Cleaner isolation +- Easier reasoning about ownership +- Usually simpler failure boundaries + +Weaknesses: + +- More copying and syscall overhead in many cases +- Message size and serialization can matter + +Common use cases: + +- Microservices +- Worker queues +- Actor-style systems +- Parent-child control channels + +### Comparison + +| Topic | Shared Memory | Message Passing | +| --- | --- | --- | +| Performance | Often faster for large local data | Often simpler but can add copy/serialization overhead | +| Synchronization | Required explicitly | Often built into the communication model | +| Isolation | Weaker | Stronger | +| Complexity | Higher | Lower to moderate | +| Typical scope | Same machine | Same machine or across network | + +Interview framing: shared memory is usually about performance; message passing is usually about simplicity, isolation, and explicit communication. + +## 12. Pipes, Named Pipes, Sockets, and Message Queues + +### Pipes + +A pipe is a unidirectional byte stream, commonly used between related processes such as parent and child processes. + +Key properties: + +- Kernel-managed buffer +- Often used with `fork()` +- Traditional Unix shell pipelines use anonymous pipes + +Example: + +- `ps aux | grep python` connects the output of one process to the input of another through a pipe + +### Named Pipes (FIFOs) + +A named pipe is like a pipe with a filesystem name. + +Key properties: + +- Unrelated processes can open it by name +- Still typically local to one machine +- Useful for simple producer-consumer communication + +### Sockets + +Sockets are a general communication endpoint. + +Types you should know: + +- Unix domain sockets: efficient IPC on the same machine +- TCP sockets: reliable communication across a network +- UDP sockets: connectionless communication with lower overhead and weaker delivery guarantees + +Why sockets matter in interviews: + +- They connect OS process management to real backend systems +- Most network services ultimately communicate through sockets + +### Message Queues + +Message queues store discrete messages, often with ordering and notification semantics. + +Key properties: + +- Decouple sender and receiver +- Can support asynchronous communication +- OS-level queues exist, and distributed systems also use message brokers such as Kafka or RabbitMQ at a higher layer + +Interview note: OS message queues and distributed message brokers are conceptually related but not the same thing. + +## 13. Signals and Semaphores + +These are often mentioned together, but they solve different problems. + +### Signals + +A signal is an asynchronous notification sent to a process or thread. + +Common Linux signals: + +- `SIGTERM`: polite request to terminate +- `SIGKILL`: immediate kill, cannot be caught or ignored +- `SIGINT`: interrupt from terminal, commonly Ctrl+C +- `SIGCHLD`: child process changed state + +Important properties: + +- Signals are not a good mechanism for transferring large data +- Signal handlers run asynchronously, so only async-signal-safe operations are safe there +- They are often used for control, shutdown, reload, or notification + +Real-world example: + +- Nginx can receive signals to reload configuration or stop workers gracefully + +### Semaphores + +A semaphore is a synchronization primitive used to control access to shared resources. + +Two types: + +- Binary semaphore: value is effectively 0 or 1, similar in spirit to a lock +- Counting semaphore: value can be greater than 1, useful when a finite number of identical resources exist + +Use cases: + +- Limit concurrency to N workers +- Coordinate producer-consumer pipelines +- Protect access to shared data structures + +Important clarification: + +- A semaphore is mainly about synchronization and coordination +- A signal is mainly about asynchronous notification + +## 14. Parent and Child Processes + +Processes often form hierarchies. + +### Parent-child relationship + +When one process creates another, the creator is the parent and the new process is the child. + +In Unix-like systems, a child inherits many attributes from the parent, such as: + +- Environment +- Open file descriptors +- Current working directory +- Credentials and limits + +### Linux model: `fork()` and `exec()` + +The classic Unix pattern is: + +1. Parent calls `fork()` +2. Kernel creates a child process +3. Child often calls `exec()` to replace its memory image with a new program +4. Parent may call `wait()` or `waitpid()` to collect the child's exit status + +This separation is a major OS design idea: + +- `fork()` duplicates the current process state +- `exec()` replaces that state with a new program image + +### Copy-on-write optimization + +Modern Unix systems do not eagerly copy every memory page during `fork()`. They use copy-on-write. + +That means: + +- Parent and child initially share the same physical pages as read-only +- Only when one side writes to a page does the kernel create a private copy + +This makes `fork()` practical even for large processes. + +### Parent-child lifecycle diagram + +```mermaid +flowchart LR + P["Parent process"] --> F["fork()"] + F --> C["Child process"] + C --> E["exec() optional\nreplace program image"] + C --> X["exit(status)"] + P --> W["wait()/waitpid()"] + X --> W +``` + +## 15. Process Creation and Termination + +### Process creation + +When a process is created, the OS typically: + +- Allocates a PCB or equivalent task structure +- Assigns a PID +- Sets up memory mappings or address-space references +- Initializes registers and stack for the first instruction +- Places the new task in the ready queue + +### Process termination + +A process may terminate because: + +- It returns from `main` +- It calls `exit()` +- It receives a fatal signal +- The OS or an administrator kills it +- It crashes due to an exception + +Termination involves: + +- Releasing memory and kernel resources +- Closing or decrementing references to open resources +- Recording the exit status for the parent +- Notifying the parent if needed + +## 16. Zombie and Orphan Processes + +These are classic interview questions. + +### Zombie process + +A zombie is a process that has finished execution, but whose parent has not yet collected its exit status with `wait()` or `waitpid()`. + +Important detail: + +- The zombie is not really running anymore +- Most resources are already released +- A small process-table entry remains so the parent can read the exit status + +Why zombies matter: + +- If a parent never reaps children, zombie entries accumulate and consume process-table slots + +### Orphan process + +An orphan is a child whose parent exits before the child does. + +What happens next: + +- The orphan is adopted by a system reaper process, historically `init`, and on many Linux systems effectively managed under `systemd` +- The new parent eventually reaps it when it exits + +### Interview distinction + +- Zombie: child is dead, parent is still alive, exit status not yet collected +- Orphan: child is alive, parent is dead + +This distinction is asked constantly, so answer it precisely. + +## 17. Multithreading Basics + +Multithreading means using multiple threads of execution within a process. + +### Why use multithreading? + +- Improve throughput on multi-core CPUs +- Overlap waiting with useful work +- Keep applications responsive +- Separate responsibilities such as request handling, background work, and monitoring + +### Benefits + +- Lower creation and communication cost than processes +- Shared memory makes cooperation fast +- Fits server workloads with many concurrent activities + +### Risks + +- Race conditions +- Deadlocks +- False sharing and cache contention +- Harder debugging and reproducibility + +### Backend examples + +- A Java web server may use a thread pool to process requests +- A database engine may use dedicated background threads for flushing, compaction, or replication +- A runtime may use one thread for networking and others for CPU-heavy work + +## 18. User Threads vs Kernel Threads + +This topic is important because it connects thread abstraction to actual scheduling. + +### Kernel threads + +Kernel threads are visible to the operating system scheduler. + +Properties: + +- The kernel can schedule them directly on CPUs +- If one blocks in the kernel, other kernel threads of the process can still run +- They usually have higher creation and switch overhead than pure user threads + +### User threads + +User threads are managed by a user-space runtime or library. + +Properties: + +- Very fast to create and switch in many designs +- Scheduler logic can be customized in user space +- A blocking system call can stall progress if the model maps many user threads onto one kernel thread + +### Common mapping models + +- 1:1: each user thread maps to one kernel thread +- N:1: many user threads map to one kernel thread +- M:N: many user threads multiplex over several kernel threads + +### Tradeoffs + +| Model | Strength | Weakness | +| --- | --- | --- | +| Kernel threads | True parallelism and better blocking behavior | More kernel overhead | +| User threads | Fast user-space scheduling | Blocking and multicore limitations in simple models | + +### Real-world view + +- Most mainstream runtimes on Linux today rely heavily on kernel threads +- Some runtimes add lightweight user-space scheduling on top, such as goroutines in Go +- Go still uses kernel threads underneath, but the runtime multiplexes many goroutines onto them + +## 19. Linux and Modern Backend Systems + +### Linux process model + +Linux internally represents processes and threads using closely related task structures. From the kernel's point of view, threads are largely tasks that share selected resources such as memory mappings and file tables. + +That is why low-level Linux APIs such as `clone()` are central to thread creation. Different sharing flags determine what is shared. + +### Common production patterns + +#### Pre-fork servers + +Some servers create a pool of worker processes up front. + +Why this is useful: + +- Fault isolation between workers +- Predictable memory layout +- Simple concurrency model + +Examples: + +- Older Apache models +- Gunicorn worker processes + +#### Thread pools + +Many application servers maintain a fixed or elastic pool of threads. + +Why this is useful: + +- Avoids thread creation cost per request +- Limits concurrency to something the system can handle +- Provides backpressure when the pool is saturated + +Examples: + +- Java servlet containers +- C++ RPC servers + +#### Event-driven systems + +Some high-concurrency systems avoid one-thread-per-request and instead use event loops. + +Why this is useful: + +- Handles many I/O-bound connections efficiently +- Reduces context-switch and stack overhead + +Examples: + +- Nginx +- Node.js plus libuv +- Redis single-threaded command execution with I/O multiplexing + +### Practical interview insight + +When choosing between processes, threads, and event loops, the answer is usually not theoretical purity. It is about workload shape: + +- Need strong isolation: lean toward processes +- Need easy shared state and moderate concurrency: lean toward threads +- Need huge numbers of mostly idle connections: lean toward event-driven models + +## 20. Common Interview Questions and Practical Scenarios + +### 1. What is the difference between a process and a thread? + +Strong answer: + +A process is an isolated resource container with its own virtual address space. A thread is a schedulable execution path inside a process. Threads share process memory and resources, which makes them cheaper to create and communicate through, but also makes them less isolated. + +### 2. Why is context switching expensive? + +Strong answer: + +Because the system must save and restore execution state, run scheduling logic, and often lose cache and TLB locality. The indirect cache effects are often more expensive than the raw register save and restore. + +### 3. Why can Round Robin improve responsiveness? + +Strong answer: + +Because every runnable task gets CPU time within a bounded interval, rather than waiting for a long task to finish. That makes interactive workloads feel responsive, assuming the quantum is chosen well. + +### 4. What is the convoy effect? + +Strong answer: + +In FCFS, a long CPU-bound task at the front of the queue can force many short or I/O-heavy tasks to wait behind it, reducing system responsiveness and utilization. + +### 5. What is the difference between a zombie and an orphan? + +Strong answer: + +A zombie has already exited but still has an unreaped process-table entry. An orphan is still running but has lost its parent; it gets adopted by the system reaper. + +### 6. When would you use processes instead of threads? + +Strong answer: + +When fault isolation, security boundaries, or independent resource limits matter more than cheap shared-memory communication. Multi-tenant services, worker isolation, and plugin sandboxes are common examples. + +### 7. When would you use shared memory instead of message passing? + +Strong answer: + +When processes on the same machine need very high-throughput, low-copy data exchange and you can handle the synchronization complexity. Otherwise, message passing is often simpler and safer. + +### 8. Why are I/O-bound tasks often favored in practice? + +Strong answer: + +Because short bursts and quick wakeups improve latency for interactive users and servers. A scheduler that only optimizes raw throughput can make the system feel slow even if utilization looks good. + +### 9. What happens during `fork()` and `exec()`? + +Strong answer: + +`fork()` creates a child based on the parent, usually using copy-on-write for memory efficiency. `exec()` then replaces the current process image with a new program while typically keeping the same PID. + +### 10. Why do backend systems often use thread pools? + +Strong answer: + +They amortize thread creation cost, bound concurrency, and provide a place to enforce resource control and backpressure. + +## 21. Practical Scenarios Interviewers Like + +### Scenario: API server under high latency to downstream services + +What matters: + +- Requests become I/O-bound +- One-thread-per-request can work, but large concurrency may increase memory and scheduling overhead +- Event-driven or async designs may scale better for waiting-heavy workloads + +Good interview discussion: + +Talk about how waiting dominates CPU bursts, why schedulers can keep the CPU busy with other work, and why thread pools or async I/O can reduce overhead. + +### Scenario: CPU-heavy image processing service + +What matters: + +- Work is CPU-bound +- Number of active workers should roughly track available cores +- Excessive threading may hurt due to context switching and cache contention + +Good interview discussion: + +Explain that for CPU-bound workloads, more concurrency than available cores often hurts throughput. Bounded worker pools and process isolation may both be sensible, depending on memory and fault-tolerance needs. + +### Scenario: Parent spawns workers but never waits for them + +What matters: + +- Exited workers become zombies +- Process-table entries accumulate +- Fix is to reap children, often with `waitpid()` or a `SIGCHLD` handling strategy + +### Scenario: Two services on the same machine need low-latency communication + +What matters: + +- Unix domain sockets are often a strong default +- Shared memory may be faster for large payloads but requires careful synchronization +- Pipes are simple but less flexible for general bidirectional service communication + +## 22. Common Mistakes in Interviews + +- Saying a process is just a program on disk. A process is a running program plus OS-managed execution state and resources. +- Saying threads are independent like processes. They are not; they share address space and many resources. +- Confusing mode switches with context switches. +- Forgetting that SJF is mainly theoretical unless you can estimate burst lengths. +- Forgetting starvation as a tradeoff in SJF and priority scheduling. +- Saying zombies are running processes. They are not running; they are already dead. +- Treating semaphores and signals as interchangeable. They solve different problems. + +## 23. How to Build Strong Interview Answers + +When answering process-management questions, aim for this structure: + +1. Define the concept precisely +2. Explain the tradeoff +3. Give a systems example +4. Mention one practical failure mode or performance implication + +For example, for threads vs processes: + +- Definition: processes isolate memory, threads share it +- Tradeoff: processes give isolation, threads give cheaper communication +- Example: PostgreSQL uses processes; many app servers use thread pools +- Failure mode: memory corruption in one thread can crash the whole process + +That answer shape is usually stronger than a short textbook definition. + +## 24. Final Mental Model + +If you remember only one picture, remember this: + +- Processes are about isolation and resource ownership +- Threads are about execution inside that resource container +- Scheduling is about deciding who gets CPU time +- Context switching is the cost of moving among runnable tasks +- IPC is how isolated tasks cooperate +- Real systems choose among processes, threads, and event loops based on workload shape, not ideology + +That mental model is enough to connect interview theory to actual Linux behavior and backend system design. diff --git a/os/storage.md b/os/storage.md new file mode 100644 index 0000000..d1bb235 --- /dev/null +++ b/os/storage.md @@ -0,0 +1,2 @@ +File Systems +Disk Scheduling diff --git a/os/systemOperations.md b/os/systemOperations.md new file mode 100644 index 0000000..5a5589f --- /dev/null +++ b/os/systemOperations.md @@ -0,0 +1,1108 @@ +# System Operations and OS Internals for Interviews + +This guide is written for software engineers who already build and debug real systems but want a stronger operating-systems mental model for interviews. The focus is not only on definitions, but on what actually happens when software crosses the boundary into the operating system, how hardware and the kernel cooperate, and how these ideas show up in Linux backend systems. + +--- + +## 1. Why This Topic Matters + +Most application code runs in a protected, abstracted environment. You write to a socket, read a file, allocate memory, create a thread, or wait on a timer, and it feels like a normal function call. Underneath that API, the operating system is enforcing protection, multiplexing hardware, handling interrupts, programming devices, managing memory, and deciding which thread gets CPU time. + +Interviewers ask these topics because they reveal whether you understand: + +- where the application boundary ends and the OS boundary begins, +- why some operations are cheap and others are expensive, +- how blocking, I/O, and scheduling interact, +- how Linux servers actually spend their time, +- and how the kernel preserves isolation and security. + +If you understand the flow from user code to hardware and back, a lot of unrelated-looking interview questions become much easier. + +--- + +## 2. Big Picture: What the OS Actually Does + +An operating system is the privileged software layer that sits between applications and hardware. It provides a controlled way to use CPUs, memory, storage, devices, and networking. + +At a high level, the OS is responsible for: + +- Process and thread management +- Memory management +- I/O and device management +- File systems +- Scheduling +- Protection and isolation +- Interrupt handling +- Resource accounting and policy decisions + +An application generally cannot touch hardware directly. Instead, it asks the OS to perform privileged work on its behalf. + +--- + +## 3. User Mode vs Kernel Mode + +One of the most important OS concepts is that the CPU runs code in different privilege levels. + +### User Mode + +Most application code runs in user mode. + +In user mode: + +- Code cannot execute privileged instructions. +- Code cannot directly access arbitrary physical memory. +- Code cannot directly reprogram devices or interrupt tables. +- Code must request OS services through controlled entry points. + +This protects the system from buggy or malicious applications. If any process could directly write page tables, reconfigure the disk controller, or disable interrupts, the entire machine would be unstable and insecure. + +### Kernel Mode + +The kernel runs in a more privileged CPU mode. + +In kernel mode: + +- The kernel can execute privileged instructions. +- The kernel can manage page tables and MMU state. +- The kernel can program devices and install interrupt handlers. +- The kernel can inspect and manipulate process state. + +Kernel mode is powerful, but dangerous. A kernel bug is much more serious than a user-space bug because it can crash the system or violate isolation. + +### Why the Separation Exists + +The OS relies on hardware support to enforce this boundary. The CPU, MMU, and page tables together make sure a user process cannot simply decide to access kernel memory or execute privileged instructions. + +This boundary is the foundation of protection. + +```mermaid +flowchart TD + A[User Process in User Mode] -->|system call or fault| B[Controlled CPU transition] + B --> C[Kernel Mode] + C --> D[Kernel validates request] + D --> E[Kernel performs privileged work] + E --> F[Return to user mode] + F --> A + A -. cannot directly .-> G[Device registers] + A -. cannot directly .-> H[Page tables] + A -. cannot directly .-> I[Interrupt controller] +``` + +### Interview framing + +A strong concise answer is: + +> User mode is the restricted execution mode for applications. Kernel mode is the privileged mode where the OS can manage hardware and system-wide resources. The boundary exists so the machine can enforce isolation, safety, and access control. + +--- + +## 4. Privileged Instructions + +Privileged instructions are CPU instructions that can only be executed in kernel mode or another sufficiently privileged mode. + +Examples include instructions that: + +- modify page tables or MMU configuration, +- disable or enable interrupts, +- access device control registers, +- install interrupt descriptor tables, +- switch certain processor control registers, +- halt or reboot the machine. + +If user code tries to execute one of these instructions, the CPU raises an exception rather than allowing it. + +### Why this matters + +Without privileged instructions, any user process could: + +- bypass memory isolation, +- intercept device traffic, +- block interrupts and freeze progress, +- or read or modify another process's memory. + +So the hardware does not merely rely on the kernel being polite. It enforces privilege checks. + +--- + +## 5. Protection Context and Security Boundaries + +When interviewers ask about protection, they are usually probing whether you understand what exactly is being isolated and how. + +### Main protection boundaries + +#### 1. User space vs kernel space + +This is the main privilege boundary. User code cannot directly perform privileged operations; it must go through the kernel. + +#### 2. Process vs process + +Each process typically has its own virtual address space. Process A cannot directly read or write process B's memory unless the OS explicitly allows sharing. + +#### 3. File and device permissions + +The kernel enforces ownership, permissions, capabilities, ACLs, and namespace boundaries. + +#### 4. Execution identity + +Every request arrives with a protection context such as: + +- user ID and group IDs, +- capabilities, +- current namespace and cgroup context, +- open file descriptors, +- current memory map, +- current working directory and root context. + +The kernel uses this context when deciding whether an operation is allowed. + +### Example + +Suppose a backend service calls `open("/etc/shadow", O_RDONLY)`. + +The kernel does not ask whether the function call exists. It asks whether the current process identity and security context are allowed to perform that operation on that inode. The check is enforced by the kernel, not by the application. + +### The role of the MMU + +Memory protection is heavily supported by hardware: + +- Each process gets virtual memory mappings. +- Page tables mark pages as readable, writable, executable, user-accessible, or kernel-only. +- The MMU translates virtual addresses to physical addresses and enforces access rules. + +So process isolation is not just a software convention. It is a hardware-backed protection boundary. + +--- + +## 6. System Calls + +System calls are the controlled interface through which user-space programs request kernel services. + +Typical examples: + +- `read`, `write`, `open`, `close` +- `fork`, `execve`, `wait` +- `mmap`, `brk` +- `socket`, `bind`, `listen`, `accept`, `connect` +- `epoll_wait` +- `ioctl` + +### System call vs normal function call + +A normal function call stays within the process and the same privilege level. + +A system call crosses into the kernel and usually involves: + +- a privilege transition, +- register convention for syscall number and arguments, +- CPU state save/restore, +- kernel validation and dispatch, +- possible blocking or scheduling, +- and a return path back to user mode. + +This is why system calls are much more expensive than pure user-space function calls. + +### Why libc wrappers exist + +In Linux, user programs often call libc functions such as `read()` or `open()`. Those are wrappers. At some point the wrapper issues the actual syscall instruction and enters the kernel. + +Historically, x86 Linux used `int 0x80`. Modern x86-64 Linux typically uses `syscall`, which is faster and designed for this purpose. + +--- + +## 7. What Happens When a Program Requests OS Services + +This is one of the most important end-to-end interview flows to understand. + +Suppose a program calls `read(fd, buf, 4096)`. + +### Step-by-step view + +1. User code prepares arguments. + The file descriptor, buffer pointer, and length are placed in registers or the stack according to the calling convention and syscall ABI. + +2. A syscall instruction is executed. + The CPU performs a controlled transition from user mode to kernel mode. + +3. CPU switches to kernel execution context. + The CPU saves enough state to resume later, loads the kernel entry path, and begins running kernel code. + +4. Kernel identifies the syscall. + A syscall number selects the correct kernel handler from the syscall table. + +5. Kernel validates the request. + It checks that the file descriptor is valid, the user buffer is accessible, permissions are valid, and the arguments are well-formed. + +6. Kernel performs the operation. + It may satisfy the read from a page cache, a socket buffer, or may need to ask a device driver and possibly block the process until data is available. + +7. Kernel prepares the return value. + The result or error code is placed in a register. + +8. CPU returns to user mode. + User code resumes after the syscall instruction. + +9. libc wrapper may translate kernel error return to `errno`. + +### Important interview point + +The application does not jump into arbitrary kernel code. The transition happens only through hardware-controlled entry paths using designated instructions and entry tables. + +```mermaid +sequenceDiagram + participant U as User Code + participant L as libc Wrapper + participant C as CPU + participant K as Kernel + participant D as Driver or Device + + U->>L: call read(fd, buf, n) + L->>C: execute syscall instruction + C->>K: switch to kernel mode and enter syscall handler + K->>K: validate fd, buffer, permissions + alt data already available + K->>K: copy data to user buffer + else need device or network progress + K->>D: request I/O or wait for completion + D-->>K: completion event or data ready + K->>K: copy result and set return value + end + K-->>C: return-from-syscall + C-->>L: resume user mode + L-->>U: bytes read or -1 with errno +``` + +--- + +## 8. System Call Flow: User Space to Kernel Space + +It helps to remember system call flow in three layers. + +### Layer 1: API layer + +User code calls a familiar interface like `open`, `send`, or `fork`. + +### Layer 2: ABI and CPU transition + +Arguments are placed where the kernel expects them. A special instruction triggers the transition. + +### Layer 3: Kernel service path + +The kernel dispatches to the correct subsystem: + +- VFS for files, +- scheduler for process and thread changes, +- network stack for sockets, +- memory manager for `mmap`, +- block layer for storage I/O, +- device drivers for hardware-specific work. + +### Important kernel checks + +The kernel generally must: + +- check the process identity and permissions, +- copy or validate user pointers, +- enforce resource limits, +- preserve isolation, +- possibly sleep the thread if the operation cannot complete immediately. + +### Why copying matters + +Kernel code cannot blindly trust a user pointer. That pointer belongs to user space. The kernel has to validate access and usually copy data using controlled helper routines. Otherwise, a process could trick the kernel into reading or writing invalid memory. + +--- + +## 9. Interrupts + +An interrupt is a signal that causes the CPU to stop its current flow of execution and run a handler for an event. + +Interrupts are a core reason the OS can respond to external events without constantly busy-waiting. + +### What interrupts are for + +Common reasons for interrupts: + +- a network card received a packet, +- a disk completed an I/O request, +- a timer fired, +- a keyboard event occurred, +- an inter-processor signal was sent, +- or software intentionally triggered a protected control transfer. + +### Key idea + +Interrupts let hardware and low-level software notify the CPU that attention is needed. + +--- + +## 10. Hardware Interrupts vs Software Interrupts + +Interview discussions often mix these terms loosely, so it helps to be precise. + +### Hardware Interrupts + +These originate from hardware devices or controllers. + +Examples: + +- NIC signals packet arrival +- disk controller signals I/O completion +- timer chip signals time slice expiration + +Properties: + +- generally asynchronous relative to the currently running instruction stream, +- arrive from outside the current program, +- handled by kernel interrupt handlers. + +### Software Interrupts + +This term is used in two related ways. + +#### Historical meaning + +An instruction such as `int` on x86 deliberately causes a controlled transfer to a privileged handler. + +#### Broader interview meaning + +People sometimes use it loosely to refer to synchronous control transfers caused by software, including system calls, traps, and exceptions. + +### Safer wording in interviews + +It is often better to say: + +- hardware interrupts are asynchronous events from devices, +- traps and exceptions are synchronous events caused by the current instruction stream, +- and system calls are controlled synchronous entries into the kernel. + +That phrasing is more precise and avoids architecture-specific confusion. + +--- + +## 11. Traps and Exceptions + +Traps and exceptions are synchronous events related to the current instruction being executed. + +### Exception + +An exception occurs when the CPU detects a condition while executing an instruction. + +Examples: + +- divide by zero, +- invalid opcode, +- page fault, +- general protection fault. + +### Trap + +In interview usage, a trap is often described as a deliberate, synchronous transfer to the kernel, such as a debugger breakpoint or a syscall-style software-triggered entry. + +### Useful refinement + +In lower-level architecture discussions, exceptions are often subdivided into: + +- faults: potentially restartable events, such as page faults, +- traps: reported after the instruction, often used for breakpoints or intentional transitions, +- aborts: serious failures that are not meaningfully restartable. + +You do not always need that level of detail, but it helps if the interviewer is very systems-oriented. + +### Example: page fault + +A page fault is not inherently a crash. + +When a process accesses a virtual page that is not currently mapped in RAM but is valid, the CPU raises a page fault exception, the kernel loads or maps the page, updates page tables, and then resumes the instruction. + +If the access is invalid, the kernel may send a signal such as `SIGSEGV` to the process. + +This is a good example of how an exception can be part of normal control flow. + +--- + +## 12. Interrupt Handling Flow + +You should understand the general shape, even if you do not memorize architecture-specific registers. + +### Typical flow + +1. An interrupt or exception occurs. +2. CPU saves enough current execution state. +3. CPU switches to a privileged handler path. +4. Kernel identifies the interrupt or exception vector. +5. A low-level handler runs. +6. The handler may acknowledge the device, record state, and schedule deferred work. +7. If necessary, the scheduler may run another thread before returning. +8. Eventually execution returns to some user or kernel context. + +### Why deferred work exists + +Interrupt handlers usually need to be fast. They often do the minimum urgent work and defer heavier processing to a later stage such as a softirq, tasklet, workqueue, kernel thread, or bottom-half style mechanism. + +That keeps interrupt latency low. + +```mermaid +flowchart TD + A[Device event or CPU exception] --> B[CPU saves current state] + B --> C[CPU enters privileged handler] + C --> D[Kernel identifies vector] + D --> E[Top-half or immediate handler] + E --> F[Acknowledge source and capture minimal state] + F --> G{More work needed?} + G -->|Yes| H[Schedule deferred processing] + G -->|No| I[Prepare return] + H --> I + I --> J{Need reschedule?} + J -->|Yes| K[Scheduler picks next runnable task] + J -->|No| L[Return to interrupted context] + K --> L +``` + +### Real Linux example + +For network receive: + +- NIC raises an interrupt, +- kernel handler acknowledges it, +- packet processing may be deferred using NAPI-style polling, +- packet eventually reaches the socket receive queue, +- a blocked process may be woken up. + +This is much more realistic than imagining the application directly talks to the NIC. + +--- + +## 13. I/O Management + +I/O is where the OS earns its keep. CPUs are fast, but devices are comparatively slow and unpredictable. The OS exists partly to hide those differences while keeping the system efficient. + +### What the kernel does for I/O + +The kernel provides: + +- abstract interfaces such as files and sockets, +- buffering and caching, +- scheduling and queuing, +- synchronization and wake-up mechanisms, +- driver interaction, +- permission checks, +- and completion notification. + +### Main I/O path idea + +An application usually works with abstractions like: + +- file descriptor, +- pathname, +- socket, +- pipe, +- terminal, +- block device. + +The kernel translates those abstractions into device-specific work. + +--- + +## 14. Blocking vs Non-Blocking I/O + +These terms describe what the calling thread experiences. + +### Blocking I/O + +In blocking I/O, the call does not return until it can make meaningful progress or complete. + +Examples: + +- `read()` on a socket with no available data blocks until data arrives, +- `accept()` blocks until a connection is ready, +- `waitpid()` blocks until child state changes. + +When a thread blocks, the scheduler usually marks it non-runnable and runs something else. + +### Non-Blocking I/O + +In non-blocking I/O, the call returns immediately if it cannot proceed right now. + +For example, `read()` on a non-blocking socket may return `-1` with `EAGAIN` or `EWOULDBLOCK`. + +The application then decides whether to: + +- retry later, +- use `select`, `poll`, `epoll`, or `kqueue`, +- hand the work to an event loop, +- or queue it in some application scheduler. + +### Real backend example + +A high-concurrency web server usually cannot afford one OS thread per slow client connection. Instead, it uses non-blocking sockets plus a readiness notification API such as `epoll`. + +That lets one thread manage many connections efficiently. + +--- + +## 15. Synchronous vs Asynchronous I/O + +These terms are related to completion semantics, not just whether the thread blocks. + +### Synchronous I/O + +In synchronous I/O, the operation is conceptually tied to the calling thread. Completion is generally observed by waiting in that call path. + +Typical examples: + +- blocking `read()` and `write()`, +- `fsync()`, +- many simple file operations. + +### Asynchronous I/O + +In asynchronous I/O, the request is submitted and completion is delivered later through a separate notification path. + +Examples: + +- signal-based AIO, +- completion queues, +- `io_uring` completion entries, +- overlapped I/O on some platforms. + +### Important distinction + +Blocking vs non-blocking asks: does the thread wait right now? + +Synchronous vs asynchronous asks: how is completion reported and who owns the completion path? + +These are different axes. + +### Common interview trap + +People often say non-blocking I/O is the same as asynchronous I/O. It is not. + +You can have: + +- non-blocking synchronous-style APIs where you keep retrying or wait for readiness, +- asynchronous APIs that still require careful completion handling, +- and blocking APIs that are entirely synchronous. + +```mermaid +flowchart TD + A[I/O request] --> B{Does caller wait now?} + B -->|Yes| C[Blocking] + B -->|No| D[Non-blocking] + A --> E{How is completion observed?} + E -->|Same call path| F[Synchronous] + E -->|Later notification or CQ| G[Asynchronous] +``` + +--- + +## 16. Buffered vs Unbuffered I/O + +These terms ask whether data passes through kernel or library-managed buffers. + +### Buffered I/O + +Buffered I/O uses intermediate storage to smooth differences in producer and consumer speed. + +Examples: + +- stdio buffering in user space, +- kernel page cache for files, +- socket receive and send buffers, +- disk write buffering. + +Benefits: + +- fewer device accesses, +- better batching, +- better throughput, +- smoother interaction with slower devices. + +Costs: + +- extra copies, +- more memory usage, +- less immediate visibility of writes unless explicitly flushed. + +### Unbuffered or direct-style I/O + +This usually means minimizing intermediate buffering, often for control or performance reasons. + +In Linux, direct I/O with flags like `O_DIRECT` aims to bypass the page cache for some workloads. It does not mean literally zero buffering everywhere, but it avoids the usual file cache path. + +### Interview angle + +If asked why databases sometimes use direct I/O, a good answer is: + +> Databases often want explicit control over caching and flushing. Using the kernel page cache on top of the database's own cache can create double buffering and reduce predictability. + +--- + +## 17. Polling vs Interrupt-Driven I/O + +These are two ways of discovering whether a device or resource needs attention. + +### Polling + +With polling, software repeatedly checks device or resource state. + +Advantages: + +- simple control flow, +- can be efficient at very high event rates, +- avoids interrupt overhead in some cases. + +Costs: + +- wastes CPU if nothing is happening, +- may add latency depending on poll frequency. + +### Interrupt-driven I/O + +With interrupt-driven I/O, the device notifies the CPU when it needs attention. + +Advantages: + +- avoids constant busy checking, +- good for sporadic events, +- allows the CPU to do other work. + +Costs: + +- interrupt handling overhead, +- can become expensive under extremely high rates. + +### Real Linux nuance + +Modern networking often blends both. A NIC may raise an interrupt to indicate work, and then the kernel may switch into a polling mode such as NAPI to drain many packets efficiently. + +That hybrid approach reduces interrupt storms under load. + +--- + +## 18. Device Drivers + +A device driver is the kernel component that knows how to operate a particular hardware device or family of devices. + +Applications do not usually talk to hardware registers directly. They interact with kernel abstractions, and the driver handles the device-specific details. + +### What drivers do + +- initialize devices, +- configure DMA, +- submit commands, +- handle interrupts, +- expose interfaces to other kernel subsystems, +- and report errors or state. + +### Examples + +- NVMe driver for SSDs +- network driver for a NIC +- USB controller driver +- GPU driver + +### Why drivers belong in the kernel path + +Drivers often need privileged access to: + +- device MMIO regions, +- interrupt registration, +- DMA mappings, +- power management hooks, +- and kernel memory. + +That is why driver bugs can be serious. + +--- + +## 19. DMA Basics + +DMA stands for Direct Memory Access. + +Without DMA, the CPU would need to move every byte between a device and memory itself. That would be inefficient. + +With DMA: + +- the kernel and driver program the device, +- the device transfers data directly to or from main memory, +- the CPU is interrupted or otherwise notified on completion. + +### Why DMA matters + +DMA reduces CPU overhead and increases throughput, especially for networking and storage. + +### Real example: NIC receive path + +1. Driver sets up receive buffers in RAM. +2. NIC DMA engine writes packet data into those buffers. +3. NIC signals completion. +4. Kernel processes the packet and eventually wakes a waiting socket reader. + +### Important nuance + +DMA is called direct, but it still requires OS and IOMMU coordination. The device does not get unrestricted access to all memory. Modern systems use mapping and protection mechanisms so the device can access only approved memory ranges. + +--- + +## 20. Boot Process Overview + +The boot process is the sequence that turns a powered-off machine into a running OS with user processes. + +At a high level: + +1. Firmware starts after power-on. +2. Firmware initializes enough hardware to load boot code. +3. A bootloader loads the kernel. +4. The kernel initializes core subsystems. +5. The kernel starts the first user-space process. +6. That process starts services and the rest of the system. + +This is worth knowing because it connects hardware, firmware, kernel, and user space into one story. + +--- + +## 21. BIOS vs UEFI + +These are firmware environments that start before the OS. + +### BIOS + +BIOS is the older traditional firmware model. + +Characteristics: + +- older boot mechanism, +- limited early environment, +- legacy partitioning and boot conventions, +- common in older systems. + +### UEFI + +UEFI is the newer firmware standard. + +Characteristics: + +- richer pre-boot environment, +- support for EFI system partitions, +- boot entries managed in firmware, +- better support for modern disks and boot flows, +- support for Secure Boot. + +### Practical interview answer + +BIOS and UEFI both initialize the system and hand off to boot code, but UEFI is the modern, more flexible firmware architecture and is what you see on most current machines. + +--- + +## 22. Bootloader + +The bootloader is the program that loads the OS kernel into memory and transfers control to it. + +Examples in Linux environments: + +- GRUB +- systemd-boot +- U-Boot in embedded systems + +### What the bootloader typically does + +- locates the kernel image, +- loads the kernel into memory, +- often loads an initramfs or initrd, +- passes boot parameters, +- and transfers control to the kernel entry point. + +### Why initramfs matters + +The initial RAM filesystem contains early user-space tools and drivers needed before the real root filesystem is mounted. + +That is useful when the real root depends on drivers, RAID, LVM, encryption, or network setup. + +--- + +## 23. Kernel Initialization + +Once the bootloader hands control to the kernel, the kernel starts bringing up the system. + +### Major initialization tasks + +- set up CPU mode and early memory structures, +- initialize page tables and memory management, +- establish interrupt and exception handling, +- initialize scheduler structures, +- initialize timers, +- discover hardware and initialize drivers, +- mount or prepare the root filesystem, +- create the first kernel and user-space execution contexts. + +### Key mental model + +During kernel initialization, the machine moves from a barely initialized hardware environment to a full operating-system environment with memory management, interrupt handling, device access, and process support. + +--- + +## 24. Init and systemd Basics + +After the kernel is ready to start user space, it launches the first user-space process. + +On Linux, that process is traditionally called `init`, and on most modern distributions it is `systemd` as PID 1. + +### Why PID 1 matters + +PID 1 is special because it: + +- becomes the ancestor of many processes, +- starts system services, +- manages service dependencies, +- reaps orphaned zombie processes, +- and helps define system startup state. + +### What systemd adds + +`systemd` is more than an init replacement. It provides: + +- service management, +- dependency ordering, +- logging integration, +- socket activation, +- timer units, +- cgroup-based supervision. + +### Interview note + +You do not need to love `systemd`, but you should understand that after kernel initialization, user-space service orchestration begins with PID 1. + +--- + +## 25. How Linux Boots: Power-On to Running Processes + +This is the most useful Linux boot narrative to remember. + +```mermaid +flowchart TD + A[Power on] --> B[Firmware runs POST and early hardware init] + B --> C[BIOS or UEFI selects boot target] + C --> D[Bootloader loads kernel and initramfs] + D --> E[Kernel decompresses and enters start_kernel] + E --> F[Kernel initializes memory, scheduler, interrupts, drivers] + F --> G[Kernel mounts initramfs and finds real root filesystem] + G --> H[Kernel starts PID 1] + H --> I[systemd or init starts services and targets] + I --> J[Login shell, sshd, daemons, containers, apps] +``` + +### Narrative version + +1. Power-on starts firmware. +2. Firmware performs POST and basic hardware initialization. +3. Firmware selects a boot target and runs the bootloader. +4. The bootloader loads the Linux kernel and often an initramfs. +5. The kernel initializes core subsystems. +6. The kernel sets up enough drivers and storage support to reach the root filesystem. +7. The kernel launches PID 1. +8. PID 1 starts the rest of user space. +9. Services such as networking, logging, SSH, container runtimes, and application daemons come up. + +That is the end-to-end answer most interviewers want. + +--- + +## 26. Real-World Linux and Backend Examples + +The theory becomes much easier if you connect it to software you already know. + +### Example 1: A web server reading from a socket + +1. Client sends a packet. +2. NIC receives it and DMA-writes packet data into memory. +3. NIC raises an interrupt. +4. Kernel networking stack processes the packet. +5. Socket receive queue becomes readable. +6. If a thread is blocked in `epoll_wait`, the kernel wakes it. +7. The server calls `read` or `recv`. +8. Data is copied or mapped into user-visible buffers. + +This ties together DMA, interrupts, drivers, kernel queues, readiness notification, and system calls. + +### Example 2: Reading a file + +If file data is already in the page cache, a `read()` may complete without touching disk hardware at all. + +If not: + +1. Kernel resolves the file and inode. +2. VFS and filesystem code determine the needed block. +3. Block layer submits storage I/O. +4. Driver and device cooperate to fetch the data. +5. Completion wakes the blocked thread. +6. Data is copied back to user space. + +This is why page cache behavior matters so much for performance. + +### Example 3: Non-blocking event loop on Linux + +A server sets sockets to non-blocking mode and registers them with `epoll`. + +Instead of blocking on each `read`, it blocks in one place, `epoll_wait`, until one or more sockets become ready. That is how a single thread can manage many mostly-idle connections. + +### Example 4: `sendfile` and fewer copies + +Linux can sometimes move file data to a socket more efficiently using `sendfile`, reducing user-space copying and context transitions. This is a good example of why understanding the kernel path helps explain performance features. + +--- + +## 27. Common Interview Questions and How to Think About Them + +### What is a system call? + +Best answer: + +> A system call is the controlled interface through which user-space code requests privileged services from the kernel, such as file I/O, process creation, memory mapping, or networking. + +### What happens during a system call? + +Mention: + +- arguments prepared in user space, +- special CPU instruction, +- switch to kernel mode, +- kernel dispatch and validation, +- possible blocking or device interaction, +- return value back to user space. + +### What is the difference between user mode and kernel mode? + +Mention: + +- privilege level, +- ability to execute privileged instructions, +- direct access to hardware and kernel memory, +- isolation and safety. + +### Are interrupts and system calls the same thing? + +Best answer: + +> No. Hardware interrupts are typically asynchronous events from devices. System calls are controlled synchronous entries into the kernel initiated by the running program. Both cause privileged control transfers, but they originate differently. + +### What is the difference between a trap, an exception, and an interrupt? + +Good interview answer: + +> Interrupts are typically asynchronous external events. Exceptions are synchronous events caused by the current instruction, such as divide-by-zero or page faults. Traps are a synchronous control-transfer category often used for deliberate software-triggered entries such as debugging breakpoints or syscall-style entry points. + +### Blocking vs non-blocking I/O? + +Good answer: + +> Blocking and non-blocking describe whether the calling thread waits immediately. In blocking I/O the call may sleep until progress is possible. In non-blocking I/O the call returns immediately if it cannot proceed. + +### Synchronous vs asynchronous I/O? + +Good answer: + +> Synchronous and asynchronous describe how completion is observed. In synchronous I/O completion is tied to the calling path. In asynchronous I/O the request is submitted now and completion is delivered later via a separate notification mechanism. + +### What is DMA and why is it useful? + +Good answer: + +> DMA lets devices transfer data directly to or from RAM without forcing the CPU to copy every byte itself. That reduces CPU overhead and improves throughput for storage and networking. + +### How does Linux boot? + +Mention: + +- firmware, +- bootloader, +- kernel image and initramfs, +- kernel initialization, +- PID 1, +- service startup. + +--- + +## 28. Practical Scenarios Interviewers Like + +### Scenario 1: Why is a service thread blocked? + +Possible explanations: + +- waiting in a blocking syscall such as `read`, `accept`, `futex`, or `epoll_wait`, +- blocked on disk I/O, +- sleeping on a lock or condition variable, +- waiting for network data, +- or descheduled because it is not runnable. + +Good follow-up thinking: + +- Is it CPU-bound or I/O-bound? +- Is it blocked in user space or kernel space? +- Is the problem contention, latency, or starvation? + +### Scenario 2: Why does one slow disk hurt request latency? + +Because synchronous blocking I/O can put threads to sleep while the storage path completes. If the application architecture has too little concurrency or poor queueing, tail latency grows quickly. + +### Scenario 3: Why do event loops scale better than thread-per-connection for many idle sockets? + +Because most connections are idle most of the time. Non-blocking sockets plus readiness notification let one thread wait efficiently for many connections instead of dedicating a blocked thread to each one. + +### Scenario 4: Why does a page fault not always mean a crash? + +Because many page faults are recoverable and part of normal virtual-memory behavior, such as demand paging or lazy allocation. + +### Scenario 5: Why are syscalls more expensive than normal function calls? + +Because they cross the protection boundary, switch privilege levels, involve kernel dispatch and validation, and may trigger scheduler interaction or device work. + +--- + +## 29. Common Mistakes to Avoid in Interviews + +- Saying non-blocking I/O and asynchronous I/O are the same thing. +- Saying a page fault always means segmentation fault. +- Saying user space directly talks to hardware in normal application code. +- Ignoring protection checks during syscall flow. +- Forgetting that the kernel may block the thread and schedule something else. +- Treating all interrupts, traps, and exceptions as identical. +- Describing BIOS, bootloader, kernel, and init as one undifferentiated startup blob. + +--- + +## 30. A Compact Mental Model to Remember + +If you need one interview-ready model, remember this: + +1. Applications run in user mode with restricted privileges. +2. They ask the kernel for services through system calls. +3. The CPU and hardware enforce the user-kernel protection boundary. +4. Devices communicate readiness and completion through interrupts and DMA-assisted data movement. +5. The kernel manages scheduling, memory, device drivers, and protection. +6. Linux boot moves from firmware to bootloader to kernel to PID 1 to the rest of user space. + +If you can explain those six points cleanly with one or two real Linux examples, you are already at a strong interview level. + +--- + +## 31. Quick Revision Checklist + +Before an interview, make sure you can explain each of these without hand-waving: + +- Why user mode and kernel mode exist +- What a privileged instruction is +- What happens during a system call +- The difference between hardware interrupts and synchronous exceptions +- The basic interrupt-handling path +- Blocking vs non-blocking I/O +- Synchronous vs asynchronous I/O +- Buffered vs direct-style I/O +- What drivers do +- What DMA is for +- Polling vs interrupt-driven I/O +- BIOS vs UEFI +- What a bootloader does +- What the kernel initializes before user space starts +- Why PID 1 matters + +If you can connect each one to a Linux server example, your understanding is in good shape.