first commit
This commit is contained in:
+1125
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,921 @@
|
||||
# Memory Management for Software Engineering Interviews
|
||||
|
||||
Memory management is one of the most important operating-system topics for interviews because it sits at the boundary between hardware reality, kernel policy, language runtime behavior, and application performance. If you build backend systems, work with C++ or Java, debug production latency, or reason about scale, you are already dealing with memory-management tradeoffs even if the kernel hides most of the mechanics.
|
||||
|
||||
This guide aims to give you an interview-ready mental model, not just a glossary. The central question is simple:
|
||||
|
||||
> How does the operating system make memory appear large, fast, isolated, and safe even though physical RAM is limited, shared, and much slower than the CPU?
|
||||
|
||||
## 1. Why Memory Management Exists
|
||||
|
||||
An operating system cannot let every process read and write raw physical memory arbitrarily. If it did:
|
||||
|
||||
- Any process could corrupt another process.
|
||||
- The kernel would have no isolation boundary.
|
||||
- Programs would need to know where they are loaded in RAM.
|
||||
- Memory would be difficult to share safely.
|
||||
- Fragmentation and relocation would become unmanageable.
|
||||
|
||||
Memory management exists to solve a few core problems at once:
|
||||
|
||||
- Isolation: each process should feel like it owns memory.
|
||||
- Protection: invalid or unauthorized accesses should be blocked.
|
||||
- Efficiency: RAM should be used well, not wasted.
|
||||
- Abstraction: programs should use addresses without caring where data physically lives.
|
||||
- Performance: recently used translations and data should be fast to access.
|
||||
- Flexibility: the OS should be able to load, move, share, swap, and evict memory as needed.
|
||||
|
||||
The big idea is that processes mostly work with logical or virtual addresses, while the operating system and hardware cooperate to map those to physical memory.
|
||||
|
||||
## 2. How Memory Works in an Operating System
|
||||
|
||||
At a high level, a running process sees a virtual address space. The CPU issues a memory reference like "load from address X". That address is usually not a raw DRAM location. Instead, hardware called the Memory Management Unit (MMU) translates it into a physical address.
|
||||
|
||||
The actual flow usually looks like this:
|
||||
|
||||
1. A process executes an instruction that references a virtual address.
|
||||
2. The CPU checks the TLB, which is a small cache of recent address translations.
|
||||
3. If the translation is in the TLB, the CPU quickly gets the physical frame.
|
||||
4. If not, hardware or the kernel walks the page tables to find the mapping.
|
||||
5. If the page is present in RAM and permissions allow access, the read or write proceeds.
|
||||
6. If the page is not present, a page fault occurs and the kernel decides how to handle it.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Instruction references virtual address] --> B{TLB hit?}
|
||||
B -->|Yes| C[Get physical frame quickly]
|
||||
B -->|No| D[Walk page tables]
|
||||
D --> E{Valid present mapping?}
|
||||
E -->|Yes| F[Fill TLB and continue]
|
||||
E -->|No| G[Page fault trap to kernel]
|
||||
G --> H{Can kernel resolve it?}
|
||||
H -->|Yes| I[Load or map page and resume]
|
||||
H -->|No| J[Send error like SIGSEGV or kill process]
|
||||
C --> K[Access cache or DRAM]
|
||||
F --> K
|
||||
I --> K
|
||||
```
|
||||
|
||||
This explains a lot of interview topics at once:
|
||||
|
||||
- Virtual memory gives each process its own address space.
|
||||
- Paging breaks memory into fixed-size units.
|
||||
- Page tables store the mapping.
|
||||
- The TLB makes translation fast.
|
||||
- Page faults handle missing pages.
|
||||
- Swapping and demand paging allow memory to exceed RAM.
|
||||
|
||||
## 3. Logical Address vs Physical Address
|
||||
|
||||
This distinction is foundational.
|
||||
|
||||
### Logical address
|
||||
|
||||
A logical address is the address generated by the CPU from the program's point of view. In modern systems, the term virtual address is usually used in practice, and in interview conversation logical and virtual are often treated as effectively the same thing.
|
||||
|
||||
Examples:
|
||||
|
||||
- A pointer in C++ points to a virtual address in the process address space.
|
||||
- A Java object reference is resolved by the JVM within the process's memory model, but the underlying memory still ultimately lives in virtual memory managed by the OS.
|
||||
|
||||
### Physical address
|
||||
|
||||
A physical address is the real location in RAM that the memory controller uses.
|
||||
|
||||
### Important nuance
|
||||
|
||||
Historically, some textbooks distinguish logical from virtual more carefully, especially in segmented systems. For most modern interview contexts, the useful distinction is:
|
||||
|
||||
- Program-visible address: logical or virtual
|
||||
- Hardware RAM location: physical
|
||||
|
||||
### Why the distinction matters
|
||||
|
||||
- Protection is enforced on virtual-to-physical translation.
|
||||
- Different processes can use the same virtual address values without conflict.
|
||||
- The OS can relocate or swap memory without changing application code.
|
||||
|
||||
Example:
|
||||
|
||||
- Process A may read from virtual address `0x7fff0000`.
|
||||
- Process B may also read from virtual address `0x7fff0000`.
|
||||
- Those can map to completely different physical frames.
|
||||
|
||||
That is why virtual addresses are per-process, while physical addresses are system-wide.
|
||||
|
||||
## 4. Address Space
|
||||
|
||||
An address space is the range of memory addresses a process can use. More precisely, it is the abstraction of memory visible to that process.
|
||||
|
||||
Each process typically gets its own virtual address space containing regions such as:
|
||||
|
||||
- Text or code segment
|
||||
- Read-only data
|
||||
- Global and static data
|
||||
- Heap
|
||||
- Memory-mapped files
|
||||
- Shared libraries
|
||||
- Stack
|
||||
|
||||
Typical process layout looks like this:
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
K[High virtual addresses]
|
||||
S[Stack grows downward]
|
||||
M[Memory-mapped region and shared libraries]
|
||||
H[Heap grows upward]
|
||||
D[Data and BSS]
|
||||
T[Code or text]
|
||||
Z[Low virtual addresses]
|
||||
|
||||
K --> S --> M --> H --> D --> T --> Z
|
||||
```
|
||||
|
||||
### Interview-level understanding
|
||||
|
||||
- The address space is virtual, not raw RAM.
|
||||
- The heap and stack are just regions inside that space.
|
||||
- Separate processes have separate address spaces.
|
||||
- Threads in the same process share the address space but usually have separate stacks.
|
||||
|
||||
### 32-bit vs 64-bit intuition
|
||||
|
||||
- A 32-bit address space is much smaller and historically made memory pressure and layout constraints more visible.
|
||||
- A 64-bit address space is so large that modern systems can use sparse mappings comfortably, which makes techniques like memory-mapped files and guard pages easier to support.
|
||||
|
||||
Large virtual address spaces do not mean the machine has that much RAM. They just give the OS a large namespace to manage.
|
||||
|
||||
## 5. Memory Allocation Basics
|
||||
|
||||
Memory allocation means deciding how memory is assigned to processes, threads, objects, buffers, or pages.
|
||||
|
||||
There are several layers of allocation:
|
||||
|
||||
- The kernel allocates physical page frames.
|
||||
- The kernel maps virtual pages into a process address space.
|
||||
- User-space allocators such as `malloc`, `new`, `jemalloc`, or `tcmalloc` manage heap memory inside the process.
|
||||
- Language runtimes like the JVM allocate objects within managed heap regions.
|
||||
|
||||
### Common allocation categories
|
||||
|
||||
#### Static allocation
|
||||
|
||||
Memory decided before execution, such as global variables or static storage.
|
||||
|
||||
#### Stack allocation
|
||||
|
||||
Memory associated with function calls and local variables with automatic lifetime.
|
||||
|
||||
#### Heap allocation
|
||||
|
||||
Memory requested dynamically at runtime, often with manual or runtime-managed lifetime.
|
||||
|
||||
### What `malloc` or `new` really does
|
||||
|
||||
Interviewers often ask this because it reveals whether you understand the layers.
|
||||
|
||||
At a simplified level:
|
||||
|
||||
1. Your program asks the allocator for some bytes.
|
||||
2. The allocator tries to satisfy it from existing heap arenas or free lists.
|
||||
3. If it needs more memory, it may ask the kernel for additional pages using mechanisms like `brk` or `mmap`.
|
||||
4. The kernel updates page tables so those virtual pages belong to the process.
|
||||
5. Actual physical pages may still be assigned lazily on first touch, depending on the OS.
|
||||
|
||||
So `malloc(1024)` usually does not mean "immediately reserve exactly 1024 physical bytes in RAM". It means "make this memory available in the process's virtual address space and allocator bookkeeping".
|
||||
|
||||
## 6. Contiguous vs Non-Contiguous Memory Allocation
|
||||
|
||||
This topic is really about how memory is laid out physically or logically for a process.
|
||||
|
||||
### Contiguous allocation
|
||||
|
||||
In contiguous allocation, a process or region is placed in one continuous block of physical memory.
|
||||
|
||||
Advantages:
|
||||
|
||||
- Simple bookkeeping
|
||||
- Simple address computation
|
||||
- Historically easy to implement
|
||||
|
||||
Disadvantages:
|
||||
|
||||
- Hard to fit variable-sized processes efficiently
|
||||
- External fragmentation becomes a serious problem
|
||||
- Growing processes is awkward
|
||||
- Compaction may be needed
|
||||
|
||||
Older memory-management designs used fixed or variable partitions in physical memory, but these approaches did not scale well.
|
||||
|
||||
### Non-contiguous allocation
|
||||
|
||||
In non-contiguous allocation, a process can occupy multiple separated physical locations.
|
||||
|
||||
Examples:
|
||||
|
||||
- Paging: memory split into fixed-size pages and frames
|
||||
- Segmentation: memory split into logical variable-sized segments
|
||||
- Combined designs: segmented paging or paged virtual memory
|
||||
|
||||
Advantages:
|
||||
|
||||
- Better flexibility
|
||||
- Better RAM utilization
|
||||
- Easier growth of address spaces
|
||||
- Simplifies sharing and protection at smaller granularity
|
||||
|
||||
Disadvantages:
|
||||
|
||||
- More translation overhead
|
||||
- More metadata such as page tables
|
||||
- More complex hardware and kernel logic
|
||||
|
||||
Modern general-purpose operating systems rely heavily on non-contiguous allocation, especially paging.
|
||||
|
||||
## 7. Fragmentation: Internal vs External
|
||||
|
||||
Fragmentation means memory is being wasted, but the reason for the waste differs.
|
||||
|
||||
### Internal fragmentation
|
||||
|
||||
Internal fragmentation happens when allocated memory is larger than what the program actually needs, so wasted space exists inside the allocated unit.
|
||||
|
||||
Example:
|
||||
|
||||
- If page size is 4 KiB and a process needs 6 KiB, it will use 2 pages, or 8 KiB total.
|
||||
- About 2 KiB is unused inside the allocated pages.
|
||||
|
||||
This is internal fragmentation because the wasted space is inside the allocated blocks.
|
||||
|
||||
### External fragmentation
|
||||
|
||||
External fragmentation happens when enough total free memory exists, but it is split into small scattered holes, so a large contiguous request cannot be satisfied.
|
||||
|
||||
Example:
|
||||
|
||||
- Free blocks of 10 MB, 5 MB, and 20 MB exist.
|
||||
- A process requests a contiguous 30 MB block.
|
||||
- Total free memory is 35 MB, but there is no single 30 MB region.
|
||||
|
||||
This is external fragmentation because the waste exists between allocated regions.
|
||||
|
||||
### What causes each one
|
||||
|
||||
- Fixed-size allocation units, like pages, tend to create internal fragmentation.
|
||||
- Variable-sized contiguous allocation tends to create external fragmentation.
|
||||
|
||||
### Interview framing
|
||||
|
||||
If asked which fragmentation paging solves, the strong answer is:
|
||||
|
||||
> Paging largely eliminates external fragmentation in physical allocation because pages can be placed anywhere, but it still suffers from internal fragmentation at page granularity.
|
||||
|
||||
## 8. Virtual Memory
|
||||
|
||||
Virtual memory is the abstraction that gives each process a large, private, contiguous-looking address space, regardless of how memory is physically arranged.
|
||||
|
||||
The key word is illusion. The OS does not promise that every virtual page is backed by RAM right now. It promises that accesses will either work through translation or be handled through faults, allocation, or process termination.
|
||||
|
||||
### What virtual memory provides
|
||||
|
||||
- Isolation between processes
|
||||
- Protection via access permissions
|
||||
- Sparse address spaces
|
||||
- The ability to use more virtual memory than physical RAM
|
||||
- Efficient sharing of libraries and file mappings
|
||||
- Simplified programming model
|
||||
|
||||
### Why virtual memory is needed
|
||||
|
||||
Without virtual memory:
|
||||
|
||||
- Programs would need physical addresses or explicit relocation logic.
|
||||
- Different processes could not reuse the same convenient address ranges.
|
||||
- Swapping and demand paging would be much harder.
|
||||
- Isolation would be weak and unsafe.
|
||||
- Shared libraries and memory-mapped files would be more complicated.
|
||||
|
||||
### The most important interview insight
|
||||
|
||||
Virtual memory is not just about pretending disk is extra RAM. That is too shallow.
|
||||
|
||||
It is mainly about:
|
||||
|
||||
- address translation,
|
||||
- protection,
|
||||
- isolation,
|
||||
- flexible placement,
|
||||
- and loading data only when needed.
|
||||
|
||||
Using disk as a backing store is one consequence, not the whole story.
|
||||
|
||||
## 9. Paging
|
||||
|
||||
Paging is the dominant memory-management technique in modern operating systems.
|
||||
|
||||
The idea is simple:
|
||||
|
||||
- Divide virtual memory into fixed-size pages.
|
||||
- Divide physical memory into fixed-size frames of the same size.
|
||||
- Map each virtual page to some physical frame.
|
||||
|
||||
If page size is 4 KiB, a virtual address is split into:
|
||||
|
||||
- Virtual page number
|
||||
- Offset within the page
|
||||
|
||||
The offset stays the same during translation. Only the page number changes.
|
||||
|
||||
Example:
|
||||
|
||||
- Virtual address = page 42, offset 100
|
||||
- Page table says page 42 is in frame 900
|
||||
- Physical address = frame 900, offset 100
|
||||
|
||||
This is why paging avoids needing contiguous physical memory.
|
||||
|
||||
### Advantages of paging
|
||||
|
||||
- Eliminates most external fragmentation
|
||||
- Supports virtual memory naturally
|
||||
- Makes sharing and protection easy at page granularity
|
||||
- Allows demand paging and swapping
|
||||
|
||||
### Costs of paging
|
||||
|
||||
- Page-table memory overhead
|
||||
- Internal fragmentation inside the last page
|
||||
- Translation overhead without a TLB
|
||||
- Page faults can be very expensive
|
||||
|
||||
## 10. Page Tables
|
||||
|
||||
A page table is the data structure that maps virtual pages to physical frames.
|
||||
|
||||
Each entry usually stores more than just a frame number. Typical metadata includes:
|
||||
|
||||
- Present or valid bit
|
||||
- Read or write permissions
|
||||
- User or kernel accessibility
|
||||
- Dirty bit, meaning page has been modified
|
||||
- Accessed or referenced bit
|
||||
- Execute-disable bit on supported hardware
|
||||
|
||||
### Why page tables matter
|
||||
|
||||
They are where isolation and protection become concrete. If the mapping is missing or permissions do not allow access, the CPU traps into the kernel.
|
||||
|
||||
### Why page tables can be large
|
||||
|
||||
Suppose a process has a large virtual address space and small page size. A flat page table would need an entry for a huge number of possible pages, even if the process only uses a small subset.
|
||||
|
||||
That is why real systems use hierarchical or multi-level page tables.
|
||||
|
||||
## 11. Multi-Level Paging
|
||||
|
||||
Multi-level paging is an optimization for page-table storage.
|
||||
|
||||
Instead of one giant page table, the address is broken into multiple index levels. Lower-level tables are allocated only for the parts of the address space actually in use.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Virtual address] --> B[Level 1 index]
|
||||
A --> C[Level 2 index]
|
||||
A --> D[Level 3 index]
|
||||
A --> E[Page offset]
|
||||
B --> F[Top-level page table]
|
||||
F --> G[Next-level table]
|
||||
G --> H[Leaf page table entry]
|
||||
H --> I[Physical frame]
|
||||
E --> J[Physical address uses same offset]
|
||||
I --> J
|
||||
```
|
||||
|
||||
### Why it helps
|
||||
|
||||
- Sparse address spaces do not require allocating a full flat page table.
|
||||
- Memory overhead becomes proportional to the used regions of the address space.
|
||||
|
||||
### Tradeoff
|
||||
|
||||
Walking multiple levels takes more memory accesses on a TLB miss. That is one reason the TLB is so important.
|
||||
|
||||
### Real-world example
|
||||
|
||||
Modern 64-bit systems such as x86-64 often use four or five levels of paging for large address spaces.
|
||||
|
||||
You do not usually need to memorize exact bit splits unless the interviewer is going deep into architecture. What matters is understanding why multi-level paging exists.
|
||||
|
||||
## 12. Translation Lookaside Buffer (TLB)
|
||||
|
||||
The TLB is a small, very fast cache inside the CPU that stores recent virtual-to-physical translations.
|
||||
|
||||
Without a TLB, every memory access could require extra page-table lookups, which would be far too slow.
|
||||
|
||||
### Why the TLB matters so much
|
||||
|
||||
Every instruction fetch, stack access, heap access, and data read depends on address translation. If translation were always a full page-table walk, memory access would be dramatically slower.
|
||||
|
||||
### TLB hit vs miss
|
||||
|
||||
- TLB hit: translation found quickly, access continues.
|
||||
- TLB miss: hardware or software must walk page tables and possibly populate the TLB.
|
||||
|
||||
### Practical implications
|
||||
|
||||
- Good locality improves TLB effectiveness.
|
||||
- Large page sizes or huge pages can reduce TLB pressure because one entry covers more memory.
|
||||
- Context switches can reduce TLB usefulness unless the CPU supports address-space tagging such as ASIDs or PCIDs.
|
||||
|
||||
### Backend-system angle
|
||||
|
||||
Databases, caches, in-memory analytics engines, and JVM heaps can all suffer when working sets exceed TLB coverage. This is one reason huge pages sometimes help performance-sensitive systems.
|
||||
|
||||
## 13. Page Faults
|
||||
|
||||
A page fault occurs when a process accesses a virtual page whose translation cannot be completed normally.
|
||||
|
||||
That does not automatically mean a bug. Some page faults are expected and legitimate.
|
||||
|
||||
### Common reasons for a page fault
|
||||
|
||||
- The page has not been loaded yet and must be brought into memory.
|
||||
- The page exists but is currently swapped out.
|
||||
- The page is marked copy-on-write and needs a private copy on write.
|
||||
- The access violates protection, such as writing to a read-only page.
|
||||
- The address is invalid and not mapped at all.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant P as Process
|
||||
participant CPU as CPU or MMU
|
||||
participant K as Kernel
|
||||
participant D as Disk or backing store
|
||||
|
||||
P->>CPU: access virtual page
|
||||
CPU->>K: page fault trap
|
||||
K->>K: inspect page-table entry and permissions
|
||||
alt page can be resolved
|
||||
K->>D: read page if needed
|
||||
D-->>K: page data
|
||||
K->>K: update page table and TLB state
|
||||
K-->>P: resume instruction
|
||||
else invalid or forbidden access
|
||||
K-->>P: send fault signal or terminate
|
||||
end
|
||||
```
|
||||
|
||||
### Major vs minor page fault
|
||||
|
||||
Interviewers sometimes like this distinction.
|
||||
|
||||
- Minor page fault: page can be satisfied without disk I/O, for example a copy-on-write mapping or a page already in memory but not mapped into this process yet.
|
||||
- Major page fault: servicing the fault requires disk I/O, which is much slower.
|
||||
|
||||
### Important nuance
|
||||
|
||||
A segmentation fault in Linux is often the user-visible result of an invalid or protection-violating page fault. So page fault is the low-level event; `SIGSEGV` is often the process-level consequence.
|
||||
|
||||
## 14. Demand Paging
|
||||
|
||||
Demand paging means pages are loaded into memory only when they are actually accessed.
|
||||
|
||||
This is one of the biggest reasons virtual memory is efficient. Instead of loading an entire executable or heap eagerly, the OS can load pages lazily.
|
||||
|
||||
### Benefits
|
||||
|
||||
- Faster program startup
|
||||
- Lower RAM usage
|
||||
- Only touched pages consume physical memory
|
||||
- Large sparse data structures become feasible
|
||||
|
||||
### Costs
|
||||
|
||||
- First access latency due to page faults
|
||||
- Too much lazy loading under pressure can cause many faults
|
||||
|
||||
### Real-world examples
|
||||
|
||||
- Executable code pages are often loaded on first use.
|
||||
- `mmap` of a large file typically does not read the whole file immediately.
|
||||
- After `fork`, Linux often uses copy-on-write so parent and child share pages until one writes.
|
||||
|
||||
Demand paging is a great interview bridge topic because it connects virtual memory, page tables, page faults, and performance.
|
||||
|
||||
## 15. Thrashing
|
||||
|
||||
Thrashing happens when the system spends too much time paging pages in and out and too little time doing useful work.
|
||||
|
||||
This usually occurs when the active working sets of processes do not fit in available RAM.
|
||||
|
||||
### Symptoms
|
||||
|
||||
- Very high page fault rate
|
||||
- Heavy disk I/O or swap activity
|
||||
- CPU utilization may drop because tasks keep waiting on memory
|
||||
- Throughput collapses
|
||||
- Tail latency becomes terrible
|
||||
|
||||
### Why it happens
|
||||
|
||||
If a process keeps needing pages that were just evicted, the system enters a destructive loop:
|
||||
|
||||
- page needed,
|
||||
- page fault,
|
||||
- load from disk,
|
||||
- evict another needed page,
|
||||
- repeat.
|
||||
|
||||
### Mitigations
|
||||
|
||||
- Add more RAM
|
||||
- Reduce multiprogramming level
|
||||
- Tune memory limits and eviction behavior
|
||||
- Use better locality-friendly algorithms
|
||||
- Reduce heap size or working set size
|
||||
- Avoid overcommitting memory aggressively
|
||||
|
||||
### Practical production example
|
||||
|
||||
A Java service in a container with tight memory limits may begin swapping or faulting heavily under burst traffic. Even if CPU looks available, the service becomes slow because it is memory-bound rather than compute-bound.
|
||||
|
||||
## 16. Segmentation
|
||||
|
||||
Segmentation divides memory into logical variable-sized regions called segments, such as code, data, stack, or heap.
|
||||
|
||||
Instead of address = page number + offset, the idea is more like:
|
||||
|
||||
- segment number
|
||||
- offset within the segment
|
||||
|
||||
Each segment has a base and limit.
|
||||
|
||||
### Why segmentation is attractive conceptually
|
||||
|
||||
- It matches program structure well.
|
||||
- Different segments can have different permissions.
|
||||
- Sharing logical regions can be natural.
|
||||
|
||||
### Main problem
|
||||
|
||||
Because segments are variable-sized, segmentation suffers from external fragmentation.
|
||||
|
||||
### Modern relevance
|
||||
|
||||
Pure segmentation is not the main model in modern general-purpose systems. Modern systems are dominated by paging, though some architectures preserve limited segmentation concepts for special purposes.
|
||||
|
||||
Still, segmentation remains important in interviews because it teaches the difference between logical program regions and fixed-size paging units.
|
||||
|
||||
## 17. Paging vs Segmentation
|
||||
|
||||
This comparison comes up often.
|
||||
|
||||
| Aspect | Paging | Segmentation |
|
||||
| --- | --- | --- |
|
||||
| Unit size | Fixed-size pages | Variable-size segments |
|
||||
| View of memory | Physical-management oriented | Logical-program-structure oriented |
|
||||
| Fragmentation | Internal fragmentation | External fragmentation |
|
||||
| Allocation flexibility | High | Lower under pressure |
|
||||
| Protection granularity | Page-based | Segment-based |
|
||||
| Modern OS usage | Very common | Limited or combined |
|
||||
|
||||
### Strong interview explanation
|
||||
|
||||
Paging is better for efficient physical memory management because fixed-size frames are easy to allocate. Segmentation is better for expressing logical program structure, but variable-sized segments fragment memory. That is why modern systems mostly use paging, sometimes with segmentation concepts layered on top or retained for limited architectural roles.
|
||||
|
||||
## 18. Swapping
|
||||
|
||||
Swapping means moving memory contents between RAM and disk to free physical memory.
|
||||
|
||||
Historically, systems sometimes swapped entire processes. Modern systems usually work at page granularity, not by moving whole processes out all at once.
|
||||
|
||||
### Why swapping exists
|
||||
|
||||
- RAM is finite.
|
||||
- Some pages are cold and can be moved out temporarily.
|
||||
- This allows the system to keep more virtual memory in use than physical RAM alone would permit.
|
||||
|
||||
### Why swapping is dangerous for performance
|
||||
|
||||
Disk, even SSD, is far slower than RAM. If hot pages are swapped out and quickly needed again, latency explodes.
|
||||
|
||||
### Linux perspective
|
||||
|
||||
- Linux can swap anonymous pages under pressure.
|
||||
- The kernel also uses the page cache heavily for file-backed data.
|
||||
- In containerized systems, excessive swapping often causes severe performance issues, and some deployments disable swap to avoid unpredictable latency.
|
||||
|
||||
Swapping is sometimes useful as a safety buffer, but if a latency-sensitive service is actively depending on swap, it is usually already in trouble.
|
||||
|
||||
## 19. Stack vs Heap
|
||||
|
||||
This is a classic interview topic because it connects language semantics to OS memory layout.
|
||||
|
||||
### Stack
|
||||
|
||||
The stack is typically:
|
||||
|
||||
- Per thread
|
||||
- Automatically managed by function call discipline
|
||||
- Used for call frames, return addresses, parameters, and many local variables
|
||||
- Very fast to allocate and free because it usually just moves the stack pointer
|
||||
|
||||
Common properties:
|
||||
|
||||
- Lifetime is usually lexical or call-scoped.
|
||||
- Size is limited.
|
||||
- Deep recursion can cause stack overflow.
|
||||
|
||||
### Heap
|
||||
|
||||
The heap is typically:
|
||||
|
||||
- Shared by threads in the same process
|
||||
- Used for dynamically allocated objects
|
||||
- Flexible in lifetime and size relative to the stack
|
||||
- Managed by allocators or garbage collectors
|
||||
|
||||
Common properties:
|
||||
|
||||
- Allocation and freeing are more expensive than simple stack-pointer movement.
|
||||
- Fragmentation can occur.
|
||||
- Bugs such as leaks, double free, or use-after-free often involve heap memory.
|
||||
|
||||
### Language examples
|
||||
|
||||
#### C++
|
||||
|
||||
- Local automatic variable usually lives on the stack.
|
||||
- `new` typically allocates on the heap.
|
||||
- RAII helps tie resource lifetime to scope.
|
||||
|
||||
#### Java
|
||||
|
||||
- Each thread has a stack for method frames.
|
||||
- Most objects live on the heap managed by the JVM.
|
||||
- Some values may be optimized away or scalar-replaced by the JIT, so the old rule "objects are always on the heap" is directionally right for interviews but not perfectly literal.
|
||||
|
||||
### Strong interview summary
|
||||
|
||||
Stack allocation is fast and structured but limited and scope-bound. Heap allocation is flexible and long-lived but more expensive to manage and more prone to fragmentation and lifetime bugs.
|
||||
|
||||
## 20. Memory Leaks and Garbage Collection Basics
|
||||
|
||||
Memory leaks are not just a C or C++ problem. They also happen in managed runtimes, just in a different form.
|
||||
|
||||
### Memory leak in manual-memory systems
|
||||
|
||||
In C or C++, a memory leak usually means allocated memory is no longer needed but can no longer be freed because the program lost track of it.
|
||||
|
||||
Examples:
|
||||
|
||||
- `malloc` without `free`
|
||||
- `new` without `delete`
|
||||
- Overwriting the only pointer to an allocated object
|
||||
|
||||
### Memory leak in garbage-collected systems
|
||||
|
||||
In Java, Go, or other GC languages, a leak usually means memory is still reachable, so the garbage collector cannot reclaim it, even though the application no longer logically needs it.
|
||||
|
||||
Examples:
|
||||
|
||||
- Static caches that grow forever
|
||||
- Listeners never deregistered
|
||||
- `ThreadLocal` values retained too long
|
||||
- Maps holding references to expired sessions
|
||||
|
||||
GC prevents many manual deallocation bugs, but it does not prevent retaining useless objects.
|
||||
|
||||
### Garbage collection basics
|
||||
|
||||
Most modern garbage collectors are tracing collectors. They start from GC roots, such as stacks, registers, and global references, then mark reachable objects.
|
||||
|
||||
Common ideas you should know:
|
||||
|
||||
- Mark-sweep: mark reachable objects, reclaim the rest.
|
||||
- Mark-compact: reclaim and then compact live objects to reduce fragmentation.
|
||||
- Copying collection: copy live objects into a new region, usually efficient for young generations.
|
||||
- Generational GC: exploit the fact that most objects die young, so collect young space frequently and old space less often.
|
||||
|
||||
### Why GC exists
|
||||
|
||||
- Reduces manual memory-management bugs
|
||||
- Improves safety and developer productivity
|
||||
- Makes high-level languages practical at scale
|
||||
|
||||
### Why GC is not free
|
||||
|
||||
- Extra CPU overhead
|
||||
- Pause times or concurrent collection complexity
|
||||
- Write barriers and runtime bookkeeping
|
||||
- Potential memory overhead from fragmentation, reserve spaces, or collection strategy
|
||||
|
||||
### C++ angle
|
||||
|
||||
C++ usually relies on deterministic destruction rather than GC. Strong interview topics include:
|
||||
|
||||
- RAII
|
||||
- `unique_ptr`
|
||||
- `shared_ptr`
|
||||
- reference cycles with `shared_ptr`
|
||||
- custom allocators and arena allocation
|
||||
|
||||
## 21. Real-World Examples from Linux, Java, C++, and Modern Backend Systems
|
||||
|
||||
### Linux
|
||||
|
||||
#### Copy-on-write after `fork`
|
||||
|
||||
When a process forks, Linux does not eagerly copy every page. Parent and child initially share pages as read-only. If one writes, that page faults and the kernel creates a private copy.
|
||||
|
||||
This is a classic example of demand paging, page faults, and efficient memory sharing working together.
|
||||
|
||||
#### `mmap`
|
||||
|
||||
Linux can map files directly into a process address space. Reads and writes can then operate through memory access rather than explicit `read` and `write` calls.
|
||||
|
||||
This is important for:
|
||||
|
||||
- databases,
|
||||
- analytics engines,
|
||||
- file-backed caches,
|
||||
- zero-copy-style optimizations.
|
||||
|
||||
#### Page cache
|
||||
|
||||
Linux uses RAM aggressively as a page cache for file data. This is why "free memory" is not the right metric by itself. Used memory may still be reclaimable cache.
|
||||
|
||||
### Java
|
||||
|
||||
Java memory interview discussion often includes:
|
||||
|
||||
- Heap for objects
|
||||
- Per-thread stacks
|
||||
- Metaspace for class metadata
|
||||
- GC generations
|
||||
- Stop-the-world pauses vs concurrent collectors
|
||||
- Off-heap memory via direct buffers or native libraries
|
||||
|
||||
Important practical point:
|
||||
|
||||
A Java service can fail from memory pressure even if heap graphs look reasonable because total memory also includes thread stacks, direct buffers, mapped files, metaspace, and native allocations.
|
||||
|
||||
### C++
|
||||
|
||||
C++ brings memory ownership and lifetime to the front.
|
||||
|
||||
Important practical topics:
|
||||
|
||||
- stack vs heap allocation,
|
||||
- manual memory management,
|
||||
- smart pointers,
|
||||
- object lifetime,
|
||||
- fragmentation under general-purpose allocators,
|
||||
- use-after-free,
|
||||
- double free,
|
||||
- arena allocation for predictable performance.
|
||||
|
||||
Many low-latency systems use custom allocators or memory pools to reduce allocator overhead and fragmentation.
|
||||
|
||||
### Modern backend systems
|
||||
|
||||
#### Containers and cgroups
|
||||
|
||||
A process may have plenty of virtual address space but still be killed because the container memory limit is reached. From an interview point of view, that shows the difference between address-space size, RSS, heap size, and actual allowed physical usage.
|
||||
|
||||
#### Databases and caches
|
||||
|
||||
Databases often care about page size, cache locality, huge pages, and NUMA effects because translation and memory locality directly affect throughput.
|
||||
|
||||
#### Managed services
|
||||
|
||||
High object churn in a Java service can increase GC frequency. The issue is not just "not enough memory" but often allocation rate, object lifetime distribution, and heap tuning.
|
||||
|
||||
#### Native services
|
||||
|
||||
A C++ service can show stable CPU but rising latency because of allocator contention, fragmentation, or page faults under memory pressure.
|
||||
|
||||
## 22. Common Interview Questions and How to Think About Them
|
||||
|
||||
### Why is virtual memory needed?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Virtual memory provides isolation, protection, flexible placement, sparse address spaces, efficient sharing, and the ability to load or back memory lazily. It is not only about using disk as extra memory.
|
||||
|
||||
### What is the difference between a page fault and a segmentation fault?
|
||||
|
||||
Strong answer:
|
||||
|
||||
A page fault is the low-level event when translation cannot proceed normally. It may be valid and recoverable, like demand paging. A segmentation fault is usually the operating system signal sent to the process when the fault is invalid or violates permissions.
|
||||
|
||||
### Why are page tables multi-level?
|
||||
|
||||
Strong answer:
|
||||
|
||||
A flat page table for a large sparse address space would waste too much memory. Multi-level paging allocates lower-level tables only where needed.
|
||||
|
||||
### What problem does the TLB solve?
|
||||
|
||||
Strong answer:
|
||||
|
||||
It caches recent address translations so each memory access does not require a costly full page-table walk.
|
||||
|
||||
### What is the difference between internal and external fragmentation?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Internal fragmentation is wasted space inside allocated units, like partially used pages. External fragmentation is wasted space between allocated regions, where enough total memory exists but not as one contiguous block.
|
||||
|
||||
### Why does paging reduce external fragmentation?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Because physical frames are fixed-size and pages can be placed anywhere, memory does not need one large contiguous block per process.
|
||||
|
||||
### What happens when you call `malloc`?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Usually the allocator serves the request from an internal free list or arena. If necessary, it requests more pages from the kernel. Physical memory may still be assigned lazily on first access.
|
||||
|
||||
### Why can Java still have memory leaks?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Because GC only frees unreachable objects. If the program keeps references to objects it no longer logically needs, those objects remain reachable and consume memory.
|
||||
|
||||
### What is thrashing?
|
||||
|
||||
Strong answer:
|
||||
|
||||
Thrashing occurs when the system spends most of its time servicing page faults and swapping pages instead of doing useful work, usually because working sets exceed available RAM.
|
||||
|
||||
## 23. Practical Scenarios Interviewers Like
|
||||
|
||||
### Scenario 1: Service latency spikes under load even though CPU is not maxed
|
||||
|
||||
Possible memory-related explanations:
|
||||
|
||||
- major page faults,
|
||||
- swapping,
|
||||
- allocator contention,
|
||||
- GC pauses,
|
||||
- poor locality causing cache and TLB misses.
|
||||
|
||||
### Scenario 2: Container is OOM-killed even though Java heap was below `-Xmx`
|
||||
|
||||
Reasoning:
|
||||
|
||||
- total process memory includes more than Java heap,
|
||||
- thread stacks, direct buffers, metaspace, native libraries, and page cache can all matter,
|
||||
- cgroup limit is the real boundary.
|
||||
|
||||
### Scenario 3: C++ process memory grows forever
|
||||
|
||||
Possibilities:
|
||||
|
||||
- actual leak,
|
||||
- retained caches,
|
||||
- allocator arenas not returned to OS,
|
||||
- fragmentation,
|
||||
- memory-mapped growth.
|
||||
|
||||
### Scenario 4: `fork` is surprisingly cheap on Linux
|
||||
|
||||
Reasoning:
|
||||
|
||||
- because of copy-on-write, pages are not copied immediately,
|
||||
- the kernel mainly duplicates metadata and page-table structures,
|
||||
- actual copying happens only on write.
|
||||
|
||||
### Scenario 5: Large memory-mapped file is opened instantly
|
||||
|
||||
Reasoning:
|
||||
|
||||
- `mmap` mainly creates virtual mappings,
|
||||
- actual file pages are brought in lazily by page faults on access.
|
||||
|
||||
## 24. What to Say in an Interview When You Want to Sound Strong
|
||||
|
||||
If you need a compact but impressive explanation, this is a solid framing:
|
||||
|
||||
> Modern memory management is built around virtual memory. Each process gets its own virtual address space, and the MMU translates virtual addresses to physical frames using page tables. Paging allows non-contiguous physical allocation, which improves flexibility and largely removes external fragmentation. The TLB makes translation fast, while page faults let the OS load pages lazily through demand paging. Multi-level page tables keep metadata manageable for sparse address spaces. In practice, performance issues often come from page faults, poor locality, TLB pressure, swapping, fragmentation, or runtime-level behaviors like GC and allocator overhead.
|
||||
|
||||
That answer ties together theory, hardware, OS behavior, and real production effects.
|
||||
|
||||
## 25. Final Mental Model
|
||||
|
||||
If you remember only one model, remember this:
|
||||
|
||||
- Programs operate in virtual address spaces.
|
||||
- The OS and MMU map virtual pages to physical frames.
|
||||
- Page tables store mappings and permissions.
|
||||
- The TLB caches those mappings for speed.
|
||||
- Missing mappings trigger page faults.
|
||||
- Demand paging and swapping let the system use RAM lazily and extend apparent memory capacity.
|
||||
- Paging trades external fragmentation for manageable internal fragmentation and metadata overhead.
|
||||
- Real systems succeed or fail based on locality, working set size, and lifetime management.
|
||||
|
||||
Once this clicks, many interview topics stop feeling like disconnected definitions and start feeling like one coherent system.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,2 @@
|
||||
File Systems
|
||||
Disk Scheduling
|
||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user