# Computer Networking Guide ## How To Use This Guide Computer networking becomes much easier when you stop treating it as a list of protocol names and start treating it as a system that answers four recurring questions: 1. Who am I trying to talk to? 2. How do I find that machine or service? 3. How does my data actually travel there and back? 4. How do we keep the communication reliable, fast, and secure? This guide is written for a beginner-to-intermediate reader. The goal is not to memorize acronyms. The goal is to build mental models that let you reason about what happens when you open a website, call an API, stream a video, or troubleshoot a connection issue. Throughout the guide, keep one idea in mind: networking is about moving data between processes running on different machines under real constraints such as latency, packet loss, congestion, and security risk. --- ## 1. Networking Fundamentals ### What A Network Is A network is a group of devices that can exchange data over some communication medium. Those devices might be laptops on home Wi-Fi, servers inside a data center, phones using cellular service, or routers connecting entire regions of the Internet. The communication medium can be physical, such as copper cable or fiber, or wireless, such as Wi-Fi, Bluetooth, or cellular radio. Regardless of the medium, the core job is the same: break information into a transferable form, send it across a path, and reconstruct it correctly at the destination. At a high level, a network exists so that one machine does not need to physically share memory or storage with another machine to communicate. That design makes modern computing possible. Browsers can talk to web servers, payment terminals can talk to banks, streaming apps can talk to CDN edge servers, and microservices can talk to databases or other services. ### Why Networking Exists Networking solves several important problems: - Resource sharing: one printer, storage server, database, or API can be used by many clients. - Communication at distance: machines can coordinate across a room, a building, a country, or the world. - Scalability: instead of one giant machine doing everything, systems can be split into many cooperating services. - Reliability: data can be replicated and services can fail over across multiple hosts or regions. - Specialization: some machines can be optimized for storage, some for compute, some for caching, and some for routing traffic. Without networks, modern web applications, cloud computing, video calls, multiplayer games, and distributed systems would not exist in recognizable form. ### Types Of Networks Networks are often categorized by geographic scope or by the role they play. | Type | Full Name | Typical Scope | Common Examples | | --- | --- | --- | --- | | PAN | Personal Area Network | Around one person | Bluetooth earbuds, smartwatch syncing with a phone | | LAN | Local Area Network | Home, office, school, data center rack | Home Wi-Fi, office Ethernet | | MAN | Metropolitan Area Network | Campus or city scale | University network across buildings | | WAN | Wide Area Network | Regional or global | ISP backbone, cloud provider private backbone | | Internet | Network of networks | Global | Public Internet connecting ISPs and organizations | ```mermaid flowchart LR A[Phone] -->|PAN| B[Watch] A -->|LAN via Wi-Fi| C[Home Router] C -->|WAN via ISP| D[Internet] D --> E[Cloud Region] D --> F[Video Streaming Service] D --> G[Messaging Service] ``` This diagram shows an important truth: small local networks do not exist in isolation. They are usually attached to larger networks, and the Internet is the system that connects those networks together. ### Core Terms You Should Know Early | Term | What It Means | Why It Matters | | --- | --- | --- | | Node | Any device participating in a network | Hosts, routers, switches, and printers are all nodes | | Link | A communication path between two nodes | Could be Ethernet, Wi-Fi, fiber, or cellular | | Packet | A unit of transmitted data | Networks move data in chunks rather than as one giant blob | | Frame | Layer 2 unit of local delivery | Used on a local link such as Ethernet or Wi-Fi | | Segment | Layer 4 unit in TCP | Helps transport protocols organize delivery | | Bandwidth | Maximum carrying capacity of a link | Often described in Mbps or Gbps | | Throughput | Actual achieved data rate | Usually lower than theoretical bandwidth | | Latency | Time taken for data to travel | Critical for responsiveness | | Jitter | Variation in delay over time | Important for voice and real-time media | | Packet loss | Data dropped before delivery | Degrades quality and triggers retransmissions | ### Mental Model: Roads, Not Pipes A common beginner mistake is imagining a network as one dedicated pipe between two applications. Real networks behave more like a road system. - Your data is chopped into packets. - Different packets may be queued, delayed, or even dropped. - Intermediate devices make forwarding decisions hop by hop. - Reliability is often created by software and protocols, not by the physical medium alone. That is why the same website can feel fast one moment and slow the next even though your laptop and the website did not change. The path, congestion, routing, and server load all matter. ### Performance Concepts: Bandwidth Vs Latency Bandwidth and latency are often confused, but they answer different questions: - Bandwidth asks, "How much data can I move per second?" - Latency asks, "How long does it take one piece of data to get there?" Real-world example: - Downloading a large game update mostly cares about bandwidth. - A video call or multiplayer game cares heavily about latency and jitter. - An API request that returns a tiny JSON payload may transfer almost no data, but still feels slow if round-trip latency is high. ### Real-World Scenario: Opening A Website When you type a URL into a browser, you are already using multiple networking concepts: 1. Your machine joins a local network through Wi-Fi or Ethernet. 2. It learns configuration like its IP address and default gateway. 3. It resolves the domain name into an IP address using DNS. 4. It establishes a transport connection, often TCP or QUIC. 5. It negotiates encryption for HTTPS. 6. It exchanges HTTP messages with one or more servers. By the end of this guide, each of those steps should feel concrete rather than mysterious. ### Quick Check - Why does a network need both a communication medium and rules for using it? - Why can a connection be slow even when bandwidth is high? - What is the difference between a local network and the Internet? --- ## 2. Thinking In Layers ### Why Layering Exists Networking would be unmanageable if every application had to worry about voltages on wires, local delivery on Wi-Fi, routing across the Internet, retransmission of lost data, encryption, and message formatting all at once. Layering solves this complexity problem by dividing responsibilities. Each layer provides a service to the layer above it and relies on the layer below it. That gives several benefits: - Separation of concerns: routing logic does not need to know how JSON is formatted. - Interoperability: vendors can build compatible devices by following the same standards. - Replaceability: Wi-Fi can replace Ethernet locally without changing how HTTP works. - Debuggability: you can ask whether a failure is at the application, transport, routing, or local-link level. ### Encapsulation And Decapsulation As data moves down the stack, each layer adds its own control information, usually in the form of headers. At the receiver, the process is reversed. ```mermaid flowchart LR A[Application Data
HTTP request] --> B[TCP Segment
Adds source and destination ports] B --> C[IP Packet
Adds source and destination IP addresses] C --> D[Ethernet or Wi-Fi Frame
Adds source and destination MAC addresses] D --> E[Bits On Wire Or Air] ``` This is called encapsulation. The reverse process, where the receiver removes headers layer by layer, is decapsulation. ### The OSI Model The OSI model is a conceptual teaching model with seven layers. Real systems do not literally stop and say, "Now we are in layer 5," but the model is still extremely useful because it helps you categorize responsibilities. ```mermaid flowchart TB L7[Layer 7 Application
HTTP, DNS, SMTP] L6[Layer 6 Presentation
TLS, compression, serialization] L5[Layer 5 Session
Session setup, reuse, teardown] L4[Layer 4 Transport
TCP, UDP] L3[Layer 3 Network
IP, routing] L2[Layer 2 Data Link
Ethernet, Wi-Fi, MAC] L1[Layer 1 Physical
Signals, cables, radio] L7 --> L6 --> L5 --> L4 --> L3 --> L2 --> L1 ``` | Layer | Name | Main Responsibility | Common Examples | | --- | --- | --- | --- | | 7 | Application | User-facing network services | HTTP, DNS, SMTP | | 6 | Presentation | Representation of data | TLS, compression, UTF-8, JPEG | | 5 | Session | Manage conversations between endpoints | Session resumption, RPC session handling | | 4 | Transport | End-to-end delivery between processes | TCP, UDP | | 3 | Network | Routing between networks | IP, ICMP | | 2 | Data Link | Delivery on the local link | Ethernet, Wi-Fi, ARP | | 1 | Physical | Transmission of raw bits | Copper, fiber, radio | ### Layer 7: Application The application layer is where protocols directly used by software live. When a browser speaks HTTP, when a mail server speaks SMTP, or when a resolver sends a DNS query, that is application-layer behavior. Why it exists: - Applications need a shared language for requests, responses, and data semantics. - A web browser and a web server need to agree on methods, headers, status codes, and message formats. How it works internally: - The application builds a message in a protocol-specific format. - That message is handed to lower layers for transport. - The receiver parses the protocol, validates the message, and performs application logic. Real-world usage: - Browsing a website uses HTTP or HTTPS. - Looking up a domain name uses DNS. - Sending mail uses SMTP. ### Layer 6: Presentation The presentation layer is about how data is represented so both sides interpret it correctly. It covers concerns like encoding, encryption, and compression. Why it exists: - Raw bytes are useless unless both sides agree on meaning. - Data may need to be compressed for efficiency or encrypted for confidentiality. How it works internally: - Data structures are serialized into bytes. - Text is encoded using standards such as UTF-8. - Encryption transforms readable plaintext into ciphertext. Real-world usage: - TLS encrypts web traffic. - JSON turns application objects into text. - Image formats such as JPEG define how bytes should be interpreted as an image. ### Layer 5: Session The session layer manages the idea of an ongoing conversation between endpoints. In modern systems, its responsibilities are often folded into libraries or application protocols rather than exposed as a separate visible layer. Why it exists: - Some interactions are not one-off messages. They involve setup, maintenance, reuse, timeout, and teardown. How it works internally: - A protocol or library may create identifiers, maintain state, and resume conversations. - Session information may be cached or reestablished after interruption. Real-world usage: - TLS session resumption reduces handshake cost. - Database drivers often manage long-lived sessions or pooled connections. - A web application may maintain a user session using cookies or tokens. ### Layer 4: Transport The transport layer provides end-to-end communication between processes, not just between machines. That is why ports matter at this layer. Why it exists: - Applications need process-to-process communication. - Many applications need reliability, ordering, flow control, or low-latency message delivery. How it works internally: - TCP offers a reliable byte stream using sequence numbers, acknowledgments, retransmissions, and flow control. - UDP offers lightweight message delivery with much less built-in control. Real-world usage: - HTTPS typically uses TCP or QUIC. - DNS often uses UDP for speed, with TCP used in some cases. - Voice or gaming traffic often prefers lower-latency transport behavior. ### Layer 3: Network The network layer is responsible for logical addressing and routing across multiple networks. IP lives here. Why it exists: - Local delivery alone is not enough. Traffic must cross routers and travel between networks. How it works internally: - Devices use IP addresses to identify source and destination interfaces. - Routers examine the destination IP address and decide the next hop using routing tables. Real-world usage: - Your home router forwards packets to your ISP. - Cloud routers move traffic between subnets and Internet gateways. ### Layer 2: Data Link The data link layer is responsible for delivery on a single local link. Ethernet and Wi-Fi are common examples. Why it exists: - Even before traffic can be routed across the world, it must move correctly within the local network segment. How it works internally: - Frames use source and destination MAC addresses. - Switches learn which MAC addresses are reachable on which ports. - Protocols such as ARP help map IP addresses to MAC addresses on a local network. Real-world usage: - A laptop sends a frame to the MAC address of its default gateway. - A switch forwards that frame to the correct port inside a LAN. ### Layer 1: Physical The physical layer is the actual transmission of bits through a medium. Why it exists: - All higher-layer abstractions ultimately need electrical, optical, or radio signaling to carry information. How it works internally: - Hardware converts bits into signals. - The receiver reconstructs those bits from the signal. - Speed, distance, interference, and signal quality affect reliability. Real-world usage: - Fiber links carry high-speed long-distance traffic. - Wi-Fi uses radio waves and is sensitive to interference and distance. ### The TCP/IP Model The TCP/IP model is the model used more directly by the Internet. It is less granular than OSI but closer to how real systems are usually discussed. | TCP/IP Layer | Rough OSI Mapping | Purpose | | --- | --- | --- | | Application | OSI 5, 6, 7 | Protocols used by software | | Transport | OSI 4 | End-to-end process communication | | Internet | OSI 3 | IP addressing and routing | | Link | OSI 1, 2 | Local delivery over a physical medium | ```mermaid flowchart LR A1[OSI Application, Presentation, Session] --> B1[TCP-IP Application] A2[OSI Transport] --> B2[TCP-IP Transport] A3[OSI Network] --> B3[TCP-IP Internet] A4[OSI Data Link and Physical] --> B4[TCP-IP Link] ``` ### OSI Vs TCP/IP In Practice The OSI model is best used for reasoning and teaching. The TCP/IP model is best used for describing what happens on the real Internet. If someone says, "TLS is between HTTP and TCP," they are thinking in a practical TCP/IP sense. If someone says, "TLS is often placed at the presentation layer," they are thinking in OSI teaching terms. Both views are useful if you understand the purpose of the model. ### Quick Check - Why is layering easier to manage than a single giant protocol? - If an HTTPS request fails because of a bad certificate, which OSI layers are most relevant? - When a router forwards a packet, which layers is it primarily using? --- ## 3. Addressing, Naming, And Local Discovery ### Why Addressing Matters Data cannot be delivered unless the network can answer three separate identity questions: - Which machine or interface should receive this traffic? - Which process on that machine should receive it? - On the local link, which hardware interface should I send the frame to next? That is why networking uses multiple kinds of identifiers rather than one universal address. ### MAC Addresses, IP Addresses, And Ports | Identifier | Scope | Purpose | Example | | --- | --- | --- | --- | | MAC address | Local link | Identifies the next local network interface | `00:1A:2B:3C:4D:5E` | | IP address | Across networks | Identifies a logical interface for routing | `192.168.1.10` or `2001:db8::10` | | Port | Inside a host | Identifies the destination process or service | `443` for HTTPS | Mental model: - IP is like the destination street address. - Port is like the apartment or office number. - MAC is like the label used by the local delivery truck for the next nearby handoff. This analogy is not perfect, but it helps explain why all three are needed. ### IPv4 Addressing IPv4 uses 32-bit addresses, usually written as four decimal numbers separated by dots, such as `192.168.1.10`. Why IPv4 exists: - Early Internet systems needed a simple logical addressing scheme for routing across many networks. How it works internally: - The address is a 32-bit binary value. - Some bits identify the network portion. - The remaining bits identify the host within that network. - Routers use the network portion to make forwarding decisions. Important IPv4 concepts: - Public IP addresses are globally routable on the Internet. - Private IP addresses are used inside local networks and are not routed directly on the public Internet. - Loopback addresses such as `127.0.0.1` refer to the same machine. Common private IPv4 ranges: | Range | Typical Use | | --- | --- | | `10.0.0.0/8` | Large internal networks | | `172.16.0.0/12` | Medium private networks | | `192.168.0.0/16` | Home and small office networks | ### Subnetting Basics Subnetting is the practice of dividing an IP address space into smaller logical networks. Why it exists: - It keeps broadcast domains smaller. - It organizes networks by department, environment, or geography. - It allows routing policies and security boundaries between subnets. How it works internally: - CIDR notation like `/24` means the first 24 bits are the network part. - The remaining bits are available for hosts. Examples: | Subnet | Meaning | Usable Host Count In Traditional IPv4 Terms | | --- | --- | --- | | `192.168.1.0/24` | 24 network bits, 8 host bits | 254 | | `192.168.1.0/25` | Split a `/24` into two smaller networks | 126 each | | `10.0.0.0/8` | Very large private network | Over 16 million | Quick mental model: - Larger prefix, such as `/28`, means a smaller subnet. - Smaller prefix, such as `/16`, means a larger subnet. Practical example: - `192.168.1.34/24` belongs to the network `192.168.1.0/24`. - A host in that subnet can talk directly to another `192.168.1.x` host on the same LAN. - To reach `8.8.8.8`, it sends traffic to its default gateway because that destination is outside the local subnet. ### IPv6 Addressing IPv6 uses 128-bit addresses, written in hexadecimal groups such as `2001:db8:85a3::8a2e:370:7334`. Why IPv6 exists: - IPv4 address space is limited. - The modern Internet has far more devices than early designers expected. - IPv6 also improves aspects of address assignment and protocol design. How it works internally: - IPv6 addresses are much larger, making exhaustion far less of a concern. - Neighbor Discovery replaces ARP. - IPv6 removes broadcast and relies more on multicast. Real-world usage: - Mobile carriers often use IPv6 heavily. - Modern cloud and consumer systems increasingly support dual stack, meaning both IPv4 and IPv6. Important note: IPv6 does not automatically make a network faster or more secure. It primarily solves addressing scale and modernizes parts of network behavior. ### ARP And Neighbor Discovery Routers use IP addresses for inter-network forwarding, but a host on a LAN still needs to know which local MAC address should receive the next frame. On IPv4, ARP solves this problem: 1. A host knows the destination is local, or knows it needs the MAC address of the default gateway. 2. It broadcasts an ARP request asking, "Who has this IP address?" 3. The matching device replies with its MAC address. 4. The sender stores the mapping in an ARP cache. On IPv6, Neighbor Discovery plays a similar role using a different mechanism. ### DHCP DHCP stands for Dynamic Host Configuration Protocol. Why it exists: - Manually configuring every host with an IP address, subnet mask, gateway, and DNS servers does not scale. How it works internally: - A client that joins a network asks for configuration. - A DHCP server offers a lease. - The client requests the offered lease. - The server acknowledges it. This is often remembered as DORA: Discover, Offer, Request, Acknowledge. ```mermaid sequenceDiagram participant C as Client participant D as DHCP Server C->>D: Discover D->>C: Offer C->>D: Request D->>C: ACK ``` Real-world usage: - Your home router often acts as a DHCP server for laptops, phones, and TVs. - Enterprise networks may use centralized DHCP servers with reservations and policy control. ### DNS DNS stands for Domain Name System. It translates human-friendly names such as `example.com` into machine-usable data such as IP addresses. Why it exists: - Humans remember names more easily than numeric addresses. - Services may move between IP addresses without changing the public name. - DNS records can also describe mail servers, verification data, aliases, and service discovery information. How it works internally: 1. An application asks the operating system to resolve a name. 2. The OS may check local caches or hosts file entries first. 3. If the answer is not cached locally, a recursive resolver is queried. 4. The recursive resolver may contact root name servers, then top-level domain servers, then the domain's authoritative name servers. 5. The answer is returned and cached according to its TTL, which means Time To Live. Important DNS roles: | Component | Role | | --- | --- | | Stub resolver | Small client-side resolver in the OS or application | | Recursive resolver | Does the lookup work on behalf of the client | | Root server | Directs queries toward top-level domain servers | | TLD server | Handles domains like `.com`, `.org`, `.net` | | Authoritative server | Stores the actual records for a domain | Common record types: | Record | Purpose | | --- | --- | | `A` | Maps a name to an IPv4 address | | `AAAA` | Maps a name to an IPv6 address | | `CNAME` | Alias from one name to another | | `MX` | Mail server for a domain | | `TXT` | Arbitrary text, often for verification or policy | ```mermaid sequenceDiagram participant U as User Browser participant R as Recursive Resolver participant Root as Root DNS participant TLD as .com TLD participant Auth as Authoritative DNS U->>R: Resolve api.example.com R->>Root: Where is .com? Root-->>R: Ask the .com TLD R->>TLD: Where is example.com? TLD-->>R: Ask authoritative DNS R->>Auth: What is api.example.com? Auth-->>R: A or AAAA record R-->>U: Final answer plus TTL ``` Real-world usage: - A CDN may return different IP addresses based on region. - A failover system may update DNS records when traffic needs to move. - Short TTL values make changes propagate faster but increase lookup load. ### Practical Example: Resolving An API Hostname Suppose a mobile app needs `api.shop.example`. 1. The app asks the OS resolver for the hostname. 2. The OS checks local cache. 3. If not cached, it asks a recursive resolver. 4. The recursive resolver finds the authoritative answer. 5. The app receives an IP address and can now start transport-level communication. If DNS is broken, everything above it may appear broken even though the server is healthy. That is why name resolution is one of the first things to check when troubleshooting connectivity. ### Quick Check - Why do we need both IP addresses and domain names? - What problem does DHCP solve on a home or office network? - Why can a DNS change take time to appear everywhere? --- ## 4. How Traffic Moves: Switching, Routing, And NAT ### Switching A switch mainly operates at Layer 2. Its job is to move frames within a local network. Why switching exists: - Devices on the same LAN need efficient local delivery without sending every frame to every port forever. How it works internally: - The switch learns MAC addresses by observing the source MAC address of incoming frames. - It stores a MAC address table mapping MAC addresses to switch ports. - When a frame arrives, the switch checks the destination MAC address. - If the destination is known, it forwards the frame only to the correct port. - If unknown, it floods the frame so the destination can respond and be learned. Real-world usage: - An office switch connects desktops, printers, access points, and routers. - A virtual switch inside a hypervisor performs a similar role for virtual machines. ### Routing A router mainly operates at Layer 3. Its job is to move packets between networks. Why routing exists: - Local switching is not enough when the destination is on another subnet, another site, or somewhere across the Internet. How it works internally: - A router reads the destination IP address. - It checks its routing table for the best match, usually using longest-prefix matching. - It selects the next hop. - It encapsulates the packet for the outgoing link and forwards it. Real-world usage: - Your home router sends non-local traffic to your ISP. - Cloud virtual routers connect private subnets to Internet gateways, NAT gateways, or VPN links. ### Network Topology: LAN To WAN Topology describes how devices and networks are arranged and connected. ```mermaid flowchart LR subgraph HomeLAN[Home LAN] L1[Laptop] P1[Phone] AP[Wi-Fi AP or Switch] R1[Home Router] L1 --> AP P1 --> AP AP --> R1 end R1 --> ISP[ISP WAN] ISP --> Internet[Internet] subgraph CloudLAN[Cloud Or Data Center LAN] ER[Edge Router] LB[Load Balancer] APP[Application Server] DB[Database] ER --> LB --> APP --> DB end Internet --> ER ``` This diagram shows why local behavior and Internet behavior are different. Your laptop uses one set of local-link rules inside the home LAN, but routing takes over once traffic leaves the LAN. ### Packet Flow Across Networks Let us follow one packet from a laptop to a remote web server. 1. The browser creates application data. 2. The transport layer wraps it in TCP or QUIC-related transport data. 3. The IP layer adds source and destination IP addresses. 4. The host checks whether the destination is local. 5. If not local, it sends the frame to the MAC address of the default gateway. 6. The switch forwards the frame locally. 7. The router removes the old Layer 2 frame, checks the IP destination, chooses the next hop, and builds a new Layer 2 frame for the outgoing link. 8. This process repeats at each router until the packet reaches the destination network. ```mermaid flowchart LR A[Laptop] --> B[Switch or Wi-Fi AP] B --> C[Home Router] C --> D[ISP Router] D --> E[Internet Routers] E --> F[Destination Edge Router] F --> G[Server] ``` Important mental model: - The Layer 3 packet usually keeps the same source and destination IP addresses across the path. - The Layer 2 frame changes at every hop because each local link is different. ### Default Gateway The default gateway is the router a host sends traffic to when the destination is outside its local subnet. Why it exists: - A host does not keep a full map of the Internet. - It only needs to know what is local and where to send everything else. How it works internally: - The host compares the destination IP against its own subnet. - If the destination is not local, it forwards the packet to the gateway. This is one of the most important troubleshooting concepts in networking. ### NAT NAT stands for Network Address Translation. Why it exists: - Many private devices share a smaller number of public IPv4 addresses. - Internal addressing can be hidden from the public Internet. How it works internally: - When a private host sends traffic outward, the NAT device rewrites the source IP address, often also rewriting the source port. - It stores a mapping in a translation table. - When return traffic arrives, the device uses that table to map the packet back to the original internal host. The most common form in home networks is PAT, Port Address Translation, where many private clients share one public IP by using different port mappings. Real-world usage: - Home routers perform NAT for phones, laptops, and TVs. - Cloud environments may use managed NAT gateways for private instances that still need outbound Internet access. Tradeoffs: - NAT conserves IPv4 addresses. - NAT complicates direct inbound connections and peer-to-peer communication. - NAT is not the same thing as a firewall, even though the two are often combined in one device. ### Switching Vs Routing This distinction matters constantly: - Switching is about local delivery inside a network segment. - Routing is about moving traffic between different networks. If two hosts are in the same subnet, switching dominates. If they are in different subnets, routing is required. ### Quick Check - Why does a switch care about MAC addresses while a router cares about IP addresses? - What does the default gateway do? - Why is NAT common in IPv4 home networks? --- ## 5. Transport Layer: TCP And UDP ### Why The Transport Layer Matters The network layer can get packets toward a destination machine, but applications still need a process-to-process communication model. They also need different delivery properties depending on the use case. Some applications need strong reliability and ordering. Others care more about low latency than perfect delivery. The transport layer exists to offer these tradeoffs. ### TCP TCP stands for Transmission Control Protocol. Why it exists: - Many applications need reliable, ordered delivery. - Developers should not have to manually rebuild lost-packet handling for every web app, database driver, or SSH client. How it works internally: - TCP is connection-oriented, meaning endpoints establish state before exchanging normal application data. - Data is tracked using sequence numbers. - The receiver acknowledges what it has received. - Lost data is retransmitted. - Flow control prevents a fast sender from overwhelming a slow receiver. - Congestion control tries to avoid flooding the network when loss or delay suggests congestion. TCP provides a byte stream, not message boundaries. If an application sends two writes, the receiver may read them as one combined read or many smaller reads. That detail matters when writing network software. #### TCP Three-Way Handshake ```mermaid sequenceDiagram participant C as Client participant S as Server C->>S: SYN S->>C: SYN-ACK C->>S: ACK ``` What this handshake accomplishes: - Both sides agree to start communication. - Initial sequence numbers are established. - The server knows the client can receive responses. #### TCP Reliability Features - Sequence numbers keep track of byte order. - Acknowledgments tell the sender what arrived. - Retransmission recovers from packet loss. - Sliding windows limit how much unacknowledged data can be in flight. - Congestion control adapts sending behavior when the network appears overloaded. Real-world usage: - Web browsing over HTTP/1.1 and HTTP/2 commonly uses TCP. - Databases often use TCP because query results must arrive correctly and in order. - SSH uses TCP because interactive remote login cannot tolerate corrupted or missing command bytes. ### UDP UDP stands for User Datagram Protocol. Why it exists: - Some applications want minimal transport overhead. - Some applications prefer timeliness over waiting for retransmissions. How it works internally: - UDP sends independent datagrams. - There is no built-in connection handshake like TCP. - There is no built-in guarantee of delivery, ordering, or retransmission. That does not mean UDP is "bad" or "unreliable by mistake." It means the application chooses how much reliability to add, if any. Real-world usage: - DNS queries often use UDP because they are small and latency-sensitive. - Voice and video calls may use UDP-based transports because old data is often less useful than new data. - Many online games use UDP for fast state updates. - QUIC uses UDP as a foundation but adds reliability, security, and multiplexing in user space. ### TCP Vs UDP | Property | TCP | UDP | | --- | --- | --- | | Setup | Connection-oriented | Connectionless | | Reliability | Built in | Not built in | | Ordering | Preserved | Not guaranteed | | Flow control | Yes | No built-in mechanism | | Congestion control | Yes | No built-in mechanism | | Overhead | Higher | Lower | | Typical uses | Web, APIs, databases, SSH | DNS, gaming, voice, custom transports | ### Client-Server Communication At the transport level, the server typically listens on a well-known port and the client uses an ephemeral source port. ```mermaid sequenceDiagram participant Client participant Server Client->>Server: Connect to 443 from ephemeral port Server-->>Client: Accept connection Client->>Server: Send request bytes Server-->>Client: Send response bytes Client->>Server: Close or keep alive ``` The combination of source IP, source port, destination IP, and destination port identifies a connection uniquely enough for the OS to track many simultaneous connections. ### Practical Examples Browsing a website: - Strong reliability matters. - Missing bytes would corrupt HTML, CSS, JavaScript, or JSON. - TCP or QUIC is a good fit. Streaming video: - Modern streaming often uses HTTP-based chunk delivery over TCP or QUIC. - Real-time calling often uses UDP-oriented approaches because waiting too long for old audio packets makes the experience worse. Sending a message: - Chat apps may use HTTPS for normal API operations and WebSockets or long-lived connections for real-time updates. - Reliable delivery and ordered events often matter. ### Quick Check - Why is TCP described as a byte stream rather than a message protocol? - Why might a voice call prefer lower latency over perfect retransmission? - Why can a transport protocol choice affect user experience even when the application is the same? --- ## 6. Application Protocols: HTTP And HTTPS ### What HTTP Is HTTP stands for Hypertext Transfer Protocol. It is the application-layer protocol that browsers, APIs, and many services use to exchange requests and responses. Why it exists: - Clients need a standard way to ask for resources or trigger actions. - Servers need a standard way to describe success, failure, metadata, and content. How it works internally: - A client sends a request containing a method, path, headers, and optionally a body. - The server processes the request and returns a response with a status code, headers, and optionally a body. Simple example: ```text GET /users/42 HTTP/1.1 Host: api.example.com Authorization: Bearer Accept: application/json ``` Possible response: ```text HTTP/1.1 200 OK Content-Type: application/json Cache-Control: no-store {"id":42,"name":"Asha"} ``` ### Why HTTP Works Well For The Web HTTP is successful because it is simple, extensible, and text-friendly at the semantic level. - Methods communicate intent: `GET`, `POST`, `PUT`, `DELETE`. - Status codes communicate outcomes: `200`, `404`, `500`. - Headers carry metadata such as content type, authentication, caching rules, and cookies. - Proxies, caches, and load balancers can understand and work with the protocol. ### HTTP Is Stateless HTTP treats each request as independent unless some extra mechanism carries state across requests. Why this matters: - Statelessness improves scalability because servers do not have to remember everything by default. - But applications often still need user identity and continuity. How real systems solve this: - Cookies can store a session identifier. - Tokens can be sent in headers such as `Authorization`. - Session data may be stored in Redis, a database, or encoded in signed tokens. ### HTTP Versions #### HTTP/1.1 - Widely used for many years. - Supports persistent connections. - Often sends one in-flight request per connection in common usage patterns. #### HTTP/2 - Allows multiplexing many streams over one TCP connection. - Reduces overhead from opening many separate connections. - Uses binary framing internally. #### HTTP/3 - Runs over QUIC, which itself uses UDP. - Improves behavior under packet loss and connection migration scenarios. - Designed to reduce some latency costs and transport limitations of older web stacks. ### What HTTPS Adds HTTPS is HTTP running over TLS. Why it exists: - Plain HTTP exposes requests and responses to eavesdropping or tampering. - Users need confidentiality, integrity, and server identity verification. How it works internally at a high level: 1. The client starts a secure session setup. 2. The server presents a certificate proving ownership or control of the domain identity. 3. The client verifies the certificate chain. 4. Both sides derive shared encryption keys. 5. HTTP messages are then exchanged inside the encrypted channel. Real-world result: - Passwords, session tokens, and API data are protected in transit. - Users can authenticate that they reached the intended site rather than an impostor. ### HTTP Request Lifecycle ```mermaid sequenceDiagram participant B as Browser participant DNS as DNS Resolver participant Edge as CDN or Edge Proxy participant LB as Load Balancer participant App as Application Server participant DB as Database B->>DNS: Resolve www.example.com DNS-->>B: IP address B->>Edge: TCP or QUIC plus TLS setup B->>Edge: HTTP request Edge->>LB: Forward request LB->>App: Select healthy backend App->>DB: Query or write data DB-->>App: Result App-->>LB: HTTP response LB-->>Edge: Forward response Edge-->>B: Encrypted response ``` ### Browsing A Website In Practice When you open `https://shop.example/products`: 1. DNS resolves the hostname. 2. The browser establishes a secure transport session. 3. The browser sends an HTTP request. 4. An edge or CDN may serve cached content or forward the request inward. 5. A load balancer sends the request to a healthy app server. 6. The app server may query databases, caches, or other services. 7. The response is sent back to the browser. 8. The browser renders HTML, CSS, JavaScript, images, and possibly makes additional requests. ### API Request Flow Suppose a frontend calls `GET /api/orders/123`. - DNS resolves the API hostname. - HTTPS protects the connection. - The request includes an auth token. - The API gateway or load balancer routes it to a backend service. - The backend fetches order data from storage. - The service returns JSON. - The client renders the result or shows an error based on the status code. ### Streaming Video In Practice Many beginners assume streaming video is one continuous custom media stream from server to user. In many modern systems, especially on the web, that is not how large-scale video delivery works. Instead: - Video is split into segments. - Those segments are cached on CDNs. - The player requests chunks over HTTP or HTTPS. - The player can change quality dynamically based on bandwidth and latency. This is called adaptive bitrate streaming. It is a great example of application design working with network reality instead of pretending the network is perfect. ### Quick Check - Why is HTTP called stateless? - What problem does HTTPS solve beyond plain connectivity? - Why can an HTTP request succeed even when the application later returns a `500`? --- ## 7. Infrastructure That Makes Networks Practical ### Load Balancing Load balancing distributes incoming traffic across multiple backends. Why it exists: - One server may not be enough for performance or availability. - Traffic may need to be spread across instances, zones, or regions. How it works internally: - A load balancer receives client traffic. - It chooses a healthy backend using a policy such as round robin, least connections, weighted routing, or hashing. - It may terminate TLS, inspect HTTP headers, or simply forward based on IP and port. Types: - Layer 4 load balancer: routes based on transport information such as IP and port. - Layer 7 load balancer: understands application information such as URL paths, hostnames, or headers. Real-world usage: - Sending `/images` traffic to a static asset service. - Sending API traffic to application servers. - Draining traffic away from unhealthy instances using health checks. ### CDN CDN stands for Content Delivery Network. Why it exists: - Users are geographically distributed. - Serving all content from one origin server creates latency and bottlenecks. How it works internally: - Copies of content are cached at edge locations closer to users. - DNS or routing logic directs clients to a nearby edge. - If content is cached, the edge serves it directly. - If not cached, the edge fetches it from the origin and may cache it for later requests. Real-world usage: - Images, JavaScript bundles, and video segments are common CDN content. - CDNs may also provide TLS termination, bot filtering, DDoS mitigation, and edge compute features. ### Firewalls And Security Basics A firewall controls which traffic is allowed or denied. Why it exists: - Not every reachable service should be publicly accessible. - Networks need policy enforcement, segmentation, and attack reduction. How it works internally: - Stateless filtering checks packets against rules such as source, destination, port, and protocol. - Stateful firewalls track connection state and allow return traffic for established flows. - Application-aware firewalls or WAFs inspect higher-level protocols like HTTP. Important mental model: - A firewall controls access. - TLS encrypts traffic. - NAT rewrites addresses. These are different jobs, even though appliances may perform all three. ### Reverse Proxies And Gateways A reverse proxy receives requests on behalf of one or more backend services. Why it exists: - It centralizes TLS termination, routing, authentication, rate limiting, and header normalization. - It hides internal service layout from public clients. How it works internally: - Clients connect to the proxy. - The proxy decides which backend should handle the request. - The proxy forwards the request and relays the response. Real-world usage: - Nginx, Envoy, HAProxy, cloud API gateways, and managed ingress controllers. ### Modern Web Path ```mermaid flowchart LR U[User] --> DNS[DNS Resolver] DNS --> EDGE[CDN or WAF] EDGE --> LB[Load Balancer] LB --> RP[Reverse Proxy or Gateway] RP --> APP[Application Service] APP --> CACHE[Cache] APP --> DB[Database] ``` This path is common enough that it is worth memorizing. Not every system has every component, but most production web systems use some version of this flow. ### Security Basics Worth Knowing Early - Use HTTPS so data in transit is encrypted. - Expose only the ports and services that should be reachable. - Segment networks so internal systems are not all flat and mutually reachable. - Use least privilege for firewall rules and access control. - Monitor logs, connection patterns, and unusual traffic spikes. ### Quick Check - Why are load balancers useful even when one server seems powerful enough? - Why is a CDN not the same thing as a load balancer? - Why is NAT not a replacement for security policy? --- ## 8. End-To-End Request Lifecycle ### What Happens When You Visit A Secure Website Let us walk through `https://www.example.com/products/42` from beginning to end. 1. The browser parses the URL and sees the scheme is HTTPS, the host is `www.example.com`, and the path is `/products/42`. 2. The browser checks caches, then asks the operating system to resolve the hostname. 3. DNS returns an IP address, possibly for a CDN or edge proxy rather than the origin server. 4. The host decides whether the destination is local. It is not, so the packet will be sent to the default gateway. 5. The host uses ARP or Neighbor Discovery to learn the gateway's local-link address if needed. 6. The browser opens a transport connection, often TCP for HTTP/1.1 or HTTP/2, or QUIC for HTTP/3. 7. TLS negotiates encryption and verifies server identity. 8. The browser sends an HTTP request. 9. Edge infrastructure may cache, filter, or forward the request. 10. A load balancer chooses a backend. 11. The application may call other internal services, caches, or databases. 12. The response is generated and returned. 13. The browser processes headers, caches content if allowed, parses the body, and renders the page. 14. The page may trigger more requests for CSS, JavaScript, images, fonts, and API calls. The key lesson is that one visible user action often becomes many network operations. ### Packet Flow Mental Model When a packet crosses the Internet, each router does not understand your full web request. Routers mostly care about the network-layer destination and how to move the packet one hop closer. The application meaning is mostly invisible to them unless a device is operating at a higher layer such as a reverse proxy, WAF, or load balancer. That distinction helps explain why: - A route can be correct even though the application returns `404`. - DNS can work even while the HTTPS handshake fails. - TCP can connect even while the application is unhealthy. ### Sending A Chat Message Consider sending a message in a modern chat application. 1. The app may already maintain a long-lived secure connection. 2. The message is serialized into a protocol format. 3. It is sent over a transport connection. 4. A gateway authenticates the client. 5. The message service persists the event. 6. The event is pushed to recipients over their own active connections. This example shows that networking is not only about one request and one response. Real applications often keep connections open and exchange many messages over time. ### Streaming Video Consider streaming a video on a phone. 1. The app fetches a manifest describing available video qualities. 2. The player measures network conditions. 3. It requests video segments from nearby CDN edges. 4. If the network worsens, the player requests a lower bitrate version. 5. If the network improves, it upgrades quality. This is a practical demonstration of adapting application behavior to bandwidth, latency, and packet loss. ### API Request Flow Between Services Inside a data center or cloud VPC, service-to-service networking uses the same fundamentals on a different scale. 1. Service A resolves the name of Service B. 2. A connection is established. 3. A request is sent with auth and tracing metadata. 4. Service B processes it and may call other services. 5. Responses return through the same or related path. Observability tools such as logs, metrics, traces, and packet captures help teams reason about these internal network flows. ### Quick Check - Why can a single page load create many separate requests? - Why might the first request to a site be slower than later requests? - Which parts of the flow are about naming, which are about transport, and which are about application behavior? --- ## 9. Common Networking Tools ### Ping `ping` tests reachability and measures round-trip time, usually using ICMP Echo Request and Echo Reply. Why it is useful: - It quickly tells you whether a host is reachable at all. - It gives a rough latency measurement. What it does not guarantee: - A successful ping does not prove that HTTP, TLS, or an application is working. - Some hosts or firewalls intentionally block ICMP, so a failed ping does not always mean the host is down. Example: ```bash ping example.com ``` ### Traceroute `traceroute` shows the path packets take hop by hop by manipulating TTL values and observing where packets expire. Why it is useful: - It helps identify where delay or loss appears along a path. - It shows whether traffic reaches the expected network region. How it works internally: - Packets are sent with a small TTL. - Each router decreases TTL by 1. - When TTL reaches 0, the router drops the packet and often returns an ICMP Time Exceeded message. - By increasing TTL step by step, the tool discovers successive hops. Example: ```bash traceroute example.com ``` ### Netstat, Ss, And Lsof These tools inspect sockets and connection state. Why they are useful: - They show which ports are listening. - They show established connections and sometimes their states. - They help answer questions like, "Is my server actually listening on port 8080?" or "Which process owns this port?" Examples: ```bash netstat -an ``` ```bash lsof -i :443 ``` On many Linux systems, `ss` is the newer replacement for some `netstat` use cases. ### Dig And Nslookup DNS problems are common enough that name-resolution tools are essential. Why they are useful: - They let you inspect DNS answers directly. - They help you distinguish a DNS problem from a transport or application problem. Examples: ```bash dig example.com ``` ```bash nslookup example.com ``` ### Curl `curl` is invaluable for testing HTTP and HTTPS behavior directly. Why it is useful: - It shows headers, status codes, redirects, and response bodies. - It lets you test APIs without a browser. Example: ```bash curl -I https://example.com ``` ### Simple Troubleshooting Order When a networked system seems broken, work from the bottom upward or from the cheapest checks outward: 1. Is the local machine connected at all? 2. Does it have an IP address and default gateway? 3. Does DNS resolution work? 4. Can you reach the remote host at the transport level? 5. Does TLS succeed? 6. Does the application return the expected result? This layered troubleshooting approach prevents wasted time. ### Quick Check - Why is `ping` a useful but incomplete test? - What kind of problem is `dig` best at isolating? - Why might `netstat` or `lsof` matter when a local server seems unreachable? --- ## 10. Common Interview Questions And Practical Scenarios ### Common Interview Questions - What is the difference between a switch and a router? - Why do we need DNS if we already have IP addresses? - What is the difference between TCP and UDP, and when would you choose each? - What happens when you type a URL into a browser? - What problem does HTTPS solve that HTTP does not? - What does NAT do, and why is it common with IPv4? - What is a subnet, and why do networks use them? - What is the role of a load balancer in a production system? - Why is a CDN useful for globally distributed users? - What is the difference between bandwidth and latency? ### Scenario 1: Website Is Down For Users In One Region Things to think about: - Is DNS returning different answers by geography? - Is one CDN edge or regional load balancer unhealthy? - Is there a routing issue between an ISP and the provider edge? - Is the application healthy but unreachable from that region? ### Scenario 2: API Is Slow But CPU Usage Is Low Things to think about: - Is latency high between services? - Is DNS slow or frequently uncached? - Are TCP connections constantly being reestablished instead of reused? - Is the database on another subnet or region adding round-trip delay? ### Scenario 3: Users Can Ping A Host But Cannot Load The Site Things to think about: - The host may be reachable while the web server process is down. - A firewall may allow ICMP but block port `80` or `443`. - TLS may be failing because of certificate or protocol mismatch. - The application may be returning errors after transport succeeds. ### Scenario 4: A Service Works Internally But Not From The Internet Things to think about: - Is the service listening on the expected port? - Is the firewall or security group allowing inbound traffic? - Is NAT or port forwarding configured correctly? - Is the service bound only to `127.0.0.1` instead of a public interface? --- ## 11. Final Mental Models To Keep ### Mental Model 1: Networks Move Packets, Applications Create Meaning Routers and switches mostly move data. Applications decide what the data means. ### Mental Model 2: Layers Hide Complexity, They Do Not Eliminate It A browser developer usually does not need to think about fiber optics, but those lower layers still matter when something breaks. ### Mental Model 3: Naming, Addressing, Routing, Transport, And Application Logic Are Different Problems DNS finds a name. IP finds a destination network. Transport connects processes. HTTP defines application behavior. TLS protects the exchange. If you can separate those concerns in your head, networking becomes much easier to reason about. ### Mental Model 4: Real Systems Optimize For Tradeoffs There is no universal best protocol or architecture. - TCP trades overhead for reliability. - UDP trades guarantees for speed and flexibility. - CDNs trade storage and complexity for lower latency. - Load balancers trade simplicity for scale and resilience. - Firewalls trade openness for control and safety. ### If You Remember Only A Few Things 1. A network is a system for moving data between devices and processes. 2. Layering exists to manage complexity. 3. DNS translates names to addresses. 4. IP routes between networks. 5. TCP and UDP offer different transport tradeoffs. 6. HTTP defines web requests and responses. 7. HTTPS adds encryption and identity verification through TLS. 8. Routing, NAT, load balancing, and CDNs are what make the Internet usable at scale. Once these ideas are solid, advanced topics such as BGP, VPNs, service meshes, WebSockets, QUIC internals, and zero-trust networking become much easier to learn.