Files

T

tarun-elango db37d59a6d electronics

Co-authored-by: Copilot <copilot@github.com>

2026-04-29 21:35:30 -04:00

53 KiB

Raw Permalink Blame History

Communication Protocols

This handbook is a practical reference for computer engineering students and working engineers who need more than textbook summaries of serial buses and interface standards. The goal is to build protocol intuition that holds up in real systems: boards that boot only when a cable is disconnected, sensors that vanish when bus speed increases, industrial links that fail only in the factory, automotive nodes that go bus-off, and USB devices that enumerate on one laptop but not another.

Communication protocols sit at the exact boundary where software assumptions meet electrical reality. Datasheets often present them as clean blocks and timing diagrams. Real products involve tolerances, routing, grounding, transceivers, firmware state machines, operating system drivers, cable quality, EMC, startup behavior, and failure recovery.

The material here is intentionally practical. It explains concepts from first principles, then connects them to board design, firmware, debugging, measurement, production tradeoffs, and design-review level decision making.

How to Use This Handbook

Read it in order the first time. Return to the protocol-specific sections when designing or debugging.

If you are new to embedded interfaces, start with the foundations and protocol-selection sections.
If you are already writing firmware, pay extra attention to framing, timing, buffering, termination, pull-ups, and recovery procedures.
If you are building products, focus on noise, cable effects, transceivers, isolation, protection, startup behavior, and field debugging.
If you are preparing for interviews or design reviews, use the quick reference, tradeoff sections, and final interview-level review.

Quick Reference

Interface	What it really is	Signaling style	Typical topology	Typical speed range	Where it shines	Most common engineering mistake
UART	Asynchronous serial framing handled by a UART peripheral	Single-ended logic-level signals	Point-to-point	`9.6 kbps` to a few `Mbps`	Debug consoles, modules, bootloaders	Treating logic-level UART as electrically interchangeable with RS-232 or RS-485
RS-232	Electrical standard often carrying UART-style data	Single-ended positive and negative voltages	Point-to-point cable	Commonly up to `115.2 kbps`	Legacy instruments, console ports, industrial equipment	Connecting TTL UART directly to RS-232 pins
RS-485	Differential electrical standard often carrying UART-style data	Differential pair	Multi-drop bus, often half-duplex	`100 kbps` over long runs to `10 Mbps` over short runs	Industrial networks, drives, meters, building control	Missing termination, missing biasing, or using star wiring
SPI	Synchronous shift-register style bus	Single-ended push-pull	One controller, one or more peripherals	`1 MHz` to tens of `MHz` or more	Flash, ADCs, displays, fast board-level peripherals	Wrong clock mode, bad chip-select timing, or too much bus loading
I2C	Shared open-drain addressed bus	Single-ended with pull-up resistors	Multi-drop board-level bus	`100 kHz`, `400 kHz`, `1 MHz`, up to `3.4 MHz`	Sensors, EEPROMs, PMICs, RTCs	Wrong pull-ups, address confusion, or too much capacitance
CAN	Multi-master differential message bus with arbitration and error confinement	Differential pair	Bus	Typically `125 kbps` to `1 Mbps` for Classical CAN	Automotive, robotics, industrial machinery	Wrong termination or bit timing, long stubs, ignoring error states
USB	Host-driven protocol family with discovery, descriptors, and transfer types	Differential pair with strict physical rules	Tiered star through hubs	`1.5 Mbps`, `12 Mbps`, `480 Mbps` for the common basics	PC peripherals, power plus data, firmware update, field service	Treating it like a simple serial cable rather than a full protocol stack

Five questions solve most interface-selection problems:

Is the link staying on one PCB, crossing a connector, or running through a cable in the field?
Is it point-to-point or shared among many devices?
Do you need deterministic arbitration, addressing, or hot-plug behavior?
How hostile is the electrical environment in terms of noise, ground shift, ESD, and cable length?
What matters more for this interface: simplicity, speed, robustness, interoperability, or cost?

1. Foundations: What a Communication Protocol Actually Is

1.1 Protocol, bus, interface, and physical standard are not the same thing

Engineers often use these words loosely, but the differences matter.

A protocol defines rules for communication: timing, framing, addressing, arbitration, error detection, and transaction behavior.
A physical layer defines voltages, currents, line drivers, connectors, and cable behavior.
A peripheral block is the hardware inside a chip that implements some of those rules.
A software stack configures the peripheral, handles interrupts or DMA, and interprets received data.

This is why the statement "UART versus RS-232" is slightly wrong. UART is usually the framing engine inside the chip. RS-232 is an electrical standard for sending serial data over a cable. A microcontroller can generate UART frames, then an RS-232 transceiver converts logic-level voltages into RS-232 levels.

The same pattern appears elsewhere:

UART plus RS-485 transceiver is common in industrial systems.
USB UART bridges expose a UART device through a USB connection to a PC.
CAN controllers and CAN transceivers are separate parts in many designs.

Understanding which layer is responsible for which behavior prevents a lot of bad debugging.

1.2 The major dimensions that separate protocols

When comparing interfaces, think in engineering dimensions rather than brand names.

Dimension	Why it matters	Typical choices
Clocking	Determines how timing is recovered	Asynchronous, shared clock, encoded timing
Electrical signaling	Determines noise tolerance and distance	Single-ended, open-drain, differential
Topology	Determines how many devices can talk	Point-to-point, shared bus, multi-drop, host tree
Duplex	Determines traffic direction limits	Simplex, half-duplex, full-duplex
Ownership	Determines who initiates traffic	Single controller, multi-master, host-device
Error handling	Determines recovery quality	None, parity, CRC, ACK/NACK, retransmission
Flow control	Determines how overruns are avoided	Fixed rate, handshaking, buffering, credits
Ecosystem	Determines software and interoperability cost	Bare-metal, Linux, industrial standard, PC class

1.3 Why digital communication fails in analog ways

A protocol diagram usually shows ideal 0 and 1 levels with perfect edges. Real links behave according to analog physics.

The real system sees:

trace resistance and inductance
cable capacitance
reflections from impedance mismatch
common-mode noise
ground offset between devices
finite edge rates
receiver thresholds and hysteresis
clock tolerance and jitter
ESD, EFT, and surge events

This is why an interface can be logically correct and still fail in hardware. A UART with the right baud rate can still fail because of missing ground reference. An I2C bus with correct firmware can still fail because pull-up resistors are too weak. A CAN network with correct identifiers can still fail because of long stubs and missing termination.

Professional interface work means treating every protocol as both a logic problem and an electrical system.

1.4 A practical layered mental model

Use this model when debugging any interface.

flowchart LR
	APP[Application or firmware] --> CTRL[Controller peripheral<br/>UART SPI I2C CAN USB]
	CTRL --> LINK[Framing timing arbitration<br/>and buffering]
	LINK --> PHY[Electrical signaling and transceiver]
	PHY --> MEDIUM[Trace connector cable<br/>and environment]
	MEDIUM --> PEER[Other device]

If communication fails, isolate which layer is broken:

Application layer: wrong command, wrong packet format, wrong state machine.
Controller layer: wrong configuration, interrupts, DMA, or buffer handling.
Link layer: wrong framing, addressing, arbitration, CRC, or timing.
Physical layer: wrong voltages, bad routing, noise, termination, or cabling.

1.5 Core signaling patterns you should recognize immediately

Asynchronous serial

The transmitter and receiver do not share a clock line. Instead, they agree on a nominal bit rate. A start bit gives the receiver a reference point, then the receiver samples bits at predicted times. UART is the main example.

Strength: minimal wires.

Risk: clock mismatch, framing error, and sensitivity to long timing drift.

Synchronous serial

The clock is provided explicitly. Data changes and is sampled relative to that clock. SPI and I2C are examples, though I2C has its own open-drain behavior and arbitration rules.

Strength: easier timing recovery.

Risk: clock mode misunderstandings, clock stretching behavior, and board-level signal integrity issues.

Open-drain or open-collector shared lines

Devices actively pull the line low but do not actively drive it high. A resistor pulls it up when nobody is pulling low. I2C uses this because it allows safe sharing and arbitration.

Strength: multiple devices can share a line without direct high-versus-low driver fights.

Risk: edges rise slowly, resistor sizing matters, and bus capacitance limits speed.

Differential signaling

The receiver cares about the voltage difference between two wires, not their absolute voltage to ground. CAN and RS-485 use this. USB also uses differential pairs.

Strength: better noise rejection and better behavior over cables.

Risk: routing, termination, common-mode limits, and transceiver details still matter.

2. How to Choose the Right Interface

Protocol selection is an engineering tradeoff, not a trivia question. Start from the system constraints.

2.1 First-principles decision rules

If devices are on the same PCB and you need cheap addressed sharing at moderate speed, I2C is often the first candidate.
If devices are on the same PCB and you need simple high speed with low software overhead, SPI is often the first candidate.
If you need a debug port, modem link, or simple point-to-point module connection, UART is usually the simplest candidate.
If you must leave the PCB and run through a long cable in a noisy environment, differential standards such as RS-485 or CAN deserve strong preference.
If the other endpoint is a PC or phone and interoperability matters, USB often dominates despite the added complexity.

2.2 Decision flow

flowchart TD
	START[Need a wired digital interface] --> OFFBOARD{Leaves the PCB or crosses a long cable?}
	OFFBOARD -- No --> PCBUS{Talking to board-local peripherals?}
	PCBUS -- Yes --> SHARE{Need many addressed devices on two wires?}
	SHARE -- Yes --> I2C[I2C]
	SHARE -- No --> SPEED{Need higher speed or simple streaming?}
	SPEED -- Yes --> SPI[SPI]
	SPEED -- No --> UARTSEL[UART]
	OFFBOARD -- Yes --> HOST{Must interoperate with a PC host?}
	HOST -- Yes --> USB[USB]
	HOST -- No --> MULTI{Need a robust multi-node field bus?}
	MULTI -- Yes --> ARB{Need built-in arbitration and fault handling?}
	ARB -- Yes --> CAN[CAN]
	ARB -- No --> RS485[RS-485]
	MULTI -- No --> LEGACY{Legacy equipment or instrument port?}
	LEGACY -- Yes --> RS232[RS-232]
	LEGACY -- No --> SIMPLE[UART with suitable transceiver]

2.3 Real production heuristics

Keep SPI and I2C mostly on-board. They are usually poor choices for long, noisy cables.
Use UART when simplicity matters more than bus sharing or guaranteed robustness.
Use RS-485 when you want rugged serial over distance and you can manage bus ownership yourself.
Use CAN when you need a multi-node network that keeps working under contention and detects many classes of fault automatically.
Use USB when you need standardized host interoperability, drivers, or power-plus-data behavior.

3. UART

3.1 What UART is and why it exists

UART stands for Universal Asynchronous Receiver/Transmitter. It is one of the simplest and most common serial interfaces because it requires very few wires:

TX
RX
GND

Optional lines such as RTS and CTS add flow control.

UART exists because many systems need simple byte-oriented communication without dedicating a clock line. It is common for:

debug consoles
bootloader interfaces
GPS, cellular, and Bluetooth modules
industrial devices internally using a microcontroller plus transceiver
low-cost links between processors

3.2 How asynchronous serial works from first principles

If two devices do not share a clock, they still need some way to agree where each bit begins. UART solves this by using:

an idle line state, usually high
a start bit, which drives the line low
a fixed bit time based on the agreed baud rate
one or more stop bits, which return the line high

The receiver continuously watches for the falling edge of the start bit. Once it sees that edge, it starts a timer and samples the incoming line near the center of each expected bit period. Many UART receivers oversample internally, often by 8x or 16x, to improve timing accuracy.

The important intuition is this: UART is not recovering the transmitter clock continuously. It is making a short prediction about where the next few bit centers will be. If the clocks differ too much, the receiver's sample point drifts and eventually lands too close to the bit boundary.

flowchart LR
	TXREG[TX buffer] --> SHIFT[UART shift register]
	SHIFT --> TXLINE[TX line idle high]
	TXLINE --> RXLINE[RX input line]
	RXLINE --> SAMPLE[Oversampling and bit-center timing]
	SAMPLE --> RXREG[RX buffer and interrupt or DMA]

3.3 Frame format

A common UART frame is written as 8N1:

8 data bits
N means no parity
1 stop bit

Other combinations exist, such as 7E1 or 8E2.

Frame structure:

idle state high
start bit low
data bits, usually LSB first
optional parity bit
stop bit or stop bits high

Parity can detect some single-bit errors, but it is weak compared with higher-level checksums or CRCs.

3.4 Baud rate and clock error budget

Baud rate is the number of signaling symbols per second. In normal UART practice, one symbol corresponds to one bit, so baud rate and bit rate are often equal.

The receiver tolerates some mismatch between its clock and the transmitter clock, but not unlimited mismatch. Exact tolerance depends on implementation, oversampling, number of data bits, and where the sample point is placed. In practice, designers should be conservative:

use crystal or accurate oscillator sources when baud rate is high
verify clock accuracy across temperature and supply range
be extra careful with auto-generated baud divisors that create fractional error

A link that works at room temperature on the bench can fail in the field if oscillator error grows.

3.5 Flow control and buffering

UART itself does not inherently prevent overruns. If the receiver cannot empty its buffer before more bytes arrive, data is lost.

Common strategies:

polling for slow links
interrupt-driven receive for moderate traffic
DMA for higher throughput or lower CPU overhead
hardware flow control using RTS and CTS
software flow control using XON and XOFF

Hardware flow control is generally more reliable than software flow control when binary data may include arbitrary byte values.

3.6 Real-world use cases

Boot logs from microcontrollers, Linux SBCs, and network equipment.
Serial links to GNSS receivers, cellular modems, barcode scanners, and industrial modules.
Service ports on production hardware.
Factory programming and provisioning.
Transport beneath RS-232 and RS-485 transceivers.

3.7 Practical design rules

Always share ground between logic-level UART devices unless isolation or differential transceivers are used.
Confirm voltage levels. 5 V TTL UART and 3.3 V UART are not always directly compatible.
Keep traces and cables short unless you add a line standard or transceiver intended for the environment.
Add ESD protection and series resistors when signals leave the board.
Decide whether boot-time chatter on the UART will affect connected equipment.

3.8 Common mistakes engineers make

Confusing UART with RS-232 and destroying a microcontroller pin by applying RS-232 voltages directly.
Forgetting the common ground path.
Swapping TX and RX and losing time because both devices appear healthy.
Matching nominal baud rate but mismatching parity, stop bits, or bit order.
Printing too much debug text in interrupt context and creating timing failures elsewhere.
Assuming the line is valid immediately at power-up when the peer device is still booting.

3.9 Debugging UART methodically

Verify wiring: TX to RX, RX to TX, and shared ground.
Measure idle state. Logic-level UART normally idles high.
Confirm voltage compatibility and logic thresholds.
Check baud, parity, stop bits, and flow control on both sides.
Use a logic analyzer or oscilloscope to measure actual bit period.
If data looks almost right but has random corrupt bytes, suspect baud error, oscillator drift, or noise.
If receive overruns occur, inspect interrupt latency, DMA setup, and buffer sizing.

Useful host-side commands:

stty -F /dev/ttyUSB0 115200 raw -echo
python -m serial.tools.miniterm /dev/ttyUSB0 115200

3.10 Software plus hardware example

Typical microcontroller setup:

uart_init(UART1, 115200);
uart_set_format(UART1, 8, UART_PARITY_NONE, 1);
uart_enable_rx_interrupt(UART1);

Typical embedded receive strategy:

UART ISR copies bytes into a ring buffer.
A task or main loop parses complete lines or packets.
Timeouts detect partial frames.

This split matters because parsing inside the interrupt often works at first, then collapses under real traffic.

3.11 Interview-level understanding

Strong answers about UART usually mention:

asynchronous timing based on a start bit
why idle is high on standard UART logic
clock mismatch and sampling drift
the role of parity and why it is weak
the difference between logic-level UART and line standards such as RS-232 or RS-485

4. RS-232

4.1 What RS-232 is and what problem it solved

RS-232 is a legacy but still relevant serial communication standard designed for point-to-point communication between equipment such as terminals, computers, modems, and instruments.

Its importance today is not that it is modern. Its importance is that many industrial, lab, telecom, and infrastructure devices still expose RS-232 ports because it is simple, well-understood, and supported by decades of tooling.

4.2 Why RS-232 is not just "UART with a connector"

RS-232 usually carries asynchronous serial data, but electrically it is very different from logic-level UART.

Key differences:

RS-232 uses positive and negative voltages relative to ground.
The logic sense is inverted compared with common TTL UART implementations.
It is intended for cable connections between external devices.

That means a microcontroller UART pin cannot normally connect directly to an RS-232 cable. A transceiver such as a MAX232-class device is used to translate levels.

4.3 Signaling intuition

Historically, RS-232 defined "mark" and "space" states using positive and negative voltages. Exact thresholds vary by equipment, but the important engineering idea is simple: it is not a 0 V to 3.3 V logic interface.

Why this helped historically:

larger voltage swing improved noise margin over cables
a standardized external interface made equipment interoperable
control signals such as RTS, CTS, DTR, and DSR supported modem-era workflows

4.4 DTE, DCE, and null modem confusion

One of the classic RS-232 mistakes is misunderstanding whether a device behaves as DTE or DCE.

DTE roughly means terminal or computer side.
DCE roughly means modem or communications equipment side.

If both ends are of the same type, a null-modem style crossover may be required. This is why two working devices can still fail to communicate despite correct baud settings.

flowchart LR
	DTE[DTE such as PC or controller] <-->|TXD RXD RTS CTS GND| DCE[DCE such as modem or instrument]

4.5 Where RS-232 still appears in real engineering

CNC machines and industrial controllers
laboratory instruments
network equipment console ports
telecom infrastructure
legacy building and access-control systems

4.6 Common mistakes engineers make

Directly connecting microcontroller UART pins to RS-232 lines.
Ignoring handshake lines when the device requires them.
Using the wrong cable type or wrong pinout.
Assuming a USB serial adapter gives RS-232 levels when it may only give TTL UART.
Forgetting that old equipment may expect lower baud rates or unusual frame settings.

4.7 Debugging RS-232

Confirm whether the device speaks RS-232 voltage levels or TTL UART.
Check connector pinout and whether crossover is required.
Verify frame format and flow control requirements.
Use a USB RS-232 adapter, not just a USB TTL serial adapter, when appropriate.
If hardware flow control is expected, confirm RTS and CTS behavior on the scope.

4.8 Design guidance

Use RS-232 mainly when interoperability with existing equipment matters.
Do not choose it for new multi-drop networks.
Protect off-board connectors against ESD.
If ground differences or harsh industrial environments are present, consider isolation.

5. RS-485

5.1 What RS-485 is and where it fits

RS-485 is an electrical standard for differential serial communication. It is widely used for long cables, noisy environments, and multi-drop networks. Unlike RS-232, it is designed to be robust in industrial wiring scenarios.

A common pattern is:

microcontroller UART generates bytes
RS-485 transceiver converts logic signals to differential bus levels
higher-level protocol such as Modbus RTU defines addressing and message structure

So RS-485 is usually not the whole protocol. It is the physical layer beneath a serial protocol.

5.2 Why differential signaling helps

RS-485 uses a differential pair, usually named A and B. The receiver looks at the voltage difference between the two wires rather than the absolute voltage to ground.

This improves robustness because noise coupled equally onto both wires tends to cancel out at the receiver.

It also supports longer cable runs and multi-node buses better than simple single-ended UART wiring.

5.3 Half-duplex and bus ownership

Many RS-485 networks are half-duplex on a two-wire bus. That means all nodes share the same pair and only one should actively transmit at a time.

This makes bus ownership important. If two devices drive simultaneously, frames collide and data becomes garbage.

Many transceivers expose:

DE driver enable
RE receiver enable

Firmware often controls DE around each UART transmission. The timing must be correct: enable before transmit starts, hold until the last stop bit clears the shift register, then release the bus.

5.4 Termination, biasing, and topology

These three topics separate robust RS-485 networks from unreliable ones.

Termination

Termination resistors, often 120 Ohm, are placed at the two physical ends of the main bus to reduce reflections.

Biasing

Because the bus may otherwise float when nobody is driving it, fail-safe bias resistors create a known idle state.

Topology

RS-485 wants a bus, not a star. Long stubs create reflections and distort edges.

flowchart LR
	T1[120 Ohm termination] --- N1[Node 1]
	N1 --- N2[Node 2]
	N2 --- N3[Node 3]
	N3 --- T2[120 Ohm termination]

5.5 Practical design rules

Put termination only at the two physical ends of the bus, not at every node.
Keep stubs short.
Use twisted pair cabling.
Consider shield and isolation based on the grounding environment.
Verify the transceiver common-mode range for the installation.
Add transient protection for field wiring.

5.6 Where RS-485 appears in production

Modbus RTU networks
motor drives and inverters
energy meters
building automation
long-distance sensor and controller networks
industrial HMIs and PLC ecosystems

5.7 Common mistakes engineers make

Wiring RS-485 in a star.
Omitting biasing and then chasing random framing errors.
Leaving termination off or placing it everywhere.
Forgetting to manage DE timing in firmware.
Assuming differential signaling removes all grounding concerns.
Running too fast for the cable length and topology.

5.8 Debugging RS-485

Confirm the bus is actually wired as a line, not a hub-and-spoke network.
Verify A and B polarity against the specific vendor naming, because naming conventions can be confusing across datasheets.
Check for termination at both ends only.
Check that idle bias exists when no node is transmitting.
Scope DE, UART TX, and bus output together to verify enable timing.
If errors increase with cable length or node count, lower bitrate and inspect topology.

5.9 Software plus hardware example

Typical transmit sequence on a half-duplex node:

gpio_set(RS485_DE, 1);
uart_write(UART2, frame, len);
uart_wait_for_tx_complete(UART2);
gpio_set(RS485_DE, 0);

The critical detail is uart_wait_for_tx_complete. Waiting only until the transmit FIFO is empty is not enough on many MCUs. The last bits may still be leaving the shift register.

5.10 Interview-level understanding

Strong answers mention:

differential signaling for long noisy links
multi-drop capability
termination and biasing
half-duplex bus ownership
the fact that RS-485 is an electrical layer, not usually a complete application protocol

6. SPI

6.1 What SPI is and why engineers like it

SPI is a synchronous serial bus typically used for fast communication between a controller and peripherals on the same board.

Common signals:

SCLK clock from controller
MOSI controller-out, peripheral-in
MISO peripheral-out, controller-in
CS or SS chip select

Engineers like SPI because it is simple, fast, and easy to implement in hardware. There is little protocol overhead, and full-duplex transfer is built into the signaling model.

6.2 How SPI works from first principles

SPI behaves like two shift registers connected together. On each clock edge, the controller shifts one bit out and samples one bit in. The peripheral does the same.

This means every SPI transaction is inherently full-duplex, even when your application thinks of it as a write followed by a read.

flowchart LR
	CTX[Controller TX shift register] -- MOSI --> PRX[Peripheral RX shift register]
	PTX[Peripheral TX shift register] -- MISO --> CRX[Controller RX shift register]
	CLK[SCLK from controller] --> CTX
	CLK --> PTX
	CS[Chip select low] --> PRX
	CS --> PTX

6.3 Why clock mode matters

SPI has no single universal standard for which clock edge is used to sample or change data. Instead, devices define timing in terms of CPOL and CPHA.

CPOL chooses idle clock polarity.
CPHA chooses which clock edge is used for sampling relative to data transitions.

If these are wrong, communication may look almost correct, which makes debugging deceptive. You might see the right number of clocks and still read nonsense.

6.4 Transaction anatomy

Typical transaction:

controller asserts CS
controller clocks out command or address bytes
peripheral interprets command
data is shifted in and out during clock pulses
controller deasserts CS

Many peripherals only treat CS boundaries as transaction boundaries. If firmware leaves CS asserted too long or toggles it between bytes unexpectedly, the device state machine may desynchronize.

6.5 Why SPI is fast but not very standardized

SPI is attractive because it has low overhead and does not need pull-ups or addressing rules. The tradeoff is that many details are device-specific:

command set
register address width
dummy cycles
burst behavior
maximum clock rate
CS setup and hold timing
which edge is valid

This is why integrating a new SPI peripheral often means reading timing diagrams very carefully.

6.6 Typical use cases

NOR and NAND flash
high-speed ADCs and DACs
displays and touch controllers
IMUs and radio transceivers
FPGA configuration or control interfaces

6.7 Common mistakes engineers make

Using the wrong CPOL and CPHA mode.
Ignoring CS timing requirements.
Sharing MISO among devices that are not truly tri-stated when deselected.
Running the clock too fast for long traces or bad layout.
Forgetting that many devices need dummy bytes before meaningful readback.
Assuming SPI has addressing like I2C.

6.8 Board-level design considerations

Keep SPI mostly on-board and physically compact.
Treat higher-speed SPI lines as real signal-integrity problems, not just logic nets.
Match voltage domains or use proper level shifting.
Consider series resistors on fast edges to reduce ringing.
Verify the peripheral's output drive strength and controller input timing.

6.9 Debugging SPI

Confirm the selected peripheral sees the intended CS waveform.
Verify clock polarity, phase, and bit order.
Measure whether data is stable around the configured sampling edge.
Reduce the SPI clock to see whether the problem is timing or protocol.
Check whether the peripheral requires a wake-up, reset, or initial dummy transaction.
Decode captures with a logic analyzer, but only after verifying the decoder is using the right SPI mode.

Useful Linux tooling:

spidev_test -D /dev/spidev0.0 -s 1000000

6.10 Software plus hardware example

Typical register read pattern:

gpio_set(CS_FLASH, 0);
spi_transfer(cmd, NULL, 1);
spi_transfer(addr, NULL, 3);
spi_transfer(NULL, data, len);
gpio_set(CS_FLASH, 1);

The important idea is that the entire command, address, and readback often form one continuous transaction under a single CS assertion.

6.11 Interview-level understanding

Strong SPI answers mention:

synchronous clocked shifting
full-duplex behavior
CS framing
CPOL and CPHA
why SPI is fast and simple but poor for long off-board cables

7. I2C

7.1 What I2C is and why it is so common

I2C is a two-wire serial bus designed for communication between chips on the same board. It is common because it supports multiple devices with only two shared signals:

SCL clock
SDA data

Modern terminology often uses controller and target. Many older documents still use master and slave.

7.2 Why open-drain with pull-ups is the key idea

I2C lines are usually open-drain. Devices can pull the line low, but they do not actively drive it high. External pull-up resistors return the line to logic high when nobody is pulling low.

This choice solves a hard problem elegantly: multiple devices can share the same wires without directly shorting a driven high against a driven low.

That is what makes the bus safe for:

acknowledgments from targets
arbitration between multiple controllers
clock stretching by slower devices

The price is slower rising edges, because the line rises through the resistor and total bus capacitance.

flowchart LR
	VDD[VDD] --> RSDA[Pull-up]
	VDD --> RSCL[Pull-up]
	RSDA --> SDA((SDA))
	RSCL --> SCL((SCL))
	CTRL[Controller] --- SDA
	CTRL --- SCL
	T1[Target 1] --- SDA
	T1 --- SCL
	T2[Target 2] --- SDA
	T2 --- SCL

7.3 Start, stop, ACK, NACK, and repeated start

An I2C transaction is defined by line-state transitions, not just bytes.

START: SDA falls while SCL is high
STOP: SDA rises while SCL is high
each byte is followed by an ACK or NACK bit

The controller sends an address and a read or write direction bit. The target acknowledges if it recognizes the address and is ready.

A very common pattern is register read with repeated start:

sequenceDiagram
	participant C as Controller
	participant T as Target
	C->>T: START + address write
	T-->>C: ACK
	C->>T: register address
	T-->>C: ACK
	C->>T: REPEATED START + address read
	T-->>C: ACK
	T-->>C: data bytes
	C->>T: ACK for more or NACK for last
	C->>T: STOP

Repeated start matters because many targets interpret STOP as the end of the transaction and reset their internal state machine.

7.4 Addressing and the 7-bit versus 8-bit trap

One of the most common I2C mistakes is address confusion.

Many datasheets quote an address in a way that includes the direction bit or present write and read values separately. Firmware APIs usually expect the 7-bit address only.

If a target appears not to respond, always check whether the software API expects:

raw 7-bit address
shifted address format
explicit read or write bit handled separately

7.5 Clock stretching and arbitration

Because the bus is open-drain, a device can hold SCL low. This is called clock stretching. It allows a slower target to delay the controller.

In a multi-controller system, arbitration works because a device that tries to release a line high but observes it low knows another device is dominating the bus. This is conceptually similar to CAN arbitration, though I2C is much less robust as a field bus.

Real-world caution: not every MCU controller handles clock stretching cleanly at all speeds, and some software stacks make assumptions that break when targets stretch aggressively.

7.6 Pull-up sizing and bus capacitance

Pull-up resistors define the rise time. If they are too weak:

rising edges are slow
noise margin shrinks
high-speed modes may fail

If they are too strong:

devices sink more current when pulling low
some parts may violate low-level current limits

This is a classic engineering tradeoff. The right value depends on supply voltage, target current capability, and total bus capacitance from traces, connectors, cables, and devices.

7.7 Where I2C fits well

temperature, pressure, and environmental sensors
EEPROMs and configuration memories
RTCs
PMICs and battery chargers
board-management controllers

7.8 Where I2C fits poorly

long off-board cables
electrically noisy environments
high-throughput data streaming
systems where several identical devices have the same unchangeable address

7.9 Common mistakes engineers make

Forgetting pull-up resistors or assuming internal pull-ups are enough.
Using resistor values that are too weak for bus capacitance.
Mixing up 7-bit and 8-bit addresses.
Ignoring address conflicts among identical devices.
Assuming every target supports the same maximum clock rate.
Failing to recover a bus after a brownout leaves SDA stuck low.

7.10 Debugging I2C

Measure idle SCL and SDA; both should normally be high.
If either line is stuck low, identify which device is holding it.
Confirm pull-up resistor values and supply voltage.
Use i2cdetect or logic analyzer traces to confirm address activity.
If communication fails only at higher speeds, inspect rise time and capacitance.
If a target is wedged, try bus recovery by toggling SCL several times, then issuing STOP.

Useful Linux tooling:

i2cdetect -y 1
i2cget -y 1 0x48 0x00

7.11 Bus recovery in practice

If a target resets in the middle of a byte, it may keep waiting for more clocks while holding SDA low. A common recovery approach is:

configure SCL as GPIO output temporarily
toggle it up to nine times
check whether SDA releases
generate a clean STOP
reinitialize the I2C controller

This is the kind of detail that separates demo code from production firmware.

7.12 Interview-level understanding

Strong answers mention:

open-drain signaling with pull-ups
addressing and ACK/NACK
start and stop conditions
why I2C is good for board-level low-to-moderate speed devices
why capacitance and pull-up sizing matter

8. CAN

8.1 What CAN is and why it became dominant in vehicles and machines

CAN stands for Controller Area Network. It is a message-oriented multi-master differential bus designed for robust communication among many nodes in noisy environments.

CAN became successful because it solves several hard problems well at the same time:

many nodes share one bus
arbitration happens without corrupting the winning frame
errors are detected aggressively
faulty nodes can remove themselves from the bus through fault confinement

That combination is why CAN remains important in automotive, robotics, heavy equipment, and industrial control.

8.2 Why CAN arbitration works

CAN uses dominant and recessive bits. A dominant bit overwrites a recessive bit on the bus.

All nodes monitor the bus while transmitting. If a node sends recessive but reads dominant, it knows another node has higher priority and it stops transmitting.

The identifier therefore acts as both address-like meaning and arbitration priority. Lower numerical identifiers win arbitration because dominant bits appear earlier in the comparison.

This is non-destructive arbitration. The winning message continues without being corrupted by the losing node.

flowchart TD
	START[Two nodes begin transmitting] --> MON[Each node drives bits and monitors the bus]
	MON --> CHECK{Sent recessive but read dominant?}
	CHECK -- Yes --> LOSE[Node loses arbitration and waits]
	CHECK -- No --> KEEP[Node continues transmitting]
	KEEP --> WIN[Lowest identifier wins without frame corruption]

8.3 CAN frame and error philosophy

You do not need every bit field memorized to think clearly about CAN, but you should understand the philosophy.

CAN includes:

identifier field
control bits
data payload
CRC
acknowledgment
end-of-frame structure

The protocol also uses bit stuffing and multiple forms of error checking so that nodes can detect corrupted traffic reliably.

Important intuition: CAN is not just about moving bytes. It is about keeping the whole network synchronized and fault-aware.

8.4 Fault confinement and bus-off

Each CAN controller tracks error counters. Nodes that detect too many problems move through error-active, error-passive, and potentially bus-off states.

This is a major practical advantage. A broken node does not necessarily destroy the whole network forever. The system can identify that something is wrong and isolate the offender.

In real products, bus-off handling must be part of the firmware design. Decide whether the node should:

automatically attempt recovery
log the fault and wait for supervision
enter a safe state

8.5 Physical design matters: termination and topology

CAN uses a differential bus and expects termination at both physical ends, typically 120 Ohm each.

The main line should be a bus with short stubs. Like RS-485, star wiring usually causes trouble unless very carefully engineered and slow.

At higher speeds, stub length, connector quality, and transceiver selection matter a lot.

8.6 Classical CAN versus CAN FD

For interview and production awareness, know the difference:

Classical CAN typically supports payloads up to 8 bytes.
CAN FD allows larger payloads and faster data phase.

The physical network and node compatibility must be considered carefully when mixing classical and FD-capable devices.

8.7 Typical use cases

engine, braking, body, and chassis networks in vehicles
battery management systems
industrial machines and mobile robots
heavy equipment and off-road vehicles
distributed control among multiple embedded nodes

8.8 Common mistakes engineers make

Forgetting termination or placing it at the wrong points.
Using long stubs.
Configuring the wrong nominal bitrate or sample point.
Confusing application-level message meaning with identifier priority.
Ignoring bus-off recovery strategy.
Assuming CAN is a general high-throughput bulk-data pipe.

8.9 Debugging CAN

Confirm bit timing configuration and transceiver supply.
Check for exactly two terminations on the bus.
Measure differential waveform and recessive common level.
Inspect controller error counters and bus-off state.
Use a CAN analyzer to confirm identifiers, ACK behavior, and error frames.
If frames transmit but are never acknowledged, suspect missing peer, wrong bitrate, or physical layer failure.

Useful Linux tooling with SocketCAN:

ip link set can0 up type can bitrate 500000
candump can0
cansend can0 123#11223344

8.10 Software plus hardware example

Production firmware often uses acceptance filters to reduce CPU load. Instead of waking the application for every frame, the controller can admit only identifiers the node cares about.

That matters because CAN networks can be busy, and wasteful interrupt handling creates timing problems elsewhere.

8.11 Interview-level understanding

Strong answers mention:

differential multi-master bus
dominant and recessive arbitration
message identifiers acting as priority
CRC and fault confinement
why CAN is robust for noisy distributed systems

9. USB Basics

9.1 Why USB feels different from the other interfaces

USB is often taught badly because it is introduced as "a serial bus like the others." That is misleading.

USB is a host-driven protocol family with:

strict physical requirements
device discovery and enumeration
descriptors
standardized device classes
defined transfer types
power behavior

Compared with UART, SPI, or I2C, USB is much more like a complete ecosystem than a simple wire protocol.

9.2 The core architecture

USB basics start with one architectural rule: normal USB communication is host-centered.

The host initiates communication.
Devices respond.
Hubs expand connectivity.
Endpoints are the actual data sources and sinks inside a device.

This is why two plain USB devices do not simply talk to each other by plugging them together.

Also important: USB-C is a connector standard. It is not itself the USB protocol. A USB-C connector may carry USB 2.0, USB 3.x, power negotiation, alternate modes, or some subset depending on the design.

9.3 Enumeration step by step

Enumeration is the process by which the host discovers what a connected device is and how to talk to it.

sequenceDiagram
	participant H as Host
	participant D as Device
	H->>D: Detect attach and reset bus
	H->>D: Read initial device descriptor bytes
	H->>D: Assign device address
	H->>D: Read full descriptors
	H->>D: Select configuration
	D-->>H: Endpoints become active

If enumeration fails, the issue may be physical, electrical, descriptor-related, timing-related, or power-related.

9.4 Endpoints and transfer types

USB devices expose endpoints, which are logical data channels.

The main transfer types are:

Control: used for configuration and standard requests.
Bulk: reliable large data transfer, common for storage and bridges.
Interrupt: small low-latency transfers, common for HID-like behavior.
Isochronous: time-sensitive streaming with bounded service but not guaranteed retransmission.

These are not just software categories. They shape how the host scheduler treats traffic.

9.5 Why USB is powerful but harder than UART

USB provides:

hot plug behavior
standard host support
power delivery at useful levels
standard classes such as HID, CDC ACM, and mass storage

But the price is complexity:

descriptors must be correct
signal integrity matters
host expectations matter
timing during enumeration matters
OS drivers and class behavior matter

This is why many embedded products use a USB-to-UART bridge instead of implementing native USB when all they really need is a simple serial console.

9.6 Common production use cases

device firmware update
virtual COM port using CDC ACM
HID control surfaces and keyboards
USB flash drives and mass storage devices
test and service interfaces for embedded products

9.7 Common mistakes engineers make

Thinking USB is peer-to-peer by default.
Ignoring differential pair routing, impedance, and ESD protection.
Underestimating descriptor and enumeration complexity.
Assuming power from the port is automatically available at the desired level.
Confusing USB protocol generation with connector type.
Forgetting that cable quality can matter a lot in marginal designs.

9.8 Debugging USB basics

Confirm the device powers correctly at attach.
Check D+ and D- routing, connector wiring, and ESD components.
Inspect enumeration with lsusb, OS logs, or a protocol analyzer.
If one host works and another does not, compare power, hubs, cable quality, and host stack behavior.
Validate descriptors carefully.
For USB 2.0 device designs, verify pull-up behavior and bus reset handling.

Useful host-side commands:

lsusb
dmesg | grep -i usb

9.9 Software plus hardware viewpoint

For an embedded USB device, success requires both layers to be right:

hardware: connector, ESD, routing, pull-up behavior, power tree
firmware: descriptors, endpoint configuration, class behavior, control request handling

A logic analyzer alone is rarely enough for USB debugging. Often you need:

host logs
USB protocol captures
descriptor inspection
oscilloscope checks on power and reset behavior

9.10 Interview-level understanding

Strong answers mention:

host-device architecture
enumeration and descriptors
endpoints and transfer types
the difference between USB protocol and connector form factor
why USB is powerful but significantly more complex than UART, SPI, or I2C

10. Tradeoffs and Real Engineering Decisions

10.1 Board-local versus cable-level interfaces

This is one of the most important practical distinctions.

Board-local favorites:

SPI for speed and simplicity with selected peripherals
I2C for shared low-to-moderate-speed management devices
UART for debug or simple modules

Cable-level favorites:

RS-485 for rugged serial links in industrial environments
CAN for multi-node robust control networks
USB for host interoperability and standardized peripherals
RS-232 for legacy interoperability

When engineers force a board-local bus into a cable environment, many "mysterious" bugs are really selection mistakes rather than implementation mistakes.

10.2 Example decisions

Scenario	Best first candidate	Why
Microcontroller to on-board IMU and temperature sensor	I2C	Two wires, addressed devices, moderate speed
Microcontroller to external high-speed ADC	SPI	Deterministic timing and higher throughput
Boot console for Linux SBC	UART	Minimal software stack and easy field access
Multi-drop energy meters over `100 m` cable	RS-485	Differential long-distance field wiring
Distributed vehicle nodes sharing status and commands	CAN	Arbitration plus error handling
Product needs to appear as a serial device to a laptop	USB CDC ACM or USB UART bridge	Standard host interoperability
Legacy lab instrument with DB9 port	RS-232	Native compatibility

10.3 A few concrete tradeoff examples

Example 1: I2C versus SPI for sensors

Choose I2C when pin count and bus sharing matter more than peak throughput.

Choose SPI when:

sample rate is high
latency must be predictable
bus capacitance or address conflicts make I2C awkward
device timing is sensitive

Example 2: RS-485 versus CAN for distributed control

Choose RS-485 when:

the application protocol is simple and under your control
one node can manage bus access or request-response timing cleanly
cost and simplicity matter more than built-in arbitration

Choose CAN when:

several nodes may need to talk asynchronously
you want robust error handling and network fault behavior
message priority and bounded arbitration matter

Example 3: Native USB versus USB UART bridge

Choose native USB when:

the product must look like a standard USB device class
bandwidth or class behavior matters
host interoperability is a product requirement

Choose a USB UART bridge when:

you only need a console or simple command channel
engineering time and firmware complexity must stay low
native USB adds more risk than value

11. Common Failure Patterns and How to Avoid Them

Protocol	Common failure pattern	Root cause	Prevention
UART	Random bad bytes	baud mismatch, noise, clock drift	accurate clocks, shorter runs, proper grounding
RS-232	No communication at all	wrong cable or no transceiver	verify levels, pinout, DTE versus DCE
RS-485	Works on bench, fails in field	bad topology or missing bias/termination	bus layout, correct resistors, protected transceivers
SPI	Readback is shifted or nonsense	wrong mode or `CS` timing	confirm `CPOL`, `CPHA`, and transaction framing
I2C	Device disappears or bus locks	pull-up issues, capacitance, stuck line	size pull-ups correctly, implement recovery
CAN	Bus-off or no ACK	wrong bitrate, bad termination, transceiver fault	validate timing and termination, inspect error counters
USB	Enumerates unreliably	power or descriptor or routing issue	validate descriptors, routing, ESD, and attach behavior

11.1 Production-oriented design habits

Document the exact voltage domain of every interface.
Document which layer is implemented where: controller, transceiver, protocol stack, application framing.
Add test points or accessible connectors for critical buses.
Design in failure recovery early, not after field failures appear.
Use protocol analyzers, not only firmware logs.
Validate the interface across temperature, cable length, and worst-case supply conditions.

12. Troubleshooting Playbook

12.1 The right order of attack

When a communication link fails, engineers often jump into software first because logs are easy to inspect. That is often the wrong order.

Start with the physical facts:

Are the right wires connected?
Are voltage levels compatible?
Is the line or bus in the expected idle state?
Are timing and framing configured correctly?
Is the peer device actually powered, booted, and ready?
Is higher-level software interpreting the traffic correctly?

12.2 General debugging flow

flowchart TD
	START[Communication failure observed] --> WIRING{Wiring and power correct?}
	WIRING -- No --> FIXPHY[Fix wiring power grounding or transceiver]
	WIRING -- Yes --> LEVELS{Idle levels and electrical behavior correct?}
	LEVELS -- No --> FIXELEC[Fix pull-ups termination biasing voltage or layout]
	LEVELS -- Yes --> CONFIG{Protocol configuration correct?}
	CONFIG -- No --> FIXCFG[Fix bitrate mode address parity or descriptors]
	CONFIG -- Yes --> TRAFFIC{Expected traffic visible on analyzer?}
	TRAFFIC -- No --> STATE[Check reset sequencing readiness and bus ownership]
	TRAFFIC -- Yes --> APP[Inspect higher-level packet format state machine and recovery]

12.3 Minimum useful toolset

Digital multimeter for supply, continuity, and idle-level checks.
Oscilloscope for edge quality, timing, voltage, and analog behavior.
Logic analyzer for UART, SPI, and I2C decoding.
USB analyzer or host logs for USB.
CAN interface or analyzer for CAN networks.
Known-good adapter cables and loopback fixtures for serial links.

12.4 A disciplined measurement mindset

Measure before changing too many variables.

Good questions during debugging:

What is the expected idle state of the interface?
Who is allowed to drive the line or bus at this moment?
Where does the transaction boundary occur?
Which event tells the receiver when to sample?
What happens when the peer resets halfway through a transaction?

These questions sound basic, but they are exactly what prevent wasted days.

13. Interview and Design-Review Level Questions

13.1 Questions you should be able to answer clearly

Why is UART called asynchronous, and how does the receiver know where to sample?
Why can multiple I2C devices safely share the same two wires?
What is the practical difference between UART, RS-232, and RS-485?
Why does CAN arbitration not corrupt the winning message?
Why is SPI fast but poor for long noisy cable links?
Why can USB not be treated like a simple byte stream between two equal peers?
Why do pull-up resistors matter so much on I2C?
Why do termination resistors matter on CAN and RS-485?

13.2 What strong answers usually include

first-principles explanation of signaling behavior
awareness of physical layer limits, not just register settings
ability to connect protocol choice to use case and topology
awareness of debugging tools and failure modes
distinction between logical protocol and electrical standard

14. Final Design Rules to Keep

Do not choose protocols by habit. Choose them by topology, environment, speed, and interoperability needs.
Keep board-level buses on the board unless you have a very good reason not to.
Treat off-board interfaces as EMC, ESD, grounding, and protection problems from day one.
Separate controller logic from transceiver logic clearly in both schematic and firmware design.
Design for visibility: test points, analyzable signals, and logging of recovery events.
Expect field failures to come from edge cases: startup order, marginal rise time, cable routing, ground offset, and overloaded software paths.
If an interface works only on the bench, assume the design is unfinished.

Communication protocols are not just a chapter in digital design. They are a recurring systems problem across firmware, hardware, test, manufacturing, and field service. The engineers who become reliable at them are the ones who learn to move smoothly between theory, waveforms, datasheets, firmware state machines, and real installation constraints.

53 KiB Raw Permalink Blame History

Communication Protocols

How to Use This Handbook

Quick Reference

1. Foundations: What a Communication Protocol Actually Is

1.1 Protocol, bus, interface, and physical standard are not the same thing

1.2 The major dimensions that separate protocols

1.3 Why digital communication fails in analog ways

1.4 A practical layered mental model

1.5 Core signaling patterns you should recognize immediately

Asynchronous serial

Synchronous serial

Open-drain or open-collector shared lines

Differential signaling

2. How to Choose the Right Interface

2.1 First-principles decision rules

2.2 Decision flow

2.3 Real production heuristics

3. UART

3.1 What UART is and why it exists

3.2 How asynchronous serial works from first principles

3.3 Frame format

3.4 Baud rate and clock error budget

3.5 Flow control and buffering

3.6 Real-world use cases

3.7 Practical design rules

3.8 Common mistakes engineers make

3.9 Debugging UART methodically

3.10 Software plus hardware example

3.11 Interview-level understanding

4. RS-232

4.1 What RS-232 is and what problem it solved

4.2 Why RS-232 is not just "UART with a connector"

4.3 Signaling intuition

4.4 DTE, DCE, and null modem confusion

4.5 Where RS-232 still appears in real engineering

4.6 Common mistakes engineers make

4.7 Debugging RS-232

4.8 Design guidance

5. RS-485

5.1 What RS-485 is and where it fits

5.2 Why differential signaling helps

5.3 Half-duplex and bus ownership

5.4 Termination, biasing, and topology

Termination

Biasing

Topology

5.5 Practical design rules

5.6 Where RS-485 appears in production

5.7 Common mistakes engineers make

5.8 Debugging RS-485

5.9 Software plus hardware example

5.10 Interview-level understanding

6. SPI

6.1 What SPI is and why engineers like it

6.2 How SPI works from first principles

6.3 Why clock mode matters

6.4 Transaction anatomy

6.5 Why SPI is fast but not very standardized

6.6 Typical use cases

6.7 Common mistakes engineers make

6.8 Board-level design considerations

6.9 Debugging SPI

6.10 Software plus hardware example

6.11 Interview-level understanding

7. I2C

7.1 What I2C is and why it is so common

7.2 Why open-drain with pull-ups is the key idea

7.3 Start, stop, ACK, NACK, and repeated start

7.4 Addressing and the 7-bit versus 8-bit trap

7.5 Clock stretching and arbitration

7.6 Pull-up sizing and bus capacitance

7.7 Where I2C fits well

7.8 Where I2C fits poorly

7.9 Common mistakes engineers make

7.10 Debugging I2C

7.11 Bus recovery in practice

7.12 Interview-level understanding

8. CAN

8.1 What CAN is and why it became dominant in vehicles and machines

53 KiB

Raw Permalink Blame History