db37d59a6d
Co-authored-by: Copilot <copilot@github.com>
1304 lines
53 KiB
Markdown
1304 lines
53 KiB
Markdown
# Batteries and Charging Systems
|
|
|
|
This handbook is a practical reference for computer engineering students and working engineers who need more than battery buzzwords, nominal voltages, and one-line charger descriptions. The goal is to build battery intuition that holds up in real products: laptops that stop at 80% by design, phones that age early because they sit hot at full charge, wireless sensors that miss their runtime target by months, drones that brown out during current spikes, and packs that look healthy by open-circuit voltage but collapse the moment a real load is applied.
|
|
|
|
Battery systems sit at the boundary between chemistry, power electronics, thermal behavior, mechanical packaging, safety engineering, and firmware policy. If any one of those layers is misunderstood, the symptom usually appears somewhere else first: random resets, slow charging, inaccurate state-of-charge, unexpected shutdown at 25%, swollen cells, hot connectors, or field returns labeled only as "won't power on."
|
|
|
|
The material here is intentionally practical. It starts from first principles, then connects them to charger behavior, protection design, BMS architecture, current measurement, runtime estimation, debugging, and production tradeoffs.
|
|
|
|
## How to Use This Handbook
|
|
|
|
Read it in order the first time. Return to specific sections when you are designing or debugging.
|
|
|
|
- If you are new to battery-powered systems, start with the first-principles and Li-ion basics sections.
|
|
- If you are designing hardware, spend extra time on charging safety, voltage protection, and current draw.
|
|
- If you are debugging products, go straight to the BMS, troubleshooting, and runtime mismatch sections.
|
|
- If you are preparing for design reviews or interviews, use the tradeoff and interview-level sections near the end.
|
|
|
|
## Quick Reference
|
|
|
|
| Topic | Practical meaning | Why engineers care |
|
|
| --- | --- | --- |
|
|
| Cell | One electrochemical unit | Cell limits are the real limits; pack numbers can hide weak cells |
|
|
| Pack | One or more cells plus wiring, protection, and usually sensing | The pack behavior depends on interconnects, balancing, and control |
|
|
| Capacity (`Ah`, `mAh`) | Charge that can be delivered over a defined discharge condition | Useful, but incomplete without voltage, load, and temperature |
|
|
| Energy (`Wh`) | Capacity times voltage | Usually the best first estimate for runtime |
|
|
| `C`-rate | Current normalized to cell capacity | Connects cell size to stress, sag, heating, and charge time |
|
|
| SOC | State of charge | "How full is the battery now?" |
|
|
| SOH | State of health | "How much has the battery aged?" |
|
|
| CC/CV | Constant-current then constant-voltage charging | Standard Li-ion charging method |
|
|
| BMS | Battery management system | Protects, measures, estimates, balances, and communicates |
|
|
| UV/OV protection | Undervoltage and overvoltage protection | Prevents deep discharge damage and overcharge damage |
|
|
|
|
Five rules prevent many battery mistakes:
|
|
|
|
1. Treat a battery as a dynamic source, not an ideal voltage rail.
|
|
2. Use watt-hours for runtime reasoning whenever voltage conversion is involved.
|
|
3. Peak current determines whether the product survives; average current determines how long it runs.
|
|
4. Li-ion charging is controlled and conditional; it is not "just apply voltage."
|
|
5. The exact cell datasheet and charger/BMS datasheets overrule generic rules of thumb.
|
|
|
|
---
|
|
|
|
## 1. What a Battery System Really Is
|
|
|
|
### 1.1 A battery is a chemical energy system with an electrical interface
|
|
|
|
At the deepest level, a battery stores free energy in chemical form. It only becomes useful to electronics because the cell is built so that:
|
|
|
|
- electrons cannot easily move internally from one electrode to the other
|
|
- ions can move internally through the electrolyte and separator
|
|
- electrons can move externally through a load when a circuit is completed
|
|
|
|
That separation is the entire trick. If electrons could move freely inside the cell, the energy would be released internally as heat rather than through your system.
|
|
|
|
In a rechargeable battery, the chemistry is designed so the main reactions are substantially reversible inside a safe operating window. Charging pushes the chemistry backward; discharging lets it move forward.
|
|
|
|
### 1.2 Why batteries feel harder than bench supplies
|
|
|
|
A bench supply is usually designed to look like a stable voltage source over a broad operating range. A battery is not. A battery changes with:
|
|
|
|
- state of charge
|
|
- temperature
|
|
- age
|
|
- recent charge and discharge history
|
|
- instantaneous load current
|
|
|
|
That means the battery you designed around on the bench is not the same battery your product sees six months later in winter after repeated high-current pulses.
|
|
|
|
The simplest useful battery model is:
|
|
|
|
- open-circuit voltage `Voc`
|
|
- series internal resistance `Rint`
|
|
|
|
Under load:
|
|
|
|
- `Vload = Voc - I x Rint`
|
|
- internal heating is approximately `Ploss = I^2 x Rint`
|
|
|
|
This is simple, but it explains many real failures:
|
|
|
|
- a battery that reads "good" with no load but collapses under load
|
|
- a pack that works when warm but not when cold
|
|
- sudden shutdown during radio transmit or motor stall
|
|
- poor late-discharge performance as battery voltage falls and current rises
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Voc[Chemical potential / open-circuit voltage] --> Rint[Internal resistance]
|
|
Rint --> Load[System load]
|
|
Load --> Return[Return path]
|
|
Return --> Voc
|
|
Rint -. causes .-> Drop[Load sag]
|
|
Rint -. causes .-> Heat[Cell heating]
|
|
```
|
|
|
|
The model above is not the whole chemistry, but it is enough to explain most first-order electrical behavior.
|
|
|
|
### 1.3 The charger, battery, protection, regulators, and firmware are one system
|
|
|
|
Battery design fails when teams treat the charger, cell, BMS, regulators, and firmware as separate boxes.
|
|
|
|
They are one control system with energy flowing through it.
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Source[USB / adapter / dock / external supply] --> Charger[Charger or power-path IC]
|
|
Charger --> Cell[Li-ion cell or pack]
|
|
Cell --> Protect[Protection FETs / BMS]
|
|
Protect --> Rails[DC/DC converters and LDOs]
|
|
Rails --> Loads[CPU, radio, display, SSD, sensors, motors]
|
|
Loads --> Sense[Voltage, current, and temperature telemetry]
|
|
Sense --> Firmware[Fuel gauge, EC, MCU, OS policy]
|
|
Firmware --> Charger
|
|
Firmware --> Loads
|
|
```
|
|
|
|
Examples of cross-layer effects:
|
|
|
|
- Firmware enables a high-power radio burst. The battery current spikes. The weakest cell droops. The BMS trips undervoltage. The bug looks like a software reset.
|
|
- The charger is sized correctly, but the system load remains active during charging. Charge current never tapers low enough, so the charger appears not to terminate.
|
|
- The hardware is safe, but the host estimates runtime from voltage alone. Users see 30% battery then sudden shutdown.
|
|
|
|
Professional rule: always ask where energy flows, where heat is generated, who enforces limits, and who reports them.
|
|
|
|
---
|
|
|
|
## 2. Li-ion Basics
|
|
|
|
### 2.1 What "Li-ion" actually means
|
|
|
|
"Lithium-ion" is not one single chemistry. It is a family of rechargeable chemistries in which lithium ions move between host materials during charge and discharge.
|
|
|
|
The common structure is:
|
|
|
|
- anode, often graphite in many consumer cells
|
|
- cathode, often a lithium metal oxide or phosphate material
|
|
- separator, which physically prevents electrode contact
|
|
- electrolyte, which allows ion transport
|
|
- current collectors, tabs, and packaging
|
|
|
|
During discharge in a typical graphite-based Li-ion cell:
|
|
|
|
1. lithium ions move from the anode toward the cathode through the electrolyte
|
|
2. electrons flow through the external circuit and power the load
|
|
3. chemical energy becomes electrical energy plus heat
|
|
|
|
During charging, the process is reversed.
|
|
|
|
The key intuition is that the cell is not a bucket of electrons. It is a controlled chemical machine whose electrical behavior depends on how quickly ions and electrons can move without damaging the materials.
|
|
|
|
### 2.2 Why Li-ion became dominant
|
|
|
|
Li-ion dominates modern portable and many transportation systems because it offers a strong combination of:
|
|
|
|
- high energy density
|
|
- reasonable cycle life
|
|
- good cell voltage per cell
|
|
- low self-discharge relative to many older rechargeable chemistries
|
|
- practical manufacturability across many form factors
|
|
|
|
That does not mean it is easy to use. Li-ion is powerful because it stores a lot of energy in a small volume. That same fact is why safety margins, charging discipline, and thermal control matter so much.
|
|
|
|
### 2.3 Cell voltage, nominal voltage, and why none of them are the whole story
|
|
|
|
Engineers casually say things like "a Li-ion cell is 3.7 V." That is only a shorthand.
|
|
|
|
In reality:
|
|
|
|
- open-circuit voltage changes with state of charge
|
|
- loaded voltage also includes `I x R` sag
|
|
- the exact full-charge voltage depends on chemistry and cell design
|
|
- the recommended discharge cutoff depends on chemistry, load, and aging goals
|
|
|
|
Typical numbers you will often see:
|
|
|
|
- many cobalt-based consumer cells: nominal around `3.6 V` to `3.7 V`, full charge around `4.2 V`
|
|
- some high-voltage variants: up to about `4.35 V` full charge
|
|
- LiFePO4: nominal around `3.2 V`, full charge around `3.6 V` to `3.65 V`
|
|
|
|
Do not design from memory when the exact cell part number is known. Use the actual datasheet.
|
|
|
|
### 2.4 Capacity, energy, and why `mAh` is not enough
|
|
|
|
Capacity in `Ah` tells you how much charge the cell can deliver under a specified condition. Energy in `Wh` tells you how much work the battery can do.
|
|
|
|
The relationship is:
|
|
|
|
- `Energy_Wh ~= Capacity_Ah x Nominal_Voltage`
|
|
|
|
Example:
|
|
|
|
- `3.0 Ah` cell at `3.7 V` nominal is about `11.1 Wh`
|
|
|
|
Why `mAh` alone is misleading:
|
|
|
|
- a `3000 mAh` single-cell pack and a `3000 mAh` three-cell series pack do not store the same energy
|
|
- if your system uses a boost converter or buck converter, the battery current is not the same as the load current
|
|
- low-temperature and aged cells can deliver much less usable capacity than the label suggests
|
|
|
|
Strong engineering habit: compare batteries in `Wh`, then apply real derating.
|
|
|
|
### 2.5 `C`-rate: the bridge between cell size and stress
|
|
|
|
`C`-rate normalizes current to capacity.
|
|
|
|
- `1C` means a current equal to the cell capacity in ampere-hours
|
|
- for a `3 Ah` cell, `1C = 3 A`
|
|
- `0.5C = 1.5 A`
|
|
- `2C = 6 A`
|
|
|
|
Why `C`-rate matters:
|
|
|
|
- high discharge `C` increases voltage sag and heating
|
|
- high charge `C` reduces charge time but raises thermal and aging stress
|
|
- the same current is mild for a large cell and severe for a small cell
|
|
|
|
Power-tool cells, phone cells, EV cells, and small wearable cells may all be Li-ion, but their acceptable `C`-rates can differ dramatically.
|
|
|
|
### 2.6 Internal resistance is where many "mystery" failures start
|
|
|
|
Internal resistance is not just a datasheet number. It is a major design variable.
|
|
|
|
When current rises:
|
|
|
|
- terminal voltage drops by approximately `I x Rint`
|
|
- internal heat rises approximately as `I^2 x Rint`
|
|
|
|
As a cell ages or gets cold, internal resistance usually rises. That means the exact same workload causes more droop and more heat.
|
|
|
|
This is why old batteries often show this pattern:
|
|
|
|
- they appear to charge normally
|
|
- open-circuit voltage looks acceptable
|
|
- runtime is poor under real workloads
|
|
- shutdown happens at a higher indicated percentage than expected
|
|
|
|
The battery is not simply "smaller." It is also weaker as a power source.
|
|
|
|
### 2.7 State of charge is not linearly encoded in voltage
|
|
|
|
Many engineers initially assume battery voltage maps cleanly to remaining capacity. That is only partly true.
|
|
|
|
Why the mapping is hard:
|
|
|
|
- the voltage curve is chemistry-dependent
|
|
- the curve can be flat over a wide SOC region
|
|
- load current changes the observed terminal voltage
|
|
- temperature shifts the curve
|
|
- recent charge/discharge history creates hysteresis and relaxation effects
|
|
|
|
Open-circuit voltage after rest can be useful. Loaded voltage during a burst is much less reliable as a direct SOC estimate.
|
|
|
|
This is why serious systems use fuel-gauge models, coulomb counting, or both rather than a simple lookup table from instantaneous voltage.
|
|
|
|
### 2.8 Series and parallel pack behavior
|
|
|
|
Series cells increase voltage. Parallel cells increase capacity and current capability.
|
|
|
|
- `Ns` in series: voltage adds, ampere-hour capacity does not
|
|
- `Np` in parallel: ampere-hour capacity adds, voltage does not
|
|
|
|
Example:
|
|
|
|
- `2s1p` of `3.7 V`, `3 Ah` cells is roughly `7.4 V`, `3 Ah`
|
|
- `1s2p` is roughly `3.7 V`, `6 Ah`
|
|
- both are roughly `22.2 Wh` before derating
|
|
|
|
Important practical differences:
|
|
|
|
- series packs need per-cell monitoring and usually balancing
|
|
- parallel groups need well-matched cells and careful interconnect design
|
|
- a pack can have a healthy total voltage while one individual series cell is already unsafe
|
|
|
|
### 2.9 Chemistry choice is a system tradeoff
|
|
|
|
| Chemistry family | Typical strength | Typical weakness | Common use case |
|
|
| --- | --- | --- | --- |
|
|
| NMC / NCA style cells | High energy density | More demanding thermal and safety management | Laptops, EVs, power-dense portable systems |
|
|
| LiFePO4 | Better thermal stability, long life, flatter discharge curve | Lower energy density, lower cell voltage | Industrial systems, energy storage, some vehicles |
|
|
| High-power cylindrical cells | Strong pulse current capability | Often less total energy for the same volume | Tools, drones, robotics |
|
|
|
|
There is no universally best chemistry. The right question is: what failure is most expensive in this product?
|
|
|
|
- If volume matters most, energy density dominates.
|
|
- If cycle life and thermal stability matter most, LFP becomes attractive.
|
|
- If pulse power matters most, internal resistance and high-rate behavior dominate.
|
|
|
|
### 2.10 Temperature and aging are not side topics
|
|
|
|
Battery behavior is strongly temperature-dependent.
|
|
|
|
Low temperature usually causes:
|
|
|
|
- lower usable capacity
|
|
- higher internal resistance
|
|
- worse pulse performance
|
|
- reduced safe charge acceptance
|
|
|
|
High temperature usually causes:
|
|
|
|
- faster aging
|
|
- more side reactions
|
|
- more gas generation and swelling risk
|
|
- faster loss of cycle life and calendar life
|
|
|
|
Two aging modes matter in practice:
|
|
|
|
- calendar aging: time, especially at high temperature and high SOC
|
|
- cycle aging: repeated charge/discharge, especially with high depth of discharge and high current stress
|
|
|
|
That is why products often intentionally limit top-of-charge or charge rate. The goal is not merely safety; it is lifespan.
|
|
|
|
### 2.11 Common Li-ion mistakes
|
|
|
|
- Treating voltage as a direct linear fuel gauge.
|
|
- Using `mAh` instead of `Wh` for runtime tradeoffs.
|
|
- Assuming all Li-ion cells are `4.2 V` full-charge cells.
|
|
- Ignoring internal resistance and only looking at nominal capacity.
|
|
- Designing around room-temperature behavior only.
|
|
- Believing an old battery failure is always a pure capacity loss rather than a power-delivery problem.
|
|
|
|
---
|
|
|
|
## 3. Charging Li-ion Safely
|
|
|
|
### 3.1 Charging is controlled reversal of chemistry
|
|
|
|
Charging is not simply "pushing current into a battery." A charger must move the chemistry back toward the charged state while staying inside voltage, current, and temperature limits.
|
|
|
|
If charging is too aggressive, the cell may:
|
|
|
|
- plate lithium instead of intercalating it correctly
|
|
- overheat
|
|
- generate gas
|
|
- age rapidly
|
|
- become unsafe
|
|
|
|
This is why Li-ion charging is governed by algorithm, measurement, and protection.
|
|
|
|
### 3.2 The standard Li-ion charging flow: CC/CV
|
|
|
|
Most Li-ion charging uses constant current followed by constant voltage.
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A[Power source present] --> B{Cell voltage and temperature valid?}
|
|
B -->|No| X[Do not charge / fault handling]
|
|
B -->|Yes| C{Cell deeply discharged but recoverable?}
|
|
C -->|Yes| D[Precharge at low current]
|
|
C -->|No| E[Constant-current phase]
|
|
D --> E
|
|
E --> F{Cell reaches charge voltage limit?}
|
|
F -->|No| E
|
|
F -->|Yes| G[Constant-voltage phase]
|
|
G --> H{Charge current falls below taper threshold?}
|
|
H -->|No| G
|
|
H -->|Yes| I[Terminate charge]
|
|
I --> J[Wait for recharge threshold]
|
|
```
|
|
|
|
Step by step:
|
|
|
|
1. Qualification: the charger verifies source presence, battery presence, acceptable battery voltage, and acceptable temperature.
|
|
2. Precharge: if the cell is deeply discharged but still considered recoverable, the charger applies a small current.
|
|
3. Constant current: the charger drives the programmed charge current while cell voltage rises.
|
|
4. Constant voltage: once the cell reaches the charge voltage limit, the charger holds voltage constant and current naturally tapers down.
|
|
5. Termination: when current falls below a chosen threshold, charging stops.
|
|
6. Recharge policy: if the cell later falls below a recharge threshold, charging may resume.
|
|
|
|
### 3.3 Why the phases exist
|
|
|
|
Precharge exists because deeply depleted cells are more fragile. A lower current lets the cell recover more gently and helps determine whether it is behaving normally.
|
|
|
|
Constant current exists because early in charge the cell can accept current efficiently while staying below the voltage limit.
|
|
|
|
Constant voltage exists because once the cell reaches its upper voltage limit, pushing the same current would exceed the safe cell voltage. The charger must then hold voltage and allow current to decay.
|
|
|
|
Termination exists because Li-ion is not normally float-charged the way some older chemistries can be. Remaining indefinitely at the top voltage with continuous trickle is bad for the cell and can be unsafe.
|
|
|
|
### 3.4 Why Li-ion is not trickle-charged like older chemistries
|
|
|
|
For many Li-ion cells, indefinite trickle charge is not appropriate. Holding the cell at its upper limit for long periods increases aging stress, and uncontrolled top-off behavior can create safety problems.
|
|
|
|
Many systems do one of the following instead:
|
|
|
|
- terminate charge and only restart after a defined voltage drop
|
|
- deliberately stop below 100% for life extension
|
|
- use user-selectable modes such as 80% or 90% max charge
|
|
|
|
This is common in laptops, fleet devices, and long-life embedded products.
|
|
|
|
### 3.5 Source limitations matter as much as cell limitations
|
|
|
|
A charger does not operate in isolation. It is constrained by the input source.
|
|
|
|
Real charging sources include:
|
|
|
|
- USB ports with negotiated current or power limits
|
|
- wall adapters with droop and thermal behavior
|
|
- docking connectors with contact resistance
|
|
- vehicle power with noise and transients
|
|
- solar inputs with varying available power
|
|
|
|
A product can have a perfectly valid charger IC and still charge poorly because:
|
|
|
|
- the source current limit is too low
|
|
- cable resistance causes input droop
|
|
- the source negotiates less power than expected
|
|
- the charger thermally throttles
|
|
- system load consumes most of the incoming power
|
|
|
|
Production scenario: a device connected to a weak adapter may appear to "charge slowly" when the real issue is that the system load plus charge current exceeds input capability. The charger then falls back to input current limiting.
|
|
|
|
### 3.6 Safety is layered, not singular
|
|
|
|
Safe charging usually relies on multiple layers:
|
|
|
|
- correct cell selection and cell datasheet limits
|
|
- charger IC voltage and current regulation
|
|
- temperature monitoring, often with NTC thermistors
|
|
- independent battery protector or BMS
|
|
- fuse, thermal fuse, CID, PTC, or pack-level protective hardware depending on product class
|
|
- firmware timeouts, telemetry checks, and event logging
|
|
- mechanical design that manages heat and damage containment
|
|
|
|
If your design depends on one comparator or one line of firmware to prevent unsafe charging, the design is weak.
|
|
|
|
### 3.7 Temperature-aware charging
|
|
|
|
Temperature is a first-class charging variable, not a nice-to-have sensor.
|
|
|
|
Why:
|
|
|
|
- cold charging can cause lithium plating because the anode cannot accept ions fast enough
|
|
- hot charging accelerates side reactions and aging
|
|
- a pack may be safe to discharge at a temperature where it is not safe to charge
|
|
|
|
Many products implement temperature-based charge derating or JEITA-style windows:
|
|
|
|
- charge normally in the ideal range
|
|
- reduce current or voltage in warm or cool ranges
|
|
- block charging in extreme cold or heat
|
|
|
|
Implementation detail: place the temperature sensor where it represents the cell, not the coolest PCB corner.
|
|
|
|
### 3.8 Charger implementation details engineers often miss
|
|
|
|
- Charge termination current must be chosen relative to cell size and system load.
|
|
- A charger connected directly to a system-plus-battery node can misread taper current if the system is still drawing power.
|
|
- Sense resistor accuracy, PCB resistance, and connector drop influence real current regulation.
|
|
- Charger thermal pad layout and copper area strongly affect thermal throttling.
|
|
- Input and battery decoupling placement changes stability and transient behavior.
|
|
|
|
Common professional pattern: use a power-path charger when the product must run from the adapter while charging the battery accurately.
|
|
|
|
### 3.9 Charging failure cases and what they usually mean
|
|
|
|
| Symptom | Common real cause |
|
|
| --- | --- |
|
|
| Charges very slowly | Input current limit, cable drop, thermal throttling, or large concurrent system load |
|
|
| Reaches voltage but never completes | Termination threshold too low, system load masking taper current, or gauge misreporting |
|
|
| Refuses to charge when cold | Correct safety behavior or overly conservative temperature sensing |
|
|
| Gets hot near full charge | CV phase is long, cell is aged, thermal path is poor, or charge current is too high |
|
|
| Charges, then quickly drops after unplug | Gauge error, aged cell with high resistance, or strong relaxation effect |
|
|
|
|
### 3.10 Charging debugging workflow
|
|
|
|
1. Verify exact cell and charger datasheet limits.
|
|
2. Measure input voltage at the charger pins, not only at the adapter.
|
|
3. Log battery voltage, charge current, temperature, and charger status registers over time.
|
|
4. Separate system load from battery charge current if the architecture allows it.
|
|
5. Check whether the charger is in input current limit, thermal regulation, CC mode, or CV mode.
|
|
6. Compare measured taper current and termination threshold.
|
|
7. If cold or hot behavior looks strange, verify the NTC network and its placement.
|
|
|
|
### 3.11 Common charging mistakes
|
|
|
|
- Charging a Li-ion cell directly from a fixed voltage source without proper charge control.
|
|
- Ignoring cell temperature during charge.
|
|
- Assuming one universal full-charge voltage fits all Li-ion cells.
|
|
- Expecting safe trickle charging.
|
|
- Setting charge current based only on desired charge time rather than source, thermal, and cell limits.
|
|
- Forgetting that concurrent system load can hide true battery current.
|
|
|
|
---
|
|
|
|
## 4. Battery Management Systems
|
|
|
|
### 4.1 What a BMS is and what it is not
|
|
|
|
In casual conversation, people call many things a BMS.
|
|
|
|
In practice, these are different layers:
|
|
|
|
- charger: controls how energy enters the battery
|
|
- protector: disconnects the battery on unsafe conditions
|
|
- fuel gauge: estimates SOC, SOH, time-to-empty, and related metrics
|
|
- BMS: a broader system that measures, protects, estimates, balances, logs, and often communicates
|
|
|
|
In a simple `1s` consumer product, the "BMS" may be just a charger plus a small protection IC plus a gauge. In a larger multi-cell pack, the BMS is a real subsystem with cell monitors, current sensing, balancing, firmware, and communication to the host.
|
|
|
|
### 4.2 Why BMS exists
|
|
|
|
A battery pack is not safe or useful enough if it only stores energy. A serious product also needs to know:
|
|
|
|
- are any cells overvoltage or undervoltage?
|
|
- is current too high?
|
|
- is temperature acceptable?
|
|
- are cells drifting apart?
|
|
- how full is the pack really?
|
|
- how much has it aged?
|
|
- should charging or discharging be blocked right now?
|
|
|
|
That is the BMS job.
|
|
|
|
### 4.3 Typical BMS architecture
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Cells[Series cells or parallel groups] --> Taps[Cell tap sense lines]
|
|
Taps --> AFE[Battery monitor AFE]
|
|
Shunt[Current shunt] --> AFE
|
|
NTC[NTC thermistors] --> AFE
|
|
AFE --> MCU[Pack MCU or gauge controller]
|
|
MCU --> Balance[Balancing circuits]
|
|
MCU --> FETs[Charge and discharge FETs / contactors]
|
|
MCU --> Comms[SMBus / I2C / CAN / UART]
|
|
Comms --> Host[Host MCU / EC / OS]
|
|
Host --> Policy[Charge limits, power modes, logging]
|
|
Policy --> MCU
|
|
```
|
|
|
|
Core BMS building blocks:
|
|
|
|
- cell voltage measurement
|
|
- current measurement, usually via shunt
|
|
- temperature measurement
|
|
- protection logic
|
|
- charge/discharge switches
|
|
- balancing circuitry for series packs
|
|
- estimation algorithms for SOC and SOH
|
|
- nonvolatile fault and usage logging in many systems
|
|
- communications interface to the host
|
|
|
|
### 4.4 Protection versus management versus estimation
|
|
|
|
A strong engineer keeps these functions separate in their head.
|
|
|
|
Protection answers: must I stop now to avoid unsafe operation?
|
|
|
|
Management answers: what is the permitted operating range right now, and how should the system behave?
|
|
|
|
Estimation answers: what do I believe the battery state is, given measurement noise, model uncertainty, and history?
|
|
|
|
Mixing them conceptually causes bad designs. For example, SOC estimate should not be trusted as a safety mechanism. Safety cutoffs should be enforced from measured voltage, current, and temperature limits with independent logic where required.
|
|
|
|
### 4.5 Fuel gauging: why it is harder than it looks
|
|
|
|
Fuel gauging tries to answer user-facing questions like:
|
|
|
|
- how much charge remains?
|
|
- how much runtime remains at the current load?
|
|
- how healthy is the pack relative to new?
|
|
|
|
Common techniques:
|
|
|
|
- voltage-based estimation: simple but weak under load and temperature variation
|
|
- coulomb counting: integrates current over time, good for tracking but drifts without calibration
|
|
- model-based estimation: combines current, voltage, temperature, and battery models for better accuracy
|
|
|
|
Why coulomb counting alone is not enough:
|
|
|
|
- offset errors accumulate
|
|
- true usable capacity changes with age
|
|
- unknown initial SOC causes error
|
|
|
|
Why voltage alone is not enough:
|
|
|
|
- voltage under load includes sag
|
|
- some chemistries have flat OCV curves over wide SOC ranges
|
|
|
|
Strong systems combine methods and periodically realign estimates.
|
|
|
|
### 4.6 Cell balancing
|
|
|
|
Balancing matters mainly for series-connected cells.
|
|
|
|
Why balancing is needed:
|
|
|
|
- cells are never perfectly identical
|
|
- capacity, leakage, and internal resistance differ
|
|
- small differences accumulate over many cycles
|
|
|
|
Without balancing, the weakest cell reaches full or empty first. That cell then limits the usable pack capacity and can hit unsafe limits while the pack-level voltage still looks acceptable.
|
|
|
|
Two broad balancing approaches:
|
|
|
|
- passive balancing: bleed excess charge from higher cells through resistors
|
|
- active balancing: move energy between cells using more complex circuitry
|
|
|
|
Passive balancing is common because it is simpler and cheaper. Active balancing is used when efficiency, pack size, or imbalance severity justify the complexity.
|
|
|
|
Implementation detail: balance current is usually small compared with drive current, so balancing cannot fix badly mismatched cells quickly. It is a trimming tool, not a miracle cure.
|
|
|
|
### 4.7 Charge and discharge FET control
|
|
|
|
Many packs use back-to-back MOSFETs so the BMS can disconnect current flow safely in both directions. This is common because a single MOSFET's body diode can otherwise allow unwanted current in one direction.
|
|
|
|
The BMS may independently control:
|
|
|
|
- charge FET
|
|
- discharge FET
|
|
- precharge path in larger systems
|
|
- contactors in high-energy systems
|
|
|
|
This allows fault-specific behavior, such as blocking charge on overvoltage but still allowing limited discharge to bring the pack back into a safe range.
|
|
|
|
### 4.8 Hardware and firmware interaction
|
|
|
|
BMS is one of the clearest places where hardware and software meet.
|
|
|
|
Examples:
|
|
|
|
- An embedded controller limits CPU turbo mode when the battery is cold or weak.
|
|
- A laptop OS uses BMS-reported cycle count and health to choose a charge ceiling.
|
|
- A drone flight controller aborts takeoff if pack voltage sag under test pulse is too large.
|
|
- A server backup unit logs cell imbalance trend data to schedule maintenance before failure.
|
|
|
|
Professional implementation practice:
|
|
|
|
- log fault reasons, not just generic shutdowns
|
|
- timestamp significant charge, discharge, and thermal events
|
|
- report raw measurements alongside filtered estimates when possible
|
|
- keep safety decisions robust even if host communication is unavailable
|
|
|
|
### 4.9 Common BMS mistakes
|
|
|
|
- Calling a basic protector a complete BMS.
|
|
- Monitoring only pack voltage and not individual cell voltages in series packs.
|
|
- Assuming balancing can compensate for badly mismatched or damaged cells.
|
|
- Using SOC estimate as a protection threshold.
|
|
- Placing current sensing so some charge or discharge paths bypass the shunt.
|
|
- Forgetting that connector resistance and sense-line routing can corrupt measurements.
|
|
|
|
### 4.10 BMS debugging workflow
|
|
|
|
1. Read per-cell voltage, pack current, and temperature simultaneously.
|
|
2. Check fault flags and the conditions that set them.
|
|
3. Compare measured shunt voltage to expected current and calibration settings.
|
|
4. Verify whether charge/discharge FET gates are being commanded correctly.
|
|
5. Look for one weak cell or one bad sense wire before assuming the whole pack is damaged.
|
|
6. Confirm that the host and BMS agree on pack state and permitted actions.
|
|
7. If SOC looks wrong, separate raw measurement problems from estimation-model problems.
|
|
|
|
---
|
|
|
|
## 5. Voltage Protection
|
|
|
|
### 5.1 Why voltage protection matters so much in Li-ion systems
|
|
|
|
Li-ion cells operate safely only within a relatively narrow voltage window. Leaving that window has consequences beyond "reduced performance."
|
|
|
|
Overvoltage can cause:
|
|
|
|
- lithium plating and other irreversible side reactions
|
|
- cathode and electrolyte stress
|
|
- gas generation
|
|
- accelerated aging or safety risk
|
|
|
|
Undervoltage and deep discharge can cause:
|
|
|
|
- copper dissolution and internal damage in severe cases
|
|
- loss of usable capacity
|
|
- inability to recharge safely in some cases
|
|
- system instability and repeated brownout behavior
|
|
|
|
Voltage protection is therefore not just about preserving runtime. It is about preserving cell integrity and safety.
|
|
|
|
### 5.2 There are multiple voltage layers in a real product
|
|
|
|
Engineers often say "the cutoff voltage" as if there were only one. In practice there may be several thresholds:
|
|
|
|
- charger voltage limit
|
|
- cell-level overvoltage detection
|
|
- cell-level undervoltage detection
|
|
- BMS recovery thresholds with hysteresis
|
|
- regulator undervoltage lockout
|
|
- MCU brownout threshold
|
|
- software low-battery warning threshold
|
|
- shipping-storage threshold
|
|
|
|
These thresholds serve different purposes and should not be accidentally collapsed into one number.
|
|
|
|
### 5.3 Cell-level versus pack-level protection
|
|
|
|
Pack voltage alone is not sufficient for series packs.
|
|
|
|
Example:
|
|
|
|
- a `4s` pack may measure an acceptable total pack voltage
|
|
- but one cell may be at dangerous undervoltage while others remain high
|
|
|
|
That is why real multi-cell packs measure each cell or each parallel group individually.
|
|
|
|
Single-cell systems still need discipline. The product may appear stable until a burst current event causes sag below the regulator UVLO or MCU brownout threshold.
|
|
|
|
### 5.4 Hysteresis and debounce are essential
|
|
|
|
If a protection threshold trips exactly when the measured voltage crosses the line and recovers immediately when it rises a few millivolts, the system may chatter on and off.
|
|
|
|
Why chatter happens:
|
|
|
|
- load current causes sag
|
|
- protection turns the load off
|
|
- voltage rebounds
|
|
- system turns on again
|
|
- load returns and causes sag again
|
|
|
|
Good protection design uses:
|
|
|
|
- threshold hysteresis
|
|
- time debounce or blanking
|
|
- staged responses where appropriate
|
|
|
|
Example:
|
|
|
|
- software low-battery warning at one threshold
|
|
- graceful power reduction at a lower threshold
|
|
- hard protection cutoff lower still
|
|
|
|
### 5.5 Protection hardware patterns
|
|
|
|
Common hardware mechanisms include:
|
|
|
|
- dedicated protection ICs
|
|
- BMS-controlled MOSFET disconnects
|
|
- regulator UVLO and OVLO
|
|
- input TVS devices for external transients
|
|
- reverse polarity protection or ideal-diode controllers
|
|
- fuses for severe fault containment
|
|
|
|
Back-to-back FETs are common in pack protection because they can block current flow in both directions when off.
|
|
|
|
### 5.6 Voltage protection decision logic in practice
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
A[Unexpected shutdown or charge refusal] --> B{Read cell voltages under actual load}
|
|
B -->|One cell low| C[Cell imbalance, weak cell, or bad sense line]
|
|
B -->|All cells low| D[Pack depleted, cold, or load too heavy]
|
|
B -->|Cells normal| E[Check drop across FETs, shunt, connector, and converters]
|
|
C --> F[Review balancing history, IR, and cell health]
|
|
D --> G[Review cutoff settings, source of current spikes, and temperature]
|
|
E --> H[Check system UVLO, brownout, cable drop, and regulator behavior]
|
|
```
|
|
|
|
### 5.7 System-level voltage protection is not the same as battery protection
|
|
|
|
Even if the battery is protected, the system may still behave badly unless its own rails are designed sensibly.
|
|
|
|
Examples:
|
|
|
|
- A `3.3 V` rail converter may fall out of regulation before the battery reaches pack-protection cutoff.
|
|
- An MCU brownout threshold may be too close to the converter dropout region, causing corrupted flash writes or repeated boot loops.
|
|
- A storage device may need earlier warning and graceful shutdown than the hard battery cutoff allows.
|
|
|
|
Professional rule: align battery protection, power-stage limits, and firmware behavior so they fail gracefully in the right order.
|
|
|
|
### 5.8 Design example: layered thresholds for a `1s` handheld device
|
|
|
|
Possible threshold strategy:
|
|
|
|
- battery low warning to the UI: around the knee where useful runtime is becoming limited
|
|
- firmware disables nonessential features below that point
|
|
- regulator UVLO ensures clean rail behavior
|
|
- MCU brownout reset protects logic integrity
|
|
- pack protector disconnects only below the deeper cell safety threshold
|
|
|
|
This layering is better than using the pack protector as the first time the system notices low battery.
|
|
|
|
### 5.9 Common voltage-protection mistakes
|
|
|
|
- Using one pack-level threshold for a series pack without cell-level visibility.
|
|
- Setting thresholds with no margin for measurement error and divider tolerance.
|
|
- Ignoring load-induced sag when choosing undervoltage behavior.
|
|
- Allowing software writes or filesystem activity too close to hard cutoff.
|
|
- Forgetting hysteresis and causing repeated on-off oscillation.
|
|
- Measuring voltage at the wrong point and missing drop across connectors or FETs.
|
|
|
|
### 5.10 Voltage-protection debugging workflow
|
|
|
|
1. Capture cell and pack voltage at the moment of fault, ideally under real load.
|
|
2. Measure both before and after protection FETs or connector interfaces.
|
|
3. Compare thresholds in hardware, firmware, and the gauge configuration.
|
|
4. Check whether temperature and aging changed the droop behavior.
|
|
5. If the failure is intermittent, log the weakest-cell voltage rather than only pack voltage.
|
|
|
|
---
|
|
|
|
## 6. Current Draw
|
|
|
|
### 6.1 Current draw is the dynamic signature of the product
|
|
|
|
Average current matters, but current profile matters more than many engineers expect.
|
|
|
|
Real products do not draw one steady current. They move between states:
|
|
|
|
- off or shipping mode
|
|
- deep sleep
|
|
- idle
|
|
- active compute
|
|
- radio transmit
|
|
- display peak brightness
|
|
- actuator or motor startup
|
|
- storage write or CPU boost
|
|
|
|
Each state has different electrical consequences.
|
|
|
|
- average current affects runtime
|
|
- peak current affects voltage sag and protection behavior
|
|
- RMS current affects heating in resistive paths
|
|
|
|
### 6.2 Current is what turns battery weakness into visible failure
|
|
|
|
High current causes:
|
|
|
|
- larger `I x R` droop in the cell and interconnects
|
|
- larger `I^2 x R` heating
|
|
- higher stress on FETs, shunts, connectors, and traces
|
|
- more converter stress and possible current-limit entry
|
|
|
|
This is why a battery system may work fine for light workloads and fail only during:
|
|
|
|
- boot bursts
|
|
- radio transmit bursts
|
|
- motor stall
|
|
- camera flash or display spikes
|
|
- SSD spin-up or write peaks in legacy systems
|
|
|
|
### 6.3 Battery current is not always the same as load current
|
|
|
|
This is one of the most common conceptual mistakes.
|
|
|
|
If power conversion exists, the battery current depends on power, efficiency, and battery voltage.
|
|
|
|
Approximate relationship:
|
|
|
|
- `Ibat ~= Pload / (eta x Vbat)`
|
|
|
|
Implications:
|
|
|
|
- as battery voltage falls, battery current rises for the same output power
|
|
- a boost converter can make battery current much higher than the output current
|
|
- late in discharge, the same load can become much harder on the battery
|
|
|
|
Example:
|
|
|
|
- load needs `5 W`
|
|
- converter efficiency is `90%`
|
|
- battery is at `3.7 V`: `Ibat ~= 5 / (0.9 x 3.7) ~= 1.5 A`
|
|
- battery later falls to `3.2 V`: `Ibat ~= 5 / (0.9 x 3.2) ~= 1.74 A`
|
|
|
|
The product has not changed, but the battery stress has increased.
|
|
|
|
### 6.4 Inrush, startup, and pulse loads
|
|
|
|
Many failures come from brief events rather than steady-state current.
|
|
|
|
Examples:
|
|
|
|
- input capacitors charging at plug-in
|
|
- CPU and DRAM ramping during boot
|
|
- radio power amplifier bursts
|
|
- motors pulling stall current at startup
|
|
- LED flash or backlight transitions
|
|
|
|
These events may be invisible on a slow meter.
|
|
|
|
Professional rule: if the failure is sudden, measure the waveform, not just the average.
|
|
|
|
### 6.5 How to measure current correctly
|
|
|
|
Tools and when to use them:
|
|
|
|
- DMM in series: good for slow average current, poor for fast bursts and can add burden voltage
|
|
- shunt plus oscilloscope: good for transients and pulse current waveforms
|
|
- current probe: useful for nonintrusive waveform capture if bandwidth and accuracy fit
|
|
- coulomb counter or fuel gauge: good for integrated charge over time
|
|
- dedicated power profiler: excellent for low-power embedded mode analysis
|
|
|
|
Implementation details that matter:
|
|
|
|
- place the shunt so all relevant current flows through it
|
|
- use Kelvin sensing for low-value shunts
|
|
- account for shunt self-heating and tolerance
|
|
- remember that a meter can change the circuit by adding series resistance
|
|
|
|
### 6.6 Software and current draw are tightly coupled
|
|
|
|
Current draw is one of the clearest hardware-software boundary problems.
|
|
|
|
Firmware choices that strongly affect battery current:
|
|
|
|
- sleep depth and wake interval
|
|
- clock frequency and DVFS policy
|
|
- peripheral gating
|
|
- radio retry behavior and network quality
|
|
- sensor duty cycle
|
|
- background tasks that prevent deep sleep
|
|
- logging verbosity and storage access pattern
|
|
|
|
This is why power optimization often fails when teams look only at hardware or only at software.
|
|
|
|
### 6.7 Sizing traces, connectors, and protection for real current
|
|
|
|
Do not size current-path components from average current alone.
|
|
|
|
Check at least:
|
|
|
|
- continuous current
|
|
- peak current
|
|
- expected fault current
|
|
- connector contact resistance and heating
|
|
- fuse characteristic and trip behavior
|
|
- MOSFET safe operating area where relevant
|
|
|
|
A connector that is fine electrically on paper can still run hot because tens of milliohms matter at high current.
|
|
|
|
### 6.8 Common current-draw mistakes
|
|
|
|
- Measuring only average current and missing spikes.
|
|
- Confusing regulator output current with battery current.
|
|
- Ignoring quiescent current in always-on rails.
|
|
- Forgetting that cold batteries sag more under the same pulse load.
|
|
- Using a DMM and assuming the reading represents worst-case behavior.
|
|
- Not correlating current spikes with software state transitions.
|
|
|
|
### 6.9 Current-draw debugging workflow
|
|
|
|
1. Define operating states clearly: sleep, idle, active, burst, fault.
|
|
2. Measure average current in each state.
|
|
3. Capture transient current during state transitions and peak events.
|
|
4. Correlate waveforms with firmware logs, GPIO markers, or trace output.
|
|
5. Compute voltage drop across the battery, shunt, connectors, and regulators during peaks.
|
|
6. If current is unexpectedly high, disable subsystems one at a time.
|
|
7. If runtime is unexpectedly short, separate "too much average current" from "too much pulse current causing early cutoff."
|
|
|
|
---
|
|
|
|
## 7. Runtime Estimation
|
|
|
|
### 7.1 Runtime starts with energy, not hope
|
|
|
|
The core runtime equation is simple:
|
|
|
|
- `Runtime_hours ~= Usable_Battery_Energy_Wh / Average_System_Power_W`
|
|
|
|
Everything hard about runtime comes from defining "usable" and "average" correctly.
|
|
|
|
Usable energy is not the label on the cell. It is the part you can really extract under your voltage limits, temperature, aging, and load profile.
|
|
|
|
Average system power is not one current number unless the product truly has one state.
|
|
|
|
### 7.2 Why watt-hours are usually the right starting point
|
|
|
|
If your system uses regulators, multiple rails, or changing battery voltage, `Wh` is usually the cleanest way to think.
|
|
|
|
Example:
|
|
|
|
- battery rated `11.1 Wh`
|
|
- usable fraction after cutoff, temperature, and aging margin: `0.85`
|
|
- overall conversion efficiency from battery to loads: `0.9`
|
|
- average system power: `0.35 W`
|
|
|
|
Estimated runtime:
|
|
|
|
- `Runtime ~= 11.1 x 0.85 x 0.9 / 0.35 ~= 24.3 hours`
|
|
|
|
### 7.3 Step-by-step runtime estimation workflow
|
|
|
|
1. Convert battery specification to energy in `Wh`.
|
|
2. Determine how much of that energy is actually usable for the product.
|
|
3. Break the product into operating states.
|
|
4. Estimate or measure time spent in each state.
|
|
5. Estimate or measure power in each state from the battery side.
|
|
6. Compute weighted average power.
|
|
7. Add margin for temperature, aging, manufacturing spread, and user behavior.
|
|
8. Validate with bench tests and field telemetry.
|
|
|
|
### 7.4 How to estimate average power from multiple states
|
|
|
|
For a product with several modes:
|
|
|
|
- `Pavg = sum(State_Power x Duty_Fraction)`
|
|
|
|
Example:
|
|
|
|
- sleep: `5 mW` for `90%` of the time
|
|
- sensing and compute: `200 mW` for `9%`
|
|
- radio transmit: `1.2 W` for `1%`
|
|
|
|
Then:
|
|
|
|
- `Pavg = 0.005 x 0.90 + 0.2 x 0.09 + 1.2 x 0.01`
|
|
- `Pavg = 0.0045 + 0.018 + 0.012 = 0.0345 W`
|
|
|
|
This kind of duty-cycle model is often much more accurate than using a single "active current" number.
|
|
|
|
### 7.5 Why estimates fail in real life
|
|
|
|
Runtime estimates are commonly too optimistic because they ignore one or more of the following:
|
|
|
|
- conversion losses
|
|
- battery cutoff before full labeled capacity is used
|
|
- cold-temperature loss of usable capacity
|
|
- aged-cell loss of capacity and increased resistance
|
|
- pulse-load sag causing earlier shutdown than energy math predicts
|
|
- background current in sleep or standby
|
|
- self-discharge and shelf time
|
|
- user behavior that differs from lab assumptions
|
|
|
|
### 7.6 Late-discharge behavior is often the hidden runtime killer
|
|
|
|
A product may appear to have enough energy in theory but still stop early because the battery cannot maintain voltage under the required current near the end of discharge.
|
|
|
|
This is especially common when:
|
|
|
|
- the converter needs a minimum input voltage
|
|
- the load has sharp pulses
|
|
- the battery is cold or aged
|
|
- protection thresholds are conservative
|
|
|
|
This is why runtime estimation must include both energy capacity and power-delivery capability.
|
|
|
|
### 7.7 Example: embedded sensor node
|
|
|
|
Assume:
|
|
|
|
- `1s` Li-ion pack: `3000 mAh`, `3.7 V` nominal, about `11.1 Wh`
|
|
- usable fraction after margin: `0.8`
|
|
- converter efficiency: `0.92`
|
|
|
|
System states:
|
|
|
|
- deep sleep at `0.8 mW` for `95%`
|
|
- sensing and local processing at `120 mW` for `4.7%`
|
|
- LTE burst at `2.5 W` for `0.3%`
|
|
|
|
Average power:
|
|
|
|
- `Pavg = 0.0008 x 0.95 + 0.12 x 0.047 + 2.5 x 0.003`
|
|
- `Pavg = 0.00076 + 0.00564 + 0.0075`
|
|
- `Pavg ~= 0.0139 W`
|
|
|
|
Runtime estimate:
|
|
|
|
- usable energy to loads `~= 11.1 x 0.8 x 0.92 ~= 8.17 Wh`
|
|
- runtime `~= 8.17 / 0.0139 ~= 588 hours`, about `24.5 days`
|
|
|
|
But that estimate still needs pulse validation. The LTE burst may force a larger battery than average power alone suggests.
|
|
|
|
### 7.8 Runtime estimation for peak-power systems
|
|
|
|
For drones, tools, robotics, and high-performance laptops, average power is not enough. You must also check:
|
|
|
|
- peak power and current
|
|
- transient droop
|
|
- connector and bus losses
|
|
- thermal rise during sustained high load
|
|
- cell voltage spread in series packs
|
|
|
|
In these systems, the battery may have enough total energy but still be incapable of supporting the demanded power safely or consistently.
|
|
|
|
### 7.9 Runtime validation flow
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Workload[Workload states and duty cycle] --> AvgPower[Average battery-side power]
|
|
Battery[Rated battery Wh] --> Usable[Usable fraction after cutoff, temp, and aging]
|
|
AvgPower --> Estimate[Runtime estimate]
|
|
Usable --> Estimate
|
|
Estimate --> Bench[Bench validation with real waveforms]
|
|
Bench --> Field[Field telemetry and refinement]
|
|
```
|
|
|
|
### 7.10 Common runtime-estimation mistakes
|
|
|
|
- Using only `mAh` and ignoring voltage and conversion efficiency.
|
|
- Using nominal battery capacity with no derating.
|
|
- Measuring load current on one output rail and assuming it equals battery current.
|
|
- Ignoring pulse-load induced early cutoff.
|
|
- Forgetting sleep current from always-on support circuitry.
|
|
- Designing to typical behavior rather than minimum guaranteed field behavior.
|
|
|
|
### 7.11 Debugging when measured runtime is worse than predicted
|
|
|
|
1. Verify actual battery energy and age, not label only.
|
|
2. Measure battery-side current and voltage over time.
|
|
3. Check converter efficiency at the real operating points.
|
|
4. Compare actual duty cycle to the assumed workload model.
|
|
5. Look for hidden background loads and retry loops.
|
|
6. Check whether undervoltage cutoff or brownout occurs before the energy budget says it should.
|
|
7. Repeat the test at cold and hot conditions if the product will ship into them.
|
|
|
|
---
|
|
|
|
## 8. Design Tradeoffs and Production Scenarios
|
|
|
|
### 8.1 Single-cell consumer device
|
|
|
|
Typical characteristics:
|
|
|
|
- `1s` pouch cell
|
|
- USB-powered charging
|
|
- charger IC with power-path feature
|
|
- fuel gauge integrated or standalone
|
|
- system prioritizes thin form factor and user experience
|
|
|
|
Common tradeoffs:
|
|
|
|
- faster charging versus thermal comfort and long-term aging
|
|
- full 100% charge versus 80% or 90% longevity mode
|
|
- low-cost simple gauge versus accurate model-based gauge
|
|
|
|
Production lessons:
|
|
|
|
- cable quality and adapter quality affect customer experience directly
|
|
- charge termination behavior must be tested with the screen, radio, and CPU active
|
|
- swelling risk is influenced by thermal design, not only cell quality
|
|
|
|
### 8.2 Industrial handheld or tool pack
|
|
|
|
Typical characteristics:
|
|
|
|
- multi-cell series pack
|
|
- strong pulse currents
|
|
- rugged connectors
|
|
- pack-level protection and balancing
|
|
|
|
Common tradeoffs:
|
|
|
|
- energy density versus power delivery
|
|
- passive versus active balancing
|
|
- removable pack convenience versus contact resistance and abuse risk
|
|
|
|
Production lessons:
|
|
|
|
- weak spot welds, poor connector contacts, and damaged sense wires cause field failures that look like cell failures
|
|
- current path resistance matters almost as much as the cell datasheet in high-power systems
|
|
|
|
### 8.3 Remote IoT or low-maintenance sensor product
|
|
|
|
Typical characteristics:
|
|
|
|
- long sleep intervals
|
|
- occasional high-power radio events
|
|
- strong dependence on firmware behavior
|
|
- sometimes solar or energy-harvesting input
|
|
|
|
Common tradeoffs:
|
|
|
|
- recharge convenience versus battery lifetime
|
|
- bigger cell versus more aggressive power optimization
|
|
- rechargeable Li-ion versus primary chemistry depending service model
|
|
|
|
Production lessons:
|
|
|
|
- standby current mistakes dominate lifetime
|
|
- field temperature profile often matters more than room-temperature capacity
|
|
- firmware retries during poor connectivity can destroy the intended energy budget
|
|
|
|
### 8.4 Server backup, UPS, or large pack system
|
|
|
|
Typical characteristics:
|
|
|
|
- more cells, more monitoring, and stronger safety requirements
|
|
- stronger need for SOH tracking and fault logging
|
|
- maintenance and serviceability matter
|
|
|
|
Common tradeoffs:
|
|
|
|
- higher measurement accuracy versus BOM cost
|
|
- redundancy and fail-safe behavior versus simplicity
|
|
- active balancing and richer telemetry versus power overhead
|
|
|
|
Production lessons:
|
|
|
|
- service diagnostics and fault history are as important as the first-release electrical design
|
|
- pack replacement policy should be based on health and internal resistance trend, not only age
|
|
|
|
### 8.5 Design-review checklist
|
|
|
|
- Is the exact cell chemistry and voltage window documented?
|
|
- Are charge limits derived from the cell datasheet and validated thermally?
|
|
- Does the architecture separate system load from charge termination when needed?
|
|
- Are cell, pack, and system voltage thresholds intentionally layered?
|
|
- Is the worst-case current profile measured, not just estimated?
|
|
- Does the runtime model use battery-side power and realistic derating?
|
|
- Can the product log why it stopped charging or shut down?
|
|
- Are connector, fuse, shunt, and FET losses accounted for at peak current?
|
|
- For series packs, are per-cell measurements trustworthy and balanced?
|
|
|
|
---
|
|
|
|
## 9. Troubleshooting Playbook
|
|
|
|
### 9.1 Symptom-to-cause map
|
|
|
|
| Symptom | High-probability causes | First checks |
|
|
| --- | --- | --- |
|
|
| Sudden shutdown during transmit or motor start | Battery sag, weak cell, high path resistance, brownout threshold too high | Capture battery voltage and current during the event |
|
|
| Device says 30% then powers off | Poor SOC model, aged cell with high resistance, early UV cutoff | Compare OCV, loaded voltage, and gauge estimate |
|
|
| Charges only from some adapters | Source negotiation, cable drop, adapter current limit | Measure charger input voltage at the board |
|
|
| Battery gets warm near full charge | CV phase heat, aged cell, poor thermal path | Log charge current and temperature during CV |
|
|
| Runtime much lower in cold weather | Higher internal resistance, lower usable capacity, blocked charging recovery | Repeat tests at temperature |
|
|
| One series cell always low | Cell mismatch, imbalance, bad weld, bad tap sense | Check per-cell trend and balancing behavior |
|
|
| Product boot-loops on low battery | UVLO/BOR threshold interaction, high startup current | Scope rails through boot sequence |
|
|
|
|
### 9.2 Practical debugging sequence
|
|
|
|
1. Start with a reproducible operating state, not a vague user report.
|
|
2. Measure battery voltage, battery current, and temperature at the same time.
|
|
3. Identify whether the failure is energy-limited, power-limited, thermal-limited, or algorithm-limited.
|
|
4. Check the exact point where the system stopped: charger, BMS, regulator, or firmware policy.
|
|
5. For series packs, look at every cell before trusting pack voltage.
|
|
6. For current-related issues, capture waveforms; for estimation issues, log time-series data.
|
|
7. Compare new battery behavior to aged battery behavior if the field problem appears over time.
|
|
|
|
### 9.3 What to log in production firmware
|
|
|
|
If the product has a microcontroller or host processor, log these when possible:
|
|
|
|
- pack voltage and per-cell voltage if available
|
|
- pack current
|
|
- battery temperature
|
|
- charge state and charger fault reason
|
|
- BMS fault flags
|
|
- shutdown reason and brownout reason
|
|
- SOC estimate and raw voltage estimate when relevant
|
|
- cycle count and health estimate
|
|
|
|
This is one of the cheapest ways to make future debugging faster.
|
|
|
|
---
|
|
|
|
## 10. Interview-Level Understanding
|
|
|
|
### 10.1 Questions strong engineers should answer clearly
|
|
|
|
Why is Li-ion charged with CC/CV?
|
|
|
|
Because early in charge the cell can safely accept a controlled current while voltage rises. Once the cell reaches its voltage limit, continuing constant current would exceed a safe terminal voltage, so the charger must hold voltage and let current taper naturally.
|
|
|
|
Why is `mAh` not enough to compare batteries?
|
|
|
|
Because energy depends on both capacity and voltage, and many systems use regulators that make battery current differ from load current. `Wh` is the better first comparison metric.
|
|
|
|
Why can a battery look fine by voltage and still fail in use?
|
|
|
|
Because open-circuit voltage does not reveal internal resistance and dynamic power-delivery ability. Under load, `I x R` sag may push the system below its usable voltage threshold.
|
|
|
|
Why does a product sometimes die at 20% to 30% indicated battery?
|
|
|
|
Because SOC estimation may be wrong, the battery may be aged or cold, or pulse-load sag may hit cutoff well before the remaining energy can be extracted smoothly.
|
|
|
|
Why does a series pack need per-cell monitoring?
|
|
|
|
Because pack voltage can hide imbalance. One weak cell can hit an unsafe limit while the total pack voltage still looks acceptable.
|
|
|
|
Why is charging below freezing dangerous for many Li-ion cells?
|
|
|
|
Because ion intercalation into the anode becomes sluggish, increasing the chance of lithium plating rather than normal storage. That damages the cell and can create safety risk.
|
|
|
|
What is the practical difference between protection and fuel gauging?
|
|
|
|
Protection decides whether operation must stop to avoid unsafe conditions. Fuel gauging estimates battery state for control and user information. Gauging can be wrong and the system must still remain safe.
|
|
|
|
### 10.2 Design questions that reveal real understanding
|
|
|
|
- Where is the first threshold that tells software to reduce load, and where is the final hard cutoff?
|
|
- What is the worst pulse current, and what does it do to the weakest battery at end of life and low temperature?
|
|
- How does the charger behave when the system is active during charging?
|
|
- What happens if one sensor wire in a multi-cell pack opens or drifts?
|
|
- Which part of the design owns the truth for shutdown cause: charger, BMS, regulator, or firmware log?
|
|
|
|
If a team cannot answer those questions, the battery design is probably not production-ready.
|
|
|
|
---
|
|
|
|
## 11. Final Engineering Principles
|
|
|
|
Battery engineering is rarely about memorizing one perfect voltage or one perfect formula. It is about respecting limits, measuring the right quantities, and understanding that chemical storage, power conversion, protection, and firmware policy all interact.
|
|
|
|
The best practical habits are simple:
|
|
|
|
- design from exact datasheets, not generic memory
|
|
- think in both energy and peak power
|
|
- measure current waveforms, not only averages
|
|
- treat temperature and aging as normal operating conditions
|
|
- layer protection and graceful degradation intentionally
|
|
- log enough data that future failures are diagnosable
|
|
|
|
If you carry those habits into design reviews, lab work, and field debugging, battery systems stop feeling mysterious and start becoming understandable engineering systems.
|