AboutIndustriesProductsCase StudiesContactBlog
Product EngineeringBy Bhimsen G.V.18 May 202612 min read

How Embedded Engineers Build an EV Battery Management System — Cell Monitoring, SoC ML, Thermal AI, CAN Bus

A complete technical guide for embedded engineers building a Battery Management System from scratch — covering all six core BMS functions with real code, hardware choices, and build insights from Knowx Innovations client and student projects.

The electric vehicle market in India is growing faster than the supply of engineers who can build the intelligent systems that power it. At the centre of every EV is a Battery Management System — the embedded intelligence that keeps a lithium-ion pack safe, efficient, and long-lasting.

This is not a beginner tutorial. It is a complete engineering guide — the kind we use at Knowx Innovations when we build BMS systems for clients and when we train ECE and EEE students to build their own. Whether you are a final year student choosing your project, a working engineer moving into the EV domain, or a startup founder building an EV product, this guide gives you the technical foundation you need.

💬
Why this article exists: We consistently see ECE and EEE students — even those with strong embedded foundations — who have never been shown how BMS systems are actually engineered. Universities teach battery theory in physics labs. Industry needs engineers who can build a working BMS on STM32, add an ML model for SoC prediction, implement CAN bus communication, and deploy it in a real vehicle. That gap is what this guide closes.

What a BMS Does — and Why Embedded Engineers Build It

A Battery Management System is the embedded controller responsible for every intelligent function of a battery pack. Its job is to keep the pack safe, maximise energy delivery, and extend battery life — simultaneously, in real time, across every operating condition the vehicle encounters.

Definition — Battery Management System

A Battery Management System (BMS) is an embedded electronic system that monitors and controls a rechargeable battery pack. It measures cell voltages, temperatures, and currents; estimates State of Charge and State of Health; performs cell balancing; enforces protection limits; and communicates with the vehicle's power electronics via protocols like CAN bus. Modern AI-enhanced BMS systems add machine learning for predictive SoC accuracy, degradation forecasting, and thermal anomaly detection.

The reason embedded engineers — not just electrical engineers — build BMS systems is that the intelligence layer is entirely software-defined. The hardware measures and actuates. The firmware makes decisions. And in a modern AI-enhanced BMS, machine learning models run on the microcontroller, making predictions that fixed-rule firmware could never achieve.

BMS Architecture — 5 Layers Every Engineer Must Understand

BMS System Architecture — Knowx Reference Design
🔋
Layer 1 — Cell Sensing Hardware
Individual cell voltage, temperature, and current measurement
BQ76920 INA219 NTC Thermistors
⚙️
Layer 2 — Protection & Control
Overvoltage, undervoltage, overcurrent, and temperature protection
STM32 / ESP32 MOSFETs Relay
🧠
Layer 3 — AI / ML Intelligence
SoC prediction, degradation forecasting, thermal anomaly detection
TFLite Micro LSTM Kalman Filter
🔗
Layer 4 — Communication
CAN bus to vehicle ECU, UART for diagnostics, IoT for remote monitoring
MCP2515 CAN MQTT BLE
📊
Layer 5 — Data & Dashboard
Cloud logging, real-time monitoring dashboard, alert management
Firebase Grafana AWS IoT

Hardware Choice — STM32 for Production, ESP32 for Learning

Production BMS
STM32 — Industry Standard
  • ARM Cortex-M0 to M7 — automotive grade
  • Hardware CAN bus peripheral built-in
  • Real-time clock and watchdog timers
  • STM32Cube.AI for TFLite model deployment
  • ISO 26262 functional safety support
  • Used by Bosch, Continental, and Tier-1 suppliers
  • Cost: Rs.300–2,500 depending on variant
Learning & Prototype BMS
ESP32 — Student Accessible
  • Dual-core 240MHz — enough for BMS + ML
  • Built-in Wi-Fi and BLE for IoT connectivity
  • MicroPython or C/C++ — flexible development
  • CAN via MCP2515 external module
  • TFLite Micro runs comfortably
  • Large community, easy debugging
  • Cost: Rs.400–800
💡
Our recommendation at Knowx: Build your first BMS on ESP32 — learn the architecture, get comfortable with cell monitoring ICs, SoC algorithms, and IoT connectivity. Once you understand the system deeply, port the critical firmware to STM32 for production or client projects. The ESP32 skills transfer directly. The STM32 adds automotive-grade reliability on top.

Module 1 — Cell Voltage Monitoring

01
Cell Voltage Monitoring with BQ76920
Measuring individual cell voltages with hardware protection IC

The BQ76920 is Texas Instruments' battery monitor IC — it measures up to 5 series cells simultaneously, handles cell balancing in hardware, and communicates with your microcontroller via I2C. It also handles overvoltage and undervoltage protection in hardware — a critical safety feature.

Every Li-ion cell operates safely between 2.5V (minimum) and 4.2V (maximum). The BMS must monitor every cell individually — a single cell outside these limits can cause thermal runaway. Monitoring the pack voltage alone is not sufficient.

Python · ESP32 cell_monitor.py
# Cell voltage reading via I2C — BQ76920
import machine, time

# I2C setup on ESP32
i2c = machine.I2C(0,
    scl=machine.Pin(22),
    sda=machine.Pin(21),
    freq=100000
)

BQ76920_ADDR = 0x08  # Default I2C address
CELL1_HI     = 0x0C  # Register: Cell 1 voltage high byte

def read_cell_voltage(cell_num):
    # Each cell register = 2 bytes, 0.382mV per LSB
    reg = CELL1_HI + (cell_num - 1) * 2
    data = i2c.readfrom_mem(BQ76920_ADDR, reg, 2)
    raw = (data[0] << 8) | data[1]
    voltage = raw * 0.000382  # Convert to volts
    return round(voltage, 3)

def check_cell_safety(voltage, cell_id):
    if voltage > 4.20:
        print(f"ALERT: Cell {cell_id} overvoltage: {voltage}V")
        trigger_protection()
    elif voltage < 2.50:
        print(f"ALERT: Cell {cell_id} undervoltage: {voltage}V")
        trigger_protection()

# Monitor all 5 cells every 500ms
while True:
    for i in range(1, 6):
        v = read_cell_voltage(i)
        check_cell_safety(v, i)
        print(f"Cell {i}: {v}V")
    time.sleep_ms(500)
🔧
Mentor tip: Always implement cell monitoring with hardware protection as the first layer — never rely on software alone for safety-critical limits. The BQ76920 has hardware OVP and UVP that acts in microseconds, long before your firmware interrupt fires. Software handles intelligent decisions. Hardware handles emergency protection.

Module 2 — State of Charge Prediction with ML

02
SoC Prediction using LSTM — TFLite on ESP32/STM32
Why Coulomb counting fails and how ML fixes it

Traditional SoC estimation uses Coulomb counting — integrating current over time to track charge in and out of the pack. The problem: it accumulates error. After 100 charge cycles, a 1% current measurement error becomes a 10–15% SoC error. Your dashboard shows 40% charge remaining when the battery actually has 25%.

An LSTM neural network trained on voltage, current, temperature, and cycle history estimates SoC with significantly better accuracy — typically under 3% error across the battery's lifetime. TFLite Micro runs this model directly on ESP32 or STM32 without any cloud dependency.

Python · Training train_soc_lstm.py
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.models import Sequential

# Features: voltage, current, temperature, cycle_count
# Target: SoC (0.0 to 1.0)

SEQ_LEN  = 20   # Look back 20 time steps
N_FEAT   = 4    # voltage, current, temp, cycle

model = Sequential([
    LSTM(64, input_shape=(SEQ_LEN, N_FEAT), return_sequences=True),
    Dropout(0.2),
    LSTM(32),
    Dropout(0.2),
    Dense(16, activation='relu'),
    Dense(1, activation='sigmoid')  # SoC 0-1
])

model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.fit(X_train, y_train, epochs=50, batch_size=32,
         validation_data=(X_val, y_val))

# Convert to TFLite for deployment on STM32/ESP32
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

with open('soc_model.tflite', 'wb') as f:
    f.write(tflite_model)
print(f"Model size: {len(tflite_model)/1024:.1f} KB")
🔧
Mentor tip: The NASA Battery Dataset and CALCE Battery Data are excellent free datasets for training SoC models. Use INT8 quantisation when converting to TFLite — it reduces model size by 4x and runs 2–3x faster on microcontrollers with minimal accuracy loss. A well-quantised SoC model typically fits in under 50KB — comfortably within STM32 flash.

Module 3 — Thermal Management with AI

03
Thermal Anomaly Detection — Beyond Simple Thresholds
Detecting dangerous thermal runaway precursors before they become critical

Li-ion thermal runaway is the most dangerous failure mode in EV batteries — it can cause fire within seconds once initiated. A traditional BMS cuts power when temperature exceeds a threshold. An AI-enhanced thermal management system detects the pattern that precedes thermal runaway — hours before it happens.

The key signals are not just absolute temperature but rate of temperature rise, differential temperature between cells, and correlation between temperature and current draw. An anomaly detection model trained on these multi-variable patterns catches dangerous conditions that simple threshold monitoring misses entirely.

Python · ESP32 thermal_monitor.py
import machine, time, math

# NTC thermistor reading — Steinhart-Hart equation
def read_temperature_celsius(adc_pin):
    adc = machine.ADC(machine.Pin(adc_pin))
    adc.atten(machine.ADC.ATTN_11DB)
    raw = adc.read()

    # Voltage divider: 10kΩ pullup, NTC thermistor
    R_REF  = 10000   # Reference resistor (ohms)
    V_MAX  = 4095    # 12-bit ADC max
    R_NTC  = R_REF * (V_MAX / raw - 1)

    # Steinhart-Hart constants for 10k NTC B=3950
    B      = 3950
    T_NOM  = 298.15   # 25°C in Kelvin
    R_NOM  = 10000    # Resistance at 25°C

    temp_k = 1 / (1/T_NOM + math.log(R_NTC/R_NOM)/B)
    return round(temp_k - 273.15, 1)

# Thermal runaway precursor detection
temp_history = []

def detect_thermal_anomaly(current_temp):
    temp_history.append(current_temp)
    if len(temp_history) < 10: return False

    # Rate of rise over last 10 samples (1 sample/sec)
    rise_rate = (temp_history[-1] - temp_history[-10]) / 10

    # >2°C/min rise rate = early thermal warning
    if rise_rate > 2.0:
        print(f"WARNING: Rapid temp rise {rise_rate:.2f}°C/min")
        return True
    return False
🔧
Mentor tip: Place at least 3 NTC thermistors per battery module — one at each end and one in the centre. Temperature differential across a module is often more informative than absolute temperature. A 5°C differential between cells in the same module during discharge is a stronger anomaly signal than a pack temperature of 38°C.

Module 4 — Cell Balancing Algorithms

04
Passive vs Active Cell Balancing — Implementation
Equalising cell voltages to maximise usable pack capacity

Without cell balancing, the weakest cell in a series pack determines the entire pack's usable capacity. If Cell 3 reaches 4.2V (full) while Cells 1, 2, 4, and 5 are at 4.1V, charging must stop — even though 4 of the 5 cells still have capacity remaining. Cell balancing equalises voltages so the full pack capacity is available.

Passive vs Active Cell Balancing
AspectPassive BalancingActive Balancing
MechanismDissipates excess charge as heat via resistorTransfers charge from high to low cells
EfficiencyLow — energy wasted as heatHigh — energy redistributed
ComplexitySimple — BQ76920 handles in hardwareComplex — requires DC-DC converter
Best ForStudent builds, low-cost BMSProduction EV BMS
Heat GeneratedSignificant — needs thermal managementMinimal
🔧
Mentor tip: For student builds, implement passive balancing first — the BQ76920 handles this in hardware with a simple register write. Understand it completely before attempting active balancing. Active balancing is more efficient but requires significantly more hardware and firmware complexity. In our training program at Knowx, we build passive balancing in Week 6 and active balancing concepts in Week 10.

Module 5 — Battery Degradation Prediction

05
Remaining Useful Life Prediction with ML
Forecasting battery end-of-life from charge-discharge patterns

Battery degradation is gradual and invisible until it is sudden. A cell that was at 95% capacity last month may be at 78% today — and the traditional BMS has no way to predict when it will drop below the 80% threshold that defines end-of-useful-life for an EV application.

A degradation prediction model trained on charge-discharge cycle data — capacity fade curves, internal resistance growth, voltage curve shape changes — can forecast remaining useful life months in advance. This is the difference between a fleet operator who replaces batteries reactively (after failure) and one who replaces them proactively (before failure causes downtime).

Key features for degradation prediction: capacity fade rate (Ah delivered vs rated), internal resistance growth (measured from voltage recovery after load), Coulombic efficiency (charge in vs charge out per cycle), and dV/dQ curve shape (characteristic changes as degradation progresses).

🔧
Mentor tip: Start collecting degradation data from Day 1 of your BMS deployment — cycle number, capacity delivered, internal resistance. Even without a trained model, having this historical data is valuable. The NASA PCoE Battery Dataset has thousands of charge cycles with ground truth end-of-life labels — use it to train and validate your degradation model before deploying on real hardware.

Module 6 — CAN Bus Integration

06
CAN Bus — Communicating with the Vehicle ECU
The industry-standard protocol for automotive embedded systems

In a real EV, the BMS does not operate in isolation. It communicates with the motor controller, charger, dashboard, and vehicle ECU via CAN bus — the automotive industry standard for reliable, noise-resistant embedded communication. The BMS broadcasts cell voltages, SoC, temperature, and fault status. It receives charging limits and current setpoints.

For ESP32-based builds, the MCP2515 CAN controller with SPI interface is the most accessible option. For STM32, most variants have a hardware bxCAN peripheral — no external IC needed.

C · STM32 bms_can.c
// STM32 CAN — Transmit BMS status frame
// CAN ID 0x100 = BMS Status (custom DBC definition)

CAN_TxHeaderTypeDef TxHeader;
uint8_t TxData[8];
uint32_t TxMailbox;

void BMS_CAN_SendStatus(
    uint16_t soc_pct,      // SoC in 0.1% units (e.g. 756 = 75.6%)
    int16_t  current_mA,   // Pack current in mA (signed)
    uint16_t temp_degC,    // Max cell temp in 0.1°C units
    uint8_t  fault_flags   // Bit flags: bit0=OV, bit1=UV, bit2=OT
) {
    TxHeader.StdId = 0x100;
    TxHeader.DLC   = 8;
    TxHeader.IDE   = CAN_ID_STD;
    TxHeader.RTR   = CAN_RTR_DATA;

    // Pack data into 8 bytes (little-endian)
    TxData[0] = soc_pct & 0xFF;
    TxData[1] = (soc_pct >> 8) & 0xFF;
    TxData[2] = current_mA & 0xFF;
    TxData[3] = (current_mA >> 8) & 0xFF;
    TxData[4] = temp_degC & 0xFF;
    TxData[5] = (temp_degC >> 8) & 0xFF;
    TxData[6] = fault_flags;
    TxData[7] = 0x00;  // Reserved

    HAL_CAN_AddTxMessage(&hcan1,
        &TxHeader, TxData, &TxMailbox);
}
🔧
Mentor tip: Define your CAN message IDs using a DBC (Database CAN) file from the start — even for a student project. DBC files are the industry standard for documenting CAN signals. Tools like CANdb++ and Vector CANalyzer can parse your DBC file and decode messages in real time during debugging. This skill is directly transferable to any automotive embedded job.

Real Build Stories — Clients and Students at Knowx

🚗
Client EV Project — BMS for an EV Startup: An EV startup approached Knowx with a problem — they had a battery pack but no intelligent management system. Their early prototype used a basic protection board with no SoC estimation, no balancing, and no communication with their motor controller. Range prediction was guesswork. We built a complete BMS on STM32 — cell monitoring with BQ76920, LSTM-based SoC prediction running as a TFLite model, passive cell balancing, NTC thermal monitoring with anomaly detection, and CAN bus communication to their motor controller. The result was measurable: SoC estimation accuracy improved to within 3%, range prediction became reliable enough for their product demo, and they had the data story they needed for their next investor conversation.
🎓
Student Build — EEE Final Year Student at Knowx: One of our EEE students came in with a solid understanding of power electronics but had never written a line of Python or worked with ML. By Week 8 of the program he had a working 3-cell Li-ion BMS on ESP32 — cell monitoring, SoC estimation using a trained LSTM model, passive balancing, and a live Grafana dashboard showing all parameters in real time. He used this as his final year project. At his placement interview at an EV component company in Bangalore, he was the only candidate who could demo a working BMS on actual hardware. He was offered the role before the interview formally ended. His interviewer later told him it was the first time in two years of campus interviews that a student had brought a working prototype.

Frequently Asked Questions

A BMS monitors and controls a lithium-ion battery pack to ensure safe operation, maximise energy delivery, and extend battery life. Core functions: cell voltage monitoring, State of Charge estimation, temperature management, cell balancing, fault detection, and CAN bus communication with the vehicle. Modern AI-enhanced BMS systems add ML for predictive SoC accuracy, degradation forecasting, and thermal anomaly detection — transforming the BMS from a reactive protection system into a predictive intelligence layer.
For production: STM32 — ARM Cortex-M series with hardware CAN bus, automotive-grade reliability, and STM32Cube.AI for TFLite deployment. Used by Bosch, Continental, and every major Tier-1 automotive supplier. For learning and prototyping: ESP32 — affordable, accessible, Wi-Fi built-in, handles all core BMS functions well enough for educational builds and MVPs. Our recommendation at Knowx: start with ESP32, then port to STM32 for production work.
ML is used in BMS for three main applications: SoC estimation using LSTM networks — improves accuracy over Coulomb counting to under 3% error; degradation prediction — forecasts remaining useful life from cycle patterns; and thermal anomaly detection — identifies dangerous temperature behaviour before it becomes a safety risk. TFLite enables these models to run directly on STM32 or ESP32 without cloud dependency — critical for automotive applications where connectivity cannot be assumed.
Cell balancing equalises voltage across individual cells in a series battery pack. Without balancing, the weakest cell limits the entire pack's usable capacity. Passive balancing dissipates excess energy as heat via resistors — simple, handled in hardware by ICs like BQ76920. Active balancing redistributes energy between cells using DC-DC converters — more efficient but more complex. For student projects, start with passive balancing. For production EV BMS, active balancing is preferred for efficiency.
Yes — a BMS project is one of the strongest final year projects for ECE and EEE students. It combines hardware design, embedded firmware, AI/ML, and automotive communication protocols — covering more engineering ground than most other project topics. At Knowx Innovations in Bangalore, students build working BMS prototypes as part of the Embedded AI & IoT Product Engineer program — a 12-week hands-on program available online and offline with weekday and weekend batches. Internship certificate included for pursuing students.
Build a Working BMS — With Real Hardware and Expert Mentors
12 Weeks. STM32 + ESP32. Real EV Projects.

Knowx Innovations is a product development company in Bangalore building embedded AI systems for EV clients. Our training division gives ECE, EEE and CSE students hands-on experience building BMS systems and other real-world embedded AI products — with mentors who have built these systems commercially. Online and offline batches. Weekday and weekend schedules. University-compliant internship certificate included.

BMS on STM32 + ESP32 Online & Offline · Weekday & Weekend Internship Certificate Included Rs.20,000 + GST · EMI Available
View Program & Enrol →
← Back to All Articles

Ready to scale your innovation?

Talk to our engineering team — we typically respond within 24 hours.