Edge AI April 8, 2026 · 6 min read

Edge AI on Microcontrollers: TinyML in 2026

What works, what is still painful, and how to decide whether your IoT product should run a model on the device or in the cloud.

#TinyML#Edge AI#ESP32#STM32#ML

A few years ago, “AI on a microcontroller” meant a 20-line keyword spotter and a great deal of hand-waving. In 2026 it is a serious option for a meaningful slice of IoT problems — but the failure modes are still subtle, and the right call is more often “no” than the marketing material suggests.

When edge AI is the right call

Run the model on the device when:

Latency must be measured in milliseconds, not seconds. A fall detector cannot wait for a round-trip to the cloud.
Connectivity is unreliable or absent. A vibration sensor on a remote pump cannot ship 100 Hz raw data over LoRa.
Privacy or regulation requires data never leaves the device. Healthcare wearables, in-cabin audio, certain industrial settings.
Cloud cost would dominate the bill of materials. Streaming raw sensor data from a million devices to inference services adds up faster than the model would.

If none of these apply, run the model in the cloud. The edge constraints make development slower and cheaper hardware faster.

What you can realistically run on a microcontroller

Approximate ranges for an ESP32-S3 or an STM32H7-class chip with a quantized model:

Wake-word and keyword spotting: routine. Sub-100 KB models, sub-100 ms latency.
Activity recognition from accelerometer data: routine.
Anomaly detection on time-series sensor data with a small autoencoder or one-class SVM-equivalent: yes.
Person detection from very low-resolution images (96×96 grayscale): yes, with care.
General object detection at usable resolution: not on a stock MCU. Move to an MCU with a neural accelerator (NXP i.MX RT crossover, STM32N6 with Neural-ART) or a Linux-class SBC.
LLM inference: no. Even a small quantized 1B-parameter model exceeds typical MCU memory by orders of magnitude.

The model is the easy part

The hardest engineering on a TinyML project is rarely model accuracy. It is:

Sensor pipeline reliability. The model’s accuracy on your validation set means nothing if the production sensor signal looks different from what you trained on. Calibration drift, sample-rate jitter, and environmental coupling all show up.
Quantization stability. Float-trained models that quantize to int8 cleanly are not the default. Expect to tune quantization-aware training and have a calibration dataset that genuinely represents production conditions.
Memory budget. Inference uses the activation buffers you account for, plus the working memory you forgot. Profile both, on the actual chip, with the actual model.
Battery accounting. A model that takes 80 ms per inference at 100 mA average is fine. The same model running 30 times a second silently halves your runtime.

What good edge-AI deployment looks like

The systems that work in the field share a few patterns.

The model is a service, not a snowflake. It has a versioned binary, a known interface, and a process for shipping updates over OTA. New models do not require new firmware releases.
Inference output is logged. A small ring buffer of recent predictions and the input features that produced them. When something looks weird in the field, you have evidence to debug with.
Drift detection runs alongside the model. A simple statistical check that the input distribution today resembles the training distribution. When it does not, the system flags it before the model silently degrades.
There is a fallback. If the model crashes or returns nonsense, the device defaults to a sensible deterministic behavior. The model is an enhancement, not a single point of failure.

The boring path that wins

For a first edge-AI product:

Define the metric — accuracy, latency, false-positive cost — before picking a model architecture.
Collect a real dataset on the actual sensor in the actual environment. Months, not days.
Pick the smallest model that hits the metric. A logistic regression on hand-engineered features beats a neural network you cannot debug.
Ship it with full telemetry, drift monitoring, and OTA. Improve it post-launch with real data.

Skipping step 2 is the most common reason edge-AI projects ship and then quietly get turned off.

If you have a TinyML project that has been “almost ready” for six months, we have probably seen the failure mode.

By Diglogic Engineering · April 8, 2026

Edge AI on Microcontrollers: TinyML in 2026

When edge AI is the right call

What you can realistically run on a microcontroller

The model is the easy part

What good edge-AI deployment looks like

The boring path that wins

Keep reading

ESP32 vs STM32: When to Pick Each for Your IoT Product

Designing OTA Firmware Updates That Don't Brick Devices

Industrial IoT: Predictive Maintenance with Vibration Sensors

Let's get started.