The True Cost of an IoT Cloud Backend
Where the cloud bill actually goes on an IoT product, and the levers that reduce it 10x without touching feature scope.
The cloud bill for an IoT product surprises almost every team — usually badly, around month four of operation. The pattern is consistent: the proof of concept ran for cents per device per month, the production system runs for dollars, and no one is sure where the money went. Here is where it tends to go and how to keep it down.
The line items that dominate
In a typical AWS, Azure, or GCP bill for an IoT product, the costs concentrate in:
- Message broker / IoT Hub ingestion, billed per message and per device-hour connected. At fleet scale, this is often the largest single line item.
- Storage — both the hot time-series store and the longer-term archive. Without retention policies, this grows linearly forever.
- Egress — surprisingly large if dashboards pull raw data, if you stream telemetry to third parties, or if a careless query exports a year of data to a notebook.
- Compute for stream processing, dashboarding APIs, and any inference. Usually third or fourth on the list.
- Logs and metrics for the system itself. Underestimated. Easy to multiply your bill by leaving verbose logging on.
Surprises happen because each individual line is small per device but multiplies. A million devices times a tenth of a cent per day is $300,000 a year.
The levers that actually reduce cost
In rough order of leverage:
1. Reduce the message rate at the source.
A device sending one message per minute costs sixty times more than one sending per hour, all else equal. Most IoT products ship more frequently than they need to. Push aggregation to the device — averages, summaries, only-on-change deltas — instead of streaming raw readings.
2. Compress at rest.
Time-series databases compress historical data 10-30x. Most teams forget to enable it. Set retention and compression policies on day one.
3. Tier storage.
Hot store for recent, fast-access data. Cold archive (S3 Glacier or equivalent) for historical raw data. Most queries only hit the last week. Storage cost drops by an order of magnitude.
4. Cache aggressively.
A dashboard query that scans a year of data on every refresh is paying CPU and IOPS for the same answer thousands of times. Continuous aggregates and CDN-cached API responses pay for themselves within a month.
5. Use spot / preemptible instances for batch.
Anything that does not need real-time response — batch reports, historical analysis, training jobs — runs on spot at 70% lower cost.
6. Choose ingestion endpoints carefully.
AWS IoT Core, Azure IoT Hub, and Google Cloud IoT have different pricing models. For high-frequency, low-payload telemetry, a self-hosted MQTT broker (EMQX, HiveMQ, Mosquitto) on a single VM can be 10x cheaper than managed offerings — at the cost of operational responsibility.
7. Profile dashboards.
The dashboard query that sums millions of rows on every render is the silent killer. Add hard time bounds. Pre-compute. Cache.
8. Be ruthless about retention.
Most IoT telemetry has no business value after 90 days. Yet the default in most setups is “keep forever.” A 90-day retention policy with a quarterly export of aggregated data covers almost every analytics need.
What it looks like at scale
Approximate cost ranges for a steady-state production fleet, after these levers are applied:
- Low-frequency telemetry (minutes between messages, low payload): $0.10 - $0.50 per device per year.
- High-frequency telemetry (seconds between messages): $1 - $5 per device per year.
- Continuous streaming (audio, video, raw sensor): $5 - $50 per device per year.
These are honest production numbers, not marketing minimums. A team running far above these ranges has tunable costs.
The architectural decisions that matter
Some choices made in the first month of a project lock in cost behavior for years.
- Whether your message format is binary or JSON. JSON is convenient and 3-5x larger. At scale, this multiplies the storage and egress lines.
- Whether ingestion fans out to multiple consumers. A message that is duplicated to a stream processor, a logging sink, and a backup is billed multiple times in many architectures.
- Whether your dashboards query the hot store directly or go through a cache. Direct queries scale linearly with users. Cached queries do not.
- Whether your aggregations live in the database or in application code. In-database aggregations are 10-100x more efficient than pulling raw rows out and aggregating in code.
Reversing any of these later is a quarter of engineering. Choosing them well at the start costs an hour.
What we typically do at the start
For new IoT cloud architectures, our default starting points:
- Compact binary message format (Protobuf or MessagePack) for telemetry.
- TimescaleDB or InfluxDB with compression enabled and retention policies set.
- Cold archive in S3 Parquet, accessed through a cheap query layer for historical analysis.
- Aggressive continuous aggregates for dashboards.
- Self-hosted broker if scale justifies it; managed broker if it does not.
If your IoT cloud bill is climbing faster than your fleet, we have audited and reduced these bills more than once.
Keep reading
-
Connectivity
Choosing IoT Connectivity: Wi-Fi, BLE, LoRaWAN, NB-IoT, or Cellular
A practical decision guide for picking the right wireless stack for your connected product, based on power, range, throughput, cost per device, and operational reality.
Read -
Security
Securing IoT: Threat Models, Secure Boot, and TLS in Constrained Devices
A practical security baseline for connected products — what to do, in what order, and what can wait until v2.
Read -
Cloud
Time-Series Data Architectures for IoT Telemetry
The patterns we use to ingest, store, and query high-volume IoT telemetry — and the failure modes that bite when you try to use a generic database for it.
Read