Time-Series Data Architectures for IoT Telemetry
The patterns we use to ingest, store, and query high-volume IoT telemetry — and the failure modes that bite when you try to use a generic database for it.
A thousand devices reporting once per minute is 1.4 million records per day. That fits in any database. Ten thousand devices reporting once per second is almost a billion records per day. That does not. The architecture you pick determines how much grief you absorb between those two numbers.
What makes time-series data different
Three characteristics:
- Append-only writes dominate. You almost never update an old reading. You almost never delete one until it ages out.
- Recent data is hot, old data is cold. Most queries hit the last hour, day, or week. Queries against last year’s data exist but are rare.
- Aggregations matter more than raw rows. “Average per minute over the last week” is the question users ask, not “give me row 8,234,891.”
These three together change the optimal storage layout. Row-store databases tuned for transactional workloads waste space and CPU on this pattern.
The decision tree
For most IoT telemetry workloads, the choice comes down to three options.
- TimescaleDB if your team already knows PostgreSQL. Hypertables, continuous aggregates, and compression turn Postgres into a competent time-series database. You inherit the entire Postgres ecosystem — JSON support, joins to relational tables, materialized views, your existing operational expertise.
- InfluxDB if pure time-series is the workload and you want a database designed for it from the ground up. Sharper write performance, native downsampling, simpler operational model. Less flexibility for relational queries.
- Cloud-managed offerings (AWS Timestream, Azure Data Explorer, Google BigQuery for time-series) when team capacity is the constraint. You pay more per byte stored, you do not manage the database, and you trade vendor flexibility for operational simplicity.
There are valid reasons to go further afield — ClickHouse, QuestDB, VictoriaMetrics — but the three above cover 80% of IoT projects without surprises.
The architecture we keep ending up with
A pattern that has survived several scale-ups:
Devices → MQTT broker → Stream processor → Hot store
↓
Cold archive (S3 / GCS) — Parquet
↓
Query layer (Presto, Athena, or duckdb)
The layers do specific jobs:
- MQTT broker (Mosquitto, EMQX, or managed AWS IoT Core) accepts the firehose, handles backpressure, and decouples ingestion from storage.
- Stream processor (Kafka Streams, Flink, or a simpler Lambda/Cloud Run consumer) does the lightweight transforms — unit conversion, calibration, basic anomaly flagging — before persistence. This is the layer where you enrich raw readings with device metadata.
- Hot store holds the recent N days at high resolution for live dashboards and alerting. This is your TimescaleDB or InfluxDB instance, sized for the working set.
- Cold archive holds raw historical data in object storage as Parquet or compressed columnar files. Cheap to retain, easy to query for ad-hoc analysis.
This separation matters. Stuffing everything into the hot store works until the storage bill exceeds the budget; pretending the cold archive is fast enough for dashboards works until the support team starts complaining.
Compression is the cheat code
Both TimescaleDB and InfluxDB support compression of older data — typically 10x to 30x reduction with little query overhead for typical IoT readings. Most teams forget to enable it until they get the storage bill. The cost of doing this right is a configuration line. The cost of doing it wrong is a quarter of revenue.
Set compression policies on day one:
- Recent N days uncompressed, fully indexed, fast for dashboards.
- Older than N days compressed, still queryable, much cheaper.
- Older than M months archived to object storage, queryable through a separate path.
Continuous aggregates beat dashboard queries
A dashboard that asks “average temperature per hour over the last 30 days” against raw 1-Hz data is computing 2.6 million rows on every page load. Pre-compute it.
TimescaleDB calls this a continuous aggregate. InfluxDB calls it a downsampled task. Both build a separate, pre-aggregated table that updates as new data arrives. Dashboards query the aggregate, not the raw data, and load instantly.
The cost is some duplicated storage and a slightly more complex schema. The benefit is dashboards that respond in milliseconds rather than seconds.
What kills these systems in production
Three failure modes we have seen more than once:
- Hot writes thrashing index updates. Every insert touches several indexes. At high enough rates, you spend more CPU on index maintenance than on writes. Mitigation: bulk insert, drop unused indexes, partition aggressively.
- Cardinality explosion. Tagging readings with high-cardinality fields (user IDs, session IDs) bloats the index and kills query performance. Keep tags low-cardinality; put high-cardinality values in fields, not tags.
- Dashboards that scan the world. A dashboard query without a time bound or with a stale time bound will, eventually, read all of history. Add hard limits at the query layer.
What we hand over
Operational time-series systems get documented as runbooks: how to add a new metric, how to change a retention policy, how to spot a hot-shard before it pages, what every alert means. Without that, the database is a black box your team cannot maintain.
If your fleet is approaching the point where the database is the bottleneck, we have re-architected this layer more than once.
Keep reading
-
Energy
Solar Plant Monitoring: From Inverter Telemetry to ROI
What it actually takes to monitor a fleet of solar installations — the data architecture, the alert filtering that prevents alarm fatigue, and how to surface the metrics owners actually care about.
Read -
Cloud
The True Cost of an IoT Cloud Backend
Where the cloud bill actually goes on an IoT product, and the levers that reduce it 10x without touching feature scope.
Read -
Edge AI
AI on the Edge vs AI in the Cloud: Where to Run the Model
A practical decision framework for whether your IoT product's AI feature should run on the device or in the cloud, and the hybrid pattern that often wins.
Read