Skip to main content
Cloud · 6 min read

Time-Series Data Architectures for IoT Telemetry

The patterns we use to ingest, store, and query high-volume IoT telemetry — and the failure modes that bite when you try to use a generic database for it.

#Time-series#TimescaleDB#InfluxDB#Architecture#Cloud

A thousand devices reporting once per minute is 1.4 million records per day. That fits in any database. Ten thousand devices reporting once per second is almost a billion records per day. That does not. The architecture you pick determines how much grief you absorb between those two numbers.

What makes time-series data different

Three characteristics:

  1. Append-only writes dominate. You almost never update an old reading. You almost never delete one until it ages out.
  2. Recent data is hot, old data is cold. Most queries hit the last hour, day, or week. Queries against last year’s data exist but are rare.
  3. Aggregations matter more than raw rows. “Average per minute over the last week” is the question users ask, not “give me row 8,234,891.”

These three together change the optimal storage layout. Row-store databases tuned for transactional workloads waste space and CPU on this pattern.

The decision tree

For most IoT telemetry workloads, the choice comes down to three options.

  • TimescaleDB if your team already knows PostgreSQL. Hypertables, continuous aggregates, and compression turn Postgres into a competent time-series database. You inherit the entire Postgres ecosystem — JSON support, joins to relational tables, materialized views, your existing operational expertise.
  • InfluxDB if pure time-series is the workload and you want a database designed for it from the ground up. Sharper write performance, native downsampling, simpler operational model. Less flexibility for relational queries.
  • Cloud-managed offerings (AWS Timestream, Azure Data Explorer, Google BigQuery for time-series) when team capacity is the constraint. You pay more per byte stored, you do not manage the database, and you trade vendor flexibility for operational simplicity.

There are valid reasons to go further afield — ClickHouse, QuestDB, VictoriaMetrics — but the three above cover 80% of IoT projects without surprises.

The architecture we keep ending up with

A pattern that has survived several scale-ups:

Devices  →  MQTT broker  →  Stream processor  →  Hot store

                              Cold archive (S3 / GCS) — Parquet

                              Query layer (Presto, Athena, or duckdb)

The layers do specific jobs:

  • MQTT broker (Mosquitto, EMQX, or managed AWS IoT Core) accepts the firehose, handles backpressure, and decouples ingestion from storage.
  • Stream processor (Kafka Streams, Flink, or a simpler Lambda/Cloud Run consumer) does the lightweight transforms — unit conversion, calibration, basic anomaly flagging — before persistence. This is the layer where you enrich raw readings with device metadata.
  • Hot store holds the recent N days at high resolution for live dashboards and alerting. This is your TimescaleDB or InfluxDB instance, sized for the working set.
  • Cold archive holds raw historical data in object storage as Parquet or compressed columnar files. Cheap to retain, easy to query for ad-hoc analysis.

This separation matters. Stuffing everything into the hot store works until the storage bill exceeds the budget; pretending the cold archive is fast enough for dashboards works until the support team starts complaining.

Compression is the cheat code

Both TimescaleDB and InfluxDB support compression of older data — typically 10x to 30x reduction with little query overhead for typical IoT readings. Most teams forget to enable it until they get the storage bill. The cost of doing this right is a configuration line. The cost of doing it wrong is a quarter of revenue.

Set compression policies on day one:

  • Recent N days uncompressed, fully indexed, fast for dashboards.
  • Older than N days compressed, still queryable, much cheaper.
  • Older than M months archived to object storage, queryable through a separate path.

Continuous aggregates beat dashboard queries

A dashboard that asks “average temperature per hour over the last 30 days” against raw 1-Hz data is computing 2.6 million rows on every page load. Pre-compute it.

TimescaleDB calls this a continuous aggregate. InfluxDB calls it a downsampled task. Both build a separate, pre-aggregated table that updates as new data arrives. Dashboards query the aggregate, not the raw data, and load instantly.

The cost is some duplicated storage and a slightly more complex schema. The benefit is dashboards that respond in milliseconds rather than seconds.

What kills these systems in production

Three failure modes we have seen more than once:

  • Hot writes thrashing index updates. Every insert touches several indexes. At high enough rates, you spend more CPU on index maintenance than on writes. Mitigation: bulk insert, drop unused indexes, partition aggressively.
  • Cardinality explosion. Tagging readings with high-cardinality fields (user IDs, session IDs) bloats the index and kills query performance. Keep tags low-cardinality; put high-cardinality values in fields, not tags.
  • Dashboards that scan the world. A dashboard query without a time bound or with a stale time bound will, eventually, read all of history. Add hard limits at the query layer.

What we hand over

Operational time-series systems get documented as runbooks: how to add a new metric, how to change a retention policy, how to spot a hot-shard before it pages, what every alert means. Without that, the database is a black box your team cannot maintain.

If your fleet is approaching the point where the database is the bottleneck, we have re-architected this layer more than once.

By Diglogic Engineering · March 18, 2026

Share

Ready to ship

Let's get started.

Tell us about the problem. We come back within one business day with a clear path, a timeline you can plan around, and a fixed-scope first milestone.