AWS IoT Core Architecture Patterns: Fleet Provisioning, Rules, Shadows
How to architect on AWS IoT Core in 2026 — provisioning, rules engine, device shadow, and the patterns that age well at 10k, 100k, and 1M devices.
AWS IoT Core has accumulated enough features over the years that “what to use” is no longer obvious. Here are the architectural patterns that age well — through 10k devices, through 100k, through migration to v2 of the product, and through the inevitable team turnover.
The four building blocks worth knowing well
Most successful AWS IoT architectures use four primitives:
- Device gateway — the MQTT broker, plus mTLS authentication
- Fleet provisioning — how devices first introduce themselves to the platform
- Rules engine — routes incoming messages to other AWS services
- Device shadow — last-known-state and desired-state document per device
Around these four are application-specific layers (DynamoDB for device metadata, Timestream or TimescaleDB for telemetry, EventBridge for event fan-out). Get the four right and the application layer is straightforward; get them wrong and every downstream choice fights you.
Pattern 1 — Just-in-time provisioning (JITP)
The cleanest approach for products that ship with a CA-signed device certificate baked in at manufacturing.
- Manufacturing burns a device certificate signed by your CA
- AWS IoT Core registers the CA as a trusted issuer
- On first connection, AWS auto-provisions the device, attaches default policies, and creates a Thing
- A Lambda hook can run additional logic (mapping to customer account, enabling specific features)
When to use: product line ships with a known CA. Volume justifies running a CA (which is real engineering — keys, lifecycle, audits).
When not: small fleets where managing a CA is overkill. Use fleet provisioning by claim instead: devices come with a shared bootstrap certificate, then get their per-device certificate on first contact.
For broader provisioning patterns see our IoT provisioning post.
Pattern 2 — Rules engine for fan-out, not for business logic
The rules engine is great at “if this MQTT topic, send to that AWS service.” It is not where complex business logic should live.
A defensible pattern:
- Rule 1: route every telemetry message to Kinesis for downstream processing
- Rule 2: route alerts (high-priority topic) to SNS for paging
- Rule 3: route lifecycle events to EventBridge for fan-out to multiple consumers
Then your business logic — enrichment, alerting decisions, ERP integration — runs in Lambda or container services consuming from those streams. The rules engine stays thin and reviewable.
The trap: building a 50-line SQL filter inside a rule. When the logic outgrows two filter conditions, move it to a real service.
Pattern 3 — Device shadow as the configuration channel
Device shadows are underused. Most teams treat them as toys, then re-implement the same pattern badly.
Use cases that work:
- Configuration push: ERP-of-record contract terms write to the desired-state of the shadow. The device reconciles its config against the shadow on connect. Last-write-wins with explicit version numbering.
- Last-known-state for dashboards: dashboards read the reported-state, not the live MQTT stream. Shadows are designed for this; live MQTT subscriptions are not.
- Sparse intermittent connectivity: the device goes offline, the shadow accumulates desired-state changes, the device catches up on next connect.
What does not belong in shadow:
- High-frequency telemetry (each shadow update is billed; do not shadow your sensor stream)
- Large blobs (8 KB document size limit; OTA goes via S3 presigned URLs, not shadow)
- Sensitive secrets (shadows are encrypted in transit and at rest, but visible to anyone with shadow read permission; treat them as semi-public)
Pattern 4 — Greengrass for edge orchestration
If devices have meaningful local compute (Linux SBC, industrial gateway), AWS IoT Greengrass provides the edge runtime that mirrors cloud-native patterns at the edge.
Pick Greengrass when:
- Multiple devices live behind a single gateway and need local communication
- ML inference happens at the edge (Greengrass + SageMaker Edge)
- Connectivity is intermittent and you need local processing during outages
- Component-based deployment matches your team’s ops style
Skip Greengrass when:
- Devices are tiny (microcontrollers — too small for the runtime)
- Connectivity is reliable and edge logic is light
- The team is uncomfortable operating a Linux runtime in field conditions
Pattern 5 — Cost-aware architecture
AWS IoT pricing has bite. Three knobs that move the bill significantly:
- Connection-time billing. Devices billed per minute connected. Devices that disconnect and reconnect frequently for the same payload count against this clock multiple times.
- Message size and frequency. Billed per 5KB message. A 6KB message bills as two; design payload sizes accordingly.
- Rules-engine action billing. Each rule action (republish, write to DynamoDB, invoke Lambda) is billed separately. Coalesce where possible.
For a 100k-device fleet sending one message per minute, expect IoT Core costs alone in the $5k–$15k/month range depending on configuration. Total cloud bill (storage, compute, egress) typically 2–3x that. See our IoT cloud cost post for the full breakdown.
What we typically build
For a new AWS IoT product:
- Provisioning: Fleet provisioning by claim for first product, JITP once volume justifies the CA work
- Auth: mutual TLS with X.509, per-device certificates rotated annually
- Rules engine: thin — one rule per major message class, fanning out to Kinesis / EventBridge / SNS
- Shadow: used for configuration push and last-known-state; not for telemetry
- Telemetry sink: Kinesis → Timestream (live) + S3 (archive in Parquet via Firehose)
- Application: Lambda or Fargate consuming from Kinesis, writing to DynamoDB and external systems via the integration patterns from our IoT-to-ERP post
- Observability: CloudWatch + a third-party APM (Datadog or New Relic) for cross-service tracing
This stack scales smoothly from 1k to 1M devices. The migration points (move beyond Lambda, switch from Timestream to TimescaleDB on Aurora, etc.) are well-understood.
What we hand over
For an AWS IoT engagement we ship:
- Architecture document with the four primitives configured
- Terraform (or CDK) for the entire IoT setup, version-controlled
- Per-environment provisioning automation
- A cost projection at 1k, 10k, and 100k devices with line-item breakdown
- A migration playbook for known-future changes (custom CA, multi-region, federation)
If you are weighing AWS IoT Core for a new product or scaling an existing one, we have shipped this stack across many engagements.
Keep reading
-
Cloud
Azure IoT Hub vs IoT Central: When to Pick Each
A practical comparison of Azure IoT Hub and Azure IoT Central in 2026 — when the managed Central experience wins, when raw Hub is the right call.
Read -
Cloud
IoT Edge Gateway Patterns: Architecture, Local Processing & Sync
Architectural patterns for IoT edge gateways in 2026 — local processing, store-and-forward, edge AI, and the operational realities of running compute at the edge.
Read -
Cloud
IoT Platform Migration: Strategies for Switching Cloud Providers
How to migrate IoT workloads between AWS IoT, Azure IoT, and self-hosted platforms — the patterns that minimise risk and the gotchas that bite mid-migration.
Read