IoT Platform Migration: Strategies for Switching Cloud Providers
How to migrate IoT workloads between AWS IoT, Azure IoT, and self-hosted platforms — the patterns that minimise risk and the gotchas that bite mid-migration.
IoT platform migrations have a reputation for being ugly. Most are. The ones that aren’t share a small set of practices: a layer of indirection between devices and the platform, a phased rollout that doesn’t depend on a single big-bang cutover, and honest scope discipline about what migrates and what doesn’t.
Here is the framework we use on real migrations.
When migration is actually justified
Three reasons survive close inspection:
1. Cost trajectory is unsustainable. The fleet has grown to a size where the managed-cloud bill exceeds what self-hosted or a different provider would cost, by enough margin to fund the migration plus operational cost.
2. Customer or compliance requirement. A new contract with a customer who specifies the cloud, or a regulatory shift (data residency, sovereignty) that the current provider can’t accommodate.
3. Strategic rebuild. The platform is being rewritten anyway — for capabilities the current provider can’t support — and the migration is bundled with the rewrite.
Migrations not justified by one of these tend to under-deliver. “We’re tired of AWS” is not a reason that survives the budget meeting.
The architectural prerequisite
The migration is meaningfully easier if the architecture has these four boundaries in place before you start. If they aren’t, building them is the first phase.
-
Device identity decoupled from cloud. Devices use mutual TLS with X.509 certs from a CA you control, not from a cloud-service-provided CA. This is the single biggest predictor of migration ease.
-
Protocol-portable message format. Protobuf or Avro with a schema registry. JSON works too if you control the structure. Cloud-specific encodings (AWS IoT’s binary device shadow format, for instance) are migration friction.
-
Application logic outside the cloud’s rules engine. Routing decisions in your code (Lambda, container services, stream processors), not in cloud-specific rules-engine SQL. Rules engines are seductive and migration-hostile.
-
Single observability platform. If you observe the IoT platform through Datadog, Grafana, or similar — a third party that runs across clouds — migration loses one dimension of pain. Per-cloud monitoring tools means you also migrate observability.
For deeper coverage of these boundaries see our multi-cloud post.
Migration patterns
Pattern A — DNS / endpoint switch
The simplest pattern when the device fleet supports remote configuration.
- Stand up the new platform alongside the old
- Replicate device registry from old to new
- Push a config update to a small cohort of devices changing their MQTT endpoint to the new platform
- Watch behaviour for a defined window
- Expand cohorts until 100% migrated
- Decommission the old platform
When this works: devices accept remote config changes, both platforms can coexist, the device endpoint is a string the device reads from config rather than hardcoded.
Risks: devices that don’t accept the config update become stranded on the old platform. Plan for a long tail.
Pattern B — Dual-publish
Devices publish to both platforms simultaneously during the migration window. Consumers gradually shift from old to new.
- Devices receive a firmware update that publishes to both platforms
- Cloud-side consumers (analytics, ERP integration) initially read from the old; gradually shift to read from the new
- Once all consumers are on the new platform and traffic is verified, devices stop publishing to the old
When this works: firmware can be updated, devices have spare bandwidth, the cost of dual-publishing is acceptable for the migration window.
Risks: dual-publishing burns more bandwidth than usual. Devices on tight cellular plans may not tolerate it.
Pattern C — Bridge migration
A bridge service replays messages from the old platform to the new during a transition window.
- Devices keep publishing to the old platform unchanged
- A bridge service subscribes to the old platform and republishes to the new
- Cloud-side consumers move from old to new at their own pace
- Once all consumers are on the new platform, devices migrate (via Pattern A or B)
When this works: firmware updates are slow or risky and you want to migrate cloud-side consumers first.
Risks: the bridge becomes a critical path with a single point of failure. Operational maturity required.
Pattern D — Strangler-fig (the safest, slowest)
A combination of B and C. New devices ship straight to the new platform. Existing devices migrate via dual-publish. Cloud-side consumers migrate via the bridge. Eventually only the long-tail of unconverted old devices remain on the old platform; those are decommissioned as part of natural lifecycle replacement.
When this works: you have time. This is the lowest-risk migration but takes 12–18 months to complete.
Risks: management overhead of running two platforms in parallel for a year-plus. Make sure the operational cost is budgeted.
The non-architectural work
The architecture matters; the operations matters more.
1. Data migration. Historical data lives in databases the old platform feeds. Migrate the data, not just the platform — or accept that historical queries will need to span two stores during a transition window. Plan the cutover.
2. Identity migration. If devices are using cloud-CA-issued certificates, you need to issue new certs from a CA that works on the new platform. This is a fleet-wide credential rotation, not a trivial config change.
3. Integration re-points. ERP, CRM, BI integrations need updating. The webhook URL changes, the streaming consumer changes, the data warehouse loader changes. Inventory every integration before starting; cut them over deliberately, not opportunistically.
4. Operational training. The team has to operate the new platform competently from day one of any production traffic. Run game days; document runbooks; rotate the on-call team through the new platform before you need them to actually be on call.
What we typically do
For a customer migrating from AWS IoT Core to a self-hosted EMQX platform (a real and recent engagement type):
- Phase 1 (4 weeks): stand up EMQX on AKS/EKS in parallel; replicate device registry; build the bridge service for dual-publish during transition.
- Phase 2 (4 weeks): firmware update for 10% of fleet to dual-publish. Verify behaviour. Migrate one cloud-side consumer (typically the analytics warehouse loader) to read from EMQX.
- Phase 3 (8 weeks): expand fleet cohort to 100%. Migrate remaining cloud-side consumers. Validate parity for a full month.
- Phase 4 (2 weeks): firmware update to disable dual-publish. Decommission AWS IoT Core. Final cost reconciliation.
Total: ~4 months for a 100k-device fleet, with most of the schedule spent on validation, not change.
What we hand over
For a migration engagement we ship:
- Migration plan with phase gates and rollback criteria
- Bridge service (if applicable) with operational runbook
- Cohort definitions and expansion schedule
- Identity / credential rotation plan
- Data-migration runbook
- Integration inventory with cutover order
- Cost reconciliation comparing pre-migration and post-migration steady state
If you are weighing a migration — or in the middle of one that is going slowly — we have shipped this work before.
Keep reading
-
Cloud
AWS IoT Core Architecture Patterns: Fleet Provisioning, Rules, Shadows
How to architect on AWS IoT Core in 2026 — provisioning, rules engine, device shadow, and the patterns that age well at 10k, 100k, and 1M devices.
Read -
Cloud
Azure IoT Hub vs IoT Central: When to Pick Each
A practical comparison of Azure IoT Hub and Azure IoT Central in 2026 — when the managed Central experience wins, when raw Hub is the right call.
Read -
Cloud
IoT Edge Gateway Patterns: Architecture, Local Processing & Sync
Architectural patterns for IoT edge gateways in 2026 — local processing, store-and-forward, edge AI, and the operational realities of running compute at the edge.
Read