Two Ways to Update Embedded Devices Over-The-Air
Whether the need is to install the latest security patch, delight customers with new features, or fix bugs - every company must be able to deploy over-the-air (OTA) software updates to their fleet of devices. There are fundamentally two different methods for updating a connected device remotely, be it a gateway, powerful edge computing device or a smaller IoT device.
The first update method is called system updates. In a system update everything, the whole disk partition on a device, is updated. Even if the update only entails a minor change in a file, the update will include all software and firmware on the device.
The second method is called application updates. In application updates only certain parts of the software on the disk is updated. If there is only a minor change in a file, the update will only include a new version of this file.
Which type of update method to opt for depends on the user's needs. Below is a brief explanation of the two methods and their pros and cons, hopefully helping you decide which method would be the best for your specific use-case.
Method 1: System updates
A system update either overwrites an existing running partition (if there is only one partition on the device), or writes to a second passive partition (if there are two partitions on the device).
Traditionally, embedded devices have only used the system update approach, and this is the most predominant for users of the Yocto Project, one of the world’s most popular projects to create embedded Linux distributions.
For devices equipped with two partitions, system updates can provide an atomic and recoverable update procedure. The update will be written to the passive partition. Once the update has been successfully transferred to the device, the device will reboot into the passive partition which thereby becomes the new active partition. If the update causes any failures, the device can easily roll back to the original active partition to preserve uptime.
For users who seek a “brick-free” and redundant update process, system update is the most robust and safe updating method.
The main drawback of system updates relates to the additional hardware costs due to the need for double storage capacity to allow for two partitions. Another drawback, unless the update is implemented smartly with some form of delta-based mechanisms, relates to the fact that even the smallest update can require hundreds of megabytes of binary transfer since the whole partition is updated. This can be both time consuming and costly if taken place over cellular networks. These drawbacks talk to the frequency of updates. Due to all the traffic caused by an update, this method is less suitable for continuous, or very frequent updates.
If costs is essential, “brick-free” a must, and downtime not critical, there is an option to, use a small recovery partition instead of a full second partition. The recovery partition will contain just enough logic to be able to call back to the management server about update failures and receive a new update. If a system update goes wrong, the device will boot into recovery and stay there until a new update is ready to be installed on the system partition. This approach is used by Apple iOS.
- Robust and atomic
- “Brick-free” (fallback option to reboot into old previously working partition)
- Requires large data transfers even for small changes (unless delta update is supported)
- Installing an update can take a long time, even 30-60 minutes, for large system updates
- Potentially longer downtime during the installation of the update, in particular if a maintenance partition is used (the device has downtime while the update is being installed in this case)
- Extra hardware costs due to the need for double (two partitions) storage, or at least an additional recovery partition
- Device is on low-cost or free network like Wifi
- The direct and or indirect costs of device down-time is high
- The need for updates are less frequent
- Device is an industrial grade
- Recurring security updates
- Improve support for peripherals, like new driver for wifi module
Method 2: Application updates
Application updates entail any kind of update not being a system update, typically software running higher up in the stack such as the UI application, or cloud connectivity client. These updates range from simple file(s) overwrite, to package to container and in odd cases VM (virtual machine) updates.
From the architecture of the device combined with CI/CD and the tool chain the preferred application update type normally falls naturally.
Powerful CPU-based devices can leverage virtualization to enable containers. Containers are easy to manage due to their atomic nature, fast and provides a nice layer of isolation. Microsoft Azure IoT Edge has opted in for this approach making it easy for their ecosystem to both develop and consume different services (as containers).
An alternative to container updates would be package updates. Today Debian is one of the most popular operating systems for connected devices. For users who don’t want the extra overhead and complexity associated with containers, updating Debian based systems often is as simple as updating a new version of a software package, or simply copy over new versions of specific files.
Application updates come in much smaller size than system updates. Consequently, application updates favors use-cases with frequent updates and where network traffic cost money.
The risk of bricking a device would be the main drawback of application updates. Even the smallest configuration change can have the most devastating consequences. If an application update goes wrong it might make the device totally unusable. Even container based devices risk bricking if the underlying container system gets a hick-up.
- Allows for fast, frequent, and targeted updates
- Great flexibility in update type (file, package, container)
- Much less data transfer per update than system update
- Often lead to “software sprawl”, or inconsistencies in versions and dependencies of applications installed across the device fleet over time
- Can brick the device, e.g. if there is a power loss during the update process
- Device is part of a subscription plan where new feature will be expected
- Device downtime and potential bricking is an accepted risk
- Device is on cellular or paid network
- Device is consumer grade
A few words on Mender
Mender supports both System Updates and Application Updates. Dependent on cost of downtime over value of time to market, we believe many vendors should opt in for both methods when selecting updating methods for their devices.
In order to shorten the time to market, vendors can prepare for launch, develop and ship devices before everything is ready. Once devices eventually get connected, application updates allows for frequent and smaller changes.
Once the product hits a maturity stage and avoidance of bricking and staying secure surmount shipping new features, system updates takes stage.
With this approach vendors get the best of both worlds; fast and frequent updates when new features are most important, and infrequent updates when security and stability matters the most.
This combination enables to solve all OTA software update needs. More frequent feature updates and bug fixes can take advantage of targeted application-level deployments whereas security patches and OS level deployments that are less frequent (usually 2-3 months) can take advantage of system updates.