Mender Blog

Requirements of a successful OTA update mechanism | Mender

Written by Farshad Tavakoli | Jul 23, 2020 4:00:00 AM

With the proliferation of embedded devices in the Internet of Things (IoT) across a variety of industry verticals, a host of assets are scattered globally across terrain, geography, or customer locations. Support for remote software updates is inevitable for effective operations. Industry verticals such as transportation, energy, consumer, smart building and manufacturing rely heavily on a potent IoT ecosystem for best possible outcomes. All software used in this ecosystem, from low-level firmware to high-level applications software will surely need updating throughout the product lifetime.

Requirements for updating software to connected devices differ from many other solutions in the IoT ecosystem, for example cloud infrastructure, because connected devices must be treated and assumed unreliable and non-redundant, for instance due to poor connectivity, loss of battery, physical damage, and end user behaviour.

Firstly, an OTA update mechanism needs the ability to manage an entire fleet of connected devices. Many homegrown OTA solutions are restricted to updating devices one-by-one, which easily becomes unmanageable, and error-prone as the fleet of devices grows in population. The increase of manual tasks will increase the probability of human errors, which introduces security and operational risks given the sensitive nature of the update process. Even if automated, homegrown solutions typically only target the specific product line it was implemented for. Once the next generation of hardware, software and new products are developed, “retro-fitting” the existing homegrown OTA updater is a challenge and sometimes not feasible so yet another homegrown OTA system needs to be developed and maintained. Secondly, another critical characteristic of an OTA solution is to ensure robustness. This has many elements to ensure the resiliency of the embedded systems. A worst-case scenario is an unwanted interruption during an update rendering the device to become unusable and bricked. The resilience and reliability of the update process must be a central concern given the potential negative consequences. Another example is if a software update is only partially or half-way installed on a remote location. This might lead to a “bricked” state, with no remediation except to replace it with a new device. Clearly, the update process needs to be robust, atomic and support rollbacks.

Thirdly, security and integrity of the software update itself need to be taken seriously. The increase of connected devices being compromised had a direct result in the 91% growth of DDoS attacks in 2017, attributed to poor IoT security. The trend is profoundly disturbing. A key security feature is code signing (cryptographic validation) of an update to avoid any man-in-the-middle attack and ensure complete integrity of sensitive components. An obvious, but surprisingly often neglected feature is that all communication needs to be encrypted communication.

Finally, the OTA mechanism needs easily exposed APIs to hook into existing continuous development, build and integration systems. The risk is the pushback from the development team if an update mechanism would dictate the tools they need to use or a solution that requires to adopt a specific embedded OS, language, or how updates are packaged.

Chechout this comprehensive OTA checklist requirement.

Get a deep dive with this whitepaper.