Imagine a swarm of IoT devices. They do not possess strong CPUs and not only have little local persistent storage and not much RAM, but with their weak network connections some are not online at all times. At the same time they provide a powerful platform. You may argue that the power comes from their number. I believe that it originates from the fact that they are connected. Assuming, for the sake of the argument, that it is true, it means that the communication channel is a cornerstone of any IoT-based solution.
The present blog post describes one way of unleashing the power of IoT with a possible WebSockets-based communication layer.
A Smart Lovely Home example
Imagine you are the proud owner of the Smart Lovely Home company. Your customers are real estate enterprises who sell houses. Each house has the same setup topology: a set of sensors in every power outlet, light bulb, kitchen oven, heating system, and solar power plant on the roof, which are solely connected to a single, local gateway. Customers (the real estate companies) expect to have a web application showing an interactive map of the world with every house sold reporting its status. The key requirement is the alerting and maintenance subsystem, that turns the marks on the map green or red with a detailed description of a possible problem and suggested solution.
Smart Lovely Home advertises as: "turbo secure, lightning fast, mega lightweight, top configurable, and most adaptable" solution, and your sales leverage is: security. With that come the final constraints:
- there are no open listening and bound ports on the gateways.
- there are no accepted incoming connections to the gateways.
- the gateways outgoing packet filter allows only TCP connections to port 443
- you encrypt all communication
To top it off, as a proud and dynamic CEO/CTO/CFO you want it all. You want your product to become a standard for all the households everywhere, you plan to connect the soon-to-be Mars colony, and then expand further.
Extend and expand
So you have most of it in place. Your Lovely Smart System handles the alerts and Precious Pretty UI shows a nice map. Everyone is happy. That was a nice blog post, we could have finished here. The reason we have not is: there is always someone with a new requirement. Same thing happens every time; here comes a customer who needs to upgrade the software on the gateways, and then trigger a firmware upgrade on the sensors. You have nothing, and your best choice is to buy Mender Enterprise.
As I mentioned above, you are a proud CEO. Really proud. When the next edition of "Pride and Prejudice" comes out you are going to be on the cover. That means you are going to build an Over-the-Air software update system from scratch. It is going to be secure, standard, portable, scalable, extensible, API-driven, user-friendly — you are going to reimplement Mender. You may also spin it off as a new product one day (your pride gives a bit space for the envy you feel seeing how Mender is doing).
Talking to your development team, it turns out that you took some shortcuts. The altering and monitoring data you pass with periodical pushes to your backend without any special and well defined structure or validation. The gateway just collects some files from the sensors, and this is a Perl script, there is no data channel leading from your backend to push data to the gateways. They tell you: sure we can do it: 5 months of work, and we are in business. You think: I need to get a life, start a grocery shop around the corner, make some ecological smoothies, be happy.
Let's assume that you had taken a different path, before all that happened.
The live channel and the client
Travelling back in time to the beginning of the Lovely Smart System, let's assume we have a portable Nice Smart Client running on gateways instead of the script, and we turn around the model of a HTTP poll based solution and assume we maintain one Websocket connection from each gateway to our backend. How can this work?
The following picture shows one possible architecture, please note that in the following we are not mentioning authentication at all -- something that you most certainly would need. The design discussed here leaves plenty of space to add it.
As you can see there is a persistent TCP connection between the Nice Smart Client and the Lovely Smart System backend. There is also a similar channel between the Precious Pretty UI and the backend. Please also note a key design feature here: the UI and the client do not pass data directly; in between there is an extra component: a messaging system. The reason for having something like NATS or RabbitMQ to handle messages gives you better flexibility and more management possibilities.
Let's see what we can get out of the above picture:
- The Nice Smart Client calls backend over HTTP and upgrades the connection to WebSocket.
- In case the gateway loses the network access, the client tries to reconnect indefinitely.
- The backend stores the data and serves the UI
- The client gets the data for the upgrades, maybe links for the file system images, new configuration for the sensors, or new packages to install.
- The client performs the upgrades, perhaps calling some non-portable gateway-specific executables, or calling customer-supplied scripts.
Perhaps this time we are done? Wait, see another customer approaching you saying: "I need to execute shell commands on my gateways via interactive remote terminals, like ssh, you know". The grocery shop is so nice (you think). Do we need to rewind again to the beginning of Lovely Smart System? No. Not much at least.
The protocol
You probably wondered what exactly goes over the WebSocket in the above figure. Taking a look at the current state of requirements we can clearly see that there are four flavours of data we are dealing with:
- Alerting messages
- Status messages
- Command execution data
- user input
- command output
- Software update messages
How can we (not only the royal we, but also backend, UI, the client) distinguish between the software update data and the output of man ls
command coming back from the gateway over the WebSocket? What if we want to add new features (remember the Mars colony plans)? What if suddenly we release a brand-new client with really cool new things to send?
One way to approach the above problem of message types, is to design a protocol to handle arbitrary data, and use a framework to marshal, encapsulate and pass it over the WebSocket.
An example implementation in go may look like that see here for full version:
type ProtoHdr struct {
Proto ProtoType `msgpack:"proto"`
MsgType string `msgpack:"typ,omitempty"`
SessionID string `msgpack:"sid,omitempty"`
Properties map[string]interface{} `msgpack:"props,omitempty"`
}
type ProtoMsg struct {
Header ProtoHdr `msgpack:"hdr"`
Body []byte `msgpack:"body,omitempty"`
}
In the above, each protocol message carries the body and a header which contains data on the protocol type (e.g.: monitoring), message type (e.g.:alert) and additional fields to handle sessions and arbitrary metadata.
The above gives the ability to pass different "application" data over the same WebSocket connection in an organized, portable, and extensible manner.
You most probably noticed the msgpack
references. This is the last remaining part: we need to serialize the above structure somehow to be able to decode it independently of the platform. The client, the UI and the backend need a universal way to deserialize the MsgProto structures. To this end you can use something like MessagePack.
The following picture shows how we encapsulate the data in this case.
The scale at large
You have almost forgotten about setting up the grocery shop, when one of your SRE engineers approaches you, and with a typical smile on his face, says: "we can't handle that many TCP connections". "OK" -- you think -- I am looking for a carrot supplier the moment that guy leaves. He is persistent, though.
Load balance TCP connections
No matter what your backend runs on, ultimately every TCP SYN packet will be handled by some host. Since once the TCP transaction starts, it is quite challenging to migrate it to another machine, the limit of the number of TCP connections one host can handle is a very real problem.
Luckily, it has been solved, however there is no universal and working everywhere out-of-the box solution. You can address the issue with HAProxy. Keep in mind, though, that scaling the HAProxies instances becomes another issue.
Countless devices
Having been with the Lovely Smart System that far, you do not want to wait for another customer visit, you enter the Sprint Planning session, and you say: "We are connecting 10 million devices on Thursday." Assuming we have something like scalable HAProxy setup to load balance the persistent connections, there is no real argument that could block the scale up. Think back to the first Lovely Smart System version before our time travel above. We still need to send status, and alerting messages, handle command execution, as well as ask for software updates. Since we do not know when a new update is ready, we have to do it periodically. The millions of independent devices over which you have no direct control can produce the amount of API calls to the backend that would probably make you look like Dr. Emmett L. Brown in 1955 when he heard that the time machine needs 1.21 Gigawatts to run.
Run anywhere
Now you think even further ahead. There are new frontiers to break, new exotic platforms to support. Let's take a look at the Lovely Smart System. The Precious Pretty UI is web based, and we implemented the Nice Smart Client in a very portable language; no real problems there. Furthermore, you had thought about a uniform DBus API to interact with the client. In this way any sensor can pass data to the backend without modifying the client code.
The introduction of the WebSocket channel together with the MsgProto protocol and MsgPack not only makes the communication platform independent, but also allows you to use different protocols in place of HTTP and WebSockets, as we briefly discuss below.
The Lovely Smart System takeaways
What we described under the Lovely Smart System nickname is a real-life design of an IoT WebSockets-based setup.
It allows to easily pass any data to and from the devices, at the same time maintaining the portability and providing you with the ability to extend it without limits. Furthermore, it comes with a live, persistent TCP connection to each device, which opens new frontiers and new possibilities at a relatively low price.
The key ingredients that allow you to scale, extend, and make it platform-independent are:
- a general purpose protocol
- message serialization framework
- messaging system in the backend
- persistent connections to the devices
The price to pay
Needless to say there is always a price to be paid for anything and everything. In the case we are discussing, we can list only a few "drawbacks":
- the above mentioned persistent connection count limit problem
- the keep-alive ping-pong periodical messages
- WebSockets implementation on both ends
- HTTP overhead
Besides the obvious necessity to have a WebSocket implementation available, we already mentioned the connection count issue, which is obviously a price to pay for the WebSockets based live channel to the devices.
The HTTP in general, and the WebSocket in particular add a certain overhead to the messages. They are still application level protocols, encapsulated in TCP and IP. If that overhead shall become an issue, you can still turn into other protocols, which can be more suited for your particular use case.
Other IoT protocols
There are at least two other protocols that come to mind for implementing our solution. Namely, MQTT, and CoAp. These two protocols come into picture mainly for small and constrained devices. The advantage of MQTT is the low network bandwidth usage, and the subscription-publish model. It is a perfect choice for many IoT applications, however for the case discussed above it brings little difference. At the same time, the WebSocket based approach can achieve similar results.
CoAp protocol being a HTTP analogue for the small devices, is also a valid option, but in a similar fashion it changes little for the above arguments. You have to take into account most of the above no matter what you use as the underlying layers. The choice probably depends on the particular devices in question and the additional constraints.
Links of glory
There was a time when people gained an incredible insight into the basic laws of nature via so-called gauge theories. With the birth of supercomputing the numerical analysis reached its peak, and continued to be an equivalent for the physical experiments. However, we needed new methods, suitable for the digital machines to handle. One of them were the lattice models, where we analyse the physical phenomena on a discrete network of nodes and links. However successful, it turned out to be extremely challenging to bring the gauge symmetry to the lattice. It was only when people realized that the gauge transformation must be associated with the links between the nodes, when this fundamental mechanism could be used in the calculations. It maybe just a fun fact, but it carries a deeper meaning; it uncovers a role a connection can play.
One of the best Star Trek villains, the Borg, vere powerful because they were connected to the collective. It is implied, if sometimes forgotten, that the IoT devices by definition are connected and at the core it is all about communication.
You can unleash the true power and glory of your IoT fleet with a proper and thought-over communication channel, for which a WebSocket is a valid and valuable alternative.
Recent articles
How to leverage over-the-air (OTA) updates for NVIDIA Jetson Platform Services
Why is a robust over-the-air (OTA) update process critical in today’s digital age?
Enhancing sustainability in oil & gas: Tackling methane emissions with cutting edge solutions
Learn why leading companies choose Mender
Discover how Mender empowers both you and your customers with secure and reliable over-the-air updates for IoT devices. Focus on your product, and benefit from specialized OTA expertise and best practices.