"What's so special about Industrial IoT as we've been collecting data from control systems and remotely monitoring process for many years?" That is a question I get asked a lot by seasoned SCADA professionals. While there is an element of truth to it, there are many reasons why that kind of assertion is misleading. One reason that I believe really sets the two apart fundamentally is the introduction of the publish/subscribe architecture for industrial data acquisition with IIoT systems.
Here's why. SCADA systems traditionally employ a polling mechanism to acquire process data, whereby the SCADA communication engine has to actively poll all sensors and devices representing the current process state, one after the other to get new data. And in most cases this is done using point to point protocols. No doubt about it, such an architecture is prone to communication failures due to application blocking, cannot easily scale because of bandwidth limitations, and is so tightly coupled that participants have to know a great deal about each other's implementation.
One the other hand, IIoT technologies use data subscription instead of polling, whereby there is no communication engine that goes around interrogating devices and sensors for new data even when there is no change to the monitored signal. Instead, the devices themselves package the data into topics and publish it to a centralised broker whenever there is need to communicate. All the services and other devices interested in that data simply have to subscribe to the same topics on the broker and new data will be pushed to them as it is made available. Think about it, this architecture fundamentally changes how systems acquire data from control systems or remote installations. Bandwidth usage is significantly reduced the because continuous collection of worthless information is eliminated. Further, the resulting system is loosely coupled as the participants do not need to know about each other's implementation details. This makes it possible for systems in the IT domain such as analytics software to directly subscribe to data from the shop floor, and because the internet is used as the back-haul network of communication, a common baseline is created thereby opening up the system for endless possibilities. As if that's not enough, such an architecture can easily scale to millions of devices.
MQTT + Sparkplug B
At the forefront of the pub/sub innovation is Message Queuing Telemetry Transport Protocol (MQTT), which is a simple lightweight, and open transport protocol that enables devices at the edge of the network to publish information to a broker. The broker can then forward the messages to any client that has subscribed for a particular type of information over TCP/IP. As it turns out, much of the implementation information was left out of the original MQTT specification in order to "provide maximum flexibility across any sector that might choose to use MQTT" . But yet, in order for MQTT to be interoperable within the IIoT sector some implementation information must be defined. And this is what Sparkplug tries to achieve in order to optimise MQTT for SCADA/IIoT. Below are three implementation details that Sparkplug seeks to address, all the while remaining true to keeping the message sizes to a minimum.
MQTT Topic Namespace
So in a typical IIoT solution there is usually a hierachy of device communication, there could be a number of devices/sensors behind and edge of network (EoN) gateway, and each gateway could be located at distinct part of a plant. If a device publishes to a topic, the receiver needs to be able to identify the position of the sender in the hierachy and to know where the message is coming from in the physical space of the plant. Simply stated, the Topic Namespace needs to be understood by every client participating in the data exchange.
All MQTT clients using the Sparkplug specification will use the following structure:
namespace/group_id/message_type/edge_node_id/[device_id]
Where the namespace element is the root element, group_id element is the logical grouping of MQTT Edge of Network nodes, e.g MQTT devices that can be grouped together as Assembly line A while another set of MQTT devices is group as Assembly Line B. The message_type element specifies the type of message being sent, e.g is it a Birth Certificate message (more on that later), a device data message or a state message etc. And the edge_node_id uniquely identifies the MQTT Edge of Network node which could be a gateway or a sensor if it is capable of transmitting MQTT messages on its own. Optionally, if there are legacy sensors or devices behind an MQTT gateway (EON Node) they would be identified using the device_id element.
MQTT State Management
Here's the thing. The polling mechanism employed by traditional SCADA systems is not by chance but by design. Because in industrial systems the connection state of the network is extremely important, there has to continuous polling make sure that all devices are awake and can provide data when requested. Hence, treating MQTT as a simple stateless pub/sub system would be a mistake. Here's an example to make a point, suppose a digital sensor is monitoring some ON/OFF state and publishing it to an MQTT topic. The sensor would obvious only publish the message when the state changes from ON to OFF or vice-versa, not when there is no change. But yet, it would very important for the receiver to be aware of whether the publisher is not sending messages because the ON/OFF state hasn't changed or because it is now offline. In that regard, Sparkplug specification defines the use of "Last Will and Testament" that enables the broker to provide relevant information to interested clients when a publisher goes offline (death certificate) or when its connection is restored (Birth Certificate).
MQTT Payload
As I mentioned earlier, the original MQTT specification does not provide implementation details on a number features including the format of the data that is to be sent as a payload. Meanwhile, if we are to create rich and interoperable IIoT solutions using MQTT there has to be a payload definition. Sparkplug's answer to that is its latest payload definition, version B. Otherwise known as Sparkplug B. It uses Google Protocol Buffers as the technology for encoding and it supports Complex data types, Datasets, Historical data, File data, Richer metrics with metadata and metric alias.
A simple Sparkplug B payload with values would be represented in JSON as follows:
{
"timestamp": 1486144502122,
"metrics": [{
"name": "Batch Status",
"alias": 1,
"timestamp": 1479123452194,
"dataType": "String",
"value": "Complete"
}],
"seq": 2
}
In conclusion, this article is far from being exhaustive on the topic of MQTT and Sparkplug. It serves as brief introduction to those that are not familiar with the technologies outlined above. Detailed information can be found on the original specification documents.