The choice and standardization of the data format is of central importance for communication between IT and OT systems based on the Unified Namespace (UNS). Data formats determine how information is structured in MQTT messages and transmitted in the Unified Namespace (UNS). Different formats fulfill specific requirements of the manufacturing industry. In order to select the optimal data format, it is important to carefully analyze the advantages and disadvantages of the available options. This article provides an overview of the most common data formats and recommendations for the UNS.
What are data formats?
Data formats define the structure and organization of information that is exchanged between systems. They determine how data is presented, stored and transmitted in a standardized form. In practice, data formats determine how content – such as sensor data, machine statuses or production parameters – is packaged in messages so that they can be interpreted uniformly by senders and recipients. Well-known formats such as JSON, XML or binary formats such as Buffer play a central role in the manufacturing industry, as they meet specific requirements in terms of efficiency, readability, compatibility and processing speed.
Data formats in the context of the Unified Namespace (UNS)
The choice of the right data format plays a decisive role for efficiency and interoperability in the Unified Namespace (UNS). This section provides an overview of the most common formats and gives practical recommendations for their use in the UNS context. The aim is to take equal account of the requirements of IT and OT systems and use cases and to ensure seamless communication between the two worlds.
1. XML (eXtensible Markup Language)
Advantages: XML offers a structured representation with tags that is particularly suitable for complex and hierarchical data. Today, XML is often found in legacy systems and OT environments where strict data validation is required. Due to its widespread use, XML is indispensable in certain industrial contexts.
Disadvantages: The extensive syntax of XML increases message and bandwidth size. Processing XML is more resource-intensive than JSON or Protobuf, which can lead to longer processing times. Recommendation: XML is particularly suitable for environments in which it is already established as the standard data format or where complex data structuring is required.
Example:
2. JSON (JavaScript Object Notation)
Advantages: JSON offers a clear, human-readable structure with so-called “value-key” pairs. These properties make it universally applicable. The broad support by programming languages, OT and IT systems and IoT platforms often minimizes the effort required for integration. JSON is suitable for both simple and complex data representations. Further information can be found on the JSON website.
Disadvantages: The text-based representation of JSON often leads to higher memory requirements for large amounts of data. In addition, JSON only offers limited support for binary data, which can be problematic in some industrial applications.
Recommendation: JSON is ideal for applications that require human readability, platform compatibility and flexibility. As such, this format is particularly useful when exchanging data between a variety of systems and applications.
Example:
3. Protocol buffers (protobuf)
Advantages: Protobuf uses a binary format that makes serialization and deserialization extremely efficient. This efficiency leads to better performance in terms of transmission and processing times, especially in resource-intensive applications or in scenarios with limited bandwidth. Protobuf is very compact and reduces memory requirements. Further information can be found on the Protobuf website.
Disadvantages: Since Protobuf is a binary format, it lacks human readability. This makes troubleshooting and data checking more difficult. Protobuf is also less common in web applications, which can make integration into systems and applications more complicated.
Recommendation: Protobuf is ideal for scenarios where bandwidth utilization efficiency and performance are critical. It is particularly useful in resource-constrained environments.
Example:
XML vs. JSON vs. Protobuf
Format | Advantages | Disadvantages | Recommendation |
XML | – Structured representation for complex, hierarchical data
– Frequently used in legacy systems and OT environments – Required for strict data validation |
– Extensive syntax leads to larger messages
– Resource-intensive processing – Longer processing times
|
Ideal for existing standards or complex data structures in legacy systems and OT environments. |
JSON | – Clear, human-readable structure with “value key” pairs
– Universal applicability and broad support – Suitable for simple and complex data |
– Higher memory requirements for large amounts of data
– Limited support for binary data |
Perfect for platform compatibility, flexibility and human readability, especially when exchanging between many systems
|
Protobuf | – Efficient serialization/deserialization through binary format
– High performance during transmission and processing – Compact format with low memory requirements |
– Lack of human readability makes troubleshooting difficult
– Less common in web applications, which can complicate integration |
Excellent for resource-intensive scenarios or environments with limited bandwidth and high efficiency requirements. |
JSON vs. buffer for real-time OT communication
Buffer formats have several advantages over JSON in real-time OT systems:
- Efficiency: Buffer formats compress data more than JSON due to the binary representation. This compactness reduces the amount of data and improves transmission and processing times.
- Deterministic processing: The fixed data set sizes in buffer formats ensure consistent and predictable data processing.
- Resource conservation: Buffer formats require fewer CPU and memory resources than JSON. These properties make them particularly suitable for resource-limited OT systems.
- Interoperability with hardware: Buffer formats integrate directly into low-level hardware protocols. This considerably simplifies data transmission between sensors, controllers and actuators.
Conclusion: JSON as the most common data format in the Unified Namespace (UNS)
Despite the specific advantages of XML and Protobuf, JSON remains the preferred format for the majority of UNS applications due to its readability and broad support. The reasons for this are:
- JSON is easy to understand and enables rapid integration at the interface between OT and IT.
- The human-readable structure makes it easier for specialists in production and manufacturing environments to handle data.
- Its broad compatibility with almost all IoT platforms, including Azure IoT and AWS IoT, makes JSON particularly versatile.