Data models define how data is presented and received by applications and systems. The standardization of these models in the Industrial Unified Namespace (UNS) is crucial to ensure the long-term scalability of the architecture. In practice, data models are defined either through the development of proprietary models or the use of recognized industry standards (e.g. OPC UA Companions). As the application of these standards is still in its infancy, many companies rely on their own data models. Although defining your own models may seem complex at first, a step-by-step approach and a focus on specific use cases will make this task much easier. In this article, you will find a step-by-step guide to data modeling in the Unified Namespace (UNS).
Why is data modeling essential in the Unified Namespace (UNS)?
Before introducing the step-by-step guide to data modeling in the UNS, a precise presentation of its central importance is essential. The following points illustrate this.
1. interoperability and integration
Modern manufacturing environments consist of a large number of systems and devices from different manufacturers. Data modeling provides a common language and structure for data representation. This enables:
- Seamless integration of different systems.
- Compatibility across the entire ecosystem.
- Reduction of data silos and communication barriers.
2. consistency and quality of the data
Standardized data models ensure uniformity and accuracy in all processes. The advantages are
- Improved data quality through defined semantics and data formats.
- Reduced errors and more reliable decision-making.
- Clear data relationships for smooth collaboration between OT and IT systems.
3. scalability and flexibility
Manufacturing systems are constantly evolving. Data modeling creates a scalable basis that:
- The integration of new technologies is made easier.
- offers flexibility for changing requirements.
- Supports innovation and growth in the long term.
4 Effective data analysis and insights
Data modeling enables the aggregation and analysis of large amounts of data. This promotes:
- Use of AI and machine learning to optimize processes.
- Identification of patterns and anomalies.
- Data-driven decisions that increase efficiency and productivity.
Step 1: Prioritize use cases
Identify and prioritize relevant use cases for your Industrial Unified Namespace (UNS) based on a cost-benefit analysis. Take into account the specific requirements of all relevant stakeholders. These include
- OT experts (e.g. control technicians): Are familiar with relevant factory systems (e.g. machines or central PLCs) and know how to access the data.
- IT experts (e.g. IT manager for factory or corporate IT): Understand how to access relevant corporate or cloud systems (e.g. Tableau or ERP) and how the data must be made available for these target systems.
At the beginning, focus on use cases that are easy to implement and offer a high benefit. Examples of this are monitoring or basic data analyses, such as the calculation of KPIs (e.g. OEE). Such simple use cases provide a good basis for creating a data model and at the same time ensure that the data model meets the practical requirements.
Step 2: Determine relevant IT/OT systems
Identify the relevant data source and target systems for the prioritized use case and clarify the following points:
- What data is necessary to achieve your business goals? For example, for the OEE analysis: How many parts were produced per machine?
- What data is available on the source system and how can it be retrieved (e.g. machine outputs via xml files or OPC UA)?
- How are the target systems accessed and how is the data received? For example, is it provided via a REST API or via database queries?
- Is there a special data format or data type that your target system expects (e.g. JSON or buffer)?
- How often does the data need to be updated and what triggers the update (e.g. event-based or cyclical to transfer data every second)?
Focus initially on the requirements of your prioritized use case. Over time, you can adapt and expand the data model.
Step 3: Relevant data categories identify
Determine the relevant data categories that are important for your prioritized use case (e.g. KPIs or machine data). Machine data can generally be divided into three basic classifications, each of which covers specific application scenarios:
- Telemetry data: This data category includes the transmission of raw data from the store floor to higher levels (e.g. cloud analysis). It includes raw data (e.g. real-time temperature, high-frequency vibration measurements, regular pressure updates) as well as important metadata such as the timestamp of the measurement. This category is particularly important for use cases such as monitoring and data analysis.
- Command and control data: This category of data has traditionally been limited to the asset and production level, primarily due to time sensitivity. However, with advances in hybrid edge-to-cloud architecture, control actions are increasingly being automated using higher levels. The messages typically contain product order details, recipe and process parameters. This category of data is critical for more complex use cases, such as automated decision cycles (closed-loop).
- Management data: Bidirectional management data plays an important role in comprehensive machine management. It typically contains dynamic and static asset metadata, such as the machine ID, machine manufacturer, machine configurations or information on the last maintenance.
Each of these categories has different requirements for data communication and processing (e.g. criticality, frequencies and target systems). It may therefore be advisable to model them in separate data models. This deliberate separation helps to address the requirements for critical and non-critical communication and forms the basis for an effective role system (e.g. differentiated access rights depending on the criticality of the data). In the following, focus on the data category relevant to the prioritized use case.
Step 4: Define data model in the Unified Namespace (UNS)
Define the data model with a unique and meaningful name and description. Determine the data format, the structure and the necessary use case parameters. This includes, among other things:
- Variable name: Define the name of the variable. Ensure consistency in variable names in all your data models. Example: A “timestamp” variable should always be referred to as such. If you have difficulties defining the names, refer to generally recognized industry standards such as ISA95 or OPC UA Companions.
- Description: Add additional context information to make the data easier to interpret. Example: “Value=’0′: Manual; Value=’1′: Idle; Value=’2′: Automatic” for the variable “CncOperationMode”.
- Unit of the variable: Make sure to keep the units consistent across all data models (e.g. temperature always in °C).
- Default values: Define default values (e.g. “0” for “off” as the default value for the machine status). Default values are particularly important in the following cases:
- Initialization: The system must start with predefined states.
- Fallback: Backup values are used if certain data is missing or not available.
- Data type: Make sure that data types are consistent across all data models (e.g. “String” for the machine ID).
Step 5: Extensibility and iterative adaptation of the data model
Data modeling in the Unified Namespace (UNS) is not a one-off process. Rather, data models must be adapted to new requirements over time. It is therefore important that your data model is designed to be expandable and flexible right from the start. An iterative adaptation process, in which feedback is regularly obtained from users and stakeholders, ensures that the data model always meets current requirements. Versioning allows you to track changes to the data model and ensure that previous versions are still supported.
Step 6: Publication in the Unified Namespace (UNS)
Define where the data should be published within the MQTT topic hierarchy. Pay attention to the best practices and follow the guidelines for the design of the topic hierarchy (for information, please read MQTT Topic Namespace: Best Practices & Step by Step Guide).
Step 7: Validation of the data model
Validation of the data model is important to ensure that it meets stakeholder requirements. Work closely with the partners involved and test the use case to check that the data model is implemented correctly. Integrate the target system into the Unified Namespace (UNS) and ensure that the data model accurately reflects all specific requirements of the target systems. Once the data model has been successfully validated, you can proceed with the implementation in production.
Step 8: Data governance
Establish clear guidelines for the creation, adaptation and deletion of data models and define the corresponding roles and responsibilities. This is the only way to guarantee the quality and consistency of the data in the Industrial Unified Namespace (UNS).
Conclusion
A clearly structured and standardized data model is crucial for the successful implementation and scaling of an Industrial Unified Namespace (UNS). By focusing on specific use cases, step-by-step development and iterative adaptation, even complex requirements can be successfully implemented. The use of recognized industry standards or the creation of your own models forms the basis for a sustainable and flexible architecture.