SEMAPTIC - A Map for Semantic Interoperability

SEMAPTIC

A Map for Semantic Interoperability

This version:
Version 1.0 - October 2024
Author
Carlos Pereira - INESC TEC
Contributors
José Villar - INESC TEC
License
License here

Abstract

Lorem ipsum dolor sit amet consectetur adipisicing elit. Sunt impedit, placeat quaerat hic in corporis dignissimos ad iste accusantium laudantium ipsam modi temporibus illum non animi, possimus vitae mollitia, dolorem omnis! Eius veritatis ex numquam, voluptatibus repellendus consequatur aspernatur labore maxime, nemo, libero possimus asperiores. Doloremque, nobis veritatis dolorum ullam obcaecati consectetur libero magnam similique excepturi voluptate illo unde non esse eveniet at, blanditiis accusamus. Sed, dolore vero provident perferendis nesciunt ab odit consectetur atque dolorum mollitia saepe neque harum assumenda! Perferendis neque, atque sunt alias quibusdam autem, obcaecati nam, necessitatibus repellat quas molestias est accusamus voluptas nesciunt dolores facilis iusto nulla. Cum excepturi tempore temporibus nisi ducimus alias quae! Asperiores blanditiis veritatis dignissimos totam suscipit, nemo dicta labore ipsam!

Motivation

Why complicate things if it's not needed?

Interoperability

In the context of informatics, “interoperability” refers to the ability of different systems to work together, exchanging and using information. There are several types of interoperability, including:


  • Organizational interoperability: this involves alignment between different organizations, processes, policies, and procedures to support seamless collaboration and information sharing. It may require addressing cultural, legal, or organizational barriers.
  • Technical interoperability: this involves ensuring that different systems of components can communicate with each other and exchange data. It includes compatibility with data formats, protocols, and communication interfaces.
  • Functional interoperability: this involves ensuring that different systems or components can perform required functions or tasks effectively when integrated. It includes compatibility with APIs, interfaces, and service-oriented architectures.
  • Platform interoperability: this refers to the ability of software applications or platforms to integrate with each other, allowing users to access and use functionality across different systems.
  • Temporal interoperability: this ensures that data exchanged between systems remains accurate and relevant over time. It involves managing data currency, consistency, and validity across different systems.
  • Syntactic interoperability: this focuses on ensuring that data exchanged between systems follows a common syntax or structure. It involves adherence to agreed-upon data formats, such as XML, JSON, or CSV.
  • Semantic interoperability: this refers to the ability of systems to understand and interpret the meaning of exchanged data accurately. It involves using common data models, ontologies, vocabularies, and standards to ensure shared understanding.

For the purpose of this document we will focus on semantic interoperability.

Semantic Interoperability

The concept of semantic interoperability has been developed over time by many researchers and organizations. The introduction of the Semantic Web concept by Tim Berners-Lee in 1999 has significantly contributed to the development and application of semantic interoperability. Various conceptual models and representation models have been developed in different countries and sectors to address the principles and needs of semantic interoperability. It's important to note that the evolution of semantic interoperability is a collective effort by many individuals, organizations, and communities worldwide. It continues to evolve as technology advances and the need for more efficient and meaningful data exchange grows. Semantic interoperability builds upon syntactic interoperability. If systems can't even understand the structure of the data, they can't hope to grasp its meaning. Once the format is understood, shared meaning can be established for accurate data exchange. This is accomplished by adding data about the data (metadata), linking each data element to a controlled, shared vocabulary. The main focus of semantic interoperability is the meaning of the data being exchanged. When two systems are exchanging semantic data, they are also sharing understanding about that data. The consumer of that data can interpret it accurately and then use it correctly.

Who needs Semantic Interoperability?

Every system who exchanges data with another system, in any area, should use semantic interoperability. Beyond any specific domain, semantic interoperability offers several advantages:

  • It is a requirement to enable machine computable logic, inferencing, knowledge discovery, and data federation between information systems.
  • Improves data quality and consistency reducing errors and misinterpretations.
  • Increases efficiency and productivity by reducing the manual effort required to translate data between systems
  • Opens doors for new applications and services that rely on seamless data exchange with well-defined semantics.

Disadvantages of Semantic Interoperability

If semantic interoperability is so powerful, then why is not used everywhere and by everyone? There are a few drawbacks and it’s not used due to several reasons:

  • Semantic interoperability requires a shared vocabulary and ontology, which can be complex to develop and implement since it requires a common understanding of data, its context, and its relationships, done by trained personnel.
  • The cost of implementing semantic interoperability is not negligeable. The benefits of semantic interoperability often accrue in the long run through improved collaboration and efficiency. Short-term benefits might be less apparent, making it a less attractive proposition for some companies.
  • Especially in critical sectors like healthcare, or energy, a mistake causing one system to misread information from another could result in severe consequences. This risk makes the transition to semantic interoperability slow.
  • Many existing systems were designed before the concept of semantic interoperability was introduced. Retrofitting these systems can be challenging and expensive.
  • While there are standards for semantic interoperability, they may not cover all use cases or may be interpreted differently by different systems.

SEMAPTIC - A new approach for Semantic Interoperability

Within the H2020 ENPOWER Project, one of the goals was to establish data models that would guarantee semantic interoperability for data exchanged between services. Extending existing ontologies or creating new ones helps ensure an interoperability chain. This approach guarantees that data produced by various services can be mapped to suitable classes and properties, facilitating its transformation into semantic data. Semantic data unlocks its full potential when serialized using formats like RDF/XML, Turtle, JSON-LD, or N-Tripes based on well-defined ontologies.
A crucial question for this project is: who ultimately benefits from the data transformation? Will a service in the project leverage reasoners to extract deeper meaning from the data, or will the services simply use the transformed data to fulfill their designated functionalities?
A major barrier to efficient data transformation is the complexity of data serialization. Producers need expertise in ontologies to serialize data effectively, while consumers require de-serialization for service integration. This significantly complicates the process.
Therefore, recognizing that services wouldn't utilize reasoners or semantic tools on the data, we developed a novel approach that maintains semantic interoperability without data transformation. This approach, called SEMAPTIC, relies on the construction of an ontological semantic map attached to the data. We explain in the following use cases how SEMAPTIC offers an alternative to "traditional" semantic interoperability, with benefits such as reduced processing burden and simpler implementation.

Semantic Interoperability Use Case - Traditional approach

To maintain simplicity in our use case, let's examine a scenario where one data producer and one data consumer are exchanging data.

Traditional Semantic Interoperability

The following steps must be performed by the data producer:

  • Produce the data.
  • Choose an ontological serialization format (RDF/XML, Turtle, JSON-LD, N-Triples, ...).
  • Serialize the data.
  • Send the serialized data or make it accessible for retrieval.

The following steps must be performed by the data consumer:

  • Get the serialized data.
  • If required, employ semantic reasoners to perform reasoning tasks for data inference and consistency verification.
  • Deserialize the data to make it usable.

In a nutshell:

  • Data Producer and Data Consumer need to agree in the serialization format.
  • It is the responsibility of the data producer to find or develop ontologies to better model all of the parameters of the produced data.
  • The data serialization process is not straightforward and requires dedicated algorithms for conversion. Be aware that serialized data can be considerably larger than the original data, with potential increases of three times or more.
  • The Data Consumer receives the serialized data.
  • Use reasoners or perform smart queries over the data.
  • Deserialize the data and use it.

Conclusion:

  • If reasoning capabilities are not required for the data, then the serialization and deserialization processes add an unnecessary layer of complexity to the overall process.

Semantic Interoperability Use Case - SEMAPTIC approach

Lorem ipsum dolor sit amet consectetur, adipisicing elit. Exercitationem consectetur cupiditate autem rem vero corrupti quisquam earum. Dignissimos, assumenda odio quos soluta explicabo consequatur fuga reiciendis. Temporibus voluptatibus laborum dolore, quas fuga, tempore quasi at iure architecto placeat facere animi est asperiores nihil facilis porro labore veritatis sed repellat. Pariatur incidunt nam deleniti officia enim officiis, tempore expedita nihil dicta, quasi voluptatum corrupti, non rem similique dolore earum sit cumque est voluptate natus eos! Inventore corporis mollitia, quod id neque ducimus incidunt ea amet magni maiores rem enim vero recusandae magnam omnis harum saepe dignissimos quia in tenetur aliquid provident!

Traditional Semantic Interoperability

SEMAPTIC - Ontological Semantic Map

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Officiis vitae, blanditiis enim quos animi excepturi, aspernatur quasi a sint labore, earum repellendus et voluptate nisi ad nulla magnam dolore doloribus velit debitis voluptates? Minus voluptate dolor facilis laudantium fugit corrupti, qui amet, officia, et repudiandae necessitatibus? Perspiciatis enim quis commodi at pariatur ipsam deleniti impedit! Suscipit laudantium assumenda quisquam voluptatibus eligendi consectetur iure corporis exercitationem! Dolores obcaecati consequatur quibusdam, quo, labore exercitationem nisi voluptatum dolorum, molestiae qui architecto iste consectetur quidem sunt recusandae error. Voluptatibus amet numquam eaque! Commodi, praesentium molestias saepe consequatur error libero earum rerum odio, dolore ab voluptatem repellendus cum architecto. Iure tenetur aut maxime vel ipsam, eos minus itaque commodi at dolores, veniam deserunt nobis in.

Basic Structure

The basic structure for the ontological semantic map will have three main blocs:

                  
      "@semantics": {
         "@prefixes": {       
         },       
         "@context": {       
         },       
         "@annotations": {       
         }
      } 
                

Ontology References: the “@prefixes” block

The @prefixes block serves as a dictionary, defining key-value pairs where the "key" is a prefix, and the "value" is the corresponding ontology IRI (identifier). This block reveals the ontologies used in the ontological semantic map. Data producers, as the owners of their data, are responsible for selecting appropriate ontologies that effectively model and provide meaning to all their data parameters.


Example:
                    
      "@prefixes": {
         "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
         "encom": "https://semanticweb.inesctec.pt/ontologies/encom"                  
      } 
                  

Parameter Mapping: the “@context” block

The @context block builds upon the chosen ontologies to define the specific meaning and context for all data parameters the producer wants to communicate. By leveraging the classes, properties, and individuals available within the selected ontologies, the producer can accurately model their data.


Example:
                    
      "@context": {
         "id": {
            "rdf:type": "encom:PVPanel",
            "encom:hasIdentifier": "$id"
          }
      } 
                  

Annotation Section: the “@annotations” block

The @annotations block allows producers to provide additional metadata about the data being described. This metadata can leverage classes from the Dublin Core ontology and RDF Schema to provide details about the service itself or the data it produces.


Example:
                    
      "@annotations": {
        "dcterms:creator": "Ricardo Silva",
        "rdfs:comment": "Sizing service from INESC TEC"
      } 
                  

Special characters

There are three reserved characters in this specification:


  1. @ (at sign):
    1. Used before the keywords semantics, prefixes, context, and annotations, to indicate the main blocks of the semantic model.
    2. Used before a parameter key as a pointer to it. See example xxxxxx.

  2. ? (question mark):
    1. In some cases, auxiliary variables are required in the specification. The "?" character is used to prefix such variables. See example 3.

  3. $ (dollar sign):
    1. Used before a parameter key, points to its value. See example example 1.


Example

The following JSON structure, will be used to construct a complete semantic specification with all possible cases. It represents an array of energy values produced by a solar panel.

                
      {
         "id": "abc123",
         "data": [
            {
               "timestamp": "2024-04-25T15:00:00Z",
               "value": 0.12,
               "unit": "kWh"
            },
            {
               "timestamp": "2024-04-25T15:15:00Z",
               "value": 0.17,
               "unit": "kWh"
            }
         ]
      } 
              


To build the ontological semantic specification map we need to perform two initial steps:

  • Extract all key parameters from our JSON structure
  • Choose the ontologies that best model the data.

  • From the JSON structure que can easily check that the parameters are:
    • "id"
    • "data"
    • "timestamp"
    • "value"
    • "unit"

  • Choosen ontologies:
    • “rdf”: “http://www.w3.org/1999/02/22-rdf-syntax-ns#”
    • RDF (Resource Description Framework) is a standard model for data interchange on the web. It structures data as triples (subject, predicate, object) to represent information about resources in a graph form.


    • “time”: “http://www.w3.org/2006/time#”
    • “Time” is an ontology of temporal concepts for describing the temporal properties of resources. It provides a vocabulary for expressing facts about relations among instants and intervals, durations, and date-time information.


    • “encom”: “http://semanticweb.inesctec.pt/ontologies/encom/”
    • “Encom” is an ontology designed to represent and understand the complex relationships within energy communities. It provides a structured approach to model various entities such as DSO's, aggregators, members, energy sources, energy storage, consumption patterns, etc. This ontology serves as a foundation for data integration, knowledge discovery, and decision-making in the realm of energy communities. It's a key tool for promoting sustainable and efficient energy practices.


“@prefixes” block example

After choosing the ontologies the "@prefixes" block is already complete since it only shows the prefixes of the ontologies used in the specification.

                
      "@prefixes": {
         "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
         "time": " http://www.w3.org/2006/time#",
         "encom": " http://semanticweb.inesctec.pt/ontologies/encom/"   
      } 
              

“@context” block example

The "@context" block is used to give meaning to all the parameters. Whether individually or together (when that makes sense), the parameters must be correctly modeled.

Example 1

The value of the "id" parameter, in our example, represents the identificator of a PV panel.

                  
     {
        "id": "abd123"   
     } 
                

We need to find in some ontology:

  • A “class” to assign its type to the key “id”.
  • A “property” to assign its value to “id”.

The character “@” is reserved. When used before the parameter key, it points to the value of that key. In this example, “@id” has value “abc123”. Putting all this together, a possible semantic specification for "id":"abc123" is:

                  
     "id": {  
        "rdf:type": "encom:Identificator",
        "encom:hasIdentificator": "$id" 
     } 
                

Example 2

The "data" parameter, in our example, is the key that represents the identification of an array.

                  
     "data": [
        {
           "timestamp": "2024-04-25T15:00:00Z",
           "value": 0.12,
           "unit": "kWh"
        },
        …
     ]  
                

To better model this parameter we must look inside the array to find any pattern. We can see that it is a timeseries, so we must find an appropriate ontological class.

                  
     "data": {
        "rdf:type": "encom:Timeseries"
     } 
                

Example 3

A “timestamp” can be a lot of things and must be modeled carefully. For example, a timestamp can represent an instant, a creation time, the beginning of an interval, the end of an interval, etc.

                  
     {
        "timestamp": "2024-04-25T15:00:00Z"   
     } 
                

In this example, "timestamp" represents the beginning of an interval. However, the value of the "timestamp", i.e. "$timestamp", has a specific format that needs to be modeled. To achieve this, we introduce an auxiliary variable that can hold additional properties.
The character "?" identifies these auxiliary variables. For instance, "?beg" serves as an auxiliary variable. This variable is specifically created to accommodate the "time:inXSDtimestamp" property, which defines the data type for "$timestamp".
It's important to note that if an auxiliary variable is created, then it must also be modeled.

                  
     "timestamp": { 
        "rdf:type": "time:Interval",
        "time:hasBeginning": "?beg",
        "?beg": {
           "time:inXSDtimestamp": "$timestamp"
        }  
     } 
                

Example 4

Since the value and the unit parameters are related, they need to be modeled carefully.

                  
     { 
        "value": 0.12,
        "unit": "kWh"
     } 
                

The semantic specification of this example depends on the use case. There are at least two viable alternatives:

Example 4.1

In some use cases the unit may change in the timeseries. For example, we may have some values with unit “kWh” and others with unit “Wh”. In this case you model the two parameters separately, but linking the two using the special character “@”:

                  
    "value": {
       "rdf:type": "encom:Energy",
       "encom:hasValue": "$value",
       "encom:hasUnit": "@unit"
    },
    "unit": {
       "encom:hasValue": "$unit"
    } 
                  

Example 4.2

When the unit if fixed then we can link the two related parameters avoiding the need to model the “unit” parameter:

                  
    "value": {
       "rdf:type": "encom:Energy",
       "encom:hasValue": "$value",
       "encom:hasUnit": "encom:kWh"
    } 
                  

In this example, since we know the exact unit, we use an "individual" value from the ontology to model the "kWh" unit, thus avoiding processing the value of “unit”, i.e. “$unit”.



“@annotations” block example

The "@annotations" block facilitates the inclusion of non-schematic metadata through free-text annotations. These annotations may reference terms from ontologies like Dublin Core (https://www.dublincore.org/specifications/dublin-core/dcmi-terms/) and RDFS (https://www.w3.org/TR/rdf12-schema/).

                  
    "@annotations": {
      "dcterms:creator": "Ricardo Silva",
      "rdfs:comment": "Sizing service from INESC TEC"
    } 
                

Data + Specification map

The ontological specification map will be then attached to the data and transmited to the data consumer.

                
                  {
                     "id": "abc123",
                     "data": [
                        {
                           "timestamp": "2024-04-25T15:00:00Z",
                           "value": 0.12,
                           "unit": "kWh"
                        },
                        {
                           "timestamp": "2024-04-25T15:15:00Z",
                           "value": 0.17,
                           "unit": "kWh"
                        }
                     ],
                     "@semantics": {
                      "@prefixes": {
                        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
                        "time": " http://www.w3.org/2006/time#",
                        "encom": " http://semanticweb.inesctec.pt/ontologies/encom/"   
                      },       
                      "@context": {  
                        "id": {
                          "rdf:type": "encom:Identifier",
                          "encom:hasIdentificator": "$id"
                        },
                        "data": {
                           "rdf:type": "encom:Timeseries"
                        },
                        "timestamp": {
                           "rdf:type": "time:Interval",
                           "time:hasBeginning": "?beg",
                           "?beg": {"time:inXSDtimestamp": "$timestamp"}
                        },
                        "value": {
                          "rdf:type": "encom:Energy",
                          "encom:hasValue": "$value",
                          "encom:hasUnit": "encom:kWh"
                       }
                      },       
                      "@annotations": {
                        "dcterms:creator": "Ricardo Silva",
                        "rdfs:comment": "Sizing service from INESC TEC"
                      }
                   }
                  } 
                

Summary

  • There is no data transformation.
  • The ontological semantic map (SEMAPTIC) is produced only once.
  • The ontological semantic map is the same whether the time series has 1 day or 5 yeard of data.
  • Since there is no serialization involved, the consumer won't need an additional service to deserialize the data.