Chapter 3
Measuring Interoperability

When you take apart a ship plank by plank and assemble it again, is it still the same ship?

― Theseus’s paradox.

In order to maximize potential reuse, an open data publisher wants its data source to be as interoperable – on all levels – as possible with other datasets world-wide. Currently, there is no way to identify relevant datasets to be interoperable with and there is no way to measure the interoperability itself. In this chapter we discuss the possibility of comparing identifiers used within various datasets as a way to measure semantic interoperability. We introduce three metrics to express the interoperability between two datasets: the identifier interoperability, the relevance, and the number of conflicts. The metrics are calculated from a list of statements which indicate for each pair of identifiers in the system whether they identify the same concept or not. The effort to collect these statements is high, and while not only relevant datasets are identified, also machine-readable feedback is provided to the data maintainer. We will therefore also look at qualitative methods to study the interoperability within a large organization.

When raising the interoperability between two or more datasets, the cost for adoption will lower. A user agent that can process one dataset, will be able to ask questions over multiple without big investments. If only we could become more interoperable with all other datasets online, then our data would be picked up by all existing user agents. Of course, this is not a one way effort: other datasets also need to become more interoperable with ours, and with all other datasets on the Web. It becomes a quadratically complex problem, where each dataset needs to adopt ideas from other datasets, managed by different organizations with different ideas. In this chapter, we will look at ways to measure the interoperability of datasets, which in the same way is also a quadratic problem as each dataset needs to be compared with all other datasets.

How can we measure the impact of a certain technology on the interoperability? A first effort we did in 2014, was comparing identifiers of available open datasets across different cities [1].

Comparing identifiers

A simplistic approach would be to classify relations between the identifiers (ids) of two datasets in four categories. One dataset can contain the same identifier as a dataset in the other. When this identifier identifies the same real-world thing in both datasets, we call this a correct id match. When this identifier identifies a different real-world thing, we can call this a false id match or an id conflict. In the same way, we have a correct different id and a false different id.

A first dataset about the city of Ghent
idlonglattype_sanitfeeid_ghent
13.7351.06new_urinalfreePS_151

In Table 1, we give an example of an open dataset of public toilets in the city of Ghent. In Table 2, a similar dataset about the city of Antwerp is given. An identifier is each element that is not a literal value such as “3.73”. Identifiers may be elements of the data model, as well as real-world objects described within the dataset. In these two tables, we can label some identifiers as conflicting, other elements as correct id matches, others as correct different ids or false different ids. However, the labeling can happen differently depending on the domain expert. Certainly when “loose semantics” are utilized it becomes difficult to label an identifier as identifying “the same as” another identifier.

A second dataset, now about the city of Antwerp
idlonglattypefeedescription
14.4151.23urinalnoneHessenhuis

An initial metric

In our research paper in 2013 for the sake of simplicity, we assumed that domain experts would be able to tell whether two identifiers are “the same as” or “not the same as”. When we would have a list of statements, classifying these identifiers in four categories, we would be able to deduct a metric for the interoperability of these two datasets. The first metric we introduced was the id ratio.

ID%=#(correct id matches)#(real-world concept matches)
The identifier ratio id%

We apply this formula on top of our two example datasets in Table 1 and Table 2. The correct identifier matches are: “id”, “long”, “lat”, “fee” (4). Conflicts would be the identifier “1” (1). Of course, this depends on our view of the world and our definitions of the things within this dataset. Real-world concept matches would be: “id”, “long”, “lat”, “type/type_sanit”, “fee”, “urinal/new_urinal” (6). An initial metric would thus score these two datasets as 66% interoperable, with 1 conflict.

In a simple experiment, we asked programmers to give a score for the interoperability between 5 different datasets and a reference dataset. The outcome revealed that there were clear design issues with this metric: when there would be a low amount of real-world matches, the score would be influenced quickly, as the number of samples to be tested is low. A full report, and the research data, on this experiment can be found on a Github repository at pietercolpaert/iiop-demo.

Identifier interoperability, relevance, and number of conflicts

There are two problems with id%. First, The id% may be calculated for 2 datasets which are not at all relevant to compare. This can lead to an inaccurate high or low interoperability score, as the number of real-world concept matches is low. Second, when different datasets are brought together, some identifiers are used more than others. An identifier which is used once has the same weight as an identifier that is used in almost all facts.

Instead of calculating an identifier ratio, we introduce a relevance (ρ) metric. A higher score on this metric would mean that two datasets are relevant matches to be merged. We now can introduce two types of relevance: the relevance of these two datasets as is (ρidentifiers), and the relevance of these two datasets when all identifiers would be interoperable (ρreal-world). We define the ρ as the number of occurences of an identifier when both datasets would be merged and would be expressed in triples. The ρidentifiers in our example would become 8, as 2 times 4 statements can be extracted from a merged table. The ρreal-world would become 12, as the number of occurences that would be left when the dataset would be 100% interoperable, would be 2 times 6. We then define the Identifier Interoperability (iiop), as the ratio between these two relevance numbers. For this dataset, the iiop becomes 66%. Coincidentally the same as the id%, with more rows in the dataset, it would quickly be different. When repeating the same experiment as with the id%, we now see a credible interoperability metric when comparing the ρreal-world, the number of conflicts and the iiop.

While these three metrics may sound straightforward, it appears to be a tedious task to gather statements. What is the threshold to decide whether two identifiers are the same as or not the same as [2]? A philosophical discussion arrises – cfr. Theseus’s paradox – whether something identified by one party can truly be the same as something identified by someone else. It is up to researchers to be cautious with these statements, as a same as statement may be prone to interpretation.

The role of Linked Open Data

Conflicts are the easiest to resolve. Instead of using local identifiers when publishing the data, all identifiers can be preceded by a baseuri. E.g., instead of “1”, an identifier can contain http://{myhost}/{resources}/1. This simple trick will eleminate the number of conflicts to zero. In order to have persistentPersistent identifiers are identifiers that stand the test of time. Cool uris don’t change. identifiers, organizations introduce a uri strategy. This strategy contains the rules needed to build new identifiers for datasets maintained within this organization.

Another problem that arose, was that a third party cannot be entirely sure what was intended with a certain identifier. Linked Data documents these terms by providing a uniform interface for resolving the documentation of these identifiers: the http protocol. This way, a third party can be certain what the meaning is, and when the data is not linked, it can link the data itself more easily by comparing the definitions. The more relevant extra data is provided with this definition, the easier linking datasets should become.

Linked Data helps to solve the semantic interoperability, for which the identifier interoperability is just an indicator. By using http uris, it becomes possible to avoid conflicts, at least, when an authority does not create one identifier with multiple conflicting definitions and when this definition, resolvable through the uri, is clear enough for third parties not to missinterpret it. The effort it would take for data publishers to document their datasets better and to provide “same as” statements with other data sources is similar to just providing uris within your data and linking them with existing datasets from the start. However, the theoretical framework built, provides an extra motivation for Linked Data.

Studying interoperability indirectly

It is understandable that policy makers today invest time and money in raising interoperability, rather than measuring its current state. When however no globally unique identifiers are used, can we work on semantic interoperability? Even more generally, is interoperability a problem data publishers are worried about at all? Or is their foremost concern to comply to regulations to publish data as is? In this section, we are going to measure interoperability indirectly, by trying to find qualitative answers to these questions by studying the adoption by third parties, studying the technology of published datasets and finally studying the organizations themselves.

Making the cost-benefit analysis for data reuse, when more third party adoption can be seen, we can also conclude the datasets become more interoperable when the benefits did not change. We can perform interviews of market players and ask them how easy it is to adopt certain datasets today. This is an interesting post-assessment. In a study we executed within the domain of multimodal route planners [3], it appeared that only a limited amount of market players could be identified that reused the datasets. Each time, reusers would replicate the entire dataset locally. In this paper [3] we reported on the study we executed We concluded that still only companies with large resources can afford to reuse data. This is again evidence that the cost for adoption, and thus the interoperability problems need to lower.

When studying data policies today, we can observe a certain maturity on the five layers of data source interoperability. For example, when a well-known open license is used, we can assume the legal interoperability is higher than with other datasets. When the data is publicly shared on the Web and accessible through a url, we can say the technical interoperability is high, and when documents and server functionality are documented through hypermedia controls, we can assume the querying interoperability is high. At some level, in order to raise interoperability, we have to make a decision for a certain standard or technology together. From an academic perspective, we can only observe that such a decision does not hinder other interoperability layers. When studying the interoperability, we could then give a higher score to these technologies that are already well adopted. With the upcoming http/2.0 standard, all Web-communication will be secure by default (https). In the rest of this book, we will use http as an acronym for the Web’s protocol, yet we will assume the transport layer is secure. Today, these technologies would be http as a uniform interface, rdf as a framework to raise the semantic interoperability, and the popular Creative Commons licenses for the legal aspect. In the next chapter, we will study organizations based on the reasoning to set up a certain kind of service today. Epistemologically, new insights can be created by the data owner to publish data in a more interoperable way.

Qualitatively measuring and raising interoperability on the 5 layers

Aspects such as the organizational, cultural, or political environment may be enablers for a higher interoperability on several levels, but should as such not be taken into account for studying data source interoperability itself. In among others the European framework isa² and eif, also political interoperability is added. When quantitatively studying dataset interoperability, we can discuss each of the five layers seperately. In this section we give an overview of all layers and how they can be used in interviews. Depending on the quality that we want to reach in our information system, we may be more strict on each of the levels, and thus for every project, a different kind of scoring card can be created. This is in line with other interoperability studies [4], that overall agree that interoperability is notoriously difficult to measure, and thus qualitatively set the expectation for every project.

Legal interoperability

Political, organizational, or cultural aspects may influence legal interoperability. When studied on a global scale, there is a political willingness to reach a cooperation on ipr frameworks. The better the overarching ipr framework, such as a concensus on international copyright law, better legal interoperability is achieved. Today, still different governments argue they need to build their own licenses for open data publishing, which again however would lower the legal interoperability. When there is a reason to lower the interoperability, the interviewer should ask whether and why these reasons weigh up to the disadvantages of lower interoperability in the opinion of the interviewee.

When only considering open datasets, measuring the interoperability boils down to making sure that a machine can understand what the reuse rights are. The first and foremost measurement we can do, is testing whether we can find an open license attached to the data source. The Creative Commons open licenses for example, each have a uri, which dereference into an rdf serialization which provides a machine readable explanation about the data.

Technical interoperability

Next, we can check how it is made available. When the dataset can be reached through a protocol everyone speaks, technical interoperability is 100%. Today, we can safely assume everyone knows the basics of the http protocol. When a dataset has a direct url over which it can be downloaded, we can say it is technically interoperable. The technical interoperability can even raise when we also refer to related different pages in the response headers or body of the response, yet we keep studying these aspects for the querying interoperability layer. An interesting test to publish data in a technically interoperable way can be that given a url, one is able to download the entire dataset.

Syntactic interoperability

The syntactic interoperability of two datasets is maximized when both datasets use the same serialization. In the case of Open Data, the syntactic interoperability is not worth measuring, as given a certain library in a certain progamming language, every syntax can just be read in memory without additional costs (e.g., the difference in cost to parse xml vs. json will be marginal). Quantitatively studying the syntactic interoperability thus involves studying whether the actual data is also machine readable, in accordance with the third star of Tim Berners-Lee.

However, much again depends on the intent. If it’s the core task of a service to build spreadsheets with statistics from various sources, and because of the fact that these statistics are summaries to, for instance, solve parliamentary questions, these spreadsheets in a closed format need to be opened up. Will it be an added value if another government service now also provides a machine readable version of these datasets? Researchers should be careful in what the outcome of blindly measuring interoperability means for the internal government processes. Quick wins are not always the best solution: raising syntactic interoperability – and other kinds of interoperability as well – means changing internal processes and software. A holistic view of processes is needed. Interviews with data managers may thus be more constructive than merely measuring the syntactic interoperability.

Some syntaxes allow semantic mark-up, while other specifications and standards for specific user agents do not allow a standard way for embedding triples. Still in this case, the identifiers for documents and real-world objects should remain technically interoperable. We therefore could measure how ready a certain government is for content negotiation and whether there is a strategy for maintaining identifiers in the long run.

Semantic interoperability

An indicator for semantic interoperability can be created by comparing identifiers, as shown in the first section. However, we can also qualitatively assess how well semantic interoperability issues are dealt with on a less fundamental level. With a closed world assumption, we can create a contract between all parties that redefines which set of words are going to be used. We can see evidence of this in specific domains where standards are omnipresent. E.g., within the transport domain, datex2 and gtfs are good examples. However, from the moment this data needs to be combined within an open world – answering questions over the borders of datasets – these standards fail at making a connection.

Essentially, researching the semantic interoperability boils down to studying how good identifiers are regulated within an information system. Again a qualitative approach can be taken as well. A perfect world does not exist, but we can interview governmental bodies to find out why they made decisions towards a solution, and look whether semantic interoperability was a problem they tried to tackle somehow.

Without rdf there is no automated way to find semantic interoperable properties or entities. Depending on the qualitative study, and whether the semantic interoperability is an issue that will be identified, we can decide to introduce this technology to the organization, if not yet known. Even when a full rdf solution is in place, still the semantic interoperability can thus vary across datasets.

Querying interoperability

The ldf axis – illustrated again in figure 2 – was at all time used to discuss the queryability of interfaces. When only a query interface would be offered, we would – as a quick win – also indicate that there is the possibility of offering data dumps. Hypermedia apis – what this axis advocates for – are not yet part of off the shelf products, which make it particularly difficult for organizations to explore other options.

high client cost high availability high bandwidth high server cost low availability low bandwidth data dump query interface
The Linked Data Fragments (ldf) idea plots the expressiveness of a server interface on an axis. On the far left, an organization can decide to provide no expressiveness at all, by publishing one data dump. On the far right, the server may decide to answer any question for all user agents.

The best example of full interoperability would be that by publishing the data using the http protocol, the data becomes automatically discovered and used in this application.

Conclusion

Raising interoperability entails making it easier for all user agents on the Web to discover, access and understand your data. We explored different ways to measure interoperability between two datasets. For Open Data however, we would then need to generalize – or scale up – these approaches to interoperability between a dataset and “all other possible open datasets”.

A first – not advised due to the investments needed – way is to compare identifiers between these two datasets or systems. When the same identifiers are used over the two datasets, we can assume a high interoperability. However, whether identifiers actually identify the same thing is prone to interpretation. We see this exercise mainly as proof that Linked Data is the right way forward: by using http uris, we use the uniform interface of the Web to document identifiers. Furthermore, using Web addresses instead of local identifiers will make sure another Web framework – rdf – can be used to discuss the relation between different real-world objects. Through comparing identifiers, we showed the importance of rdf to raise the semantic interoperability. Only time will tell whether using Web addresses for identifiers – and consequentially rdf – will become the norm for all aspects. Well established standards then will have to evolve to rdf as well. Also within the legal, technical, syntactical, and querying interoperability, rdf may play its role. Without rdf, we would have to rely on a different mechanism to retrieve machine readable license information, or we would have to rely on a specification that reintroduces syntax rules for hypermedia.

A second way to measure interoperability between datasets is to study the effects. The fact that Open Data by definition should allow data to be redistributed and mashed up with other datasets, means it becomes hard to automatically count each access to a data fact coming from your original dataset. A successful open data policy can thus be measured by the number of parties that declare that they reuse the dataset. Interviewing the parties that voluntarily declared this fact, may result in interesting insights on how to raise the interoperability. However, this is a post-assessment when an Open Data policy (or a data sharing policy) is in place.

A third way is to interview data owners within an organization on their own vision on Open Data. During the interviews, questions can be asked on why certain decisions have been taken, each time categorizing the answer at an interoperability layer. This is an epistemological approach, in which data owners will get new insights when explaining their vision within the context of the 5 interoperability layers. Depending on the interviewed organizations, different technologies can be assumed accepted. While some organizations will find it evident to use http as a uniform interface, others may still send data that should be public to all stake holders using a fax machine. For fewer cases – discussed in the next chapterrdf was evident. Thefore, we need a good mix between desk research on the quality of the data and interviews with data maintainers in order to create a good overview of the interoperability.

In the next chapter we introduce our own context of the organizations we worked with and will choose a qualitative approach to studying interoperability.

References

[1]
Colpaert, P., Van Compernolle, M., De Vocht, L., Dimou, A., Vander Sande, M., Verborgh, R., Mechant, P., Mannens, E. (2014, October). Quantifying the interoperability of open government datasets. (pp. 50–56). Computer.
[2]
Halpin, H., Herman, I., J. Hayes, P. (2010, April). When owl:sameAs isn’t the Same: An Analysis of Identity Links on the Semantic Web. In proceedings of the World Wide Web conference, Linked Data on the Web (LDOW) workshop.
[3]
Walravens, N., Van Compernolle, M., Colpaert, P., Mechant, P., Ballon, P., Mannens, E. (2016). Open Government Data’: based Business Models: a market consultation on the relationship with government in the case of mobility and route-planning applications. In proceedings of 13th International Joint Conference on e-Business and Telecommunications (pp. 64–71).
[4]
C. Ford, T., M. Colombi, J., R. Graham, S., R. Jacques, D. (1980). A Survey on Interoperability Measurement. In proceedings of 12th ICCRTS.