The SPRINT project will identify open issues on performance and scalability of the state-of-the-art solutions when dealing with semantic data representation. SPRINT will advance beyond the state-of-the-art providing new solutions with different TRL to solve these issues.
COLLABORATIVE LIGHTWEIGHT ONTOLOGY ENGINEERING
SPRINT will start from one of the latest collaborative lightweight ontology engineering tools (more specifically, Ontoology), which is starting to be used for the management of large corpora of reference ontologies, and will adapt it in order to automate even more the generation/extension/revision of the ontologies to be developed in the context of the IF and will be extended for the definition/revision of ontology-based annotations and mappings, something that has not been dealt with appropriately yet.
Furthermore, the project will make sure that the development of ontologies is not starting from scratch, given the existence of many non-ontological resources (especially XML documents and XSD schemas) that may be used to start and inform such development. Several frameworks have been proposed to transform XML documents to OWL ontologies, with three main objectives: generation, enrichment and population of an OWL ontology. The ontology-enriching process from an XML document adds new constructors (classes, object attributes, data types, etc.) to the schema of an existing ontology. The ontology population process adds to the ontology individuals or attributes to available individuals from XML data. Ontology generation and enrichment can be done using XML instances or validation schemas (DTD or XSD files), corresponding to two possible transformation approaches: the instance-based approach (from XML to ontology) and the validation-based approach (from DTD or XSD to ontology). Validation-based approaches typically provide richer results and cover a larger set of constructors.
XSD schemas are already part of the information lifecycle within the ST4RT project, so the SPRINT project will seek to automate the ontology creation and enrichment process. In particular, the project will select – and possibly improve on – one of the aforementioned techniques to convert XSD schemas into ontologies, and it will develop mechanisms to then match and merge the resulting ontologies with the Shift2Rail reference ontology.
SEMANTIC AUTOMATION FOR SERVICE INTEGRATION
Four different approaches to semantic interoperability can be distinguished according to two fundamental dimensions: the data schema mapping (any-to-any vs any-to-one) and the integration logic (centralised vs decentralised). The any-to-one centralised approach to semantic interoperability appears to be powerful and well-suited for managing complex and dynamic environments, where a common shared reference ontology is available. The ST4RT project is working to release a demonstrator tool to provide ontology-based transformations between different standards adopted by heterogeneous legacy systems. The any-to-one centralised approach adopted by ST4RT consists in semantically annotating standards in order to create mappings from their data models to a global reference ontology. The resulting mappings are the basis for the semantic transformations, so that data expressed in the standards can be converted into their respective ontological version. Whenever two systems that adopt the two different standards A and B need to exchange information, the semantic transformation takes place: a message originally expressed with regards to standard A is “lifted” to its ontological version, by means of the mapping between standard A and the reference ontology; once in its ontological counterpart, the message can then be “lowered” to a message expressed with regards to standard B, by means of its respective mapping to the reference ontology, and can be consumed by the target system.
The goal of the SPRINT project is to allow for a higher degree of automation in the creation of the semantic transformations. Currently, the semantic annotation of data models requires to skilled software designers an excessive amount of time in error prone, tedious manual tasks due to the lack of maturity of semantic interoperability tools and a not-yet-optimised conversion process. SPRINT will investigate new techniques and mechanisms to increase the automation and the validation of the semantic annotations by lowering the entry-barrier for software and business designers.
ENHANCED INTEROPERABILITY FRAMEWORK ARCHITECTURE
One way to look at the Shift2Rail IF is as a collection of services (for managing assets such as ontologies, for providing conversion between standards, for allowing a transparent invocation of MSP services, etc.) that can be offered in a decentralised way to clients distributed across a wide region (Europe, but not only). As such, the Shift2Rail IF shares many similarities with cloud-based applications, even though it is not necessarily itself a single application. Therefore, architectures and patterns for cloud-based applications are natural candidates to serve as the basis for a reference architecture for the Shift2Rail IF that addresses issues such as performance and scalability of the computations.
One of the most interesting paradigms to address scalability and performance issues is that of microservices, where small and independent artefacts (services) collaborate with one another.
One of the possible technologies that is a candidate to be the foundation for the architecture of the Shift2Rail IF is that of containers. For example, containers can be used to realize microservices and they allow system administrators to dynamically resize the resources allocated to applications. One of their main benefits is the possibility they offer to easily provide and distribute applications with their environments. As a consequence, they can be used to not only dynamically manage the resources that are necessary to run the Shift2Rail IF services, but also to easily deploy pieces of software that are part of the IF ecosystem, such as for example converters that realize mappings between different standards.
The project will study and compare different architectural alternatives – including microservices-based ones – for the Shift2Rail IF and it will identify the one that best suits the needs of the IF users and stakeholders. In particular, it will explore the possibility of using container technology to facilitate the distribution of components that realise the IF functions, such as, for example, conversion services between data representations, and “brokering” services that ease and make transparent the interaction with MSPs and, in general, providers of transport-related services.
DISTRIBUTED QUERYING PROCESSING AND DATA INTEGRATION
Several tools and libraries exist for RDF generation and Linked Data publishing and for supporting distributed SPARQL query processing on top of them.
SPRINT will advance the state-of-the-art providing efficient tools that enable seamless access to a combination of Linked Data-enabled sites, SPARQL endpoints on top of native RDF stores, and virtual SPARQL endpoints on top of mapping-enabled data sources (CSVs, relational databases, REST APIs) that are available as IF assets. SPRINT will implement efficient operators and query plans in order to test and improve performance and scalability that take into account the underlying characteristics of these data sources. Such improvements will enable integration across different data sources by means of reconciliation services that facilitate the generation of canonical URIs.
Concerning data integration, SPRINT will analyse the state-of-the-art techniques mentioned in the “collaborative lightweight ontology engineering” section also to transform and integrate traditional non-ontological data (e.g., XML documents) into ontological resources.
Moreover, SPRINT will investigate alternative techniques to perform the conversion process among different data formats proposed by the ST4RT project in order to improve performance and scalability.
NATIONAL ACCESS POINT FOR MULTIMODAL TRANSPORTATION
Only few EU countries have already defined a plan for the definition of a NAP (National Access Point) for Multimodal Transportation compliant with Commission Delegated Regulation (EU) 2017/1926 supplementing Directive 2010/40/EU of the European Parliament and of the Council with regard to the provision of EU-wide multimodal travel information services.
The SPRINT project will investigate the possibility to use the functionalities provided by the IF Semantic Assets Manager to set up a NAP for Multimodal Transportation, compliant with Commission Delegated Regulation (EU) 2017/1926. Indeed, the provided functionalities support: (i) the storage of data sets; (ii) the definition of national metadata profiles (e.g., DCAT-AP) to be used to create metadata description of data sets; (iii) the definition of a lifecycle for each data set; (iv) the definition of APIs and discovery services to facilitate the access to the data sets; (v) the discovery of web services supporting the conversion/translation between standards. Moreover, SPRINT will evaluate the possibility to integrate an existing NAP in the IF proof-of-concept developed in the project. In this case the NAP would act as an additional IF dataset repository.