Reasoning Web2017 Papers

機械学習技術　人工知能技術　自然言語処理技術　セマンティックウェブ技術　オントロジー技術　デジタルトランスフォーメーション技術 AI学会論文 知識情報処理技術 AI学会論文を集めて 推論技術

In the previous article, we discussed Reasoning Web 2016. In this issue, we describe the lecture notes from the 13th Reasoning Web Summer School, RW 2017, held in London, UK in July 2017.

In 2017, the theme of the School was “Semantic Interoperability on the Web” and encompassed themes such as data integration, open data management, reasoning on linked data, mapping databases and ontologies, query answering on ontologies, hybrid reasoning with rules and ontologies, ontology based dynamic systems. The focus here is on these topics, but also covers basic techniques of reasoning used in answer set programming and ontologies.

Details are given below.

Data Integration for Open Data on the Web

In this lecture we will discuss and introduce challenges of integrating openly available Web data and how to solve them. Firstly, while we will address this topic from the viewpoint of Semantic Web research, not all data is readily available as RDF or Linked Data, so we will give an introduction to different data formats prevalent on the Web, namely, standard formats for publishing and exchanging tabular, tree-shaped, and graph data. Secondly, not all Open Data is really completely open, so we will discuss and address issues around licences, terms of usage associated with Open Data, as well as documentation of data provenance. Thirdly, we will discuss issues connected with (meta-)data quality issues associated with Open Data on the Web and how Semantic Web techniques and vocabularies can be used to describe and remedy them. Fourth, we will address issues about searchability and integration of Open Data and discuss in how far semantic search can help to overcome these. We close with briefly summarizing further issues not covered explicitly herein, such as multi-linguality, temporal aspects (archiving, evolution, temporal querying), as well as how/whether OWL and RDFS reasoning on top of integrated open data could be help.

Ontological Query Answering over Semantic Data

Modern information retrieval systems advance user experience on the basis of concept-based rather than keyword-based query answering

Ontology Querying: Datalog Strikes Back.

In this tutorial we address the problem of ontology querying, that is, the problem of answering queries against a theory constituted by facts (the data) and inference rules (the ontology). A varied landscape of ontology languages exists in the scientific literature, with several degrees of complexity of query processing. We argue that Datalog $^{\pm}$ a family of languages derived from Datalog, is a powerful tool for ontology querying. To illustrate the impact of this comeback of Datalog, we present the basic paradigms behind the main Datalog $^{\pm}$ as well as some recent extensions. We also present some efficient query processing techniques for some cases.

Integrating Relational Databases with the Semantic Web: A Reflection

From the beginning it was understood that the success of the Semantic Web hinges on integrating the vast amount of data stored in Relational Databases. This manuscript reflects on the last 10 years of our research results to integrate Relational Databases with the Semantic Web. Since 2007, our research has led us to answer the following question: How and to what extent can Relational Databases be Integrated with the Semantic Web? The answer comes in two parts. We start by presenting how to get from Relational Databases to the Semantic Web via mappings, such as the W3C Direct Mapping and R2RML standards. Subsequently, we present how the Semantic Web can access Relational Databases. We finalize with how Relational Databases and Semantic Web technologies are being used practice for data integration and discuss open challenges.

Datalog Revisited for Reasoning in Linked Data

Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality. In this survey, we describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transitivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. The expressivity and the genericity of this framework is illustrated for modeling Linked Data applications and for developing inference algorithms. In particular, we show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.

A Tutorial on Hybrid Answer Set Solving with clingo

Answer Set Programming (ASP) has become an established paradigm for Knowledge Representation and Reasoning, in particular, when it comes to solving knowledge-intense combinatorial (optimization) problems. ASP’s unique pairing of a simple yet rich modeling language with highly performant solving technology has led to an increasing interest in ASP in academia as well as industry. To further boost this development and make ASP fit for real world applications it is indispensable to equip it with means for an easy integration into software environments and for adding complementary forms of reasoning. In this tutorial, we describe how both issues are addressed in the ASP system clingo. At first, we outline features of clingo’s application programming interface (API) that are essential for multi-shot ASP solving, a technique for dealing with continuously changing logic programs. This is illustrated by realizing two exemplary reasoning modes, namely branch-and-bound-based optimization and incremental ASP solving. We then switch to the design of the API for integrating complementary forms of reasoning and detail this in an extensive case study dealing with the integration of difference constraints. We show how the syntax of these constraints is added to the modeling language and seamlessly merged into the grounding process. We then develop in detail a corresponding theory propagator for difference constraints and present how it is integrated into clingo’s solving process.

Answer Set Programming with External Source Access

Access to external information is an important need for Answer Set Programming (ASP), which is a booming declarative problem solving approach these days. External access not only includes data in different formats, but more general also the results of computations, and possibly in a two-way information exchange. Providing such access is a major challenge, and in particular if it should be supported at a generic level, both regarding the semantics and efficient computation. In this article, we consider problem solving with ASP under external information access using the dlvhex system. The latter facilitates this access through special external atoms, which are two-way API style interfaces between the rules of the program and an external source. The dlvhex system has a flexible plugin architecture that allows one to use multiple predefined and user-defined external atoms which can be implemented, e.g., in Python or C++. We consider how to solve problems using the ASP paradigm, and specifically discuss how to use external atoms in this context, illustrated by examples. As a showcase, we demonstrate the development of a HEX program for a concrete realworld problem using Semantic Web technologies, and discuss specifics of the implementation process.

Uncertainty Reasoning for the Semantic Web

The Semantic Web has attracted much attention, both from academia and industry. An important role in research towards the Semantic Web is played by formalisms and technologies for handling uncertainty and/or vagueness. In this paper, I first provide some motivating examples for handling uncertainty and/or vagueness in the Semantic Web. I then give an overview of some own formalisms for handling uncertainty and/or vagueness in the Semantic Web.

OBDA for Log Extraction in Process Mining

Process mining is an emerging area that synergically combines modelbased and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. Through process mining, decision makers can discover process models from data, compare expected and actual behaviors, and enrich models with key information about their actual execution. To be applicable, process mining techniques require the input data to be explicitly structured in the form of an event log, which lists when and by whom different case objects (i.e., process instances) have been subject to the execution of tasks. Unfortunately, in many real world set-ups, such event logs are not explicitly given, but are instead implicitly represented in legacy information systems. To apply process mining in this widespread setting, there is a pressing need for techniques able to support various process stakeholders in data preparation and log extraction from legacy information systems. The purpose of this paper is to single out this challenging, open issue, and didactically introduce how techniques from intelligent data management, and in particular ontology-based data access, provide a viable solution with a solid theoretical basis.

In the next article, we will discuss Reasoning Web 2018.