Reasoning Web2021 Papers

Machine Learning Technology Artificial Intelligence Technology Natural Language Processing Technology Semantic Web Technology Ontology Technology Reasoning Technology Knowledge Information Technology Collecting AI Conference Papers Digital Transformation Technology

In the previous article, we discussed Reasoning Web 2020. In this issue, we describe the 17th Reasoning Web held in Leuven, Belgium, in September 2021.

Specifically, I will discuss the fundamentals of querying graph-structured data, reasoning with ontology languages based on description logics and non-monotonic rule languages, combining symbolic reasoning and deep learning, the Semantic Web and knowledge graphs and machine learning, building information modeling (BIM), the Geospatially Linked Open Data, Ontology Evaluation Techniques, Planning Agents, Cloud-based Electronic Health Record (EHR) Systems, COVID Pandemic Management, Belief Revision and its application to Description Logic and Ontology Repair, Temporal Equilibrium Logic (TEL) and its solution Set Programming (ASP), an introduction and review of Shapes Constraint Language (SHACL), a W3C recommended language for RDF data validation, and score-based The following sections describe the details.

Details are given below.

Foundations of Graph Path Query Languages

We survey some foundational results on querying graphstructured data. We focus on general-purpose navigational query languages, such as regular path queries and its extensions with conjunctions, inverses, and path comparisons. We study complexity, expressive power, and static analysis. The course material should be useful to anyone with an interest in query languages for graph structured data, and more broadly in foundational aspects of database theory.

On Combining Ontologies and Rules

Ontology languages, based on Description Logics, and nonmonotonic rule languages are two major formalisms for the representation of expressive knowledge and reasoning with it, that build on fundamentally different ideas and formal underpinnings. Within the Semantic Web initiative, driven by the World Wide Web Consortium, standardized languages for these formalisms have been developed that allow their usage in knowledge-intensive applications integrating increasing amounts of data on the Web. Often, such applications require the advantages of both these formalisms, but due to their inherent differences, the integration is a challenging task. In this course, we review the two formalisms and their characteristics and show different ways of achieving their integration. We also discuss an available tool based on one such integration with favorable properties, such as polynomial data complexity for query answering when standard inference is polynomial in the used ontology language.

Modelling Symbolic Knowledge using Neural Representations

Symbolic reasoning and deep learning are two fundamentally different approaches to building AI systems, with complementary strengths and weaknesses. Despite their clear differences, however, the line between these two approaches is increasingly blurry. For instance, the neural language models which are popular in Natural Language Processing are increasingly playing the role of knowledge bases, while neural network learning strategies are being used to learn symbolic knowledge, and to develop strategies for reasoning more flexibly with such knowledge. This blurring of the boundary between symbolic and neural methods offers significant opportunities for developing systems that can combine the flexibility and inductive capabilities of neural networks with the transparency and systematic reasoning abilities of symbolic frameworks. At the same time, there are still many open questions around how such a combination can best be achieved. This paper presents an overview of recent work on the relationship between symbolic knowledge and neural representations, with a focus on the use of neural networks, and vector representations more generally, for encoding knowledge.

Mining the Semantic Web with Machine Learning: main issues that need to be known

The Semantic Web (SW) is characterized by the availability of a vast amount of semantically annotated data collections. Annotations are provided by exploiting ontologies acting as shared vocabularies. Additionally ontologies are endowed with deductive reasoning capabilities which allow to make explicit knowledge that is formalized implicitly. Along the years a large number of data collections have been developed and interconnected, as testified by the Linked Open Data Cloud. Currently, seminal examples are represented by the numerous Knowledge Graphs (KGs) that have been built, either as enterprise KGs or open KGs, that are freely available. All of them are characterized by very large data volumes, but also incompleteness and noise. These characteristics have made the exploitation of deductive reasoning services less feasible from a practical viewpoint, opening up to alternative solutions, grounded on Machine Learning (ML), for mining knowledge from the vast amount of information available. Actually, ML methods have been exploited in the SW for solving several problems such as link and type prediction, ontology enrichment and completion (both at terminological and assertional level), and concept leaning. Whilst initially symbol-based solutions have been mostly targeted, recently numeric-based approaches are receiving major attention because of the need to scale on the very large data volumes. Nevertheless, data collections in the SW have peculiarities that can hardly be found in other fields. As such the application of ML methods for solving the targeted problems is not straightforward. This paper extends [20], by surveying the most representative symbol-based and numeric-based solutions and related problems, with a special focus on the main issues that need to be considered and solved when ML methods are adopted in the SW field as well as by analyzing the main peculiarities and drawbacks for each solution.

Despite the fact that the value of extending Building Information Modeling (BIM) implementation through the operations and maintenance phase is simply to reduce the operations and maintenance costs associated with inadequate interoperability, facilities management information flow is neither automated nor seamless. Facility managers do not normally use BIM models data, since they claim that BIM models either do not include their information requirements, or contain a huge amount of superfluous data which makes the data exchange process tedious and overwhelming. Construction Operations Building information exchange (COBie) is developed to improve the facility data handover and to support facilities management systems. However, COBie add-in existing applications have their inherent limitation to generate all facilities management required data, particularly spare, resource and job data sheets, in which a manual data entry is still required. Through a series of interviews with industry practitioners, this paper analyses current data exchange practices as well as proposed a conceptual interoperability framework for seamless data exchange between BIM models and facilities management systems. A proposed database information system that automatically generates a rich COBie spreadsheet by linking BIM data models via the Industry Foundation Classes (IFC) model to facilities management information provided by various sources. The proposed framework supplements the existing body of knowledge in facilities management domain by providing a system that facilitates seamless data transfer between BIM and facilities management systems. Facilities management organisations and owners can use this approach to decrease the redundant activity of manual data entry and focus their efforts on productive maintenance activities.

Towards Semantics-driven Natural Language Question-Answering for 3D Lidar Data

With an increase in Geospatial Linked Open Data being adopted and published over the web, there is a need to develop intuitive interfaces and systems for seamless and efficient exploratory analysis of such rich heterogeneous multi-modal datasets. This work is geared towards improving the exploration process of Earth Observation (EO) Linked Data by developing a natural language interface to facilitate querying. Questions asked over Earth Observation Linked Data have an inherent spatio-temporal dimension and can be represented using GeoSPARQL. This paper seeks to study and analyze the use of RNN-based neural machine translation with attention for transforming natural language questions into GeoSPARQL queries. Specifically, it aims to assess the feasibility of a neural approach for identifying and mapping spatial predicates in natural language to GeoSPARQL’s topology vocabulary extension including – Egenhofer and RCC8 relations. The queries can then be executed over a triple store to yield answers for the natural language questions. A dataset consisting of mappings from natural language questions to GeoSPARQL queries over the Corine Land Cover(CLC) Linked Data has been created to train and validate the deep neural network. From our experiments, it is evident that neural machine translation with attention is a promising approach for the task of translating spatial predicates in natural language questions to GeoSPARQL queries.

Ontology Assessment on different Tools

Ontologies nowadays have become widely used for knowledge representation, and are considered as foundation for Semantic Web. However with their wide spread usage, a question of their evaluation increased even more. This paper addresses the issue of finding an efficient ontology evaluation method by presenting the existing ontology evaluation techniques, while discussing their advantages and drawbacks. The presented ontology evaluation techniques can be grouped into four categories: gold standard-based, corpus-based, task-based and criteria based approaches.

Towards a Logical Account for Human-Aware Explanation Generation in Model Reconciliation Problems

Recent work has formalized the explanation process in the context of automated planning as one of model reconciliation – i.e. a process by which the planning agent can bring the explainee’s (possibly faulty) model of a planning problem closer to its understanding of the ground truth until both agree that its plan is the best possible. The content of explanations can thus range from misunderstandings about the agent’s beliefs (state), desires (goals) and capabilities (action model). Though existing literature has considered different kinds of these model differences to be equivalent, literature on the explanations in social sciences has suggested that explanations with similar logical properties may often be perceived differently by humans. In this brief report, we explore to what extent humans attribute importance to different kinds of model differences that have been traditionally considered equivalent in the model reconciliation setting. Our results suggest that people prefer the explanations which are related to the effects of actions.

Cloud-based Encrypted EHR System with Semantically Rich Access Control and Searchable Encryption

Cloud-based electronic health records (EHR) systems provide important security controls by encrypting patient data. However, these records cannot be queried without decrypting the entire record. This incurs a huge amount of burden in network bandwidth and the client-side computation. As the volume of cloud-based EHRs reaches Big Data levels, it is essential to search over these encrypted patient records without decrypting them to ensure that the medical caregivers can efficiently access the EHRs. This is especially critical if the caregivers have access to only certain sections of the patient EHR and should not decrypt the whole record. In this paper, we present our novel approach that facilitates searchable encryption of large EHR systems using Attribute-based Encryption (ABE) and multi-keyword search techniques. Our framework outsources key search features to the cloud side. This way, our system can perform keyword searches on encrypted data with significantly reduced costs of network bandwidth and client-side computation.

Trusted Compliance Enforcement Framework for Cloud Health IT Services

COVID pandemic management via contact tracing and vaccine distribution has resulted in a large volume and high velocity of Health-related data being collected and exchanged among various healthcare providers, regulatory and government agencies, and people. This unprecedented sharing of sensitive health-related Big Data has raised technical challenges of ensuring robust data exchange while adhering to security and privacy regulations. We have developed a semantically rich and trusted Compliance Enforcement Framework for sharing large velocity Health datasets. This framework, built using Semantic Web technologies, defines a Trust Score for each participant in the data exchange process and includes ontologies combined with policy reasoners that ensure data access complies with health regulations, like Health Insurance Portability and Accountability Act (HIPAA). We have validated our framework by applying it to the Centers for Disease Control and Prevention (CDC) Contact Tracing Use case by exchanging over 1 million synthetic contact tracing records. This paper presents our framework in detail, along with the validation results against Contact Tracing data exchange. This framework can be used by all entities who need to exchange high velocity-sensitive data while ensuring real-time compliance with data regulations.

Belief Revision and Ontology Repair

Belief Revision deals with accommodating new information in a knowledge base, where potential inconsistencies may arise. Several solutions have been proposed, turning it into an active field of research from the 80’s. Theoretical results were established in classical propositional logic and became the standard in the area, with rationality postulates, mathematical constructions and representation theorems. More recently, results have been adapted to different knowledge representation formalisms such as Horn Logic and Description Logics. In this tutorial, I will introduce Belief Revision, starting from the seminal AGM paradigm and giving an overview of the area in the last 35 years. In the second part, I will focus on applying Belief Revision to Description Logics in general and the relation between Revision and Ontology Repair.

Temporal ASP: from logical foundations to practical use with telingo

This document contains some lecture notes for a seminar on Temporal Equilibrium Logic (TEL) and its application to Answer Set Programming (ASP) inside the 17th Reasoning Web Summer School (RW 2021). TEL is a temporal extension of ASP that introduces temporal modal operators as those from Linear-Time Temporal Logic. We present the basic definitions and intuitions for Equilibrium Logic and then extend these notions to the temporal case. We also introduce several examples using the temporal ASP tool telingo.

SHACL: From Data Validation to Schema Reasoning for RDF Graphs

We present an introduction and a review of the Shapes Constraint Language (SHACL), the W3C recommendation language for validating RDF data. A SHACL document describes a set of constraints on RDF nodes, and a graph is valid with respect to the document if its nodes satisfy these constraints. We revisit the basic concepts of the language, its constructs and components and their interaction. We review the different formal frameworks used to study this language and the different semantics proposed. We examine a number of related problems, from containment and satisfiability to the interaction of SHACL with inference rules, and exhibit how different modellings of the language are useful for different problems. We also cover practical aspects of SHACL, discussing its implementations and state of adoption, to present a holistic review useful to practitioners and theoreticians alike.

Explanations in Data Management and Classification in Machine Learning via Counterfactual Interventions Specified by Answer-Set-Programs

In this course, we describe some recent approaches to score-based explanations for query answers in databases and outcomes from classification models in machine learning. The focus is on work done by the author and collaborators. Special emphasis is placed on declarative approaches based on answer-set programming to the use of counterfactual reasoning for score specification and computation. Several examples with the DLV ASP-system that illustrate the flexibility of these methods are shown.