An overview of RDF stores and SPARQL

Mathematics Machine Learning Technology Artificial Intelligence Technology　Database Technology Algorithm Programming Technology　Digital Transformation ICT Semantic Web Technology

In the previous article, we discussed the data model and specific data structures of RDF data that realize LOD. In this article, we will discuss the RDF store, which is a database for handling RDF data, and SPARQL, which is a query system for extracting data from the RDF store.

RDF data is also known as a triplestore, which is a type of non-RDBMS graph data called a NoSQL database as introduced in the overview of DB. As for graph type DBs, there are databases composed of “nodes”, “edges”, and “properties” without RDF, such as Noo4J and DAtomic, each of which has its own query engine (Cypher, Datalog extended query). An RDF database is one that supports SPARQL, a query that conforms to the RDF data structure.

RDF data stores are provided by existing DB frameworks such as Ocarle and IBM, Jena (and the query module Arq) is provided by the Apache project as open source, and BlazeGraph and AllegroGraph are provided by individual vendors. There are also individual vendors such as BlazeGraph and AllegroGraph. As for cloud service support, Neptune of AWS also supports SAPRQL.

SPARQL is a query language for RDF stores, equivalent to SQL in RDBMS.

PREFIX abc: <http://mynamespace.com/exampleOntologie#>
SELECT ?capital ?country
WHERE {
  ?x abc:cityname ?capital.
  ?y abc:countryname ?country.
  ?x abc:isCapitalOf ?y.
  ?y abc:isInContinent abc:africa.
}

The first one, PREFIX, is the area for defining the name space. All RDF data has a name space, and to avoid the inefficiency of writing them every time, it is first defined as an abbreviation (abc: in the above example), and then the abbreviation is used.

The second one, SELECT, is called query format, and besides “SELECT”, there are “CONSTRUCT”, “ASK”, and “DESCRIBE”. SELECT” is the most commonly used query, and it returns a variable that matches the query pattern described below (a string starting with ???? defined after SELECT). If “*” is used after “SELECT”, the results for all variables will be returned.

CONSTRUCT” is a query that returns triple data instead of just variables, “ASK” returns a pooled value (true/false) that indicates whether the query pattern matches, and DESCRIBE returns the RDF graph described for the resource found.

DESCRIBE returns the RDF graph described for the resource found. The third and subsequent fields are for describing the pattern to be matched. The third and subsequent areas are for describing the patterns you want to match, starting with “WHERE” and describing the triple patterns you want to match in {}. Within these descriptions, a wide variety of descriptions are possible, such as logical OR and OR products, string operations, filters, etc., which ensure the flexibility of SPARQL.

The best reference book on SPARQL is “Learning SPARQL second edition“.

In the following, I will describe how to set up and use an actual RDF store, using Blazegraph as an example: BalzeGraph is an RDF store that is also used as a data endpoint for wikis, and is also the base for AWS Neptune, a database that runs in Java.

To use it, first start up the recommended Java environment (Java 9 or later), then go to the Blazegraph web page and download the Jar file (blazegraph.jar) from the download page.

After storing the downloaded jar file in any folder, run a terminal in that folder and type the following command. If the following message is displayed, the startup is complete.

>java -server -Xmx4g -jar blazegraph.jar
...
serviceURL: http://127.0.0.1:9999
Welcome to Blazegraph(tm) by SYSTAP.
Go to http://localhost:9999/blazegraph/ to get started.

Next, launch your browser and access “http://localhost:9999”, and you will see the following screen.

The next step is to store the (sample) data in the database. (Sample data from the official website page) Open the “UPDATE” tag in Blazegraph as follows, specify the data to be stored, and click “UPDATE”.

Finally, open the “QUERY” tag as shown below, write a SPARQL query, and try searching. If the search is successful, the results will be displayed as shown below.

The integration of this RDF store with Clojure will be described separately.

RDF Store and SPARQL Overview and Action

コメント