Data Stream System Architecture Overview

Machine Learning Technology Artificial Intelligence Technology Digital Transformation Technology ICT Technology Sensor Data & IOT ICT Infrastructure Technologies Navigation of this blog Stream Data Processing and Machine Learning
Data Stream System Architecture Overview

Solving the problem of data streams requires a variety of requirements. The following is a list of characteristics required for those systems.

  1. Dealing with huge data
  2. Dealing with streamed data
  3. Linking Heterogeneous Data Sets
  4. Dealing with Incomplete Data
  5. Dealing with noisy data
  6. Provide highly responsive (fast) answers
  7. Access to fine-grained information
  8. Integration of complex domain models

There are two major existing systems that may be able to address these issues, as shown below.

  • Data Stream Management System (DSMS)
  • Complex Event Procesing (CEP) system

A DSMS converts data into data that can be received by a query by compressing it or splitting it in windows, and processes the data with a query that is always running. For example, smoke and temperature sensors installed in many areas are used to alert when a fire occurs (for example, smoke and temperature above 50°). When considering the system, the smoke and temperature sensors would be used to set up a window for a certain time period, and the system would alert when the temperature reaches a certain value within that window. (See the link for details)

CEP is event processing that combines data from multiple sources to infer events and patterns that suggest more complex situations. In the example above, the system would generate a fire alert event in an area if it receives smoke and high temperature events within a minute. (See the link for more details)

In this DSMS/CEP, it is possible to respond to the previous requirements: “1. support for huge data, 2. support for streamed data, 5. support for noisy data, 6. provide answers with high responsiveness (speed), 7. support for access to fine-grained information,” but it is not possible to respond to “3. heterogeneous Heterogeneous data set coordination, 4. Support for incomplete data, 8. Integration of complex domain models.

To address this issue, the Ontology Based Data Access System (OBS) has the ability to rewrite queries using knowledge data called ontology. However, as it is a static data access system, it is not suitable for “2. and 8. integration of complex domain models.” However, since it is a static data access system, it cannot support “2. support for stream data, 5. support for noisy data, and 6. provide answers with high responsiveness (speed).

In order to solve these problems, a technology called Stream Reasoning was proposed as a system that combines the features of ontology-based data access systems and DSMS/CEP. This technology is designed to flexibly combine various types of stream data to support queries in various contexts.

We will discuss the details of these technologies in the next article.

コメント

タイトルとURLをコピーしました