Introduction to the steps and technologies required for digital transformation

This section discusses information on digital transformation (DX) in the areas shown in the map below.

The details of each are described below.

DX(digital Transformation) Technologies

DX (Digital Transformation) has been attracting a lot of attention in recent years. It is said that more than 80% of the information in a company is unstructured or non-electronic information that is difficult to handle electronically.However, it is very difficult to digitize unstructured or non-electronic information, and a strategic approach is required.

The target information of DX is text information processed by natural language processing technology, image information processed by deep learning, audio information processed by probability generation models such as HMM, information generated by IOT such as sensors, and knowledge information processed by Semantic Web and ontology technology. These processes include

These processes are described below, but there is a wide range of know-how, and the digitized information (features, etc.) has relationships and generates various values by being combined with knowledge information as the core.

One approach to considering DX is to take the following steps

Clarify the objectives (analyze the issues): Analyze the current business flow and the issues. In many cases, it is not possible to clearly identify the issues at the initial stage. In such cases, make a hypothesis using existing analysis methods (KPI, OKR, etc.), select the target business, and visualize the business flow based on the three axes of actors (characters), objects (documents and other information), and systems.
Organize/analyze target information: List the amount of data to be collected/analyzed related to the area selected in step 1, its location, and attributes (what type of information, whether it is digitized, etc.). For those that need to be digitized, research and determine the means of digitization. At this time, consider the granularity of the information based on the problem to be solved (e.g., is it a document itself, a paragraph, or a single sentence?). At this point, check whether or not the information necessary to solve the problem is available in light of the purpose of the first step.
Primary verification: In the case of information-based problems, the first task is to find the information. Therefore, using an open-source search engine (such as FESS or ElasticSearch, described later), we will build a search system based on the information listed in step 2, and verify whether the information required for the purpose of step 1 can be obtained through basic search. These verifications are evaluated by assuming the target values (input-output pairs) in advance, and the necessary data (2) and the purpose/target values (1) are revised by analyzing how much of the information can be obtained and what is missing if the answer cannot be obtained.
Secondary verification: The information that cannot be obtained in 3 means that it cannot be obtained by using only the raw search material, and in order to generate it, we select appropriate techniques from among machine learning/artificial intelligence techniques to process/generate the information. At this time, the target value is to quantify and evaluate the issues obtained in 3 (information that cannot be obtained only from plain search materials, etc.). The final cost-benefit ratio should be estimated by comparing the results with the objectives of 1.
Construction of the production system: After confirming that the effects can be clearly obtained in step 4, the production system should be constructed (including robustness, scale, maintainability, etc.).

The reference information necessary for the above steps is shown below.

Problem setting and quantification

In order to perform machine learning, it is necessary to determine the nature of the issue and quantify it as a target value using various problem-solving frameworks such as PDCA, KPI, and KGI. In addition, when the problem is not yet clarified, it is necessary to formulate various hypotheses using deductive, inductive, projective, analogical, abductive, and other non-deductive methods, to verify the hypotheses formed without falling into the “confirmation bypass,” and to devise ways to quantify them using methods such as Fermi estimation.

In t he following pages of this blog, we describe these problem-solving methods, thinking methods, and experimental designs in detail.

Specific Applications of Artificial Intelligence Technology for DX Applications

Artificial intelligence technology refers to technology that allows computers and robots to perform intelligent tasks previously performed by humans by imitating human intelligence and thought processes. Artificial intelligence technology includes various technologies such as machine learning, deep learning, natural language processing, and image recognition.

Artificial intelligence technology has developed rapidly in recent years and is used in a variety of fields. Its applications are wide-ranging and include self-driving cars, the medical field, finance, marketing, and more. In the following pages of this blog, examples of the application of artificial intelligence technologies to this DX are listed.

Applying the ICT Framework

Web Technology

Web technology is a platform for technologies such as machine learning, artificial intelligence, and digital transformation.

In the following pages of this blog, you will find an overview of web technologies (overview of Internet technologies, HTTP protocol, web servers, web browsers, web applications and programming technologies such as Javascript and React), implementation technologies (Javascript, React ), implementation techniques (Javascript, React, Clojure, Pyhton, etc.), specific applications (MAMP, MediaWiki (a variety of CMS (Contents Management System), WordPress, Fess (a search platform), ElasticSearch, etc.) and various applications ) and various applications (various applications presented at conferences and on the web) will be described.

DB Technology

As the wiki explains, “A database is a collection of information organized for easy retrieval and storage. A database is a collection of information organized for easy retrieval and storage, usually realized by a computer. A database is a collection of information organized for easy retrieval and storage. The data structure handled by the program and the data itself can be manipulated with less man-hours than in the case of proprietary implementations. It is the most important technology in modern information systems that handle huge amounts of data [source]. This is the most important technology in modern information systems that handle huge amounts of data.

The advantage of using a database is that you can simply use a general-purpose data structure rather than implementing your own data structure in a program, and you can use a system that can ensure data consistency (data backup, etc.), which I will discuss later.

In the following pages of this blog, we discuss various technologies related to this database.

Search Technology

Information is the basis of computer technology. It is meaningless to simply collect information, and in order to perform creative activities from the collected information, it is necessary to go through a cycle of “collecting,” “searching,” “finding,” “looking,” and “noticing. For each of these, there are corresponding technologies and ideas. In this article, I will discuss “search” technology (search technology).

In the following pages of this blog, various technologies related to search technology are discussed in the .

Chatbots and Question & Answer Technology

Chatbot technology can be used as a general-purpose user interface in a variety of business domains, and due to its diversity of business opportunities, it is an area in which many companies are now entering.

The question-and-answer technology that forms the basis of chatbots is more than just user interface technology. It is the culmination of advanced technologies that combine artificial intelligence technologies such as natural language processing and inference, and machine learning technologies such as deep learning, reinforcement learning, and online learning.

In the following pages of this blog, we discuss a variety of topics related to chatbots and question-and-answer technology, from their origins, to business aspects, to a technical overview including the latest approaches, to concrete, ready-to-use implementations.

User Interface and DataVisualization

Using a computer to process data is equivalent to creating value by visualizing the structure within the data. In addition, data itself can be interpreted in multiple ways from multiple perspectives, and in order to visualize them, we need a well-designed user interface.

In the following pages of this blog, I will discuss various examples of this user interface, mainly focusing on papers presented at conferences such as ISWC.

Workflow & Services

It summarizes information on service platforms, workflow analysis, and their application to real business applications published in ISWC and other publications.

In the following pages of this blog, we discuss service platforms using the Semantic Web for business domains such as healthcare, law, manufacturing, and science.

Stream Data Technology

This world is full of dynamic data, not static data. For example, a huge amount of dynamic data is formed in factories, plants, transportation, economy, social networks, and so on. In the case of factories and plants, a typical sensor on an oil production platform makes 10,000 observations per minute, peaking at 100,000 o/m. In the case of mobile data, a mobile user in Milan makes 20,000 calls/SMS/data connections per minute, and 20,000 connections per minute. In the case of mobile data, mobile users in Milan make 20,000 calls/sms/data connections per minute, reaching 20,000 connections per minute and 80,000 connections at peak times, and in the case of social networks, Facebook, for example, observed 3 million likes per minute as of May 2013. as of May 2013.

Use cases where these data appear include “What is the expected timing of failure when the turbine barring starts to vibrate in the last 10 minutes? What is the expected failure time when the turbine barring starts to vibrate, as detected in the last 10 minutes?” or “Is there public transportation where people are?” or “Who is discussing the top ten topics? These are just a few of the many granular issues that arise, and solutions to them are needed.

In the following pages of this blog, we discuss real-time distributed processing frameworks for handling such stream data, machine learning processing of time series data, and application examples such as smart cities and Industry 4.0 that utilize these frameworks.

Data (electronic) conversion of unstructured information

Natural Language Processing Technology

Language is a tool used for communication between people. It is easy for humans to learn to speak, and does not require any special talent or long and steady training. However, it is almost impossible for non-humans to control language. Language becomes a very mysterious thing.

Natural language processing is the study of using computers to handle such language. Initially, natural language processing was realized by writing a series of rules that said, “This is what a language is like. However, language is extremely diverse, constantly changing, and can be interpreted differently by different people and contexts. It is not practical to write down all of them as rules, and statistical inference based on data, i.e., actual natural language data, has been the mainstream since the late 1990s, replacing rule-based natural language processing. Statistical natural language processing is, to put it crudely, solving problems based on a model of how words are actually used.

In the following pages of this blog, we will first discuss natural language from the perspective of “what natural language is”, (2) natural language from the viewpoint of philosophy, linguistics, and mathematics, (3) natural language processing technology in general, and the most important of these, (4) word similarity. (3) natural language processing techniques in general, and (4) word similarity, which is particularly important among them. Then, it describes (5) various tools for using them in computers and their specific programming (6) implementation, and provides information that can be used for real-world tasks.

Image Processing Technology

With the development of modern internet technology and smartphones, the web is filled with a vast amount of images. One of the technologies that can create new value from this vast amount of images is computer-aided image recognition technology. This image recognition technology requires not only knowledge of image-specific constraints, pattern recognition, and machine learning, but also exclusive knowledge of the target application. In addition, due to the recent artificial intelligence boom triggered by the success of deep learning, a huge number of research papers on image recognition have been published, and it has become difficult to follow them. Because the content of image recognition is so vast and enormous, it is difficult to overview the entire field and acquire knowledge without clear guidelines.

In the following pages of this blog, we have discussed the theories and algorithms for image information processing techniques, as well as specific applications of deep learning using python/Keras and approaches using sparse models and probability generation models.

Speech Recognition

One area where machine learning technology can be applied is in the area of signal processing. These are mainly one-dimensional data that changes on the time axis, such as various sensor data and voice signal processing. Various machine learning techniques, including deep learning, are used for speech signal recognition.

In the following pages of this blog, we will discuss speech recognition technology, starting from the mechanism of natural language and speech, AD transform, Fourier transform, and applications such as speaker adaptation, speaker recognition, and noise tolerance speech recognition using methods such as Dynamic Programming (DP), Hidden Markov Model (HMM), and deep learning.

Geospatial Information Processing

The term “geospatial information” refers to information about location or information that is linked to location. For example, it is said that 80% of the information handled by the government used for LOD is linked to some kind of location information, and in an extreme case, if the location where the information occurred is recorded together with the information, all information can be called “geospatial information.

By handling information in connection with location, it is possible to grasp the distribution of information even by simply plotting location information on a map. If you have data on roads and destinations linked to latitude and longitude, you can guide a person with a GPS device to a desired location, or track his or her movements. By taking the trajectory of how a person moves, it is possible to provide services based on location with information on past, present, and future events.

By making good use of these features of location information, it will be possible to make new scientific discoveries, develop services in business, and solve various social problems.

In the following pages of this blog, we will discuss how to use QGIS, a geospatial information platform, and how to combine it with R and various machine learning tools, as well as with Bayesian models.

Sensor Data and IOT

The use of sensor information is a central element of IOT technology. There are various types of sensor data, but here we will focus on one-dimensional, time-varying information.

There are two types of IOT approaches: one is to set up individual sensors for a specific measurement target and analyze the characteristics of the target in detail, and the other is to set up multiple sensors for multiple targets as described in “Application of Sparse Model to Anomaly Detection”, and select specific data from the obtained data to make decisions such as anomaly detection for a specific target.

In the following pages of this blog, we will discuss various IOT standards (WoT, etc.), statistical processing as time series data, stochastic refinement models such as hidden Markov models, sensor placement optimization by inferior modular optimization, control of hardware such as BLE, smart cities, and a wide range of other areas of knowledge.

Anomaly detection and change detection

In any business setting, it is extremely important to be able to detect changes or signs of anomalies. For example, by detecting changes in sales, we can quickly take the next step, or by detecting signs of abnormalities in a chemical plant in operation, we can prevent serious accidents from occurring. This will be very meaningful when considering digital transformation and artificial intelligence tasks.

In addition to extracting rules, it is now possible to build a practical anomaly and change detection system by using statistical machine learning techniques. This is a general-purpose method that uses the probability distribution p(x) of the possible values of the observed value x to describe the conditions for anomalies and changes in mathematical expressions.

In the following pages of this blog, I have described various approaches to anomaly and change detection, starting from Hotelling’s T2 method, Bayesian method, neighbo

Linking digitized data with knowledge information

Semantic Web Technology

Semantic Web technology is “a project to improve the convenience of the World Wide Web by developing standards and tools that make it possible to handle the meaning of Web pages,” and it will evolve Web technology from the current WWW “web of documents” to a “web of data.

The data handled there is not Data in the DIKW (Data Information Knowledge Wisdom) pyramid, but Information and Knowledge information, expressed in ontologies, RDF and other frameworks for expressing knowledge, and used in various DX and AI tasks.

In the following pages of this blog, I discuss about this Semantic Web technology, ontology technology, and conference papers such as information of ISWC (International Semantic Web Conference), which is the world’s leading conference on Semantic Web technology.

Knowledge Data and its Utilization

The question of how to handle knowledge as information is a central issue in artificial intelligence technology, and has been examined in various ways since the invention of the computer.

Knowledge in this context is expressed in natural language, which is then converted into information that can be handled by computers and computed. How do we represent knowledge and how do we handle the represented knowledge? And how to extract knowledge from various data/information? In this blog, we will discuss these issues in the following pages.

In the following pages of this blog, we discuss the handling of such knowledge information, including the definition of knowledge, approaches using Semantic Web technologies and ontologies, predicate logic based on mathematical logic, logic programming using Prolog, etc., and solution set programming as an application of these approaches.

Ontology Technology

The term ontology has been used as a branch of philosophy, and according to the wiki, “It is not concerned with the individual nature of various things (beings), but with the meaning and fundamental rules of being that bring beings into existence, and is considered to be metaphysics or a branch of it, along with cognitive theory.

Metaphysics deals with abstract concepts of things, and ontology in philosophy deals with abstract concepts and laws behind things.

On the other hand, according to the wiki, ontology in information engineering is “a formal representation of knowledge as an ordered sequence of concepts and relations among concepts in a domain, used to reason about entities in the domain and to describe the domain. It is used to reason about entities (realities) in the domain and to describe the domain. It is used to reason about entities (realities) in the domain and to describe the domain. It also states that “an ontology is defined as “a formal and explicit specification of a shared conceptualization” and provides a vocabulary (types, properties, and relations of objects and concepts) that is used to model a domain.

In the following pages of this blog, we will discuss the use of this ontology from the perspective of information engineering.