Introduction to the steps and technologies required for digital transformation

DX (Digital Transformation) technology is a key element to innovate business processes and build competitive advantage through the use of digital technology, and the following topics are discussed in detail here. Click on an item in the table of contents to jump to the corresponding summary.

DX Technologies

DX(digital Transformation) Technologies

DX (Digital Transformation) has been attracting a lot of attention in recent years. It is said that more than 80% of the information in a company is unstructured or non-electronic information that is difficult to handle electronically.However, it is very difficult to digitize unstructured or non-electronic information, and a strategic approach is required.

The target information of DX is text information processed by natural language processing technology, image information processed by deep learning, audio information processed by probability generation models such as HMM, information generated by IOT such as sensors, and knowledge information processed by Semantic Web and ontology technology. These processes include

These processes are described below, but there is a wide range of know-how, and the digitized information (features, etc.) has relationships and generates various values by being combined with knowledge information as the core.

One approach to considering DX is to take the following steps

Clarify the objectives (analyze the issues): Analyze the current business flow and the issues. In many cases, it is not possible to clearly identify the issues at the initial stage. In such cases, make a hypothesis using existing analysis methods (KPI, OKR, etc.), select the target business, and visualize the business flow based on the three axes of actors (characters), objects (documents and other information), and systems.
Organize/analyze target information: List the amount of data to be collected/analyzed related to the area selected in step 1, its location, and attributes (what type of information, whether it is digitized, etc.). For those that need to be digitized, research and determine the means of digitization. At this time, consider the granularity of the information based on the problem to be solved (e.g., is it a document itself, a paragraph, or a single sentence?). At this point, check whether or not the information necessary to solve the problem is available in light of the purpose of the first step.
Primary verification: In the case of information-based problems, the first task is to find the information. Therefore, using an open-source search engine (such as FESS or ElasticSearch, described later), we will build a search system based on the information listed in step 2, and verify whether the information required for the purpose of step 1 can be obtained through basic search. These verifications are evaluated by assuming the target values (input-output pairs) in advance, and the necessary data (2) and the purpose/target values (1) are revised by analyzing how much of the information can be obtained and what is missing if the answer cannot be obtained.
Secondary verification: The information that cannot be obtained in 3 means that it cannot be obtained by using only the raw search material, and in order to generate it, we select appropriate techniques from among machine learning/artificial intelligence techniques to process/generate the information. At this time, the target value is to quantify and evaluate the issues obtained in 3 (information that cannot be obtained only from plain search materials, etc.). The final cost-benefit ratio should be estimated by comparing the results with the objectives of 1.
Construction of the production system: After confirming that the effects can be clearly obtained in step 4, the production system should be constructed (including robustness, scale, maintainability, etc.).

The reference information necessary for the above steps is shown below.

Problem setting and quantification

Problem Solving Methods and Thinking and Design of Experiments. In machine learning, it is important to quantify goals using frameworks such as PDCA and KPI. If the problem is unclear, hypotheses are formulated using deduction and abduction methods, verified while avoiding confirmation bias, and quantified using Fermi estimation. This section describes problem-solving methods, thinking methods, and experimental design.

Specific Applications of Artificial Intelligence Technology for DX Applications

Artificial intelligence technology as a DX case study. Artificial intelligence technology is a technology that allows computers and robots to perform intelligent tasks by mimicking human intelligence and thinking, and includes machine learning, deep learning, natural language processing, and image recognition. It has developed rapidly in recent years and is used in various fields such as self-driving cars, medicine, finance, and marketing. This section introduces examples of application of artificial intelligence technology to DX (Digital Transformation).

Applying the ICT Framework

Web Technology

Web Technologies.Web technologies are the platforms that support machine learning, AI and DX. Here, we will discuss Internet technologies, HTTP, web servers, browsers, programming such as JavaScript and React, MAMP and CMS (MediaWiki, WordPress), search platforms (Fess, ElasticSearch) implementation and application examples The following information will be explained

DB Technology

Database Technology. A database is a system for organizing information and facilitating retrieval and storage, often using a database management system. This enables data manipulation with fewer man-hours than proprietary implementations, and is essential in modern information systems that handle vast amounts of data. Advantages include the ability to utilize a general-purpose data structure, data uniformity, and backup functions. This section introduces various database-related technologies.

Search Technology

Search technology. Simply collecting information is meaningless; a cycle of “collecting,” “searching,” “finding,” “viewing,” and “noticing” is necessary for creative activities. This lecture will explain the “search” technique, or search technology.

Chatbots and Question & Answer Technology

Chatbots and Question-and-Answer Technology. Chatbot technology is used as a general-purpose user interface in various business fields, and is an area in which many companies are entering. Based on question-and-answer technology, it is not just a user interface technology, but also utilizes advanced technologies that combine artificial intelligence and machine learning technologies such as natural language processing, inference, deep learning, reinforcement learning, and online learning. At present, however, many chatbots do not make full use of these technologies and remain rule-based and simple. This section discusses a variety of topics regarding chatbots and question-and-answer technology, from origins, business aspects, and technical overviews including the latest approaches, to specific implementations that are available in practice.

User Interface and DataVisualization

User Interface and Data Visualization Technology. Data processing is an act of creating value by visualizing structures, allowing multiple perspectives and interpretations. Visualizing it requires an ingenious user interface. Here, we present examples of diverse UIs based on papers from conferences such as ISWC.

Workflow & Services

Workflow & Service Technology. A compilation of papers on service platforms, workflow analysis, and real-world business applications presented at various conferences, describing Semantic Web-based service platforms in business domains such as healthcare, law, manufacturing, and science.

Stream Data Technology

Machine learning and system architecture for data streams (time series data). Modern society is full of dynamic data, with huge amounts of data being generated in factories, plants, transportation, economics, social networks, and other areas. For example, factory sensors, mobile data, and social networks make tens of thousands of observations per minute, requiring real-time data analysis in a variety of use cases. Specifically, solutions to detailed problems are needed, such as predicting turbine failures, locating public transportation, and tracking people’s discussions. This book describes a real-time distributed processing framework for handling these stream data, machine learning processing of time-series data, and examples of smart city and Industry 4.0 applications that make use of them.

Data (electronic) conversion of unstructured information

Natural Language Processing Technology

Natural Language Processing Technology. Language is a means of communication between humans and is something that people naturally acquire, but it is very difficult for computers to handle. Natural language processing is a field of research in which computers are used to handle that language. In the early days, natural language processing was based on rules, but since the late 1990s, statistical methods using actual language data have become the mainstream. Here, we review the philosophical, linguistic, and mathematical aspects of natural language, describe natural language processing techniques in general, language similarities, and tools and programming implementations for use on computers, and describe how to apply them to actual tasks.

Image Processing Technology

Image Information Processing Technology. With the development of modern Internet technology and smartphones, a vast number of images exist on the Web, and image recognition technology is needed to create new value from them. This technology requires expertise in image-specific constraints, pattern recognition, machine learning, and further applications, and covers a wide range of areas. In addition, the success of deep learning has led to a rapid increase in research on image recognition, making it difficult to grasp the whole picture. Here, we describe the theory and algorithms of image information processing techniques, the practice of deep learning using Python/Keras, and approaches using sparse and stochastic generative models.

Speech Recognition

Speech Recognition Technology. Machine learning techniques play an important role in the area of signal processing, especially as applied to sensor data and speech signals, which are one-dimensional data that vary over time. Various machine learning techniques, including deep learning, are used for speech signal recognition. In this section, we discuss applications of natural language and speech mechanisms, speaker adaptation, speaker recognition, and noise-resistant speech recognition using AD transform, Fourier transform, dynamic programming (DP), hidden Markov model (HMM), and deep learning with respect to speech recognition techniques.

Geospatial Information Processing

Geospatial information processing technology. Geospatial information refers to location information or information linked to location, and it is estimated that 80% of the information handled by government is related to location information. By utilizing location information, it is possible to plot information on a map to determine its distribution, guide people to their destinations using GPS data, and track their trajectories. This information can also be used to provide services based on past, present, and future events. Utilization of geospatial information will advance scientific discovery, business development, and the solution of social problems. Specific uses of this information are described in combination with QGIS, R, machine learning tools, and Bayesian models.

Sensor Data and IOT

Sensor Data & IOT Technology. The use of sensor information is central to IoT technology and targets time-varying, one-dimensional information; IoT approaches include placing sensors on specific objects for detailed analysis and using multiple sensors for anomaly detection. The areas discussed here include IoT standards (e.g., WoT), statistical processing of time series data, hidden Markov models, sensor placement optimization through sub-modular optimization, hardware control such as BLE, and smart cities.

Anomaly detection and change detection

Anomaly detection and change detection techniques. Machine learning anomaly detection is a technique for detecting anomalies that deviate from normal conditions, while change detection is a technique for detecting changes in conditions. They are used to detect anomalous behavior, such as manufacturing line failures, network attacks, and fraudulent financial transactions. Techniques for anomaly and change detection include Hotelling’s T2 method, Bayesian methods, neighborhood methods, mixed distribution models, support vector machines, Gaussian process regression, and sparse structure learning, and these approaches are described.

Linking digitized data with knowledge information

Semantic Web Technology

Semantic Web Technology. Semantic Web Technology is a project to evolve the WWW from a “web of documents” to a “web of data” by developing standards and tools to handle the meaning of web pages. and utilize them for DX and AI tasks. In this paper, we discuss Semantic Web technologies, ontology technologies, and papers presented at the ISWC (International Semantic Web Conference).

Knowledge Data and its Utilization

Knowledge Information Processing Technology. How to handle knowledge as information is a central issue in artificial intelligence technology, and various methods have been examined since the invention of the computer. This section discusses how to convert knowledge from natural language to computer-processable information, and describes the definition of knowledge, Semantic Web technologies and ontologies, predicate logic based on mathematical logic, logic programming using Prolog, and solution set programming and its applications.

Ontology Technology

Ontology Technology. An ontology is a formal model for systematizing and structuring data in information science, representing concepts, things, attributes, and relationships in a shareable format. This improves data consistency and interoperability in information retrieval, database design, knowledge management, natural language processing, artificial intelligence, and the Semantic Web. In addition, domain-specific ontologies, which are specific to a particular field, are also used for information sharing and integration. In this section, we discuss the use of ontologies from an information engineering perspective.