Epistemic uncertainty and AI complementation

Machine Learning Artificial Intelligence Digital Transformation Probabilistic Generative Models Machine Learning with Bayesian Inference Small Data Nonparametric Bayesian and Gaussian Processes python Economy and Business Physics & Mathematics Navigation of this blog

epistemic uncertainty

Epistemic Uncertainty refers to uncertainty arising from a lack or incompleteness of knowledge or information, which is caused by an inadequate understanding of an event or system, which uncertainty can be which can be reduced by acquiring more information or deepening existing knowledge.

Epistemic uncertainty is particularly pronounced in the following situations

1. insufficient data: epistemic uncertainty arises in predicting results when, for example, there are insufficient observational data or not enough experiments have been carried out

2. model incompleteness: if a simulation or predictive model is incomplete, uncertainty arises with respect to its predictions. This is because the model does not fully capture all elements of the real world.

3. assumption-based errors: where a model or theory is built on certain assumptions, uncertainty about predictions and inferences arises if these assumptions are incorrect.

4. knowledge limitations: where knowledge of an issue is limited, uncertainty arises in understanding the issue due to the large number of unknowns.

Characteristics of epistemic uncertainty include the following

Reduced by updating knowledge: epistemic uncertainty can be eliminated by gathering information and gaining a deeper understanding. Uncertainty in predictions and inferences can be reduced by obtaining new data and by reassessing based on more sophisticated theories.
Changes due to external factors: uncertainty changes over time through information provided by others, technological advances, further experimentation, etc.

The following approaches can be used to manage epistemic uncertainty.

1. Bayesian inference: a method of updating prior probability distributions and making posterior predictions as knowledge advances. Uncertainty is reduced by taking new data into account.

2. data-driven approaches: epistemic uncertainty can be reduced by collecting more data and adjusting models. For example, in machine learning, more data improves prediction accuracy.

3. probabilistic programming: using probabilistic models to quantitatively represent uncertainty and improve knowledge through learning.

4. simulation and Monte Carlo methods: where model predictions are uncertain, repeat simulations are used to assess uncertainty and search for optimal results.

Epistemic uncertainty is often compared to risk (uncertainty about the prediction of the outcome). Risk is usually assessed within a range of probabilistic predictability, whereas epistemic uncertainty arises from limited knowledge and can be resolved by increasing knowledge, making them fundamentally different.

Epistemic uncertainty affects the risk of decision-making. When information is incomplete, decision-makers have to make risky choices, but if epistemic uncertainty is reduced, decisions can be made with more certainty and the accuracy of forecasts and risk assessments can be improved.

For example, in climate change projections, epistemic uncertainty exists for outcomes based on different forecast models and scenarios. Scientists try to reduce this uncertainty by collecting data, but the climate system itself is so complex that models cannot fully capture all factors. Therefore, epistemic uncertainty remains.

How to use AI to solve epistemic uncertainty

Possible approaches to solving epistemic uncertainty using AI include.

1. updating knowledge through Bayesian inference: Bayesian inference, as described in ‘Overview and various implementations of Bayesian inference’, is a very effective method for dealing with epistemic uncertainty, in which prior knowledge (prior probability) and newly obtained data (likelihood) are combined to calculate posterior probabilities. In this process, the AI can update its own knowledge based on the information obtained, improving the accuracy of its predictions and inferences.

Prior Probability: how likely each hypothesis is to occur based on prior information or beliefs.
Likelihood: the probability of how likely it is to be observed given new data.
Posterior Probability: combines prior probability and likelihood to derive the most plausible hypothesis based on observed data.

This allows AI to reduce uncertainty and make more reliable predictions even from limited information.

2. probabilistic programming and uncertainty modelling: probabilistic programming, also described in ‘Probabilistic Programming with Clojure’, is a technique for modelling uncertainty using probability theory. It is a technique for modelling, where the problem to be solved by the AI is expressed in a probabilistic way and probabilistic reasoning is performed on unknown parameters and variables. This allows the AI to evaluate potential hypotheses and search for the most likely solution, even when information is incomplete.

Examples include the use of probabilistic programming languages (such as Pyro and Stan) to model epistemic uncertainty and update hypotheses based on observed data.

3. reinforcement learning and exploration: ‘Why do we need reinforcement learning? Reinforcement Learning (RL), which is also discussed in ‘Application Examples, Technical Challenges and Solution Approaches’, is an effective approach for dealing with epistemic uncertainty, in which agents learn by interacting with their environment. The agent learns the best behaviour for an unknown environment and gathers information while balancing exploration and utilisation. Initially, the agent lacks knowledge about the environment, but gradually learns the optimal policy through trial and error.

Exploration: trying out unknown behaviours in order to obtain new information.
Exploitation: choosing the best course of action based on the information already obtained.

AI reduces uncertainty by gaining new knowledge while exploring the unknown.

4. ensemble learning: ensemble learning, described in ‘Overview of ensemble learning and examples of algorithms and implementations’, is a method for improving forecast accuracy by combining multiple models and is also an effective approach for resolving epistemic uncertainty. Since individual models are trained based on different perspectives and assumptions, the use of ensembles ensures that the predictions of each model are complementary and the accuracy of the overall prediction is improved.

Bagging: training multiple models on the same dataset and averaging their predictions.
Boosting: train multiple weak learners in sequence, with the next model focusing on the data on which the previous model made incorrect predictions.

This complements the epistemic uncertainty of each model and enables more accurate predictions.

5. transfer learning and pre-trained models: transfer learning, described in ‘Overview of transfer learning and examples of algorithms and implementations’, is a method of applying existing knowledge or pre-trained models to a new task; this technique is particularly useful when limited data is available or when adapting to a new AI can reduce epistemic uncertainty and improve performance in new tasks by transferring knowledge learned in other problems.

Pre-training: pre-training models on large datasets and applying that knowledge to new tasks.
Fine-tuning: fine-tune pre-trained models to suit a specific task.

This enables high performance on new tasks while reducing epistemic uncertainty.

6. expert systems and knowledge bases: expert systems, described in ‘Rule-based and knowledge-based and expert systems and relational data’, are AI systems that incorporate expert knowledge in a specific field and can be an approach that can be used to resolve epistemic uncertainty. Provide expert knowledge-based reasoning to support decision-making in uncertain situations. Knowledge-based reasoning engines utilise available knowledge to draw optimal conclusions, even when information is lacking.

7. uncertainty quantification and risk assessment: AI can reduce epistemic uncertainty by quantifying the level of confidence in predictions, as described e.g. in ‘Overview of causal inference using Meta-Learners, algorithms and implementation examples’. By setting confidence intervals for forecast results and assessing risk, it is possible to indicate which forecasts can be trusted, thereby quantifying uncertainty and utilising it in decision-making.

Epistemic uncertainty can be effectively managed by AI techniques, in particular by utilising Bayesian inference, probabilistic programming and reinforcement learning, which can update knowledge and reduce uncertainty even from limited information. Combined, these techniques allow AI to make more accurate predictions and decisions, helping to eliminate epistemic uncertainty.

implementation example

The following Bayesian inference and Monte Carlo approaches are presented as examples of implementations for managing epistemic uncertainty. These methods can be used to reduce epistemic uncertainty and improve the accuracy of forecasts.

1. example implementation using Bayesian inference: by using Bayesian inference, the posterior distribution (posterior) is updated by combining the prior distribution (prior) with new observed data (likelihood). This allows one to see how predictions improve as knowledge increases.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Initial prior distribution (PRIOR)
mu_prior = 0  # average
sigma_prior = 1  # standard deviation
prior = norm(mu_prior, sigma_prior)

# Observation data (likelihood)
observed_data = [2.3, 2.7, 3.1, 2.8, 2.6]  # Observation data example
mu_likelihood = np.mean(observed_data)  # Average of observed data
sigma_likelihood = np.std(observed_data)  # Standard deviation of observed data

# Calculate the posterior distribution 
# Bayes theorem: P(data | parameter) * P(parameter) 
# Here we update the mean and variance for simplicity
posterior_mu = (mu_prior / sigma_prior**2 + mu_likelihood / sigma_likelihood**2) / (1 / sigma_prior**2 + 1 / sigma_likelihood**2)
posterior_sigma = np.sqrt(1 / (1 / sigma_prior**2 + 1 / sigma_likelihood**2))

# Plot prior, observed data, and posterior distributions
x = np.linspace(-5, 5, 1000)
prior_pdf = prior.pdf(x)
posterior_pdf = norm(posterior_mu, posterior_sigma).pdf(x)

plt.plot(x, prior_pdf, label="Prior", linestyle="--")
plt.plot(x, posterior_pdf, label="Posterior")
plt.title("Bayesian Update of Distribution")
plt.xlabel("Parameter Value")
plt.ylabel("Probability Density")
plt.legend()
plt.show()

print(f"Prior Mean: {mu_prior}, Prior Sigma: {sigma_prior}")
print(f"Posterior Mean: {posterior_mu}, Posterior Sigma: {posterior_sigma}")

This code computes the prior distribution (prior) and the posterior distribution based on observed data (posterior) using Bayesian inference This code computes the prior distribution (prior) and the posterior distribution based on observed data (posterior) using Bayesian inference and compares them. It simulates the process of reducing epistemic uncertainty through updating the posterior distribution.

2. Monte Carlo simulation: a method that uses Monte Carlo methods to assess epistemic uncertainty as model uncertainty. This allows a random sample to be generated and simulated to quantify the uncertainty with respect to the predictions. calculations and comparing them with each other. It simulates the process of reducing epistemic uncertainty through updating the posterior distribution.

import numpy as np
import matplotlib.pyplot as plt

# モデルパラメータの設定
true_parameter = 2.5  # 真のパラメータ
sigma = 0.5  # 標準偏差（モデルの不確実性）

# モンテカルロシミュレーション
num_simulations = 1000
simulated_parameters = np.random.normal(true_parameter, sigma, num_simulations)

# シミュレーション結果のヒストグラムをプロット
plt.hist(simulated_parameters, bins=30, density=True, alpha=0.6, color='g')

# 真のパラメータをプロット
plt.axvline(true_parameter, color='r', linestyle='dashed', linewidth=2)

plt.title("Monte Carlo Simulation of Parameter")
plt.xlabel("Parameter Value")
plt.ylabel("Density")
plt.show()

# シミュレーションから得られた平均と標準偏差
simulated_mean = np.mean(simulated_parameters)
simulated_std = np.std(simulated_parameters)

print(f"Simulated Mean: {simulated_mean}")
print(f"Simulated Standard Deviation: {simulated_std}")

The code uses Monte Carlo methods to simulate parameter uncertainty and plot its distribution. Through simulation, uncertainty can be understood more intuitively.

Conclusion.

Bayesian inference is a powerful tool for reducing epistemic uncertainty, allowing posterior distributions to be updated by adding data.
Monte Carlo methods are used to simulate uncertainty in a model or system and to assess its distribution.

Application examples

Specific applications of AI technologies to resolve epistemic uncertainty include.

1. climate change prediction

Problem: Climate change models are complex and involve many variables. This leads to a high level of epistemic uncertainty in climate predictions, and the accuracy of prediction models is highly dependent on the assumptions and initial conditions incorporated into the model, making it important to reduce this uncertainty.
Solution: use Bayesian inference to update the posterior distribution for climate models. Use observational data (e.g. temperature, precipitation, CO2 concentrations) to reduce model uncertainty and produce more accurate predictions. Each time new climate data is available, models are recalibrated to reduce epistemic uncertainties and more accurately predict future climate scenarios.
Specific example: the Intergovernmental Panel on Climate Change (IPCC) climate models use Bayesian inference to evaluate multiple models and scenarios and quantify the uncertainty in the projections through posterior distributions. This increases the level of confidence regarding forecast outcomes and enables policymakers to make risk-based decisions.

2. obstacle recognition in self-driving vehicles

Problem: Self-driving vehicles need to be aware of their surroundings in real-time, but the data from cameras and sensors contains noise and errors, resulting in epistemic uncertainty in obstacle recognition.
Solution: Use Monte Carlo methods to simulate the uncertainty in the recognition results from sensors. Automated vehicles simulate multiple scenarios when predicting the position and movement of obstacles and make the most appropriate decisions. Furthermore, reinforcement learning is used to improve recognition accuracy and simultaneously assess risk to manage epistemic uncertainty.
Specific example: Waymo (Google’s self-driving division) combines data from multiple sensors to recognise obstacles. Using Monte Carlo methods, sensor errors are taken into account and a probabilistic assessment of the obstacle’s position is made, which allows the vehicle to be more accurately aware of its surroundings and to drive safely.

3. medical diagnosis support systems

Problem: Diagnosis in the medical field is based on patient symptoms and test results, but epistemic uncertainty arises, especially when symptoms are not clear or data are incomplete.
Solution: use Bayesian inference to build probabilistic models of medical diagnosis. For example, the probability of a disease is estimated based on symptoms and test results, and updated with each new test result. The predictive models are evaluated to help quantify the uncertainty in the data and suggest the most likely diagnosis.
Specific example: the IBM Watson Health is developing a system that uses vast amounts of medical data to estimate diseases based on patients’ symptoms. It uses Bayesian inference to assess epistemic uncertainty about the patient’s condition and suggests the most likely disease, as well as new test results to improve confidence in the diagnosis.

4. quality control in manufacturing

Problem: In manufacturing, various processes are controlled to ensure product quality, but epistemic uncertainty arises due to variation and noise in the manufacturing process. This uncertainty affects the prediction of whether a product will meet specifications.
Solution: A Monte Carlo method is used to simulate the variability in the manufacturing process, to assess the uncertainty regarding the quality of the product, and to randomly vary several manufacturing parameters to probabilistically assess the final quality of the product. In this way, the quality control system finds the optimum manufacturing parameters to reduce epistemic uncertainty and makes predictions to guarantee quality.
Specific example: Bosch (Bosch) uses Monte Carlo methods in quality control on the production line to assess the final quality of the product, taking into account uncertainties in raw materials and manufacturing conditions. This can lead to optimal conditions that increase the probability of the product meeting specifications and improve manufacturing efficiency.

5. risk assessment in the financial sector

Problem: In financial markets, forecasts for asset prices and market movements are subject to epistemic uncertainty, with market volatility and unknown risk factors potentially affecting the outcome.
Solution: risk assessment based on historical market data using Bayesian inference. Define a prior distribution of risk for the investment strategy and update the posterior distribution whenever market movements are predicted. In addition, scenario analysis is used to forecast risk for different market movements and optimise risk management strategies.
Specific example: Goldman Sachs uses Bayesian inference in risk management. It uses scenario analysis to make forecasts for market fluctuations based on historical financial data and to assess their uncertainty in order to adjust its investment strategy.

reference book

Reference books on epistemic uncertainty and AI technologies are listed below.

1. ‘Bayesian Reasoning and Machine Learning’
– Author: David Barber
– Abstract: A comprehensive textbook on Bayesian reasoning and machine learning. It is useful for understanding the use of Bayesian reasoning in managing epistemic uncertainty and provides a foundation for tackling complex problems with a probabilistic approach.
– Related areas: bayesian inference, machine learning, probability theory.

2. ‘Probabilistic Graphical Models’
– Author(s): Daphne Koller, Nir Friedman
– Abstract: A comprehensive text on probabilistic graphical models. It teaches how to model complex dependencies in dealing with epistemic uncertainty. In particular, techniques for handling uncertainty using Bayesian networks are detailed.
– Related areas: probability theory, Bayesian networks, machine learning.

3. ‘Monte Carlo Methods in Financial Engineering’
– Author: Paul Glasserman
– Abstract: Describes the application of Monte Carlo methods in financial engineering. It is rich in content related to uncertainty management in risk assessment and scenario analysis, and touches on simulation techniques to reduce epistemic uncertainty in financial markets.
– Relevant areas: Monte Carlo methods, financial engineering, risk management.

4. ‘The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation’.
– Author: Christian P. Robert
– Abstract: The book focuses on the theory of Bayesian inference and its implementation methods. It details how to utilise Bayesian approaches to quantify epistemic uncertainty.
– Related areas: Bayesian inference, decision theory, statistics.

5. ‘Machine Learning: A Probabilistic Perspective’.
– Author: Kevin P. Murphy
– Abstract: A comprehensive textbook on probabilistic approaches to machine learning. It provides an overview of the importance of the probabilistic perspective in dealing with epistemic uncertainty and building predictive models.
– Related areas: machine learning, Bayesian inference, probability theory

6. ‘Uncertainty: The Life and Science of Werner Heisenberg’.
– Author: David C. Cassidy
– Abstract: This biography on Heisenberg’s uncertainty principle delves into the concept of uncertainty in science and philosophy. It is helpful for understanding the scientific and philosophical background of epistemic uncertainty.
– Related areas: uncertainty, physics, philosophy of science.

7. ‘An Introduction to Computational Learning Theory’
– Author(s): Michael J. Kearns, Umesh Vazirani
– Abstract: This book is designed to provide an understanding of the fundamentals of computational theory and machine learning. It provides an introduction to the theory of learning algorithms and how they deal with epistemic uncertainty.
– Related areas: theory of computation, machine learning, algorithms.

8. ‘Decision Theory: Principles and Approaches’
– Author(s): Giovanni Parmigiani, Lurdes Inoue
– Abstract: A comprehensive guide to decision theory. It introduces a decision-making approach to managing epistemic uncertainty. Focuses on how to take a probabilistic perspective on decision-making.
– Related areas: decision theory, probability theory.

9. ‘Introduction to Stochastic Processes with R’
– Author: Robert P. Dobrow
– Abstract: An introduction to stochastic processes. It provides an understanding of epistemic uncertainty through probability theory and simulation, and how to apply it to real-world problems.
– Related areas: stochastic processes, simulation, probability theory