Aleatory uncertainty and AI-based solutions

Machine Learning Artificial Intelligence Digital Transformation Probabilistic Generative Models Machine Learning with Bayesian Inference Small Data Nonparametric Bayesian and Gaussian Processes python Economy and Business Physics & Mathematics Navigation of this blog

Aleatory Uncertainty

Aleatory Uncertainty will mainly refer to uncertainty caused by natural phenomena and stochastic fluctuations. This type of uncertainty is inherently random and uncontrollable and is often expressed using probabilistic models. This applies, for example, to weather conditions or the roll of the dice.

Aleatic uncertainty is described below.

1. characteristics

Based on randomness: aleatoric uncertainty cannot be completely eliminated by repeated observations and experiments, as fluctuations occur naturally.
Modellable by probability distributions: this uncertainty can be quantified using probability distributions (e.g. normal, Poisson) and statistical methods.
Objective nature: aleatoric uncertainty is objective and independent of the observer’s knowledge and information content.

2. examples

Weather: the possibility of sudden rain or wind speed fluctuations even under constant conditions.
Dice or coin tossing: equal probability that each side will turn up, but the outcome is random.
Earthquake occurrence: the probability distribution of at what point an earthquake will occur is predictable, but the exact timing is unknown.

3. difference from Epistemic Uncertainty: aleatoric uncertainty is based on random variability, whereas Epistemic Uncertainty stems from a lack of knowledge or incompleteness of the model. Epistemic uncertainty can be reduced by obtaining more information and data, whereas aleatoric uncertainty cannot be completely reduced.

4. use in engineering and risk analysis: aleatoric uncertainty is an important concept in reliability engineering and risk analysis. For example, it considers coping with anticipated uncertainty by modelling probabilistic behaviour using simulation and Monte Carlo methods and by establishing a safety margin.

5. How to cope: aleatory uncertainty cannot be completely eliminated and is therefore managed in the following ways

Establishing risk tolerances: uncertainty is designed to keep risk within an acceptable range.
Scenario analysis: decisions are made taking into account various probabilistic scenarios.
Probabilistic approaches: design and assessments are carried out taking into account uncertainty.

Mathematical models for aleatory uncertainty and prediction in AI

Mathematical models for dealing with aleatic uncertainty and AI-based forecasting methods are described below.

1. mathematical models of aleatoric uncertainty: aleatoric uncertainty is expressed using probability distributions and statistical models. The following are general mathematical approaches.

(1) Probability distribution models: the variability of natural phenomena is modelled by a probability distribution. Models include the normal distribution applied to continuous data with known mean and variance, the Poisson distribution applied when events occur randomly within a certain time (e.g. the number of earthquakes), and the beta and gamma distributions, which represent data with a particular shape. For details, see ‘Various probability distributions used in stochastic generative models’.

(2) Monte Carlo simulation: simulating the effects of uncertainty using random sampling, e.g. forecasting wind power generation taking into account variations in weather data. For more information, see also ‘Overview and implementation of Markov chain Monte Carlo methods’.

(3) Stochastic Differential Equations (SDE): These use differential equations with random factors, e.g. the Black-Scholes equation in financial market models of stock price fluctuations. See also ‘Financial Engineering, Black-Scholes Models and Artificial Intelligence Technology’ for more information.

(4) Bayesian statistics: updating probability distributions based on observed data, e.g. Bayesian updating of earthquake rates. See also ‘Overview and various implementations of Bayesian estimation’ for more information.

2. aleatory uncertainty forecasting methods using AI: AI models are suitable for forecasting with large amounts of data and taking uncertainty into account. The main methods are listed below.

(1) Deep Learning (Deep Learning): forecasting using recurrent neural networks (RNNs), LSTMs and time-series data (e.g. weather variability). It includes the use of generative models (GANs). This is the generation of new data with randomness to assess the range of uncertainty. Examples include simulating the probability distribution of dice rolls. For more information, see also ‘About deep learning’.

(2) Gaussian Process Regression (GPR): smooth prediction of continuous data with uncertainty, providing predictions and confidence intervals as output. Examples include modelling uncertainty in wind speed forecasts. See also ‘GPy – A Python-based framework for Gaussian processes’ for more information.

(3) Ensemble learning: combining multiple models to improve the reliability of forecasts, including random forests and gradient boosting. For more information, see Ensemble Learning: Overview, Algorithms and Examples of Implementations.

(4) Bayesian deep learning: introduces probabilistic inference into deep learning models, e.g. by adding uncertainty (predictive distribution) to the output. For more information, see also ‘Overview of Bayesian deep learning, application examples and implementation examples’.

(5) Reinforcement Learning: Learning to make optimal decisions while taking into account aleatory uncertainty. For more information, see ‘Why is reinforcement learning necessary? Application examples, technical challenges and solution approaches’.

The following points should be considered when using aleatoric uncertainty

Data quality: high quality and sufficient data are required to adequately model aleatory uncertainty.
Computational cost: Monte Carlo methods and Bayesian learning can be computationally expensive.
Interpretation of results: the probability distributions provided by AI models should not be over-confident and expert judgement should be used in conjunction.

implementation example

The following scenarios are described as examples of implementations that take into account aleatoric uncertainty.

Prediction of wind speeds: predict future values of wind speeds, including randomness, using historical weather data.
Method used: Gaussian Process Regression (GPR) is used to visualise predictions and confidence intervals.

Example implementation in Python: Gaussian Process Regression.

Required libraries.

pip install numpy scikit-learn matplotlib

implementation code

import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C

# 1. data generation 
# Simulate historical wind speed data (e.g. sine wave + noise)
np.random.seed(42)
X = np.linspace(0, 10, 20).reshape(-1, 1)  # Historical observation time
y = np.sin(X).ravel() + np.random.normal(0, 0.2, X.shape[0])  # Wind speed data (m/s)

# New station (time to be predicted)
X_pred = np.linspace(0, 10, 100).reshape(-1, 1)

# 2. definition of Gaussian process regression model 
# kernel = constant kernel × RBF kernel
kernel = C(1.0, (1e-3, 1e3)) * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10, alpha=0.1)

# 3. model learning
gp.fit(X, y)

# 4. predictions and calculation of confidence intervals
y_pred, sigma = gp.predict(X_pred, return_std=True)

# 5. plotting the results
plt.figure(figsize=(10, 6))
plt.plot(X, y, 'r.', markersize=10, label="observed data")
plt.plot(X_pred, y_pred, 'b-', label="predicted value")
plt.fill_between(
    X_pred.ravel(),
    y_pred - 1.96 * sigma,
    y_pred + 1.96 * sigma,
    alpha=0.2,
    color="blue",
    label="95% Confidence interval",
)
plt.title("Wind speed prediction by Gaussian process regression.")
plt.xlabel("Time (sex)")
plt.ylabel("Wind speed data (m/s)")
plt.legend(loc="upper left")
plt.show()

Key implementation points

Input data:.
- X is the observed time.
- Y is the observed wind speed (including noise).
Kernel selection for Gaussian process:.
- RBF kernel: models smooth changes in wind speed.
- Constant kernel: scales the entire data.
Confidence intervals:.
- 95% confidence intervals are drawn to reflect the uncertainty in the forecast.

Running results

Red dots: actual observed data.
Blue line: predicted wind speed.
Blue shading: 95% confidence interval of the forecast (representing the aleatoric uncertainty).

Proposed extensions

Real-time data input: real-time collection of measured data to update the model.
Combined Monte Carlo simulation: random sampling of forecast results to further quantify uncertainty.
Comparison with other AI models: compare results with LSTMs and RNNs to assess the trade-off between accuracy and computational cost.

Application examples

Specific applications of considering aleatoric uncertainty include the following areas of application.

1. weather forecasting

Background: weather forecasting needs to take into account random variations in the atmosphere (aleatoric uncertainty), which have a significant influence on forecasts of wind speed, precipitation and temperature.
Applicable models: Gaussian process regression, Monte Carlo methods.
Examples: wind farm forecasting the probability that a certain range of wind speeds will be maintained. Assessment of flood risk due to heavy rainfall with probability distribution.
Result: predictions with confidence intervals allow appropriate preventive measures (e.g. issuing flood warnings, adjusting energy supply plans).

2. earthquake risk assessment

Background: aleatoric uncertainty is involved in the timing and magnitude of earthquake events. Quantifying this is important for disaster prevention and urban design.
Applicable model: Poisson distribution + recurrent neural network (RNN).
Examples: modelling of earthquake frequency and magnitude as time series data. Risk assessment in the design of seismic standards for buildings.
Results: building design criteria and emergency response plans can be improved based on the data.

3. medical sector: prediction of patients’ medical conditions

Background: individual patient constitution and random factors influence disease progression and treatment efficacy.
Applicable models: Bayesian deep learning, ensemble learning.
Examples: predicting fluctuations in patients’ blood glucose and blood pressure levels, and predicting the risk of side-effects in drug treatment.
Results: personalised medicine (Precision Medicine) can be realised and the effectiveness of treatment plans can be maximised.

4. energy supply and demand forecasting

Background: Energy consumption is affected by weather conditions and people’s behaviour patterns, and therefore has a high degree of uncertainty.
Applicable models: LSTM, ensemble models.
Examples: forecasting fluctuations in summer electricity demand, adjusting supply plans and forecasting the amount of electricity generated from renewable energy sources (e.g. solar, wind).
Result: over- and under-supply is prevented and energy efficiency is improved.

5. quality control in manufacturing

Background: In production lines, subtle differences in material properties and processing conditions affect product quality.
Applicable model: Gaussian process + Monte Carlo method.
Examples: prediction of defect rates in semi-conductor manufacturing processes and evaluation of variations in the strength of automotive components using probability distributions.
Results: improved product reliability and reduced costs.

6. forecasting price fluctuations in financial markets

Background: fluctuations in stock prices and exchange rates contain many random elements.
Applicable models: stochastic differential equations (SDE) + generative models (GAN).
Examples: predicting the range of stock price fluctuations, enhancing risk management and optimising hedging strategies in options trading.
Results: reduced investment risk and more efficient portfolio management.

7. simulation of natural disasters

Background: The frequency and extent of impact of natural disasters such as typhoons and floods are subject to uncertainty.
Applicable models: Monte Carlo simulation + deep learning.
Examples: typhoon path prediction and impact assessment, visualisation of flood risks using geographic information systems (GIS).
Result: disaster management planning and risk reduction.

8. logistics and supply chain management

Background: demand forecasts and transport routes involve many variables.
Applicable models: reinforcement learning + Bayesian statistics.
Examples: forecasting rapid changes in demand, optimising inventories and selecting the shortest possible delivery routes taking into account uncertainties.
Result: improved delivery efficiency and reduced costs.

Forecasting with aleatory uncertainty can be applied in many areas, enabling randomness to be quantified and supporting reliable decision-making.

reference book

Reference books for dealing with aleatory uncertainty in mathematical models and AI are described below.

1. fundamentals and mathematical models of aleatory uncertainty
Title of book: ‘Introduction to Uncertainty Quantification’
Author: T.J. Sullivan
Publisher: Springer
Abstract: Classification of uncertainty (aleatoric and epistemic). Modelling using Bayesian estimation, Monte Carlo methods and stochastic processes. Meteorology and engineering are discussed as examples of applications.

Book title: ‘Uncertainty Quantification: Theory, Implementation, and Applications’
Author(s): R. Ghanem, D. Higdon, and H. Owhadi
Publisher: Wiley
Abstract: This book is dedicated to uncertainty quantification. Describes a wide range of mathematical methods, including finite element methods and stochastic modelling.

2. approaches from a probability theory and statistics perspective.
Title of book: ‘Probability and Statistics for Engineers and Scientists’
Author: Sheldon M. Ross
Publisher: Pearson
Abstract: Fundamentals of probability and statistics in engineering. Learn how to model uncertainty and select probability distributions.

Title of book: ‘Stochastic Differential Equations: An Introduction with Applications’
Author: Bernt Øksendal
Publisher: Springer
Abstract: An introduction to stochastic differential equations (SDEs). Describes methods for modelling aleatory uncertainty dynamically.

3. reference book for the use of AI and machine learning
Book title: ‘Gaussian Processes for Machine Learning’
Author(s): Carl Edward Rasmussen and Christopher K. I. Williams
Publisher: MIT Press
Abstract: A classic book dedicated to Gaussian Process Regression (GPR). It is fundamental for quantifying uncertainty and dealing with confidence intervals. A free PDF version is available.

Title of book: ‘Bayesian Reasoning and Machine Learning’
Author: David Barber
Publisher: Cambridge University Press
Abstract: Bayesian Statistics Meets Machine Learning. Covers methods for building predictive models that account for uncertainty.

4. applications and practical methods
Title of book: ‘Applied Predictive Modelling’
Author(s): Max Kuhn and Kjell Johnson
Publisher: Springer
Abstract: Builds predictive models using real data sets. Focuses on uncertainty visualisation and model evaluation.

Title of book: ‘Time Series Forecasting using Deep Learning: Combining PyTorch, RNN, TCN, and Deep Neural Network Models to Provide Production-Ready Prediction Solutions

5. uncertainty handling in specific fields.
Title of book: ‘Risk Analysis in Engineering and Economics’
Author: Bilal M. Ayyub
Publisher: CRC Press
Abstract: Risk Analysis and Uncertainty Assessment in Engineering and Economics. How to apply aleatory uncertainty to real-world problems.

Title of book: ‘Uncertainty in Weather and Climate Prediction’

Online resources.
1. Gaussian Processes for Machine Learning (official website)
– Free access to the contents of the above book.

2.DeepAI Tutorials.
– Provides online tutorials related to uncertainty and machine learning.