Linear and nonlinear models and their use

Machine Learning Artificial Intelligence Digital Transformation Natural Language Processing Deep Learning Mathematics Navigation of this blog

Linear Models and Nonlinear Models

Linear models and nonlinear models are distinguished based on the difference in the relationship between input and output.

A linear model has a simple structure where the relationship between input and output can be expressed as a straight line or a flat plane. Specifically, the model is constructed using first-degree terms only, involving addition and multiplication, as shown below:

y = a₁x₁ + a₂x₂ + ... + aₙxₙ + b

Thanks to this structure, linear models are relatively easy to compute, process quickly, and offer straightforward interpretability.

Typical examples of linear models include:

Linear Regression
Logistic Regression
Support Vector Machine (SVM) with a linear kernel

These models have the advantage of being less prone to overfitting. However, they have limitations in expressing complex relationships or nonlinear data structures.

Example of a Linear Model: House Price Prediction

Price = 500 × Floor Size + 1000 × Distance to Station + Constant

In this example, the effect of each input simply accumulates through addition, reflecting a straightforward linear structure.

On the other hand, nonlinear models express the relationship between input and output through curves, exponential functions, periodic functions, or other complex patterns, such as:

y = sin(x)
y = x²
y = e^x

These models allow for more flexible representation of complex real-world data.

Typical examples of nonlinear models include:

Neural Networks
Decision Trees
Support Vector Machine (SVM) with nonlinear kernels (e.g., RBF kernel)

However, nonlinear models tend to involve more complex calculations, higher computational costs, and greater difficulty in parameter tuning. Improper handling may lead to overfitting.

Example of a Nonlinear Model: Population Growth

Population = Initial Value × e^(Growth Rate × Time)

Additionally, nonlinear models can describe outputs that change in curved or periodic ways in response to inputs, such as:

y = sin(x)

This flexibility enables nonlinear models to capture complex and dynamic patterns present in real-world data.

Examples of Linear and Nonlinear Models by Field

The distinction between linear and nonlinear models manifests differently across specific fields such as machine learning, statistics, and physics. Below is a detailed explanation of the differences and practical applications of linear and nonlinear models in each domain.

Machine Learning

Representative linear models in machine learning include linear regression, logistic regression, and support vector machines (SVM) with a linear kernel. These models are effective when the relationship between input and output can be expressed as a simple straight line or flat plane.

Linear models determine the output based on the weighted average of the explanatory variables (features), making them simple and highly interpretable.

In contrast, nonlinear models such as neural networks, decision trees, and SVM with nonlinear kernels (especially RBF kernels) can capture more complex and nonlinear data structures. As a result, they are applied to advanced pattern recognition tasks such as image recognition and natural language processing.

Nonlinear models can learn complex patterns and nonlinear relationships, often achieving higher prediction accuracy. However, they also tend to function as “black boxes,” making interpretation more difficult.

Statistics

In statistics, typical examples of linear models include simple regression analysis, multiple regression analysis, and analysis of variance (ANOVA). These are analytical methods based on the assumption that the relationship between explanatory variables and the target variable is linear, and they are widely used for hypothesis testing and causal inference.

The advantage of linear models in statistics is that hypothesis testing and parameter estimation are straightforward, and the theoretical foundations are well established.

On the other hand, nonlinear models in statistics include generalized additive models (GAM), polynomial regression, and nonlinear least squares methods, which enable flexible modeling of complex, curved relationships and structures.

Nonlinear statistical models provide better flexibility in fitting real-world data, but they also pose challenges in terms of model selection and parameter estimation.

Physics

Typical examples of linear models in physics include Hooke’s Law (the relationship between force and displacement in springs: F = kx) and Ohm’s Law (the relationship between voltage and current: V = IR). These laws describe linear relationships that hold true under limited, specific conditions.

Linear models in physics are widely used to describe fundamental laws of force and motion (such as in Newtonian mechanics and electrical circuits), where linear approximations are valid for small displacements and low-speed scenarios.

In contrast, real-world physical phenomena often exhibit significant nonlinear characteristics. For example, time dilation in relativity, chaotic systems, nonlinear oscillations, and turbulence are behaviors that can only be described using nonlinear models. Such complex dynamics cannot be captured by simple linear approximations.

Typical nonlinear models in physics include:

Relativistic corrections under strong gravitational fields
Nonlinear optics
Chaos theory
Turbulence modeling

These phenomena are highly complex, making simple analytical solutions impractical. In many cases, they require advanced numerical simulations for proper analysis and prediction.

Choosing Between Linear and Nonlinear Models

There are clear differences in the characteristics and practical usage of linear and nonlinear models, as outlined below.

Computation and Interpretability

Linear models have a simple structure, making them relatively easy to compute and highly stable from a theoretical perspective. As a result, the meaning of the results and causal relationships can be intuitively understood, which is why linear models are widely used in academic research and practical applications.

In contrast, nonlinear models have complex structures, making their calculations more difficult. Theoretical analysis and interpretation of the results are not straightforward, requiring advanced expertise and numerical analysis techniques.

Differences in Expressiveness

Linear models capture the relationship between input and output in a straight-line fashion, which limits the range of relationships the model can express.

On the other hand, nonlinear models offer greater flexibility. They can represent complex real-world phenomena, including curves, exponential relationships, and periodic patterns. As a result, nonlinear models enable more accurate predictions and the ability to analyze complex data structures.

Practical Workflow

In many practical scenarios, it is common to first test a simple linear model. This approach helps to easily capture the basic trends and structure of the data while avoiding unnecessary complexity in model building.

If it is determined that the data structure cannot be adequately expressed by a simple linear model, a nonlinear model is introduced as needed. This stepwise approach helps balance interpretability and prediction accuracy while optimizing the modeling process.

Choosing Between Linear and Nonlinear Models – Simple Example

Below is a concrete example illustrating the practical choice between linear and nonlinear models, using the case of marketing campaigns and sales prediction.

Step 1: Testing with a Simple Linear Model

Objective:
Predict the relationship between advertising expenses and sales.

Example Data:

Month	Advertising Cost (10,000 yen)	Sales (10,000 yen)
Jan	10	150
Feb	20	300
Mar	30	450
Apr	40	550
May	50	600

Building the Linear Model

Assuming a proportional relationship where sales increase linearly with advertising expenses:

Sales = a × Advertising Cost + b

Step 2: Identifying the Limitations of the Linear Model

As more data is collected, the following trend becomes evident:

Advertising Cost (10,000 yen)	Sales (10,000 yen)
60	620
70	630
80	635

This new data cannot be explained by the initially assumed linear model.

Problem: Saturation Effect

At first, advertising is effective.
As investment increases, the marginal effect diminishes.
A linear model falsely predicts unlimited growth in sales, which does not reflect reality.

Step 3: Transition to a Nonlinear Model

To capture the saturation effect, we introduce a nonlinear, exponentially decaying model:

Sales = Maximum Sales × (1 - e^(-k × Advertising Cost))

About the Exponential Function

If the exponent has a positive coefficient (growth model), the output increases exponentially over time or with increased input.
If the exponent has a negative coefficient (decay model), the output rapidly decreases and approaches saturation.
This allows for realistic modeling of diminishing returns on advertising investment.

Features of Nonlinear Models

A major advantage of nonlinear models is their ability to accurately reproduce realistic saturation effects, such as those seen in advertising campaigns. For instance:

Initially, increasing advertising significantly boosts sales.
Beyond a certain investment level, the effect gradually weakens.
Eventually, additional investment yields little to no sales growth.

By using a nonlinear model, it becomes possible to:

Quantitatively identify the point where the advertising effect saturates.
Estimate the optimal advertising budget to maximize returns while avoiding unnecessary expenses.

Choosing Between Linear and Nonlinear Models – Explainability Perspective

Another practical example of choosing between linear and nonlinear models is based on the perspective of explainability.

For instance, consider the relationship between sales and advertising expenses represented by the following simple linear model:

y = 2 × Advertising Cost + 3 × Number of SNS Posts + 5

In this model, the “weights” (coefficients) directly indicate the influence of each feature, making it possible to interpret exactly how much the output will change when a particular variable changes by one unit.

In contrast, suppose we use a more complex model, such as a neural network, designed with the following inputs and output:

Inputs:

Advertising cost
Number of SNS mentions
Seasonal factors
Competitor activity

Output:

Sales prediction

While such a model can achieve higher prediction accuracy, the relationship between inputs and outputs becomes a “black box,” making it difficult to explain precisely how changes in specific inputs affect the predicted sales. As a result, decision-makers may find it challenging to apply the model effectively.

Improving Explainability in Nonlinear Models

To address this challenge, it is common to apply explainability techniques such as SHAP, LIME, or Feature Importance Visualization after analyzing the data with a nonlinear model.

SHAP and LIME allow visualization of feature importance.
These tools provide an “add-on” layer of explainability to complex nonlinear models, making them behave more like interpretable linear models.

In other words, these explainability techniques help bridge the gap by bringing some of the interpretability benefits of linear models to nonlinear models.

Practical Workflow Example

First, apply a simple linear model to gain a quick understanding of the results and easily share insights within the team.
Next, introduce a more complex nonlinear model to capture intricate patterns and improve prediction accuracy.
Finally, use tools like SHAP or LIME to supplement the nonlinear model with interpretability, allowing for clearer decision-making.

Thus, the best approach is to adaptively use both the simplicity and explainability of linear models and the expressive power of nonlinear models, depending on the situation.

Choosing Between Linear and Nonlinear Models – Pushing Linear Models with High-Dimensional Transformation

Linear models can be significantly enhanced in complexity through a process called high-dimensional feature transformation.

Typical Approaches

Polynomial Feature Transformation

Consider a simple linear model:

y = a × x + b

By transforming the features as follows:

x → [x, x², x³, ...]

Even a linear model can express curves and complex decision boundaries, enabling it to capture non-linear patterns within the framework of linear regression.

Kernel Methods (Kernel Transformation in SVM)

Kernel methods map the input space into a higher-dimensional space through a non-linear transformation. In this transformed space, linear separation becomes possible.

For example:

RBF (Radial Basis Function) Kernel
Polynomial Kernel

With these transformations, even if the original input is simple, the model can learn highly complex patterns internally while still relying on a linear framework in the transformed space.

Sparse Linear Models

Sparse linear models, such as Lasso Regression, introduce a large number of features and select only the relevant ones.

This approach allows for:

Maintaining interpretability
Handling large-scale, high-dimensional data
Achieving a balance between model simplicity and complex expression

Practical Considerations

Theoretically, with appropriate feature transformation, linear models can handle almost any level of complexity, making it possible to solve a wide range of problems using linear models alone.

However, challenges arise:

The computational cost increases rapidly with high-dimensional transformations.
The risk of overfitting becomes significant.
Poorly designed transformations can lead to reduced model accuracy.

Therefore, rather than excessively complicating a linear model, it is often more efficient to use inherently flexible nonlinear models, such as neural networks, especially for tasks that require capturing highly complex patterns.

Recommended Books

Fundamental Books on Linear and Nonlinear Models (English)

“Introduction to Statistical Learning” (ISL)
By Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
→ Covers linear regression, logistic regression, classification, nonlinear models, tree-based methods, and neural networks.
Practical introduction to statistical learning, combining theory and applications.
“The Elements of Statistical Learning” (ESL)
By Trevor Hastie, Robert Tibshirani, Jerome Friedman
→ More theoretical and comprehensive. Covers linear models, nonlinear models, boosting, SVMs, and neural networks.
“Pattern Recognition and Machine Learning“
By Christopher M. Bishop
→ Focuses on the mathematical foundations of both linear and nonlinear models, including neural networks, decision trees, and probabilistic models.

Books on Model Interpretability, SHAP, LIME, and XAI (English)

“Interpretable Machine Learning“
By Christoph Molnar
→ Covers SHAP, LIME, Partial Dependence Plots (PDP), and various explainable AI techniques.
“Interpretable Machine Learning with Python“
By Serg Masís
→ Practical examples of SHAP, LIME, and other interpretability tools implemented in Python.
“Explainable AI: Interpreting, Explaining and Visualizing Deep Learning“
By Ankur Taly, Been Kim, Christoph Molnar, et al.
→ Covers explainability and visualization techniques for deep learning models, including CNNs and RNNs.

Feature Engineering & Practical Machine Learning (English)

“Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists“
By Alice Zheng, Amanda Casari
→ Practical techniques for designing features, including preprocessing for text, categorical, and time-series data.
“Python Machine Learning“
By Sebastian Raschka, Vahid Mirjalili
→ Comprehensive guide to machine learning with scikit-learn and PyTorch, including both linear and nonlinear models and interpretability techniques.
“Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow“
By Aurélien Géron
→ Practical, hands-on approach covering linear models, nonlinear models, neural networks, and model interpretability.

Supplementary Review Paper on XAI

“Explainable Machine Learning: A Field Guide for the Uninitiated“
arXiv preprint, 2021
→ Comprehensive comparison of SHAP, LIME, and other XAI methods with practical recommendations.