How to deal with singularities in Newton’s method

Machine Learning Artificial Intelligence Digital Transformation Deep Learning Information Geometric Approach to Data Mathematics Navigation of this blog

How to deal with singularities in Newton’s method

Newton’s method, also described in ‘Overview of Newton’s method, algorithm and implementation’, is a powerful method for finding solutions to non-linear equations, but problems can arise at singularities (e.g. points where the Jacobi matrix is singular or approximately singular). There are several ways to deal with singularities and the appropriate method should be chosen according to the type of problem and the characteristics of the solution.

The following are some of the problems encountered at singularities in Newton’s method.

Singularity of the Jacobi matrix: If the Jacobi matrix ( J(x)) is singular (determinant 0), the solution to the linear equation ( J(x) Delta x = -F(x)) required by Newton’s method cannot be uniquely determined.
The Jacobi matrix is almost singular: as the singularity is approached, the condition number of the matrix increases, leading to instability and loss of accuracy in numerical calculations.

Methods for dealing with singularities include.

1. introducing Trust-Region Methods: by extending Newton’s method to Trust-Region Methods as described in ‘Overview, Algorithms and Examples of Implementations of Trust-Region Methods’, the problems at and near singularities can be alleviated.
– Abstract: Limits the amount of solution updates to within the trust-region and prevents unstable updates in the vicinity of singularities.
– Features: addresses the problem by controlling the solution rather than directly avoiding the singularity of the Jacobi matrix.

2. Damping Newton method: To suppress instability near singularities, the update formula is modified by adding a damping term, which is also described in ‘On the rescaling of the Newton method’.
– Updated formula:
\[
(J(x) + \lambda I) \Delta x = -F(x)
\] Where \(\lambda > 0\) is the regularisation parameter and \(I\) is the unit matrix.
– Effect: stabilises the inverse computation of the Jacobi matrix in the vicinity of singular points.

3. use pseudo-inverse: when the Jacobi matrix is singular or approximately singular, the pseudo-inverse is used to calculate the updates.
– Abstract: The Moore-Penrose pseudo-inverse \( J(x)^+ \) is used.
– Update formula:
\[
\Delta x = – J(x)^+ F(x)
\] – Applicable scenarios: valid even when the matrix is unranked.

4. switch to residual minimisation methods: if the Newtonian method is unstable, apply methods based on non-linear least squares methods (e.g. Gauss-Newton method, LM method).
– Levenberg-Marquardt method (LM method): an intermediate method between the Newton method and the steepest descent method. Stable even when the Jacobi matrix is singular or the condition number is large.

5. choice and improvement of initial values: to avoid singular neighbourhoods, choose appropriate initial values or improve the initialisation method.
– Methods: coarse solutions are obtained by other methods (e.g. steepest descent method) and used as initial values for Newton’s method. Initial value selection to avoid regions where singularities are predicted to exist.

6. solution normalisation: if the solution is not scaled properly in the vicinity of the singularity, the problem is scaled and normalised.
– Method: scale the Jacobi matrix or \( F(x) \).

7. improvement of the Jacobi matrix: to construct a numerically stable Jacobi matrix, the following techniques are applied
– Approximation of the Jacobi matrix by the difference method.
– Stabilisation of the matrix by Singular Value Decomposition (SVD) as described in ‘Overview of Singular Value Decomposition (SVD) and examples of algorithms and implementations’.

8. in combination with other methods:
– Secant method: omits the full calculation of the Jacobi matrix and uses an approximate matrix.
– Quasi-Newton Methods: see “Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method. for a discussion of the BFGS and DFP methods (Davidon-Fletcher-Powell method): uses algorithms such as those described in ‘Overview of the Davidon-Fletcher-Powell method, its algorithms and examples of implementations’.

implementation example

The following example shows a Python implementation of the ‘damped Newton method’ to deal with singularities in the Newton method. In this example, we consider solving a simple non-linear equation (F(x)=x^3-x-2).

Example implementation: damped Newton method

import numpy as np

def damped_newton_method(func, jacobian, x0, max_iter=100, tol=1e-6, lambda_factor=1e-3):
    """
    Solving non-linear equations by Newton's method with damping.
    
    Args:
        func (callable): Non-linear function F(x)
        jacobian (callable): Jacobi matrix of functions J(x)
        x0 (float): initial value
        max_iter (int): Maximum number of repetitions
        tol (float): allowable limit of error
        lambda_factor (float): Damping coefficient λ
        
    Returns:
        float: approximate solution
    """
    x = x0
    for i in range(max_iter):
        F = func(x)
        J = jacobian(x)
        # Modified matrix with damping
        J_damped = J + lambda_factor * np.eye(1)
        
        try:
            # Solving the modified equation
            delta_x = np.linalg.solve(J_damped, -F)
        except np.linalg.LinAlgError:
            print("Singularity reached. The Jacobi matrix is unsolvable.")
            break
        
        # answer update
        x += delta_x[0]
        
        # convergence judgment
        if np.abs(delta_x[0]) < tol:
            print(f"Convergence. answer x = {x:.6f} (iteration: {i+1})")
            return x
    
    print("No convergence.")
    return x

# Non-linear functions and their Jacobi matrices (simplified for scalar functions).
def func(x):
    return np.array([x**3 - x - 2])

def jacobian(x):
    return np.array([[3*x**2 - 1]])

# initial value
x0 = 1.5

# Execution of the Newton method with damping.
solution = damped_newton_method(func, jacobian, x0)
print("approximate answer:", solution)

Code description.

Non-linear functions and Jacobi matrices
- Function \(F(x)=x^3-x-2\)
- Jacobi matrix (derivative in one dimension) \(J(x)=3x^2-1\)
Introduction of damping terms
- Add a damping factor \(\lambda I\) to the Jacobi matrix to avoid singularity.
- Modified as \(J(x)+\lambda I\)
Updating the solution
- Computed using the modified matrix \(J(x)+\lambda I\).
Dealing with singularities
- If the Jacobi matrix is singular, np.linalg.solve throws an error.
- Catch the error and abort the process.
Convergence decision
- Terminates when the amount of updates \(\delta x\) is less than or equal to the permissible error.

Result of execution: given an initial value (x_0=1.5), the following results are obtained.

Convergence. answer x = 1.521380 (iteration: 5)
Approximate answer:. 1.521380139035056

Proposed extension.

Applicable for higher dimensions and for non-linear systems.
Dynamic adjustment of the damping factor (lambda) can further improve efficiency (e.g. Levenberg-Marquardt method).

Numerical stability in the vicinity of singularities can be ensured with reference to this implementation.

Application examples

The following are examples of the application of Newton’s method to singularities. These examples explore how Newton’s method can be applied stably to singularities and poor conditions.

1. inverse kinematics of a robot arm

Problem overview: when solving inverse kinematics problems in robotic arms, where joint angles are calculated to achieve a specific position or posture, the Jacobi matrix can become singular during the calculation. This occurs when the target position is at a specific position of the robot arm (e.g. when the robot arm is aligned in a straight line).

Application: The Newton method with damping can be used to find a stable solution even when the Jacobi matrix is singular or ill-conditioned. This method allows for smooth updating of the joint angles to avoid singularities and to eliminate numerical instability.

Implementation flow.
1. target and current positions.
2. calculate the Jacobi matrix for each joint of the robot arm.
3. add a damping term to ensure numerical stability near the singularity.
4. perform iterative calculations and update joint angles.

2. non-linear regression analysis

Problem overview: in non-linear regression analysis, when optimising parameters to fit the data, the slope of the objective function can be very small or the Jacobi matrix can be singular. This is particularly likely to occur for complex non-linear models and over-fitted data.

Application: use the Newton method with damping to stably find the optimal solution even when the Jacobi matrix is singular or very small. For example, this method is useful when estimating the parameters ( beta_0 ) and ( beta_1 ) of a non-linear model ( y = beta_0 cdot x^{beta_1} ).

Implementation flow.
1. initialise the parameters ( beta_0 ) and ( beta_1 ) to be optimised
2. calculate the residual ( F(beta) = y_{text{data}} – f(x, beta) )
3. calculate the Jacobi matrix and add a damping term.
4. update the parameters using the update equation and check convergence.

3. optimal design of the structure

Problem overview: in the design of structures, e.g. to find the optimum structure of a building or bridge, there are many design variables (e.g. length of members, cross-sectional area, choice of materials) and the objective function (e.g. energy or stresses to be minimised) is often non-linear. In such problems, singularities may occur in the design space, leading to numerical instability.

Application: the Newton method with damping is used to achieve stable optimisation while avoiding singularities when updating design variables. This method prevents numerical instability that may occur in the design space and speeds up convergence towards the optimal solution.

Implementation flow.
1. initialise the design variables of the structure (length and cross-sectional area of members).
2. compute the objective functions (energy, stresses, etc.).
3. calculate the Jacobi matrix and add a damping term.
4. update the design variables in the optimisation algorithm.

4. physics simulation (fluid dynamics)

Problem overview: in fluid dynamics simulations, when the motion of an object or the behaviour of a fluid is solved numerically, the Jacobi matrix can be singular or nearly singular, especially near boundary layers or contact points. For example, numerical instabilities occur when the fluid slides over the surface of an object.

Application: Use the Newton method with damping to obtain numerically stable solutions and accurately simulate the motion of the fluid. A damping term is added to avoid singularities in the Jacobi matrix when solving the hydrodynamic equations.

Implementation flow.
1. initialise the fluid velocity and pressure fields.
2. define the non-linear equations (e.g. Navier-Stokes equations) based on the fluid dynamics.
3. calculate the Jacobi matrix and add a damping term.
4. update the fluid state using an iterative method.

5. optimise the financial model.

Problem overview: in financial portfolio optimisation, non-linear optimisation is performed with the objective of minimising risk and return. In this optimisation problem, the Jacobi matrix can be singular or ill-conditioned due to the non-linear nature of asset returns and risks.

Application: The Newton method with damping is used to solve the optimisation problem. Stable updates are possible in the optimisation considering non-linear constraints and portfolio returns and risks.

Implementation flow.
1. set the initial allocation of the portfolio.
2. define a non-linear model of return and risk.
3. calculate the Jacobi matrix and add a damping term.
4. optimise the portfolio allocation using the update formula.

reference book

Reference books on how to deal with singularities in Newton’s method are described.

1. ‘Numerical Optimisation’.
– Author(s): Jorge Nocedal, Stephen J. Wright
– Abstract: This classic book on numerical optimisation describes in detail how to handle Newton’s method for optimisation problems involving singularities. In particular, it also describes damping and modified Newton methods to improve the singularity of matrices and convergence of solutions.

2. ‘Optimisation by Vector Space Methods’.
– Author(s): David G. Luenberger
– Abstract: The book introduces the basic theory of optimisation using vector space methods and describes how to deal with convergence problems and singularities in Newton’s method. In particular, it focuses on how to obtain convergence when the matrix is singular.

3. ‘Numerical Methods for Unconstrained Optimisation and Nonlinear Equations’.
– Author(s): J. E. Dennis, Robert B. Schnabel
– Abstract: This book details the theory of solving non-linear equations using Newton’s method and explains how modifications should be made with respect to singularities. In particular, it describes algorithmic options for improving stability in singular matrices.

4. ‘Practical Optimisation: Algorithms and Engineering Applications’.
– Author(s): Andreas Antoniou, Wu-Sheng Lu
– Abstract: This book on the application of practical optimisation algorithms describes how to solve non-linear problems using Newton’s method and how to deal appropriately with singularities. It teaches damping techniques and modified Newton methods to avoid singularity problems.

5. ‘Introduction to Numerical Optimisation’.
– Author(s): Andrzej Ruszczynski
– Abstract: This book provides the basics of numerical optimisation, with chapters on how to use matrix approximations and damping to avoid singularities and how to evaluate their effects.

6. ‘Convex Optimisation’.
– Author(s): Stephen Boyd, Lieven Vandenberghe
– Abstract: Although the book is dedicated to convex optimisation, it deals with Newton’s method and its variants, and explains the theory and practical fixes for cases where singularities occur. Singularity avoidance in convex problems is also discussed.

7. ‘A First Course in Optimisation’.
– Author(s): S. S. Rao
– Abstract: A basic introduction to optimisation, explaining Newton’s method and related algorithms, in particular how to modify matrices when they are singular (e.g. using pseudo-inverse matrices).