Riemannian optimisation algorithms and implementation examples

Machine Learning Artificial Intelligence Digital Transformation Deep Learning Information Geometric Approach to Data Mathematics Navigation of this blog

Riemannian optimisation

Riemannian Optimisation (Riemannian Optimisation) is an approach where the usual optimisation methods are performed on Riemannian manifolds.

A manifold here is a mathematical tool that represents ‘a space that is locally simple but overall complex’, such as a circumference that looks like a straight line but is closed overall, or a sphere that looks like a plane but has no edges and a closed structure, which locally is a simple structure but overall It will represent a complex structure.

A Riemannian manifold is a space with a smooth geometric structure in which each point of this manifold has a defined inner product, which makes it possible to define measures such as distances and angles.

Applying this Riemannian manifold to the approach described in ‘What is information geometry?’, which considers probability distributions and statistical models as geometric spaces and introduces geometric structures (distance, curvature, connection, etc.) into them, it becomes possible to effectively solve problems with non-linear constraints, and this method can be used in numerical computing and machine learning, computer vision, signal processing, etc.

The characteristics of Riemannian optimisation include the following.

1. optimisation on manifolds: when solving problems where the constraints are non-linear, it is necessary to optimise within a space closed to the manifold. In Riemannian optimisation, algorithms are designed on Riemannian manifolds instead of Euclidean spaces to solve such problems.

2. scope of application: characteristic application areas include
– Eigenvalue problems
– Rank constraints on matrices
– Low-dimensional embeddings (e.g. PCA and LDA extensions)
– Robust machine learning models
– Distributed clustering and graph analysis

3. geometric perspective: standard optimisation methods such as gradient descent and Newton’s method need to be adapted to the geometric structure of the manifold. In particular, the gradient is defined as a tangent vector on the manifold.

Riemannian optimisation is performed using the following basic concepts.

1. a Riemannian manifold: it has a tangent vector space defined for each point and an inner product. This inner product can be used to measure distances and angles.

2, Riemannian gradient: an adaptation of the ordinary gradient to manifolds, which allows optimisation in the direction satisfying constraints.

3. projection: projecting the gradient in Euclidean space onto a manifold to search for solutions that satisfy constraints on the manifold.

4. reparameterisation: to operate on Riemannian manifolds, it is sometimes necessary to reparameterise variables.

Examples of specific algorithms include.

Riemannian gradient descent method: an algorithm aimed at optimising a function defined on a Riemannian manifold, adapting the gradient descent method in Euclidean space to the geometric structure of the Riemannian manifold.

Initial points are set on the manifold.
The gradient is computed and projected onto the manifold.
Compute the next point according to the update rule.
Repeat until convergence conditions are met.

Riemann-Newton method: an algorithm for faster convergence using Hesse matrices, also described in ‘On Hesse matrices and regularity’ on Riemannian manifolds.

Computing gradients and Hesse matrices
Solving Newton’s equations
Redefinition (retraction)
Convergence decision
Iteration

Applications include.

Low-rank matrix completion: recommendation systems such as Netflix need to infer the complete matrix from partial data. This is where low-rank constraints arise and Riemannian optimisation can help.
Machine learning: used especially in deep learning models with normalisation conditions and in tensor decomposition models.
Signal and image processing: applied to alignment and filtering problems.

mathematic model

The mathematical model of Riemannian optimisation is described.

1. definition of a Riemannian manifold: Riemannian optimisation is a method for solving optimisation problems defined on a Riemannian manifold \( M \).

A smooth manifold \( M \) is a space with a geometric structure that can be locally embedded in a Euclidean space.
A tangent vector space \( T_xM \) is defined at each point \( x \in M \).
In Riemannian manifolds, an inner product \( g_x(u, v) \) is defined at each point and the angles and distances between the tangent vectors \( u, v \in T_xM \) are measurable.

2. formalisation of the optimisation problem: The basic Riemannian optimisation problem takes the following form

\[
\min_{x \in M} f(x)
\]

– \( f: M \to \mathbb{R} \) is a scalar-valued function (objective function) on a Riemannian manifold.
– The constraint \( x \in M \) restricts the search space to \( M \).

In ordinary Euclidean space optimisation, the point \( x \) is searched in the Euclidean space \( \mathbb{R}^n \), but in Riemannian optimisation, \( x \) is restricted on the manifold \( M \).

3. the Riemannian gradient and the Hesse matrix:

Riemannian gradient: in Riemannian optimisation, instead of the gradient \(\nabla f \) in Euclidean space, a gradient on a Riemannian manifold is used. The Riemannian gradient \( \text{grad } f(x) \) is the gradient vector projected onto the tangent space \( T_xM \), the projection requires the structure of the Riemannian manifold and is usually calculated as follows.

\[
\text{grad } f(x) = P_x(\nabla f(x))
\] Where \( P_x \) is the projection operator onto the tangent space.

Riemannian Hesse matrix: the Riemannian version of the Hesse matrix is used in the Riemann-Newton method. This will describe the rate of change of the gradient vector in the tangent space.

4. algorithms

Riemannian gradient descent method: the Riemannian gradient descent method is an adaptation of the usual gradient descent method to manifolds.

1. Set the initial point \(x_0 \in M \).
2. calculate the gradient \( \text{grad } f(x_k) \).
3. compute the next point based on the direction in tangent space:
\[
x_{k+1} = R_{x_k}(-\alpha_k \text{grad } f(x_k))
\] Where \( R_{x_k} \) is the remapping to the manifold (retraction) and \( \alpha_k \) is the step size.

Retraction: operation of re-mapping from the tangent space to the original manifold in order to update points on the manifold.

5. applications and concrete examples of mathematical modelling

Example 1: Eigenvalue problem: The maximum eigenvalue problem can be modelled on a Riemannian manifold in the following form
\[
\max_{\mathbf{x} \in M} \mathbf{x}^\top A \mathbf{x}, \quad \text{where } M = \{\mathbf{x} \in \mathbb{R}^n \mid \|\mathbf{x}\|_2 = 1\}
\]

In this case, the Riemannian manifold \(\mathcal{M} \) is the unit sphere.

Example 2: Matrix low-rank completion: the problem of completing a matrix with rank constraints from observed data is defined as follows
\[
\min_{X \in M} \|P_\Omega(X – M)\|_F^2, \quad \text{where } M = \{X \in \mathbb{R}^{m \times n} \mid \text{rank}(X) = r\}
\]

Let the set of observed elements be \(\Omega \) and the Frobenius norm be \(\| \cdot \|_F \).

implementation example

The following example shows an implementation of Riemannian optimisation in Python. This implementation makes use of the pymanopt library. This library is a widely used tool for easy handling of Riemannian optimisation.

Example: solving an eigenvalue problem

Consider the following problem.

\[\max_{x\in M}x^TAx,\ where\ M=\{x\in R^n\ |||x||_2=1\}\]

Here, the goal would be to find the largest eigenvalue of matrix A and the corresponding eigenvector.

Code Example:

Library installation: pymanopt can be installed with the following command.

pip install pymanopt

code

import numpy as np
from pymanopt import Problem
from pymanopt.solvers import SteepestDescent
from pymanopt.manifolds import Sphere

# Definition of matrix A (generates a random symmetric matrix)
np.random.seed(0)
n = 5  # Dimensions of the matrix
A = np.random.randn(n, n)
A = (A + A.T) / 2  # Conversion to symmetric matrix

# Definition of manifold (unit sphere)
manifold = Sphere(n)

# Objective function definition
def cost(x):
    return -x.T @ A @ x  # Maximisation, so sign reversed.

# Definition of the optimisation problem
problem = Problem(manifold=manifold, cost=cost)

# Solver selection (steepest descent method)
solver = SteepestDescent()

# Perform optimisation
x_opt = solver.solve(problem)

# Display of results
print("greatest eigenvalue:", -cost(x_opt))
print("Corresponding eigenvector:", x_opt)

Interpretation of execution results

Maximum eigenvalue: in the above code, the maximum eigenvalue of matrix A
is calculated.
Corresponding eigenvector: the eigenvector corresponding to the maximum eigenvalue is obtained as an optimisation result.

Key points of the code

Setting up the Riemannian manifold: use pymanopt.manifolds.Sphere to represent the unit sphere (a set of points in Euclidean space).
Objective function: the goal of the eigenvalue problem is to maximise \(x^TAx\), but pymanopt has a minus sign on the objective function in order to solve the minimisation problem.
Solver: SteepestDescent was used, but other methods such as TrustRegions are available.

reference book

Reference books on Riemannian optimisation and Riemannian geometry are listed below.

1. fundamentals of Riemannian geometry
“Riemannian Geometry”
Author: Manfredo P. do Carmo
A classic introduction to Riemannian geometry. well-suited for students with a background in differential geometry.

“Introduction to Riemannian Manifolds”
Authors: John M. Lee
Covers fundamental Riemannian geometry topics with a focus on intuition and mathematical rigor.

2. theory and applications of Riemannian optimisation

“Optimization Algorithms on Matrix Manifolds”
Authors: P.-A. Absil, R. Mahony, R. Sepulchre
Description: a comprehensive description of the basic theory and algorithms of Riemannian optimisation. Specialises in optimisation problems on matrix manifolds, with a wealth of applications.

“Riemannian Optimization and Its Applications”
Authors: Hiroyuki Sato
Description: Explains the theory of Riemannian optimisation, actual computational algorithms and application areas (e.g. machine learning, tensor decomposition, etc.).

3. applications and numerical computation

“Manifold Learning Theory and Applications”
Authors: Yunqian Ma, Yun Fu
Description: explains how to apply Riemannian geometry to machine learning and data analysis.

“Matrix Computations”
Author: Gene H. Golub, Charles F. Van Loan
Description: deals with computational methods (matrix factorisation, eigenvalue computation, etc.) required in Riemannian optimisation, with a focus on numerical linear algebra.

4. applications in machine learning and deep learning

“Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges”

“Information Geometry and Its Applications”
Authors: Shun-ichi Amari
Description: Explains the connection between information geometry and optimisation. Includes applications combining Riemannian manifolds and information theory.

5. online resources

Paper
“Optimization Technique on Riemannian Manifolds”
Free lecture notes available, with clear explanations from the basics to applications.

Implementation guide in Python.
The official documentation of the `pymanopt` library contains examples of implementations and algorithms for Riemannian optimisation.