Overview of non-negative matrix factorisation (NMF) and examples of algorithms and implementations.

Machine Learning Artificial Intelligence Digital Transformation Natural Language Processing Deep Learning Information Geometric Approach to Data Mathematics Navigation of this blog

Overview of non-negative matrix factorisation (NMF)

Non-negative matrix factorisation (Non-negative Matrix Factorization, NMF) is a method that decomposes a given non-negative matrix into a product of two non-negative matrices. Specifically, the non-negative matrix ɛ(Vɛ) of a given \(m \times n\) is decomposed as follows.

\[ V \approx WH \]

Where,
– \(W\) is the non-negative matrix (basis or feature matrix) of \(m \times r\), where \(r\) denotes the number of dimensions.
– \(H\) is the non-negative matrix (coefficient matrix or weight matrix) of \(r \times n\), where \(r\) denotes the number of dimensions.

The aim of NMF is to find a \(W\) and \(H\) that approximates the original matrix \(V\) as accurately as possible, and this approximation allows the original data to be represented in terms of potential features and patterns, which can be applied to tasks such as dimensionality reduction and feature extraction.

NMF has the following main characteristics and applications.

1. interpretability: the basis matrix\(W\) and coefficient matrix\(H\) obtained by NMF can interpret the latent structure and features of the data. This facilitates data analysis and understanding.

2. dimensionality reduction: the original data can be converted to a lower dimensional representation. This enables compression and visualisation of high-dimensional data sets.

3. feature extraction: NMF can help extract meaningful features from data. This is particularly useful in the field of signal processing, such as image and audio.

4. topic modelling: used to extract topics from corpora such as textual data. In topic modelling, it is assumed that a document is represented by a combination of several topics.

Algorithms associated with non-negative matrix factorisation (NMF).

Several algorithms have been proposed to perform non-negative matrix factorisation (NMF). Typical algorithms include.

1. multiplicative update algorithms: proposed by Lee and Seung, which can be used to solve the NMF optimisation problem. The algorithm ensures that the elements of the basis matrix (W) and the coefficient matrix (H) are non-negative and maintains non-negativity in the update step. Typical methods include the Lee and Seung Algorithm (2001) and its derivative ALS-NMF (Alternating Least Squares NMF).

2. Kullback-Leibler Divergence Minimisation: This approach formulates the NMF optimisation problem in a way that minimises the Kullback-Leibler divergence described in “KL divergence constraint” and uses optimisation methods such as gradient descent to find a solution. The approach is to find the solution using the gradient descent method or other optimisation methods.

3.Projected Gradient Algorithms: formulate the NMF optimisation problem as a gradient descent method with non-negativity constraints, which ensures that the basis matrix (W) and coefficient matrix (H) are non-negative.

4.Alternating Nonnegative Least Squares: this approach solves the NMF optimisation problem using a non-negative least squares method. Alternating minimisation methods are used to alternately update the basis matrix (W) and the coefficient matrix (H).

Non-negative matrix factorisation (NMF) application examples

The following are examples of NMF applications.

1. topic modelling: the NMF is used to extract topics from textual data or document sets. It is assumed that a document is represented by a combination of several topics, where the basis matrix (W) represents the topics and the coefficient matrix (H) the combination of topics for each document. The method has been applied to tasks such as document clustering, information retrieval and recommendation systems.

2. image processing: NMF is used for feature extraction and image classification of image data. For example, in tasks such as face recognition and object detection, image data is decomposed into a basis matrix (W) and a coefficient matrix (H), where the basis matrix represents the basic features of the image and the coefficient matrix represents the combination of features in each image.

3. speech processing: NMF is used in feature extraction and speech classification of speech signals. It decomposes the speech data into a basis matrix (W) and a coefficient matrix (H), where the basis matrix represents the basic features of the speech and the coefficient matrix represents the combination of each speech feature.

4. data analysis: NMFs are used for data analysis in areas such as biology and chemistry. For example, it is applied to clustering of gene expression data, functional analysis and analysis of chemical spectral data.

5. recommendation systems: in recommendation systems, NMF decomposes matrices representing user preferences and item characteristics, with the basis matrix representing user preferences and the coefficient matrix representing item characteristics. This makes it possible to recommend items suitable for individual users.

Examples of non-negative matrix factorisation (NMF) implementations

The following is an example of implementing non-negative value matrix factorisation (NMF) using the Python scikit-learn library.

from sklearn.decomposition import NMF
import numpy as np

# Create a non-negative value matrix of the sample.
V = np.array([[1, 0, 2],
              [0, 3, 1],
              [2, 1, 0],
              [1, 2, 1]])

# Instantiate NMF and perform decomposition
nmf = NMF(n_components=2, init='random', random_state=0)
W = nmf.fit_transform(V)
H = nmf.components_

# The decomposed matrix is stored
print("basis matrix W:")
print(W)
print("coefficient matrix H:")
print(H)

The code uses the NMF class from the scikit-learn library to decompose a given non-negative matrix V. The n_components parameter specifies the number of dimensions of the basis matrix W and the coefficient matrix H. The init parameter specifies the initialisation method, with random The random_state parameter specifies the seed of the random number generator to ensure reproducibility. In addition, the decomposed basis matrix W and coefficient matrix H are output. Using these matrices, the original matrix V is approximated.

Non-negative matrix factorisation (NMF) challenges and remedies.

The following section describes the general challenges of NMF and the measures taken to address them.

1. uniqueness of solutions:

Challenge: NMF solutions are not unique. This means that different initialisations and parameter settings will yield different solutions.
Solution: it is common to try several initialisations and select the most stable solution. Regularisation methods and algorithms may also be improved to increase the stability of the solution.

2. dimension selection:

Challenge: The choice of the number of dimensions of the NMF (number of columns in the basis matrix (W) or coefficient matrix (H)) must be appropriate to the problem. An inappropriate number of dimensions may reduce the quality of the solution.
Solution: It is recommended to select the appropriate number of dimensions using methods such as cross-validation or information criterion. Another method is to search for the optimum number of dimensions by conducting the analysis with different numbers of dimensions.

3. scalability:

Challenge: computational costs and memory usage may increase when applying NMF to large datasets.
Solution: Approximate methods such as randomised NMF or streaming NMF may be used to cope with large data sets. Parallelisation methods such as distributed NMF and parallel NMF can also be useful.

4. sensitivity to noise and outliers:

Challenge: NMFs tend to be sensitive to noise and outliers. This reduces the quality of the solution.
Solution: it is important to reduce the influence of noise and outliers by using data pre-processing and denoising techniques. The development of robust NMF algorithms and the use of regularisation methods can also be useful.

Reference Information and Reference Books

For more information on optimization in machine learning, see also “Optimization for the First Time Reading Notes” “Sequential Optimization for Machine Learning” “Statistical Learning Theory” “Stochastic Optimization” etc.

Reference books include Optimization for Machine Learning

“Machine Learning, Optimization, and Data Science“

“Linear Algebra and Optimization for Machine Learning: A Textbook“