Chinese Restaurant Process Overview, Algorithm and Implementation Example

Machine Learning Artificial Intelligence Digital Transformation Probabilistic Generative Models Machine Learning with Bayesian Inference Small Data Nonparametric Bayesian and Gaussian Processes python Economy and Business Physics & Mathematics Navigation of this blog

Overview of Chinese Restaurant Process

The Chinese Restaurant Process (CRP) is a stochastic model used to intuitively explain the Dirichlet Process (DP) described in “Overview of the Dirichlet Process (DP), its algorithms, and examples of implementations“. It is especially frequently used for clustering problems.

Rules of CRP: The following is a parable about a Chinese restaurant.

There is a Chinese restaurant with an infinite number of tables.
The first customer sits at one table at random.
The second and subsequent customers follow these rules
- The probability of sitting at a table where someone is already sitting is proportional to the number of people sitting at that table.
- The probability of sitting at a new table is proportional to some constant \(\alpha\).

These can be expressed in the formula as follows.

Probability of sitting at existing table \(k\):\[P(probability of sitting at existing table k)=\frac{n_k}{n+\alpha}\]\(n_k\):table, k:number of people sitting at k, n:number of all customers who have visited the restaurant so far, \(\alpha\):parameter controlling probability of sitting at new table Parameters to control the probability of sitting at a new table
Probability of sitting at a new table: \[P(probability of sitting at a new table)=\frac{\alpha}{n+\alpha}\].

Characteristics of CRP

Rich-get-richer effect: The more people at a table, the higher the probability that a new customer will sit at the table, which tends to create large clusters.
Infinite dimensionality: Since there are an infinite number of tables, there is always the possibility of adding new tables (clusters).
Relationship to Dirichlet process: CRP is a concrete representation of distributional sampling by a Dirichlet process. The CRP is used to describe the behavior of clustering in a Dirichlet process\(DP(\alpha,H\)).

Applications include the following.

Infinite mixture models: data clustering when the number of clusters is not a priori determined
Natural language processing: topic models (e.g. HDP-LDA)
Recommendation systems: dynamic assignment of user preferences to an infinite number of categories
Genetic analysis: modeling to discover unknown genotypes

Implementation Example

An example of a simple implementation of the Chinese restaurant process (CRP) using Python is described.

import random
import matplotlib.pyplot as plt

# Parameter α (controls the probability of sitting at a new table)
alpha = 1.0
n_customers = 100  # Number of guests

# List to store the number of people in each table
tables = []

# Simulation of CRP
for i in range(n_customers):
    if random.random() < alpha / (len(tables) + alpha):
        # Sitting at a new table
        tables.append(1)
    else:
        # Sit at an existing table (proportional to probability)
        table_probabilities = [count / (len(tables) + alpha) for count in tables]
        chosen_table = random.choices(range(len(tables)), weights=table_probabilities)[0]
        tables[chosen_table] += 1

# Visualize results
plt.bar(range(1, len(tables) + 1), tables)
plt.xlabel('Table (Cluster) Number')
plt.ylabel('Number of Customers')
plt.title('Chinese Restaurant Process Simulation')
plt.show()

print(f"Total tables (clusters) formed: {len(tables)}")

Code Description

Parameter settings
- alpha: Controls the probability of sitting at a new table. The higher the value, the more likely a new table will be created.
- n_customers: Number of customers (= number of data points).
CRP rules
- Probability of sitting at a new table: \(\frac{\alpha}{n+\alpha}\)
- Probability of sitting at an existing table\(k\):\(\frac{n_k}{n+\alpha}\)
Visualization: Visualize how many tables (clusters) were created in the end and check them in the bar chart.

Image of the result of the run

If \(\alpha\) is small → large tables dominate (small number of clusters)
\When the number of clusters is large → New tables tend to increase and the number of clusters also increases

Application Examples

Specific applications of the Chinese Restaurant Process (CRP) will be described.

1. clustering in a mixture model

Examples: text and image classification

Objective: To dynamically estimate clusters from data without a priori determination of the number of clusters.

Applications:

- Text classification: Discover unknown categories from text (e.g., automatic classification of news articles)
- Image classification: Grouping from unlabeled data based on image features

Effectiveness: Enables clustering based on data, so adding new categories (tables) is flexible

2. topic model (HDP-LDA)

Example: Natural Language Processing (NLP)

Objective: To automatically estimate the number of topics in a document without pre-setting the number of topics.

Application:

- Topic extraction of news articles
- Topic estimation of social media posts
- Topic analysis of word-of-mouth reviews

Effectiveness: Conventional LDA (Latent Dirichlet Allocation Method) requires manual setting of the number of topics (k), but with CRP, topics can be increased or decreased dynamically

3. genetic analysis

Example: Bioinformatics

Objective: To discover unknown gene groups by clustering gene expression data

Application:

- Discovery of unknown genotypes
- Extraction of gene patterns for each disease
  Grouping of DNA sequences

Effectiveness: New clusters (tables) are added each time a new gene pattern is discovered, allowing flexible models to be built

4. recommendation system

Example: E-commerce and video streaming services

Objective: To dynamically generate categories (tables) according to user preferences and find new preference groups.

Applications:

- Movie recommendation: Increase clusters (genres and themes) each time a new movie is introduced.
- Recommendation for e-commerce sites: Detect new purchasing trends based on user behavior

Effectiveness: No fixed number of clusters, so flexible response to new groups even if user preferences change

5. anomaly detection

Example: Cyber security

Objective: To dynamically identify new clusters (tables) of anomalous behaviors that do not belong to conventional patterns.

Applications:

- Network anomaly detection
- Detection of unauthorized access

Effectiveness: When an unknown attack method occurs, it is immediately added as a new cluster, enabling real-time detection

reference book

I will discuss reference books on the Chinese restaurant process (CRP) and related Bayesian nonparametric models.

1. Bayesian Nonparametric Statistics

2. Pattern Recognition and Machine Learning – Christopher M. Bishop

Summary: Covers the basics of probabilistic models and clustering, including CRPs and Dirichlet processes. You can also learn about graphical models and variational inference.
Recommendation: Ideal for those who want to understand the relevance of machine learning.

3. Machine Learning: A Probabilistic Perspective – Kevin P. Murphy

Summary: Explains Bayesian nonparametric methods including CRPs, Dirichlet processes, and Hierarchical Dirichlet Processes (HDP) described in “Hierarchical Dirichlet Process (HDP) Overview, Algorithm and Implementation Examples” with plenty of examples.
Recommendation: For those who want to apply the methods to practical work, as there are explanations with implementations. 4.

4. Fundamentals of Nonparametric Bayesian Inference

5. Chinese Restaurant Process, Stick Breaking Process, and Dirichlet Process Mixture Model