Overview of online learning and various algorithms, application examples and specific implementations

Artificial Intelligence Digital Transformation Image Information Processing Deep Learning Probabilistic Generative Model Machine Learning Online Learning Sensor Data/IOT Navigation of this blog

Online Learning

Online Learning is a method of learning by sequentially updating a model under conditions where data arrives sequentially. Unlike batch learning, which is usually performed in machine learning, Online Learning is characterized by an algorithm in which the model is updated each time new data arrives.

Online learning has the following characteristics

Real-time: Since data arrives sequentially, the model is updated in real-time. This allows for rapid response to new information and changes.
Open world setting: In online learning, models need to adapt to unknown data. Therefore, the method must be able to adapt and continue learning as new data and classes emerge.
Memory efficiency: Online learning processes data sequentially, eliminating the need to hold the entire data set in memory. This makes it possible to process large data sets efficiently.
Model updating: In online learning, the model is updated each time new data arrives. Model updating can be done in different ways, such as fine-tuning parameters or updating weights.

Online learning will be a technique used in a variety of application areas, for example, analyzing clickstream data, personalizing online advertisements, real-time anomaly detection, and learning on mobile devices.

Online learning will play an important role in real-time changing environments and large data sets, but it is a method that requires caution because the performance of the model may vary depending on the order and timing of data arrival.

Algorithms used for online learning

In online learning, data arrives sequentially, so the model must be updated as new data is accepted. The following is a list of algorithms commonly used in online learning.

Stochastic Gradient Descent (SGD): SGD described in “Overview of Stochastic Gradient Descent (SGD), its algorithms and examples of implementation” is an optimization algorithm that calculates the gradient at each data point and updates the parameters. The basic steps of SGD are as follows
1. Set the initial parameters of the model.
2. Randomly shuffle the training data.
3. Perform the following steps for each training data sample
  - Make predictions for the model using the features of the sample.
  - Calculate the error between the prediction and the correct data.
  - Use the error to update the parameters of the model. Specifically, update the parameters in the opposite direction of the gradient by a small amount, and the rate of update is controlled by a hyperparameter called the learning rate.
4. Step 3 is repeated until all training data samples have been processed. In this way, the parameters of the model are updated sequentially.
Adaptive Learning Rate Methods: In online learning, the distribution and characteristics of the data can change. Adaptive learning rate methods automatically adjust the learning rate to enable learning models that are more responsive to changes. Some typical adaptive learning rate methods are described below.
- AdaGrad (Adaptive Gradient Algorithm): AdaGrad is a method that adjusts the learning rate for each parameter. The method accumulates the sum of squares of past gradients for each parameter and attenuates the learning rate based on the sum of squares. Characteristically, the learning rate decreases as the number of updates increases, and a large learning rate is applied to elements with sparse gradients.
- RMSprop (Root Mean Square Propagation): RMSprop mitigates AdaGrad’s shortcoming of rapid learning rate decay by using an exponential moving average to calculate the root mean square of the gradient, whereas AdaGrad accumulates the sum of squares. RMSprop uses an exponential moving average to compute the mean of the squares of the gradients. This allows for a slower decay of the learning rate and easier convergence to the optimal solution.
- Adam (Adaptive Moment Estimation): Adam computes an exponential moving average of the primary moments (mean) and secondary moments (variance) of the gradient to adaptively adjust the learning rate. This provides a bias correction to the learning rate and tends to make convergence faster and more stable.Adam is widely used, especially in the field of deep learning.
Passive Aggressive Algorithm: A passive aggressive is a supervised learning algorithm used for classification and regression tasks, where the model can be updated sequentially according to the order of the training data. Passive-aggressive algorithms, as their name suggests, have “passive” and “aggressive” properties. The algorithm chooses between passive (passive) and aggressive (aggressive) responses to misclassified samples. The specific steps of the algorithm are as follows
1. Set the initial parameters of the model.
2. Process the training data one sample at a time.
3. Perform the following steps for each sample
  - Use the samples to make predictions for the model.
  - Calculate the error between the prediction and the correct data.
  - Use the error to update the parameters of the model.
    - Passive response: If the error is within acceptable limits, ignore the samples without updating the parameters of the model.
    - Aggressive response: if the error is above the tolerance, force updating the parameters of the model.
4. Repeat step 3 until all samples have been processed.

This technique updates the model to minimize misclassification and is especially useful when the order of the data is critical.

Batch Update: Batch update is a technique that breaks the data into smaller batches and updates the model based on each batch. The general procedure is as follows
1. Split the training data into smaller batches. The batch size is selected based on processing power, memory constraints, or specific algorithm requirements.
2. Perform the following steps for each batch
  - Use the data in the batch to make predictions for the model.
  - Calculate the error between the prediction and the correct data.
  - Use the error to update the parameters of the model. Generally, optimize the parameters using the gradient descent method or a derivative algorithm.
3. Repeat step 2 until all batches have been processed. Since the parameters are updated for each batch, the model is trained sequentially.
Incremental Precision: Incremental Precision is a type of online learning in classification that updates the model as data arrives. This method is particularly suited for training in non-stationary environments, where models are adaptively updated as new data arrives, using a measure of incremental precision that represents the performance of the model as it is updated with new data.
Mirror Proxy Algorithm: Mirror proxies are a method for online learning in constrained optimization problems such as convex optimization problems. The method approximates the problem using a convex function, called a proxy function, that reflects the constraints and properties of the original optimization problem, and iteratively updates it. The basic mirror proxy algorithm procedure is as follows.
1. Define the objective function to be optimized.
2. Select a proxy function. Since the proxy function reflects the constraints and properties of the original optimization problem, it must be chosen appropriately for the problem.
3. Set the initial solution.
4. Perform the following steps in an iterative manner.
  - Update the solution to minimize the sum of the objective function and the proxy function.
  - Since the solution is constrained by the proxy function, adjust the solution to satisfy the appropriate constraints.
5. Repeat step 4 until convergence criteria (e.g., change in solution or decrease in objective function) are met.

These algorithms are methods for dealing with data ordering and arrival timing in online learning, and it is important to select the best algorithm for the specific task and data characteristics. Translated with www.DeepL.com/Translator (free version)

Libraries and platforms used for online learning

Various libraries and platforms are available for online learning. Some of the most representative ones are described below.

TensorFlow: TensorFlow is an open source deep learning framework developed by Google that supports online learning. by using features such as TensorFlow’s Estimator API and TFX (TensorFlow Extended), TensorFlow can be used to build an online learning pipeline.
PyTorch: PyTorch is an open source deep learning framework developed by Facebook that is also used for online learning, and extension packages such as PyTorch Lightning and TorchDrift can be used to implement online learning methods and processes PyTorch Lightning and TorchDrift.
scikit-learn: scikit-learn is a machine learning library available in Python and supports online learning. partial_fit method can be used to learn data sequentially.
Vowpal Wabbit: Vowpal Wabbit is an open source machine learning library developed by Microsoft Research and dedicated to online learning. It may be used for large data sets or when fast learning is needed.
Amazon SageMaker: Amazon SageMaker is a managed machine learning platform from Amazon Web Services (AWS) that also supports online learning. with SageMaker, you can create a scalable online learning environment for model training and deployment.

Case Studies in the Application of Online Learning

Online learning is a particularly useful technique in situations where data arrives sequentially, and applications include the following

Online advertising: In online advertising, ad personalization and targeting are performed in real time based on user profiles and behavioral data. Online learning enables optimization of ad display in response to user feedback and behavioral changes.
Recommendation system: An online recommendation system makes it possible to make personalized recommendations based on user preferences and behavioral data. Online learning enables real-time updating of recommendation models based on user feedback and new data.
News Feeds: Social media and news platforms may apply online learning to provide users with individually relevant news and content. These can be used to adjust news priorities and relevance in real time based on user feedback and behavioral data.
Network Security: In network security, online learning may be used for anomaly detection and attack detection. These can be used to detect potential threats in real time based on network traffic patterns and behavior.
Financial Transaction Prediction: The stock and currency markets can use online learning to make trade and price predictions. This would be like sequentially updating predictive models based on real-time market data and traders’ trading patterns.

The following describes their concrete implementation in python.

Example implementation in python using online learning in online advertising

The following is an example of a Python implementation using an online learning algorithm to evaluate the effectiveness of online advertising. This example describes the use of a neural network called a multilayer perceptron (MLP).

First, import the necessary libraries.

import numpy as np
from sklearn.neural_network import MLPClassifier

Next, the dataset is prepared. Here, we assume a dataset with ad features (input) and clicks (output).

# Creating a dummy data set
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) # Feature value of ads
y = np.array([0, 0, 0, 1]) # With or without clicks

Create an instance of MLP and conduct online learning.

# MLP Instance Creation
mlp = MLPClassifier(hidden_layer_sizes=(10,), max_iter=1000)

# online learning
mlp.partial_fit(X, y, classes=np.unique(y))

In online learning, data is passed around a bit at a time rather than using the entire dataset at once. the partial_fit method allows training with a portion of the dataset.

Once training is complete, new ad features can be given to predict clicks.

# New advertising feature quantities
new_X = np.array([[0, 0], [1, 1]])

# Predicting clicks
predictions = mlp.predict(new_X)

The above is an example of a Python implementation using online learning in online advertising.

Example implementation in python with online learning in a recommendation system

An example Python implementation for using online learning in a recommendation system is shown below. This example describes the use of collaborative filtering to recommend items based on user rating data.

First, import the necessary libraries.

import numpy as np
from sklearn.linear_model import SGDRegressor

Next, we prepare the user’s rating dataset. Here, we assume a dataset with user IDs, item IDs, and rating values.

# Creating a dummy data set
X = np.array([[1, 1], [1, 2], [2, 1], [2, 2]]) # User ID and Item ID
y = np.array([5, 3, 4, 2]) # evaluation value

Online learning with SGDRegressor.

# Instantiation of SGDRegressor
sgd = SGDRegressor()

# online learning
sgd.partial_fit(X, y)

In online learning, data is passed around bit by bit rather than using the entire data set at once; the partial_fit method allows training with a portion of the data set. Once training is complete, items can be recommended to new users.

# New user ID and item ID
new_X = np.array([[1, 3], [2, 3]])

# Predicting Recommendations
predictions = sgd.predict(new_X)

The above is an example of a Python implementation using online learning in a recommendation system. Depending on the actual problem, it may be necessary to prepare a dataset, adjust the parameters of SGDRegressor, or select and implement an appropriate online learning algorithm when using methods other than collaborative filtering.

Example implementation in python with online learning in news feeds

An example Python implementation for using online learning in a news feed is shown below. In this example, a ranking model is used to properly rank news articles based on user feedback.

First, import the necessary libraries.

import numpy as np
from sklearn.linear_model import SGDClassifier

Next, we prepare a user feedback dataset. Here we assume a dataset with user IDs, news article IDs, and feedback (clicks or non-clicks).

# Creating a dummy data set
X = np.array([[1, 1], [1, 2], [2, 1], [2, 2]]) # User ID and News Article ID
y = np.array([1, 0, 0, 1]) # Feedback (click: 1, non-click: 0)

Online learning with SGDClassifier.

# Instantiation of SGDClassifier
sgd = SGDClassifier(loss='log')

# online learning
sgd.partial_fit(X, y, classes=np.unique(y))

In online learning, data is passed around bit by bit rather than using the entire dataset at once; the partial_fit method allows training with a portion of the dataset. Once training is complete, the news articles can be ranked for new users.

# New user ID and news article ID
new_X = np.array([[1, 3], [2, 3]])

# Ranking Predictions
probabilities = sgd.predict_proba(new_X)

Using the predict_proba method, it is possible to obtain a probability for the rank of each news article. The above is an example of a Python implementation using online learning with news feeds. Depending on the actual problem, it may be necessary to prepare a dataset, adjust the parameters of SGDClassifier, etc. In addition, if a method other than the ranking model is used, it is necessary to select and implement an appropriate online learning algorithm.

Example implementation in python with online learning in network security

An example Python implementation of the use of online learning in network security is shown below. This example uses a logistic regression model that uses online learning to classify network traffic.

First, import the necessary libraries.

import numpy as np
from sklearn.linear_model import SGDClassifier

Next, the traffic dataset is prepared. Here, we assume a dataset with network traffic features (inputs) and labels (normal or attack).

# Creating a dummy data set
X = np.array([[0.2, 0.5, 0.1], [0.3, 0.4, 0.2], [0.1, 0.7, 0.3], [0.4, 0.3, 0.5]]) # Traffic Features
y = np.array([0, 0, 1, 1]) # Label (normal: 0, attack: 1)

Online learning with SGDClassifier.

# Instantiation of SGDClassifier
sgd = SGDClassifier(loss='log')

# online learning
sgd.partial_fit(X, y, classes=np.unique(y))

In online learning, data is passed around a bit at a time rather than using the entire data set at once; the partial_fit method can be used to train with a portion of the data set. Once training is complete, new traffic data can be classified.

# New traffic features
new_X = np.array([[0.3, 0.2, 0.1], [0.1, 0.3, 0.5]])

# Classification Prediction
predictions = sgd.predict(new_X)

The above is an example of a Python implementation using online learning in network security. Depending on the actual problem, preparation of the data set and adjustment of the SGDClassifier parameters may be necessary. Also, when using methods other than the logistic regression model, it is necessary to select and implement an appropriate online learning algorithm.

Example implementation in python using online learning in financial transaction forecasting

An example Python implementation of the use of online learning in financial trade forecasting is shown below. This example uses a logistic regression model that uses online learning to predict the next trade up or down using historical trading data.

First, import the necessary libraries.

import numpy as np
from sklearn.linear_model import SGDClassifier

Next, a transaction data set is prepared. Here, we assume a dataset with historical transaction data features (input) and labels for ups and downs (up: 1, down: 0).

# Creating a dummy data set
X = np.array([[0.2, 0.5, 0.1], [0.3, 0.4, 0.2], [0.1, 0.7, 0.3], [0.4, 0.3, 0.5]]) # Transaction feature value
y = np.array([1, 1, 0, 0]) # Label (up: 1, down: 0)

Online learning with SGDClassifier.

# Instantiation of SGDClassifier
sgd = SGDClassifier(loss='log')

# online learning
sgd.partial_fit(X, y, classes=np.unique(y))

In online learning, data is passed around a bit at a time rather than using the entire data set at once; the partial_fit method allows for learning with a portion of the data set. Once the learning is complete, new transaction data can be used to forecast the rise and fall of the data.

# Characteristic quantities of new transactions
new_X = np.array([[0.3, 0.2, 0.1], [0.1, 0.3, 0.5]])

# Prediction of rise and fall
predictions = sgd.predict(new_X)

The above is an example of a Python implementation using online learning in financial transaction forecasting. Depending on the actual problem, preparation of the data set and adjustment of the SGDClassifier parameters may be necessary. Also, when using methods other than the logistic regression model, it is necessary to select and implement an appropriate online learning algorithm.

Reference Information and Reference Books

For more information on online learning, see “About Online Learning and Online Prediction.

Core Texts

A Modern Introduction to Online Learning – Francesco Orabona, 2019
A comprehensive introduction to online convex optimization, regret minimization, and bandit problems. It’s freely available online (arXiv link) and is one of the most cited modern references.
Prediction, Learning, and Games – Nicolo Cesa-Bianchi & Gábor Lugosi, 2006
A foundational text that systematically develops the theory of online learning, regret analysis, and connections to game theory. This is considered a “classic” in the field.

Broader Machine Learning Books (with Online Learning Sections)

Understanding Machine Learning: From Theory to Algorithms – Shai Shalev-Shwartz & Shai Ben-David, 2014
Covers a broad spectrum of ML algorithms, including sections on online learning, convexity, and stochastic gradient descent. Good balance of theory and accessibility. (Free PDF from authors)
Online Learning and Online Convex Optimization – Shai Shalev-Shwartz, 2011 (Foundations and Trends in ML monograph)
Shorter but very influential survey, excellent for focused study of OCO and regret bounds.

Applied / Complementary Perspectives

The Master Algorithm – Pedro Domingos, 2015
A more popular science style book, not technical, but helps situate online learning as one of the five major “tribes” of ML (the Bayesian tribe). Useful for conceptual grounding.
Bandit Algorithms – Tor Lattimore & Csaba Szepesvári, 2020
Focused specifically on multi-armed bandits and contextual bandits, which are core problems within online learning. Strong mathematical treatment.

For a reference book, see “Online Machine Learning.

“Machine Learning Techniques for Online Social Networks“

“Practical Machine Learning for Streaming Data with Python: Design, Develop, and Validate Online Learning Models “

“Machine Learning for Streaming Data with Python: Rapidly build practical online machine learning solutions using River and other top key frameworks“

Deux Ex Machina

AIシステム設計・意思決定構造の設計を専門としています。
Ontology・DSL・Behavior Treeによる判断の外部化、マルチエージェント構築に取り組んでいます。

Specialized in AI system design and decision-making architecture.
Focused on externalizing decision logic using Ontology, DSL, and Behavior Trees, and building multi-agent systems.