Time series analysis using Prophet

Machine Learning Artificial Intelligence Digital Transformation ICT Sensor Data & IOT ICT Infrastructure Stream Data Processing Probabilistic Generative Model Support Vector Machine Sparse Modeling Anomaly and Change Detection Relational Data Learning Time Series Data Analysis Economy and Business Physics & Mathematics Navigation of this blog
Prophet Overview

Prophet is a time-series forecasting tool developed by Facebook that will be able to forecast future time-series data, taking into account the effects of time trends, periodicity, and holidays.

Prophet is an open source tool implemented in Python and provides a simple API for forecasting by adding nonlinear terms to a linear model and decomposing the trend, seasonal, and holiday components, It is also able to calculate confidence intervals for forecast results. As a result, Prophet is used in a variety of fields, including business, finance, meteorology, and medicine.

The main algorithms used in Prophet are described in detail below.

  • Prophet uses robust regression to estimate the trend component.
  • Fourier analysis is used to estimate the seasonal component. Fourier analysis is a method of representing periodic phenomena in terms of multiple sinusoids and can be used to estimate the seasonal component.
  • A model using aggregated holiday information is used to estimate the holiday component. Since holidays affect time-series data in the same way as the seasonality component, modeling holidays in Prophet allows for more accurate forecasts.

Prophet combines these algorithms to forecast time-series data. By using these estimation of the trend component using an additive model, estimation of the seasonal component using Fourier analysis, and modeling of the holiday component, Prophet can forecast time-series data with a high degree of accuracy.

The following are examples of Prophet applications.

  • Business: Prophet is used to forecast sales and demand. For example, in the retail industry, Prophet can be used to optimize inventory control and production planning by forecasting sales of specific products and stores.
  • Finance: Prophet is also used to forecast stock prices, exchange rates, etc. It can also be used for risk management by calculating confidence intervals for time series data using Prophet.
  • Meteorology: Prophet is also used for weather forecasting, which enables accurate weather information by taking into account trends and periodicity of weather phenomena.
  • Medical: Prophet is also used to forecast medical data. It can be used, for example, to predict the length of a patient’s hospital stay or the cost of medical care, allowing healthcare organizations to optimize their resource allocation and budgeting.
Algorithms used in Prophet

The following is a description of the main algorithms used in Prophet.

  • Trend Model: Prophet uses logistic growth curves to model trends in time-series data. It captures increases and decreases in the data, thus allowing for systemic changes in the trend.
  • Seasonality Model: Prophet uses Fourier transforms to model seasonal patterns, such as weekly, annual, and monthly. This allows for periodic fluctuations to be captured, as well as the effects of special events (e.g., holidays).
  • Holiday Effects: Prophet has a mechanism for incorporating special events and holidays into the model. This allows for accurate capture of data variations during holidays.
  • Noise Model: Prophet models time-series data to account for noise and random factors in the data. This allows it to reflect the uncertainty in the data.

Prophet’s advantages include its user-friendly interface and high flexibility. Users can build models and make forecasts based on customized trend changes, seasonality patterns, and holiday effects.

About the analysis procedure in Prophet

The general procedure for analyzing time series data using Prophet is as follows: Prophet will be a tool for modeling and forecasting trends, seasonality, holiday effects, etc.

Data Preparation:

  • Load the time series data you wish to analyze from a CSV file or other source. The data must have at least two columns, “ds” (date) and “y” (value).

Importing libraries:

  • Import the prophet package.
from prophet import Prophet

Model instantiation:

  • Create an instance of the Prophet model.
model = Prophet()

Customized seasonality (optional):

  • The add_seasonality method can be used to customize the seasonality of the model. For example, weekly or annual seasonality can be added.
model.add_seasonality(name='weekly', period=7, fourier_order=3) 
model.add_seasonality(name='yearly', period=365.25, fourier_order=10)

Additional holiday effects (optional):

  • The add_country_holidays method can be used to incorporate the impact of special events and holidays into the model.
model.add_country_holidays(country_name='US')

Data Fitting and Prediction:

  • Fit data to the model using the fit method to learn trends and seasonality.
model.fit(dataframe)
  • Use the make_future_dataframe method to generate data for future dates.
future = model.make_future_dataframe(periods=365) # Prediction up to 365 days in advance
  • Use the predict method to perform predictions.
forecast = model.predict(future)

Visualization of results:

  • PROPHET also supports visualization of results. Trends, seasonality, and forecast results can be plotted as follows
fig = model.plot(forecast)

The above is a general procedure for analyzing time series data using Prophet.

On the evaluation of multiple time series data

Next, we will discuss the evaluation when there are multiple time series data. In considering them, it is first important to visualize and confirm the elementary properties of the data before analyzing the time series data. In this section, we describe a method of reading CSV files of multiple time-series data and displaying them on a scatter diagram. In this case, each CSV file contains different time-series data, and we intend to compare them in a scatter plot.

First, import the necessary libraries. In the following code example, we use pandas and matplotlib.

pip install pandas matplotlib

The following is an example code that reads multiple CSV files and displays them on a scatter plot:

import pandas as pd
import matplotlib.pyplot as plt
import glob

# Specify the directory containing the CSV file to be read
csv_directory = 'path_to_directory_containing_csv_files'

# Get the paths of all CSV files in the directory
csv_files = glob.glob(f'{csv_directory}/*.csv')

# Read CSV files and plot on scatter plots
for csv_file in csv_files:
    df = pd.read_csv(csv_file)
    
    # Get the name of the time series data from the CSV file name
    series_name = csv_file.split('/')[-1].split('.')[0]
    
    # Plot a scatter plot of time series data
    plt.scatter(df['timestamp_column'], df['value_column'], label=series_name)

# Graph Settings
plt.xlabel('Timestamp')
plt.ylabel('Value')
plt.legend()
plt.title('Scatter Plot of Time Series Data')
plt.xticks(rotation=45)
plt.tight_layout()

# Display Graphs
plt.show()

This code reads all CSV files in the specified directory and plots the time-series data in each file as a scatter plot (timestamp_column and value_column are replaced by the column names of the actual CSV files).

When visualized with these plots, if the individual time-series data have the same trend, the analysis with PROPHET can be considered as is. If they have different trends, we can cluster the time-series data and perform a Prophet analysis on each class, or we can use a Bayesian time-series model as described in “Overview of Bayesian Structured Time Series Models and Examples of Application and Implementation” or “Overview of State Space Models and Examples of Time Series Data Analysis Using R and Python” is necessary.

Reference Information and Reference Books

For more details on time series data analysis, see “Time Series Data Analysis. Please refer to that as well.

Reference book is “Practical Time-Series Analysis: Master Time Series Data Processing, Visualization, and Modeling using Python

Time Series Analysis Methods and Applications for Flight Data

Time series data analysis for stock indices using data mining technique with R

Time Series Data Analysis Using EViews

Practical Time Series Analysis: Prediction with Statistics and Machine Learning

コメント

タイトルとURLをコピーしました