Python uses Statsmodes for ARIMA model analysis, moving average, exponential smoothing, etc
Before using Statsmodes for ARIMA model analysis, moving average, and exponential smoothing, it is necessary to build a Python environment and install the necessary class libraries.
Firstly, ensure that the latest version of Python is installed. You can access it from the official Python website( https://www.python.org/downloads/ )Download and install Python from.
Next, use the following command to install the necessary class libraries:
pip install numpy
pip install pandas
pip install matplotlib
pip install statsmodels
After installation, you can start using Statsmodes for ARIMA model analysis, moving average, and exponential smoothing.
To demonstrate the methods of ARIMA model analysis, moving average and exponential smoothing, we will use an air quality dataset built into the statsmodes library. This data set contains hourly Air quality index (AQI) data.
The following is the Python code to implement the complete example:
python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.holtwinters import ExponentialSmoothing
from statsmodels.tsa.statespace.sarimax import SARIMAX
#Download and load dataset
data_url = "https://raw.githubusercontent.com/ritvikmath/Time-Series-Analysis/master/air_passengers.csv"
df = pd.read_csv(data_url)
#Set the date column as an index and convert it to a time series
df['Month'] = pd.to_datetime(df['Month'])
df.set_index('Month', inplace=True)
#Visualization Dataset
plt.figure(figsize=(10, 6))
plt.plot(df)
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Number of Air Passengers over Time')
plt.show()
#Using ARIMA model for time series analysis
model_arima = ARIMA(df, order=(2, 1, 2))
results_arima = model_arima.fit()
#Predicting time series
predictions_arima = results_arima.predict(start='1960-01-01', end='1970-01-01')
#Draw prediction results
plt.figure(figsize=(10, 6))
plt.plot(df, label='Actual')
plt.plot(predictions_arima, label='ARIMA')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Number of Air Passengers over Time - ARIMA')
plt.legend()
plt.show()
#Smooth processing using sliding average
rolling_mean = df.rolling(window=12).mean()
#Draw sliding average results
plt.figure(figsize=(10, 6))
plt.plot(df, label='Actual')
plt.plot(rolling_mean, label='Rolling Mean')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Number of Air Passengers over Time - Rolling Mean')
plt.legend()
plt.show()
#Using exponential smoothing for smoothing processing
model_exponential_smoothing = ExponentialSmoothing(df, trend='add', seasonal='add', seasonal_periods=12)
results_exponential_smoothing = model_exponential_smoothing.fit()
#Predicting time series
predictions_exponential_smoothing = results_exponential_smoothing.predict(start='1960-01-01', end='1970-01-01')
#Draw prediction results
plt.figure(figsize=(10, 6))
plt.plot(df, label='Actual')
plt.plot(predictions_exponential_smoothing, label='Exponential Smoothing')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Number of Air Passengers over Time - Exponential Smoothing')
plt.legend()
plt.show()
In the above code, we first downloaded and loaded the air quality data set through the URL. Then, we set the date column as an index and convert it into a time series.
Next, we visualized the time series diagram of the dataset.
Then, we analyzed the time series using the ARIMA model and made predictions using this model. We also used sliding average and exponential smoothing methods to smooth the time series and made corresponding predictions.
Finally, we plotted the ARIMA prediction results, moving average results, and exponential smoothing results separately.
Please note that the above code loads the dataset directly from the URL into memory. If you want to use a locally stored dataset, simply comment out the following line of code and use the path to the local dataset:
python
# data_url = "https://raw.githubusercontent.com/ritvikmath/Time-Series-Analysis/master/air_passengers.csv"
# df = pd.read_csv(data_url)
Then, uncomment the following line of code and change the path to the local dataset:
python
# data_path = "path/to/your/local/dataset.csv"
# df = pd.read_csv(data_path)
Please ensure that the format of the dataset is the same as the sample dataset, which includes a date column and an observation value column.