Python使用Statsmodels进行ARIMA模型分析、滑动平均、指数平滑等

在使用Statsmodels进行ARIMA模型分析、滑动平均和指数平滑之前，需要搭建Python环境并安装必要的类库。首先，确保安装了Python的最新版本。可以从Python官方网站（https://www.python.org/downloads/）上下载并安装Python。接下来，使用以下命令安装必要的类库： pip install numpy pip install pandas pip install matplotlib pip install statsmodels 安装完成后，就可以开始使用Statsmodels进行ARIMA模型分析、滑动平均和指数平滑了。为了演示ARIMA模型分析、滑动平均和指数平滑的方法，我们将使用statsmodels库内置的一个空气质量数据集。该数据集包含了每小时的空气质量指数（AQI）数据。下面是实现完整样例的Python代码： python import numpy as np import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.arima.model import ARIMA from statsmodels.tsa.holtwinters import ExponentialSmoothing from statsmodels.tsa.statespace.sarimax import SARIMAX # 下载并加载数据集 data_url = "https://raw.githubusercontent.com/ritvikmath/Time-Series-Analysis/master/air_passengers.csv" df = pd.read_csv(data_url) # 将日期列设置为索引，并将其转换为时间序列 df['Month'] = pd.to_datetime(df['Month']) df.set_index('Month', inplace=True) # 可视化数据集 plt.figure(figsize=(10, 6)) plt.plot(df) plt.xlabel('Year') plt.ylabel('Passengers') plt.title('Number of Air Passengers over Time') plt.show() # 使用ARIMA模型进行时间序列分析 model_arima = ARIMA(df, order=(2, 1, 2)) results_arima = model_arima.fit() # 预测时间序列 predictions_arima = results_arima.predict(start='1960-01-01', end='1970-01-01') # 绘制预测结果 plt.figure(figsize=(10, 6)) plt.plot(df, label='Actual') plt.plot(predictions_arima, label='ARIMA') plt.xlabel('Year') plt.ylabel('Passengers') plt.title('Number of Air Passengers over Time - ARIMA') plt.legend() plt.show() # 使用滑动平均进行平滑处理 rolling_mean = df.rolling(window=12).mean() # 绘制滑动平均结果 plt.figure(figsize=(10, 6)) plt.plot(df, label='Actual') plt.plot(rolling_mean, label='Rolling Mean') plt.xlabel('Year') plt.ylabel('Passengers') plt.title('Number of Air Passengers over Time - Rolling Mean') plt.legend() plt.show() # 使用指数平滑进行平滑处理 model_exponential_smoothing = ExponentialSmoothing(df, trend='add', seasonal='add', seasonal_periods=12) results_exponential_smoothing = model_exponential_smoothing.fit() # 预测时间序列 predictions_exponential_smoothing = results_exponential_smoothing.predict(start='1960-01-01', end='1970-01-01') # 绘制预测结果 plt.figure(figsize=(10, 6)) plt.plot(df, label='Actual') plt.plot(predictions_exponential_smoothing, label='Exponential Smoothing') plt.xlabel('Year') plt.ylabel('Passengers') plt.title('Number of Air Passengers over Time - Exponential Smoothing') plt.legend() plt.show() 在上述代码中，我们首先通过URL下载并加载了空气质量数据集。然后，我们将日期列设置为索引，并将其转换为时间序列。接下来，我们可视化了数据集的时间序列图。然后，我们使用ARIMA模型对时间序列进行了分析，并使用该模型进行了预测。我们还使用滑动平均和指数平滑方法对时间序列进行了平滑处理，并进行了相应的预测。最后，我们分别绘制了ARIMA预测结果、滑动平均结果和指数平滑结果的图表。请注意，以上代码将数据集直接从URL加载到内存中。如果想要使用本地存储的数据集，只需将以下代码行注释掉，并使用本地数据集的路径: python # data_url = "https://raw.githubusercontent.com/ritvikmath/Time-Series-Analysis/master/air_passengers.csv" # df = pd.read_csv(data_url) 然后，将以下代码行解除注释，并将路径更改为本地数据集的路径: python # data_path = "path/to/your/local/dataset.csv" # df = pd.read_csv(data_path) 请确保数据集的格式与示例数据集相同，即包含一个日期列和一个观测值列。