Python uses Statsmodes to calculate the central trend and Statistical dispersion of data
Environmental construction and preparation work:
1. Install Python: Go to the official website https://www.python.org/downloads/ Download and install the appropriate Python version for your operating system.
2. Install Statsmodes library: Open a command line or terminal window and run the following command to install:
pip install statsmodels
Dependent class libraries:
-NumPy: Used to handle numerical calculations and array operations.
-Pandas: used for data processing and analysis.
-Matplotlib: used for Data and information visualization.
-Statsmodes: Used for statistical analysis and modeling.
Downloadable Datasets:
We will use the 'iris' dataset, which comes with Statsmodes. This dataset describes the sizes of sepals and petals of three different types of iris (Setosa, Versicolor, and Virginia).
Sample data:
The Iris dataset contains 150 samples, each with 4 characteristic columns (calyx length, calyx width, petal length, and petal width) and 1 target column (iris species).
The complete sample code is as follows:
python
import pandas as pd
import statsmodels.api as sm
from sklearn.datasets import load_iris
#Load iris dataset
data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
#Calculate the central trend of data
mean = df.mean()
median = df.median()
mode = df.mode().iloc[0]
#Statistical dispersion of calculated data
std = df.std()
var = df.var()
range_val = df.max() - df.min()
#Print calculation results
Print ("Central Trend:")
Print ("mean:")
print(mean)
print("
Median:
print(median)
print("
Mode:
print(mode)
print("
Statistical dispersion: ")
Print ("Standard Deviation:")
print(std)
print("
Variance:
print(var)
print("
Range:
print(range_val)
This code loads the iris dataset, and uses Statsmodes to calculate the central trend (mean, median, mode) and Statistical dispersion (standard deviation, variance, range) of the data. Finally, the calculated results were printed out.
上一篇:Python uses Pandas to implement time series analysis, including date and timestamp processing, sliding window analysis, moving average, etc
下一篇:Python uses Statsmodes for hypothesis testing and confidence interval estimation, including single sample test, double sample test, analysis of variance, Chi-squared test, t test, etc
切换中文