Python uses Statsmoodels for multidimensional data analysis such as principal component analysis and factor analysis
Preparation work:
1. Install Statsmodes library: Enter 'pip install Statsmodes' on the command line to install the Statsmodes library.
2. Prepare a multidimensional dataset: We use the Iris dataset as an example dataset for principal component analysis and factor analysis. The Iris dataset is a commonly used dataset for classification and clustering problems, consisting of 150 samples, each with 4 features (calyx length, calyx width, petal length, petal width), and each sample belongs to three different varieties of iris (Setosa, Versicolor, Virginia). You can download the dataset from the following link: https://archive.ics.uci.edu/ml/datasets/iris .
Code example:
import pandas as pd
import numpy as np
import statsmodels.api as sm
#Load Dataset
data = pd.read_csv('iris.csv')
#Data preprocessing
X=data [['sepal_length ','sepal_width','petal_length ','petal_width'] # Select the features that require principal component analysis
X=sm. add_ Constant (X) # Add a constant column
X_ Scaled=(X - X. mean())/X. std() # Feature standardization
#Principal Component Analysis
pca = sm.PCA(X_scaled)
pca_result = pca.fit()
#Output principal component analysis results
print(pca_result.summary())
#Factor analysis
fa = sm.FactorAnalysis(X_scaled, factors=2)
fa_result = fa.fit()
#Output factor analysis results
print(fa_result.summary())
In the example code above, we first use ` pd.read_ The csv() function loaded the Iris dataset and then selected four feature columns for principal component analysis. Next, we will use 'sm. add'_ The constant() function added a constant column and then standardized the features. Then, we created a principal component analysis object using the 'sm. PCA()' function and called the 'fit()' function for fitting to obtain the results of principal component analysis. Finally, we created a factor analysis object using the 'sm. FactorAnalysis()' function and called the 'fit()' function for fitting to obtain the results of factor analysis.
Please ensure to replace the 'iris. csv' in the code with the file path of the Iris dataset you downloaded.