Practical Application of Scikit-learn Linear Regression in Python

Preparation work and environmental setup: 1. Install Python: On the Python official website（ https://www.python.org/downloads/ ）Download the Python version suitable for your operating system and install it. 2. Install Scikit learn: Open a command prompt and enter the following command to install Scikit learn: pip install -U scikit-learn 3. Install other necessary class libraries: In this practice, we also need to use the numpy, pandas, and matplotlib class libraries. Enter the following command to install: pip install numpy pandas matplotlib Dependent class libraries: 1. numpy: Used for numerical calculations and array operations. 2. Pandas: Used for data preprocessing and analysis. 3. matplotlib: used for Data and information visualization. 4. Scikit learn: used to construct and train machine learning models. Dataset introduction: The actual combat used this time is the Boston Housing Dataset, which comes with Scikit-learn and is a classic dataset used for regression problems. This dataset contains 506 samples, each with 13 features such as crime rate, average number of rooms, etc. The target variable is the median price of houses in the region. Dataset download website: The dataset provided by Scikit learn can be downloaded directly from its server without the need for additional download links. Sample data and code: The following is a complete sample code for linear regression using Scikit-learn: python import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import load_boston from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error #Load Boston House Price Dataset boston = load_boston() df = pd.DataFrame(boston.data, columns=boston.feature_names) df['PRICE'] = boston.target #Extract features and target variables X = df.drop('PRICE', axis=1).values y = df['PRICE'].values #Divide training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) #Building a linear regression model model = LinearRegression() #Training model model.fit(X_train, y_train) #Prediction y_pred = model.predict(X_test) #Evaluation mse = mean_squared_error(y_test, y_pred) print('Mean Squared Error:', mse) #Visualization results plt.scatter(y_test, y_pred) plt.plot([y.min(), y.max()], [y.min(), y.max()], '--', color='red', linewidth=2) plt.xlabel('True Price') plt.ylabel('Predicted Price') plt.title('Boston Housing Dataset - Linear Regression') plt.show() Run the above code to perform linear regression and obtain a visual display of the results. Summary: This practical exercise introduced how to use Scikit-learn for linear regression, and used the Boston housing price dataset as an example to train, predict, and evaluate the model. Through this example, we can learn the basic process of building machine learning models using Scikit learn, as well as using relevant class libraries for data processing and visualization.