Python uses Scikit born Logistic regression

Preparation work: 1. Install Python: on the official website( https://www.python.org/downloads/ )Download and install the latest version of Python. 2. Install Scikit learn: Open a command line window and run the following command: pip install scikit-learn Dependent class libraries: -Pandas: For data processing and analysis, the installation command is' pip install Pandas'` -NumPy: used for numerical calculations and array operations, with the installation command 'pip install numpy'` -Matplotlib: For visualization, the installation command is' pip install matplotlib '` -Seaborn: Matplotlib based Data and information visualization library. The installation command is ` pip install seaborn` Dataset introduction: The actual battle used a dataset of Titanic passengers, which includes characteristic information of the passengers (such as age, gender, ticket level, etc.) and survival labels. The dataset contains two files: the training set (train. csv) and the test set (test. csv), which can be downloaded from the Kaggle website( https://www.kaggle.com/c/titanic/data ). Sample data: Some of the data in the training set are as follows: PassengerId Survived Pclass ... Fare Cabin Embarked 0 1 0 3 ... 7.2500 NaN S 1 2 1 1 ... 71.2833 C85 C 2 3 1 3 ... 7.9250 NaN S 3 4 1 1 ... 53.1000 C123 S 4 5 0 3 ... 8.0500 NaN S The complete sample code is as follows: python import pandas as pd from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score #Read Training Set train_data = pd.read_csv('train.csv') #Data preprocessing train_data = train_data[['Survived', 'Pclass', 'Sex', 'Age', 'Fare']] train_data = train_data.dropna() train_data['Sex'] = train_data['Sex'].map({'female': 0, 'male': 1}) #Partition features and labels X = train_data[['Pclass', 'Sex', 'Age', 'Fare']] y = train_data['Survived'] #Divide training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) #Create Logistic regression model model = LogisticRegression() #Model training model.fit(X_train, y_train) #Model prediction y_pred = model.predict(X_test) #Calculation accuracy accuracy = accuracy_score(y_test, y_pred) print("Accuracy:", accuracy) Code Description: Firstly, import the required libraries and classes. 2. Use the Pandas library to read the CSV file of the training set, preprocess the data, select the required feature columns, and delete rows containing null values. At the same time, convert the value of the gender column to a numerical representation. 3. Divide features and labels. 4. Use train_ Test_ Split divides the dataset into training and testing sets. 5. Create a LogisticRegression object, that is, a Logistic regression model. 6. Train the model. 7. Use the trained model to predict the test set. 8. Use accuracy_ Calculate the prediction accuracy based on score. 9. Printing accuracy. Summary: The Logistic regression model in the Scikit-learn database was used to predict the passenger data of the Titanic in this actual battle, and the prediction accuracy was calculated. Through this example, we can learn how to use Scikit learn to model and predict machine learning tasks, and to preprocess data and Feature selection.