Python's Boosting Ensemble learning Practice with Scikit learn
Preparation work:
1. Install Python: First, ensure that Python has been installed, and it is recommended to use Python version 3.
2. Install Scikit learn: Scikit learn is a machine learning library used to build, train, and evaluate various machine learning models. You can install Scikit learn through the pip command:
pip install scikit-learn
3. Dataset introduction: We selected the classic Iris dataset as the sample data. This dataset contains 150 samples, each containing 4 features and 1 objective variable, for classification problems.
Dataset download link: https://archive.ics.uci.edu/ml/datasets/iris
Dependent class libraries:
python
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
The complete sample code is as follows:
python
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
#Load Dataset
data = load_iris()
X = data.data
y = data.target
#Divide training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
#Building a Decision Tree Classifier as a Base Classifier
base_classifier = DecisionTreeClassifier(max_depth=1)
#Building AdaBoost Classifiers
adaboost = AdaBoostClassifier(base_classifier, n_estimators=50, learning_rate=0.1)
#Training model
adaboost.fit(X_train, y_train)
#Prediction
y_pred = adaboost.predict(X_test)
#Evaluation Model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
By running the above code, a Ensemble learning model based on AdaBoost algorithm can be realized, using the decision tree as the base classifier. Firstly, divide the dataset into a training set and a testing set, then construct an AdaBoost classifier and train the model, and finally evaluate the accuracy of the model using the testing set.
Summary:
Boosting is a common method in Ensemble learning, which combines multiple weak classifiers to build a strong classifier. Scikit-learn provides a powerful Boosting Ensemble learning library, including AdaBoost, Gradient Boosting, XGBoost and other algorithms. In this example, we use the AdaBoostClassifier class from Scikit learn to construct an AdaBoost classifier, and use the decision tree as the base classifier. Preparing data, building models, training and evaluating models are the basic steps for implementing Boosting Ensemble learning using Scikit lean.