Python uses NumPy to implement array operations and functions, including sorting, deduplication, summation, mean, variance, standard deviation, maximum, minimum, etc
1、 Environment building and class library dependencies
Before using NumPy, you need to first install the NumPy library. You can use the pip command to install:
pip install numpy
After the installation is completed, the NumPy library can be introduced using the following statement:
python
import numpy as np
2、 Dataset and sample data preparation
To demonstrate the use of NumPy, we can use a classic dataset called 'iris'. This dataset contains 150 samples, each with 4 features: calyx length, calyx width, petal length, and petal width. There are three categories in the dataset: Iris setosa, Iris versicolor, and Iris virginica.
You can find the dataset on the UCI Machine Learning Repository and download it using the following link:
Dataset download link: https://archive.ics.uci.edu/ml/datasets/Iris
3、 Sample code implementation
The following is a complete example code that demonstrates how to use NumPy for array operations and function calculations:
python
import numpy as np
#Read Dataset
data = np.genfromtxt('iris.data', delimiter=',', dtype=str)
#Select calyx length as data
sepal_length = data[:, 0].astype(float)
#Sort
sorted_sepal_length = np.sort(sepal_length)
#Weightlessness reduction
unique_sepal_length = np.unique(sepal_length)
#Summation
sum_sepal_length = np.sum(sepal_length)
#Mean
mean_sepal_length = np.mean(sepal_length)
#Variance
var_sepal_length = np.var(sepal_length)
#Standard deviation
std_sepal_length = np.std(sepal_length)
#Maximum value
max_sepal_length = np.max(sepal_length)
#Minimum value
min_sepal_length = np.min(sepal_length)
Print ("Sorted calyx length:", sorted_sepal_length)
Print ("Calyx length after weight removal:", unique_sepal_length)
Print ("Sum of calyx length:", sum_sepal_length)
Print ("Average Calyx Length:", mean_sepal_length)
Print ("Variance of calyx length:", var_sepal_length)
Print ("Standard deviation of calyx length:", std_sepal_length)
Print ("Maximum value of calyx length:", max_sepal_length)
Print ("Minimum value of calyx length:", min_sepal_length)
The above code reads the data from the iris dataset and selects the first column of data (calyx length) for operation. Then, the data is sorted, deduplicated, summed, averaged, variance, standard deviation, maximum and minimum values are calculated using NumPy's functions, and the results are printed out.
Please ensure that when running the above code, the code file is in the same directory as the iris dataset file.