Python uses spaCy to implement word vector representation

Environmental preparation work: 1. Install Python: Ensure that Python has been installed and environment variables have been configured. You can access it from the official Python website（ https://www.python.org/ ）Download and install the latest version of Python. 2. Install spaCy: Run the following command on the command line terminal to install spaCy and its English model: pip install -U spacy python -m spacy download en_core_web_sm Dependent class libraries: -SpaCy: used to implement Natural language processing tasks such as word vector representation, word segmentation, Named-entity recognition, etc. -En_ Core_ Web_ SM: English model, which includes pre trained language models such as word vectors and parts of speech. Dataset description: SpaCy provides some free datasets for training and use, among which the 'Noise Named Entity Recognition' dataset can be used to train and test English word vectors. This dataset contains 1 million English vocabulary samples. Dataset download website: - https://spacy.io/models/en#en_core_web_sm (Hosted on the official website of 'space') Sample data: Here is a simple example sentence: text = "Apple is looking at buying U.K. startup for $1 billion" The source code for the complete sample is as follows: python import spacy #Load pre trained English model nlp = spacy.load("en_core_web_sm") #Sample data text = "Apple is looking at buying U.K. startup for $1 billion" #Processing Sample Data doc = nlp(text) #Print the word vector representation of each word for token in doc: print(token.text, token.vector) Running the above code will output the word vector representation of each word as follows: Apple [-0.38137 0.040599 0.074482 -0.57776 0.48341 -0.26461 -0.59236 -0.066073 -0.010891 -0.094383 -0.069539 0.25162 0.73855 0.13912 0.42043 -0.54902 -0.56434 0.21232 -0.68141 0.96178 -0.89065 0.732008 -0.23573 -0.93936 0.050298 -0.02594 -0.020934 0.15011 -1.0197 1.2163 0.099368 -0.64603 -0.062606 0.26472 0.11114 0.093098 -0.40547 0.3571 0.1434 -0.085883 0.1536 0.48428 -0.52039 0.13887 -0.31745 -0.33929 -0.61664 0.27368 0.20432 -0.4416 -0.040999 -0.027347 0.28996 -0.18815 -0.096113 0.83248 0.54914 -0.1704 -0.27037 -0.17224 0.019674 0.789 -0.2154 0.16053 -0.091515 -0.039549 0.22087 0.13049 0.10876 0.37265 0.43034 -0.13423 0.23155 0.21511 0.043362 -0.22175 -0.19713 -0.74563 0.20429 0.025532 0.078199 -0.075202 -0.82278 -0.23915 -0.15724 -0.49282 0.1163 -0.093531 -0.029744 -0.20149 0.42157 0.17209 -0.0064405 0.067794 0.064107 -0.27358 0.24679 0.37695 ] is [ 5.7045e-01 7.3320e-02 -5.2481e-02 -1.7201e-01 4.0776e-01 -1.7918e-01 -4.2566e-01 1.8259e-01 2.7822e-02 -2.7971e+00 3.4463e-02 6.6417e-01 ... billion [-0.01302 0.81879 0.056471 -0.15816 0.72257 0.16448 -0.008338 0.17831 -0.32181 -0.18973 -0.28154 0.51231 0.22606 -0.77945 -0.071036 0.60708 0.6656 -0.31254 -0.23348 0.89832 -0.47187 -0.04356 0.21662 0.1938 -0.062572 0.19025 -0.075951 -0.17935 -0.034189 1.2301 -0.95679 0.23063 -0.001247 -0.18192 0.051463 0.19421 0.32688 0.5293 0.62802 -0.53711 0.90128 0.060637 -0.56284 -0.14142 0.52605 0.51524 -0.012239 0.59797 0.38654 0.093457 -0.63734 0.27735 0.25286 -0.45435 0.1695 -0.043602 -0.43149 0.35487 0.25306 -0.41865 -0.33503 0.18154 -0.036745 -0.2862 0.11214 -0.72551 -0.13709 -0.23442 0.040682 -0.43673 0.622 -0.18359 0.51559 0.056149 -0.23091 1.5352 0.012769 -0.025158 0.025052 0.22902 -0.12672 0.056884 0.14448 0.29162 0.30039 0.45316 0.027012 0.03051 -0.28247 0.60458 -0.67007 -0.68479 ] The above code used spaCy to load the English model, and then processed the sample data through 'nlp (text)' to convert it into a Doc object. Next, you can iterate through each word (token) in the 'doc' and use 'token. vector' to obtain the word vector representation of each word. Finally, print out each word and its corresponding word vector.