In natural language processing (NLP), there are diverse ways to represent words such as one-hot encoding, bag of words, TF*IDF, and distributed word representa- tions. In one hot encoding, a bit vector whose length is the size of the vocabulary of words is created, where only the associated word bit is on (i.e., 1) while all other bits are off (i.e., 0). Here is a toy example: suppose there is a 5-dimensional feature vector to represent a vocabulary of five words: [king, queen, man, woman, power]. In this case, 'king' is encoded into \( [1,0,0,0,0] \), 'queen' is encoded into \( [0,1,0,0,0] \), etc. Due to the nature of this representation, the feature vector encodes the vocabulary of a sentence where all words are equally distant. On the other hand, in distributed word vectors, a real-valued vector whose length is defined by some common properties of words is created, then each word can be represented as a linear combination of the defined properties. Using the toy example above, given a 3-dimensional feature vector of [man, woman, power] as the common properties, then words such as 'king' , 'queen' ,' 'man' , and 'woman' could be encoded into \( [0.98,0.1 \), \( 0.8],[0,0.99,0.85],[0.9,0,0.5] \), and \( [0,0.97,0.5] \), respectively. In this case, if you subtract a vector of 'man' from a vector of 'king', , and add a vector of 'woman' , then you will get a vector close to a vector of 'queen' . i. (4 points) Sentiment analysis is the use of NLP to identify the emotions associated with a text. Sentiment analysis is commonly used to analyze product reviews and social media amongst other things. Knowledge-based approaches to sentiment analysis use words such as good, bad, happy, and sad to classify a text as positive or negative. Which word representation, one-hot encoding or distributed word vectors, would be more useful for a knowledge-based approach? Briefly justify your answer. Distributed word vector|