Home /
Expert Answers /
Computer Science /
question-tokenizing-text-you-will-process-text-saved-to-the-text-data-variable-the-variable-is-o-pa689
(Solved):
Question: Tokenizing Text You will process text saved to the text_data variable. The variable is o ...
Question: Tokenizing Text You will process text saved to the text_data variable. The variable is of type String . This data is know as a (very small) dataset, or sometimes referred to as corpora. The etymology of corpora is it comes from corpus or corpse, meaning the body of something. So corpora refers to a body of texts (or collection of texts). text_data ="?" Here's to the crazy ones, the misfits, the rebels, the troublemakers, the round pegs in the square holes. The ones who see things that they can change the world, are the ones who do. The quote baove is by Steve Jobs. Mr. Jobs also said: I choose a lazy person to do a hard job. Because a lazy person will find ar """ Break the above text into **paragraph tokens** (a list of paragraphs). 1 How many paragraphs do you have? [ ] : \# YOUR CODE IN THIS CELL \#raise NotImplementedError() \# Remove this after you have started implementing your code below number_of_paragraphs =0 print(number_of_paragraphs) Break the above text into sentence tokens. How many sentences do you have? [ ]: \# YOUR CODE IN THIS CELL raise NotImplementedError() \# Remove this after you have started implementing your code below number_of_sentences =0 Break the above text into word tokens. How many words do you have? [ ]: \# YOUR CODE IN THIS CELL raise NotImplementedError() \# Remove this after you have started implementing your code below number_of_words =0