Home /
Expert Answers /
Computer Science /
hints-preprocess-the-contents-of-the-file-before-extracting-n-grams-convert-everything-to-lowe-pa925
(Solved): Hints - Preprocess the contents of the file before extracting n-grams: - Convert everything to lowe ...
Hints - Preprocess the contents of the file before extracting n-grams: - Convert everything to lower case. - Remove all punctuation characters (import the string module and use the variable string.punctuation, recall the palindrome example from class). - n-grams can cross lines in the input file. Thus, your program should remove line breaks and treat the text as one long line of words. - To get the num most frequent n-grams, you will need to sort the information in the dictionary. Python does not support sorting dictionaries directly because they are unordered. The easiest way to do this is to convert the dictionary into a list of (frequency, n-gram) tuples and then sort that list.