In services.py
Write a function, process_text, that takes a string and:
- Replaces all punctuation with spaces
- Converts the string to lowercase
- Splits the string into words
- Returns a dictionary with word frequencies
>>> process_text("Hello, world! Hello again.")
{'hello': 2, 'world': 1, 'again': 1}
In services.py
Write a function, total_words, that takes a histogram
(dictionary of words and their frequencies) and returns the total number of words:
Example:
>>> total_words({'hello': 2, 'world': 1, 'again': 1})
4
In services.py
Write a function, total_unique_words, that takes a histogram
(dictionary of words and their frequencies) and returns the total number of unique words:
Example:
>>> total_unique_words({'hello': 2, 'world': 1, 'again': 1})
3
In services.py
Write a function, get_most_common_words, that takes a histogram
(dictionary of words and their frequencies) and returns a list of
(count, word) tuples, sorted from the most frequent to least.
Example:
>>> get_most_common_words({'again': 1, 'hello': 2, 'world': 1})
[(2, 'hello'), (1, 'world'), (1, 'again')]
In main.py
Write a function print_report(text, n=10) that prints a report with the following information:
- The top n most common words with their counts
- The total number of words
- The number of unique words
>>> text = "The house was not empty. The house had life in it! She walked in silence. The silence was deep and old."
>>> print_report(text, n=5)
Most common words:
the: 4
house: 2
was: 2
silence: 2
not: 1
Total words: 20
Unique words: 16