This exercise is available at https://study.find-santa.eu/exercises/py/exercise_4/.
For the sake of the environment, please avoid printing these instructions in the future. Thank you!

4

Max. 9 points

Download hw_4.tar.gz and extract it. This archive contains a grader which works for all current versions of Python 3 and expects the solution files to be placed in the same directory. It has to be executed from this directory via

python hw4_grader.py

Add your solutions in the directory contained in the archive. Right after the shebang, each of your files must contain your name using the following template

"""
.. moduleauthor:: Your Name <your.name@example.com>
"""

We did not discuss every detail required to solve the following tasks. Use your favorite search engine and some common sense to solve the tasks.

This homework is to be prepared in teams of two students. Ask the lecturer to announce the teams to know who you'll be working with.

  1. Telephone numbers with letters
    In some countries it is a common practice to encode selected telephone numbers as text because they are easier to memorize that way. The letters of the text signal which button to press on a telephone keypad. See the image below as a reference:
    Source: Marnanel (via Wikimedia Commons)
    Implement a function as_numeric(text) that returns a string containing only the numbers that correspond to the input text. Using the function in a python3 shell should look like this:
    >>> as_numeric('0800 reimann')
    '0800 7346266'
    
    Hint: using a Python dictionary to store the translation table facilitates this task.
    Name the program file: telephone_numbers.py
  2. Working on existing programs (2 points)
    Your lecturer just started riding the fake news wave. In order to illustrate how much fake new is out there, he wrote a little fake news generator. However, the generator is far from perfect.
    In order for the generator to work you need to install the faker and wikipedia packages.
    The little script deliberately makes use of a selection of libraries to illustrate the power of Python. At the same time the example illustrates that you do not have to understand every line of a script in order to improve it. Start by playing around with the fake news generator from the commandline: python3 fake_news_generator.py -h
    Your lecturer needs your help to create messages that are more credible. At the moment, the messages are given credit by adding a "source". Let's assume female sources are more credible. Find the line that adds the name of the the source and adjust it to only use female names (check the documentation of the faker module if needed).
    Furthermore, the module can use an article from Wikipedia as source for the list of words that make up the fake news. It also contains a function to remove non-word characters from wordlists. However, this function is currently not applied. Make sure it is applied, but only for the Wikipedia articles - not for the carefully handpicked tweets which serve as default inputs.
    Name the program file: fake_news_generator.py
  3. Basic statistics (2 points)
    Write a few functions that compute basic statistics from given financial data stored in CSV files. The input files have to have column headers in their first row. As you'll have to be able to deal with bigger amounts of data, it cannot be guaranteed that all of the data can fit your computer's memory. To help you out with this situation, you can use the provided function items(.) yielding one row of the data after the other when being iterated over. The rows are yielded as dictionaries using the first row as keys. The provided count(.) function gives an idea on how to use the items(.) generator function. Doing this correctly for find_median(.) is a bonus challenge. If you do not manage to implement this under the memory constraints just implement it ignoring them.
    You need to create a series of functions that compute the required values:
    calc_max(.)
    calc_min(.)
    calc_mean(.)
    calc_stddev(.)
    calc_sum(.)
    calc_variance(.)
    calc_median(.)
    In order for the grader to work, install the numpy package via pip install --user numpy. Name the program file: statistics.py
  4. Counting unique words in a file (4 points)
    In a prior 'Information Science' course at University of Graz, one of the tasks was to count how many times each word in an article occurs. To alleviate checking if someone performed such a task correctly, write a Python program that does the work for you. You don't have to write the entire program from scratch. Instead, use the provided count_unique.py file and implement count_unique(words).
    Also implement count_unique_sorted(words) that returns a list of named tuples. The first element of each named tuple must be 'word' and the second 'count'. The list has to contain the tuples in the same order as the words occur in the input file.
    Name the program file: unique_words.py

All resulting files must be placed in a single directory. The name of the directory must be 4_firstname_lastname (in case of team homeworks, add each member's first and last names). Make sure to also include the grader. Compress the directory to either 4_firstname_lastname.tar.gz or 4_firstname_lastname.tar.bz2 before sending it to assignments@senarclens.eu.