Functions

Now that we’ve learnt about the basics of Python syntax, as well as how to use modules, it’s time to think about starting to modularize our own code!

A function is a way of modularizing code, such that given a set of inputs (or none), the same set of commands are executed each time a function is executed.

Similarity to Mathematical Functions

Functions are related to the idea of mathematical functions e.g. f(x)

Example mathematical function:
f(x) = x + 2

f(5) = ?
Answer:
If f(x) = x + 2

Then f(5) = 5 + 2

Therefore f(5) = 7

Calling functions

We’ve already been using, or calling, functions that were defined by others since we started this workshop. The first function we called was the print function.

All functions are called in the same way:

    <FUNCTION NAME>( VALUE FOR ARGUMENT1, VALUE FOR ARGUMENT2, ...)

where ARGUMENT1, ARGUMENT2 refers to possible input arguments to the function.

In words, functions are called using the name of the function, then open parentheses, then the argument list (comma separated), and then close parentheses. The number of arguments that must be passed into the function depends on how the function was defined.

Defining functions

To define a function in Python, we use the def statement:

def <FUNCTION NAME>( ARGUMENT1, ARGUMENT2, ...):

so very similar to calling a function, except that we start the line with def, and we end the line with the colon, :.

The number of arguments is up to us and is dependent on what inputs the function needs.

The function body follows and is indented relative to the def ... line.

If we want to return values from our function we use the return statement;

def <FUNCTION NAME>(ARGUMENT1, ARGUMENT2,...):
    # Function body
    :
    :
    return value1, value2

# Code no longer in function definition

The function body ends when the indentation level returns to that before the def statement.

So to use our maths function example of f(x) = x + 2 we could write this as a python function like this:

def f(x):
    answer = x + 2 
    return answer

or more compact:

def f(x):
    return x + 2

A function must be defined before it can be called. Every built-in function and standard library module function we’ve been using is defined somewhere.

Exercise : Our first function definition

To get used to defining functions, lets start by defining a trivial function that replaces functionality that we already know.

In a new script file (exercise_function.py)

define a function called “add_numbers”
- that takes two inputs,
- and returns their sum
Print to the console the result of calling the function on 40 and 2.

functions answer

This is a short example to familiarize ourselves with function definition;

# Our first function definition!

def add_numbers(a,b):
    return a+b

print(add_numbers(40,2))

Produces

Exercise : Upgrading our numerical analysis script

We have already written code that could be modularized in previous exercises. Lets upgrade the code from the exercise on “reading a data file” to a function.

In a new script file (exercise_modularization.py)

Copy and paste the code from the “reading a data file” exercise in the previous section
Modularize that code by
- Creating a function definition called “analyze_file” that
  - Takes a file path as an input
  - Opens the file and reads in the data
  - Returns the statistics
- Then call this function with the data file name as an input, and print the result to the terminal
- Verify that the result is the same as the original script

Modularization answer

There are multiple ways that we could return the statistics; below is an example where the statistics are returned as a dictionary.

They could also have been returned as a tuple, in which case we need to take note of which statistic is returned at which index!

# Load data from a text (csv) file, and calculate some stats
from math import sqrt


def analyze_file(filename):
    with open(filename) as fid:
        # Skip the first line (ie read it but don't keep the data)
        fid.readline()
        # Now read the data
        time = []
        sig = []
        for line in fid:
            t,s = line.split(',')
            time.append(int(t))
            sig.append(int(s))

    # Now calculate the stats
    sum_s = sum(sig)
    mean_s = sum_s/len(sig)

    # Population standard deviation = sqrt ( (1/ (N-1)) * sum (x_i - <x>)^2 )
    devsq = []
    for s in sig:
        devsq.append( (s - mean_s)**2 )
    stddev = sqrt(sum(devsq)/(len(sig)-1))

    return {
        "N"      : len(sig),
        "Sum"    : sum_s,
        "Mean"   : mean_s,
        "StdDev" : stddev,
    }

# OPTIONALLY use the current script's location as the data folder, if that's where the data
import os
ROOT = os.path.realpath(os.path.dirname(__file__))

result = analyze_file( os.path.join(ROOT,"data_exercise_reading.csv"))

print(result)

This generates the output

{'Mean': 99.576, 'StdDev': 10.048517818821722, 'Sum': 199152, 'N': 2000}

Why we would do this?

The answer is that now we can call the same functionality on any file, or more importantly, on many files.

For example we could have 1000 files that all contain such data; the benefits of having a single function that is called on each one instead of a script with hard-coded input file name:

We don’t need to copy the script file 1000 times and change the input file name in each…
If we want to add an additional statistic… we don’t need to then update 1000 script files! We only update the one function! The same goes for if we find a bug in the code.

Positional and keyword arguments

“Traditionally” function arguments are positional, meaning that the value that is passed into a function call at the first position, is assigned to the first variable in the function definition, the second to the second, and so on.

As well as these positional arguments, Python functions often accept keyword arguments. Keyword arguments are provided as <KEYWORD>=<VALUE> pairs.

Keyword arguments are always optional, as they are given default values when the function is defined, while positional arguments are non-optional and do not have default values.

For example even the print function takes several keyword arguments.

If you try the following in a script

print("a", end="\n\n\n\n")
print("b")

You should see an output like

a

b

because the print function used four new line characters instead of the default of one new line character.

Similarly

print("a", end="--NEXT--")
print("b")

would output

a--NEXT--b

Exercise : Adding a keyword argument

In a new script file (exercise_keyword.py)

Copy and paste the function definition for “add_numbers”
Rename the function to “add_numbers2”
- Add a keyword argument in the function definition
  - called absolute
  - that defaults to False
- Add a couple of lines in the code before the sum that
  - if absolute is True, converts all inputs to their absolute values
Print to the console the result of calling the function on 40 and 2.
Print to the console the result of calling the function on 2 and -2
Print to the console the result of calling the function on 2 and -2 if you also pass in the keyword argument absolute set to True

Keyword answer

# Updated add function to allow for absolute value addition

def add_numbers2(a,b, absolute=False):
    if absolute:
        return abs(a) + abs(b)
    else:
        return a+b

print(add_numbers2(40,2))
print(add_numbers2(-2,2))
print(add_numbers2(-2,2, absolute=True))

produces

42
0
4

Documenting code

Another feature we need to start adding as our scripts grow, is documentation.

All good source code contains good commenting to explain to other programmers, or remind the author, of what the code is doing.

Comments using the hash (aka pound symbol if you come from the USA) symbol typically appear every few lines in well written code.

Pulling a random section of code from a standard python module:

# From : pathlib.py, line 1000
.
.
.

    def absolute(self):
        """Return an absolute version of this path.  This function works
        even if the path doesn't point to anything.

        No normalization is done, i.e. all '.' and '..' will be kept along.
        Use resolve() to get the canonical path to a file.
        """
        # XXX untested yet!
        if self._closed:
            self._raise_closed()
        if self.is_absolute():
            return self
        # FIXME this must defer to the specific flavour (and, under Windows,
        # use nt._getfullpathname())
        obj = self._from_parts([os.getcwd()] + self._parts, init=False)
        obj._init(template=self)
        return obj

    def resolve(self):
        """
        Make the path absolute, resolving all symlinks on the way and also
        normalizing it (for example turning slashes into backslashes under
        Windows).
        """
        if self._closed:

.
.
.

Ignoring the majority of what’s actually written, this section illustrates a few things regarding commenting:

Comments don’t have to appear every line, just every now-and-again to help people reading the code
Comments are also used to keep track of when things need attention, e.g. the use of FIXME above

Special comments in the form of multi-line strings (using three single/double quote symbols, """ ... """) are used immediately after function definitions to document functions. These are called docstrings and the Python help system scans source code for these when you call e.g. help <FUNCTION NAME>.

Exercise : Adding documentation

Copy and paste your previous exercise (add_numbers2) into a new script file (exercise_docstring.py) and add a docstring that explains what the function does and how to use it.

Then run python -m pydoc exercise_docstring in the terminal, from your exerice folder.

Docstring answer

In this case the docstring is enough to tell us what’s going on - we don’t need to comment every line!

# Adding a docstring

def add_numbers2(a,b, absolute=False):
    """
    A modified addition function, that optionally performs absolute value addition
    """
    if absolute:
        return abs(a) + abs(b)
    else:
        return a+b

Running pydoc exercise_docstring from the terminal produces:

Help on module answer_docstring:

NAME
    answer_docstring - # Adding a docstring

FUNCTIONS
    add_numbers2(a, b, absolute=False)
        A modified addition function, that optionally performs absolute value addition

FILE
    .../answer_docstring.py

Final Exercise

Note

Don’t forget to use the good coding suggestions, including frequent commenting / documentation of your code in this exercise.

In light of revelations regarding government agency snooping, you have decided to encrypt your personal communications!

Being a budding Pythonista, and having heard of the Caesar cipher, you will now write a script (exercise_encryption.py) that

Part 1

Has an encryption function which takes
- text(string) and an offset(integer) as a input
- converts the text to cipher-text using the given offset to generate the cipher
- outputs the cipher-text
Test your function in the if __main__ == ... section of your code (check back to the module section for what this means!)
- Using any string you like, and offset 0 - you should get back the original text
- The same text but 1 offset – e.g. the cat should become uifadbu
Add a decryption function that
- takes text and an offset as an input
- Generates a cipher with the given offset
- Decrypts the input text using the cipher
- Returns the decrypted text
- HINT: think about how the encryption & decryption functions work (i.e. how similar they are…) - maybe you can avoid creating a whole new decryption function ?!?

Note:

With offset 1, the cipher should become: bcde...xyz a, i.e. include space as a character, in contrast to the example on the wiki page.

I.e. the plain-text too cipher translation table for offset 1 should be:

PLAIN  : "abcd...xyz " 
CIPHER : "bcde...yz a"

Ask a demonstrator if you are unclear about this.

Part 2

Add a file-reading and writing function that
- Takes a filename and offset as inputs
- Opens and reads text in the specified file
- Calls your encryption function on the text, using the offset provided as an input
- Writes the output to a file with the same base name and extension as the input text, but is named <INPUT NAME>-encrypted.<INPUT EXTENSION>
- Returns the encrypted file name
Download the plain text file from here: data_exercise_encryption.txt,
Call your script with this file as an input to the previously defined encryption function (and non-zero offset!), and verify that you get a new encrypted file that contains encrypted text.
Add a file-reading function that
- Takes a filename and offset as inputs
- Opens and reads text in the specified file
- Calls your decryption function on the text, using the offset provided as an input
- Returns the resulting decrypted text
Print the return value of this function to the terminal (passing the previously encrypted file)

Part 3: Bonus section

Now that you can encrypt text, and decrypt text that was encrypted with a known offset cipher, let’s test our script using an encrypted file where we don’t know the offset.

To do this we’re going to use a “brute-force” algorithm,

download the encrypted text from here: data_exercise_encryption_secret.txt
Write a loop over all possible offsets
- For each offset, decrpyt the text and print the final message and offset to the terminal
Skim through the decyrpted texts; hopefully one and only one of the decryped texts should make sense!

python-intro