Now that we’ve learnt about the basics of Python syntax, as well as how to use modules, it’s time to think about starting to modularize our own code!
A function is a way of modularizing code, such that given a set of inputs (or none), the same set of commands are executed each time a function is executed.
Similarity to Mathematical Functions
Functions are related to the idea of mathematical functions e.g. f(x)
Example mathematical function:
f(x) = x + 2 f(5) = ?
Answer:
If f(x) = x + 2 Then f(5) = 5 + 2 Therefore f(5) = 7
We’ve already been using, or calling, functions that were
defined by others since we started this workshop.
The first function we called was the print
function.
All functions are called in the same way:
<FUNCTION NAME>( VALUE FOR ARGUMENT1, VALUE FOR ARGUMENT2, ...)
where ARGUMENT1
, ARGUMENT2
refers to possible input arguments
to the function.
In words, functions are called using the name of the function, then open parentheses, then the argument list (comma separated), and then close parentheses. The number of arguments that must be passed into the function depends on how the function was defined.
To define a function in Python, we use the def
statement:
def <FUNCTION NAME>( ARGUMENT1, ARGUMENT2, ...):
so very similar to calling a function, except that we start the
line with def
, and we end the line with the colon, :
.
The number of arguments is up to us and is dependent on what inputs the function needs.
The function body follows and is indented relative to the
def ...
line.
If we want to return values from our function we use the return
statement;
def <FUNCTION NAME>(ARGUMENT1, ARGUMENT2,...):
# Function body
:
:
return value1, value2
# Code no longer in function definition
The function body ends when the indentation level returns to that before the
def statement
.
So to use our maths function example of f(x) = x + 2 we could write this as a python function like this:
def f(x):
answer = x + 2
return answer
or more compact:
def f(x):
return x + 2
A function must be defined before it can be called. Every built-in function and standard library module function we’ve been using is defined somewhere.
To get used to defining functions, lets start by defining a trivial function that replaces functionality that we already know.
In a new script file (exercise_function.py)
This is a short example to familiarize ourselves with function definition;
# Our first function definition!
def add_numbers(a,b):
return a+b
print(add_numbers(40,2))
Produces
42
We have already written code that could be modularized in previous exercises. Lets upgrade the code from the exercise on “reading a data file” to a function.
In a new script file (exercise_modularization.py)
There are multiple ways that we could return the statistics; below is an example where the statistics are returned as a dictionary.
They could also have been returned as a tuple, in which case we need to take note of which statistic is returned at which index!
# Load data from a text (csv) file, and calculate some stats
from math import sqrt
def analyze_file(filename):
with open(filename) as fid:
# Skip the first line (ie read it but don't keep the data)
fid.readline()
# Now read the data
time = []
sig = []
for line in fid:
t,s = line.split(',')
time.append(int(t))
sig.append(int(s))
# Now calculate the stats
sum_s = sum(sig)
mean_s = sum_s/len(sig)
# Population standard deviation = sqrt ( (1/ (N-1)) * sum (x_i - <x>)^2 )
devsq = []
for s in sig:
devsq.append( (s - mean_s)**2 )
stddev = sqrt(sum(devsq)/(len(sig)-1))
return {
"N" : len(sig),
"Sum" : sum_s,
"Mean" : mean_s,
"StdDev" : stddev,
}
# OPTIONALLY use the current script's location as the data folder, if that's where the data
import os
ROOT = os.path.realpath(os.path.dirname(__file__))
result = analyze_file( os.path.join(ROOT,"data_exercise_reading.csv"))
print(result)
This generates the output
{'Mean': 99.576, 'StdDev': 10.048517818821722, 'Sum': 199152, 'N': 2000}
Why we would do this?
The answer is that now we can call the same functionality on any file, or more importantly, on many files.
For example we could have 1000 files that all contain such data; the benefits of having a single function that is called on each one instead of a script with hard-coded input file name:
“Traditionally” function arguments are positional, meaning that the value that is passed into a function call at the first position, is assigned to the first variable in the function definition, the second to the second, and so on.
As well as these positional arguments, Python functions
often accept keyword arguments. Keyword arguments are provided
as <KEYWORD>=<VALUE>
pairs.
Keyword arguments are always optional, as they are given default values when the function is defined, while positional arguments are non-optional and do not have default values.
For example even the print
function takes several
keyword arguments.
If you try the following in a script
print("a", end="\n\n\n\n")
print("b")
You should see an output like
a
b
because the print
function used four new line characters instead
of the default of one new line character.
Similarly
print("a", end="--NEXT--")
print("b")
would output
a--NEXT--b
In a new script file (exercise_keyword.py)
absolute
absolute
is True
, converts all inputs to
their absolute valuesabsolute
set to True
# Updated add function to allow for absolute value addition
def add_numbers2(a,b, absolute=False):
if absolute:
return abs(a) + abs(b)
else:
return a+b
print(add_numbers2(40,2))
print(add_numbers2(-2,2))
print(add_numbers2(-2,2, absolute=True))
produces
42
0
4
Another feature we need to start adding as our scripts grow, is documentation.
Comments using the hash (aka pound symbol if you come from the USA) symbol typically appear every few lines in well written code.
Pulling a random section of code from a standard python module:
# From : pathlib.py, line 1000
.
.
.
def absolute(self):
"""Return an absolute version of this path. This function works
even if the path doesn't point to anything.
No normalization is done, i.e. all '.' and '..' will be kept along.
Use resolve() to get the canonical path to a file.
"""
# XXX untested yet!
if self._closed:
self._raise_closed()
if self.is_absolute():
return self
# FIXME this must defer to the specific flavour (and, under Windows,
# use nt._getfullpathname())
obj = self._from_parts([os.getcwd()] + self._parts, init=False)
obj._init(template=self)
return obj
def resolve(self):
"""
Make the path absolute, resolving all symlinks on the way and also
normalizing it (for example turning slashes into backslashes under
Windows).
"""
if self._closed:
.
.
.
Ignoring the majority of what’s actually written, this section illustrates a few things regarding commenting:
FIXME
aboveSpecial comments in the form of multi-line strings (using three single/double
quote symbols, """ ... """
) are used immediately after function definitions
to document functions. These are called docstrings and the Python help
system scans source code for these when you call e.g. help <FUNCTION NAME>
.
Copy and paste your previous exercise (add_numbers2) into a new script file (exercise_docstring.py) and add a docstring that explains what the function does and how to use it.
Then run python -m pydoc exercise_docstring
in the terminal, from your exerice folder.
In this case the docstring is enough to tell us what’s going on - we don’t need to comment every line!
# Adding a docstring
def add_numbers2(a,b, absolute=False):
"""
A modified addition function, that optionally performs absolute value addition
"""
if absolute:
return abs(a) + abs(b)
else:
return a+b
Running pydoc exercise_docstring
from the terminal produces:
Help on module answer_docstring:
NAME
answer_docstring - # Adding a docstring
FUNCTIONS
add_numbers2(a, b, absolute=False)
A modified addition function, that optionally performs absolute value addition
FILE
.../answer_docstring.py
Note
Don’t forget to use the good coding suggestions, including frequent commenting / documentation of your code in this exercise.
In light of revelations regarding government agency snooping, you have decided to encrypt your personal communications!
Being a budding Pythonista, and having heard of the Caesar cipher, you will now write a script (exercise_encryption.py) that
Part 1
if __main__ == ...
section of your code (check back
to the module section for what this means!)
the cat
should become
uifadbu
Note:
With offset 1, the cipher should become:
bcde...xyz a
, i.e. include space as a character, in contrast
to the example on the wiki page.
I.e. the plain-text too cipher translation table for offset 1 should be:
PLAIN : "abcd...xyz "
CIPHER : "bcde...yz a"
Ask a demonstrator if you are unclear about this.
Part 2
<INPUT NAME>-encrypted.<INPUT EXTENSION>
Part 3: Bonus section
Now that you can encrypt text, and decrypt text that was encrypted with a known offset cipher, let’s test our script using an encrypted file where we don’t know the offset.
To do this we’re going to use a “brute-force” algorithm,