The following concepts are less frequently needed than those in the first section of this workshop.
However they are nonetheless useful for certain scenarios, and included here for those of you who might find them useful.
You should already be familiar with how to use a module, and all of the modules that you will need for this workshop are included in the WinPython install.
However, for completeness we will briefly mention the simple install
process for new modules; using pip
.
pip
comes as standard with Python as of 3.4, and allows you to search the
online Python Package Index (PyPI) as well as install packages from PyPI.
From the command line, we can
pip search memory
pip install memory_profiler
pip uninstall memory_profiler
NOTE (Windows):
If the pip
executable isn’t recognized by the terminal
the above commands can be replaced by e.g. python -m pip search memory
.
Note however, that unless the module is only reliant on Python, the install process may run into dependency problems (e.g. needing a C++ development environment to be set up in order to compile included C++ code).
sys.argv
Python can read inputs passed to it on the command line by using the
sys
module’s argv
attribute (module variable):
print(sys.argv)
If we place that statement in a file (called, e.g. “test_inputs.py”), and run the file with Python as usual:
python test_inputs.py
the output would be the list [ "test_inputs.py" ]
;
i.e. the first element of argv
(which is a list), is always the name of the script.
Subsequent command-line arguments (separated by spaces) will appear
as additional elements in the list.
For example if we call the script with
python test_inputs.py Hello 2 you
the output (contents of sys.argv
) would be
["test_inputs", "Hello", "2", "you"]
.
This highlights that all command line inputs are interpreted as
strings; if you wish to pass in a number, you would need to convert
the string to a number using either float(STRING)
or int(STRING)
,
where STRING
could be "2"
, "3.14"
, or sys.argv[2]
if the 2nd command
line argument (after the script name) were a number.
More advanced command-line input interfaces can be created by using
a module such as
argparse
which allows for
-h
for help)input
As well as generating terminal output with the print
function, we
are also able to read input from the terminal with the
input
function;
for example
some_stuff = input("Please answer yes or no:")
would cause the script (when run), to pause, display the prompt text (i.e.
the first argument to the input
function), and then wait for the use to
enter text using the keyboard.
Text entry would finish when the enter key is pressed.
More advanced interactive terminal interfaces are also possible with Python.
The
cmd
module, for example, allows the creation of interactive
commands session, where the developer maps functions he has written to
“interactive commands” which the user can then call interactively (including
features such as interactive help, tab-completion of allowed commands,
and custom prompts.
A decorator is a way to “dynamically alter the functionality of a function”. On a practical level, it amounts to passing a function into another function, which returns a “decorated” function.
Why would we want to do such as thing?
Consider the following simple example; we have a number of functions that generate return values;
def add(a, b):
return a + b
def subtract(a, b):
return a - b
def multiply(a, b):
return a * b
def divide(a, b):
return a / b
Now we want a clean and simple way of turning all of these functions into more “verbose” versions, which also print details to the terminal.
For example when we call add( 10, 2)
we want the terminal to read:
Adding 10 and 2 results in 12
Without knowing about decorators, we might think we need to re-write all of our functions. But, as our modification is relatively straight forward to turn into an algorithm, we can instead create a decorator function:
def verbose(func):
def wrapped_function(a, b):
# Access the special __name__ attribute to get a function's name
name = func.__name__
result = func(a,b)
print("%sing %0.2f and %0.2f results in %0.2f"%(name, a, b, result))
return result
return wrapped_function
which we the use to decorate our previous function definitions:
@verbose
def add(a, b):
return a + b
@verbose
def subtract(a, b):
return a - b
@verbose
def multiply(a, b):
return a * b
@verbose
def divide(a, b):
return a / b
E.g. then calling each of these on 10 and 2 as inputs gives:
adding 10.00 and 2.00 results in 12.00
subtracting 10.00 and 2.00 results in 8.00
multiplying 10.00 and 2.00 results in 20.00
divideing 10.00 and 2.00 results in 5.00
Note that the decorator syntax, where we use the @ symbol followed by the decorator function name, and then define the function to be decorated, is the same as
add = verbose(add)
i.e. we are calling the decorator function (here called verbose
) on the
input function (add
), which returns a wrapped version of add
, and we
assign it back into add
.
The reason that the decorator syntax (@) was created is that the core Python developers decided that having to first define a function and then apply the decorating syntax would be confusing, especially when dealing with large function definitions or class methods (yes, decorators can also be applied to class methods!).
A quick but useful additional python construct is the lambda
.
Lambdas a one-line function definitions, which can be extremely
useful to remove the need to create a full function definition for
an almost trivial function.
Let’s look at an example of when this might be useful.
There is an in-built function called filter
that is used to filter
lists; the help doc for filter is:
filter(function or None, iterable) --> filter object
Return an iterator yielding those items of iterable for which function(item)
is true. If function is None, return the items that are true.
Given a list of files, we might want to use this function to select only files ending in a specific extension.
However, to do so we need to pass in a function that takes a
filename as an input and returns True
or False
indicating if the
filename ends in the desired extension. We could now go ahead and define
such a function, but it really is a trivial function and a more efficient
approach is to use a lamba
:
# e.g. file list
file_list = ["file1.txt", "file2.py", "file3.tif", "file4.txt"]
text_files = filter( lambda f : f.endswith("txt"), file_list)
Another example of a function that requires another function as an input
is non-standard sorting using sorted
, in which case we can pass in a custom “key” function
that is used to extract a sort “key” from each item:
print( sorted(["10 kg", "99 kg", "100 kg"] ))
print( sorted(["10 kg", "99 kg", "100 kg"] , key=lambda k : float(k.split()[0]) ))
outputs
['10 kg', '100 kg', '99 kg']
['10 kg', '99 kg', '100 kg']
A generator is something that is iterable, and yet the values that are iterated over do not all exist in memory at the same time.
In fact, those of you who attended the Introductory course will have seen a generator without knowing it, when we covered reading data from a file, and you should have just made use of this above above!
There, the last method shown for iterating over a file was (roughly)
for line in open("filename.txt"):
print(line)
However, what we didn’t mention, was that this approach provides sequential access to the file being read meaning that only one line of the file is loaded into memory per iteration.
This is in contrast to
lines = open("filename.txt").readlines()
for line in lines:
print(line)
where we read all of the file into memory using readlines
.
Under the hood, a file object
is a generator, meaning that
for each iteration in for line in <fileobject>
the file object
yields the next line of the file.
Similarly, the last comprehension expression we encountered, i.e. the tuple comprehension, yields the value of each list item squared, without loading all of the new values into memory.
Any function where we iterate over something can be converted to a
generator; instead of creating a list/tuple etc in the function
we we use the yield
statement instead of return
at the end of the
function.
For example consider a simple squared
function that takes a list and
generates a new list of squared values:
def squared(list_in):
list_out = []
for val in list_in:
list_out.append(val * val)
return list_out
To convert this into a generator function we might write
def squared(list_in):
for val in list_in:
yield val * val
Running the first version through a memory-profiler,
In [1]: %memit l1 = squared(range(10000000))
peak memory: 391.05 MiB, increment: 367.12 MiB
we see that we create almost 400MB of data in memory by running the traditional function (as we’ve created 10,000,000 numbers - and each is quite a bit larger than the 8 bytes needed for the value alone as Python objects are more than just data!)
By using the generator version of the function instead, we get
In [2]: %memit l2 = squared(range(10000000))
peak memory: 23.80 MiB, increment: 0.10 MiB
i.e. very little additional memory is used!
Can we still use the resulting generator as we would a list? The answer is, most of the time… by which I mean that many functions where we don’t need the list to be all in-memory will work, e.g.
print(sum(l2))
will output 333333283333335000000
, but only once, as generators are iterable once and only once.
If we want to call another function on the generator, we would
need to recreate it (i.e. set e.g. l2 = squared(10000000)
again).
The bottom line is that generators are useful for specific memory-critical functionalities, such as working with data that is too large to fit in memory.
Create a script file (“exercise_generators.py”) to evaluate the sum of the squares of the first 1,000,000,000 integers (1 billion or 10E9).
As this is a large number you will at best be barely able to store all of these values in memory at the same time (if each integer were represented using 64 bit / 8 byte this would require at least 8 GB RAM, which is roughly the total amount of RAM a standard current standard desktop PC has!).
NOTE: This will take a couple of minutes to execute!.
While you’re waiting for this to finish running, it’s worth mentioning that
the equivalent functionality in C++ is about 20 times faster; but getting C++
to do this is significantly more difficult (even for such a relatively simple task)
as it involves using the non-standard __int128
compiler extension, which requires
creating a custom output stream operator (to print the result to the screen)!
Nonetheless, for problems involving very simple operations repeated many many times, lower level languages such as C++ may be better suited.
Luckily for us, we don’t need to choose! We can have our cake and eat it…
Low-level language purists argue that Python is extremely slow compared with e.g. C, C++, Fortran, or to a lesser extent, Java.
While it is true that Python is slow compared with these languages (typically on the order of 10-40 times slower), pure Python isn’t designed to compete with these languages for speed.
Instead Python prioritizes readability, code design, and simplicity.
That said, there is a large community of Python developers who devote their time to optimizing Python in many ways. These include
ctypes
: C “foreign function library” built into Python often used for wrapping
c libraries in Pure Pythonswig
, scipy.weave
(Py2), Boost.PythonBefore you get carried away with these hybrid approaches and/or start to think you might need
to write your whole project using a lower-level language, consider the small snippet below that
demonstrates some simple arithmetic operations in a for
-loop:
def simple(amp, N):
out = []
for i in range(N):
val1 = amp * math.sin(i)
val2 = (abs(val1) - 1)**2
out.append(val2*val1)
return out
simple(0.1, 10000)
Ie here we perform some relatively basic mathematical operations
(sin
, abs
, multiplication, exponentiation, etc),
inside a for
loop, looped 10,000 times.
For-loops are one of Python’s common speed bottlenecks compared with low-level languages, so this can be considered to be a relatively representative snippet.
Benchmark results
Running this code using the Numpy module, we get the following statistics:
Numpy: 0.000417 seconds / "loop iteration" (uses matrix operations)
Execution time relative to numpy version (small is good!)
Cython versions
For loop : 10.09
List comprehension : 15.05
With Basic Cython type declarations : 5.13
+ return type declarations : 5.07
+ using cmath abs & sin : 1.27
+ removed bounds checking : 1.26
Normal Python
Numpy (vectorized) : 0.97
Basic for loop : 13.46
List comprehension : 17.23
Basic C++
Growing vector : 1.44
Pre-assigned vector : 1.36
C++ with O2 (optimization) flag
Growing vector : 0.97
Pre-assigned vector : 0.92
C version with O2 : 0.89
So the Numpy version (and the best Cython versions!) are faster than the basic Pure C++ version!
This is possible because Numpy is not a pure-python library; all of the basic operations are implemented in c, and wrapped in Python to provide a Python object interface to a load of really fast numerical algorithms!
As this still incurs some overhead, applying optimization flags to the C++ and C version mean these versions are still the fastest.
If someone else (like the Numpy development team!) hasn’t already written the speed-critical functions you need in C/C++ etc for you, there is a gentle way of optimizing your Python code using Cython to perform incremental optimization.
We mentioned Cython above as a way of interfacing with C++. Cython basically converts pure Python into C/C++ and automatically create a Python compatible module from the output.
This means that pure Python is valid Cython code. In fact, by just running Cython on pure python we already obtain a small increase in speed, often of about 10-20%.
By adding in additional Cython-specific information, such as type declarations (and removing calls to Python functions), Cython is able to further optimize your Python code for you.
For example, the optimized Cython version (which operates faster than the unoptimized Pure C++ and almost as fast as the highly optimized Numpy code) is identical to the Python version, featuring just a few Cython specific decorators:
from libc.math cimport sin, abs
@cython.locals(amp=cython.double, N=cython.int, out=list,
val1 = cython.double, val2 = cython.double)
@cython.returns(list)
def cython_version(amp, N):
out = []
for i in range(N):
val1 = amp * sin(i)
val2 = (abs(val1) - 1)**2
out.append(val2*val1)
return out
That’s it!
The function definition itself is almost identical (except that we switched to the
c math library version of sin
and abs
), and all we needed was three additional lines;
from libc.math cimport sin, abs
to import the c library functions for sin and abs (so that Cython doesn’t have to switch back to the Python library to call the Python versions of these functions!), and
@cython.locals(amp=cython.double, N=cython.int, out=list,
val1 = cython.double, val2 = cython.double)
@cython.returns(list)
i.e. two special cython
module decorators, locals
and returns
, that are used to specify the
data types in the local variables and the return variables so that Cython can convert the
Python code to c code. Cython knows how to deal with for
loops and Python list conversions.
Not much to add for a more than 10-fold speed increase!
And yet we’ve also been able to still write valid Python code in doing so.
This highlights how we can incrementally add cython annotation to Python code to make it run faster; so if you have a nicely written Python module that has a speed bottleneck that has to be addressed, no need to replace the whole thing with c code - you can simply add in a few cython library lines and run the code through cython to produce an optimized, compiled version of the same code!
Cython also already contains a special cimport numpy
directive to allow
operations involving Numpy arrays to be optimized using Cython.