In addition to the basic data types covered in the last section, Python has several built-in “containers”.
These come in two main types;
A list
, as the name implies, is a sequence of items. For example, the
numbers 1 – 10 could be arranged in a list: 1,2,3,4,5,6,7,8,9, and 10.
In Python, lists are created using square brackets, and items separated by commas, e.g.:
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
List elements can be anything (including lists - which creates “nested lists”), strings, etc.
For now though, lets get to grips with lists by considering flat lists of simple data.
Lists can be manipulated using member-functions:
(anywhere in the list)append
(always at the end),extend
to add multiple items at the endpop
(a specific index - without any number the last element will be removed)remove
(a specific value)sort
Lists can also be combined using the plus operator, +
, e.g.
[1,2,3] + [4,5]
results in the list [1,2,3,4,5]
The length of a list can be determined using the built-in function len
is not a member function of a list, i.e. you cannot callAListVariable.len()
Instead, use
to return the length of a list
Anything that can be “iterated” over or cycled through, can be converted into a list
using the list
Python also contains a built-in function called range
which is a convenient
way of generating a range of numbers (as an iterable that can be converted
into a list):
is the same as
Write a script (name the file
to perform the following operations (this is just to test them all out!):
to the end of the list (hint use extend
with [5,6,7]
!)When you’re given a step-by-step set of tasks in Python, it often helps to copy and paste those steps into your script file, turn them into comments, and then work through the tasks.
All of the steps that you need are covered in the text above this exercise, except for
creating the range that starts at 10…, to do that pass in additional inputs to the
range(stop) -> range object
range(start, stop[, step]) -> range object
Check the Python website for list member functions here
As suggested, adding the steps as comments,
# 1. Create a variable that contains a list that holds the numbers 10 to 100 (i.e. 10, 11, 12, ..., 99)
l = list(range(10, 100))
# 2. Add the value 100 to the end of the list
# 3. remove the 20th element of the list (index 19!)
# 4. remove the value 55
# 5. Add the list `[5,6,7]` to the end of the list (use `extend`!)
# 6. Print the length of the list.
You should get a length of:
A tuple is similar to a list in that it is a sequence of items.
The main difference between a list and a tuple, is that a tuple is immutable which means that elements cannot be changed, added, or removed once created (i.e. it is ~”read-only”!).
This makes tuples computationally faster than lists, but also less versatile.
As a general rule of thumb, when in doubt, use a list.
Lists are great when we have a sequence of data, but the caveat is that in order to
get a specific element we need to keep track of its index or repeatedly call
the member-function index
This can get tricky if the list grows and shrinks.
For example, if we want to keep track of the colours (and names) of fruit, our first attempt might be to keep two lists - one for the names, and one for the colours:
names = [ "banana", "orange", "strawberry"]
colours = [ "yellow", "orange", "red"]
Then, to find the colour of a strawberry, we first have to find the index of strawberry, and then use that to find the colour:
ind = names.index("strawberry")
print("The colour is : " + colours[ind])
Not only is this clumsy, it’s also likely to lead to issues as the lists might accidentally become unsynchronized (meaning that the order of the name and the order of the colours doesn’t match).
Attempt number two: as briefly mentioned, lists can hold pretty much anything, including other lists, so we could use a list of lists, e.g.
fruit_colours = [ ["banana", "yellow"], ["orange", "orange"], ["strawberry", "red"]]
Now the name will always be paired with the right colour!
The downside to this approach, is that if we want to find out what colour a fruit is,
we need to do a non-trivial search operation (as we’re trying to find lists in lists!)
and can no longer use the index
function that we used before.
The bottom line is that we could hack together a solution using lists, but luckily for these kinds of situations, Python offers us a much better solution in the form of a dictionary.
A dictionary is created using curly-bracket notation { }, for example continuing our fruit names and colours example, dictionary items are provided as a list of key:value pairs:
fruit_colours = { "banana" : "yellow", "orange" : "orange", "strawberry" : "red"}
Now we can access values using keys:
will print yellow
to the terminal.
Note about keys
Here we’ve used strings as both the keys and values, but you can also use numbers as keys and/or values.
In fact pretty much anything can be a value (much as with lists), and anything that is hashable can be a key – hashable roughly translates as non-changing. A number or string is hashable, as is a tuple (as explained above). Lists and dictionaries are NOT hashable as they can change.
Dictionaries can be created from a sequence where each item in the sequence
has two elements, by using the dict
function e.g.
fruit_colours = dict( [ ["banana", "yellow"], ["orange", "orange"], ["strawberry", "red"]] )
fruit_colours = dict( ( ("banana", "yellow"), ("orange", "orange"), ("strawberry", "red")) )
both produce the same dictionary as in the previous section.
Note that while for this example this way of creating a dictionary might seem superfluous, there are scenarios where it is very useful. For example, if we have a function that generates a list of 2-element lists, but we want the result as a dictionary, we can simply convert the list to a dictionary as per above, without having to write a new function!
Once a dictionary has been created, it can be grown or shrunk slightly differently to lists:
- creates a new key-value
pair, or updates one if it already existsd.pop(KEY, DEFAULT)
- removes KEY and returns
its value, or DEFAULT if KEY doesn’t exist in the dictionary.As you might have spotted, you can’t assign multiple values with the same key - instead if you write
the old value that ALREADY_EXISTING_KEY pointed to will be overwritten.
Write a script (name the file
to count the occurrences of the the words
in Shakespeare’s Romeo & Juliet.
Use the following initial two lines (feel free to copy and paste!)
which will pull the text from an online source and assign it to
a variable called text
import urllib.request
text = urllib.request.urlopen("").read().decode('utf8')
The longer approach here would be to perform the word counting
and assign the counts into variables, and then enter those
variables into a dict
However, when creating a dict
(using the curly bracket notation as
mentioned), you are free to have the value be an expression that
results in a value!
This means you can condense the whole exercise (after the copy and paste) into two lines; one where you create the dictionary and the other where you print it’s contents!
The first few lines should be copied and pasted from the question text; the rest could be achieved fairly succinctly using
import urllib.request
#text = urllib.request.urlretrieve("")
text = urllib.request.urlopen("").read().decode('utf8')
counts = {
'sword': text.count('sword'),
which should produce the following output when run
{'sword': 13, 'wench': 5, 'love': 171, 'fool': 11}