Data Structures
Contents
Data Structures#
So far we have dealt with single data items. But what if we have a large number of data items? Storing each item in a separate variable is not efficient and time-consuming. This is when we use data structures.
Programming concept
Data structures are structures in memory that store and organise data. Essentially it is a collection of data items.
The benefits of having multiple data items in a collection makes it easier to perform operations that need to be applied on each item in the collection. Data Structures are also very useful when reading data from files. This section introduces different data structures commonly used in Python.
Tuples#
Tuples are one of the built-in data structures in Python. Tuples can contain data items of different data types, but as a good practice we normally create lists of the same data tyoe. Tuples are created as a comma-separated list of data items enclosed in parentheses. Let us create our first tuple:
t = (55, 92, 110, 66, 75, 45, 40, 57, 55, 62)
The code above creates a variable t
which references a tuple in memory that is 10 items long. Assigning data items to
a tuple is also known as tuple packing.
Tuple index positions and slicing#
Each data item has a specified position known as index position. Index positions in Python start from 0. Fig. 6 below shows an
example of the sequence of data items in tuple t
together with their index positions.
A tuple is a Sequence type in Python, and therefore, it can be sliced using the item access operator []
. Slicing
means extracting items from a sequence. Assuming seq
is a sequence, in this case of data items, to extract one item from a tuple, specify the index position of that item in []
as \(seq[index position]\) as in the example below.
#Extract item at index position 2
print("Item at t[2] is", t[2])
#Extract item at index position -2
print("Item at t[-2] is", t[-2])
Item at t[2] is 110
Item at t[-2] is 55
To extract more than one item from a sequence you can either:
\(seq[start:end]\) to extract items from a sequence from a start position up to an end position (excluding it).
\(seq[start:end:step]\) to extract every stepth item from the sequence starting from the start position to the end position (excluding it). Below are two examples, A and B, that show these two ways of slicing a tuple.
Below are code examples of other operations you can do with tuples:
#Extract items from the beginning of the tuple to index position 2 (excluded).
print("t[:2] is", t[:2])
#Extract items from index position 5 to the end of the tuple.
print("t[5:] is", t[5:])
#print all items in a tuple
print("\nPrint all items in tuple t", t)
# Tuple concatenation: add tuples together
t2 = t + (70,) #you have to specify the one item as a tuple as well
print("\nConcatenate t to (70,) results in", t2)
# Tuple replication: replicate contents of a tuple by a specified number of times
t3 = t * 2
print("\nReplicating tuple t by 2:", t3)
# Tuple unpacking - place data items of tuple in separate variables
print("\nTuple unpacking:")
t4 = ("Alexia", "MCB", "1B")
(name, subject, year) = t4
print("Name is", name)
print("Subject is", subject)
print("Year is", year)
t[:2] is (55, 92)
t[5:] is (45, 40, 57, 55, 62)
Print all items in tuple t (55, 92, 110, 66, 75, 45, 40, 57, 55, 62)
Concatenate t to (70,) results in (55, 92, 110, 66, 75, 45, 40, 57, 55, 62, 70)
Replicating tuple t by 2: (55, 92, 110, 66, 75, 45, 40, 57, 55, 62, 55, 92, 110, 66, 75, 45, 40, 57, 55, 62)
Tuple unpacking:
Name is Alexia
Subject is MCB
Year is 1B
Tuples are immutable#
As mentioned previously, tuples are immutable, meaning that they cannot be changed. Below is an example showing what will happen if we try to change the contents of a tuple.
# change the data item at index position 2 of tuple t
t[2] = 80
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [4], line 2
1 # change the data item at index position 2 of tuple t
----> 2 t[2] = 80
TypeError: 'tuple' object does not support item assignment
As you can see an error is thrown back saying explicitly that the 'tuple' object does not support item assignment
. Since they
cannot be modified, tuples are useful when you have a fixed sequence of data items that you know are not going to be changed,
e.g., days of the week, months of the year. Another benefit of tuples is that they are
faster than lists. However, this depends on the size of your data and the difference is normally not significant.
If you want to be able to modify the sequence of data items, then use a list
instead. You can convert
a tuple to a list by the list()
function e.g., l = list(t)
will create a variable l
of type list
that contains the
same data items as t
.
Membership operators#
Membership operator are used to test for membership in Sequence types. For tuples, the in
membership operator
returns True
if a data item in tuple t
is equal to x
, otherwise it returns False
. The not in
non-membership operator
does the opposite, it returns True
if a data item in tuple t
is not equal to x
, otherwise it returns False
. The code
below shows an example of using membership operators in tuples.
#Check if number 2 is one of the items of tuple t
2 in t #returns False
#Check if number 55 is one of the items of tuple t
55 in t #returns True
#Check that 2 is not one of the items of tuple t
2 not in t #returns True
Other useful functions#
Below is a list of other useful functions that can be used with tuples. Note that these functions do not make any changes on the tuples, but rather they perform a calculation or operation on the data items of the tuple and return the result.
Function |
Description |
Example |
---|---|---|
|
returns the number of data items in a tuple |
|
|
returns the smallest data item in |
|
|
returns the largest data item in |
|
|
returns the index of the leftmost index position of data item |
|
|
returns the number of times |
|
Lists#
Lists hold a one dimensional list of data items. This is similar to tuples, however they differ from tuples over two main things:
how they are declared
they are mutable objects
Lists are declared as a comma-separated list of data items enclosed in square brackets, rather than parentheses. Let us
create a list l
with the same data items as tuple t
above:
l = [55, 92, 110, 66, 75, 45, 40, 57, 55, 62]
Lists are also a Sequence type in Python and therefore support slicing, membership operators, concatenation, replication and the other utility functions we mentioned in the Tuples section.
(Exploring lists)
Level:
Explore lists by trying the code we used in the Tuples section as follows:
Create a list using the code above.
Explore list index positions by adapting this code.
Explore list slicing by adapting the code in Fig. 7 and this code.
Test for membership of data items in lists using the membership operators as in this example.
Try other useful functions on lists as specified in Table 5.
Lists are mutable#
Unlike tuples, lists are mutable, this means that we can modify lists. Example:
#print contents of l and get id of l
print(l)
print("l is referencing object", id(l), "in memory")
# change the data item at index position 2 of list l
l[2] = 80
print("\nAfter l[2] = 80, l is:", l)
print("l is referencing object", id(l), "in memory")
[55, 92, 110, 66, 75, 45, 40, 57, 55, 62]
l is referencing object 140504391354560 in memory
After l[2] = 80, l is: [55, 92, 80, 66, 75, 45, 40, 57, 55, 62]
l is referencing object 140504391354560 in memory
When we tried to apply a similar operation in tuples, Python threw an exception, because tuples are immutable. Since
lists are mutable, the code above changes the data item in index position 2 successfully in l
.
List utility methods#
In addition to the sequence methods in Table 5, we can use other methods with lists.
This is mainly because lists can be modified.
Function |
Description |
Example |
---|---|---|
|
appends data item |
|
|
insert data item |
|
|
removes the leftmost occurance of the data item that is equal to |
|
|
reverses the data items in |
|
|
sorts the data items of list |
|
|
The |
|
Methods vs Functions
We have already seen how to call functions. Functions are called only by their names, eg. print()
. Methods are similar
to functions but they are defined inside classes or objects, and so they are dependent on them. We will not be going
into the details of object-oriented programming in this course, but we have already encountered some methods in tuples and
lists that use methods e.g., l.append(
x)
. Here l
is an object reference of the list
class and append()
is a method
inside the list
class. The dot .
after l
is used to access the method associated with the list
object. In this example,
a pop-up menu will be shown in PyCharm with the list of all methods associated with list
.
Level:
Try the examples in Table 6. Check the output of list l
after running every example.