Data Structures#

So far we have dealt with single data items. But what if we have a large number of data items? Storing each item in a separate variable is not efficient and time-consuming. This is when we use data structures.

Data Structures

Programming concept

Data structures are structures in memory that store and organise data. Essentially it is a collection of data items.

The benefits of having multiple data items in a collection makes it easier to perform operations that need to be applied on each item in the collection. Data Structures are also very useful when reading data from files. This section introduces different data structures commonly used in Python.

Tuples#

Tuples are one of the built-in data structures in Python. Tuples can contain data items of different data types, but as a good practice we normally create lists of the same data tyoe. Tuples are created as a comma-separated list of data items enclosed in parentheses. Let us create our first tuple:

t = (55, 92, 110, 66, 75, 45, 40, 57, 55, 62)

The code above creates a variable t which references a tuple in memory that is 10 items long. Assigning data items to a tuple is also known as tuple packing.

Tuple index positions and slicing#

Each data item has a specified position known as index position. Index positions in Python start from 0. Fig. 6 below shows an example of the sequence of data items in tuple t together with their index positions.

_images/tuple.png

Fig. 6 Representation of tuple t in memory and the index positions of its items.#

A tuple is a Sequence type in Python, and therefore, it can be sliced using the item access operator []. Slicing means extracting items from a sequence. Assuming seq is a sequence, in this case of data items, to extract one item from a tuple, specify the index position of that item in [] as \(seq[index position]\) as in the example below.

#Extract item at index position 2
print("Item at t[2] is", t[2])


#Extract item at index position -2
print("Item at t[-2] is", t[-2])
Item at t[2] is 110
Item at t[-2] is 55

To extract more than one item from a sequence you can either:

  • \(seq[start:end]\) to extract items from a sequence from a start position up to an end position (excluding it).

  • \(seq[start:end:step]\) to extract every stepth item from the sequence starting from the start position to the end position (excluding it). Below are two examples, A and B, that show these two ways of slicing a tuple.

_images/tuple-slicing.png

Fig. 7 Tuple slicing.#

Below are code examples of other operations you can do with tuples:

#Extract items from the beginning of the tuple to index position 2 (excluded).
print("t[:2] is", t[:2])

#Extract items from index position 5 to the end of the tuple.
print("t[5:] is", t[5:])

#print all items in a tuple
print("\nPrint all items in tuple t", t)

# Tuple concatenation: add tuples together
t2 = t + (70,)   #you have to specify the one item as a tuple as well
print("\nConcatenate t to (70,) results in", t2) 

# Tuple replication:  replicate contents of a tuple by a specified number of times
t3 = t * 2
print("\nReplicating tuple t by 2:", t3)

# Tuple unpacking - place data items of tuple in separate variables
print("\nTuple unpacking:")
t4 = ("Alexia", "MCB", "1B")
(name, subject, year) = t4
print("Name is", name)
print("Subject is", subject)
print("Year is", year)
t[:2] is (55, 92)
t[5:] is (45, 40, 57, 55, 62)

Print all items in tuple t (55, 92, 110, 66, 75, 45, 40, 57, 55, 62)

Concatenate t to (70,) results in (55, 92, 110, 66, 75, 45, 40, 57, 55, 62, 70)

Replicating tuple t by 2: (55, 92, 110, 66, 75, 45, 40, 57, 55, 62, 55, 92, 110, 66, 75, 45, 40, 57, 55, 62)

Tuple unpacking:
Name is Alexia
Subject is MCB
Year is 1B

Tuples are immutable#

As mentioned previously, tuples are immutable, meaning that they cannot be changed. Below is an example showing what will happen if we try to change the contents of a tuple.

# change the data item at index position 2 of tuple t
t[2] = 80
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [4], line 2
      1 # change the data item at index position 2 of tuple t
----> 2 t[2] = 80

TypeError: 'tuple' object does not support item assignment

As you can see an error is thrown back saying explicitly that the 'tuple' object does not support item assignment. Since they cannot be modified, tuples are useful when you have a fixed sequence of data items that you know are not going to be changed, e.g., days of the week, months of the year. Another benefit of tuples is that they are faster than lists. However, this depends on the size of your data and the difference is normally not significant.

If you want to be able to modify the sequence of data items, then use a list instead. You can convert a tuple to a list by the list() function e.g., l = list(t) will create a variable l of type list that contains the same data items as t.

Membership operators#

Membership operator are used to test for membership in Sequence types. For tuples, the in membership operator returns True if a data item in tuple t is equal to x, otherwise it returns False. The not in non-membership operator does the opposite, it returns True if a data item in tuple t is not equal to x, otherwise it returns False. The code below shows an example of using membership operators in tuples.

#Check if number 2 is one of the items of tuple t
2 in t      #returns False

#Check if number 55 is one of the items of tuple t
55 in t     #returns True

#Check that 2 is not one of the items of tuple t
2 not in t  #returns True

Other useful functions#

Below is a list of other useful functions that can be used with tuples. Note that these functions do not make any changes on the tuples, but rather they perform a calculation or operation on the data items of the tuple and return the result.

Table 5 Useful functions for operations with tuples#

Function

Description

Example

len(t)

returns the number of data items in a tuple t.

len(t) returns 10

min(t)

returns the smallest data item in t.

min(t) returns 40

max(t)

returns the largest data item in t.

max(t) returns 110

t.index(x)

returns the index of the leftmost index position of data item x in tuple t or throws an error if x is not present.

t.index(55) returns 0

t.count(x)

returns the number of times x is present in t.

t.count(55) returns 2

Lists#

Lists hold a one dimensional list of data items. This is similar to tuples, however they differ from tuples over two main things:

  1. how they are declared

  2. they are mutable objects

Lists are declared as a comma-separated list of data items enclosed in square brackets, rather than parentheses. Let us create a list l with the same data items as tuple t above:

l = [55, 92, 110, 66, 75, 45, 40, 57, 55, 62]

Lists are also a Sequence type in Python and therefore support slicing, membership operators, concatenation, replication and the other utility functions we mentioned in the Tuples section.

Exercise 3 (Exploring lists)

Level:

Explore lists by trying the code we used in the Tuples section as follows:

  1. Create a list using the code above.

  2. Explore list index positions by adapting this code.

  3. Explore list slicing by adapting the code in Fig. 7 and this code.

  4. Test for membership of data items in lists using the membership operators as in this example.

  5. Try other useful functions on lists as specified in Table 5.

Lists are mutable#

Unlike tuples, lists are mutable, this means that we can modify lists. Example:

#print contents of l and get id of l
print(l)
print("l is referencing object", id(l), "in memory")

# change the data item at index position 2 of list l
l[2] = 80
print("\nAfter l[2] = 80, l is:", l)
print("l is referencing object", id(l), "in memory")
[55, 92, 110, 66, 75, 45, 40, 57, 55, 62]
l is referencing object 140504391354560 in memory

After l[2] = 80, l is: [55, 92, 80, 66, 75, 45, 40, 57, 55, 62]
l is referencing object 140504391354560 in memory

When we tried to apply a similar operation in tuples, Python threw an exception, because tuples are immutable. Since lists are mutable, the code above changes the data item in index position 2 successfully in l.

List utility methods#

In addition to the sequence methods in Table 5, we can use other methods with lists.
This is mainly because lists can be modified.

Table 6 List utility methods#

Function

Description

Example

l.append(x)

appends data item x at the end of list l.

l.append(100)

l.insert(i, x)

insert data item x in list l at index position i.

l.insert(1,44)

l.remove(x)

removes the leftmost occurance of the data item that is equal to x or throws an error if x is not present in list l.

l.remove(55)

l.reverse()

reverses the data items in l

l.sort()

sorts the data items of list l in ascending order.

del l[start:end]

The del statement removes the data items from start position to end position (excluded) in list l.

del l[7:9]

Methods vs Functions

We have already seen how to call functions. Functions are called only by their names, eg. print(). Methods are similar to functions but they are defined inside classes or objects, and so they are dependent on them. We will not be going into the details of object-oriented programming in this course, but we have already encountered some methods in tuples and lists that use methods e.g., l.append(x). Here l is an object reference of the list class and append() is a method inside the list class. The dot . after l is used to access the method associated with the list object. In this example, a pop-up menu will be shown in PyCharm with the list of all methods associated with list.

Exercise 4

Level:

Try the examples in Table 6. Check the output of list l after running every example.