5. Lists, tuples, and sets
Chapter 5. Lists, tuples, and sets
In this chapter, I discuss the two major Python sequence types: lists and tuples. At first, lists may remind you of arrays in many other languages, but donât be fooled: lists are a good deal more flexible and powerful than plain arrays.
Tuples are like lists that canât be modified; you can think of them as a restricted type of list or as a basic record type. I discuss the need for such a restricted data type later in the chapter. This chapter also discusses a newer Python collection type: sets. Sets are useful when an objectâs membership in the collection, as opposed to its position, is important
Most of the chapter is devoted to lists, because if you understand lists, you pretty much understand tuples. The last part of the chapter discusses the differences between lists and tuples in both functional and design terms.
5.1. Lists are like arrays
A list in Python is much the same thing as an array in Java or C or any other language; itâs an ordered collection of objects. You create a list by enclosing a comma-separated list of elements in square brackets, like so:
Note that you donât have to worry about declaring the list or fixing its size ahead of time. This line creates the list as well as assigns it, and a list automatically grows or shrinks as needed.
ARRAYS IN PYTHON
A typed array module available in Python provides arrays based on C data types. Information on its use can be found in the Python Library Reference. I suggest that you look into it only if you really need the performance improvement. If a situation calls for numerical computations, you should consider using NumPy, mentioned in chapter 4 and available at www.scipy.org.
Unlike lists in many other languages, Python lists can contain different types of elements; a list element can be any Python object. Hereâs a list that contains a variety of elements:
Probably the most basic built-in list function is the len function, which returns the number of elements in a list:
Note that the len function doesnât count the items in the inner, nested list.
QUICK CHECK: LEN()
What would len() return for each of the following: [0]; []; [[1, 3, [4, 5], 6], 7]?
5.2. List indices
Understanding how list indices work will make Python much more useful to you. Please read the whole section!
Elements can be extracted from a Python list by using a notation like Câs array indexing. Like C and many other languages, Python starts counting from 0; asking for element 0 returns the first element of the list, asking for element 1 returns the second element, and so forth. Here are a few examples:
But Python indexing is more flexible than C indexing. If indices are negative numbers, they indicate positions counting from the end of the list, with â1 being the last position in the list, â2 being the second-to-last position, and so forth. Continuing with the same list x, you can do the following:
For operations involving a single list index, itâs generally satisfactory to think of the index as pointing at a particular element in the list. For more advanced operations, itâs more correct to think of list indices as indicating positions between elements. In the list ["first", "second", "third", "fourth"], you can think of the indices as pointing like this:
This is irrelevant when youâre extracting a single element, but Python can extract or assign to an entire sublist at onceâan operation known as slicing. Instead of entering list[index] to extract the item just after index, enter list[index1:index2] to extract all items including index1 and up to (but not including) index2 into a new list. Here are some examples:
It may seem reasonable that if the second index indicates a position in the list before the first index, this code would return the elements between those indices in reverse order, but this isnât what happens. Instead, this code returns an empty list:
When slicing a list, itâs also possible to leave out index1 or index2. Leaving out index1 means âGo from the beginning of the list,â and leaving out index2 means âGo to the end of the listâ:
Omitting both indices makes a new list that goes from the beginning to the end of the original listâthat is, copies the list. This technique is useful when you want to make a copy that you can modify without affecting the original list:
TRY THIS: LIST SLICES AND INDEXES
Using what you know about the len() function and list slices, how would you combine the two to get the second half of a list when you donât know what size it is? Experiment in the Python shell to confirm that your solution works.
5.3. Modifying lists
You can use list index notation to modify a list as well as to extract an element from it. Put the index on the left side of the assignment operator:
Slice notation can be used here too. Saying something like lista[index1:index2] = listb causes all elements of lista between index1 and index2 to be replaced by the elements in listb. listb can have more or fewer elements than are removed from lista, in which case the length of lista is altered. You can use slice assignment to do several things, as shown here:
Appending a single element to a list is such a common operation that thereâs a special append method for it:
One problem can occur if you try to append one list to another. The list gets appended as a single element of the main list:
The extend method is like the append method except that it allows you to add one list to another:
Thereâs also a special insert method to insert new list elements between two existing elements or at the front of the list. insert is used as a method of lists and takes two additional arguments. The first additional argument is the index position in the list where the new element should be inserted, and the second is the new element itself:
insert understands list indices as discussed in section 5.2, but for most uses, itâs easiest to think of list.insert(n, elem) as meaning insert elem just before the nth element of list. insert is just a convenience method. Anything that can be done with insert can also be done with slice assignment. That is, list.insert(n, elem) is the same thing as list[n:n] = [elem] when n is nonnegative. Using insert makes for somewhat more readable code, and insert even handles negative indices:
The del statement is the preferred method of deleting list items or slices. It doesnât do anything that canât be done with slice assignment, but itâs usually easier to remember and easier to read:
In general, del list[n] does the same thing as list[n:n+1] = [], whereas del list[m:n] does the same thing as list[m:n] = [].
The remove method isnât the converse of insert. Whereas insert inserts an element at a specified location, remove looks for the first instance of a given value in a list and removes that value from the list:
If remove canât find anything to remove, it raises an error. You can catch this error by using the exception-handling abilities of Python, or you can avoid the problem by using in to check for the presence of something in a list before attempting to remove it.
The reverse method is a more specialized list modification method. It efficiently reverses a list in place:
TRY THIS: MODIFYING LISTS
Suppose that you have a list 10 items long. How might you move the last three items from the end of the list to the beginning, keeping them in the same order?
5.4. Sorting lists
Lists can be sorted by using the built-in Python sort method:
This method does an in-place sortâthat is, changes the list being sorted. To sort a list without changing the original list, you have two options. You can use the sorted() built-in function, discussed in section 5.4.2, or you can make a copy of the list and sort the copy:
Sorting works with strings, too:
The sort method can sort just about anything because Python can compare just about anything. But thereâs one caveat in sorting: The default key method used by sort requires all items in the list to be of comparable types. That means that using the sort method on a list containing both numbers and strings raises an exception:
Conversely, you can sort a list of lists:
According to the built-in Python rules for comparing complex objects, the sublists are sorted first by ascending first element and then by ascending second element.
sort is even more flexible; it has an optional reverse parameter that causes the sort to be in reverse order when reverse=True, and itâs possible to use your own key function to determine how elements of a list are sorted.
5.4.1. Custom sorting
To use custom sorting, you need to be able to define functionsâsomething I havenât talked about yet. In this section, I also discuss the fact that len(string) returns the number of characters in a string. String operations are discussed more fully in chapter 6.
By default, sort uses built-in Python comparison functions to determine ordering, which is satisfactory for most purposes. At times, though, you want to sort a list in a way that doesnât correspond to this default ordering. Suppose that you want to sort a list of words by the number of characters in each word, as opposed to the lexicographic sort that Python normally carries out.
To do this, write a function that returns the value, or key, that you want to sort on, and use it with the sort method. That function in the context of sort is a function that takes one argument and returns the key or value that the sort function is to use.
For number-of-characters ordering, a suitable key function could be
This key function is trivial. It passes the length of each string back to the sort method, rather than the strings themselves.
After you define the key function, using it is a matter of passing it to the sort method by using the key keyword. Because functions are Python objects, they can be passed around like any other Python objects. Hereâs a small program that illustrates the difference between a default sort and your custom sort:
The first list is in lexicographical order (with uppercase coming before lowercase), and the second list is ordered by ascending number of characters.
Custom sorting is very useful, but if performance is critical, it may be slower than the default. Usually, this effect is minimal, but if the key function is particularly complex, the effect may be more than desired, especially for sorts involving hundreds of thousands or millions of elements.
One particular place to avoid custom sorts is where you want to sort a list in descending, rather than ascending, order. In this case, use the sort methodâs reverse parameter set to True. If for some reason you donât want to do that, itâs still better to sort the list normally and then use the reverse method to invert the order of the resulting list. These two operations togetherâthe standard sort and the reverseâwill still be much faster than a custom sort.
5.4.2. The sorted() function
Lists have a built-in method to sort themselves, but other iterables in Python, such as the keys of a dictionary, donât have a sort method. Python also has the built-in function sorted(), which returns a sorted list from any iterable. sorted() uses the same key and reverse parameters as the sort method:
TRY THIS: SORTING LISTS
Suppose that you have a list in which each element is in turn a list: [[1, 2, 3], [2, 1, 3], [4, 0, 1]]. If you wanted to sort this list by the second element in each list so that the result would be [[4, 0, 1], [2, 1, 3], [1, 2, 3]], what function would you write to pass as the key value to the sort() method?
5.5. Other common list operations
Several other list methods are frequently useful, but they donât fall into any specific category.
5.5.1. List membership with the in operator
Itâs easy to test whether a value is in a list by using the in operator, which returns a Boolean value. You can also use the converse, the not in operator:
5.5.2. List concatenation with the + operator
To create a list by concatenating two existing lists, use the + (list concatenation) operator, which leaves the argument lists unchanged:
5.5.3. List initialization with the * operator
Use the * operator to produce a list of a given size, which is initialized to a given value. This operation is a common one for working with large lists whose size is known ahead of time. Although you can use append to add elements and automatically expand the list as needed, you obtain greater efficiency by using * to correctly size the list at the start of the program. A list that doesnât change in size doesnât incur any memory reallocation overhead:
When used with lists in this manner, * (which in this context is called the list multiplication operator) replicates the given list the indicated number of times and joins all the copies to form a new list. This is the standard Python method for defining a list of a given size ahead of time. A list containing a single instance of None is commonly used in list multiplication, but the list can be anything:
5.5.4. List minimum or maximum with min and max
You can use min and max to find the smallest and largest elements in a list. Youâll probably use min and max mostly with numerical lists, but you can use them with lists containing any type of element. Trying to find the maximum or minimum object in a set of objects of different types causes an error if comparing those types doesnât make sense:
5.5.5. List search with index
If you want to find where in a list a value can be found (rather than wanting to know only whether the value is in the list), use the index method. This method searches through a list looking for a list element equivalent to a given value and returns the position of that list element:
Attempting to find the position of an element that doesnât exist in the list raises an error, as shown here. This error can be handled in the same manner as the analogous error that can occur with the remove method (that is, by testing the list with in before using index).
5.5.6. List matches with count
count also searches through a list, looking for a given value, but it returns the number of times that the value is found in the list rather than positional information:
5.5.7. Summary of list operations
You can see that lists are very powerful data structures, with possibilities that go far beyond those of plain old arrays. List operations are so important in Python programming that itâs worth laying them out for easy reference, as shown in table 5.1.
Table 5.1. List operations
List operation
Explanation
Example
[]
Creates an empty list
x = []
len
Returns the length of a list
len(x)
append
Adds a single element to the end of a list
x.append('y')
extend
Adds another list to the end of the list
x.extend(['a', 'b'])
insert
Inserts a new element at a given position in the list
x.insert(0, 'y')
del
Removes a list element or slice
del(x[0])
remove
Searches for and removes a given value from a list
x.remove('y')
reverse
Reverses a list in place
x.reverse()
sort
Sorts a list in place
x.sort()
+
Adds two lists together
x1 + x2
*
Replicates a list
x = ['y'] * 3
min
Returns the smallest element in a list
min(x)
max
Returns the largest element in a list
max(x)
index
Returns the position of a value in a list
x.index['y']
count
Counts the number of times a value occurs in a list
x.count('y')
sum
Sums the items (if they can be summed)
sum(x)
in
Returns whether an item is in a list
'y' in x
Being familiar with these list operations will make your life as a Python coder much easier.
QUICK CHECK: LIST OPERATIONS
What would be the result of len([[1,2]] * 3)?
What are two differences between using the in operator and a listâs index() method?
Which of the following will raise an exception?: min(["a", "b", "c"]); max([1, 2, "three"]); [1, 2, 3].count("one")
TRY THIS: LIST OPERATIONS
If you have a list x, write the code to safely remove an item ifâand only ifâthat value is in the list.
Modify that code to remove the element only if the item occurs in the list more than once.
5.6. Nested lists and deep copies
This section covers another advanced topic that you may want to skip if youâre just learning the language.
Lists can be nested. One application of nesting is to represent two-dimensional matrices. The members of these matrices can be referred to by using two-dimensional indices. Indices for these matrices work as follows:
This mechanism scales to higher dimensions in the manner youâd expect.
Most of the time, this is all you need to concern yourself with. But you may run into an issue with nested lists; specifically the way that variables refer to objects and how some objects (such as lists) can be modified (are mutable). An example is the best way to illustrate:
Figure 5.1 shows what this example looks like.
Figure 5.1. A list with its first item referring to a nested list
Now the value in the nested list can be changed by using either the nested or the original variables:
copy
But if nested is set to another list, the connection between them is broken:
Figure 5.2 illustrates this condition.
Figure 5.2. The first item of the original list is still a nested list, but the nested variable refers to a different list.
Youâve seen that you can obtain a copy of a list by taking a full slice (that is, x[:]). You can also obtain a copy of a list by using the + or * operator (for example, x + [] or x * 1). These techniques are slightly less efficient than the slice method. All three create what is called a shallow copy of the list, which is probably what you want most of the time. But if your list has other lists nested in it, you may want to make a deep copy. You can do this with the deepcopy function of the copy module:
See figure 5.3 for an illustration.
Figure 5.3. A shallow copy doesnât copy nested lists.
The lists pointed at by the original or shallow variables are connected. Changing the value in the nested list through either one of them affects the other:
The deep copy is independent of the original, and no change to it has any effect on the original list:
This behavior is the same for any other nested objects in a list that are modifiable (such as dictionaries).
Now that youâve seen what lists can do, itâs time to look at tuples.
TRY THIS: LIST COPIES
Suppose that you have the following list: x = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] What code could you use to get a copy y of that list in which you could change the elements without the side effect of changing the contents of x?
5.7. Tuples
Tuples are data structures that are very similar to lists, but they canât be modified; they can only be created. Tuples are so much like lists that you may wonder why Python bothers to include them. The reason is that tuples have important roles that canât be efficiently filled by lists, such as keys for dictionaries.
5.7.1. Tuple basics
Creating a tuple is similar to creating a list: assign a sequence of values to a variable. A list is a sequence thatâs enclosed by [ and ]; a tuple is a sequence thatâs enclosed by ( and ):
This line creates a three-element tuple.
After a tuple is created, using it is so much like using a list that itâs easy to forget that tuples and lists are different data types:
The main difference between tuples and lists is that tuples are immutable. An attempt to modify a tuple results in a confusing error message, which is Pythonâs way of saying that it doesnât know how to set an item in a tuple:
You can create tuples from existing ones by using the + and * operators:
A copy of a tuple can be made in any of the same ways as for lists:
If you didnât read section 5.6, you can skip the rest of this paragraph. Tuples themselves canât be modified. But if they contain any mutable objects (for example, lists or dictionaries), these objects may be changed if theyâre still assigned to their own variables. Tuples that contain mutable objects arenât allowed as keys for dictionaries.
5.7.2. One-element tuples need a comma
A small syntactical point is associated with using tuples. Because the square brackets used to enclose a list arenât used elsewhere in Python, itâs clear that [] means an empty list and that [1] means a list with one element. The same thing isnât true of the parentheses used to enclose tuples. Parentheses can also be used to group items in expressions to force a certain evaluation order. If you say (x + y) in a Python program, do you mean that x and y should be added and then put into a one-element tuple, or do you mean that the parentheses should be used to force x and y to be added before any expressions to either side come into play?
This situation is a problem only for tuples with one element, because tuples with more than one element always include commas to separate the elements, and the commas tell Python that the parentheses indicate a tuple, not a grouping. In the case of one-element tuples, Python requires that the element in the tuple be followed by a comma, to disambiguate the situation. In the case of zero-element (empty) tuples, thereâs no problem. An empty set of parentheses must be a tuple because itâs meaningless otherwise:
5.7.3. Packing and unpacking tuples
As a convenience, Python permits tuples to appear on the left side of an assignment operator, in which case variables in the tuple receive the corresponding values from the tuple on the right side of the assignment operator. Hereâs a simple example:
This example can be written even more simply, because Python recognizes tuples in an assignment context even without the enclosing parentheses. The values on the right side are packed into a tuple and then unpacked into the variables on the left side:
One line of code has replaced the following four lines of code:
This technique is a convenient way to swap values between variables. Instead of saying
simply say
To make things even more convenient, Python 3 has an extended unpacking feature, allowing an element marked with * to absorb any number of elements not matching the other elements. Again, some examples make this feature clearer:
Note that the starred element receives all the surplus items as a list and that if there are no surplus elements, the starred element receives an empty list.
Packing and unpacking can also be performed by using list delimiters:
5.7.4. Converting between lists and tuples
Tuples can be easily converted to lists with the list function, which takes any sequence as an argument and produces a new list with the same elements as the original sequence. Similarly, lists can be converted to tuples with the tuple function, which does the same thing but produces a new tuple instead of a new list:
As an interesting side note, list is a convenient way to break a string into characters:
copy
This technique works because list (and tuple) apply to any Python sequence, and a string is just a sequence of characters. (Strings are discussed fully in chapter 6.)
QUICK CHECK: TUPLES
Explain why the following operations arenât legal for the tuple x = (1, 2, 3, 4):
If you had a tuple x = (3, 1, 4, 2), how might you end up with x sorted?
5.8. Sets
A set in Python is an unordered collection of objects used when membership and uniqueness in the set are main things you need to know about that object. Like dictionary keys (discussed in chapter 7), the items in a set must be immutable and hashable. This means that ints, floats, strings, and tuples can be members of a set, but lists, dictionaries, and sets themselves canât.
5.8.1. Set operations
In addition to the operations that apply to collections in general, such as in, len, and iteration in for loops, sets have several set-specific operations:
You can create a set by using set on a sequence, such as a list 1. When a sequence is made into a set, duplicates are removed 2. After creating a set by using the set function, you can use add 3 and remove 4 to change the elements in the set. The in keyword is used to check for membership of an object in a set 5. You can also use | 6 to get the union, or combination, of two sets, & to get their intersection 7, and ^ 8 to find their symmetric differenceâthat is, elements that are in one set or the other but not both.
These examples arenât a complete listing of set operations but are enough to give you a good idea of how sets work. For more information, refer to the official Python documentation.
5.8.2. Frozensets
Because sets arenât immutable and hashable, they canât belong to other sets. To remedy that situation, Python has another set type, frozenset, which is just like a set but canât be changed after creation. Because frozensets are immutable and hashable, they can be members of other sets:
QUICK CHECK: SETS
If you were to construct a set from the following list, how many elements would the set have?: [1, 2, 5, 1, 0, 2, 3, 1, 1, (1, 2, 3)]
LAB 5: EXAMINING A LIST
In this lab, the task is to read a set of temperature data (the monthly high temperatures at Heathrow Airport for 1948 through 2016) from a file and then find some basic information: the highest and lowest temperatures, the mean (average) temperature, and the median temperature (the temperature in the middle if all the temperatures are sorted).
The temperature data is in the file lab_05.txt in the source code directory for this chapter. Because I havenât yet discussed reading files, hereâs the code to read the files into a list:
You should find the highest and lowest temperature, the average, and the median. Youâll probably want to use the min(), max(), sum(), len(), and sort() functions/methods.
BONUS
Determine how many unique temperatures are in the list.
Summary
Lists and tuples are structures that embody the idea of a sequence of elements, as are strings.
Lists are like arrays in other languages, but with automatic resizing, slice notation, and many convenience functions.
Tuples are like lists but canât be modified, so they use less memory and can be dictionary keys (see chapter 7).
Sets are iterable collections, but theyâre unordered and canât have duplicate elements.
Last updated