Tuples vs Lists vs Sets in Python
In Python, there are four built-in data types that we can use to store collections of data. With different qualities and characteristics, these built-in data types are List (list
), Tuple (tuple), Set (set
), and Dictionary (dict
).
In this article, we are going to dig into the rabbit holes of Lists, Tuple, and Sets in Python. We will go through their differences and when to use these data types.
As Dictionary associates keys with their respective values, which is a very different use case compared to List, Tuple, and Set (which simply just contain values), it wonโt be part of this discussion.
For the sake of simplicity, I will use Set and Dictionary interchangeably, as they are based on Hash Table (or Hash Map).
Why do we care?
For the most part, these data types can be used interchangeably within an application without much trouble.
Yet, imagine if we were given a task to check if a needle exists in a sizable haystack. What would be the most efficient way in terms of speed and memory to do so?
Should the haystack be a List? What about a Tuple? Or why not always use a Set (or a Dictionary)? What are the caveats that we should look out for?
Letโs dig in!
Differences between List, Tuple, and Set
Duplicates
If I were to explain this, List and Tuple are like siblings in Python. On the other hand, Set (or Dictionary) is like a cousin to both of them.
Unlike a List or Tuple, a Set cannot contain duplicates. In other words, the elements in a Set are unique.
set_example = {1, 1, 2, 3, 3, 3}
# {1, 2, 3}
fruit_set = {'๐', '๐', '๐', '๐', '๐', '๐'}
# {'๐', '๐', '๐'}
With this knowledge in mind, we now know that Set can also be used to remove duplicates from a list!
Sorting Order
You might have heard the statement โSet and Dictionary are not ordered in Python.โ Well, that is only half the truth today, depending on which version of Python you are using.
Before Python 3.6, Dictionaries and Sets do not keep their insertion order. Hereโs an example if you try it out in Python 3.5:
# Example in Python 3.5
fruit_size = {}
>>> fruit_size['๐'] = 12
>>> fruit_size['๐'] = 16
>>> fruit_size['๐'] = 20
>>> fruit_size
{'๐': 12, '๐': 20, '๐': 16}
Today, that statement is out of date by a couple of years. Starting from Python 3.7, Dictionary and Set are officially ordered by the time of insertion.
Anyway, in case you wondered, Lists and Tuples are ordered sequences of objects.
Mutability
When you describe an object as mutable, itโs simply a fancy way of saying the object's internal state can be changed.
The key difference here is that Tuple is immutable (not changeable), whereas List and Set are mutable.
Although Sets are mutable, we cannot access or change any element of a Set via indexing or slicing. Hence, we can only add new elements into a set โ not change them.
Do note that the update
method in a Set simply means the ability to add multiple elements at once.
Indexing
Both Tuple and List support indexing and slicing, while Set does not.
fruit_list = ['๐', '๐', '๐']
fruit_list[1]
# '๐'
animal_tuple = ('๐ถ', '๐ฑ', '๐ฎ')
animal_tuple[2]
# '๐ฎ'
vehicle_set = {'๐', '๐', '๐'}
vehicle_set[0]
# TypeError: 'set' object is not subscriptable
When to use List vs. Tuple?
As we mentioned earlier, Tuples are immutable, whereas Lists are mutable. By the same token, Tuples are fixed size in nature, whereas Lists are dynamic.
a_tuple = tuple(range(1000))
a_list = list(range(1000))
a_tuple.__sizeof__() # 8024 bytes
a_list.__sizeof__() # 9088 bytes
Use List
- When you need to mutate your collection.
- When you need to remove or add new items to your collection of items.
Use Tuple
- If your data should or does not need to be changed.
- Tuples are faster than lists. We should use a Tuple instead of a List if we are defining a constant set of values and all we are ever going to do with it is iterate through it.
- If we need an array of elements to be used as dictionary keys, we can use Tuples. As Lists are mutable (unhashable type), they can never be used as dictionary keys.
When to use Set vs. List/Tuple?
As Set uses Hash Table as its underlying data structure, Set is blazingly fast when it comes to checking if an element is inside it (e.g. x in a_set
).
The idea behind it is that looking up an item in a hash table is an O(1) (constant time) operation.
"So, should I always use Set or Dictionary?"
Essentially, if you do not need to store duplicates, Set is going to be better than List. Period.
Summary
If youโre a numbers geek like me, check out this speed comparison between Tuple, List, and Set when iterating or checking if an object is present in a collection.
What are the main takeaways?
- If you need to store duplicates, go for List or Tuple.
- For List vs. Tuple, consider mutability. If you need immutability, go for Tuple.
- If you do not need to store duplicates, always go for Set or Dictionary. Hash maps are significantly faster when it comes to determining if an object is present in the Set (e.g.
x in set_or_dict
).