Implementation details
Many programmers easily confuse Python's list type with the concept of linked lists which are found often in standard libraries of other languages, such as C, C++, or Java. In fact, CPython lists are not lists at all. In CPython, lists are implemented as variable length arrays. This should be also true for other implementations, such as Jython and IronPython, although such implementation details are often not documented in these projects. The reasons for such confusion is clear. This datatype is named list and also has an interface that could be expected from any linked list implementation.
Why it is important and what does it mean? Lists are one of the most popular data structures, and the way in which they are used greatly affects every application's performance. CPython is the most popular and used implementation, so knowing its internal implementation details is crucial.
Lists in Python are contiguous arrays of references to other objects. The pointer to this array and the length is stored in the list's head structure. This means that every time an item is added or removed, the array of references needs to be resized (reallocated). Fortunately, in Python, these arrays are created with exponential over allocation, so not every operation requires an actual resize of the underlying array. This is how the amortized cost of appending and popping elements can be low in terms of complexity. Unfortunately, some other operations that are considered cheap in ordinary linked lists have relatively high computational complexity in Python:
- Inserting an item at an arbitrary place using the list.insert method has complexity O(n)
- Deleting an item using list.delete or using the del operator har has complexity O(n)
At least retrieving or setting an element using an index is an operation where cost is independent of the list's size, and the complexity of these operations is always O(1).
Let's define n as the length of a list. Here is a full table of average time complexities for most of the list operations:
For situations where a real linked list or doubly linked list is required, Python provides a deque type in the collections built-in module. This is a data structure that allows us to append and pop elements at each side with O(1) complexity. This is a generalization of stacks and queues, and should work fine anywhere where a doubly linked list is required.