Python lists

Python lists are not really lists based on computer science’s definition of the word.  Classically trained programmers who are new to Python may be confused why a python list’s append method is so much more efficient than its insert method.

The classical list (not the python list) – what computer scientists call a linked list – is implemented as a series of nodes, each node keeping a reference to the next node.  We can imagine such a linked list in Python like this:-

class Node:
    def __init__(self, value, next=None):
        self.value = value = next

# Usage
>>>    L = Node("a", Node("b", Node("c", Node("d"))))

Computer scientists call this a “singly linked list”, as opposed to a “double linked list”.  In a “double linked list”, each node will also keep a reference to the previous node so it is “bi-directional” whereas our singly linked list example here only points to the next node and does not “remember” the previous node.

But Python’s list type is implemented in a different way.  Instead of several separate nodes referencing each other, a Python list is a single contiguous slab of memory.  Computer scientists call this an “array”.

Understanding this fundamentals  reveal our implementation (and performance) differences.

1.  Iterating over Python List and Linked List

When iterating over the contents of a list, both are equally efficient.  But there’s some (resource) overhead in the linked list.

2.  Accessing an element in a Python List  vs an element in a Linked List

When directly accessing an element in a given index, our Python list (an “array”) is a lot more efficient because the position of the element can be calculated and the right memory location accessed directly (since it is in a contiguous slab of memory)!  To access an element in the linked list, we will need to traverse the list from the beginning (much like traversing a DOM tree in HTML).

3.  Inserting vs Appending into a Python List compared to a Linked List

The biggest puzzle, as mentioned initially, is the difference between insert and append.  insert in a linked list is very cheap – no matter how many nodes we have in our linked list, insertion takes roughly the same amount of time.  This is precisely because our linked list’s nodes are at different memory location.

On the other hand, the advantage we have gained from using Python’s list being an array that occupies a contiguous slab of memory is now lost if we attempt insertion because this requires that we move all elements that are on the right of the insertion point, possibly even moving all the elements to a larger array (a completely new memory slab).  This also explains why append is efficient for a Python list since append means inserting at the end of the memory slab where there are no elements on its right.