Python lists

Python lists are not really lists based on computer science’s definition of the word.  Classically trained programmers who are new to Python may be confused why a python list’s `append` method is so much more efficient than its `insert` method.

The classical list (not the python list) – what computer scientists call a linked list – is implemented as a series of nodes, each node keeping a reference to the next node.  We can imagine such a linked list in Python like this:-

class Node:
    def __init__(self, value, next=None):
        self.value = value
        self.next = next

# Usage
>>>    L = Node("a", Node("b", Node("c", Node("d"))))
>>>    L.next.next.value
'c'

Computer scientists call this a “singly linked list”, as opposed to a “double linked list”.  In a “double linked list”, each node will also keep a reference to the previous node so it is “bi-directional” whereas our singly linked list example here only points to the next node and does not “remember” the previous node.

But Python’s list type is implemented in a different way.  Instead of several separate nodes referencing each other, a Python list is a single contiguous slab of memory.  Computer scientists call this an “array”.

Understanding this fundamentals  reveal our implementation (and performance) differences.

1.  Iterating over Python List and Linked List

When iterating over the contents of a list, both are equally efficient.  But there’s some (resource) overhead in the linked list.

2.  Accessing an element in a Python List  vs an element in a Linked List

When directly accessing an element in a given index, our Python list (an “array”) is a lot more efficient because the position of the element can be calculated and the right memory location accessed directly (since it is in a contiguous slab of memory)!  To access an element in the linked list, we will need to traverse the list from the beginning (much like traversing a DOM tree in HTML).

3.  Inserting vs Appending into a Python List compared to a Linked List

The biggest puzzle, as mentioned initially, is the difference between `insert` and `append`.  `insert` in a linked list is very cheap – no matter how many nodes we have in our linked list, insertion takes roughly the same amount of time.  This is precisely because our linked list’s nodes are at different memory location.

On the other hand, the advantage we have gained from using Python’s list being an array that occupies a contiguous slab of memory is now lost if we attempt insertion because this requires that we move all elements that are on the right of the insertion point, possibly even moving all the elements to a larger array (a completely new memory slab).  This also explains why `append` is efficient for a Python list since `append` means inserting at the end of the memory slab where there are no elements on its right.

  • ntt

    So the issue with this design is that you’ll incur a large cost, when you run out of that block of memory, and have to move to the next one. So there’s jitter in your append performance, something that hard real-time apps don’t like.

    Also, there’s a memory usage inefficiency. The chunks take up much more memory than you actually need. This might be fine on servers with 16GB RAM, but om embedded devices, that’s a bad habit..

    • http://www.calvinx.com Calvin Cheng

      Yes.

      Understanding application of linked list versus arrays implies understanding the context (*hardware*) you are working on. Reference – http://highscalability.com/blog/2013/5/22/strategy-stop-using-linked-lists.html

      Quote from article:-
      Here are Aeter’s reasons to be anti-linked-list:
      * They reduce the benefit of out-of-order execution.
      * They throw off hardware prefetching.
      * They reduce DRAM and TLB locality.
      * They cannot leverage SIMD.
      * They are harder to send to GPUs.

      • ntt

        Yup. The points are valid, but modern compliers also make it a point to allocate memory for arbitrary malloc calls in contiguous spaces. IMO it’s better to let a compiler handle this rather than your language implementation, more scalable that way..