Python lists

Python lists are not really lists based on computer science’s definition of the word.  Classically trained programmers who are new to Python may be confused why a python list’s `append` method is so much more efficient than its `insert` method.

The classical list (not the python list) – what computer scientists call a linked list – is implemented as a series of nodes, each node keeping a reference to the next node.  We can imagine such a linked list in Python like this:-

class Node:
    def __init__(self, value, next=None):
        self.value = value
        self.next = next

# Usage
>>>    L = Node("a", Node("b", Node("c", Node("d"))))
>>>    L.next.next.value
'c'

Computer scientists call this a “singly linked list”, as opposed to a “double linked list”.  In a “double linked list”, each node will also keep a reference to the previous node so it is “bi-directional” whereas our singly linked list example here only points to the next node and does not “remember” the previous node.

But Python’s list type is implemented in a different way.  Instead of several separate nodes referencing each other, a Python list is a single contiguous slab of memory.  Computer scientists call this an “array”.

Understanding this fundamentals  reveal our implementation (and performance) differences.

1.  Iterating over Python List and Linked List

When iterating over the contents of a list, both are equally efficient.  But there’s some (resource) overhead in the linked list.

2.  Accessing an element in a Python List  vs an element in a Linked List

When directly accessing an element in a given index, our Python list (an “array”) is a lot more efficient because the position of the element can be calculated and the right memory location accessed directly (since it is in a contiguous slab of memory)!  To access an element in the linked list, we will need to traverse the list from the beginning (much like traversing a DOM tree in HTML).

3.  Inserting vs Appending into a Python List compared to a Linked List

The biggest puzzle, as mentioned initially, is the difference between `insert` and `append`.  `insert` in a linked list is very cheap – no matter how many nodes we have in our linked list, insertion takes roughly the same amount of time.  This is precisely because our linked list’s nodes are at different memory location.

On the other hand, the advantage we have gained from using Python’s list being an array that occupies a contiguous slab of memory is now lost if we attempt insertion because this requires that we move all elements that are on the right of the insertion point, possibly even moving all the elements to a larger array (a completely new memory slab).  This also explains why `append` is efficient for a Python list since `append` means inserting at the end of the memory slab where there are no elements on its right.

python threading bug: ‘_DummyThread’ object has no attribute ‘_Thread__block’

This bug, filed here – http://bugs.python.org/issue14308 - occurs because of a bad interaction between dummy thread objects created by the threading API when we call threading.currentThread() on a foreign thread.  And in particular, because of the _after_fork feature which is called to clean up resources (triggered by `os.fork()` method).

Stephen White also provided a code snippet that demonstrates this problem:-


import os
import thread
import threading
import time

def t():
    threading.currentThread() # Populate threading._active with a DummyThread
    time.sleep(3)

thread.start_new_thread(t, ())

time.sleep(1)

pid = os.fork()
if pid == 0:
    os._exit(0)
    os.waitpid(pid, 0)

Running this script will give you “no attribute ‘_Thread__block’” error, as explained.  For detailed explanations and a monkey-patch solution without modifying python source code, this is a good resource - http://stackoverflow.com/questions/13193278/understand-python-threading-bug

It so happens that django-debug-toolbar’s middleware causes exactly this problem.  And it’s extremely annoying to have my django dev server printing out ‘_DummyThread’ object has no attribute ‘_Thread__block’ in my terminal stdout repeatedly whenever my DebugToolbarMiddleware is enabled.

MIDDLEWARE_CLASSES += (
    'debug_toolbar.middleware.DebugToolbarMiddleware',
)

So here’s my pull request to resolve this issue on django-debug-toolbar - https://github.com/django-debug-toolbar/django-debug-toolbar/pull/333.  I have also taken the liberty to “upgrade” the original use of the thread module to threading module in this pull request. thread module will no longer be available in Python3 but threading module will, so in my opinion, it’s better to simply using the threading module!

Further criticisms and suggestions to improve welcome.

IPython on steroids with qt console, via Macports and in Virtualenv

Getting iPython installed in a virtualenv is straightforward.

The gotcha for new pythonistas not familiar with iPython is that it requires gnu readline to display colors, provide autocomplete functionality and a host of other usability features. Unfortunately, on Mac OSX, getting readline via `pip install readline` will not quite work.  This is due to a Mac OSX-specific PYTHONPATH problem.  With `pip installs readline`, it will never be imported, because readline.so goes in site-packages, which ends up behind the libedit system one, located in lib-dynload (yes, OSX Python path order is very odd).

To solve this problem, we will, instead, run this after we have run `pip install readline`:-

easy_install -a readline

The `-a` option means `–always-copy` which copies all needed packages into our installation directory.  Executing this command while in our virtualenv ensures that the readline package gets placed correctly in our virtualenv directory.  Checking:-

(python-for-scientists)Calvins-MacBook-Pro.local ttys009 Sat Nov 03 12:36:36 |~/work/python-for-scientists|
calvin$ ls -la ~/.virtualenvs/python-for-scientists/lib/python2.7/site-packages/
total 16
drwxr-xr-x 14 calvin staff 476 Nov 3 12:27 .
drwxr-xr-x 51 calvin staff 1734 Nov 2 15:21 ..
drwxr-xr-x 19 calvin staff 646 Nov 3 12:27 IPython
drwxr-xr-x 10 calvin staff 340 Nov 2 15:03 distribute-0.6.28-py2.7.egg
-rw-r--r-- 1 calvin staff 285 Nov 2 15:36 easy-install.pth
drwxr-xr-x 10 calvin staff 340 Nov 3 12:27 ipython-0.13.1-py2.7.egg-info
drwxr-xr-x 38 calvin staff 1292 Nov 2 15:22 numpy
drwxr-xr-x 7 calvin staff 238 Nov 2 15:22 numpy-1.6.2-py2.7.egg-info
drwxr-xr-x 4 calvin staff 136 Nov 2 15:03 pip-1.2.1-py2.7.egg
drwxr-xr-x 3 calvin staff 102 Nov 2 15:03 readline
drwxr-xr-x 6 calvin staff 204 Nov 2 15:36 readline-6.2.4.1-py2.7-macosx-10.7-x86_64.egg
drwxr-xr-x 38 calvin staff 1292 Nov 2 16:00 scipy
drwxr-xr-x 7 calvin staff 238 Nov 2 16:00 scipy-0.11.0-py2.7.egg-info
-rw-r--r-- 1 calvin staff 30 Nov 2 15:03 setuptools.pth

Now, in our virtual env, our iPython shell should work perfectly, with all the features afforded by the gnu readline library.

Moving on, we now want the power of iPython on in a Qt4-powered GUI console.  This will prove to be a little tricky on Mac OSX with Macports.

1.  Qt4 on Mac OSX

The first thing we need to do is to install the qt4-mac package.  Straightforward via MacPorts:-

calvin$ sudo port -v install qt4-mac

2. Python bindings to Qt4

The problem begins when we try to install a python binding library for Qt4. We have to possible options from PyPi

Attempting `pip install PyQt` in your virtual env fails out right. Because PyQt package on pypi does not include a setup.py.

Going the PySide route requires us to have cmake, which can easily be solved by installing cmake system-wide with `sudo port install cmake`.  However, that still fails to help us get `pip install pyside` working right.  We will encounter an error like this:-

error: Failed to locate the Python library /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.so.1
 Complete output from command /Users/calvin/.virtualenvs/python-for-scientists/bin/python -c "import setuptools;__file__='/Users/calvin/.virtualenvs/python-for-scientists/build/pyside/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/kl/_52jng9s6sl2knv_0jds9w140000gn/T/pip-7zqGiu-record/install-record.txt --single-version-externally-managed --install-headers /Users/calvin/.virtualenvs/python-for-scientists/bin/../include/site/python2.7:
 Removing /Users/calvin/.virtualenvs/python-for-scientists/build/pyside/PySide

To solve this problem, we will instead rely on the Macports version of PyQt.

calvin$ port search pyqt4
py-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py24-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py25-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py26-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py27-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py31-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

py32-pyqt4 @4.9.4 (python, devel)
 PyQt4 is a set of Python bindings for the Qt4 toolkit

3. System-wide iPython qtconsole

So,

calvin$ sudo port -v install py27-pyqt4

---> Computing dependencies for py27-pyqt4.
---> Fetching archive for py27-pyqt4
---> py27-pyqt4-4.9.4_0.darwin_11.x86_64.tbz2 doesn't seem to exist in /opt/local/var/macports/incoming/verified
---> Attempting to fetch py27-pyqt4-4.9.4_0.darwin_11.x86_64.tbz2 from http://packages.macports.org/py27-pyqt4
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
---> Attempting to fetch py27-pyqt4-4.9.4_0.darwin_11.x86_64.tbz2 from http://mse.uk.packages.macports.org/sites/packages.macports.org/py27-pyqt4
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
---> Attempting to fetch py27-pyqt4-4.9.4_0.darwin_11.x86_64.tbz2 from http://lil.fr.packages.macports.org/py27-pyqt4
 % Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
---> Fetching distfiles for py27-pyqt4
---> Verifying checksum(s) for py27-pyqt4
---> Checksumming PyQt-mac-gpl-4.9.4.tar.gz
---> Extracting py27-pyqt4
---> Extracting PyQt-mac-gpl-4.9.4.tar.gz
---> Applying patches to py27-pyqt4
---> Applying patch-configure.py
patching file configure.py
---> Applying patch-fix-qt_apps_dir.diff
patching file examples/demos/qtdemo/menumanager.py
patching file examples/designer/plugins/plugins.py
---> Configuring py27-pyqt4
Determining the layout of your Qt installation...

...

Qt v4.8.3 free edition is being used.
Qt is built as a framework.
SIP 4.13.3 is being used.
The Qt header files are in /opt/local/include.
The shared Qt frameworks are in /opt/local/Library/Frameworks.
The Qt binaries are in /opt/local/bin.
The Qt mkspecs directory is in /opt/local/share/qt4.
These PyQt modules will be built: QtCore, QtGui, QtHelp, QtMultimedia,
QtNetwork, QtDeclarative, QtOpenGL, QtScript, QtScriptTools, QtSql, QtSvg,
QtTest, QtWebKit, QtXml, QtXmlPatterns, QtDesigner.
The PyQt Python package will be installed in
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages.
PyQt is being built with generated docstrings.
PyQt is being built with 'protected' redefined as 'public'.
The Designer plugin will be installed in /opt/local/share/qt4/plugins/designer.
The PyQt .sip files will be installed in /opt/local/share/py27-sip/PyQt4.
pyuic4, pyrcc4 and pylupdate4 will be installed in
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin.
Generating the C++ source for the QtCore module...

...

x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Loader/qobjectcreator.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/__init__.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/compiler.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/indenter.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/misc.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/proxy_metaclass.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/qobjectcreator.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4/uic/Compiler/qtproxies.py
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/pylupdate4
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/pyrcc4
x ./opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/pyuic4
x ./opt/local/bin/pylupdate4-2.7
x ./opt/local/bin/pyrcc4-2.7
x ./opt/local/bin/pyuic4-2.7
---> Cleaning py27-pyqt4
---> Removing work directory for py27-pyqt4
---> Updating database of binaries: 100.0%
---> Scanning binaries for linking errors: 100.0%
---> No broken files found.

With this done, and with a system-wide zmq/py27-zmq installed (via `sudo port -v install zmq py27-zmq` command), our system-wide ipython will work beautifully with `ipython qtconsole` outside a python virtual env.

4. iPython qtconsole in virtualenv

However, if we attempt to do the same in an isolated python virtual env, we will encounter this error triggered when qtconsoleapp.py in our virtualenv iPython gets called:-


File "/Users/calvin/.virtualenvs/python-for-scientists/lib/python2.7/site-packages/IPython/frontend/qt/console/qtconsoleapp.py", line 56, in <module>
 from IPython.external.qt import QtCore, QtGui
 File "/Users/calvin/.virtualenvs/python-for-scientists/lib/python2.7/site-packages/IPython/external/qt.py", line 43, in <module>
 raise ImportError('Cannot import PySide >= 1.0.3 or PyQt4 >= 4.7')
ImportError: Cannot import PySide >= 1.0.3 or PyQt4 >= 4.7

which of course, is to be expected.

So, we aren’t satisfied yet because we cannot run our ipython qtconsole within an isolated python virtual env.  Our isolated python virtualenv will not be able to locate the system-wide py27-qt4 that we have installed above; and we also cannot depend on `pip install pyqt` or `pip install pyside` commands inside the virtual env, we will need to resort to a little bit of bash scripting trickery – i.e. symlink to our installed system-wide libraries (PyQt4 and its dependency SIP) to solve this problem from our virtualenv location.

And here’s our little bash script to achieve this:-

Once we run `./symlink_pyqt4_and_sip.sh` while in our virtual env, PyQt4 and SIP will now be available to us!

5. Finishing up: ipython qtconsole in virtualenv!

Quick check:-


calvin$ ls -la $VIRTUAL_ENV/lib/python2.7/site-packages
total 32
drwxr-xr-x 12 calvin staff 408 Nov 3 13:59 .
drwxr-xr-x 51 calvin staff 1734 Oct 31 14:07 ..
drwxr-xr-x 19 calvin staff 646 Oct 31 10:54 IPython
lrwxr-xr-x 1 calvin staff 93 Nov 3 13:59 PyQt4 -> /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyQt4
drwxr-xr-x 10 calvin staff 340 Oct 31 10:13 distribute-0.6.28-py2.7.egg
-rw-r--r-- 1 calvin staff 285 Oct 31 10:15 easy-install.pth
drwxr-xr-x 10 calvin staff 340 Oct 31 10:15 ipython-0.13.1-py2.7.egg-info
drwxr-xr-x 4 calvin staff 136 Oct 31 10:13 pip-1.2.1-py2.7.egg
drwxr-xr-x 3 calvin staff 102 Oct 31 10:13 readline
drwxr-xr-x 6 calvin staff 204 Oct 31 10:15 readline-6.2.4.1-py2.7-macosx-10.7-x86_64.egg
-rw-r--r-- 1 calvin staff 30 Oct 31 10:13 setuptools.pth
lrwxr-xr-x 1 calvin staff 94 Nov 3 13:59 sip.so -> /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sip.so

Yup.  As you can see, PyQt4 and sip.so are correctly symlinked from our virtualenv directory to the system-wide one.

Recall that iPython qt console requires a couple more dependencies; but now we can get them via pip (and not macports) – `pip install pygments pyzmq` (Note that zmq library itself should already be installed system-wide but the python bindings pyzmq – different name compared to our macports’ py27-zmq – needs to be installed within our virtualenv).

Once done, running `ipython qtconsole` will work beautifully, as expected.