brainsteam.co.uk/brainsteam/content/posts/2017-08-06-.md

4.8 KiB
Raw Blame History

title author type date draft url medium_post categories
Cython: Some Top Tips James post -001-11-30T00:00:00+00:00 true /?p=191
O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";N;s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";N;}
Uncategorized

This week Ive been using Cython to build “native” Python extensions. For the uninitiated, Cython is the secret love-child programming language of C and Python. A common misconception is that Cython is “an easy way for Python developers to write fast code using C”. Really using Cython requires familiarity with both Python and C and makes use of concepts from both languages. Therefore Id highly recommend reading up on C a little bit before you start working on Cython code.

During the last few days Ive been running into some interesting problems and solving a few problems. Im hoping that this blog post will provide much needed google results for those who dont want to waste hours on these issues like I did.

Using Cython modules from Python

Cython compiles into a binary library that can be loaded natively with an import statement. However, getting it compiled is the tricky bit.

When youre doing quick and dirty dev work and re-running your code to see if it will work every few minutes, Id recommend making use of the pyximport library that comes with Cython. This module makes importing cython libraries really convenient by wrapping the build process and making the import statement look for and build .pyx files. All you need to do to get it working is run:

import pyximport; pyximport.install()

Then you can literally just import your library. Imagine your Cython file is called test.pyx, you can just do:

import test

and off you go.

If, like me, youre a big fan of Jupyter notebooks and using importlib reload to bring in new versions of models youre developing, Cython and pyximport offer a hack that supports this. When you import pyximport, add reload_support=True to the install function call to enable this.

import pyximport; pyximport.install(reload_support=True)

I found this to be very hacky and that reloading often failed with this method unless preceeded by another import statement. Something like this usually works:

from importlib import reload
import test
reload(test)

Optimising and Understanding Cython Code

Remember that Cython code is first “re-written” or “transpiled” to C code and then is compiled to machine readable binary by your systems C compiler. Well written C is still one of the fastest languages you can write an application in (but also complex and easy to cause a crash from). Since Python is an interpreted language that lives inside a virtual environment, each operation such as adding together two numbers actually translates to several C expressions.

Well written Cython code can be compiled down to a small number of instructions but badly optimised Cython will just result in lines and lines of C code. In these cases, the benefit youre going to be getting from having written the module in Cython is likely to be negligible over standard interpreted Python code.

Cython comes with a handy tool which generates a HTML report showing how well optimised your code is. You can run it on your code by doing

cython -a test.pyx

What you should now have is a test.c file and a test.html file in the directory. IF you open the HTML file in the browser youll see your Cython code and yellow highlights. Its pretty simple: the brighter/more intense the yellow colouring, the more likely it is that your code is interacting with normal Python objects rather than pure C ones and ergo the more likely it is that you can optimise that code and speed things up*.

*Of course this isnt always the case. In some cases you will want to be interacting with the Python world like in code that passes the output from a highly optimised C function back into the world of the Python interpreter so that it can be used by normal Python code.

If youre trying to squeeze loads of performance out of Cython, what you should be aiming for is to get to a point where all your variables have a C type (by using cde****f to declare them before you use them) and by only applying C operations and functions wherever possible.

For example the code:

i = 0
while i < 99:
    i += 1

will result in