brainsteam.co.uk/brainsteam/content/posts/2017-08-06-.md

68 lines
4.8 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: 'Cython: Some Top Tips'
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=191
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";N;s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";N;}'
categories:
- Uncategorized
---
This week I’ve been using [Cython][1] to build “native” Python extensions. For the uninitiated, Cython is the secret love-child programming language of C and Python. A common misconception is that Cython is “an easy way for Python developers to write fast code using C”. Really using Cython requires familiarity with both Python and C and makes use of concepts from both languages. Therefore I’d highly recommend reading up on C a little bit before you start working on Cython code.
During the last few days I’ve been running into some interesting problems and solving a few problems. I’m hoping that this blog post will provide much needed google results for those who don’t want to waste hours on these issues like I did.
## Using Cython modules from Python
Cython compiles into a binary library that can be loaded natively with an import statement. However, getting it compiled is the tricky bit.
When you’re doing quick and dirty dev work and re-running your code to see if it will work every few minutes, I’d recommend making use of the _**pyximport**_ library that comes with Cython. This module makes importing cython libraries really convenient by wrapping the build process and making the import statement look for and build .pyx files. All you need to do to get it working is run:
<pre lang="python">import pyximport; pyximport.install()</pre>
Then you can literally just import your library. Imagine your Cython file is called test.pyx, you can just do:
<pre lang="python">import test</pre>
and off you go.
If, like me, you&#8217;re a big fan of Jupyter notebooks and using importlib reload to bring in new versions of models you&#8217;re developing, Cython and pyximport offer a hack that supports this. When you import pyximport, add reload_support=True to the install function call to enable this.
<pre lang="python">import pyximport; pyximport.install(reload_support=True)</pre>
I found this to be very hacky and that reloading often failed with this method unless preceeded by another import statement. Something like this usually works:
<pre lang="python">from importlib import reload
import test
reload(test)
</pre>
## Optimising and Understanding Cython Code
Remember that Cython code is first &#8220;re-written&#8221; or &#8220;transpiled&#8221; to C code and then is compiled to machine readable binary by your system&#8217;s C compiler. Well written C is still one of the fastest languages you can write an application in (but also complex and easy to cause a crash from). Since Python is an interpreted language that lives inside a virtual environment, each operation &#8211; such as adding together two numbers &#8211; actually translates to several C expressions.
Well written Cython code can be compiled down to a small number of instructions but badly optimised Cython will just result in lines and lines of C code. In these cases, the benefit you&#8217;re going to be getting from having written the module in Cython is likely to be negligible over standard interpreted Python code.
Cython comes with a handy tool which generates a HTML report showing how well optimised your code is. You can run it on your code by doing
<pre lang="bash">cython -a test.pyx</pre>
What you should now have is a test.c file and a test.html file in the directory. IF you open the HTML file in the browser you&#8217;ll see your Cython code and yellow highlights. It&#8217;s pretty simple: the brighter/more intense the yellow colouring, the more likely it is that your code is interacting with normal Python objects rather than pure C ones and ergo the more likely it is that you can optimise that code and speed things up*.
*Of course this isn&#8217;t always the case. In some cases you will want to be interacting with the Python world like in code that passes the output from a highly optimised C function back into the world of the Python interpreter so that it can be used by normal Python code.
If you&#8217;re trying to squeeze loads of performance out of Cython, what you should be aiming for is to get to a point where all your variables have a C type (by using **cde****f** to declare them before you use them) and by only applying C operations and functions wherever possible.
For example the code:
<pre>i = 0
while i &lt; 99:
    i += 1
</pre>
will result in
[1]: http://cython.org/