Tidy up some draft pages

This commit is contained in:
James Ravenscroft 2020-12-29 10:00:38 +00:00
parent eb77d54fec
commit 08536d1ee3
17 changed files with 45 additions and 239 deletions

View File

@ -26,21 +26,18 @@ name = "Home"
url = "/"
weight = 1
[[menu.main]]
name = "All posts"
url = "/posts"
weight = 2
[[menu.main]]
name = "Tags"
url = "/tags"
weight = 4
weight = 2
[[menu.main]]
name = "My Home Page"
name = "About Me"
url = "https://jamesravey.me"
weight = 3
[[params.social]]
name = "Twitter"
icon = "twitter"
@ -58,4 +55,18 @@ url = "/index.xml"
[taxonomies]
tag = "tags"
tag = "tags"
category = "categories"
[related]
includeNewer = false
threshold = 80
toLower = false
[[related.indices]]
name = "keywords"
weight = 100
[[related.indices]]
name = "date"
weight = 10

View File

@ -6,10 +6,7 @@ date: 2015-06-28T10:36:28+00:00
url: /2015/06/28/bedford-place-vintage-festival/
categories:
- Lindyhop
tags:
- lindyhop
- shimsham
format: video
---

View File

@ -5,11 +5,10 @@ type: post
date: 2015-08-30T16:52:59+00:00
url: /2015/08/30/cusp-challenge-week-2015/
categories:
- Lindyhop
- PhD
tags:
- cdt
- cusp
- phd
- warwick
---

View File

@ -5,14 +5,9 @@ type: post
date: 2015-10-22T18:10:57+00:00
url: /2015/10/22/a-week-in-austin-tx-watson-labs/
categories:
- Uncategorized
- Work
tags:
- alchemy
- austin
- labs
- questions
- rank
- retrieve
- taxonomy
- watson

View File

@ -8,15 +8,8 @@ categories:
- Work
tags:
- automation
- home
- iot
- jasper
- pi
- raspberry
- speech
- speech-to-text
- stt
- text
- raspberry-pi
- watson
---

View File

@ -7,7 +7,7 @@ url: /2016/11/12/the-builder-the-salesman-and-the-property-tycoon/
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";s:69:"https://cdn-images-1.medium.com/fit/c/200/200/0*naYvMn9xdbL5qlkJ.jpeg";s:10:"author_url";s:30:"https://medium.com/@jamesravey";s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";s:12:"45839adb0b2d";s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";s:92:"https://medium.com/@jamesravey/the-builder-the-salesman-and-the-property-tycoon-45839adb0b2d";}'
categories:
- Uncategorized
- Work
tags:
- buzzwords
- funny

View File

@ -8,13 +8,12 @@ featured_image: /wp-content/uploads/2016/11/IMG_20161127_130808-e1480252170130-5
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";s:69:"https://cdn-images-1.medium.com/fit/c/200/200/0*naYvMn9xdbL5qlkJ.jpeg";s:10:"author_url";s:30:"https://medium.com/@jamesravey";s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";s:12:"3a1b15a3f469";s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";s:124:"https://medium.com/@jamesravey/we-need-to-talk-about-push-notifications-and-why-i-stopped-wearing-my-smartwatch-3a1b15a3f469";}'
categories:
- Uncategorized
- Work
tags:
- multi-tasking
- notifications
- phd
- planning
- work
---
I own a Pebble Steel which I got for Christmas a couple of years ago. I’ve been very happy with it so far. I can control my music player from my wrist, get notifications and a summary of my calender. Recently, however I’ve stopped wearing it. The reason is that constant streams of notifications stress me out, interrupt my workflow and not wearing it makes me feel more calm and in control and allows me to be more productive.

View File

@ -1,68 +0,0 @@
---
title: 'Cython: Some Top Tips'
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=191
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";N;s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";N;}'
categories:
- Uncategorized
---
This week I’ve been using [Cython][1] to build “native” Python extensions. For the uninitiated, Cython is the secret love-child programming language of C and Python. A common misconception is that Cython is “an easy way for Python developers to write fast code using C”. Really using Cython requires familiarity with both Python and C and makes use of concepts from both languages. Therefore I’d highly recommend reading up on C a little bit before you start working on Cython code.
During the last few days I’ve been running into some interesting problems and solving a few problems. I’m hoping that this blog post will provide much needed google results for those who don’t want to waste hours on these issues like I did.
## Using Cython modules from Python
Cython compiles into a binary library that can be loaded natively with an import statement. However, getting it compiled is the tricky bit.
When you’re doing quick and dirty dev work and re-running your code to see if it will work every few minutes, I’d recommend making use of the _**pyximport**_ library that comes with Cython. This module makes importing cython libraries really convenient by wrapping the build process and making the import statement look for and build .pyx files. All you need to do to get it working is run:
<pre lang="python">import pyximport; pyximport.install()</pre>
Then you can literally just import your library. Imagine your Cython file is called test.pyx, you can just do:
<pre lang="python">import test</pre>
and off you go.
If, like me, you&#8217;re a big fan of Jupyter notebooks and using importlib reload to bring in new versions of models you&#8217;re developing, Cython and pyximport offer a hack that supports this. When you import pyximport, add reload_support=True to the install function call to enable this.
<pre lang="python">import pyximport; pyximport.install(reload_support=True)</pre>
I found this to be very hacky and that reloading often failed with this method unless preceeded by another import statement. Something like this usually works:
<pre lang="python">from importlib import reload
import test
reload(test)
</pre>
## Optimising and Understanding Cython Code
Remember that Cython code is first &#8220;re-written&#8221; or &#8220;transpiled&#8221; to C code and then is compiled to machine readable binary by your system&#8217;s C compiler. Well written C is still one of the fastest languages you can write an application in (but also complex and easy to cause a crash from). Since Python is an interpreted language that lives inside a virtual environment, each operation &#8211; such as adding together two numbers &#8211; actually translates to several C expressions.
Well written Cython code can be compiled down to a small number of instructions but badly optimised Cython will just result in lines and lines of C code. In these cases, the benefit you&#8217;re going to be getting from having written the module in Cython is likely to be negligible over standard interpreted Python code.
Cython comes with a handy tool which generates a HTML report showing how well optimised your code is. You can run it on your code by doing
<pre lang="bash">cython -a test.pyx</pre>
What you should now have is a test.c file and a test.html file in the directory. IF you open the HTML file in the browser you&#8217;ll see your Cython code and yellow highlights. It&#8217;s pretty simple: the brighter/more intense the yellow colouring, the more likely it is that your code is interacting with normal Python objects rather than pure C ones and ergo the more likely it is that you can optimise that code and speed things up*.
*Of course this isn&#8217;t always the case. In some cases you will want to be interacting with the Python world like in code that passes the output from a highly optimised C function back into the world of the Python interpreter so that it can be used by normal Python code.
If you&#8217;re trying to squeeze loads of performance out of Cython, what you should be aiming for is to get to a point where all your variables have a C type (by using **cde****f** to declare them before you use them) and by only applying C operations and functions wherever possible.
For example the code:
<pre>i = 0
while i &lt; 99:
    i += 1
</pre>
will result in
[1]: http://cython.org/

View File

@ -8,7 +8,10 @@ url: /?p=212
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";N;s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";N;}'
categories:
- Uncategorized
- SpaCy
tags:
- nlp
- python
---
Recently I have been working on a project that involves trawling full text newspaper articles from the JISC UK Web Domain Dataset &#8211; covering all websites with a .uk domain suffix from 1996 to 2013. As you can imagine, this task is pretty gargantuan and the archives themselves are over 27 Terabytes in size (that&#8217;s enough space to store 5000 high definition movies).

View File

@ -7,7 +7,7 @@ url: /2018/01/27/how-i-became-a-gopher/
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";s:69:"https://cdn-images-1.medium.com/fit/c/200/200/0*naYvMn9xdbL5qlkJ.jpeg";s:10:"author_url";s:30:"https://medium.com/@jamesravey";s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";s:12:"452cd617afb4";s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:6:"public";s:3:"url";s:95:"https://medium.com/@jamesravey/how-i-became-a-gopher-and-learned-myself-an-angular-452cd617afb4";}'
categories:
- Uncategorized
- Work
tags:
- chatbots
- filament

View File

@ -1,13 +0,0 @@
---
title: Upgrading from legacy ui-router
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=231
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";N;s:2:"id";N;s:21:"follower_notification";N;s:7:"license";N;s:14:"publication_id";N;s:6:"status";N;s:3:"url";N;}'
categories:
- Uncategorized
---

View File

@ -1,16 +0,0 @@
---
title: HarriGT and news coverage of science
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=255
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";s:2:"no";s:2:"id";N;s:21:"follower_notification";s:3:"yes";s:7:"license";s:19:"all-rights-reserved";s:14:"publication_id";s:2:"-1";s:6:"status";s:4:"none";s:3:"url";N;}'
categories:
- Uncategorized
---
A major theme of my PhD is around how scientific work is portrayed in the media. News articles that report on scientific papers serve a number of purposes for the research community. Firstly, they broadcast academic work to a much wider audience.
A scientific paper&#8217;s purpose is to be read and understood by scientists, engineers and other specialists who are interested in reproducing, rebutting or building atop of the work (or heck, maybe they&#8217;re just curious and have a spare half hour). News articles are supposed to inform and entertain (a cynic might place the latter before the former) the general public with regard to current affairs. This difference in purpose and target audience can lead to news articles and scientific papers that refer to the same study but use very different vocabularies and writing styles.

View File

@ -1,14 +0,0 @@
---
title: What next for AI in the UK?
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=287
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";N;s:2:"id";N;s:21:"follower_notification";N;s:7:"license";N;s:14:"publication_id";N;s:6:"status";N;s:3:"url";N;}'
categories:
- Uncategorized
---
In light of the ever-evolving AI landscape globally and within the UK, last year the House of Lords in the UK formed a Select Committee who were appointed to assess the UK&#8217;s ability to support Artificial Intelligence in the near and medium term. They spent the best part of a year collecting evidence and speaking to experts in the field and myself and some colleagues were lucky enough to have some evidence accepted and taken into consideration.

File diff suppressed because one or more lines are too long

View File

@ -1,29 +0,0 @@
---
title: Do more than kick the tires of your NLP model
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=498
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";N;s:2:"id";N;s:21:"follower_notification";N;s:7:"license";N;s:14:"publication_id";N;s:6:"status";N;s:3:"url";N;}'
categories:
- Uncategorized
---
### _We&#8217;ve known for a while that &#8216;accuracy&#8217; doesn&#8217;t tell you much about your machine learning models but now we have a better alternative!_
&#8220;So how accurate is it?&#8221; &#8211; a phrase that many data scientists like myself fear and dread being asked by business stakeholders. It&#8217;s not that I fear I&#8217;ve done a bad job but that evaluation of model performance is complex and multi-faceted and that summarising it with a single number usually doesn&#8217;t do it justice. Accuracy can also be a communications hurdle &#8211; it is not an intuitive concept and it can lead to friction and misunderstanding if you&#8217;re not &#8216;in&#8217; with the AI crowd. 50% model accuracy across a model that has 1500 possible answers could be considered pretty good. 80% accuracy in a task setting where data is split 80:10 across two classes is meaningless (that means that randomly guessing is more effective than the model).
I&#8217;ve written before about [how we can use finer-grained metrics like Recall, Precision and F1-score to evaluate machine learning models][1]. However, many of us in the AI/NLP community still feel that these metrics are too simplistic and do not adequately describe the characteristics of trained ML models. Unfortunately, we didn&#8217;t have many other options for evaluating model performance&#8230; until now that is&#8230;
## Checklist &#8211; When machine learning met test automation
At the Annual Meeting of the Association for Computational Linguistics 2020 &#8211; a very popular academic conference on NLP &#8211; [Ribeiro et al presented a new method for evaluating NLP models,][2] inspired by principles and techniques that software quality assurance (QA) specialists have been using for years.
The idea is that we should design and implement test cases for NLP models that reflect the tasks that the model will be required to perform &#8220;in the wild&#8221;. Like software QA, these test cases should include tricky edge cases that may trip the model up in order to understand the practical limitations of the model.
For example, we might train a named entity recognition model that
[1]: https://brainsteam.co.uk/2016/03/29/cognitive-quality-assurance-an-introduction/
[2]: https://www.aclweb.org/anthology/2020.acl-main.442.pdf

View File

@ -1,19 +0,0 @@
---
title: Easy MLFlow Server Hosting with Docker-Compose
author: James
type: post
date: -001-11-30T00:00:00+00:00
draft: true
url: /?p=532
medium_post:
- 'O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";N;s:2:"id";N;s:21:"follower_notification";N;s:7:"license";N;s:14:"publication_id";N;s:6:"status";N;s:3:"url";N;}'
categories:
- Uncategorized
---
At Filament we&#8217;re really big fans of MLFlow for managing our ML model lifecycle from experiment to deployment. I won&#8217;t go into the [many advantages][1] of using this software since [many others][2] have done a good job of this before me.
If you&#8217;re bought in
[1]: https://towardsdatascience.com/tracking-ml-experiments-using-mlflow-7910197091bb
[2]: https://towardsdatascience.com/5-tips-for-mlflow-experiment-tracking-c70ae117b03f

View File

@ -0,0 +1,16 @@
---
title: Serving NLP Models with MLFlow
author: James
type: post
date: 2020-12-29T09:50:28+00:00
url: /2020/12/29/serving-nlp-models-with-mlflow/
categories:
- Lindyhop
tags:
- machine-learning
- python
- ai
- devops
- mlops
---