brainsteam.co.uk/brainsteam/content/pages/my-work.md

5.7 KiB

post_meta title type
date
My Work pages

On this page you will find summaries of projects that I've worked on including both software and scientific research.

Software

Turbopilot

github repository

A weekend experiment where I attempted to use GGML quantized tensors to run a state-of-the-art code completion model on commodity hardware including laptops, desktops, ARM machines like Macbooks and even Raspberry Pis. As the GGML library matures, I'm adding support for things like Nvidia GPU support too.

Partridge

Website | github repository

A scientific paper indexing system that uses machine learning to enrich papers in order to make them more easy to search and filter. Originally written in Python 2 with xml-rpc worker processes and recently updated to use Python 3 and dramatiq for concurrency.

Sapienta

Website | Live Instance | github repository

An NLP pipeline for processing and enriching scientific papers with sentence-level information about their core scientific concepts (CoreSCs). This is a Python 3 implementation of Prof Maria Liakata's 2010 paper. We provide a free web service for low volume requests and a simple to use docker configuration for those who want to run the software over a larger number of papers.

CDCRTool

github repository

A tool for annotating co-references of entities that occur in linked news paper article/scientific paper pairings. Some 'sharp' code but this was my first venture into 'full stack' using ReactJS on the frontend and Flask on the backend. The repository also contains the final corpus which we made available as part of our EACL21 publication.

TimeTrack

github repository

A small command-line tool I wrote for monitoring my time spent on projects - it has API integration with the popular SaaS timesheet tool Harvest

Academic Publications

Below are links to my various publishing profiles in case you prefer to follow me on an external site/silo:

2022

  • Maufe, M., Ravenscroft, J., Procter, R., & Liakata, M. (2022, December). A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering. In Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 80-97).

2021

2019

2018

2017

2016

2013