Merge branch 'main' of ssh://git.jamesravey.me:222/ravenscroftj/brainsteam.co.uk
continuous-integration/drone/push Build is passing Details

This commit is contained in:
James Ravenscroft 2023-04-16 18:13:00 +01:00
commit d440d0357f
12 changed files with 758 additions and 0 deletions

View File

@ -0,0 +1,73 @@
---
date: '2023-03-21T06:25:47'
hypothesis-meta:
created: '2023-03-21T06:25:47.417575+00:00'
document:
title:
- 'GPT-4 and professional benchmarks: the wrong answer to the wrong question'
flagged: false
group: __world__
hidden: false
id: N6BVsMexEe2Z4X92AfjYDg
links:
html: https://hypothes.is/a/N6BVsMexEe2Z4X92AfjYDg
incontext: https://hyp.is/N6BVsMexEe2Z4X92AfjYDg/aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
json: https://hypothes.is/api/annotations/N6BVsMexEe2Z4X92AfjYDg
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- llm
- openai
- gpt
- ModelEvaluation
target:
- selector:
- endContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[4]/span[2]
endOffset: 300
startContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[4]/span[1]
startOffset: 0
type: RangeSelector
- end: 5998
start: 5517
type: TextPositionSelector
- exact: "To benchmark GPT-4\u2019s coding ability, OpenAI evaluated it on problems\
\ from Codeforces, a website that hosts coding competitions. Surprisingly,\
\ Horace He pointed out that GPT-4 solved 10/10 pre-2021 problems and 0/10\
\ recent problems in the easy category. The training data cutoff for GPT-4\
\ is September 2021. This strongly suggests that the model is able to memorize\
\ solutions from its training set \u2014 or at least partly memorize them,\
\ enough that it can fill in what it can\u2019t recall."
prefix: 'm 1: training data contamination'
suffix: As further evidence for this hyp
type: TextQuoteSelector
source: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
text: OpenAI was only able to pass questions available before september 2021 and
failed to answer new questions - strongly suggesting that it has simply memorised
the answers as part of its training
updated: '2023-03-21T06:26:57.441600+00:00'
uri: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
tags:
- llm
- openai
- gpt
- ModelEvaluation
- hypothesis
type: annotation
url: /annotations/2023/03/21/1679379947
---
<blockquote>To benchmark GPT-4s coding ability, OpenAI evaluated it on problems from Codeforces, a website that hosts coding competitions. Surprisingly, Horace He pointed out that GPT-4 solved 10/10 pre-2021 problems and 0/10 recent problems in the easy category. The training data cutoff for GPT-4 is September 2021. This strongly suggests that the model is able to memorize solutions from its training set — or at least partly memorize them, enough that it can fill in what it cant recall.</blockquote>OpenAI was only able to pass questions available before september 2021 and failed to answer new questions - strongly suggesting that it has simply memorised the answers as part of its training

View File

@ -0,0 +1,68 @@
---
date: '2023-03-21T06:27:59'
hypothesis-meta:
created: '2023-03-21T06:27:59.825632+00:00'
document:
title:
- 'GPT-4 and professional benchmarks: the wrong answer to the wrong question'
flagged: false
group: __world__
hidden: false
id: hoqyasexEe2ZnQ_nOVgRxA
links:
html: https://hypothes.is/a/hoqyasexEe2ZnQ_nOVgRxA
incontext: https://hyp.is/hoqyasexEe2ZnQ_nOVgRxA/aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
json: https://hypothes.is/api/annotations/hoqyasexEe2ZnQ_nOVgRxA
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- openai
- gpt
- ModelEvaluation
target:
- selector:
- endContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[6]/span[2]
endOffset: 42
startContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[6]/span[1]
startOffset: 0
type: RangeSelector
- end: 6591
start: 6238
type: TextPositionSelector
- exact: 'In fact, we can definitively show that it has memorized problems in
its training set: when prompted with the title of a Codeforces problem, GPT-4
includes a link to the exact contest where the problem appears (and the round
number is almost correct: it is off by one). Note that GPT-4 cannot access
the Internet, so memorization is the only explanation.'
prefix: the problems after September 12.
suffix: GPT-4 memorizes Codeforces probl
type: TextQuoteSelector
source: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
text: GPT4 knows the link to the coding exams that it was evaluated against but
doesn't have "internet access" so it appears to have memorised this as well
updated: '2023-03-21T06:27:59.825632+00:00'
uri: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
tags:
- openai
- gpt
- ModelEvaluation
- hypothesis
type: annotation
url: /annotations/2023/03/21/1679380079
---
<blockquote>In fact, we can definitively show that it has memorized problems in its training set: when prompted with the title of a Codeforces problem, GPT-4 includes a link to the exact contest where the problem appears (and the round number is almost correct: it is off by one). Note that GPT-4 cannot access the Internet, so memorization is the only explanation.</blockquote>GPT4 knows the link to the coding exams that it was evaluated against but doesn't have "internet access" so it appears to have memorised this as well

View File

@ -0,0 +1,68 @@
---
date: '2023-03-21T06:29:09'
hypothesis-meta:
created: '2023-03-21T06:29:09.945605+00:00'
document:
title:
- 'GPT-4 and professional benchmarks: the wrong answer to the wrong question'
flagged: false
group: __world__
hidden: false
id: sFZzLMexEe2M2r_i759OiA
links:
html: https://hypothes.is/a/sFZzLMexEe2M2r_i759OiA
incontext: https://hyp.is/sFZzLMexEe2M2r_i759OiA/aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
json: https://hypothes.is/api/annotations/sFZzLMexEe2M2r_i759OiA
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- openai
- gpt
- ModelEvaluation
target:
- selector:
- endContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[8]/span[2]
endOffset: 199
startContainer: /div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/article[1]/div[4]/div[1]/div[1]/p[8]/span[1]
startOffset: 0
type: RangeSelector
- end: 7439
start: 7071
type: TextPositionSelector
- exact: "Still, we can look for telltale signs. Another symptom of memorization\
\ is that GPT is highly sensitive to the phrasing of the question. Melanie\
\ Mitchell gives an example of an MBA test question where changing some details\
\ in a way that wouldn\u2019t fool a person is enough to fool ChatGPT (running\
\ GPT-3.5). A more elaborate experiment along these lines would be valuable."
prefix: ' how performance varies by date.'
suffix: "Because of OpenAI\u2019s lack of tran"
type: TextQuoteSelector
source: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
text: OpenAI has memorised MBA tests- when these are rephrased or certain details
are changed, the system fails to answer
updated: '2023-03-21T06:29:09.945605+00:00'
uri: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
tags:
- openai
- gpt
- ModelEvaluation
- hypothesis
type: annotation
url: /annotations/2023/03/21/1679380149
---
<blockquote>Still, we can look for telltale signs. Another symptom of memorization is that GPT is highly sensitive to the phrasing of the question. Melanie Mitchell gives an example of an MBA test question where changing some details in a way that wouldnt fool a person is enough to fool ChatGPT (running GPT-3.5). A more elaborate experiment along these lines would be valuable.</blockquote>OpenAI has memorised MBA tests- when these are rephrased or certain details are changed, the system fails to answer

View File

@ -0,0 +1,66 @@
---
date: '2023-03-21T19:59:04'
hypothesis-meta:
created: '2023-03-21T19:59:04.177001+00:00'
document:
title:
- 2303.09752.pdf
flagged: false
group: __world__
hidden: false
id: 1MB9BMgiEe27GS99BvTIlA
links:
html: https://hypothes.is/a/1MB9BMgiEe27GS99BvTIlA
incontext: https://hyp.is/1MB9BMgiEe27GS99BvTIlA/arxiv.org/pdf/2303.09752.pdf
json: https://hypothes.is/api/annotations/1MB9BMgiEe27GS99BvTIlA
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- llm
- attention
- long-documents
target:
- selector:
- end: 1989
start: 1515
type: TextPositionSelector
- exact: "Over the past few years, many \u201Cefficient Trans-former\u201D approaches\
\ have been proposed that re-duce the cost of the attention mechanism over\
\ longinputs (Child et al., 2019; Ainslie et al., 2020; Belt-agy et al., 2020;\
\ Zaheer et al., 2020; Wang et al.,2020; Tay et al., 2021; Guo et al., 2022).\
\ However,especially for larger models, the feedforward andprojection layers\
\ actually make up the majority ofthe computational burden and can render\
\ process-ing long inputs intractable"
prefix: ' be applied to each input token.'
suffix: ".\u2217Author contributions are outli"
type: TextQuoteSelector
source: https://arxiv.org/pdf/2303.09752.pdf
text: Recent improvements in transformers for long documents have focused on efficiencies
in the attention mechanism but the feed-forward and projection layers are still
expensive for long docs
updated: '2023-03-21T19:59:04.177001+00:00'
uri: https://arxiv.org/pdf/2303.09752.pdf
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://arxiv.org/pdf/2303.09752.pdf
tags:
- llm
- attention
- long-documents
- hypothesis
type: annotation
url: /annotations/2023/03/21/1679428744
---
<blockquote>Over the past few years, many “efficient Trans-former” approaches have been proposed that re-duce the cost of the attention mechanism over longinputs (Child et al., 2019; Ainslie et al., 2020; Belt-agy et al., 2020; Zaheer et al., 2020; Wang et al.,2020; Tay et al., 2021; Guo et al., 2022). However,especially for larger models, the feedforward andprojection layers actually make up the majority ofthe computational burden and can render process-ing long inputs intractable</blockquote>Recent improvements in transformers for long documents have focused on efficiencies in the attention mechanism but the feed-forward and projection layers are still expensive for long docs

View File

@ -0,0 +1,54 @@
---
date: '2023-03-21T19:59:42'
hypothesis-meta:
created: '2023-03-21T19:59:42.317507+00:00'
document:
title:
- 2303.09752.pdf
flagged: false
group: __world__
hidden: false
id: 63md-sgiEe2GA2OJo26mSA
links:
html: https://hypothes.is/a/63md-sgiEe2GA2OJo26mSA
incontext: https://hyp.is/63md-sgiEe2GA2OJo26mSA/arxiv.org/pdf/2303.09752.pdf
json: https://hypothes.is/api/annotations/63md-sgiEe2GA2OJo26mSA
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- llm
target:
- selector:
- end: 2402
start: 2357
type: TextPositionSelector
- exact: This paper presents COLT5 (ConditionalLongT5)
prefix: s are processed by aheavier MLP.
suffix: ', a new family of models that, b'
type: TextQuoteSelector
source: https://arxiv.org/pdf/2303.09752.pdf
text: CoLT5 stands for Conditional LongT5
updated: '2023-03-21T19:59:42.317507+00:00'
uri: https://arxiv.org/pdf/2303.09752.pdf
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://arxiv.org/pdf/2303.09752.pdf
tags:
- llm
- hypothesis
type: annotation
url: /annotations/2023/03/21/1679428782
---
<blockquote>This paper presents COLT5 (ConditionalLongT5)</blockquote>CoLT5 stands for Conditional LongT5

View File

@ -0,0 +1,19 @@
---
date: '2023-04-07T11:14:41.131905'
mp-syndicate-to:
- https://brid.gy/publish/mastodon
photo:
- /media/2023/04/07/1680866081_0.jpg
tags:
- personal
type: note
url: /notes/2023/04/07/1680866081
---
<img src="/media/2023/04/07/1680866081_0.jpg" class="u-photo" />
Happy freaking Easter James - from Mother Nature
<a href="https://brid.gy/publish/mastodon"></a>

View File

@ -0,0 +1,58 @@
---
date: '2023-03-13T20:08:35.475110'
mp-syndicate-to:
- https://brid.gy/publish/mastodon
tags:
- ai
- nlp
- humour
title: Deep Thought, Hitchhiker's Guide, LLMs and Raspberry Pis
description: Musings on parallels between AI fiction and AI fact
type: post
url: /posts/2023/03/13/deepthought-hitchhiker-s-guide-llms-and-raspberry-pis1678738115
---
Today I read via [Simon Willison's blog](https://simonwillison.net/2023/Mar/13/alpaca/) that [someone has managed to get LlaMA running on a raspberry pi]. That's pretty incredible progress and it made me think of this excerpt from [Hitchiker's Guide To the Galaxy](https://bookwyrm.social/book/181728/s/hitchhikers-guide-to-the-galaxy-trilogy-collection-5-books-set-by-douglas-adams):
> O Deep Thought computer," he said, "the task we have designed you to perform is this. We want you to tell us...." he paused, "The Answer."
>
>"The Answer?" said Deep Thought. "The Answer to what?"
>
>"Life!" urged Fook.
>
>"The Universe!" said Lunkwill.
>
>"Everything!" they said in chorus.
>
>Deep Thought paused for a moment's reflection.
>
>"Tricky," he said finally.
>
>"But can you do it?"
>
>Again, a significant pause.
>
>"Yes," said Deep Thought, "I can do it."
>
>"There is an answer?" said Fook with breathless excitement.
>
>"Yes," said Deep Thought. "Life, the Universe, and Everything. There is an answer. But, I'll have to think about it."
>
>...
>
>Fook glanced impatiently at his watch.
>
>“How long?” he said.
>
>“Seven and a half million years,” said Deep Thought.
>
>Lunkwill and Fook blinked at each other.
>
>“Seven and a half million years...!” they cried in chorus.
>
>“Yes,” declaimed Deep Thought, “I said Id have to think about it, didnt I?"
Maybe Deep Thought was actually just an LLM running on a raspberry pi and that's why it took so long to generate the ultimate answer!
<a href="https://brid.gy/publish/mastodon"></a>

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

View File

@ -0,0 +1,45 @@
---
title: "Weeknote 11 2023"
date: 2023-03-20T19:53:00Z
description: in which I ate too much, entered gremlin mode and upgraded mkdocs-material
url: /2023/3/20/week-11
type: post
mp-syndicate-to:
- https://brid.gy/publish/mastodon
- https://brid.gy/publish/twitter
resources:
- name: feature
src: images/officelights.jpg
tags:
- personal
---
This week (or last week)'s weeknote is a touch late since I was travelling over the weekend. On Sunday it was Mother's Day in the UK so we visited my mum up in the midlands and then Mrs R's mum down here in Hampshire, having a sit down meal with both. It was a bit like [the bit in the Vicar of Dibley where she accidentally signs herself up for multiple christmas dinners on the same day](https://www.youtube.com/watch?v=2aq3DNSF-jc).
---
On tuesday we had a problem with our lighting in our office AND the water main near our office complex burst which meant we were sat in the office like gremlins in the dark and there were no toilet facilities. I decided to work from home for the rest of the week for reasons that were not unrelated.
{{<figure src="images/officelights.jpg" caption="The office in this photo by <a href='https://unsplash.com/@sunday_digital?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText'>Nastuh Abootalebi</a> has lights unlike our office which does not.">}}
- Now that I've got into the swing of using [foam](https://foambubble.github.io/) and [mkdocs](https://www.mkdocs.org/) to publish [my digital garden](https://notes.jamesravey.me/), I finally took the plunge and signed up as a [mkdocs-material insider](https://squidfunk.github.io/mkdocs-material/). I'm now sponsoring the good work of [squidfunk](https://fosstodon.org/@squidfunk) and also benefitting from some of the quality of life features that the insiders build of his theme provides including navigation breadcrumbs.
- I've been looking for a new printer for a while since my 10 year old HP printer/scanner finally packed in on me just when I needed to print some important documents. I've heard horror stories about pretty much all inkjet printers and the word on the street seemed to be buy a laserjet if you can afford it and you don't print very frequently as the toner cartridges last forever and don't clog up the printer like an inkjet printer. I was undecided about which printer to get until I read [this review](https://www.theverge.com/23642073/best-printer-2023-brother-laser-wi-fi-its-fine) which absolutely nails it. It arrived today, I set it up and it prints stuff so - yey I guess!
## Next Week
- (This week really - week 12) - I am in London towards the end of the week for a colleague's leaving get together and to hopefully hang out and get some face time with another colleague who is usually based up in Edinburgh.
- Trying out some new physical-journal-and-markdown hybrid note-taking methodologies.
- Hoping to have a quiet weekend at hope and get some housework and gardening done.
## Interesting Links
- https://climatejets.org/ - really smart site that provides some insight into who is burning the most fuel flying around flippantly.
- [this guy](https://twitter.com/miolini/status/1634982361757790209) - got Llama (a recent large language model) running on a Raspberry Pi. A couple of days later someone also got it running on a Pixel 5. Miniaturisation of this tech will help with its democratisation (which dillutes the power of the corporates who are pushing it so hard right now) and reduces the environmental impact of running it.
- [OpenAI Is Now Everything It Promised Not to Be: Corporate, Closed-Source, and For-Profit](https://www.vice.com/en/article/5d3naz/openai-is-now-everything-it-promised-not-to-be-corporate-closed-source-and-for-profit)

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

View File

@ -0,0 +1,122 @@
---
title: "NLP is more than just LLMs"
date: 2023-03-25T14:13:14Z
description: Opportunities for early NLP professionals and small companies in the post ChatGPT era
url: /2023/3/25//nlp-is-more-than-just-llms
type: post
mp-syndicate-to:
- https://brid.gy/publish/mastodon
- https://brid.gy/publish/twitter
resources:
- name: feature
src: images/language.jpg
tags:
- nlp
- llms
- ai
---
{{<figure src="images/language.jpg" caption="Typesetting blocks form alphabet spaghetti a bit like a language model might spit out. Photo by <a href='https://unsplash.com/@raphaelphotoch?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText'>Raphael Schaller</a> on <a href='https://unsplash.com/photos/GkinCd2enIY?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText'>Unsplash</a>">}}
There is sooo much hype around LLMs at the moment. As an NLP practitioner of 10 years (I built Partridge [^Partridge] in 2013), it's exhausting and quite annoying and amongst the junior ranks, there's a lot of despondency and dejection and a feeling of "what's the point? ~~Closed~~OpenAI have solved NLP".
Well, I'm here to tell you that NLP is more than just LLMs and that there are plenty of opportunities to get into the field. What's more, there are plenty of interesting, ethical use cases that can benefit society. In this post I will describe a number of opportunities for research and development in NLP that are unrelated or tangential to training bigger and bigger transformer-based [^vaswaniAttentionAllYou] LLMs.
This post is based on a comment I made on a reddit thread [^aromatic_eye_6268ShouldSpecializeNLP2023] covering "should I study NLP?"
## Combatting Hallucination
If you take the hype at face value, you could be forgiven for believing that NLP is pretty much a solved problem. However, that simply isn't the case. LLMs hallucinate (make stuff up) and whilst there is a marked improvement in hallucinations between versions of GPT, hallucination is a problem with transformer-based LLMs in general as the technical co-founder of OpenAI, Ilya Sutskever admits [^smithGPT4CreatorIlya]. Instead of relying on pure LLMs, there are lots of opportunities for building NLP pipelines that can reliably retrieve answers from specific documents via semantic search [^SemanticSearchFAISS]. This sort of approach allows the end user to make their own mind up about the trustworthiness of the source rather than relying on the LLM itself which might be right or might spit out alphabet soup. This week OpenAI announced a plugin interface for ChatGPT that, in theory, facilitates a hybrid LLM and retrieval approach through their system. However, it seems like GPT can still hallucinate incorrect answers even when the correct one is in the retrieved response [^SometimesItHallucinates]. There's definitely some room for improvement here!
As use of LLMs becomes more widespread and people ask it questions and use it to write blog posts, we're going to start seeing more hallucinations presented as facts online. What's more, we're already seeing LLMs citing misinformation generated by other LLMs[^vincentGoogleMicrosoftChatbots2023] to their users.
## Bot Detection
There are certainly opportunities in bot vs human detection. Solutions like GPTZero [^GPTZero] and GLTR[^GLTRGlitterV0] rely on the statistical likelihood that a model would use a given sequence of words based on historical output (for example if the words "bananas in pajamas" never appear in known GPT output but they appear in the input document, the probability that it was written by a human is increased). Approaches like DetectGPT [^mitchellDetectGPTZeroShotMachineGenerated2023] use a model to perturb (subtly change) the output and compare the probabilities of the strings being generated to see if the original "sticks out" as being unusual and thus more human-like. ***edit: I was also contacted by Tracey Deacker - a computer science student in Reykjavik, who recommended CrossPlag[^CrossPlag] - another such detection tool.***
It seems like bot detection and evading detection are likely to be a new arms race: as new detection methods emerge, people will build more and more complex methods for evading detection or rely on adversarial training approaches to train existing models to evade new detection approaches automatically.
## Fact Checking and Veracity
Regardless of who wrote the content, bots or humans, fact-checking remains a key topic for NLP, again something that generative LLMs are not really set up to do. Fact checking is a relatively mature area of NLP with challenges and workshops like FEVER [^thorneFEVERLargescaleDataset2018]. However, it remains a tricky area which may require models to make multiple logical "hops" to arrive at a conclusion.
When direct evidence of something is not available, rumour verification is another tool in the NLP arsenal that may help us to derive the trustworthiness of a source. It works by identifying support or denial from parties who may be involved in a particular rumour (for example, Donald Trump tweets that he's going to be arrested and some AI generated photos of his arrest appear online, posted by unknown actors, but we can determine that this is unlikely to be true because social media accounts at trustworthy newspapers tweet that trump created a false expectation of arrest). Kochkina et al currently hold the state of the art on the RumourEval dataset [^kochkinaTuringSemEval2017Task2017].
## Temporal Reasoning
Things change over time. The answer to "who is the UK Prime Minister" today is different to this time last year. GPT 3.5 got around this by often prefixing information with big disclaimers about being trained in 2021 before telling you that the UK Prime Minister is Boris Johnson and not knowing who Rishi Sunak is. Early Bing/Sydney (which we now know was GPT-4 [^ConfirmedNewBing]) simply tried to convince you into believing that it was actually 2022 not 2023 and that you must be wrong: "You have been a bad user. I have been a good Bing"[^vynckMicrosoftAIChatbot2023]).
Again this is something that a pure transformer-based LLM sucks at and around which there are many opportunities. Recent work in this area includes modelling moments of change in peoples' mood based on social media posts [^tsakalidisIdentifyingMomentsChange2022] and some earlier work has been done to do things like how topics of discussion in scientific research change over time [^prabhakaranPredictingRiseFall2016].
## Specialised Models and Low Compute Modelling
LLMs are huge and power hungry language generalists but often get outperformed by smaller specialised models at specific tasks [^schickExploitingClozeQuestionsFewShot2021] [^schickTrueFewShotLearning2021] [^gaoMakingPretrainedLanguage2021]. Furthermore, recent developments have shown that we can get pretty good performance out of LLMs by shrinking them so that they run on laptops, Raspberry Pis and even mobile phones [^LargeLanguageModels]. It also looks like it's possible to get ChatGPT-like performance from relatively small LLMs with the right datasets, DataBricks yesterday announced their Dolly model which was trained on a single machine in under an hour [^HelloDollyDemocratizing2023].
There is plenty more work to be done in continuing to shrink models so that they can be used on-site, on mobile or in embedded use cases in order to support use cases where flexibility and trustworthiness are key. Many of my customers would be very unlikely to let me send their data to OpenAI to be processed and potentially learned from in a way that would benefit their competitors or that could accidentally leak confidential information and cause GDPR headaches.
Self-hosted models are also a known quantity but the big organisations that can afford to train and host these gigantic LLMs stand to make a lot of money off people just using their APIs as black boxes. Building small, specialised models that can run on cheap commodity hardware will allow small companies to benefit from NLP without relying on OpenAI's generosity. It might make sense for small companies to start building with a hosted LLM but when you get serious, you need to own your model [^HelloDollyDemocratizing2023].
## Trust and Reproducibility
Explainability and trustworthiness of models are now a crucial part of the machine learning landscape. It is often very important to understand why an algorithm made a particular decision in order to eliminate latent biases and discrimination and to ensure that the reasoning behind a decision is sound in general. There are plenty of opportunities to improve the current state-of-the-art in this space by training models that can explain their rationale as part of their decision r[^chanUNIREXUnifiedLearning2022] and by developing benchmarks and tests that can draw out problematic biases [^ribeiroAccuracyBehavioralTesting2020] [^morrisTextAttackFrameworkAdversarial2020].
The big players have started to signal their intent not to make their models and datasets open any more [^vincentOpenAICofounderCompany2023] [^snyderAILeaderSays2023]. By hiding this detail, they are effectively withdrawing from the scientific community and, we can no longer meaningfully reproduce their findings or trust their results. For example, there are some pretty feasible hypotheses around about how GPT-4 may have previously been exposed to and overfit on the bar exam papers that it supposedly aced [^narayananGPT4ProfessionalBenchmarks2023]. Without access to the model dataset or weights nobody, can check this.
In fact, we've got something of a reproducibility crisis when it comes to AI in general [^knightSloppyUseMachine]. There are lots of opportunities for budding practitioners to enter the arena and tidy up processes and tools and reproduce results.
## Conclusion
In conclusion, while the world's gone mad with GPT fever, it's important to remember that there are still a huge number of opportunities within the NLP space for small research groups and businesses.
I sort of see ChatGPT a bit like how many software engineers see MongoDB: a prototyping tool you might use at a hackathon to get a proof-of-concept working but which you subsequently revisit and replace with a more appropriate, tailored tool.
So for early career researchers and engineers considering NLP: it's definitely learning about LLMs and considering their strengths and weaknesses but also consider that, regardless of what the Silicon Valley Giants would have you believe, NLP is more than just LLMs.
## Other Resources for AI Beyond LLMs
Here are some more resources on nlp and ml stuff that is going on outside of the current LLM bubble from others in the nlp space:
https://twitter.com/andriy_mulyar/status/1636139257805828096 - a thread where some nlp experts weigh in on unsolved problems
https://twitter.com/vboykis/status/1635987389381222406 - a recent chat between AI and ML practitioners on stuff they are working on outside of LLMs
https://link.medium.com/6Bz5jc2hsyb - a blog post from an NLP professor about finding problems to work on outside of the LLM bubble.
[^Partridge]: Partridge - a web based tool used for scientific paper retrieval and filtering that makes use of Machine Learning techniques. https://beta.papro.org.uk
[^aromatic_eye_6268ShouldSpecializeNLP2023]: Aromatic_Eye_6268. (2023, March 25). Should I specialize in NLP considering the advent of Large Language Models? [Reddit Post]. R/LanguageTechnology. www.reddit.com/r/LanguageTechnology/comments/121gv4c/should_i_specialize_in_nlp_considering_the_advent/
[^smithGPT4CreatorIlya]: Smith, C. S. (n.d.). GPT-4 Creator Ilya Sutskever on AI Hallucinations and AI Democracy. Forbes. Retrieved 25 March 2023, from https://www.forbes.com/sites/craigsmith/2023/03/15/gpt-4-creator-ilya-sutskever-on-ai-hallucinations-and-ai-democracy/
[^SemanticSearchFAISS]: Semantic search with FAISS - Hugging Face Course. (n.d.). Retrieved 25 March 2023, from https://huggingface.co/course/chapter5/6
[^vaswaniAttentionAllYou]: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (n.d.). Attention is All you Need. 11.
[^SometimesItHallucinates]: Sometimes it hallucinates despite fetching accurate data! · Issue #2 · simonw/datasette-chatgpt-plugin. (n.d.). GitHub. Retrieved 25 March 2023, from https://github.com/simonw/datasette-chatgpt-plugin/issues/2
[^vincentGoogleMicrosoftChatbots2023]: Vincent, J. (2023, March 22). Google and Microsofts chatbots are already citing one another in a misinformation shitshow. The Verge. https://www.theverge.com/2023/3/22/23651564/google-microsoft-bard-bing-chatbots-misinformation
[^GPTZero]: GPTZero. (n.d.). Retrieved 25 March 2023, from https://gptzero.me/
[^GLTRGlitterV0]: GLTR (glitter) v0.5. (n.d.). Retrieved 25 March 2023, from http://gltr.io/dist/index.html
[^mitchellDetectGPTZeroShotMachineGenerated2023]: Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D., & Finn, C. (2023). DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature (arXiv:2301.11305). arXiv. http://arxiv.org/abs/2301.11305
[^ConfirmedNewBing]: Confirmed: The new Bing runs on OpenAIs GPT-4 | Bing Search Blog. (n.d.). Retrieved 25 March 2023, from https://blogs.bing.com/search/march_2023/Confirmed-the-new-Bing-runs-on-OpenAI%E2%80%99s-GPT-4
[^vynckMicrosoftAIChatbot2023]: Vynck, G. D., Lerman, R., & Tiku, N. (2023, February 17). Microsofts AI chatbot is going off the rails. Washington Post. https://www.washingtonpost.com/technology/2023/02/16/microsoft-bing-ai-chatbot-sydney/
[^tsakalidisIdentifyingMomentsChange2022]: Tsakalidis, A., Nanni, F., Hills, A., Chim, J., Song, J., & Liakata, M. (2022). Identifying Moments of Change from Longitudinal User Text. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 46474660. https://doi.org/10.18653/v1/2022.acl-long.318
[^prabhakaranPredictingRiseFall2016]: Prabhakaran, V., Hamilton, W. L., McFarland, D., & Jurafsky, D. (2016). Predicting the Rise and Fall of Scientific Topics from Trends in their Rhetorical Framing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 11701180. https://doi.org/10.18653/v1/P16-1111
[^thorneFEVERLargescaleDataset2018]: Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: A Large-scale Dataset for Fact Extraction and VERification. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 809819. https://doi.org/10.18653/v1/N18-1074
[^kochkinaTuringSemEval2017Task2017]: Kochkina, E., Liakata, M., & Augenstein, I. (2017). Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM (arXiv:1704.07221). arXiv. http://arxiv.org/abs/1704.07221
[^schickExploitingClozeQuestionsFewShot2021]: Schick, T., & Schütze, H. (2021). Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 255269. https://www.aclweb.org/anthology/2021.eacl-main.20
[^schickTrueFewShotLearning2021]: Schick, T., & Schütze, H. (2021). True Few-Shot Learning with Prompts—A Real-World Perspective. ArXiv:2111.13440 [Cs]. http://arxiv.org/abs/2111.13440
[^gaoMakingPretrainedLanguage2021]: Gao, T., Fisch, A., & Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners. ArXiv:2012.15723 [Cs]. http://arxiv.org/abs/2012.15723
[^LargeLanguageModels]: Large language models are having their Stable Diffusion moment. (n.d.). Retrieved 25 March 2023, from https://simonwillison.net/2023/Mar/11/llama/
[^HelloDollyDemocratizing2023]: Hello Dolly: Democratizing the magic of ChatGPT with open models. (2023, March 24). Databricks. https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html
[^vincentOpenAICofounderCompany2023]: Vincent, J. (2023, March 15). OpenAI co-founder on companys past approach to openly sharing research: “We were wrong”. The Verge. https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview
[^snyderAILeaderSays2023]: Snyder, A. (2023, March 2). AI leader says fields new territory is promising but risky. Axios. https://www.axios.com/2023/03/02/demis-hassabis-deepmind-ai-new-territory
[^narayananGPT4ProfessionalBenchmarks2023]: Narayanan, A., & Kapoor, S. (2023, March 20). GPT-4 and professional benchmarks: The wrong answer to the wrong question [Substack newsletter]. AI Snake Oil. https://aisnakeoil.substack.com/p/gpt-4-and-professional-benchmarks
[^chanUNIREXUnifiedLearning2022]: Chan, A., Sanjabi, M., Mathias, L., Tan, L., Nie, S., Peng, X., Ren, X., & Firooz, H. (2022). UNIREX: A Unified Learning Framework for Language Model Rationale Extraction. Undefined. https://doi.org/10.18653/v1/2022.bigscience-1.5
[^ribeiroAccuracyBehavioralTesting2020]: Ribeiro, M. T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 49024912. https://doi.org/10.18653/v1/2020.acl-main.442
[^morrisTextAttackFrameworkAdversarial2020]: Morris, J. X., Lifland, E., Yoo, J. Y., Grigsby, J., Jin, D., & Qi, Y. (2020). TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP (arXiv:2005.05909). arXiv. https://doi.org/10.48550/arXiv.2005.05909
[^knightSloppyUseMachine]: Knight, W. (n.d.). Sloppy Use of Machine Learning Is Causing a Reproducibility Crisis in Science. Wired. Retrieved 25 March 2023, from https://www.wired.com/story/machine-learning-reproducibility-crisis/
[^CrossPlag]: AI Content Detector - Crossplag - https://crossplag.com/ai-content-detector/

View File

@ -2186,6 +2186,25 @@
"content": null, "content": null,
"published": "2022-05-05T16:24:01+00:00" "published": "2022-05-05T16:24:01+00:00"
} }
},
{
"id": 1662026,
"source": "https:\/\/jamesg.coffee\/2023\/liked-brainsteamcouk2022130debugging-bridgy-for-my-blog",
"target": "https:\/\/brainsteam.co.uk\/2022\/1\/30\/debugging-bridgy-for-my-blog\/",
"activity": {
"type": "like"
},
"verified_date": "2023-04-14T15:51:52.180446",
"data": {
"author": {
"type": "card",
"name": "James' Coffee Blog",
"photo": "https:\/\/webmention.io\/avatar\/jamesg.coffee\/44a4b81d3ad1303a2acd54e82a33c5333b80a611e540f4971bcb5fd93096c352.jpg",
"url": "https:\/\/jamesg.coffee\/profile\/capjamesg"
},
"content": null,
"published": "2023-04-14T15:51:46+00:00"
}
} }
], ],
"\/notes\/2022\/02\/04\/1643990322\/": [ "\/notes\/2022\/02\/04\/1643990322\/": [
@ -14609,6 +14628,31 @@
"content": null, "content": null,
"published": null "published": null
} }
},
{
"source": "https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/",
"verified": true,
"verified_date": "2023-03-12T19:41:21+00:00",
"id": 1639493,
"private": false,
"data": {
"author": {
"name": "James Ravenscroft",
"url": "https:\/\/brainsteam.co.uk",
"photo": null
},
"url": "https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/",
"name": "Weeknote 2023 Week 10",
"content": "<ul><li>This week I was back at work after both myself and Mrs R were off poorly <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/\">most of last week<\/a>. It\u2019s actually been pretty hard going and every night after work we have been coming home, collapsing on the sofa and sleeping.<\/li>\n<li>Unfortunately we failed miserably at seeing Jon Richardson as we were just feeling too sick.<\/li>\n<li>At work I was running a tech due dilligence project with an edtech company which was really interesting and fun. Sometimes it is nice to be reminded about how other small tech companies operate and that the trials and tribulations that my co-founders and I face are often shared by others.<\/li>\n<li>I received a lovely surprised in the mail: a mug with the abstract from my PhD thesis that my supervisor Amanda sent off for <a href=\"https:\/\/brainsteam.co.uk\/notes\/2023\/02\/13\/1676295052\/\">when I got the news that the final version of my work had been accepted<\/a>.<\/li>\n<\/ul><img src=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/images\/thesis_mug_small.jpg\" alt=\"A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda\" \/><p>A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda<\/p>\n <ul><li>This week I\u2019ve been taking a break from non-fiction and reading <a href=\"https:\/\/bookwyrm.social\/book\/219642\/s\/valors-choice\">Valor\u2019s Choice by Tanya Huff<\/a>, a veritable sci-fi cheese-fest that I added to my to read list after chatting with some fellow bookworms on Mastodon.<\/li>\n<li>We had snow this week across most of the UK but where I live in the south we mainly just got rained on a lot.<\/li>\n<\/ul><h2>Interesting Links<\/h2>\n<ul><li>I was reminded of this masterclass in game theory <a href=\"https:\/\/ncase.me\/trust\/\">The Evolution of Trust<\/a> which talks about what happens when strangers choose to cooperate or compete for resources (as per the prisoner\u2019s dilema). This demo is super interesting.<\/li>\n<li>Yesterday <a href=\"https:\/\/simonwillison.net\/2023\/Mar\/11\/llama\/\">Simon Willison wrote about how he managed to get the new Llama language model running on a macbook pro<\/a> and how exciting that is. As Simon says \u201cIt\u2019s easy to fall into a cynical trap of thinking there\u2019s nothing good here at all, and everything generative AI is either actively harmful or a waste of time\u201d - I\u2019m actually quite close to this train of though which you might find surprising for an AI\/ML specialist. However, given that generative models are here to stay (nobody\u2019s putting this genie back in the bottle), it is great that we are starting to see commoditization and democratization of them rather than letting a small oligarchy of huge SVL companies gatekeeping the technology.<\/li>\n<\/ul><h2>Blog Posts From Me<\/h2>\n<ul><li><a href=\"https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/\">Haunted by my headphones: a modern ghost story<\/a> a (tongue in cheek) spooky encounter with some old headphones.<\/li>\n<\/ul><h2>Next Week<\/h2>\n<ul><li>I\u2019m giving a talk about my PhD research to some computer science PhD students at the University of Manchester this week.<\/li>\n<li>Now that I\u2019m feeling better I want to get back out in the garden - I\u2019ve got some <a href=\"https:\/\/www.gardensillustrated.com\/plants\/chitting-potatoes-how-to\/\">chitted<\/a> potatoes that are ready to be planted in their grow bags and a whole bunch of general maintainence chores to do.<\/li>\n<li><a href=\"https:\/\/tv.apple.com\/gb\/show\/ted-lasso\/umc.cmc.vtoh0mn0xn7t3c643xqonfzy\">Ted Lasso<\/a> Season 3 starts - we\u2019ve been rewatching series 1 and 2 in preparation. We\u2019re also planning on catching up on <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/\">You<\/a> as the second part of the new season just dropped.<\/li>\n<li>Now that I\u2019m feeling a bit better I\u2019m hoping to spend some time on my hobby projects after work again. I\u2019ve got one new website idea that I\u2019m particularly excited about sharing when the concept is a little more proven\u2026<\/li>\n<\/ul><a href=\"https:\/\/brid.gy\/publish\/mastodon\"><\/a>\n \n <a href=\"https:\/\/brid.gy\/publish\/twitter\"><\/a>",
"published": "2023-03-11T17:08:55+00:00",
"published_ts": 1678554535
},
"activity": {
"type": "link",
"sentence": "James Ravenscroft posted 'This week I was back at work after both myself and Mrs R were off poorly most of...' linking to https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/",
"sentence_html": "<a href=\"https:\/\/brainsteam.co.uk\">James Ravenscroft<\/a> posted 'This week I was back at work after both myself and Mrs R were off poorly most of...' linking to <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/\">https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/<\/a>"
},
"target": "https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/"
} }
], ],
"\/notes\/2023\/03\/06\/1678136032\/": [ "\/notes\/2023\/03\/06\/1678136032\/": [
@ -14714,6 +14758,147 @@
"sentence_html": "<a href=\"https:\/\/mastodon.longlandclan.id.au\/@stuartl\">Stuart Longland (VK4MSL)<\/a> commented '@jamesravey If it happened to me, likely I'd come across them long after the lit...' on a post <a href=\"https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/\">https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/<\/a>" "sentence_html": "<a href=\"https:\/\/mastodon.longlandclan.id.au\/@stuartl\">Stuart Longland (VK4MSL)<\/a> commented '@jamesravey If it happened to me, likely I'd come across them long after the lit...' on a post <a href=\"https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/\">https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/<\/a>"
}, },
"target": "https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/" "target": "https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/"
},
{
"id": 1639491,
"source": "https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/",
"target": "https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/",
"activity": {
"type": "mention"
},
"verified_date": "2023-03-12T19:40:49.484839",
"data": {
"author": {
"type": "card",
"name": "James Ravenscroft",
"photo": "",
"url": "https:\/\/brainsteam.co.uk"
},
"content": "<ul><li>This week I was back at work after both myself and Mrs R were off poorly <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/\">most of last week<\/a>. It\u2019s actually been pretty hard going and every night after work we have been coming home, collapsing on the sofa and sleeping.<\/li>\n<li>Unfortunately we failed miserably at seeing Jon Richardson as we were just feeling too sick.<\/li>\n<li>At work I was running a tech due dilligence project with an edtech company which was really interesting and fun. Sometimes it is nice to be reminded about how other small tech companies operate and that the trials and tribulations that my co-founders and I face are often shared by others.<\/li>\n<li>I received a lovely surprised in the mail: a mug with the abstract from my PhD thesis that my supervisor Amanda sent off for <a href=\"https:\/\/brainsteam.co.uk\/notes\/2023\/02\/13\/1676295052\/\">when I got the news that the final version of my work had been accepted<\/a>.<\/li>\n<\/ul><img src=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/images\/thesis_mug_small.jpg\" alt=\"A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda\" \/><p>A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda<\/p>\n <ul><li>This week I\u2019ve been taking a break from non-fiction and reading <a href=\"https:\/\/bookwyrm.social\/book\/219642\/s\/valors-choice\">Valor\u2019s Choice by Tanya Huff<\/a>, a veritable sci-fi cheese-fest that I added to my to read list after chatting with some fellow bookworms on Mastodon.<\/li>\n<li>We had snow this week across most of the UK but where I live in the south we mainly just got rained on a lot.<\/li>\n<\/ul><h2>Interesting Links<\/h2>\n<ul><li>I was reminded of this masterclass in game theory <a href=\"https:\/\/ncase.me\/trust\/\">The Evolution of Trust<\/a> which talks about what happens when strangers choose to cooperate or compete for resources (as per the prisoner\u2019s dilema). This demo is super interesting.<\/li>\n<li>Yesterday <a href=\"https:\/\/simonwillison.net\/2023\/Mar\/11\/llama\/\">Simon Willison wrote about how he managed to get the new Llama language model running on a macbook pro<\/a> and how exciting that is. As Simon says \u201cIt\u2019s easy to fall into a cynical trap of thinking there\u2019s nothing good here at all, and everything generative AI is either actively harmful or a waste of time\u201d - I\u2019m actually quite close to this train of though which you might find surprising for an AI\/ML specialist. However, given that generative models are here to stay (nobody\u2019s putting this genie back in the bottle), it is great that we are starting to see commoditization and democratization of them rather than letting a small oligarchy of huge SVL companies gatekeeping the technology.<\/li>\n<\/ul><h2>Blog Posts From Me<\/h2>\n<ul><li><a href=\"https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/\">Haunted by my headphones: a modern ghost story<\/a> a (tongue in cheek) spooky encounter with some old headphones.<\/li>\n<\/ul><h2>Next Week<\/h2>\n<ul><li>I\u2019m giving a talk about my PhD research to some computer science PhD students at the University of Manchester this week.<\/li>\n<li>Now that I\u2019m feeling better I want to get back out in the garden - I\u2019ve got some <a href=\"https:\/\/www.gardensillustrated.com\/plants\/chitting-potatoes-how-to\/\">chitted<\/a> potatoes that are ready to be planted in their grow bags and a whole bunch of general maintainence chores to do.<\/li>\n<li><a href=\"https:\/\/tv.apple.com\/gb\/show\/ted-lasso\/umc.cmc.vtoh0mn0xn7t3c643xqonfzy\">Ted Lasso<\/a> Season 3 starts - we\u2019ve been rewatching series 1 and 2 in preparation. We\u2019re also planning on catching up on <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/\">You<\/a> as the second part of the new season just dropped.<\/li>\n<li>Now that I\u2019m feeling a bit better I\u2019m hoping to spend some time on my hobby projects after work again. I\u2019ve got one new website idea that I\u2019m particularly excited about sharing when the concept is a little more proven\u2026<\/li>\n<\/ul><a href=\"https:\/\/brid.gy\/publish\/mastodon\"><\/a>\n \n <a href=\"https:\/\/brid.gy\/publish\/twitter\"><\/a>",
"published": "2023-03-11T17:08:55"
}
}
],
"\/notes\/2023\/02\/13\/1676295052\/": [
{
"id": 1639494,
"source": "https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/",
"target": "https:\/\/brainsteam.co.uk\/notes\/2023\/02\/13\/1676295052\/",
"activity": {
"type": "mention"
},
"verified_date": "2023-03-12T19:41:16.726934",
"data": {
"author": {
"type": "card",
"name": "James Ravenscroft",
"photo": "",
"url": "https:\/\/brainsteam.co.uk"
},
"content": "<ul><li>This week I was back at work after both myself and Mrs R were off poorly <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/04\/week-9\/\">most of last week<\/a>. It\u2019s actually been pretty hard going and every night after work we have been coming home, collapsing on the sofa and sleeping.<\/li>\n<li>Unfortunately we failed miserably at seeing Jon Richardson as we were just feeling too sick.<\/li>\n<li>At work I was running a tech due dilligence project with an edtech company which was really interesting and fun. Sometimes it is nice to be reminded about how other small tech companies operate and that the trials and tribulations that my co-founders and I face are often shared by others.<\/li>\n<li>I received a lovely surprised in the mail: a mug with the abstract from my PhD thesis that my supervisor Amanda sent off for <a href=\"https:\/\/brainsteam.co.uk\/notes\/2023\/02\/13\/1676295052\/\">when I got the news that the final version of my work had been accepted<\/a>.<\/li>\n<\/ul><img src=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/images\/thesis_mug_small.jpg\" alt=\"A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda\" \/><p>A mug with my PhD thesis abstract printed on it courtesy of my supervisor Amanda<\/p>\n <ul><li>This week I\u2019ve been taking a break from non-fiction and reading <a href=\"https:\/\/bookwyrm.social\/book\/219642\/s\/valors-choice\">Valor\u2019s Choice by Tanya Huff<\/a>, a veritable sci-fi cheese-fest that I added to my to read list after chatting with some fellow bookworms on Mastodon.<\/li>\n<li>We had snow this week across most of the UK but where I live in the south we mainly just got rained on a lot.<\/li>\n<\/ul><h2>Interesting Links<\/h2>\n<ul><li>I was reminded of this masterclass in game theory <a href=\"https:\/\/ncase.me\/trust\/\">The Evolution of Trust<\/a> which talks about what happens when strangers choose to cooperate or compete for resources (as per the prisoner\u2019s dilema). This demo is super interesting.<\/li>\n<li>Yesterday <a href=\"https:\/\/simonwillison.net\/2023\/Mar\/11\/llama\/\">Simon Willison wrote about how he managed to get the new Llama language model running on a macbook pro<\/a> and how exciting that is. As Simon says \u201cIt\u2019s easy to fall into a cynical trap of thinking there\u2019s nothing good here at all, and everything generative AI is either actively harmful or a waste of time\u201d - I\u2019m actually quite close to this train of though which you might find surprising for an AI\/ML specialist. However, given that generative models are here to stay (nobody\u2019s putting this genie back in the bottle), it is great that we are starting to see commoditization and democratization of them rather than letting a small oligarchy of huge SVL companies gatekeeping the technology.<\/li>\n<\/ul><h2>Blog Posts From Me<\/h2>\n<ul><li><a href=\"https:\/\/brainsteam.co.uk\/2023\/3\/11\/haunted-by-my-headphones\/\">Haunted by my headphones: a modern ghost story<\/a> a (tongue in cheek) spooky encounter with some old headphones.<\/li>\n<\/ul><h2>Next Week<\/h2>\n<ul><li>I\u2019m giving a talk about my PhD research to some computer science PhD students at the University of Manchester this week.<\/li>\n<li>Now that I\u2019m feeling better I want to get back out in the garden - I\u2019ve got some <a href=\"https:\/\/www.gardensillustrated.com\/plants\/chitting-potatoes-how-to\/\">chitted<\/a> potatoes that are ready to be planted in their grow bags and a whole bunch of general maintainence chores to do.<\/li>\n<li><a href=\"https:\/\/tv.apple.com\/gb\/show\/ted-lasso\/umc.cmc.vtoh0mn0xn7t3c643xqonfzy\">Ted Lasso<\/a> Season 3 starts - we\u2019ve been rewatching series 1 and 2 in preparation. We\u2019re also planning on catching up on <a href=\"https:\/\/brainsteam.co.uk\/2023\/03\/12\/week-10\/\">You<\/a> as the second part of the new season just dropped.<\/li>\n<li>Now that I\u2019m feeling a bit better I\u2019m hoping to spend some time on my hobby projects after work again. I\u2019ve got one new website idea that I\u2019m particularly excited about sharing when the concept is a little more proven\u2026<\/li>\n<\/ul><a href=\"https:\/\/brid.gy\/publish\/mastodon\"><\/a>\n \n <a href=\"https:\/\/brid.gy\/publish\/twitter\"><\/a>",
"published": "2023-03-11T17:08:55"
}
}
],
"\/2018\/04\/05\/phd-mini-sabbaticals\/": [
{
"id": 1640450,
"source": "https:\/\/minimus.su\/?p=77534",
"target": "https:\/\/brainsteam.co.uk\/2018\/04\/05\/phd-mini-sabbaticals\/",
"activity": {
"type": "mention"
},
"verified_date": "2023-03-14T05:39:17.611621",
"data": {
"author": {
"type": "card",
"name": "",
"photo": "",
"url": ""
},
"content": null,
"published": null
}
}
],
"\/posts\/2023\/03\/13\/deepthought-hitchhiker-s-guide-llms-and-raspberry-pis1678738115\/": [
{
"id": 1640946,
"source": "https:\/\/brid.gy\/like\/mastodon\/@jamesravey@fosstodon.org\/110017834880580225\/109253482134385876",
"target": "https:\/\/brainsteam.co.uk\/posts\/2023\/03\/13\/deepthought-hitchhiker-s-guide-llms-and-raspberry-pis1678738115\/",
"activity": {
"type": "like"
},
"verified_date": "2023-03-14T18:41:46.363996",
"data": {
"author": {
"type": "card",
"name": "James Sutton",
"photo": "https:\/\/webmention.io\/avatar\/cdn.fosstodon.org\/e965cbd56b7dc097db583014c68c7fff3e61fba6f82dedca575bb7bb1d24463f.jpg",
"url": "https:\/\/mastodon.social\/@jpwsutton"
},
"content": null,
"published": null
}
}
],
"\/2023\/3\/25\/nlp-is-more-than-just-llms\/": [
{
"id": 1649248,
"source": "https:\/\/brid.gy\/repost\/mastodon\/@jamesravey@fosstodon.org\/110084990672585042\/109360287301702419",
"target": "https:\/\/brainsteam.co.uk\/2023\/3\/25\/nlp-is-more-than-just-llms\/",
"activity": {
"type": "repost"
},
"verified_date": "2023-03-25T17:02:57.716454",
"data": {
"author": {
"type": "card",
"name": "Brahn",
"photo": "https:\/\/webmention.io\/avatar\/cdn.fosstodon.org\/0f148ff62cf0cf4bdcd1691f3ddd73662c5871d70a12ce396fcdbee6b41246d9.png",
"url": "https:\/\/hachyderm.io\/@Brahn"
},
"content": null,
"published": null
}
},
{
"id": 1649375,
"source": "https:\/\/brid.gy\/like\/mastodon\/@jamesravey@fosstodon.org\/110084990672585042\/109305440155192695",
"target": "https:\/\/brainsteam.co.uk\/2023\/3\/25\/nlp-is-more-than-just-llms\/",
"activity": {
"type": "like"
},
"verified_date": "2023-03-25T21:39:45.032457",
"data": {
"author": {
"type": "card",
"name": "Bill Ricker",
"photo": "https:\/\/webmention.io\/avatar\/cdn.fosstodon.org\/6e2f07d86abb24242d2f999657de78f87ca91dfd424a2ac821e462a65307992a.png",
"url": "https:\/\/fosstodon.org\/@BRicker"
},
"content": null,
"published": null
}
},
{
"id": 1649384,
"source": "https:\/\/brid.gy\/repost\/mastodon\/@jamesravey@fosstodon.org\/110084990672585042\/109332168565600309",
"target": "https:\/\/brainsteam.co.uk\/2023\/3\/25\/nlp-is-more-than-just-llms\/",
"activity": {
"type": "repost"
},
"verified_date": "2023-03-25T22:10:09.972565",
"data": {
"author": {
"type": "card",
"name": "Mars Ikeda",
"photo": "https:\/\/webmention.io\/avatar\/cdn.fosstodon.org\/49767b1d2495f7e2ab74b267cb74c2e640d675db0be4b160e6eaf8b5bad977b5.jpg",
"url": "https:\/\/data-folks.masto.host\/@Mars_Ikeda"
},
"content": null,
"published": null
}
} }
] ]
} }