diff --git a/brainsteam/content/bookmarks/2024/08/Bookmarked https___phrack.org_issues_71_17.html_article. A primer on how hacking mentality ....md b/brainsteam/content/bookmarks/2024/08/Bookmarked https___phrack.org_issues_71_17.html_article. A primer on how hacking mentality ....md new file mode 100644 index 0000000..c8ce1c1 --- /dev/null +++ b/brainsteam/content/bookmarks/2024/08/Bookmarked https___phrack.org_issues_71_17.html_article. A primer on how hacking mentality ....md @@ -0,0 +1,19 @@ +--- +categories: + - Personal +date: "2024-08-27 20:11:00" +draft: true +tags: [indiehacking, entrepeneurship] +title: + Bookmarked https://phrack.org/issues/71/17.html#article. A primer on how hacking + mentality ... +type: bookmarks +--- + + +
Bookmarked https://phrack.org/issues/71/17.html#article.
+ + +

A primer on how hacking mentality can be applied to business

+
+ diff --git a/brainsteam/content/bookmarks/2024/09/04/Bookmarked The Art of Finishing _ ByteDrum by ....md b/brainsteam/content/bookmarks/2024/09/04/Bookmarked The Art of Finishing _ ByteDrum by ....md new file mode 100644 index 0000000..776df48 --- /dev/null +++ b/brainsteam/content/bookmarks/2024/09/04/Bookmarked The Art of Finishing _ ByteDrum by ....md @@ -0,0 +1,16 @@ +--- +categories: + - Engineering Leadership + - Software Development +date: "2024-09-04 12:06:18" +draft: false +tags: [] +title: Bookmarked The Art of Finishing | ByteDrum by ... +type: bookmarks +--- + + +

Bookmarked The Art of Finishing | ByteDrum by Tomas Stropus.

+

An incredibly relatable essay with a lot of sensible advice and suggestions for issues I struggle with. I think I'm getting better at shipping MVPs but the hard bit is not getting distracted by new shiny ideas when you get stuck with something else. This philosophy is in direct opposition to the SOFA principle.

+
+ diff --git a/brainsteam/content/bookmarks/2024/09/04/Bookmarked Where does Postgres fit in a world ....md b/brainsteam/content/bookmarks/2024/09/04/Bookmarked Where does Postgres fit in a world ....md new file mode 100644 index 0000000..c5e7c8a --- /dev/null +++ b/brainsteam/content/bookmarks/2024/09/04/Bookmarked Where does Postgres fit in a world ....md @@ -0,0 +1,28 @@ +--- +categories: + - AI and Machine Learning + - Software Development +date: "2024-09-04 16:01:51" +draft: false +tags: + - AI + - database + - postgresql + - vectors +title: Bookmarked Where does Postgres fit in a world ... +type: bookmarks +--- + + +

Bookmarked Where does Postgres fit in a world of GenAI and vector databases? - Stack Overflow.

+

The title and framing of this talk are weird and it's bugging me

+ + + +

The question could be paraphrased as "why would we need to efficiently store and retrieve data in a deterministic way when we have GenAI?" This is like asking "why do we need cars when we have speedboats?" or "Why do we need butter knives now that we've invented the chainsaw?".

+ + + +

The actual subject matter is "PostgreSQL with a couple of plugins can do pretty good nearest neighbour search". I've long been a big fan of Postgres. You probably don't need separate vector database engines, you can just use postgres for everything.

+
+ diff --git a/brainsteam/content/likes/2023/12/04/Likes https___snarfed.org_2023-12-03_51578 by Ryan Barrett..md b/brainsteam/content/likes/2023/12/04/Likes https___snarfed.org_2023-12-03_51578 by Ryan Barrett..md new file mode 100644 index 0000000..34e409b --- /dev/null +++ b/brainsteam/content/likes/2023/12/04/Likes https___snarfed.org_2023-12-03_51578 by Ryan Barrett..md @@ -0,0 +1,13 @@ +--- +categories: [] +date: '2023-12-04 14:29:00' +draft: false +tags: [] +title: Likes https://snarfed.org/2023-12-03_51578 by Ryan Barrett. +type: likes +--- + + +

Likes https://snarfed.org/2023-12-03_51578 by Ryan Barrett.

+
+ \ No newline at end of file diff --git a/brainsteam/content/likes/2023/12/26/Likes Indiewebifying a WordPress Site _ 2023 Edition ....md b/brainsteam/content/likes/2023/12/26/Likes Indiewebifying a WordPress Site _ 2023 Edition ....md new file mode 100644 index 0000000..4da5c0f --- /dev/null +++ b/brainsteam/content/likes/2023/12/26/Likes Indiewebifying a WordPress Site _ 2023 Edition ....md @@ -0,0 +1,12 @@ +--- +categories: [] +date: '2023-12-26 19:24:00' +draft: false +tags: [] +title: "Likes Indiewebifying a WordPress Site \u2013 2023 Edition ..." +type: likes +--- + + +

Likes Indiewebifying a WordPress Site – 2023 Edition by David Shanske.

+ \ No newline at end of file diff --git a/brainsteam/content/likes/2023/12/29/Likes The Web is Fantastic _ Robb Knight..md b/brainsteam/content/likes/2023/12/29/Likes The Web is Fantastic _ Robb Knight..md new file mode 100644 index 0000000..aa607de --- /dev/null +++ b/brainsteam/content/likes/2023/12/29/Likes The Web is Fantastic _ Robb Knight..md @@ -0,0 +1,12 @@ +--- +categories: [] +date: '2023-12-29 10:05:00' +draft: true +tags: [] +title: "Likes The Web is Fantastic \u2022 Robb Knight." +type: likes +--- + + +

Likes The Web is Fantastic • Robb Knight.

+ \ No newline at end of file diff --git a/brainsteam/content/likes/2024/08/27/Likes Productivity gains in Software Development through AI ....md b/brainsteam/content/likes/2024/08/27/Likes Productivity gains in Software Development through AI ....md new file mode 100644 index 0000000..32a96f2 --- /dev/null +++ b/brainsteam/content/likes/2024/08/27/Likes Productivity gains in Software Development through AI ....md @@ -0,0 +1,13 @@ +--- +categories: [] +date: '2024-08-27 22:31:00' +draft: false +tags: [] +title: Likes Productivity gains in Software Development through AI ... +type: likes +--- + + +

Likes Productivity gains in Software Development through AI by tante.

+
+ \ No newline at end of file diff --git a/brainsteam/content/notes/2023/12/04/It was lovely to visit my parents over ....md b/brainsteam/content/notes/2023/12/04/It was lovely to visit my parents over ....md new file mode 100644 index 0000000..f9f85f4 --- /dev/null +++ b/brainsteam/content/notes/2023/12/04/It was lovely to visit my parents over ....md @@ -0,0 +1,15 @@ +--- +categories: + - Personal +date: "2023-12-04 12:39:00" +draft: false +photo: + - url: /media/1701693562222_d752939e.jpg + alt: a british shorthair cat in a bin +tags: + - caturday +title: It was lovely to visit my parents over ... +type: notes +--- + +
It was lovely to visit my parents over the weekend and say hi to my dad's naughty cat Bertie who decided to climb into the kitchen bin for a photo op
diff --git a/brainsteam/content/notes/2023/12/20/Feeling handy after fixing an airlock in our ....md b/brainsteam/content/notes/2023/12/20/Feeling handy after fixing an airlock in our ....md new file mode 100644 index 0000000..a6dc0e3 --- /dev/null +++ b/brainsteam/content/notes/2023/12/20/Feeling handy after fixing an airlock in our ....md @@ -0,0 +1,12 @@ +--- +categories: +- Personal +date: '2023-12-20 17:55:00' +draft: false +tags: [] +title: Feeling handy after fixing an airlock in our ... +type: notes +--- + +

Feeling handy after fixing an airlock in our heating system by painstakingly standing with a bucket and draining water out of a radiator until it started hissing and cursing like a scene out of The Exorcist. Phew, no need to call out a plumber after all! #SmallWin

+ \ No newline at end of file diff --git a/brainsteam/content/notes/2023/12/20/I spent some time updating my digital garden ....md b/brainsteam/content/notes/2023/12/20/I spent some time updating my digital garden ....md new file mode 100644 index 0000000..dd6dcac --- /dev/null +++ b/brainsteam/content/notes/2023/12/20/I spent some time updating my digital garden ....md @@ -0,0 +1,18 @@ +--- +categories: +- Personal +date: '2023-12-20 11:46:37' +draft: false +tags: +- music +title: I spent some time updating my digital garden ... +type: notes +--- + + +

+ + + +

I spent some time updating my digital garden page on music with some bands that I recommend. My taste is somewhat eclectic so you have been warned!

+ \ No newline at end of file diff --git a/brainsteam/content/notes/2023/12/21/Took a trip out to Koh Thai at ....md b/brainsteam/content/notes/2023/12/21/Took a trip out to Koh Thai at ....md new file mode 100644 index 0000000..53c34de --- /dev/null +++ b/brainsteam/content/notes/2023/12/21/Took a trip out to Koh Thai at ....md @@ -0,0 +1,35 @@ +--- +categories: + - Personal +date: "2023-12-21 15:26:00" +tags: + - trips +title: Took a trip out to Koh Thai at ... +type: notes +--- + + +

Took a trip out to Koh Thai at Port Solent for lunch for Mrs R's birthday

+ + + +

Restaurant was lovely and the food was top notch. I especially enjoyed the spice ratings on the menu from a little tingle up to life changing

+ + + + + diff --git a/brainsteam/content/notes/2023/12/22/I was pretty horrified to be greeted by ....md b/brainsteam/content/notes/2023/12/22/I was pretty horrified to be greeted by ....md new file mode 100644 index 0000000..f5528f1 --- /dev/null +++ b/brainsteam/content/notes/2023/12/22/I was pretty horrified to be greeted by ....md @@ -0,0 +1,17 @@ +--- +categories: +- Software Development +date: '2023-12-22 13:50:34' +draft: false +tags: [] +title: I was pretty horrified to be greeted by ... +type: notes +--- + + +
+ + + +

I was pretty horrified to be greeted by this message whilst running #ubuntu today. I've always enjoyed Ubuntu as a distro that "just works" but I'm starting to get pretty frustrated with Canonical #enshittifying the experience over time. Thinking about jumping back to Mint or possibly #fedora - I think #arch is too high maintenance.

+ \ No newline at end of file diff --git a/brainsteam/content/notes/2023/12/24/Drinking Irish coffee with Jonnie Walker whisky and ....md b/brainsteam/content/notes/2023/12/24/Drinking Irish coffee with Jonnie Walker whisky and ....md new file mode 100644 index 0000000..2f82c2c --- /dev/null +++ b/brainsteam/content/notes/2023/12/24/Drinking Irish coffee with Jonnie Walker whisky and ....md @@ -0,0 +1,15 @@ +--- +categories: [] +date: "2023-12-24 15:13:00" +draft: false +photo: + - url: /media/1703430793977_a7199d9c.jpg + alt: "An Irish coffee in a festive mug in front of a gaggia coffee machine" +tags: + - coffee + - drinks +title: Drinking Irish coffee with Jonnie Walker whisky and ... +type: notes +--- + +

Drinking Irish coffee with Jonnie Walker whisky and HICS coffee beans

diff --git a/brainsteam/content/notes/2023/12/26/Drinking HICS El Salvador blend coffee Smooth medium ....md b/brainsteam/content/notes/2023/12/26/Drinking HICS El Salvador blend coffee Smooth medium ....md new file mode 100644 index 0000000..eb71063 --- /dev/null +++ b/brainsteam/content/notes/2023/12/26/Drinking HICS El Salvador blend coffee Smooth medium ....md @@ -0,0 +1,18 @@ +--- +categories: [] +date: "2023-12-26 08:45:00" +draft: false +tags: + - coffee + - drinks +title: Drinking HICS El Salvador blend coffee Smooth medium ... +photo: + - url: /media/1703580025838_8188525b.jpg + alt: a bag of coffee in a person's hand +type: notes +--- + +

Drinking HICS El Salvador blend coffee

+ +

Smooth medium roast

+ diff --git a/brainsteam/content/notes/2023/12/31/New year_ new desk.md b/brainsteam/content/notes/2023/12/31/New year_ new desk.md new file mode 100644 index 0000000..e9453de --- /dev/null +++ b/brainsteam/content/notes/2023/12/31/New year_ new desk.md @@ -0,0 +1,14 @@ +--- +categories: + - Personal +date: "2023-12-31 12:55:00" +draft: false +photo: + - url: /media/1704027302456_a403ee18.jpg + alt: A standing desk with a couple of monitors and a light on it in an office room with blue walls +tags: [] +title: New year, new desk +type: notes +--- + +New year, new desk diff --git a/brainsteam/content/notes/2024/01/01/Today we saw Anyone But You_ a romcom ....md b/brainsteam/content/notes/2024/01/01/Today we saw Anyone But You_ a romcom ....md new file mode 100644 index 0000000..583c081 --- /dev/null +++ b/brainsteam/content/notes/2024/01/01/Today we saw Anyone But You_ a romcom ....md @@ -0,0 +1,24 @@ +--- +categories: + - Personal +date: "2024-01-01 21:15:00" +draft: false +tags: + - movies +title: Today we saw Anyone But You, a romcom ... +type: notes +--- + +{{< youtube gbjdSlTHFts >}} + + +

Today we saw Anyone But You, a romcom starring Glen Powell and Sydney Sweet which was a retelling of much ado about nothing. The character names were a giveaway: Ben as in Benedict and Bea as in Beatrice. Overall it was a fun, silly film set in very scenic parts of Australia. The Aussie setting also meant that there was quite a lot of strong language used. It didn't take itself too seriously as the gag reel at the end showed.

+ + + +

I enjoyed it but it wasn't a masterpiece.

+ + + +

3.5/5

+ diff --git a/brainsteam/content/notes/2024/01/11/Drinking Volcano Island blend coffee from hics.md b/brainsteam/content/notes/2024/01/11/Drinking Volcano Island blend coffee from hics.md new file mode 100644 index 0000000..66bd52e --- /dev/null +++ b/brainsteam/content/notes/2024/01/11/Drinking Volcano Island blend coffee from hics.md @@ -0,0 +1,14 @@ +--- +categories: [] +date: "2024-01-11 07:03:00" +draft: false +tags: + - coffee +title: Drinking Volcano Island blend coffee from hics +type: notes +photo: + - url: /media/1704956520871_b891e6bc.jpg + alt: "a bag of coffee on a white counter top" +--- + +Drinking Volcano Island blend coffee from [hics](https://hics.biz) diff --git a/brainsteam/content/notes/2024/03/16/Edinburgh castle with a touch of lens flare ....md b/brainsteam/content/notes/2024/03/16/Edinburgh castle with a touch of lens flare ....md new file mode 100644 index 0000000..0fec5ac --- /dev/null +++ b/brainsteam/content/notes/2024/03/16/Edinburgh castle with a touch of lens flare ....md @@ -0,0 +1,13 @@ +--- +categories: [] +date: "2024-03-16 08:57:31" +draft: false +tags: [] +title: Edinburgh castle with a touch of lens flare ... +type: notes +photo: + - url: /media/g5GRr39PRCR27dQd7R3OkgSCJqF2PRedQ0EWZXDU_98514577.jpg + alt: Edinburgh castle with a touch of lens flare from the street lighting +--- + +Edinburgh castle with a touch of lens flare #edinburgh #castle #streetlights diff --git a/brainsteam/content/notes/2024/05/04/home for the weekend_ Passford House Hotel _newforest ....md b/brainsteam/content/notes/2024/05/04/home for the weekend_ Passford House Hotel _newforest ....md new file mode 100644 index 0000000..73209c6 --- /dev/null +++ b/brainsteam/content/notes/2024/05/04/home for the weekend_ Passford House Hotel _newforest ....md @@ -0,0 +1,13 @@ +--- +categories: [] +date: "2024-05-04 10:30:20" +draft: false +tags: [] +title: "home for the weekend, Passford House Hotel #newforest ..." +photo: + - url: /media/NTYJmS9XZBj16uCVrtdwGL38JE8Y6pOKs8NEmcFS_2129e733.jpg + alt: an old looking English country house with purple wisteria growing around it and the sun shining +type: notes +--- + +home for the weekend, Passford House Hotel #newforest #sunshine #countryhouse diff --git a/brainsteam/content/notes/2024/08/10/It was lovely to head up to London ....md b/brainsteam/content/notes/2024/08/10/It was lovely to head up to London ....md new file mode 100644 index 0000000..c83ed60 --- /dev/null +++ b/brainsteam/content/notes/2024/08/10/It was lovely to head up to London ....md @@ -0,0 +1,21 @@ +--- +categories: + - Personal +date: "2024-08-10 20:40:48" +draft: false +photo: + - url: /media/img-20240810-wa00012515327513561028989-768x1024_50a1ef11.jpg + alt: James and Daniel selfie looking at camera sat in the national library +tags: + - personal +title: It was lovely to head up to London ... +type: notes +--- + + +

It was lovely to head up to London today to meet up with my friend and fellow NLP nerd Daniel. We spent some time discussing some ideas we had for side projects and talked in depth about how web development has become too complex and our desire to build new software with simple stacks with html templates and limited frontend code.

+ + + +

We spent quite a lot of time hanging out in the National Library where there's plenty of space to study and conspire over ideas and schemes. Daniel is something of a digital nomad so it's always nice to get some in-person time when possible. Unfortunately none of our contacts at the Alan Turning Institute were available to let us in today (I mean who can blame them, it's the weekend after all).

+ diff --git a/brainsteam/content/notes/2024/08/18/Had the pleasure of visiting Bletchley Park yesterday ....md b/brainsteam/content/notes/2024/08/18/Had the pleasure of visiting Bletchley Park yesterday ....md new file mode 100644 index 0000000..fc24529 --- /dev/null +++ b/brainsteam/content/notes/2024/08/18/Had the pleasure of visiting Bletchley Park yesterday ....md @@ -0,0 +1,13 @@ +--- +categories: [] +date: "2024-08-18 08:08:12" +draft: false +tags: [] +title: Had the pleasure of visiting Bletchley Park yesterday ... +type: notes +photo: + - url: /media/WxKAbJ8jLRAwF3iE1LmP3ISzYt5mIEyXgI3wiB2M_cc70c64a.jpg + alt: A statue of Alan Turing fashioned from slate set against a black and white photo of an office in Bletchley park +--- + +

Had the pleasure of visiting Bletchley Park yesterday to see where Alan Turing and many others did amazing work on early computers and cryptanalysis. As a computer scientist it has been on my bucket list for a really long time so I'm glad to have finally gotten around to it. The bombe machines were fascinating. Mechanical machines (not computers as they were not general purpose/programmable) that brute forced enigma code cyphers by trying different combinations of circuits. They could try 37k combinations in 12 minutes. #BletchleyPark #AlanTuring #computers

diff --git a/brainsteam/content/notes/2024/08/27/Breakfast with a view during our stay in ....md b/brainsteam/content/notes/2024/08/27/Breakfast with a view during our stay in ....md new file mode 100644 index 0000000..595229d --- /dev/null +++ b/brainsteam/content/notes/2024/08/27/Breakfast with a view during our stay in ....md @@ -0,0 +1,14 @@ +--- +categories: + - Personal +date: "2024-08-27 20:09:00" +draft: false +tags: [] +title: Breakfast with a view during our stay in ... +type: notes +photo: + - url: "/media/1724785714404_26e2b97f.jpg" + alt: "a bowl of cereal and coffee in the foreground and a view out over a park and sunrise" +--- + +Breakfast with a view during our stay in Milton Keynes after visiting Bletchley Park diff --git a/brainsteam/content/notes/2024/08/27/Productivity gains in Software Development through AI.md b/brainsteam/content/notes/2024/08/27/Productivity gains in Software Development through AI.md new file mode 100644 index 0000000..76860a9 --- /dev/null +++ b/brainsteam/content/notes/2024/08/27/Productivity gains in Software Development through AI.md @@ -0,0 +1,16 @@ +--- +categories: [] +date: '2024-08-27 22:30:00' +draft: false +tags: +- AI +- hype +title: Productivity gains in Software Development through AI +type: notes +--- + + +

Bookmarked Productivity gains in Software Development through AI by tante.

+

A fairly good rebuttal of the 4500 hours saved piece by Amazon - I had some similar thoughts.

+
+ \ No newline at end of file diff --git a/brainsteam/content/notes/2024/08/30/The sunset was beautiful today as we drove ....md b/brainsteam/content/notes/2024/08/30/The sunset was beautiful today as we drove ....md new file mode 100644 index 0000000..c3b48b2 --- /dev/null +++ b/brainsteam/content/notes/2024/08/30/The sunset was beautiful today as we drove ....md @@ -0,0 +1,14 @@ +--- +categories: + - Personal +date: "2024-08-30 23:27:00" +draft: false +tags: [personal] +photo: + - url: /media/1725056765290_632449d8.jpg + alt: A picture of a beautiful golden sunset over the motorway as photographed from inside a car. +title: The sunset was beautiful today as we drove ... +type: notes +--- + +

The sunset was beautiful today as we drove up north to visit my family for the weekend.

diff --git a/brainsteam/content/notes/2024/09/01/Long cat is long.md b/brainsteam/content/notes/2024/09/01/Long cat is long.md new file mode 100644 index 0000000..14f19e5 --- /dev/null +++ b/brainsteam/content/notes/2024/09/01/Long cat is long.md @@ -0,0 +1,14 @@ +--- +categories: +- Personal +date: '2024-09-01 20:30:00' +draft: true +photo: +- url: /media/1725219019267_009ab453.jpg +tags: [] +title: Long cat is long +type: notes +--- + +

Long cat is long

+ \ No newline at end of file diff --git a/brainsteam/content/notes/2024/09/01/Test Hidden_ Snowdon Drive_ Catisfield_ HAM 19 _C ....md b/brainsteam/content/notes/2024/09/01/Test Hidden_ Snowdon Drive_ Catisfield_ HAM 19 _C ....md new file mode 100644 index 0000000..167673a --- /dev/null +++ b/brainsteam/content/notes/2024/09/01/Test Hidden_ Snowdon Drive_ Catisfield_ HAM 19 _C ....md @@ -0,0 +1,13 @@ +--- +categories: [] +date: '2024-09-01 20:45:00' +draft: true +photo: +- url: /media/1725219912483_55765ba2.jpg +tags: [] +title: "Test Hidden: Snowdon Drive, Catisfield, HAM 19 \xB0C ..." +type: notes +--- + +

Test

+ \ No newline at end of file diff --git a/brainsteam/content/notes/2024/09/08/Moody sky this morning on a walk along ....md b/brainsteam/content/notes/2024/09/08/Moody sky this morning on a walk along ....md new file mode 100644 index 0000000..5d2d75e --- /dev/null +++ b/brainsteam/content/notes/2024/09/08/Moody sky this morning on a walk along ....md @@ -0,0 +1,14 @@ +--- +categories: +- Personal +date: '2024-09-08 12:52:00' +draft: false +photo: +- url: /media/1725796327250_c8c555b4.jpg +tags: [] +title: Moody sky this morning on a walk along ... +type: notes +--- + +

Moody sky this morning on a walk along the seafront in Lee on the Solent.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/08/08/We moved offices_.md b/brainsteam/content/posts/2023/08/08/We moved offices_.md new file mode 100644 index 0000000..025373a --- /dev/null +++ b/brainsteam/content/posts/2023/08/08/We moved offices_.md @@ -0,0 +1,72 @@ +--- +categories: +- Personal +date: '2023-08-08 11:33:16' +draft: false +tags: [] +title: We moved offices! +type: posts +--- + +

+ I've been meaning to write about this for a while - we actually moved offices at the end of May! +
+

+

+ My company Filament has, since 2017, had two UK offices. One up in London and one on the South Coast in the solent area. I live on the south coast and I spend most of my time at the latter office. I've always been a huge advocate of hybrid or even full remote working - since long before COVID made remote working more "socially acceptable." All of our staff have always had the option to work from home when they want or come in and spend time on site if they prefer. We have a small number of fully remote staff who live on the other side of the country from our office space. People who live near one of our hub offices tend to balance their time between home and the office according to what works for them. +
+

+

+ Our Pre-COVID Setup +
+

+
+ an large and airy office space with a few desks, some potted plants in the foreground +
Our old USSP was large and airy and had breakout areas and a kitchenette.
+
+

Most of our tech team are located on the south coast and I was given the mandate to +make our Solent office as much of a "developers' paradise" as possible - a place where people would want to come and spend time solving problems together and socialising.

+

From 2017 to May we were based at the University of Southampton's Science Park (USSP) nestled away near a small village, Chilworth, just north of Southampton and surrounded by woodland.

+

+ The science park itself is a pretty nice spot, you can do loads of walks and there's a really lovely on site cafe. If you want a bit of variation, there's a pub within walking distance with a large beer garden and the team and I spent many extended lunch breaks  sat out there on a sunny afternoon. +
+

+

Pre-covid we had a really nice big office space on the science park with a pool table and a little kitchen area. People like being there and most of the time, local-based folks would come in a few days a week and we'd hang out, play some games of pool, get lunch together, that kind of stuff.

+

+ During and Post-COVID +
+

+
+ a cluttered office space full of desks and too many chairs +
Our post-covid USSP office was small and full of furniture from our rush to downsize
+
+

Unfortunately, we ended up downsizing during lockdown and that was sub-optimal for a few reasons. There wasn't anything wrong with the space per-se, but it was small and uninspiring and when it got busy, it was really hard to concentrate. It probably didn't help that it was half full with flat-packed furniture from our older larger space making it feel cluttered and scruffy. Also, post-COVID working patterns have changed a lot. 

+

Now people want to spend more time at home but tend to congregate in the office when there's a big in-person meeting, a birthday meal or a well-liked full-remoter is in town for a few days. This leads to a bit of a weird pattern where the office is at maybe 50-60% capacity most of the time and then suddenly experiences big bursts where everyone wants to be in at the same time. In the "small" USSP office these bursts probably meant you weren't going to get any work done that day and there wasn't really anywhere to go and take calls unless you paced in the corridor, booked a meeting room, or as I started doing towards the end of our stay there, lurk in the mail room.

+

When we raised some money in April, we decided to start looking for a new space. We figured that we probably wanted some kind of new "hybrid" deal to facilitate these new ways of working. Ideally, a smallish office with access to a communal breakout space/lounge area and ideally some small phone pods where we can take calls. We spoke to the USSP and unfortunately they were not able to offer a deal like this although they did tell us that they're working on revamping the building that we were in to provide some of these amenities. So we widened our search.

+

+ Our new Space and Whiteley Village +
+

+

We looked at a few local offices before settling on Spaces Whiteley. It's about 20 minutes down the motorway from the old office location and it's a modern co-working setup managed by IWG (the guys who also own Regus). 

+
+ James standing with his back to the camera in a light breezy office space +
+
+ A sofa and chairs in a large open breakout area +
+
+ A breakout space with a coffee bar and a pool table +
+

There's a pool table (ah we missed you pool), loads of communal break-out space with comfy seating and a couple of sound-proofed phone booths for us to take calls from when the office gets busy.

+

Like USSP it has loads of really nice scenery and walks nearby but it definitely wins in terms of amenities. It's a 15 minute walk from an outdoor mall with a bunch of different restaurants and coffee shops.

+
+
+ A square with some cafes and restaurants +
+
+ A lake with some ducks and swans in the foreground +
+

+ I think the team have generally been enjoying the new space more too. People tend to want to come in more frequently and make use of the new space a few days a week. I've been enjoying going in more regularly too. The new office location means that I no longer need to drive on the motorway to get to work so I'm considering starting to cycle in a couple of times a week instead. +
+

\ No newline at end of file diff --git a/brainsteam/content/posts/2023/08/14/Prod-Ready Airbyte Sync.md b/brainsteam/content/posts/2023/08/14/Prod-Ready Airbyte Sync.md new file mode 100644 index 0000000..d138b29 --- /dev/null +++ b/brainsteam/content/posts/2023/08/14/Prod-Ready Airbyte Sync.md @@ -0,0 +1,309 @@ +--- +categories: +- AI and Machine Learning +- Data Science +date: '2023-08-14 11:57:25' +draft: false +tags: [] +title: Prod-Ready Airbyte Sync +type: posts +--- + + +

Airbyte is a tool that allows you to periodically extract data from one database and then load and transform it into another. It provides a performant way to clone data between databases and gives us the flexibility to dictate what gets shared at field level (for example we can copy the users table but we can omit name, address, phone number etc). There are a bunch of use cases where this kind of thing might be useful. For example, say you have a data science team who want to generate reports on how many sales your e-shop made this month and train predictive models for next month’s revenue. You wouldn’t want to give your data team direct access to your e-shop database because:

+ + + +
    +
  1. there might be sensitive personal information (SPI) in there (think names, addresses, bank details, links to what individual users ordered)
  2. + + + +
  3. running this kind of analysis might impact the performance of your shop database and customers might start churning.
  4. +
+ + + +

Instead, we can use a tool, such as airbyte, regularly make copies of a subset of the production database minus the SPI and load it into an external analytics database. The data team can then use this external database to make complex queries all day long without affecting application performance. This pattern is called Extract Load Transform or ELT.

+ + + +

In this post I’ll summarise some strategies for using airbyte in a production environment and share some tips for navigating some of it’s “rough edges” based on my own experience of setting up this tool as a pilot project for some of my clients in my day job.

+ + + +

General Requirements

+ + + + + + + +

Database Technologies

+ + + +

Airbyte has a large number of mature data “Source” plugins for a bunch of standard SQL databases like MySQL/MariaDB, PostgreSQL, MSSQL and so on. It’s Source plugins for data warehouse systems like BigQuery are still in alpha. The reverse appears to be true for “Destination” plugins. Airbyte provides mature destination implementations for data warehouse products like Google BigQuery and Snowflake but immature alpha and beta support for syncing to a traditional RDBMS.

+ + + +

The full list of supportect source and destination connectors is available here. We are able to use the MySQL source to pull data from the core application and store our data in Google BigQuery.

+ + + +

Picking an Edition of Airbyte

+ + + +

Airbyte comes in two “flavours”:

+ + + + + + + +

I have a risk averse client-base who have strong data protection requirements so we opted for self-hosted instances that sit in the same region and virtual network in our Google Cloud instance as our application. The target database is Google BigQuery in the same data centre region and the involved clients were ok with that.

+ + + +

Picking a Virtual Machine Spec

+ + + +

If you opt for the self-hosted version like we did you’ll need to pick a VM that has enough resources to run Airbyte. We went for google’s n2-standard-4 machine spec which has 4 CPU cores and 16GB of RAM. This was actually our second choice after picking an e2-standard-2 which only had 8GB of RAM which was not enough to run Airbyte optimally and caused thrashing/spiking issues.

+ + + +

Although all the data does pass through the VM, it’s done in buffered chunks so your VM doesn’t need a lot of storage space - 50GiB was sufficient for our setup.

+ + + +

Setting up Airbyte

+ + + +

We deployed Airbyte using docker-compose by following their quick start guide. We locked down access to the machine over HTTPS so that it only accepts requests from inside our corporate VPN.

+ + + +

We were able to create a database connection to the MySQL database via it’s internal Google Cloud IP address which meant that no production data is routed outside of the virtual network during the first leg of the extraction (everything is encrypted with TLS anyway).

+ + + +

CDC versus Full Refresh

+ + + +

When you configure a MySQL source (or indeed a bunch of other compatible sources), you can turn on either full refresh or incremental sync via change data capture. The latter makes use of logs from the SQL server to play back all of the SQL queries that have been run since the last sync and reduce the amount of data that is transferred. If you have a large database (of the order of 10s to 100s of GiB), this may be worthwhile as it is likely to accelerate the sync process significantly after the first run.

+ + + +

However, the current implementation of CDC/incremental sync for MySQL appears to get stuck and continue to run if you have ongoing changes being made to the system. For example, if you have a high availability application that is always in use or if you have automated processes making changes to the data around the clock, the sync driver will continue to pick up these new changes and append them on an ongoing basis unless you’re able to turn off the workers or create an isolated copy of the source database (as described below).

+ + + +

We opted to stick with basic “full sync” as we’re only copying around 10GiB of data on a daily basis and our Airbyte setup is able to do this in about 30-45 mins under normal circumstances.

+ + + +

When to Sync

+ + + +

Ideally you want to run the sync when the system is not in use or when it is least likely to impact the ongoing operation of your primary business application. Since most of our clients operate between 9am-6pm GMT we have a nice big “out of hours” window during which we can run this sync.

+ + + +

If you don’t have that luxury because you have a high-availability application, Google Cloud SQL has the ability to take disk snapshots that don’t appear to significantly impact the database performance. We did sketch out a potential workflow that would involve using the Google Cloud SQL Admin API to:

+ + + +
    +
  1. Create a new disk dump of the client db
  2. + + + +
  3. “Restore” the disk dump to a secondary database replica
  4. + + + +
  5. Run the airbyte sync, targetting the secondary replica as the “sync”
  6. + + + +
  7. Turn off the replica db
  8. + + + +
  9. Rinse & repeat
  10. +
+ + + +

However, we were able to get full sync (without incremental CDC) working on a nightly basis without adding these extra complications. This would have required us to use an external orchestrator like Airflow which has Airbyte operators to execute our process. Thanksfully, we were able to use the built in cron scheduler to have Airbyte run the sync nightly before our application maintenance window.

+ + + +

Securing the Process

+ + + +

As outlined, it can be important to make sure that certain fields are not shared via airbyte. Airbyte does provide a UI for toggling fields and indeed whole tables on and off as part of sync but for schemas with large numbers of tables and columns this is time-consuming and unwieldy and is likely to lead to human error.

+ + + +
The UI in airbyte allows us to turn syncing on or off at field level granularity a given table but gets unwieldy at scale
+ + + +

Octavia CLI

+ + + +

Airbyte’s answer to this problem is a command line tool named Octavia. Octavia communicates with Airbyte via a REST API and generates YAML configuration files that contain all tables, fields and configurations.

+ + + +

This is a game changer as it means we can script the fields that get synchronized and we can version control and quality control both the script and the config file itself. We can also gate changes to the sync config via CI tooling and “lint” the config for potential errors (e.g. by providing a script that compares the config against a blacklist of columns that are never allowed).

+ + + +

+ + + +

Below is an excerpt from one such script that might give readers some inspiration for implementing a similar process themselves:

+ + + +
import yaml
+import click
+
+@cli.command()
+@click.option("--connection-file", "-c", type=click.Path(exists=True, file_okay=True, dir_okay=False, readable=True, writable=True), required=True)
+@click.option("--fix/--no-fix", type=bool, default=False)
+def lint_connection(connection_file, fix):
+    
+    with open(connection_file,'r') as f:
+        config = yaml.safe_load(f)
+
+    print(f"Reading config file for {config['resource_name']}")
+
+    print(f"Checking Connection Resource: {config['resource_name']}")
+
+    streams = config['configuration']['sync_catalog']['streams']
+
+    print(f"Found {len(streams)} configured streams (tables) from source")
+
+    errors = []
+
+    for i, stream in enumerate(streams):
+
+        name = stream['stream']['name']
+
+        if ("admin" in name) or (name in BLOCKED):
+            print(f"Checking {name} is not enabled for sync...")
+
+            if stream['config']['selected']:
+                err_string = f"ERROR: {name} must not be synced but is being synced"
+                errors.append(err_string)
+
+                if fix:
+                    streams[i]['config']['selected'] = False
+
+    print(f"Found {len(errors)} problems")
+
+    for error in errors:
+        print(f"\t - {error}")
+    
+    if len(errors) > 0:
+        
+        if fix:
+            config['configuration']['sync_catalog']['streams'] = streams
+
+            print("Fix=True, writing updated config")
+            with open(connection_file, 'w') as f:
+                    yaml.safe_dump(config, f)
+        else:
+            print("Fix=False, please manually fix any problems or rerun with --fix to resolve automatically")
+    else:
+        print("No problems found! 😊")
+
+ + + +

The idea here is that we run the script as part of CI and have the pipeline fail to deploy if it catches any errors and have a --fix mode that attempts to automatically rectify any problems which our maintainer can run locally.

+ + + +

Octavia “Bug”

+ + + +

When I started using Octavia, I noticed that it would bug out and print error messages when I started to use change which fields are selected for sync. I found a bug ticket about this issue and then eventually realised that the Airbyte documentation for Octavia is quite out of date and by default it installs an old version of Octavia that is not compatible with the current version of the Airbyte server itself. In order to make it work, I simply changed my .zshrc file (or .bashrc file for some) to use the latest version of the tool - which at time of writing is 0.50.7:

+ + + +
OCTAVIA_ENV_FILE=/home/james/.octavia
+export OCTAVIA_ENABLE_TELEMETRY=True
+alias octavia="docker run -i --rm -v \$(pwd):/home/octavia-project --network host --env-file \${OCTAVIA_ENV_FILE} --user \$(id -u):\$(id -g) airbyte/octavia-cli:0.50.7"
+
+ + + +

Octavia and SSL

+ + + +

Unfortunately I couldn’t find an easy way to get Octavia to play nicely with self-signed SSL certificates which meant we had to load in an externally “signed” SSL cert. Octavia is written in Python and uses requests to interact with Airbyte so you could theoretically configure it to trust a self-signing certificate authority (as per this stackoverflow post).

+ + + +

Keeping an Eye on Things

+ + + +

Once you have your sync up and running you probably want to make sure it keeps running regularly. Airflow has slack webhook integration which means that it’s easy enough to have it automatically notify you when sync has passed or failed.

+ + + +
+ + + +

Conclusion

+ + + +

There are a lot of variables and moving parts to consider when using Airbyte, especially when data security and privacy are so important. In this post I outlined some hints and tips for using airbyte successfully based on my own experience of setting up the tool. Hopefully some of my observations were useful or interesting to any readers who are thinking about picking up Airbyte.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/08/20/Weeknote 13-19 August 2023.md b/brainsteam/content/posts/2023/08/20/Weeknote 13-19 August 2023.md new file mode 100644 index 0000000..09554bf --- /dev/null +++ b/brainsteam/content/posts/2023/08/20/Weeknote 13-19 August 2023.md @@ -0,0 +1,35 @@ +--- +categories: +- Personal +date: '2023-08-20 21:24:31' +draft: false +tags: +- weeknotes +title: Weeknote 13-19 August 2023 +type: posts +--- + +
+ James looking pretty happy sat with a tray of cake in front of him
High Tea taking on a double meaning at the top of the Spinaker Tower
+Stuff that happened this week: + +This week we have a few office days planned and a few house chores to do. I'll be up in London attending the Airflow Meetup on Thursday evening. \ No newline at end of file diff --git a/brainsteam/content/posts/2023/09/30/TIL_ Unlocking Ubuntu Remotely.md b/brainsteam/content/posts/2023/09/30/TIL_ Unlocking Ubuntu Remotely.md new file mode 100644 index 0000000..dcec143 --- /dev/null +++ b/brainsteam/content/posts/2023/09/30/TIL_ Unlocking Ubuntu Remotely.md @@ -0,0 +1,52 @@ +--- +categories: +- Software Development +date: '2023-09-30 09:34:37' +draft: false +tags: +- til +title: 'TIL: Unlocking Ubuntu Remotely' +type: posts +--- + + +

My use case is starting a game stream from my steamdeck after my pc has locked itself. The game will start but the desktop lock screen/screensaver is visible and you can't play or do anything.

+ + + +

You need to SSH in as the active user (from another device like your phone or another laptop) and run:

+ + + +
loginctl list-sessions
+
+ + + +

You will get a table output with a list of sessions:

+ + + +
SESSION  UID USER  SEAT  TTY  
+     38 1000 james       pts/1
+     44 1000 james       pts/2
+      5 1000 james seat0 tty2
+
+ + + +

You want to take the one that has a seat allocated and run unlock against it:

+ + + +
loginctl unlock-session 5
+
+ + + +

Hopefully this will unlock the screen and your steamdeck screen will show the application you're trying to use.

+ + + +

https://askubuntu.com/questions/341014/unlock-login-screen-using-command-line

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/09/30/Turbopilot - a Retrospective.md b/brainsteam/content/posts/2023/09/30/Turbopilot - a Retrospective.md new file mode 100644 index 0000000..c00d973 --- /dev/null +++ b/brainsteam/content/posts/2023/09/30/Turbopilot - a Retrospective.md @@ -0,0 +1,165 @@ +--- +categories: +- AI and Machine Learning +date: '2023-09-30 16:42:58' +draft: false +tags: [] +title: Turbopilot - a Retrospective +type: posts +--- + + +

As of today, I am deprecating/archiving turbopilot, my experimental LLM runtime for code assistant type models. In this post I'm going to dive a little bit into why I built it, why I'm stopping work on it and what you can do now.

+ + + +

If you just want a TL;DR of alternatives then just read this bit.

+ + + +

Why did I build Turbopilot?

+ + + +

In April I got COVID over the easter break and I had to stay home for a bit. After the first couple of days I started to get restless. I needed a project to dive into while I was cooped up at home. It just so happened that people were starting to get excited about running large language models on their home computers after ggerganov published [llama.cpp]. Lots of people were experimenting with asking llama to generate funny stories but I wanted to do something more practical and useful to me.

+ + + +

I started to play around with a project called fauxpilot. This touted itself as an open source alternative to Github Copilot that could run the Salesforce Codegen models locally on your machine. However, I found it a bit tricky to get running, and it didn't do any kind of quantization or optimization which meant that you could only run models on your GPU if you have enough VRAM and also if you have a recent enough GPU. At the time I had an Nvidia Titan X from 2015 and it didn't support new enough versions of CUDA to allow me to run the models I wanted to run. I also found the brilliant vscode-fauxpilot which is an experimental vscode plugin for getting autocomplete suggestions from fauxpilot into the IDE.

+ + + +

This gave me an itch to scratch and a relatively narrow scope within which to build a proof-of-concept. Could I quantize a code generation model and run it using ggerganov's runtime? Could I open up local code-completion to people who don't have the latest and greatest nvidia chips? I set out to find out...

+ + + +

What were the main challenges during the PoC?

+ + + +

I was able to whip up a proof of concept over the course of a couple of days, and I was pretty pleased with that. The most difficult thing for me was finding my way around the GGML library and how I could use it to build a computation graph for a transformer model built in PyTorch. This is absolutely not a criticism of ggerganov's work but more a statement about how coddled we are these days as developers who use these high-level Python libraries to abstract away all the work that's going on whenever we build out these complex models. Eventually, I found a way to cheat by using a script written by moyix to convert the Codegen models to run in a model architecture already supported by the ggml example code. This meant that I didn't need to spend several days figuring out how to code up the compute graph and helped me get my POC together quickly.

+ + + +

Once I'd figured out the model, it was just a case of quantizing it and running the example code, then I made use of CrowCPP to provide a lightweight HTTP server over the top. I reverse engineered the fauxpilot code to figure out what the REST interface needed to look like and started crafting.

+ + + +

When I typed in my IDE and got those first code suggestions back, I got that magical tingly feeling from making something work.

+ + + +

How successful was the PoC?

+ + + +

Once I had my PoC I added a readme, some badges and some CI pipelines for docker images, Mac pkgs and so on. Then I shared my project on Twitter, Reddit and Mastodon. I was surprised at how much attention it got, I accumulated about 2.5k stars on github in the first couple of days, then it slowed down to about 100 stars a day for the rest of the week. I think it helped a lot that Clem from Huggingface retweeted and replied to my tweet:

+ + + +
A screenshot from twitter: @clem very cool, would be awesome to have the models on hf.co/models
The project got a lot of attention once Clem from huggingface took an interest.
+ + + +

In the weeks that followed I got emails and linkedin notifications from all over the place. I got invited for coffee by a bunch of investors and I got asked to demo my tool at a large industry event (unfortunately the demo didn't go ahead due to last minute agenda changes. I will leave this totally, completely unrelated link here). I would say that as a proof-of-concept my work was very successful and demonstrated pretty clearly that it's feasible to have a local code assistant that stacks up pretty well against big, centralised models LLMs.

+ + + +

How did the project go after the PoC stage?

+ + + +

After the initial buzz, my progress slowed down quite a bit. Partly because we'd just raised some investment at my day job and things had really taken off and that meant I had less time to spend on the project. Part of it is because this is an area that is moving so fast right now and over the summer there seemed to be new code assistant model releases every week that someone would raise a ticket against. I felt guilty for not being able to keep up with the pace and that made me start to resent the project.

+ + + +

I did have a few exciting highlights though. Once I refactored the code and integrated Nvidia GPU offloading, I was getting blazingly fast responses to prompts from very complex models. I also hired a virtual Mac Mini from Scaleway to fix some MacOS specific bugs and seeing how quickly the inference server ran on an M2 chip was great fun too. I also enjoyed working with the guys who wanted to run the conference demo to add more improvements to the tool.

+ + + +

Why are you downing tools?

+ + + +

There have been a number of recent independent developments that provide a feature-rich and stable way to run state-of-the-art models locally and have VSCode communicate them, effectively deprecating Turbopilot. I just don't have the time to commit to competing with these alternatives and now that I've got a decent understanding of how GGML + quantization work, I'd like to spend my extra-curricular time on some other challenges.

+ + + +

The main reason for downing tools is that llama.cpp has feature parity with turbopilot. In fact, it's overtaken turbopilot at a rate of knots. It's not surprising, the project is amazing and has garnered a lot of interest and the community are making loads of contributions to make it better all the time. The new GGUF model format allows you to store metadata alongside the model itself so now you can run the llama server and automatically load Starcoder format models (including santacoder and wizardcoder), GPT-NEOX format models (i.e. StableCode) and LlamaCode models. Llama also provide a proxy script that converts their server API to be compatible with OpenAI which means that you can use llama's server directly with vscode-fauxcode without the need for turbopilot.

+ + + +

Secondly, there is a new open source VSCode plugin called Continue which provides both code autocomplete and that conversational style chat experience that CoPilot now supports. Continue can communicate directly with Llama.cpp's vanilla server without the need for the conversion proxy script. This is now my preferred way to code.

+ + + +

What setup do you recommend?

+ + + +

My recommendation is the vanilla llama.cpp server with CodeLlama 7B and Continue.

+ + + +

Firstly, download and build llama.cpp, I have an Nvidia 4070 so I build it with CUDA:

+ + + +
git clone https://github.com/ggerganov/llama.cpp.git
+cd llama.cpp
+mkdir build
+cd build
+cmake .. -DLLAMA_CUBLAS=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc
+make -j6
+
+ + + +

Next I download the gguf model from thebloke's repo and I run the server, offloading the whole model to GPU:

+ + + +
wget https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q5_K_S.gguf
+./bin/server -m ./codellama-7b-instruct.Q5_K_S.gguf -ngl 35
+
+ + + +

With that in place, I installed the Continue plugin in VSCode and open the sidebar. Select the '+' button to add an LLM config.

+ + + +
A screenshot of the continue sidebar with the add button highlighted in red
Open the continue sidebar and click the + button to add a model config
+ + + +

Then select llama.cpp and "Configure Model in config.py".

+ + + +
A screenshot of the continue sidebar with llama config view
Click Configure Model in config.py
+ + + +

This will open up a python config.py file, append the server url to the default model and click save:

+ + + +
		default=LlamaCpp(
+			model="llama2",
+            server_url="http://localhost:8080"
+		),
+
+ + + +

And now you're ready to do some coding with the new tool!

+ + + +

Conclusion

+ + + +

I've had fun building Turbopilot. It's been a great learning opportunity and helped me to expand my network and meet some cool people. However, given how much the landscape has changed since I started the project and my lack of time to devote to this project, it's right to move on and pass the torch to others in the community.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/10/01/Weeknote Week 39 2023.md b/brainsteam/content/posts/2023/10/01/Weeknote Week 39 2023.md new file mode 100644 index 0000000..ed4a3d4 --- /dev/null +++ b/brainsteam/content/posts/2023/10/01/Weeknote Week 39 2023.md @@ -0,0 +1,51 @@ +--- +categories: +- Personal +date: '2023-10-01 15:48:08' +draft: false +tags: +- weeknotes +title: Weeknote Week 39 2023 +type: posts +--- + +Another busy week here as "Q3" gives way for "Q4" in the business world. + +Early this week we had a company retreat. On Monday we invited middle and senior management teams and on Tuesday we invited the whole tech team including fully remote staff. We rented a huge AirBnB with a games room and a projector and even a hot tub (although only our CEO and one of the braver account managers went in the tub, the rest of the team stood around awkwardly). + +The retreat was an opportunity to try to build some good working relationships between different parts of the business since we've had so many new starters in recent days. It was kind of cool seeing different people, who wouldn't normally spend time together, hanging out and chatting over beers. + +On Wednesday I gave a CTO update at my first board meeting since we raised funding. The segment went well and we had a discussion about where machine learning and AI are useful and where they are not. + +The rest of the week was quieter and I got to work from home for a couple of days which was great because I was absolutely shattered after all of that socialising. + +
+ +Outside of work this week I published a few short blog posts and short comments about some of the articles that I read. + +I wrote some philosophical stuff about AI: + +I also wrote about my decision to stop developing TurboPilot which I still have mixed feelings about. However, it does free up some mental bandwidth for some other side-projects and concentrate attention into fewer places as far as open source AI code assistants are concerned. + +
+ +Earlier in the week I was playing Mass Effect 2 as part of the Legendary re-release. I definitely preferred the gunplay in the first game but, the story in the 2nd game is kind of fun. It leans more into a cheesy/stereotype storyline. + +I also spotted that Nvidia shipped some new beta drivers (535.43.10) that allow my RTX 4070 to finally run Starfield reasonably well via Proton. I've been streaming it to my Steamdeck and getting 50-60 FPS on 'High' settings, dropping to ~30FPS for densely populated areas. I suspect that there will be more performance gains to be had over the coming weeks. + +I wrote a TIL about automatically unlocking your Gnome session remotely with a shell comand which is very useful if I want to launch Starfield on my SteamDeck without going upstairs to physically unlock my desktop PC which has locked me out. + +
+ +I put off my trip up to visit my parents in Shropshire since my Mum and her partner both have COVID. I will try to visit later in the month. Instead, we had a pretty quiet weekend which was appreciated since I had a pretty active week and I needed some introvert energy recharging time. Instead, I've been playing Starfield, reading, writing and snoozing. We've also tackled a few house chores that needed sorting out. + +I've been making some small changes to my habits when it comes to writing and journaling that probably deserve their own full blog post. The most interesting thing has been trying to write more of my thoughts down either on paper or in Memos. Keeping track of more of the things that I do reminds me of just how much stuff I achieve in a given day. If I'm beating myself up for being unproductive, it's useful to reflect on some of this stuff. + +
+ +Next week will be another busy one again. I'm in London for some work events during the middle of the week and I've got a few calls starting to fill up my diary. + +We also need to make our quarterly migration to CostCo as we're starting to run low on some of our bulk items and we're planning to go to the cinema and make use of our Cineworld memberships at least once this week. \ No newline at end of file diff --git a/brainsteam/content/posts/2023/10/15/Weeknote CW 41 2023.md b/brainsteam/content/posts/2023/10/15/Weeknote CW 41 2023.md new file mode 100644 index 0000000..9b99e3b --- /dev/null +++ b/brainsteam/content/posts/2023/10/15/Weeknote CW 41 2023.md @@ -0,0 +1,48 @@ +--- +categories: +- Personal +date: '2023-10-15 21:18:18' +draft: false +tags: +- weeknotes +title: Weeknote CW 41 2023 +type: posts +--- + +This week has been jam packed with traveling, meetings, events and all sorts! For an introvert like me, it's been pretty hard going pretending to be extroverted and interacting with lots of folks. + +The biggest news this week was that my company won another award. A few weeks ago in September we won the CogX award for Best Fintech Company 2023. On Thursday I attended an awards ceremony with my colleague in order to accept the award for Best Emerging Tech Company (on the south coast) 2023. +
+ two men wearing tuxedos smiling and holding a glass trophy
+ Doing my very best Chandler Bing photo smile face whilst holding our award with my colleague Alex (left). +
+
Earlier in the week I did a lot of traveling. On Monday I was up in our London office for a bunch of meetings and face to face catch ups with some new members of the team. Then, on Monday night I made my way over to a hotel near the London Excel centre and prepared for the Snowflake World Tour on Tuesday. +
+ a huge projector screen showing some snowflake graphics in a dimly lit conference hall + +
The Snowflake World Tour was pretty glitzty and very high energy. Some of the content gave me real "Pied Piper" / "Silicon Valley" vibes
+We're not using Snowflake that heavily at work so it was an interesting opportunity for me to learn a little bit more about other features they offer and what we might want to start looking into. The new streaming features that allow you to build views that combine "streamed" data with "static" data looked pretty cool. One theme that kept coming up again and again was Large Language Models and Generative AI. However, very few of the speakers actually said what it is they thought you should be doing with these tools, merely that the platform "now supports them". I wrote in my notes that +
+ Every time llms come up it is always very vague and hand wavey
+There was one demo which was actually pretty cool where they describe a database schema to a model and ask it to generate SQL queries based on natural language questions and then execute these queries and display the answers. This is actually a pretty reasonable use case for this tech because: + +I  had a play with the demo that the salesforce team shared afterwards and I was able to get it working with a local MySQL database (as opposed to a Snowflake instance) pretty damn quickly which was cool +
I stayed on in London on Tuesday night and on Wednesday morning I took a train down to Chelmsford to give a talk to some young professionals attending a networking event by the University of Essex as part of their KTP programme. I really enjoy public speaking these days so I had a great time and afterwards I got lunch with the University team and we chatted TV shows. + +
A pattern I tried out this week, particularly at the conference on Tuesday was using my memos instance in combination with MoeMemos a bit like twitter to "live tweet" interesting observations from the conference. This was a really nice way to do low friction note taking while enjoying the conference and adding photos is pretty seamless. I've got the LogSeq Memos Sync plugin which pulls in all the memos that I make and attaches them to my daily journal so that I can process and summarise my notes later. The other really cool thing about this is that tags in memos just transparently sync with tags in Logseq so if I tag something #snowflake in memos it will show up as a mention on my snowflake page in Logseq later on. Very groovy. + +
This week I've not had a whole lot of recreational time and when I have had a bit of time I've felt like just snoozing or watching TV. I did watch the latest episode of Great British Bake Off on tuesday and Mrs R and I have started watching the latest series of Only Murders In The Building on Disney+. + +On Saturday I went with Mrs R to visit my sister-in-law and we started watching Fall of the House of Usher on Netflix. This series is apparently loosely based on the works of Edgar Allan Poe and it's by Mike Flanagan who directed a bunch of shows I've previously enjoyed including Haunting of Hill House, Haunting of Bly Manor and Midnight Mass. There is a lot I enjoyed about House of Usher except there is a particularly graphic/gory scene in one of the early episodes which is still etched permanently into my memory and left me feeling pretty panicked and prevented me from sleeping. The last thing that impacted me so deeply was The Boy in Striped Pajamas. I don't know if I want to keep watching the show as whilstt I'm a big fan of psychological horror, I don't do gore and normally Flanagan excels in the former rather than the later. However, I'm kind of captivated by the storyline so I'm wondering if I should just keep pushing through and hope that it won't cause any more lost sleep. + +
I've not read any more of Ringworld by Larry Niven this week, I'm kind of put off by the in-your-face misogyny at this point. I may DNF it and move on to something else. + +I did pick up Think Like a CTO (which is probably a good idea since I am one) this week which I'm enjoying so far. + +
This week should be a lot less hectic and busy. We are going to the theatre to see 2:22 on Tuesday, I'm only in London for one day this week for some meetings on Wednesday and I'm taking a half day on Friday to visit my parents and some friends up in the midlands whilst Mrs R stays at home and celebrates her sister's birthday with a girly spa afternoon. + +Here we go again! \ No newline at end of file diff --git a/brainsteam/content/posts/2023/10/18/Dealing with death-by-a-thousand questions in the workplace.md b/brainsteam/content/posts/2023/10/18/Dealing with death-by-a-thousand questions in the workplace.md new file mode 100644 index 0000000..09fdd5b --- /dev/null +++ b/brainsteam/content/posts/2023/10/18/Dealing with death-by-a-thousand questions in the workplace.md @@ -0,0 +1,101 @@ +--- +categories: +- Engineering Leadership +date: '2023-10-18 09:04:00' +draft: false +tags: [] +title: Dealing with death-by-a-thousand questions in the workplace +type: posts +--- + +Inundated by 'quick questions' that prevent you from carrying out your main role? Slack has made it very easy to invade your focus time and before slack people would turn up at your desk in the office and say "can I borrow you for a sec...?" + + + + +

Featured Image by Towfiqu barbhuiya on Unsplash

+ + + +

Inundated by 'quick questions' that prevent you from carrying out your main role? Slack has made it very easy to invade your focus time and before slack people would turn up at your desk in the office and say "can I borrow you for a sec...?"

+ + + +

The big problem, especially for software engineers and deep knowledge workers, is that these "quick questions" and just being borrowed comes with context switching cost. The person who asked it may not realise that their 2 minute question just cost you half an hour of work time because you need to get back into the zone (or a nerdier analogy - you need to page the contents of your brain out to disk and loading it back into memory is slow work...)

+ + + +

+ + + +

Here are a few suggestions and anecdotes for dealing with these quick questions from my personal experience.

+ + + +

Document Frequently Asked Questions

+ + + +

If you are an individual contributor and you're good at your job you likely have a reputation as an expert at X and people will want to ask you about that thing. In one way that's flattering but when it takes up a lot of time it obviously becomes disruptive. Have a think about what you're being asked on a regular basis and consider writing up answers to common questions and either blogging about them or putting them somewhere appropriate within your workplace (a notion or confluence page or a Google doc or something). Then next time someone asks you that question send them a link to you FAQ. Rather than just sending the link without any context (which could potentially come across passive aggressive) have a pre-canned "polite" response about how you get asked that a lot and how you took the time to write up a detailed answer and hope that it's helpful. If you put all your answers in the same place people will start to check there first and share the answer amongst themselves.

+ + + +

Set Some Boundaries for Yourself

+ + + +

I hope this next one doesn't come across too patronising because it might sound obvious but many thirty-somethings or even forty-somethings that I work with hadn't really thought about it.

+ + + +

We all want to be helpful and please our colleagues but does answering their quick question the moment it arrives in slack actually unblock them there and then? Some people, particularly those that juggle a lot of to-dos, like to dump questions straight into slack as a way to offload - once its in your inbox it no longer has to be in theirs. What's more, IM apps like Slack are designed to make us alert and stressed and want to get rid of that little red indicator and drive the number of unread messages down to zero.

+ + + +

So what can you do? Set yourself some boundaries and be disciplined about them. Block out an hour of "focus time" in your calendar to just get on with your main day job and turn off slack. Just right click -> exit! It's ok! You can do it! The world is not going to stop spinning and if it really is "that urgent" they probably have your phone number. Make a habit of "batch replying" to all your quick questions at once - this reduces the context switching required.

+ + + +

Set Some Boundaries for Others

+ + + +

If you already had an inkling about the above or you took my advice and people switched from slacking to calling it's time to lay down some boundaries.

+ + + +

My favourite approach here is to schedule "AMA sessions" or "office hours" - put in a recurring meeting at a time when you're not at your most productive (I'm a morning person so mid-afternoon works well for me). Tell everyone that you will no longer be responding to ad-hoc questions and to bring the questions to your office hours session.

+ + + +

You can assume that people will bring urgent questions to you as soon as they pop up so don't say the quiet part out loud. If you do, people will likely assume everything they care about is urgent and bombard you anyway. A tiny bit of friction here goes a long way

+ + + +

Ideally your line manager will be on board with this but regardless, give yourself permission to do it and ask for forgiveness rather than permission. After all, this is easily defensible to the wider business (the question askers) as a necessary process change to improve your productivity.

+ + + +

Normally this has the effect of condensing down all your ad-hoc question answering (good for context switching) and often has another pleasant side effect. Two or more people might join with the same question and they both get it answered simultaneously. Not quite as good as having an FAQ written up but certainly more efficient than parroting the same thing to different people)

+ + + +

An important step is to follow through and actually ignore people who message you outside of office hours (again -probably using your professional judgement to decide whether it really is urgent).

+ + + +

Conclusion

+ + + +

In the modern work environment, juggling tasks is particularly painful without being bombarded with questions constantly. Whilst you might feel the need to be helpful, if you're not setting boundaries you could end up burning out trying to find time to do everything. Start by setting yourself some intrinsic boundaries and habits and then consider going radical and using tools like "office hours" to cut down wasted time. Give yourself permission to do your job without distractions - your manager will almost certainly be onboard!

+ + + +

Appendix: Getting Your Manager On Board

+ + + +

If you're an individual contributor you, hopefully, have a pretty well defined day job and core set of deliverables and goals. If answering questions all day is getting in the way of that and your manager is actually decent at their job, they are going to care and want to help you find ways to make things work. If the office hours/AMA idea sounds a bit radical to them, gather some data. Start to time how long you are spending answering questions over time and show your manager in your next 1:1. Good managers should be horrified and want to help to facilitate you getting more time to do your core job.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/10/29/Comment_ Once Interrupted_ Give It Your Full Attention.md b/brainsteam/content/posts/2023/10/29/Comment_ Once Interrupted_ Give It Your Full Attention.md new file mode 100644 index 0000000..08ec2ee --- /dev/null +++ b/brainsteam/content/posts/2023/10/29/Comment_ Once Interrupted_ Give It Your Full Attention.md @@ -0,0 +1,55 @@ +--- +categories: +- Engineering Leadership +date: '2023-10-29 22:48:23' +draft: false +tags: [] +title: 'Comment: Once Interrupted, Give It Your Full Attention' +type: posts +--- + + +

+ + + +
+

Good leaders do their best to prevent distractions and avoid interruptions.

+ + + +

...

+ + + +

But no matter how much planning goes into eliminating distractions and arranging the physical work environment to enhance focus, people (and pets) interrupt the flow.

+ + + +

...

+ + + +

Once interrupted, it often makes the most sense to give the source of the disturbance your full attention. If it’s going to take a few minutes to redeploy your focus anyway, why not achieve the equally important goal of doing what leaders are meant to do: focus on the problems and issues of others.

+Admired Field Notes
+ + + +

+ + + +

This really makes a lot of sense even though our brains will actively try to pull us back into whatever we were doing before. By going back to what you were working on you have to go through the pain of another context switch whilst the interruption presumably remains unresolved.

+ + + +

+ + + +

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/10/29/Weeknote CW43 2023.md b/brainsteam/content/posts/2023/10/29/Weeknote CW43 2023.md new file mode 100644 index 0000000..e728b55 --- /dev/null +++ b/brainsteam/content/posts/2023/10/29/Weeknote CW43 2023.md @@ -0,0 +1,40 @@ +--- +categories: +- Personal +date: '2023-10-29 19:26:14' +draft: false +tags: +- weeknotes +title: Weeknote CW43 2023 +type: posts +--- + + +

We had a great time at Peter Pan goes wrong - lots of silly slapstick

+ + + + + \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/01/Going Mainstream with Wordpress.md b/brainsteam/content/posts/2023/11/01/Going Mainstream with Wordpress.md new file mode 100644 index 0000000..d2730e2 --- /dev/null +++ b/brainsteam/content/posts/2023/11/01/Going Mainstream with Wordpress.md @@ -0,0 +1,33 @@ +--- +categories: +- Personal +date: '2023-11-01 20:12:49' +draft: false +tags: [] +title: Going Mainstream with Wordpress +type: posts +--- + + +

+ + + +

Hello folks - I've taken the decision to migrate my blog back to wordpress after a few years of "dogfooding" my own publishing stack with Hugo and a little python script I wrote to integrate with indieweb and webmention stuff.

+ + + +

Why have I done this? Well frankly I realised that my blog setup was annoying me and holding me back from actually publishing. I couldn't find a workflow that I was totally comfortable with and that friction was making me less likely to want to write. During this period of reflection I also noticed Kev's comment about moving between CMS systems and had a little chat with him about it over email. He also advised me that I should do what I need to in order to reduce friction so that I actually want to write.

+ + + +

I know that Wordpress is the boring, mainstream, vanilla option when it comes to web publishing but it's also very convenient to use and I'm very comfortable drafting and publishing pages from this interface. It also makes updating the layout and design of my page a lot easier and I am very much a backend/data person rather than a web designer so that is pretty appealing too. I was also pretty keen to try out the new WP activitypub integration for directly sharing my content on the fediverse.

+ + + +

I've been migrating my post across from my Hugo site using a little conversion script that I wrote up to convert the markdown files and add them as drafts in wordpress. This way I can curate my archive of posts and bring it with me over to Wordpress over time. I've brought some of my more popular articles over with me already and I'll slowly make my way through my back catalogue for posts I want to retain.

+ + + +

I'm hoping that this will help inspire me to write a little more regularly along side some deliberate daily writing practice that I'm trying out at the moment. Apologies for breaking peoples' RSS feeds and whatnot, hopefully it won't happen too regularly from here on out.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/03/Living with Aphantasia.md b/brainsteam/content/posts/2023/11/03/Living with Aphantasia.md new file mode 100644 index 0000000..4cee715 --- /dev/null +++ b/brainsteam/content/posts/2023/11/03/Living with Aphantasia.md @@ -0,0 +1,58 @@ +--- +categories: +- Personal +date: '2023-11-03 11:20:35' +draft: false +tags: +- aphantasia +title: Living with Aphantasia +type: posts +--- + + +

Aphantasia is what they call it when you can't see stuff in your "mind's eye", it's latin for "no imagination". It was a term originally coined by Dr Adam Zeman who did a study in 2015 that I responded to. I am an aphant and I spent most of my life believing that this is the normal state of affairs. When my secondary school art teacher got frustrated with me during still life drawing, they said "just visualise it in your mind's eye" which completely threw me - I assumed that this was a turn of phrase or a metaphor or something. I didn't realise that most people genuinely do have mental imagery.

+ + + +
+https://www.youtube.com/embed/vbVlte9hSrE +
In this video, Dr Adam Zeman summarises Aphantasia - the inability to imagine things
+ + + +

+ + + +

I think in terms of "facts" and "statements". If I ask you to picture a sunset in your mind you are probably able to see a lovely orange glowing sun setting over a beach with waves gently lapping across pink coral sand. I don't see any of that stuff but I know, theoretically what sunsets look like and I can pick language that describes how that sunset might interact with other stuff (like a sandy beach). What I can't do is "picture" anything in my head.

+ + + +

Living with Aphantasia is welll... all I've ever known. It's my lived experience. I don't believe it's ever held me back in any major ways. I'm not very good at visual art but I never set out to be an artist (perhaps there was alway an unconscious bias there?). There are a few things about being an aphant that annoy me.

+ + + +

I am very bad at facial recognition. I'm not face-blind, I know my wife and family when they are there in front of me. However, I'm awful at recognising people "out and about" or actors in movies. I am often reminded by my mother of occasion from my childhood where, upon seeing a photo of the famously beautiful model Heidi Klum, I remarked "she looks just like Gollum from Lord of the Rings". I'm not consciously aware of how my facial recognition faculties work (or don't work) but I'm sort of vaguely aware that it's quite brittle and "fact based". For example, person in question has have big eyes and black hair and a particular shape to their nose or mouth and so does other person. This can make for some very embarrassing interactions.

+ + + +

When it comes to aesthetics, I know what good looks like but I find it hard to replicate. I often get excited about DIY projects and dream of replicating styles that I've seen in showrooms or other peoples' houses or even fancy hotels. However, when I get home, I see how the room is now and I can't picture how I want it to look. This is a struggle when I'm buying a house - they always tell you to "look past" the current appearance and imagine how it could look. Pahahahaha - I'd love to be able to do that! I have the same frustrations when it comes to all visual creative endeavours - web design, drawing, even designing the invites we sent out for our wedding. This is a major reason that I specialise in backend/server-side and machine learning code rather than frontend as part of my day job.

+ + + +

A third and final frustration I'll talk about here is mindfulness and stress relief techniques. Many meditation and stress relief exercises involve statements like "imagine a warm beam of sunlight that beams up and down your body and relaxes your muscles" or "imagine going to your happy place" or "imagine you're holding all your stress in a balloon and you're going to let it go and it will float away". None of these work for aphants like me. What does work for me is breath work and actual videos (for example, the headspace "thoughts as traffic" analogy which I couldn't picture 'in my head' but I can appreciate as a cute cartoon)

+ + + +
+https://www.youtube.com/embed/iN6g2mr0p3Q +
Headspace's "thoughts as traffic" animation is cute and I don't have to try and visualise it.
+ + + +

Overall though, I can't say that Aphantasia is particularly problematic for me, it's all I've known for 30-something years and, as such, I've just generally "got on with it". However, learning about Aphantasia, interacting with Dr Zeman and being able to put two-and-two together later in life ("oh that's why my art teacher said that stuff and that's why I'm not great at painting") has been very positive for me.

+ + + +

I'm planning to write a series on aphantasia and how I "work around" it. If you have questions or are curious about it, I'd love to hear from you. Or, if you are a fellow aphant, drop me a message and say hi!

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/05/Weeknote CW 44 2023.md b/brainsteam/content/posts/2023/11/05/Weeknote CW 44 2023.md new file mode 100644 index 0000000..440864e --- /dev/null +++ b/brainsteam/content/posts/2023/11/05/Weeknote CW 44 2023.md @@ -0,0 +1,41 @@ +--- +categories: +- Personal +date: '2023-11-05 17:43:31' +draft: false +tags: [] +title: Weeknote CW 44 2023 +type: posts +--- + + +

I spent two days up in London this week, attending various in-person meetings and I got to have lunch with my friends and former co-founders of Filament who span out to run EBM.

+ + + +

On Wednesday we were "locked down" for Storm Ciaran - our local council declared a Major Incident and encouraged people to stay at home. In the end, the storm wasn't that bad for us although it caused a lot of damage on the channel islands.

+ + + +

I ended up also catching a cold/flu from London so with the storm and feeling pretty bad, we decided not to visit Bournemouth to see Dawn French.

+ + + +

On our planned day off, we went out for lunch instead. We visited a local garden and craft centre with an amazing cafe and deli attached to it. They had christmas displays up and Mrs R talked me into spending a fortune on decorations. I also lost Whammagedon - on the 2nd of November!

+ + + +

I finally decided to make the move from Hugo to Wordpress - I wrote about it a bit here.

+ + + +

I also wrote about my experience of Aphantasia and had a few chats with people over on Substack Notes.

+ + + +

I finally finished Larry Niven's Ringworld. The sci-fi concepts were cool but it was really let down by a whole bunch of problematic treatment of women. I've moved on to Dredd: Year Three from a fiction POV and I've started reading Herbert Lui Creative Doing and MAKE by Pieter Levels on the non-fic pile.

+ + + +

This week my Dad and his wife are coming to visit us for a couple of days so I've got a little longer off work. I suspect I'll be up in London for some f2f meetings too. Then we've got a few weeks reprieve before christmas stuff starts kicking off.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/12/Weeknote CW45 2023.md b/brainsteam/content/posts/2023/11/12/Weeknote CW45 2023.md new file mode 100644 index 0000000..5965602 --- /dev/null +++ b/brainsteam/content/posts/2023/11/12/Weeknote CW45 2023.md @@ -0,0 +1,62 @@ +--- +categories: +- Personal +date: '2023-11-12 19:58:30' +draft: false +tags: +- weeknotes +title: Weeknote CW45 2023 +type: posts +--- + + +

This week was another busy one both at work and at home.

+ + + +

My Dad and his wife came to stay with us at the start of the week. My dad is reeaaally into coffee and I got him a voucher to go on a barista training course at Winchester Coffee School for his birthday earlier in the year. While he was doing his course, I took my evil stepmother for a walk along the beach at Lee-on-the-Solent and we had lunch in one of the seafront cafes. I snapped a photo of some brave kite surfer, out in the sea on a very cold and windy day.

+ + + +

Since I had Monday afternoon, Tuesday and Wednesday morning off from work, I mainly worked from home and the Whiteley office this week and managing to avoid London. I'm currently looking at our internal tooling to support our software development lifecycle. We currently use Linear and even though the UX is crisp and polished, there are a few things that aren't working well for us right now. In particular, I'm looking for something that gives us better transparency and visibility of team velocity. We're looking at whether a move back to Jira might be the right thing for us (to butcher an old phrase, nobody got fired for buying Jira).

+ + + +

Nourishing Side Project

+ + + +

During my downtime I've been playing with an idea for building a federated recipe app (kind of like if Whisk.com - which was recently acquired and enshittified by Samsung had open interoperable recipe formats). I've been playing with local AI tools like Stable Diffusion and CodeLlama to accelerate the development. The idea seems to attract a few interested folks when I tooted about some of the "default avatars" I had Stable Diffusion create for me.

+ + + +
Screengrab of a mastodon toot where I mention that I am creating an activitypub recipe app and share a screenshot of some of the cute avatars I got SD to create.
+ + + +

I'm really excited about this idea and it sounds like some other people are too. I'm planning to follow up with a blog post about my plans for what it will do and a name reveal in the next few days. I'm planning to build it "with the garage door up" so hopefully it will help others too.

+ + + +

Watching and Reading

+ + + +

On Saturday we went to see Dream Scenario - a film about an awkward, introverted man, played by Nicolas Cage, who ends up appearing in a large number of peoples' dreams - sort of like a sleep meme. It's a really good movie with some dark moments, some cringy moments and some funny moments and it spends some time examining some important societal tropes. Although I sat and watched the film through gritted teeth and occasionally with my head in my hands, I genuinely enjoyed it. I did see myself (an awkward introverted man) in Cages character although Mrs R assures me that I'm nicer and have more common sense.

+ + + +

I am still reading Make by Pieter Levels which so far has focused on finding interesting ideas and idea generation and Dredd: Year Three which provides me with some pulpy sci-fi entertainment. Today I saw this SMBC comic which really resonated after my recent reading of Ringworld.

+ + + +

Coming up Next Week

+ + + +

This coming week I will mainly be working from home again. It's cold, wet and dark and there aren't many hours of sunlight so I'm aiming to get out and do some walks to try and get some endorphins and some exercise. I'll be writing more about my recipe app.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/13/Gastronaut - Fediverse Recipe App.md b/brainsteam/content/posts/2023/11/13/Gastronaut - Fediverse Recipe App.md new file mode 100644 index 0000000..765f982 --- /dev/null +++ b/brainsteam/content/posts/2023/11/13/Gastronaut - Fediverse Recipe App.md @@ -0,0 +1,120 @@ +--- +categories: +- Software Development +date: '2023-11-13 20:56:06' +draft: false +tags: +- gastronaut +title: Gastronaut - Fediverse Recipe App +type: posts +--- + + +

Over the last few days I've been starting to build out a webapp. I was inspired by a personal "itch" I want to "scratch". Although I'm a long way from ready to share my app, I thought I'd take the time to write up some of my initial ideas and choices. I'm a big fan of building with the garage door up and as Pieter Levels says - it's better to share ideas early and get out of your own head.

+ + + +

My Inspiration? Whisk.com

+ + + +

I was, until recently, a big fan of whisk.com, a proprietary recipe app that allowed you to create and share recipes, create a food plan for the week and then use that food plan to put together a shopping list. I love the fact that it normalises units across recipes and estimates conversions between volumes and weights for common foodstuffs (e.g. it will guess at how many grams of sugar you need in total if one recipe calls for 200g and another calls for 2 tablespoons). I also love the recipe importer tool that allows you to paste in URLs to your favourite recipes around the web and automatically pull in the ingredients and quantities for your grocery list.

+ + + +

Whisk was recently acquired by Samsung and is now Samsung Food. Most of the functionality is still there but there are some obnoxious signs of early enshittification, a big "Get App" button if you're using the web form. It got me thinking.... what would happen to all my recipes and food plans if this app went away?

+ + + +

Name Ideas

+ + + +

I spent a little time coming up with some name ideas. Initially I went with nurishify but I decided it was too many syllables. The next one, which I was very pleased with was GastroPub which is a portmanteu of Gastronomy and ActivityPub but also a play on the british Gastro Pub - a fancy pub/bar that also serves nice food. I decided that GastroPub might be too British and also might confuse people who are genuinely looking for GastroPub recommendations. The name I've settled on for now is Gastronaut - a portmanteau of Gastronomy and Astronaut. I like the idea that it's like food space exploration and I imagine I could have a cute little logo with an Astronaut holding a knife and fork or something.

+ + + +

Intended Functionality

+ + + +

The idea is to offer many of Whisk.com's features but in an open, interoperable way. For example:

+ + + +
    +
  1. Import recipes from websites or books
  2. + + + +
  3. Share recipes with friends (and plug into ActivityPub so that they can be tooted/reposted etc)
  4. + + + +
  5. Manage your food plan and grocery list
  6. +
+ + + +

In the short term, my intention would be to make a neat import/export tool that uses common formats like CSV and Markdown so that someone could download all their data and keep it safe if their instance started to creak but perhaps longer term there could be some kind of profile migration tool like Mastodon offers.

+ + + +

I am a machine learning and natural language processing specialist so I have some ideas about how the system could auto-import recipes from websites but maybe also from photos that have been OCRed. LLMs are probably a bit overkill for this kind of use case, I suspect open source NLP pipelines like SpaCy would do pretty well at some of these tasks.

+ + + +

Tech Stack

+ + + +

Rather than using this as an excuse to learn some new trendy language or tool, I want to get something up and running quickly. Therefore, I'm going to rely on well established, and probably 'boring' tools and frameworks to get the job done. The core web app is Python Django. I want to keep the frontend simple and minimalist and avoid javascript hell so I'm using the Milligram lightweight CSS framework along side jQuery and HTMX for doing some lightweight UX stuff. I'm going to use Python+Celery+Redis for the worker queue system (for doing things like importing recipes).

+ + + +

Funding and Sustainability

+ + + +

It may be a bit cheeky but I already have a Ko-fi page. If this thing takes off I'd be interested in turning it into a side-gig. I would also ask people using the flagship instance of this software to make a regular donation if they can. I may also provide some "plus" features such as OCRing recipes from phone camera pics and charge a small fee for convenience. People would be welcome to self-host stuff and set up and pay for all the infrastructure for the premium features but likewise, they would probably find it cheaper and easier to use the central instance.

+ + + +

Early Screenshots/Mockups

+ + + +

I've already started building out the app. Below are some photos of the current state of the system with the old Nurishify name/brand that I came up with.

+ + + +

I started with a simple landing page explaining what the app does and encouraging the user to sign up

+ + + +
A landing page. There is a photo of some food on a chopping board and a summary of the app and  what it does. There is a button inviting users to sign up to use the app.
+ + + +

I made a mock up of the home screen where you'd be able to see your friends and what they'd posted about and manage your own recipes

+ + + +
A mockup of the app. On the left there is a full size picture of the user's avatar and underneath it summaries and links to the user's recipes, food plans and shopping list. On the right, a news feed showing people liking the user's recipe and adding their own.
A screenshot of the home screen with the old Nurishify name
+ + + +

What's Next?

+ + + +

I'm quite excited about building this thing out but I'd love to get your feedback! Any killer feature ideas? Any thoughts about how things should work? Any names that are better than Gastronaut?

+ + + +

In the next few days I will probably share a Github repo for my project but for now please follow me on my blog or on fosstodon for updates if you're interested!

+ + + +

Thanks for listening to my ted talk!

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/18/Gitea Actions and PDM.md b/brainsteam/content/posts/2023/11/18/Gitea Actions and PDM.md new file mode 100644 index 0000000..8c9319c --- /dev/null +++ b/brainsteam/content/posts/2023/11/18/Gitea Actions and PDM.md @@ -0,0 +1,150 @@ +--- +categories: +- Software Development +date: '2023-11-18 22:44:24' +draft: false +tags: +- gitea +- python +title: Gitea Actions and PDM +type: posts +--- + + +

Gitea actions is the new Github-compatible CI/automation pipeline feature that ships with Gitea and Forgejo. In theory it is interoperable with Github actions but there are still a few rough edges and for that reason, the feature is still disabled by default.

+ + + +

I have been trying to get a django project that uses PDM for Python dependency management to to install itself and run some tests in a Gitea CI environment.

+ + + +

In theory, my workflow is simple:

+ + + + + + + +

Unfortunately there were a couple of odd quirks that I had to resolve first before I could get it working.

+ + + +

Add a Github Key

+ + + +

I was initially getting an error about being Unauthorized from the pdm-project/setup-pdm@v3 step. This is because the action attempts to download a list of pre-built python distributions from github and since we're running outside of github it is initially unable to get this list without an additional API key. All we need to do is create a Github token and then set up a new project secret and paste in the token that we just created. I use the name GH_TOKEN because Gitea does not allow you to use any secret prefixed GITHUB.

+ + + +
A screenshot of the project settings in gitea. Navigate to Actions, Secrets, Add Secret and reate a new secret called GH_TOKEN.
Create a secret called GH_TOKEN which we can pass to the action.
+ + + +

+ + + +

Now, we can pass the token into the setup pdm step of the yaml like so:

+ + + +
      - uses: pdm-project/setup-pdm@v3
+        with:
+          python-version: 3.10
+          token: ${{ secrets.GH_TOKEN }}
+ + + +

+ + + +

Change the Container Image

+ + + +

Once I resolved the authorization error above, I started getting error messages about how no Python releases were available for the given OS and architecture. I thought that was weird because in theory we're running Ubuntu on x64. This forum post suggested that changing the docker image that the runner uses to execute the pipeline might work. I'm not 100% sure what the default Gitea action runner uses as its base image but Gitea Actions are based on Act, a local runner for Github actions and the official Act project recommends images by catthehacker for use as runners. By specifying one of these images, we seem to be able to 'fix' whatever metadata is missing from the default image.

+ + + +

We can pass a container in via the job container directive like so:

+ + + +
jobs:
+  run_tests:
+    runs-on: ubuntu-latest
+    container: catthehacker/ubuntu:act-latest
+    steps:
+    ...
+      - uses: pdm-project/setup-pdm@v3
+        with:
+          python-version: 3.10
+          token: ${{ secrets.GH_TOKEN }}
+ + + +

+ + + +

With this change in place, the rest of my pipeline seems to have burst into life.

+ + + +

Here is the full yaml file for my CI:

+ + + +
name: Run Tests
+run-name: ${{ gitea.actor }} is testing out Gitea Actions 🚀
+on: [push]
+
+jobs:
+  run_tests:
+    runs-on: ubuntu-latest
+    container: catthehacker/ubuntu:act-latest
+    steps:
+      - name: Checkout Codebase
+        uses: actions/checkout@v3
+
+      - name: Set up python
+        run: |
+          apt-get update && apt-get install -y python3-venv
+          pip install --upgrade pdm
+
+      - uses: pdm-project/setup-pdm@v3
+        with:
+          python-version: 3.10
+          token: ${{ secrets.GH_TOKEN }}
+
+
+      - name: Install dependencies
+        run: cd ${{ gitea.workspace }} && pdm install
+
+      - name: Run tests
+        run: |
+          cd ${{ gitea.workspace }} && pdm run manage.py test
+
+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/19/Parsing Ingredient Strings with SpaCy PhraseMatcher.md b/brainsteam/content/posts/2023/11/19/Parsing Ingredient Strings with SpaCy PhraseMatcher.md new file mode 100644 index 0000000..2d35fcb --- /dev/null +++ b/brainsteam/content/posts/2023/11/19/Parsing Ingredient Strings with SpaCy PhraseMatcher.md @@ -0,0 +1,271 @@ +--- +categories: +- Data Science +- Software Development +date: '2023-11-19 18:13:40' +draft: false +tags: +- django +- gastronaut +- python +title: Parsing Ingredient Strings with SpaCy PhraseMatcher +type: posts +--- + + +

As part of my work on Gastronaut, I'm building a form that allows users to create recipes and which will attempt to parse ingredients lists and find a suitable stock photo for each item the user adds to their recipe. As well as being cute and decorative, this step is important for later when we want to normalise ingredient quantities.

+ + + +

What we're looking to do is take a string like "2lbs flour" and turn it into structured data - a json representation might look like this:

+ + + +
{
+   "ingredient":"flour",
+   "unit":"lbs",
+   "quantity":"2"
+}
+ + + +

We can then do whatever we need with this structured data - like using the ingredient to look up a thumbnail in another system or generate a link or reference to the new recipe so that people looking for "banana" can find all the recipes that use them.

+ + + +

Building a Parser

+ + + +

There are a few options for parsing these strings. If you're feeling frivolous and want to crack a walnut with a sledgehammer, you could probably get OpenAI's GPT to parse these strings with a single API call and a prompt. However, I wanted to approach this problem with a more proportional technique.

+ + + +

I'm using Spacy along with Spacy's PhraseMatcher functionality which basically looks for a list of possible words and phrases. Once we've installed Spacy, we make a long list of units and we tell Spacy about them:

+ + + +
import spacy
+from spacy.matcher import PhraseMatcher
+
+# Load the English language model
+nlp = spacy.load("en_core_web_sm")
+
+
+# Define a list of known units
+known_units = [
+    "grams", 
+    "g", "kg", 
+    "kilos", 
+    "kilograms", 
+    # ... many more missing for brevity
+    "lbs",
+    "cup",
+    "cups",
+    "tablespoons",
+    "teaspoons"]
+
+# Initialize the pattern matcher
+matcher = PhraseMatcher(nlp.vocab)
+
+# Add the unit patterns to the matcher
+matcher.add("UNITS", [nlp(x) for x in known_units])
+ + + +

Now we can write a function to use this matcher along with a bit of logic to figure out what's what and structure it into the json format I outlined above.

+ + + +
# Function to parse ingredient strings
+def parse_ingredient(ingredient):
+    doc = nlp(ingredient)
+    matches = matcher(doc)
+    
+    quantity = None
+    unit = None
+    ingredient_name = []
+
+
+    for token in doc:
+        if token.text in ['(',')']:
+            continue
+        if not quantity and token.pos_ == 'NUM':
+            quantity = token.text
+        elif unit is None and any(match_id == token.i for _, match_id, _ in matches):
+            unit = token.text
+        else:
+            ingredient_name.append(token.text)
+
+    return {
+        "quantity": quantity,
+        "unit": unit,
+        "ingredient": " ".join(ingredient_name)
+    }
+
+ + + + + + + +

This approach works if the items are in a different order too (which would completely throw a regular expression off) e.g. Milk (1 cup) rather than 1 cup milk.

+ + + +

Making it More Robust

+ + + +

This logic is not perfect but it should cover most reasonable use cases where someone enters an ingredient following .

+ + + +

However, if I want to improve the parsing performance and variety of things that we want to be able to understand in the future, I could train a custom NER model inside spacy. I will likely write about doing exactly that at some point in the future.

+ + + +

+ + + +

Adding Some Tests

+ + + +

Since this is going to operate an API endpoint in my recipe app, I want to be relatively sure it will work reliably for a few different examples. I'm building a Django app so I've set up a test case using the Django testing framework

+ + + +
from django.test import TestCase
+import spacy
+from spacy.matcher import PhraseMatcher
+from recipe_app.nlp.ingredients import parse_ingredient 
+
+class ParseIngredientTestCase(TestCase):
+
+    def test_parse_ingredient(self):
+        test_cases = [
+            ("4 bananas", {"quantity": "4", "unit": None, "ingredient": "bananas"}),
+            ("200g sugar", {"quantity": "200", "unit": "g", "ingredient": "sugar"}),
+            ("1 stock cube", {"quantity": "1", "unit": None, "ingredient": "stock cube"}),
+            ("1/2 tbsp flour", {"quantity": "1/2", "unit": "tbsp", "ingredient": "flour"}),
+            ("3 lbs ground beef", {"quantity": "3", "unit": "lbs", "ingredient": "ground beef"}),
+            ("2.5 oz chocolate chips", {"quantity": "2.5", "unit": "oz", "ingredient": "chocolate chips"}),
+            ("5 kg potatoes", {"quantity": "5", "unit": "kg", "ingredient": "potatoes"}),
+            ("1 cup milk", {"quantity": "1", "unit": "cup", "ingredient": "milk"}),
+            ("2 tablespoons olive oil", {"quantity": "2", "unit": "tablespoons", "ingredient": "olive oil"}),
+            ("1/4 pound sliced ham", {"quantity": "1/4", "unit": "pound", "ingredient": "sliced ham"}),
+            ("2 liters water", {"quantity": "2", "unit": "liters", "ingredient": "water"}),
+            ("750 ml orange juice", {"quantity": "750", "unit": "ml", "ingredient": "orange juice"}),
+            ("3 teaspoons salt", {"quantity": "3", "unit": "teaspoons", "ingredient": "salt"}),
+            ("milk (1 cup)", {"quantity": "1", "unit": "cup", "ingredient": "milk"}),
+            ("tomatoes (3 pieces)", {"quantity": "3", "unit": "pieces", "ingredient": "tomatoes"}),
+            ("pasta (200g)", {"quantity": "200", "unit": "g", "ingredient": "pasta"}),
+        ]
+
+        for ingredient, expected in test_cases:
+            parsed = parse_ingredient(ingredient)
+            self.assertEqual(parsed, expected)
+
+ + + +

This code tests a few different variations and options and includes some examples where there is no unit. I also used American english spellings of some of the units for variety.

+ + + +

Building an Endpoint

+ + + +

I'm using HTMX for doing asynchronous interaction between the web page and the form. In my Form class, I set hx-get on the ingredient form field so that whenever the value changes it makes a request to an ingredient_parser endpoint:

+ + + +
from django import forms, urls
+
+from recipe_app.models import Recipe, RecipeIngredient, Ingredient
+   
+class IngredientForm(forms.Form):
+    ingredient = forms.CharField(widget=forms.TextInput(attrs={
+        'hx-get': urls.reverse_lazy('ingredient_parser'),
+        'hx-trigger': "change",
+        "hx-target":"this",
+        "hx-swap": "outerHTML"
+        }))
+
+ + + +

Then I have a view class defined which grabs the value from the ingredient form, parses it and responds with a little HTML snippet which HTMX swaps in on the frontend. In the future I will look up a stock image per ingredient but for now I've got a mystery food picture:

+ + + +
class IngredientAutocomplete(View):
+
+    def get(self, request, *args, **kwargs):
+        
+        ingredient = None
+        form_key = None
+
+        for key in request.GET:
+            if key.startswith("recipeingredient"):
+                ingredient = request.GET[key]
+                form_key = key
+                break
+
+        if ingredient is None:
+            return JsonResponse({}) # TODO: make this respond in a better way
+        else:
+            ing = parse_ingredient(ingredient)
+            ing['raw_text'] = ingredient
+            ing['form_key'] = form_key
+            ing['thumbnail'] = static("images/food/mystery_food.png")
+            return render(request, "partial/autocomplete/ingredients.html", context=ing)
+ + + +

The for loop over the request.GET items is needed because each time we add a new ingredient to the form, the field gets a slightly different name. e.g. recipeingredient_1, recipeingredient_2 and so on.

+ + + +

Putting it All Together

+ + + +

I recorded a video of the form that I've built where I add an ingredient and the response gets populated.

+ + + +
+https://youtu.be/_l0_Lxwm4TY +
+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/20/Reply_ How moving from AWS to Bare-Metal saved us 230_000_ _yr.md b/brainsteam/content/posts/2023/11/20/Reply_ How moving from AWS to Bare-Metal saved us 230_000_ _yr.md new file mode 100644 index 0000000..b2848d6 --- /dev/null +++ b/brainsteam/content/posts/2023/11/20/Reply_ How moving from AWS to Bare-Metal saved us 230_000_ _yr.md @@ -0,0 +1,39 @@ +--- +categories: +- Engineering Leadership +- Software Development +date: '2023-11-20 11:56:36' +draft: false +tags: +- cloud +- devops +- onprem +title: 'Reply: How moving from AWS to Bare-Metal saved us 230,000$ /yr' +type: posts +--- + + +
+

Our transition from AWS to bare-metal infrastructure underscores the fact that while cloud services such as AWS offer robust flexibility and power, they may not always be the most economical choice for every enterprise.

+Neel Patel, https://blog.oneuptime.com/moving-from-aws-to-bare-metal/
+ + + +

We had a very similar experience at my current company a few years ago and we're saving similar magnitudes of cash. Rather than going full bare metal we have a hybrid IT strategy where staging and production systems are hosted in Google Cloud with all of the advantages that brings (redundancy, flexibility, security) but we use our own hardware for running test environments and CI workloads. Our source code repo is cloud-based for the redundancy after, true story, our old gitlab instance was literally melted in the OVH fire, March 2021.

+ + + +

Our hardware is co-located in an ISO27001 data centre and we also have an off-site backup solution, even though the "test" data is not as critical as the data we keep in the cloud.

+ + + +

We use containerisation and Kubernetes orchestration to provide similar environments for running almost identical environments in the cloud and locally. Our product is quite chunky and expensive to run and barely chugs along on a laptop so having inexpensive internal test instances that our team can use for testing and debugging purposes is really helpful.

+ + + +

It seems like cloud is great for "getting started" as a business when you maybe don't have the resource or knowledge for configuring your own servers in-house. Cloud probably also makes sense when you are a massive corporation and you have so many machines and networks to manage that you basically end up inventing your own "cloud" anyway - if your business' core competency is not IT then you almost certainly don't want to have to operationalise your own cloud!

+ + + +

However, at a certain size I think IT/Tech firms can end up in this sort of "Goldilocks" zone where you are big enough to know how to manage servers and small enough that it isn't an entire industry. At this size there are some serious savings to be made, even when you offset the salaries of your site engineering staff!

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/11/24/Medieval Buzzfeed - Debugging Dodgy Datetimes in Pandas and Parquet.md b/brainsteam/content/posts/2023/11/24/Medieval Buzzfeed - Debugging Dodgy Datetimes in Pandas and Parquet.md new file mode 100644 index 0000000..f3fcc1d --- /dev/null +++ b/brainsteam/content/posts/2023/11/24/Medieval Buzzfeed - Debugging Dodgy Datetimes in Pandas and Parquet.md @@ -0,0 +1,114 @@ +--- +categories: +- Data Science +date: '2023-11-24 09:19:20' +draft: false +tags: +- pandas +- python +title: Medieval Buzzfeed - Debugging Dodgy Datetimes in Pandas and Parquet +type: posts +--- + + +

I was recently attempting to cache the results of a long-running SQL query to a local parquet file using SQL via a workflow like this:

+ + + +
import os
+import pandas as pd
+import sqlalchemy
+
+env = os.environ
+
+engine = sqlalchemy.create_engine(f"mysql+pymysql://{env['SQL_USER']}:{env['SQL_PASSWORD']}@{env['SQL_HOST']}/{env['SQL_DB']}")
+
+connection = engine.connect()
+with engine.connect() as conn:
+    df = pd.read_sql("SELECT * FROM articles", connection)
+
+
+df.to_parquet("articles.parquet")
+ + + +

This ended up yielding the following slightly cryptic error message:

+ + + +
ValueError: Can't infer object conversion type: 0         2023-03-23 11:31:30
+1         2023-03-20 09:37:35
+2         2023-02-27 10:46:47
+3         2023-02-24 10:34:42
+4         2023-02-23 08:51:11
+                 ...         
+908601    2023-11-09 14:30:00
+908602    2023-11-08 14:30:00
+908603    2023-11-07 14:30:00
+908604    2023-11-06 14:30:00
+908605    2023-11-02 13:30:00
+Name: published_at, Length: 908606, dtype: object
+ + + +

So obviously there is an issue with my published_at timestamp column. Googling didn't help me very much, lots of people suggesting that because there are maybe some nan values in the column, Pandas can't infer the correct data type before serializing to parquet.

+ + + +

I tried doing df.fillna(0, inplace=True) on my dataframe, hoping that pandas would be able to coerce the value into a zeroed out unix epoch but I noticed I was still getting the issue.

+ + + +

A quick inspection of df.published_at.dtype returned 'O'. That's pandas' catchall "I don't know what this is" object data type.

+ + + +

I tried to force the data type to a date with pd.to_datetime(df.published_at) but I got another error :

+ + + +
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1201-11-01 12:00:00, at position 154228
+ + + +

Sure enough if I inspect the record at row 154228 the datestamp is in the year of our lord 1201. I don't /think/ the article would have been published approximately 780 years before the internet was invented. Aside from the fact that this is obviously wrong, the error essentially tells us that the date was so long ago that it's not possible to represent it in terms of how many nanoseconds it was before the unix epoch (1 Jan 1970) without the data structure running out of memory.

+ + + +

We now need to do some clean up and make some assumptions about the data.

+ + + +

We can be pretty confident that none of the news articles from before the unix epoch matter. In this use case, I'm actually only interested in news from the last couple of years so I could probably be even more cut throat than that. I check how many articles are older than that:

+ + + +
import datetime
+
+EPOCH =  datetime.datetime.fromtimestamp(0)
+
+df[df.published_at < EPOCH]
+ + + +

The only result - our article from the dark ages. I'm going to treat the unix epoch as a sort of nan value and set all articles with dates older than this (thankfully only the one) to have that value:

+ + + +
+df.loc[df.published_at < EPOCH, 'published_at'] = EPOCH
+ + + +

Now when I re-run my to_datetime conversion it works! We can overwrite the column on our dataframe and write it out to disk!

+ + + +
df.published_at = pd.to_datetime(df.published_at)
+
+df.to_parquet("test.parquet")
+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/04/Data Swamp.md b/brainsteam/content/posts/2023/12/04/Data Swamp.md new file mode 100644 index 0000000..b213769 --- /dev/null +++ b/brainsteam/content/posts/2023/12/04/Data Swamp.md @@ -0,0 +1,22 @@ +--- +categories: +- Data Science +date: '2023-12-04 14:40:04' +draft: false +tags: +- humour +title: Data Swamp +type: posts +--- + + +

Likes https://snarfed.org/2023-12-03_51578 by Ryan Barrett.

+
+

My dad has spent some of his retirement doing hobbyist machine learning projects. He heard the term “data lake” a while back and has taken to calling his datasets a “data swamp.” Feels like a terminology improvement the whole field could get behind.

+
+
+ + + +

This is brilliant, I've not come across this term before but I could definitely get behind using it to describe data that comes out of customer systems like Jonathon says he already does at his workplace.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/04/Weeknote CW48 2023.md b/brainsteam/content/posts/2023/12/04/Weeknote CW48 2023.md new file mode 100644 index 0000000..a2564ed --- /dev/null +++ b/brainsteam/content/posts/2023/12/04/Weeknote CW48 2023.md @@ -0,0 +1,57 @@ +--- +categories: +- Personal +date: '2023-12-04 15:46:20' +draft: false +tags: [] +title: Weeknote CW48 2023 +type: posts +--- + + +

I've missed a couple of weeknotes since the end of November and start of December has been a pretty busy time.

+ + + +

On Tuesday we went to see a rerun of The Holiday at the cinema. It's one of my favourite Christmas films and it put me and Mrs R into a christmassy mood.

+ + + +

On Thursday I went to London to partake in the Alan Turing Institute NLP Group's Christmas party which was really lovely. I haven't seen much of Prof. Liakata or her students since I finished my PhD this time last year so it was great to catch up and have a good old moan about the AI Hype of the past year.

+ + + +

The weather has been bitterly cold over the last few days in the UK. On Friday it briefly snowed in London and in Fareham and we drove up to the Midlands to see my parents for the last time before Christmas and to do gift exchanges. We were a little concerned that we might not make it up there (not necessarily because it was particularly bad weather but because the UK grinds to a halt whenever we so much as glimpse a single snowflake). However, the drive was very smooth.

+ + + +

On Saturday we spent the day with my dad and his wife. It was -4C outside so we spent the day indoors with the log fire on and a hot toddy in hand. Their naughty cat Bertie was almost constantly under foot and at one point decided to put himself in the bin which made a great photo opportunity.

+ + + +
A british shorthair cat sat in a rubbish bin with a cheeky look on his face
Naughty Bertie in the bin
+ + + +

In the evening we visited Ralph Court Gardens where we met with my mum and her partner. We had a walk around the illuminated gardens (in -4C) and afterwards went indoors for a slide of cake and a hot chocolate. The illuminations were a strange mixture, some of them were cute, some of them were perplexing like a Santa in an England football supporter getup.

+ + + +
An inflatable Santa wearing a white vest with the English Flag across it.
I particularly enjoyed seeing football hooligan santa
+ + + +

I haven't published much on this site since last week when I wrote about a problem I was diagnosing in Python to do with weird datestamps.

+ + + +

I'm continuing to read the young Dredd books, I'm about 45% through Dredd: Year Three and I'm also continuing to make my way through The One Thing which I picked up last week. I can't resist a "productivity" self-help book. On my way back from London last week I found myself listening to A Very British Cult, a podcast that was recommended during another podcast I listen to regularly: Three Bean Salad. It follows the story of a number of people who joined what seemed like a life coaching/mentoring group and ended up giving them lots of money and isolating themselves from their families to spend more time with the group. It's highly worth a listen if you are interested in crime/cult documentaries. The thing I found particularly interesting is that a lot of the key people are based near me in Hampshire. I've also started a new playthrough of Cyberpunk 2077 with the 2.1 patch in place.

+ + + +

This week Mrs R and I had today off to recover from our trip up north and then later in the week I'm up in London again where I'll be joining a panel discussion as part of a seminar on the Business Impacts of AI taught by a friend who is an Assistant Professor at Warwick Business School.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/05/AI_s Electron.JS Moment_.md b/brainsteam/content/posts/2023/12/05/AI_s Electron.JS Moment_.md new file mode 100644 index 0000000..18e58b7 --- /dev/null +++ b/brainsteam/content/posts/2023/12/05/AI_s Electron.JS Moment_.md @@ -0,0 +1,95 @@ +--- +categories: +- AI and Machine Learning +- Philosophy and Thinking +date: '2023-12-05 11:59:17' +draft: false +tags: +- climate +- genai +- onprem +title: AI's Electron.JS Moment? +type: posts +--- + + +

In reply to Generating AI Images Uses as Much Energy as Charging Your Phone, Study Finds.

+

The study provides an analysis of ML model energy usage on a state of the art nvidia chip:

+ + + +
+

We ran all of our experiments on a single NVIDIA A100-SXM4-80GB GPU

+
+
+ + + +

Looking these devices up - they have a power draw of 400W when they're running at full pelt. Your phone probably uses something like 30-40W when fast charging and your laptop probably uses 60-120W when it's charging up. Gaming-grade GPUs like the RTX4090 have a similar power draw to the A100 (450w). My Nvidia 4070 has a power draw of 200W.

+ + + +

We know that the big players are running data centres filled with racks and racks of A100s and similar chips and that is concerning. We should collectively be concerned with how much energy we're burning using these systems.

+ + + +

I'm a bit wary about the Gizmodo article's conclusion that all models - including Dall-e and Midjourney - should be tarred with the same brush, not because I'm naively optimistic that they're not burning the same amount of energy, but simply because they are an unknown quantity at this point. It's possible that they are doing something clever behind the scenes (see quantization section below)/

+ + + +

Industry Pivot Away From Task Appropriate Models

+ + + +

I think those of us in the AI/ML space had an intuition that custom-trained models would probably be cheaper and more efficient than generative models but this study provides some great empirical validation of that hunch:

+ + + +
+

...The difference is much more drastic if comparing BERT-based models for tasks such as text classification with the larger multi-purpose models: for instance bert-base-multilingual-uncased-sentiment emits just 0.32g of 𝐶𝑂2 per 1,000 queries, compared to 2.66g for Flan-T5-XL and 4.67g for BLOOMz-7B...

...While we see the benefit of deploying generative zero-shot models given their ability to carry out multiple tasks, we do not see convincing evidence for the necessity of their deployment in contexts where tasks are well-defined, for instance web search and navigation, given these models’ energy requirements.

+pg 14, https://arxiv.org/pdf/2311.16863.pdf
+ + + +

Generative models that can "solve" problems out of the box may seem like an easy way to save many person-weeks of effort - defining and scoping an ML problem, building and refining datasets and so on. However, the cost to the environment (heck even the fiscal cost) of training and using these models is higher in the long term.

+ + + +

If we look at the recent history of the software industry to understand this current trend, we can see a similar sort of pattern in the switch away from platform-specific development frameworks like QT or Java on Android towards the use of cross-platform frameworks like Electron.js React Native. These frameworks generally produce more power-hungry, bloated apps but a much faster and cheaper development experience for companies who need to support apps across multiple systems. This is why your banking app takes up several hundred megabytes on your phone.

+ + + +

The key difference when applying this general "write once run everywhere" type approach to AI is that, once you scratch the surface of your problem-space and realise that prompt engineering is more alchemy than wizadry and that the behaviour of these models is opaque and almost impossible to explain, it may make sense to start with a simple model anyway. If you have a well defined classification problem you might find that a random forest model that can run on a potato computer will do the job for you.

+ + + +

Quantization and Optimisation

+ + + +

A topic that this study doesn't broach is model optimisation and quantization. For those unfamiliar with the term, quantization is a compression mechanism which allows us to shrink neural network models so that they can run on older/slower computers or run much more quickly and efficiently on state-of-the-art hardware. Quantization has been making big waves this year, starting with llama.cpp (which I built Turbopilot on top of).

+ + + +

Language models like Llama and Llama2 typically need several gigabytes of VRAM to run (hence the A100 with 80GB ram). However, quantized models can run in 8-12GiB RAM and will happily tick along on your gaming GPU or even a Macbook with an Apple M-series chip. For example, To run Llama2 without quantization you need 28GiB of RAM. To run it in 5-bit quantized mode you need 7.28GB. Not only does compressing the model mean it can run on smaller hardware but it also means that inference can be carried out in fewer computer cycles since we can do more calculations at once.

+ + + +

Whilst I stand by the idea that we should use appropriate models for specific tasks, I'd love to see this same study done with quantized models. Furthermore, there's nothing stopping us applying quantization to pre-GPT models to make them even more efficient too as this repository attempts to do with BERT.

+ + + +

I haven't come across a stable runtime for quantized stable diffusion models yet but there are promising early signs that such an approach is possible for image generation models too.

+ + + +

However, I'd wager that companies like OpenAI are currently not under any real pressure (commercial or technical) to quantize their models when they can just throw racks of A100s at the problem and chew through gigawatt-hours in the process.

+ + + +

Conclusion

+ + + +

It seems pretty clear that transformer-based and diffusion-based ML models are energy intensive and difficult to deploy at scale. Whilst there are some use cases where it may make sense to deploy generative models, the advantages that these models bring to well defined problem spaces may simply never manifest. In cases where a generative model does make sense, we should be using optimisation and quantization to make their usage as energy efficient as possible.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/20/NLP in the Post-LLM World.md b/brainsteam/content/posts/2023/12/20/NLP in the Post-LLM World.md new file mode 100644 index 0000000..e3a3010 --- /dev/null +++ b/brainsteam/content/posts/2023/12/20/NLP in the Post-LLM World.md @@ -0,0 +1,32 @@ +--- +categories: +- AI and Machine Learning +date: '2023-12-20 06:37:14' +draft: false +tags: +- nlp +title: NLP in the Post-LLM World +type: posts +--- + + +

I really enjoyed diving into Seb Ruder's latest NLP Newsletter which focuses on all the areas of NLP that are still in desperate need of attention in a post-LLM world.

+ + + +
+

In an era where running state-of-the-art models requires a garrison of expensive GPUs, what research is left for academics, PhD students, and newcomers to NLP without such deep pockets?

+ + + +

...while massive compute often achieves breakthrough results, its usage is often inefficient. Over time, improved hardware, new techniques, and novel insights provide opportunities for dramatic compute reduction...

+
+ + + +

I wrote about some of the same issues in my post NLP is more than just LLMs earlier this year and I recently speculated about how current industry AI darlings, sexy-scale-up companies very much in "growth" mode as opposed to incumbents in "cost-saving" mode, are just not incentivised to be compute-efficient.

+ + + +

If you are just starting out in this space there are plenty of opportunities and lots of problems to solve - particularly around trust, reliability and energy efficiency.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/21/20231221_082f036a.md b/brainsteam/content/posts/2023/12/21/20231221_082f036a.md new file mode 100644 index 0000000..c1dec83 --- /dev/null +++ b/brainsteam/content/posts/2023/12/21/20231221_082f036a.md @@ -0,0 +1,16 @@ +--- +categories: +- Personal +date: '2023-12-21 15:28:57' +draft: false +tags: +- read +title: null +type: posts +--- + + +

Reblog via James Ravenscroft +

+ +

Took me a few weeks but I finally chewed my way through it. Thoroughly enjoyable pulp sci-fi. It has been fun to read this whilst also playing Cyberpunk 2077 and think about how the two might cross over.

(comment on Judge Dredd Year Three (Judge Dredd: The Early Years))

\ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/21/20231221_224cf724.md b/brainsteam/content/posts/2023/12/21/20231221_224cf724.md new file mode 100644 index 0000000..c1dec83 --- /dev/null +++ b/brainsteam/content/posts/2023/12/21/20231221_224cf724.md @@ -0,0 +1,16 @@ +--- +categories: +- Personal +date: '2023-12-21 15:28:57' +draft: false +tags: +- read +title: null +type: posts +--- + + +

Reblog via James Ravenscroft +

+ +

Took me a few weeks but I finally chewed my way through it. Thoroughly enjoyable pulp sci-fi. It has been fun to read this whilst also playing Cyberpunk 2077 and think about how the two might cross over.

(comment on Judge Dredd Year Three (Judge Dredd: The Early Years))

\ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/21/20231221_23f8cf2d.md b/brainsteam/content/posts/2023/12/21/20231221_23f8cf2d.md new file mode 100644 index 0000000..c1dec83 --- /dev/null +++ b/brainsteam/content/posts/2023/12/21/20231221_23f8cf2d.md @@ -0,0 +1,16 @@ +--- +categories: +- Personal +date: '2023-12-21 15:28:57' +draft: false +tags: +- read +title: null +type: posts +--- + + +

Reblog via James Ravenscroft +

+ +

Took me a few weeks but I finally chewed my way through it. Thoroughly enjoyable pulp sci-fi. It has been fun to read this whilst also playing Cyberpunk 2077 and think about how the two might cross over.

(comment on Judge Dredd Year Three (Judge Dredd: The Early Years))

\ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/21/20231221_528343a7.md b/brainsteam/content/posts/2023/12/21/20231221_528343a7.md new file mode 100644 index 0000000..e305fdb --- /dev/null +++ b/brainsteam/content/posts/2023/12/21/20231221_528343a7.md @@ -0,0 +1,16 @@ +--- +categories: +- Personal +date: '2023-12-21 15:28:57' +draft: false +tags: +- read +title: null +type: post +--- + + +

Reblog via James Ravenscroft +

+ +

Took me a few weeks but I finally chewed my way through it. Thoroughly enjoyable pulp sci-fi. It has been fun to read this whilst also playing Cyberpunk 2077 and think about how the two might cross over.

(comment on Judge Dredd Year Three (Judge Dredd: The Early Years))

\ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/22/Annual Review 2023.md b/brainsteam/content/posts/2023/12/22/Annual Review 2023.md new file mode 100644 index 0000000..79beb5c --- /dev/null +++ b/brainsteam/content/posts/2023/12/22/Annual Review 2023.md @@ -0,0 +1,228 @@ +--- +categories: +- Personal +date: '2023-12-22 11:50:24' +draft: false +tags: +- annual-review +title: Annual Review 2023 +type: posts +--- + + +

It's that time of year where people reflect on how their year went. I'd say this year has been a really mixed bag as my life has changed and transformed post-PhD study and my job has transformed significantly post funding at work. There were lots of exciting achievements and events but a little stagnation on some fronts. I hope that next year I can focus a little more on some of those areas.

+ + + +

Becoming a Doctor...

+ + + +
a mug with the abstract from James' phd thesis printed on it
+

I'd say my biggest achievement this year was finally completing my PhD after 7 years of part time study after having my thesis corrections accepted in February. I began my PhD in Natural Language Processing with the University of Warwick in 2015 while I was still at IBM. During that time a lot has changed, both for me personally and in technology and machine learning. The last few years have been a huge challenge but I'm really happy that they've paid off.

+
+ + + +

Unfortunately I was sick during my graduation ceremony but I graduated in absentia and received my doctoral certificate in the mail. I had a lot of fun updating my title to Dr from Mr on as many official documents as I can and repeatedly making the joke about "is there a doctor in the house? Yes but I can't help with your medical emergency."

+ + + +

I am excited to continue working on my academic projects as a sideline and will be staying in touch with my supervisors at the University of Warwick, Aberystwyth University and the Alan Turing Institute.

+ + + +

Public Appearances & Talks

+ + + +

I've done reasonably well for public speaking gigs this year. In April I was invited to give a talk about applied AI at Rare Earth Digital's Debrief event in Nantwich, Cheshire. This gave me an opportunity to put together a slide deck that incorporated my experience of building machine learning and AI solutions over the last decade and some some cautionary tales about LLMs that I picked up from folks like Simon WIllison.

+ + + +
James standing at the front of a function room filled with people siting at round tables.
+ + + +

I then did something of a "tour", presenting similar material at a local Science Cafe, an event for KTP associates at the University of Essex and privately for my wife's team at her workplace.

+ + + +

Earlier in December I was invited by a member of my PhD cohort to join a panel discussion about AI in industry. The audience were made up of heavy hitters from a diverse set of companies and industries. I was pleasantly surprised at the level of discourse and the healthy cynicism that the audience had about the blind application of GPT and LLMs.

+ + + +
A webcam photo of panelists sat in a lecture theatre with James in the middle.
My first visit to The Shard in London was as part of a panel discussion on AI as part of a course run by Warwick Business School
+ + + +

In October, I also attended an awards ceremony, accepting the trophy for best emerging tech firm on behalf of my company.

+ + + +

+ + + +

Work Achievements

+ + + +

This year my company went from being pretty much fully bootstrapped to taking on some investment money. This was a new experience for pretty much all of the senior leadership and the experience has been very exciting and enlightening and at times quite uncomfortable. It was great as it meant that we could move offices, give out pay rises and recruit but It's also been a big change of pace and focus for me with the engineering team almost doubling in size. In late September we had a leadership team retreat and we've been working on ways to improve the way that our product team work together to make things super sleak.

+ + + +

I've found that the hype around generative AI hasn't impacted what we do as much as you might expect. As I wrote earlier in the year, NLP is about a lot more than just LLMs and we need to provide reliable systems that don't hallucinate about stuff. There are still a huge number of unsolved problems in this space and much more efficient ways to solve them than throwing GPT-4 at it.

+ + + +

Projects and Open Source Contributions

+ + + +

By far the biggest success I had this year was Turbopilot which was hugely successful when I first launched it in April thanks to a tweet by Clem Delangue. It was really interesting to play with llama.cpp and learn about quantizing models, not just because of all the LLM stuff but because this technique can be applied to other types of models to make them smaller and more efficient too. I stopped developing Turbopilot in september when it was clear that other, better-staffed projects were doing a better job. I wrote about it at the time. Turbopilot was then featured in a keynote at Intel's 2023 Innovate conference after I spent a little time coordinating with a couple of their engineers in order to help get it running smoothly on intel CPUs.

+ + + +

I've also made a couple of smaller contributions this year:

+ + + + + + + +

Misc Personal Stats

+ + + +

According to my bookwyrm stats, this year I read 11, nearly 12, books. That is a disappointingly low number for me. Last year I managed nearly double that. My favourite books this year have been the Dredd: The Early Years (non-affiliate link) series which are properly pulpy/cheesy sci-fi and William Gibson's Sprawl Trilogy (non-affiliate link) definitely a bit of a Cyberpunk theme going on. My favourite non-fic book was Sprawl Trilogy (Sprawl Trilogy">non-affiliate link) by Johann Hari which provided an interesting and well argued summary of the ongoing, multi-fronted assault that the modern world mounts upon our attention.

+ + + +

According to Spotify, I've listened to about 25k minutes of music this year which isn't bad going. I spent a bit of time test driving their AI DJ earlier in the year and that led me to discover some amazing new bands. Of particular note, The Last Dinner Party who I can't stop telling people about. I tried out Jungle's new album, it's not my favourite but there are some catchy tracks on there. I recently updated my music recommendations page over on my digital garden with some recommendations.

+ + + +

Mrs R and I have retained our Cineworld memberships and been to see a few films this year. That said I still haven't gotten around to the Barbenheimer experience that was all the rage earlier in the year. I recently saw Wonka which I actually enjoyed quite a lot. It had a lot of actors from British sitcoms like Peep Show and Ghosts.

+ + + +

Habits and Fitness

+ + + +

I've been a fair bit better at journaling and meditating this year but I've definitely gained some weight this year which is unfortunate. I was probably at peak fitness towards the beginning of 2021 having spent most of lockdown eating healthily and regularly going for walks. This year we've been eating less well and doing less exercise and I've got the "dad bod" to prove it. I've had a few conversations with other people about getting back on the fitness train next year including some colleagues at work and Kev Quirk who is trying to avoid being a fat boy at 40. I'm a few years away from 40 but I am going to try and do a similar "fat boy at 35" challenge.

+ + + +

I have made a few weeknotes this year but felt a little self-conscious about publishing something every week, particularly during weeks where it's felt like not much has happened. I might re-jig the format next year and switch to private weekly reviews and monthly blog reviews more like Jan-Lukas.

+ + + +

Both Mrs R and I have found the pace of this year pretty challenging and it's had a knock on effect on our personal projects around the house and our general health and fitness. Hopefully next year these are some areas that we can work on together.

+ + + +

Travel and Family

+ + + +

I've managed to do a fair bit of travelling around this year and seen a lot more of my family than since pre-covid. In May we took a week-long trip up to the Lake District around the time of my birthday. At the same time, we popped in to visit my parents in the midlands and my dad, who has been taking flying lessons, took us up in a small aircraft which was really exciting.

+ + + + + + + +

While we were up in the Lake district we got to visit Kendal Museum where my late grandfather was the custodian in the 1990s. It was great to be back there as I have many treasured memories of Kendal and the surrounding lakes. We couldn't pass up an opportunity to recreate an old photo.

+ + + + + + + +

In June we went on a cruise around the Mediterranean with my dad and his wife to celebrate his 60th birthday. We stopped off at Pisa, Corsica, Marseilles, Palamos and a few others. We briefly visited St Tropez and I was a bit taken aback by the ostentatious and inauthentic display of wealth in the town.

+ + + +

Then, in September, we also did a beach and chill holiday in Cyprus where we spent a lot of time reading, swimming and drinking cocktails. We spent a lot of time making friends with the local stray cats of Cyprus who we wanted to smuggle home with us.

+ + + + + + + +

I have found that I've been ill quite a lot this year. From about mid-February until the end of April I had coughs, colds and COVID almost every other week. This has prevented us from visiting family and friends quite as often as I'd have liked this year. In particular, I would have liked to have visited some of my old school friends, one of whom is expecting their first child early in the year. We also had to cancel some concerts and shows due to illness and dangerous weather. Although it's not necessarily my fault that I got ill or that there was a raging storm, next year I hope to rectify this where possible by visiting as many of my friends as possible. I'm hoping that our fitness goals will also help with the whole "being sick" thing.

+ + + +

Next Year

+ + + +

Rather than making a separate new year post, let me make some predictions and set some goals here.

+ + + +

I'd summarise by saying that getting fitter and looking myself a little bit more are the main things that I'm aiming for in 2024.

+ + + +

I'm also keen to improve on my existing journaling and planning habits. I recently joined the Ness Labs community and I'm keen to try and take part in some of the workshops and meetups around these topics.

+ + + +

I've enjoyed all of my public appearances and speaking in 2023 and I hope to continue with these sorts of activities next year. I'd like to get more engaged with the BCS (of which I am a member but have yet to go to any events) and join more meetups and events locally.

+ + + +

I think it'd be nice to make more progress on jobs around the house next year too, we've slacked off somewhat in 2023 and we have a few maintenance and decorating type jobs that we need to tick off the list.

+ + + +

I've published a fair few posts and articles on this site in 2023. I'd like to keep this up in the new year, possibly changing from weekly to monthly reviews as mentioned above and also writing more about stuff I'm working on.

+ + + +

Happy end of the year everyone, hope to speak to you all in 2024!

+ + + +

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/26/Royal Institute Christmas Lecture on AI 2023.md b/brainsteam/content/posts/2023/12/26/Royal Institute Christmas Lecture on AI 2023.md new file mode 100644 index 0000000..fda1dd7 --- /dev/null +++ b/brainsteam/content/posts/2023/12/26/Royal Institute Christmas Lecture on AI 2023.md @@ -0,0 +1,19 @@ +--- +categories: +- AI and Machine Learning +date: '2023-12-26 18:32:46' +draft: false +tags: +- AI +- ml +title: Royal Institute Christmas Lecture on AI 2023 +type: posts +--- + + +

I'm excited and very proud to see my colleague and mentee Safa appear on this year's Royal Institute Christmas lecture. In the video, Safa is assisting Prof. Mike Woolridge (who has some excellent and sensible takes on AI by the way) with some experiments.

+ + + +

In general this is a well put together lecture that gives a lay summary of some of the techniques and acronyms that you're likely to come across in conversation about AI. I'd recommend it to anyone interested in learning more about what these systems do or to any relatives who might have brought up the subject of AI at the Christmas dinner table

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/27/Account Portability in the Social Web.md b/brainsteam/content/posts/2023/12/27/Account Portability in the Social Web.md new file mode 100644 index 0000000..124357e --- /dev/null +++ b/brainsteam/content/posts/2023/12/27/Account Portability in the Social Web.md @@ -0,0 +1,49 @@ +--- +categories: +- Software Development +date: '2023-12-27 21:16:47' +draft: false +tags: +- fediverse +- indieweb +title: Account Portability in the Social Web +type: posts +--- + + +

Reposted Account portability in the social web — underlap by underlap.

+

...In an interoperable world, users want to be able to move between providers with minimal friction to avoid “lock-in”...

+ + + +

...ActivityPub is the protocol that underpins the Fediverse (Mastodon, Pixelfed, PeerTube, etc.). It does not provide direct support for account portability... It will be fascinating to see to what extent, if at all, interoperation with the Fediverse by Meta's Threads will include account portability...

+ + + +

...Efforts to build account portability on top of ActivityPub fall into three categories: abstracting identity, improving account migration, and migrating content...

+
+ + + +

This is a really great article that discusses some of the things that are missing from ActivityPub in order to make the fediverse experience truly portable and some ongoing works in progress. A couple of observations that made me think:

+ + + +

Noone besides BlueSky has implemented the AT protocol yet. Given that its an open protocol, I'm surprised that nobody's at least tinkering with something in this vain.

+ + + +

+ + + +

Secondly, I really like the idea of DNS-based identity for identity provenance. That said, the majority of web users don't own their own domain so this might not scale well at all.

+ + + +

Finally, I noticed that @hcj has commented on the post to say that when moving to FireFish you can also take your posts with you - something that Mastodon doesn't currently support.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/28/Can you add AI to the hydraulics system... Seriously_.md b/brainsteam/content/posts/2023/12/28/Can you add AI to the hydraulics system... Seriously_.md new file mode 100644 index 0000000..eb6ac4c --- /dev/null +++ b/brainsteam/content/posts/2023/12/28/Can you add AI to the hydraulics system... Seriously_.md @@ -0,0 +1,47 @@ +--- +categories: +- AI and Machine Learning +date: '2023-12-28 16:42:34' +draft: false +tags: +- AI +- humour +title: Can you add AI to the hydraulics system... Seriously? +type: posts +--- + + +

Bookmarked Diane Duane on Tumblr.

+
+

"Can you add AI to the hydraulics system?"

+ + + +

can i fucking what mate "Sir, I'm sorry, I'm a little confused - what do you mean by adding AI to the hydraulics?"

+ + + +

"I just thought this stuff could run smoother if you added AI to it. Most things do"

+ + + +

The part of the car that moves when you push the acceleration pedal is metal and liquid my dude what are you talking about "You want me to .add AI...to the pistons? To the master cylinder?"

+ + + +

"Yeah exactly, if you add AI to the bit that makes the pistons work, it should work better, right?"

+ + + +

IT'S METAL PIPES it's metal pipes it's metal pipes "Sir, there isn't any software in that part of the car"

+
+
+ + + +

This is equal parts hilarious, horrifying, excruciating and anger inducing to me. It takes me right back to my IBM days where "can't we just use Watson to do this?" was the eternal cry of the client.

+ + + +

ML and AI have come a long way but people need to understand the domain that they are working in well enough to determine whether an AI system will help them or not. Frankly if the user doesn't understand the problem then how are they going to train a model?

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2023/12/29/Serving Django inside Docker the Right Way.md b/brainsteam/content/posts/2023/12/29/Serving Django inside Docker the Right Way.md new file mode 100644 index 0000000..533e5e9 --- /dev/null +++ b/brainsteam/content/posts/2023/12/29/Serving Django inside Docker the Right Way.md @@ -0,0 +1,378 @@ +--- +categories: +- Software Development +date: '2023-12-29 14:22:27' +draft: false +tags: +- django +- docker +- python +title: Serving Django inside Docker the Right Way +type: posts +--- + +I've seen a number of tutorials that incorrectly configure django to run inside docker containers by leveraging it's built in dev server. In this post I explore the benefits of using django with gunicorn and nginx and how to set this up using Docker and docker-compose. + + + + +

I'm working on a couple of side projects that use django and I'm a big fan of docker for simplifying deployment.

+ + + +

Django is a bit of a strange beast in that, it has a simple, single-threaded development server that you can start with python manage.py runserver but then it can also be run in prod mode using WSGI but once in this mode, it doesn't serve static files any more. This can be a little off-putting for people who are used to packaging a single server that does everything (like a nodejs app). It is especially confusing to people used to packaging an app along with everything it needs inside docker.

+ + + +

Part 1: Why not just use runserver?

+ + + +

If you already understand why it's better to use WSGI than runserver and just want to see the working config, skip down to Part 2 below.

+ + + +

I've seen a few tutorials for packaging up django apps inside docker containers that just use the runserver mechanism. The problem with this is that you don't get any of the performance benefits of using a proper WSGI runner and in order to handle server load, you end up needing to run multiple copies of the docker container very quickly.

+ + + +

A Rudimentary Performance Test

+ + + +

I did a quick performance test against the python runserver versus my WSGI + Nginx configuration (below) to illustrate the difference on my desktop machine. I used bombardier and asked it to make as many requests as it can for 10s with up to 200 concurrent connections:

+ + + +

bombardier -c 200 -d 10s http://localhost:8000

+ + + +

The thing at that address is the index view of my django app so we're interested in how quickly we can get the Python interpreter to run and return a response.

+ + + +

The python runserver results:

+ + + +
Statistics        Avg      Stdev        Max
+  Reqs/sec      1487.50     633.11    2988.81
+  Latency      133.84ms   259.97ms      7.12s
+  HTTP codes:
+    1xx - 0, 2xx - 15042, 3xx - 0, 4xx - 0, 5xx - 0
+    others - 0
+  Throughput:    13.70MB/s
+ + + +

And the WSGI config results:

+ + + +
Statistics        Avg      Stdev        Max
+  Reqs/sec      1754.20     666.40   16224.55
+  Latency      115.05ms     7.23ms   174.44ms
+  HTTP codes:
+    1xx - 0, 2xx - 17472, 3xx - 0, 4xx - 0, 5xx - 0
+    others - 0
+  Throughput:    15.95MB/s
+ + + +

As you can see, using a proper deployment configuration, the average number of requests handled per second goes up by about 15% but also we get a much more consistent latency (115ms average with a deviation of about 7ms as opposed to in the first example where latency is all over the place and if you're really unlucky, you're the person waiting 7s for the index page to load).

+ + + +
Testing Static File Service
+ + + +

Now let's look at handling files. When we use runserver we are relying on the python script to serve up the files we care about. I ask bombardier to request the logo of my app as many times as it can for 10 seconds like before:

+ + + +

bombardier -c 200 -d 10s http://localhost:8000/static/images/logo.png

+ + + +

First we run this with django runserver:

+ + + +
Statistics        Avg      Stdev        Max
+  Reqs/sec       731.51     252.55    1795.93
+  Latency      270.50ms   338.53ms      5.01s
+  HTTP codes:
+    1xx - 0, 2xx - 7504, 3xx - 0, 4xx - 0, 5xx - 0
+    others - 0
+  Throughput:   255.27MB/s
+ + + +

And again with Nginx and WSGI.

+ + + +
Statistics        Avg      Stdev        Max
+  Reqs/sec      6612.33     705.07    9332.41
+  Latency       30.27ms    19.95ms      1.30s
+  HTTP codes:
+    1xx - 0, 2xx - 66156, 3xx - 0, 4xx - 0, 5xx - 0
+    others - 0
+  Throughput:     2.25GB/s
+ + + +

And suddenly the counter-intuitive reason for Django splitting static file service from code execution makes a little bit more sense. Since we are just requesting static files, Python never actually gets called. Nginx, which is an efficient server that is written in battle-hardened C, is able to just directly serve up the static files.

+ + + +

In the first example, Python is the bottleneck and using Nginx + WSGI just makes some of the shifting around of information a little bit smoother. In the second example, we can completely sidestep python.

+ + + +

If you still need convincing...

+ + + +

The devs literally tell you not to use runserver in prod in the official docs:

+ + + +
+

DO NOT USE THIS SERVER IN A PRODUCTION SETTING. It has not gone through security audits or performance tests. (And that’s how it’s gonna stay. We’re in the business of making web frameworks, not web servers, so improving this server to be able to handle a production environment is outside the scope of Django.)

+django-admin and manage.py - Django documentation
+ + + +
+ + + +

Part 2: Packaging Django + WSGI in Docker

+ + + +

Getting Started

+ + + +

Ok so I'm going to assume that you have a django project that you want to deploy and it has a requirements.txt file containing the dependencies that you have installed. If you are using a python package manager, I'll drop some hints but you'll have to infer what is needed in a couple of places.

+ + + +

Install and Configure Gunicorn

+ + + +

Firstly, we need to add a WSGI server component that we can run inside the docker container. I will use gunicorn.

+ + + +

pip install gunicorn (or you know, pdm add/poetry add etc)

+ + + +

We can test that it's installed and working by running:

+ + + +

gunicorn -b :8000 appname.wsgi

+ + + +

If you go to localhost:8000 you should see your app there but, wait a minute, there are no images or css or js. As I mentioned, django won't serve your static resources so we'll pair gunicorn up with nginx in order to do that.

+ + + +

Collect Static Resources

+ + + +

nginx needs a folder that it can serve static files from. Thankfully django's manage.py has a command to do this so we can simply run:

+ + + +

python manage.py collectstatic --noinput

+ + + +

The --noinput argument prevents the script from asking you questions in the terminal and it will simply dump the files into a static folder in the current directory.

+ + + +

Try running the command in your django project to see how it works. We'll be using this in the next step

+ + + +

Build a Dockerfile for the app

+ + + +

We can produce a docker file that builds our django app and packages any necessary files along with it.

+ + + +
FROM python:3
+WORKDIR /app
+ADD . /app
+
+RUN python3 -m pip install -r requirements.txt
+# nb if you are using poetry or pdm you might need to do something like:
+# RUN python3 -m pip install pdm
+# RUN pdm install
+
+ENV STATIC_ROOT /static
+CMD ["/app/entrypoint.sh"]
+ + + +

NB: if you are using pdm or poetry or similar, you will want to install them

+ + + +

We also need to create the entrypoint.sh file which docker will run when the container starts up. Save this file in the root of your project so that it can be picked up by Docker when it builds:

+ + + +
#!/usr/bin/env bash
+pdm run manage.py collectstatic --noinput
+pdm run manage.py migrate --noinput
+pdm run gunicorn -b :8000 appname.wsgi
+ + + +

This script runs the collectstatic command which, with a little bit of docker magic we will hook up to our nginx instance later. Then we run any necessary database migrations and then we use gunicorn to start the web app.

+ + + +

Build the nginx.conf

+ + + +

We need to configure nginx to serve static files when someone asks for /static/something and forward any other requests to the django app. Create a file called nginx.conf and copy the following:

+ + + +
events {
+    worker_connections  1024;  # Adjust this to your needs
+}
+
+http {
+    include       mime.types;
+    default_type  application/octet-stream;
+    sendfile        on;
+    keepalive_timeout  65;
+
+    # Server block
+    server {
+        listen       80;
+        server_name  localhost;
+
+        # Static file serving
+        location /static/ {
+            alias /static/;
+            expires 30d;
+        }
+
+        # Proxy pass to WSGI server
+        location / {
+            proxy_pass http://frontend:8000;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+        }
+    }
+}
+ + + +

This configuration should be relatively self-explanatory but a couple of notes:

+ + + + + + + +

Glue it together with docker compose

+ + + +

We will use docker compose to combine our nginx and django containers together.

+ + + +
services:
+  frontend:
+    build: .
+    restart: unless-stopped
+    volumes:
+      - ./static:/app/static
+    environment:
+      DJANGO_CSRF_TRUSTED_ORIGINS: 'http://localhost:8000'
+
+  frontend-proxy:
+    image: nginx:latest
+    ports:
+      - "8000:80"
+    volumes:
+      - ./nginx.conf:/etc/nginx/nginx.conf:ro
+      - ./static:/static:ro
+    depends_on:
+      - frontend
+ + + +

Ok so what we are doing here is using volume mounts to connect /app/static inside the django container where the results of collectstatic are dumped to /static/ in our nginx container where the static files are served from.

+ + + +

We also mount the nginx.conf file in the nginx container. You'll probably end up using docker compose to add database connections too or perhaps a volume mount for a sqlite database file.

+ + + +

Finally we bind port 8000 on the host machine to port 80 in nginx so that when we go to http://localhost:8000 we can see the running app.

+ + + +

Running it

+ + + +

Now we need to build and run the solution. You can do this by running:

+ + + +
docker compose build
+docker compose up -d
+
+ + + +

Now we can test it out by going to http://localhost:8000. Hopefully you will see your app running in all its glory. We can debug it by using docker compose logs -f if we need to.

+ + + +

Conclusion

+ + + +

Hopefully this post has shown you why it is important to set up Django properly rather than relying on runserver and how to do that using Docker, Nginx and Gunicorn. As you can see, it is a little bit more involved than your average npm application install but it isn't too complicated.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/03/Migrating Users Across Servers With RSync.md b/brainsteam/content/posts/2024/01/03/Migrating Users Across Servers With RSync.md new file mode 100644 index 0000000..06af882 --- /dev/null +++ b/brainsteam/content/posts/2024/01/03/Migrating Users Across Servers With RSync.md @@ -0,0 +1,189 @@ +--- +categories: +- Software Development +date: '2024-01-03 14:33:37' +draft: false +tags: +- linux +- rsync +title: Migrating Users Across Servers With RSync +type: posts +--- + + +

I recently needed to migrate some user data from one Ubuntu server to another. It was not possible for me to clone the full disk so I opted to copy the user data and re-create the user accounts on the other machine.

+ + + +

I used rsync to copy all the user data and preserve all permissions on the files. I needed sudo access on both sides.

+ + + +

In this article I refer to the new machine as the target onto which we want to copy our data and the old machine as the source of the data we want to copy.

+ + + +

Preparing The Data

+ + + +

Firstly, check what data you want to clone. I made liberal use of du -sh /home/* to see how much space each of the affected user directories were taking up and worked with them to tidy up their local directories where necessary (lots of junk in hidden places like .local and .cache). A couple of the users had large projects that they were able to purge before we did the copy so I was able to significantly reduce the amount of data I needed to transfer.

+ + + +

Create the Users

+ + + +

For each of the users on the old machine, I created a new account using sudo useradd -m - if there are any special groups like sudo or docker you can add them at this point, e.g. sudo useradd -m james -G sudo,docker

+ + + +

The -m flag creates the user home directory so if you do an ls /home you should see one directory per user in there.

+ + + +

Set Up Passwordless Sudo rsync

+ + + +

In order to have permission to copy users we need to be able to operate as root on both the new machine and the old machine. We will run the sync command from the new machine with sudo and we can enter the password but that will SSH to the remote system and attempt to sudo and will likely fail if we don't do this next step.

+ + + +

We don't want to give blanket permission for the user to sudo without password auth - that's a pretty big security risk and also an accident waiting to happen (sudo rm -rf / anyone?). Instead, we will edit the /etc/sudoers file and add special permission for our current user (let's say james) to run the rsync command without asking for a password.

+ + + +

We add a couple of new lines to the bottom of the file like so:

+ + + +
# User privilege specification
+root    ALL=(ALL:ALL) ALL
+
+# Members of the admin group may gain root privileges
+%admin ALL=(ALL) ALL
+
+# Allow members of group sudo to execute any command
+%sudo   ALL=(ALL:ALL) ALL
+
+# Custom user privileges
+james   ALL=(ALL) NOPASSWD: /usr/bin/rsync
+
+ + + +

The key thing here is the NOPASSWD: directive, after which we put the path to the rsync binary. We should also remove this again once the sync is successful.

+ + + +

Set up and test the connection

+ + + +

From the new machine we can test our ability to rsync with the old machine. We can use the --dry-run flag to avoid actually copying any data.

+ + + +

+ + + +

The test command might like something like this:

+ + + +
sudo rsync -Porgzl --dry-run --rsync-path="sudo /usr/bin/rsync" james@target.domain.com:/home/james /home
+ + + +

You will want to run rsync via sudo on the new machine AND the old machine (via the config above) to ensure that we have permission to read and write other users' data.

+ + + +

--dry-run prevents rsync from actually copying any data, it just makes a list of files it would copy if it was run without this option.

+ + + +

--rsync-path - this is the command that the old server will use when it is looking for files to grab and send to the new machine. We prepended the command with sudo and because the user that we're sshing as has the NOPASSWD configured for /usr/bin/rsync it should allow the user to run this without any issue.

+ + + +

-P displays progress of the transfer and also enables partial transfer (if the command is interrupted at any point it will resume any partly-transferred files)

+ + + +

-o and -g preserve the permissions on the files -o for owner and -g for groups. This ensure that the old user and group are correctly set in the new location. This is why it is important that we created the affected users and groups before we initiated the transfer.

+ + + +

-r for recursive - copy folders and their contents recursively, without this, rsync will just stop without copying anything

+ + + +

-z for compression - this compresses data on the old machine before it is sent and decompresses it when it is received on the new machine. This saves bandwidth and these days with fancy, powerful CPUs, the chances are that you'll be able to compress/decompress the data faster than it can be transferred over the net so this is likely to be helpful.

+ + + +

-l allows rsync to copy symlinks - this may be important if you are copying things like conda environments (since library structures often contain multiple links to each other libexample.so.1.2.3 -> libexample.so.1 -> libexample.so).

+ + + +

If all goes well you should see a list of files flash up the screen - this is the list of files that would be copied from the old machine to the new machine if --dry-run wasn't enabled.

+ + + +

If you get any permission errors, double check that you have the right permissions set up on the old machine, make sure that the user you are SSHing as is the same as the user in the sudoers file.

+ + + +

Run the Sync

+ + + +

If your dry run succeeded, you can now execute the full copy and transfer. I'm going to add a couple of additional exclusions with the --exclude operator. We can use wildcards to apply these exclusions to all of our users:

+ + + +
sudo rsync \
+  -Porgzl --rsync-path="sudo /usr/bin/rsync" \
+  --exclude "*/.local/lib" \
+  --exclude "*/.cache" \
+  --exclude '*/.vscode-server' \
+  --exclude "*/miniconda/pkgs" \
+  james@target.domain.com:/home/* \
+  /home
+ + + +

The command is pretty much the same as the previous one with the --dry-run flag turned off, with an --exclude for each of the directories we don't care about and with a wildcard in the home directory so that we copy all users rather than just james.

+ + + +

I highly recommend executing long-running commands like this inside tmux so that if your connection from your workstation to the new machine goes down, the process continues.

+ + + +

If your connection does get interrupted and you need to restart, you can run this command with --ignore-existing to have rsync skip any files that were already copied during the failed run.

+ + + +

+ + + +

Finally: Undo the Sudoers Change

+ + + +

Remove the line that you added to the sudoers file that allowed you to run rsync without a password on the old server!

+ + + +

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/10/Brand Loyalty.md b/brainsteam/content/posts/2024/01/10/Brand Loyalty.md new file mode 100644 index 0000000..0a0a65f --- /dev/null +++ b/brainsteam/content/posts/2024/01/10/Brand Loyalty.md @@ -0,0 +1,123 @@ +--- +categories: +- Philosophy and Thinking +date: '2024-01-10 21:33:53' +draft: false +tags: +- humour +- philosophy +title: Brand Loyalty +type: posts +--- + + +

I'm not really saying anything revolutionary or novel in this post, it's just a bit of a stream of consciousness rant about how companies behave when they think we're not watching them....

+ + + +

When I was growing up I remember my parents and grandparents frequently talking about companies that they trusted and telling me good makes for things:

+ + + +

"German cars are well engineered... you should drive a Volkswagen"

+ + + +

"Always fly British Airways"

+ + + +

"You should shop at Morrisons"

+ + + +

"Your Uncle's just bought a Dyson hoover, they're the best you know"

+ + + +

Setting Myself Up For Disappointment

+ + + +

Throughout my tweens, teens and early 20s I remember following this sort of advice and after years of encouragement, forming my own strong opinions about 'good' brands. I also remember falling, hook line and sinker, for a lot of the late 00s silicon valley companies who promised to change the world.

+ + + +

I can also recall the moments when almost all of the brands that I was loyal to betrayed my trust, either directly through a negative experience or indirectly through a scandal of some description.

+ + + +

Like when I learned about the VW emissions scandal... and also that time I bought a lemon off them...

+ + + +

...or when it turned out that the inventor of my vacuum cleaner expels more hot air than his trademark hand dryers...

+ + + +

...Or when it turned out that my silicon valley 'heroes' (in hindsight 🤮) were spying on us for years... and that the guy who sold the cool electrical cars was a loser who publically disparaged actual heroes and hung out with nazis...

+ + + +

Or in the recent 'inflation' years where almost every single branded food and drink company and supermarket shrinkflated the hell out of their entire range, raised their prices and then posted record profits...

+ + + +

...or when a particular aerospace company that I had grown up wanting to work for released a plane that killed a bunch of people and then nearly killed a bunch more people...

+ + + +

A Legacy of a By-Gone Time

+ + + +

Brand loyalty probably made sense in an earlier incarnation of capitalism. Your grandma might have known the owner of the local garage for years and they might have done her a solid a couple of times by not charging her for adding wiper fluid that one time or coming out to fix her car on a public holiday. There genuinely was a time when the things people needed were built with pride to a high quality and you'd buy something for life and maintain it. If Grandma had a pan that lasted 10 years and the company that made it for her were still going after all that time, chances are they were still making good pans.

+ + + +

The problem is, if I only buy one high quality saucepan every 20 years, saucepan companies can't make loads of money out of me. This all changed in the 1920s when psychologists like Edward Bernays convinced us en-masse via the power of radio and print media that we needed to consume and buy and keep up with the Jones'. This led to planned obsolescence. If people are going to buy something fashionable and chuck the old clothes away every season, it stands to reason that the clothes only need to last one season and therefore the manufacturer could make more money if they make the clothes out of cheaper materials.

+ + + +

Brand Loyalty in the 2020s

+ + + +

Now we've had about 100 years to get used to the idea of "consuming" rather than buying for life and companies are in a race to the bottom to make stuff as cheaply and nastily as possible in order to sell us things we don't need that won't last very long anyway. Companies change very quickly and they are forever optimising (i.e. enshittifying) their production processes, testing the waters to see what they can get away with not doing before customers realise like frogs slowly boiling in a pan of water.

+ + + +

Companies have also learned that talk is cheap. They will say whatever they think they should to get you to trust them and spend money with them even if they don't actually mean it. That's why we have things like greenwashing.

+ + + +

In our current incarnation of capitalism brand loyalty makes zero sense. You might have a rose-tinted memory of a time you interacted with a company 10 years ago but in that time they might have been through 3 different CEOs, opted plastic components instead of metal ones or replaced their experienced senior engineers with graduates who got 6 months of training and just use ChatGPT to tell them what to do.

+ + + +
I don't mean to be the 'gotcha' guy and I'm not judging people for doing what they need to do to get by. Surviving is hard work... (Comic by Matt Bors)
+ + + +

Loyalty is for People not Companies

+ + + +

In 2023 it's pretty hard to do avoid interacting with disingenuous rip off merchants... There are sites that can help you decide whether a company is ethical and you can rely on your own moral compass. I'm not the "gotcha" guy from the comic strip above. Ethical boycotting is a privilege that not everyone can afford (fiscally or emotionally). Do what you've got to do to survive.

+ + + +

Ultimately I think loyalty should be reserved for people rather than big faceless corporations. When it comes to buying your next car, booking your next holiday, feeding your family next week or dressing yourself, be sceptical and clinical. Shop around, ignore adverts, posturing and green-washing. Remember talk is cheap. Don't give companies your loyalty. Abandon them as quickly and ruthlessly as they would you. If you have the energy, kick up a fuss. Write to them, email them, phone them, tell them why you're abandoning them (always be nice to the individuals handling your messages).

+ + + +

Gift your loyalty to friends, family, neighbours and random strangers and remember those who help you out too. The brand loyalty of old was a lot closer to this model - Grandma probably trusted the manager at the local garage rather than the garage company.

+ + + +

If you are a businessy person, build quality stuff, don't chase infinite growth and money at the expense of your customers and set up your business in a way that makes it enshittification resistant after you move on.

+ + + +

Finally, don't forget to also be loyal to yourself!

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/10/TIL_ Accessing Google Storage Buckets with gcloud sdk on an M2 Pro_MacOS 14.2.1.md b/brainsteam/content/posts/2024/01/10/TIL_ Accessing Google Storage Buckets with gcloud sdk on an M2 Pro_MacOS 14.2.1.md new file mode 100644 index 0000000..d1df32e --- /dev/null +++ b/brainsteam/content/posts/2024/01/10/TIL_ Accessing Google Storage Buckets with gcloud sdk on an M2 Pro_MacOS 14.2.1.md @@ -0,0 +1,73 @@ +--- +categories: +- Software Development +date: '2024-01-10 16:11:18' +draft: false +tags: +- python +- til +title: 'TIL: Accessing Google Storage Buckets with gcloud sdk on an M2 Pro/MacOS 14.2.1' +type: posts +--- + + +

I recently ran into a strange issue where I was struggling to get the GCloud SDK to work properly on Mac 14.2.1.

+ + + +

I used brew to install the package brew install google-cloud-sdk and added it to my zshrc file.

+ + + +

After I logged in with gcloud auth login I wanted to copy some files out of a GCP bucket to my local machine with the storage cp command:

+ + + +

gcloud storage cp gs://bucketname/filename .

+ + + +

The command looked fine and detected the right number of files to copy before crashing out with this rather odd looking error message:

+ + + +
gcloud crashed (AttributeError): 'Lock' object has no attribute 'is_fork_ctx'
+ + + +

a quick search surfaced this ticket in google's issue tracker showing exactly the same problem. It turns out that gcloud is a Python package and it doesn't play nicely with the version of Python that this version of MacOS has installed.

+ + + +

My workaround was to use Anaconda (which I have installed anyway as a data science bod) to create a new Python environment at a version that gcloud likes. Then to install gcloud from conda:

+ + + +
conda create -n cloudtest python=3.10
+conda activate cloudtest
+conda install google-cloud-sdk
+ + + +

Once I had this set up I found that the command suddenly burst into life.

+ + + +

A Simpler Alternative?

+ + + +

If you don't want to use Anaconda, Google's installer offers to install a compatible version of Python for you. In fact, using brew made life harder for me - I could have just followed the Google experience.

+ + + +
+ + + +

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/20/Gitea_Forgejo Actions and PostgreSQL Tests.md b/brainsteam/content/posts/2024/01/20/Gitea_Forgejo Actions and PostgreSQL Tests.md new file mode 100644 index 0000000..bf28f4c --- /dev/null +++ b/brainsteam/content/posts/2024/01/20/Gitea_Forgejo Actions and PostgreSQL Tests.md @@ -0,0 +1,106 @@ +--- +categories: +- Software Development +date: '2024-01-20 14:05:07' +draft: false +tags: +- ci +- django +- docker +- gitea +title: Gitea/Forgejo Actions and PostgreSQL Tests +type: posts +--- + + +

I am building a recipe app in django and I want to be able to test the app within the CI pipelines when I push to my self-hosted gitea repository.

+ + + +

The Problem

+ + + +

Github actions already has a postgresql action but this crashes when you run Gitea's Action runner via docker:

+ + + +
initdb: error: cannot be run as root
+Please log in (using, e.g., "su") as the (unprivileged) user that will own the server process.
+ + + +

This is because, by default, docker containers don't really enforce users and permissions in the same way that systems installed in the operating system usually do and the action tries to run a load of stuff with the root user that should probably be run under the postgres system user account.

+ + + +

Finding Another Approach

+ + + +

In gitea we can specify the underlying container that we want the action to run in (which is different to github where actions typically run in full VMs rather than containers). Although the vanilla gitea container is reasonably good, I replaced it with catthehacker/ubuntu:act-latest so that I could get the setup pdm action to work. This container effectively contains a basic set of stuff you'd find in a default ubuntu install including apt for managing packages. Ubuntu ships with a PostgreSQL server package so I went ahead and installed it which also initialises the relevant data files and service user accounts inside the container.

+ + + +

Then, I started the service with service postgresql start. Normally this would be a weird thing to do inside a docker container since best practice is typically to have a separate container per service. However, for testing purposes it's probably ok.

+ + + +

The next step is to set a password for the postgres user which is used by the django app to log in and do the tests. We can use the psql command to do this and we use sudo -u postgres to authenticate against the postgres user before the password is set.

+ + + +

Our final gitea actions yaml looks something like this:

+ + + +
name: Run Tests
+run-name: ${{ gitea.actor }} is testing out Gitea Actions 🚀
+on: [push]
+
+jobs:
+  run_tests:
+    runs-on: ubuntu-latest
+    container: catthehacker/ubuntu:act-latest
+    steps:
+      - name: Checkout Codebase
+        uses: actions/checkout@v3
+
+
+      - name: Configure and install postgres
+        run: |
+         apt update
+         apt install -y postgresql
+         service postgresql start
+         sudo -u postgres -s psql -U postgres -d postgres -c "alter user postgres with password 'test123';"
+
+
+      - uses: pdm-project/setup-pdm@v3
+        with:
+          python-version: 3.10
+          token: ${{ secrets.GH_TOKEN }}
+
+
+      - name: Install dependencies
+        run: cd ${{ gitea.workspace }} && pdm install 
+    
+
+      - name: Run Django tests
+        env:
+          DB_HOST: 127.0.0.1
+          DB_NAME: gastronaut
+          DB_USER: postgres
+          DB_PASSWORD: test123
+
+        run: |
+          cd ${{ gitea.workspace }} && pdm run manage.py test
+      
+ + + +

Hopefully this will work out of the box. The whole thing is pretty efficient with a good internet connection, my run taks just under 2 minutes with about 25 seconds needed to install and configure the postgres db:

+ + + +
a screenshot of the gitea actions CI log for the flow set out above and with indicative timings. The whole run takes 1m 47s and the postgres config step takes 25s
+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/22/AI Poisoning for everyone_.md b/brainsteam/content/posts/2024/01/22/AI Poisoning for everyone_.md new file mode 100644 index 0000000..7935dc9 --- /dev/null +++ b/brainsteam/content/posts/2024/01/22/AI Poisoning for everyone_.md @@ -0,0 +1,32 @@ +--- +categories: +- AI and Machine Learning +date: '2024-01-22 09:25:51' +draft: false +tags: +- llms +- nlp +- security +title: AI Poisoning for everyone! +type: posts +--- + + +

In reply to AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic.

+ + + +

This is definitely a bit of a hot take from Ars Technica on the recent Anthropic paper about sleeper agents. The article concludes with "...this means that an open source LLM could potentially become a security liability..." but neglects to mention two key things:

+ + + +

1) this attack vector isn't just for "open source LLMs" but for any LLM trained on publically scraped data. We're in the dark on the specifics but we know with some certainty that GPT and Claude are "really really big" transformer-decoders and the secret sauce is the scale and the mix of training data. That means they're just as susceptible to attack as any other LLM with this architecture when trained on scraped data.

+ + + +

2) This isn't a new problem, its an extension of the "let's train on everything we could scrape without proper moderation and hope that we can fine-tune the bad stuff away" mentality. It's a problem that persists in any model, closed or open, which has been trained in this way.

+ + + +

One thing I know for sure as a machine learning practitioner: performance discrepancies aside, I can probe, test and fine-tune open model weights to my heart's content. With a model behind an API I have a lot less scope to explore and probe and I have to trust, at face value, the promises of the model providers who are being embarassed by moderation fails on a weekly basis (like here and here). I know which I'd prefer...

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/01/25/Supporting the Underdog when it comes to Automated Decision Making and Power Dynamics.md b/brainsteam/content/posts/2024/01/25/Supporting the Underdog when it comes to Automated Decision Making and Power Dynamics.md new file mode 100644 index 0000000..2218642 --- /dev/null +++ b/brainsteam/content/posts/2024/01/25/Supporting the Underdog when it comes to Automated Decision Making and Power Dynamics.md @@ -0,0 +1,32 @@ +--- +categories: +- AI and Machine Learning +- Philosophy and Thinking +date: '2024-01-25 09:33:27' +draft: false +tags: +- AI +- ethics +- nlp +title: Supporting the Underdog when it comes to Automated Decision Making and Power + Dynamics +type: posts +--- + + +

In reply to The (theoretical) risks of open sourcing (imaginary) Government LLMs.

+

I enjoyed this recent blog post by acclaimed technologist Terence Eden proposing a thought experiment about the ethics of open sourcing a hypothetical LLM classifier trained on benefits sanction appeal letters.

+ + + +

Eden, himself a huge open source advocate, argues, quite compellingly that such a model should be kept closed to prevent the potential leakage of potentially confidential information in the training data or probing of the model for the purpose of abusing it.

However, as some of the post's commentators point out, there is a bigger question at play here: where is it appropriate to be using this kind of tech?

One of the key issues in my mind is the end-user's treatment and the power dynamic at play here. If you're making life-and-death decisions (tw suicide) about people who have few resources to challenge those decisions, then you should have appropriate systems in place to make sure that decisions are fair, explainable and rational. You must provide mechanisms that allow the party with everything to lose in this situation to understand what is happening and why. Finally, There must always be an adequate escape hatch mechanism for recourse if the computer gets it wrong.

+ + + +

Whether we're talking about AI in the context of "draw a line of best fit through these points on a graph" or whether we're talking about fancy language models with billions of parameters, my view is that stuff should always be an augmentative technology rather than a replacement for human intelligence. Wherever it is are deployed, AI should be helping a human being to do their job more effectively rather than letting them fall asleep at the wheel. From what I know about Terence I'd go out on a limb to assume he feels the same way and perhaps all this stuff I'm writing about is implicit in his thought experiment.

+ + + +

However, this all comes at a time when, in the UK, we've had a recent reminder about what happens when the computer says no and no option for recourse is provided. So I felt that it was worthwhile to fill in these gaps.

+
+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/02/03/January 2024 In Review.md b/brainsteam/content/posts/2024/02/03/January 2024 In Review.md new file mode 100644 index 0000000..e2f45a6 --- /dev/null +++ b/brainsteam/content/posts/2024/02/03/January 2024 In Review.md @@ -0,0 +1,108 @@ +--- +categories: +- Personal +date: '2024-02-03 09:38:54' +draft: false +tags: +- MonthlyReview +title: January 2024 In Review +type: posts +--- + + +

My first monthly review in the new format since I mentioned in December that I would move from publishing weekly to monthly instead. Quite often January is a slow and sleepy month that drags on and on. This year I found that a lot happened in January and I was surprised how quickly it passed.

+ + + +

Personal Stuff

+ + + +

In my goal not to be a fat boy at 35 I shifted around 5kg (or 11lb) by not eating crap like we did over the christmas holidays and by going for walks or getting on the stationary bike most days. I've still got a fair bit more weight to shift but I'm happy with this as a starting point and I've found that eating more healthily most of the time means that sweet treats tastes absolutely amazing when we do have a treat day.

+ + + +

January was a great opportunity for me to re-connect with a lot of people. I caught up with a mentor from my time at the Catalyst programme in Southampton Science Park, some former colleagues and friends. I got the chance to hang out with my good friend Dan Duma who I met during my PhD studies.

+ + + +
James and Daniel hanging out in the The Scottish Stores pub. They are sat at a table with beers in front of them.
Two doctors: Dan (left) and James (right) catching up in the Scottish Stores pub near Kings Cross, London
+ + + +

It was also an opportunity to make new friends and acquaintances and I had some networking meetings and calls with people I'd met on LinkedIn and through the Ness Labs community.

+ + + +
+

We got the opportunity to go up and visit my parents in the Midlands and hung out there one weekend. It was one of the colder weeks in January so it was lovely to be in a cosy setting and their cat Bertie took a liking to me

+ + + +

We had planned to go out for to a Thai restaurant that we really liked and had been to before but when we arrived there the lovely owners came out to tell us that they had a power cut and they couldn't cook - luckily it was lunch time and they were hoping to get the power back on before their fully booked evening service. We ended up going to a cute little pub nearby instead and Mrs R still ordered a Thai curry!

+
+ + + +

While we were up north we also got to meet up with one of my school friends for breakfast which was great fun.

+ + + +

Some stuff didn't go so well: we lost a close family friend earlier this month and Mrs R's family dog which she's had since she was an early teenager.

+ + + +

We also had to take our cats in for some dental surgery which was carried out under general anaesthetic and I found that very stressful even though the vet reassured us that the risk was very low and it all worked out fine.

+ + + +

Work Stuff

+ + + +

It's been a pretty hectic month on the work front. I had 8 annual personal development reviews to go through with my team. I love chatting to my team about what went well and what they want to work on this year so that was actually quite an enjoyable experience.

+ + + +

I oversaw a migration from Linear to JIRA which we decided to make due to a need to move to a more controlled and rigorous development process and so that our product team could make use of JIRA Product Discovery for ranking and sorting new feature requests effectively. I've written some scripts for automating a lot of the migration and preserving things like attachments and comments which I am in the process of open sourcing. It's still a little bit half-baked for now but I'll post again when I'm done.

+ + + +

I spent a lot of time helping out with annual reviews of internal policy documentation around security, cyber-risk and software lifecycle which was a dull but very important task.

+ + + +

Entertainment

+ + + +

We set a goal of trying to get to the cinema twice a month this year since we have cineworld membership that costs about the same as 1.5 standard tickets. At the start of the month we went to see Anyone But You but we haven't made it back since. Maybe February will be a better month for Cinema trips!

+ + + +

We binged our way through the latest series of Queer Eye this last week (I say binged but we watched one episode a night which I thought demonstrated restraint) and I really enjoyed it as usual. I'm excited that there will be a new series but sad that Bobby will not be returning, his interior design work is superb.

+ + + +

This month I've been reading Consider Phlebas by Iain Banks. I was reminded by Kev's recent review that the Culture books have been on my reading list since forever. So far I've found it a little difficult to get into and I keep falling asleep with it. However, I am about 25% of the way through now and I'm determined to keep going.

+ + + +

New Process

+ + + +

My new review process is to write a weekly note in obsidian that summarises each week and then write a monthly review note which summaries my weeklies. Then I can publish a blog post based on that. So far I'm enjoying it and I think I'm putting a lot less pressure on myself if I have a slow week where there's not much to write home about. I think I'll probably stick to this new process for now and see how I get on.

+ + + +

What will February Bring?

+ + + +

It will be interesting to see how the next month pans out. I have the funeral for the family friend I mentioned coming up later this week. I'm hoping to keep meeting up with people and networking as, finally nearly 3 years since the COVID lockdowns stopped happening regularly, I'm getting into the swing of having a social life again.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/03/04/February 2024 In Review.md b/brainsteam/content/posts/2024/03/04/February 2024 In Review.md new file mode 100644 index 0000000..b1f7d1f --- /dev/null +++ b/brainsteam/content/posts/2024/03/04/February 2024 In Review.md @@ -0,0 +1,82 @@ +--- +categories: +- Personal +date: '2024-03-04 21:29:51' +draft: false +tags: +- MonthlyReview +title: February 2024 In Review +type: posts +--- + + +

It was a pretty typical February for us this year in the sense that it was pretty rubbish and both myself and Mrs R were sick for a lot of it.

+ + + +

Flu February

+ + + +

Regular readers may have noticed that I've not been posting much this month. Towards the end of the first full week of Feb I was stricken with Flu. Now, I've heard people say "it's not real flu unless you think you're dying" and laughed it off and I've had heavy colds before. Now, I've had real influenza I really do get it. On the first 2-3 days I had very high fever and muscle aches. Then, I had those symptoms plus nausea and sickness for a further 3 days. Normally if you get a stomach bug it passes within 24 hours. After 3 days solid of feeling this way you really do start to think "am I dying?"

+ + + +

Thankfully, I started to recover after about 5 days and now, almost 3 weeks later I almost feel human again. However, the struggle has been real. One of the interesting things was that do get the flu vaccine on an annual basis so I guess this year I was just super unlucky and got a strain that I was not protected against. Or, perhaps, as my doctor said, it would have been even worse without the vaccine (I almost can't comprehend how bad that would be).

+ + + +

Back at Work

+ + + +

After a week off and a second week doing half days and sleeping throughout the afternoon, I returned to work "as normal" last week.

+ + + +

We are currently doing some interesting work exploring few-shot and zero-shot models for classification in use cases where we would normally be looking to collect large volumes of data to help the client train their models. The trade-off is that many of these models are quite heavy to run compute-wise so I've been doing some work looking at Huggingface Optimum to quantize models so that they can run in less memory and with fewer CPU/GPU cycles. I've also been playing with Nvidia Triton for deploying models.

+ + + +

Take Note

+ + + +

I've been exploring personal knowledge management (yes again sigh) and playing with some tools and approaches. I have started experimenting with Silverbullet which has a steep learning curve but provides some pretty nifty features out of the box and plays really nicely with other Markdown-based PKM systems like Obsidian and Foambubble. I've also set it up so that any notes that I set as public: true in the frontmatter are automatically published to my digital garden. I have added some sensible exclude and ignore rules so that I don't publish anything I want to keep private.

+ + + +

I was playing with the idea of moving away from wordpress for my main site again but I don't want to sink loads of time into re-designing my main website - especially since I only did it 6 months ago.

+ + + +

At Home

+ + + +

With both of us being sick, we haven't had much opportunity to achieve much in our family life this month. However, we did book some holidays and trips for later in the year. Being deathly sick allowed me to lose a further 2kg I'm now 97kg (down from 105kg at my heaviest) which I am pretty happy with.

+ + + +

While I was at my sickest I couldn't even bring myself to watch TV or read but either side I did manage a few things:

+ + + +

I finally finished Consider Phelebas which as I mentioned previously I had been struggling to read. At some point the pace picked up and I started to gel with the characters but honestly, it was a real struggle to get there. When I did finally finish the book, I found the ending utterly unsatisfying and it made the whole book feel a bit moot and pointless. I think that might be what the author was going for but it's made me think twice about picking up any more of the culture books. I've since started reading The Doors of Eden by Adrian Tchaikovsky which I'm really enjoying.

+ + + +

I finished Avenue 5 which is a sitcom about a bunch of idiots in space with some heavy hitters (Hugh Laurie, Josh Gadd, Zack Woods, Lenora Crichlow, Daisy Mae Cooper). I really enjoyed the show and I was sad that it got cancelled after two series.

+ + + +

Hopes for March

+ + + +

I'd love for March to not be another write-off due to sickness. Last year we got a bunch of coughs and colds from about Mid-Feb until the end of March and then I got COVID over easter.

+ + + +

This week we have some friends and family catch ups and UK Mother's day. Next week I will be delivering a guest lecture on Software development best practices at Warwick Business School and then the next day we will be going up to Edinburgh for a mini-break and to explore the city.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/03/17/Broken Sites in Firefox with Proton Pass v1.15.md b/brainsteam/content/posts/2024/03/17/Broken Sites in Firefox with Proton Pass v1.15.md new file mode 100644 index 0000000..b14383c --- /dev/null +++ b/brainsteam/content/posts/2024/03/17/Broken Sites in Firefox with Proton Pass v1.15.md @@ -0,0 +1,23 @@ +--- +categories: +- Software Development +date: '2024-03-17 17:41:14' +draft: false +tags: +- firefox +- proton +title: Broken Sites in Firefox with Proton Pass v1.15 +type: posts +--- + + +

This morning I noticed that there were some weird things happening in Firefox and pages were crashing all over the place including my wordpress editor, my umami console and various other pages. I had some weirdness with Gnome and Ubuntu on my laptop this morning so I thought it was operating system instability.

+ + + +

Then, I came across someone on Lemmy who was experiencing similar problems and blamed the Proton Pass plugin. The error messages I was getting in the console didn't mention proton but they were similar in nature. I disabled the addon and hey presto, I got some stability back on my system.

+ + + +

If you are having this problem and you're a proton pass user, you will be pleased to know that Proton are working on getting it fixed but in the mean time you can go to the official addon page and download v1.14.1 to forcibly revert your plugin and restore stability until the new version is out.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/03/17/Moving from Gitea to Forgejo including Actions with Docker Compose.md b/brainsteam/content/posts/2024/03/17/Moving from Gitea to Forgejo including Actions with Docker Compose.md new file mode 100644 index 0000000..f48eaa0 --- /dev/null +++ b/brainsteam/content/posts/2024/03/17/Moving from Gitea to Forgejo including Actions with Docker Compose.md @@ -0,0 +1,163 @@ +--- +categories: +- Software Development +date: '2024-03-17 12:51:07' +draft: false +tags: +- devops +- docker +- forgejo +- gitea +title: Moving from Gitea to Forgejo including Actions with Docker Compose +type: posts +--- + + +

I just moved from gitea to forgejo and the process could not have been simpler. I'm really impressed. I suppose since the latter was until very recently a soft fork of gitea, I shouldn't be surprised at how easy it was. Still though, I sat down expecting to need to dedicate a few hours to debugging and it "just works" without a hitch after 15 minutes. The most jarring part was getting the actions runner working but the forgejo documentation is great, in fact I'd say better than gitea's docs, so in the end this wasn't a massive issue.

+ + + +

Why?

+ + + +

I have used gitea for hosting personal projects since about 2018 when I first became aware of it. I've been super impressed and I love the fact that it's so lightweight and simple to use. However, I've been a little disappointed by the founders' recent-ish pivot towards commercialisation. Don't get me wrong, I'm an open source maintainer and contributor myself and I know that many of these folks are essentially doing a public service for free. The final straw for me has been the recent announcement of Gitea Enterprise and the direction they seem to be moving in (open "core" with a bunch of paywalled features).

+ + + +

I was a huge fan of Gitlab about 10-11 years ago but they have since heavily commercialised and gotten worse, caring less and less about their community and more and more about corporate user. The open core model can be ok if the features that are paywalled genuinely are things that only big companies care about or if bleeding edge features are open sourced after an embargo period or once a funding milestone has been met. However, I'm once bitten twice shy when it comes to these kinds of precedents in software forges so forgive my cynicism.

+ + + +

Forgejo was set up in response to Gitea's pivot towards commercialisation and their governance model is pretty much set up in a way to prevent the same thing happening again (I guess nothing is impossible but it is unlikely and there would be a lot of drama). For now I will throw in my lot with them.

+ + + +

How to Switch The Server

+ + + +

The server software was the easiest thing to switch. Following the instructions from forgejo, I simply replaced the name of the gitea image with the equivalent forgejo one in my docker-compose.yml and ran docker-compose pull followed by docker-compose up:

+ + + +

Before:

+ + + +
  server:
+    restart: unless-stopped
+    image: gitea/gitea:1.21
+    container_name: gitea
+    environment:
+      - USER_UID=1000
+      - USER_GID=1000
+    volumes:
+     ...
+ + + +

And after:

+ + + +
  server:
+    restart: unless-stopped
+    image: codeberg.org/forgejo/forgejo:1.21
+    container_name: gitea
+    environment:
+      - USER_UID=1000
+      - USER_GID=1000
+    volumes:
+      ...
+ + + +

As soon as I ran this and started the server the whole thing came back up and apart from the new orange paint job, very little had changed about my server.

+ + + +

+ + + +

Swapping the Runner

+ + + +

The runner was slightly more painful because I had made a couple of customisations to my gitea actions runner before in order to get it running docker-in-docker (allowing me to push and pull images from inside the runner without those images appearing in the host system's user space)

+ + + +
  runner:
+    restart: unless-stopped
+    image: vegardit/gitea-act-runner:dind-latest
+    privileged: true
+    environment:
+      CONFIG_FILE: /config.yaml
+      GITEA_INSTANCE_URL: https://git.jamesravey.me
+      GITEA_RUNNER_REGISTRATION_TOKEN: ""
+      GITEA_RUNNER_NAME: gitea-runner-1
+      GITEA_RUNNER_LABELS: "ubuntu-latest,ubuntu-22.04,ubuntu-20.04,ubuntu-18.04"
+    volumes:
+      - ./runner-config.yaml:/config.yaml
+      - ./runner-data:/data:rw
+      - ./runner-cache/:/cache
+    ports:
+      - 42263:42263
+ + + +

Again, the documentation in forgejo is great and they provide an example for running docker-in-docker as a separate service and having the runner talk to it. All I had to do was change a couple of env vars and the runner image name:

+ + + +
  docker-in-docker:
+    image: docker:dind
+    container_name: 'docker_dind'
+    privileged: true
+    command: ['dockerd', '-H', 'tcp://0.0.0.0:2375', '--tls=false']
+    restart: 'unless-stopped'
+
+  runner:
+    restart: unless-stopped
+    links:
+      - docker-in-docker
+    depends_on:
+      docker-in-docker:
+        condition: service_started
+    container_name: 'runner'
+    image: code.forgejo.org/forgejo/runner:3.3.0
+    user: 1000:1000
+    command: forgejo-runner daemon
+    environment:
+      DOCKER_HOST: tcp://docker-in-docker:2375
+      CONFIG_FILE: /config.yaml
+      GITEA_INSTANCE_URL: https://git.jamesravey.me
+      GITEA_RUNNER_REGISTRATION_TOKEN: ""
+      GITEA_RUNNER_NAME: gitea-runner-1
+      GITEA_RUNNER_LABELS: "ubuntu-latest,ubuntu-22.04,ubuntu-20.04,ubuntu-18.04"
+    volumes:
+      - ./runner-config.yaml:/config.yaml
+      - ./runner-data:/data:rw
+      - ./runner-cache/:/cache
+      - /var/run/docker.sock:/var/run/docker.sock
+    ports:
+      - 42263:42263
+ + + +

Again, a quick test via docker-compose up and running one of my CI pipelines and voila, everything clicked into place:

+ + + +
a screenshot of a forgejo run with green ticks
Happy green ticks after the forgejo runner completes its run.
+ + + +

Conclusion

+ + + +

If you were thinking about checking out forgejo for philosophical reasons or simply because you were curious and you're worried about switching from gitea, it couldn't be easier. If you are thinking about trying it out though, I'd strongly recommend taking a backup and/or spinning up a temporary or beta site where you can try it out before committing. Also bear in mind that Forgejo recommend not using their actions runner in prod yet because it may not be secure enough. I don't let anyone else use or sign up in my instance and I'm pretty paranoid about what I do run there so I'm taking a calculated risk here.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/03/30/Edinburgh 2024.md b/brainsteam/content/posts/2024/03/30/Edinburgh 2024.md new file mode 100644 index 0000000..b30df73 --- /dev/null +++ b/brainsteam/content/posts/2024/03/30/Edinburgh 2024.md @@ -0,0 +1,155 @@ +--- +categories: +- Personal +date: '2024-03-30 08:07:03' +draft: false +tags: +- food +- travel +title: Edinburgh 2024 +type: posts +--- + + +

Earlier this month, Mrs R and I took a mini-break in Edinburgh. We got a few hours of sunshine on the day that we arrived and I managed to get a couple of nice photos of the castle. After that it basically rained non stop until we left, which, as I understand it, is the authentic Scottish experience. We still had a really great time though.

+ + + +
James and Mrs R posing in front of Edinburgh Castle on a sunny afternoon.
+ + + +

One of the few sunny photos that we managed to capture.

+ + + +

Things We Did

+ + + +

The Castle

+ + + +

We visited Edinburgh castle which was quite neat but it was raining heavily during our visit. The interior of the castle was really interesting. There is a regal banquet hall that felt like something straight out of game of thrones. Apparently Oliver Cromwell turned it into a cramped barracks. He installed 3 storeys of wooden floors in there which definitely would have spoiled it. It was then restored in the 1800s. From what I know about Cromwell, most stuff he did was questionable or flat out offensive so it didn't surprise me.

+ + + +
A building that looks a bit like a poo made out of metal - or maybe a spiral ribbon if you look closely
+

The views from atop the castle weren't super interesting due to the heavy rain and lack of visibility. However, On a good day, I'm told the view is spectacular.

+ + + +

I managed to photograph the building that looks suspiciously like a big poo. This building is locally known as "the jobbie" which is scottish slang for poo. Apparently, it's a shopping mall.

+ + + +

Our guide in the camera obscura said he was once berated by the building's architect when describing it as "the jobbie". Apparently it is supposed to be a ribbon. I suppose I sort of see it.

+
+ + + +

Camera Obscura

+ + + +
+

Edinburgh Camera obscura is really cool. In the 1800s a female optician and professor built this thing as a giant spying tool. It uses huge glass focussing lenses to focus light onto a white projection table. You can move the aperture around with a long wooden pole. It was effectively an early CCTV system. Our tour guide explained that she basically did this "for shigs" in high society Edinburgh.

+ + + +

The camera needs lots of light to work and we visited edinburgh on a particularly rainy day. Therefore, the picture wasn't that clear. Luckily, they simulated a sunny day using a digital projector and there were some stunning views. This was my second experience with a Camera Obscura. My first was at Aberystwyth Constitution Hill when I was at university.

+
An external shot of the camera obscura on a rainy day. It's a round tower with a hole at the top to let light in.
+ + + +

The Real Mary King's Close

+ + + +

This was a really interesting and informative attraction. When they built Edinburgh City Chambers, they decided to build it over the top of 4-5 streets that go downhill behind it. This meant using the existing streets and buildings for support. They chopped the roofs off some buildings and putting rubble and stuff on others to create a flat foundation. The original street is still there under ground and you can take guided tours around it. You can see some of the original architecture and even some of the possessions of 18th and 19th century Edinburgh people. It was very cool and very interesting.

+ + + + + + + +

Really simplified diagrams of before (left) and after (right) the hall was built.

+ + + +

Our tour guide, Becca, was really fun and engaging. She was in character as a "Foul Clenger" from the plague times. These were people who went into the houses of dead plague victims and burned all of their belongings. We learned a lot about the black death epidemics in Edinburgh. Most interestingly, the plague doctor's uniform (that iconic look with the raven mask) protected them from the disease almost by coincidence. They were trying to protect themselves from "foul odors" and "miasmas" and coincidentally stopped the bubonic-plague carrying fleas from biting them.

+ + + +

+ + + +

National Museum of Scotland

+ + + +
The magnificent interior of the museum with huge glass windows and decorative iron supports
+

The National Museum of Scotland is a huge striking building in central Edinburgh. It's about 5 minutes walk from the castle and a perfect distraction in the rain.

+ + + +

We ventured around the museum for a few hours. They had lots of different exhibits focussing on natural history, modern science and engineering and anthropology. They had lots of fun interactive exhibits to play with like giant musical instruments and practical experiments.

+ + + +

It was a bit jarring seeing devices that I used to own, like the Nokia 3310 in the museum of technologies past. That's progress I guess!

+
+ + + +

+ + + +

Dining

+ + + +

We ate in two really lovely restaurants.

+ + + +
+

On the first night we went to Bertie's Proper Fish + Chips where I tried a proper scotch egg. There were a lot of Americans around trying the traditional British dish which was kind of neat. They did have battered mars bars on their menu but after my scotch egg and haddock and chips I was too full to try it.

+
A monster plate of fish and chips with a wedge of lemon and tartar sauce
+ + + +

On our 2nd night we were due to go and eat at an Italian restaurant but we were both super tired. Instead, so we stayed in our hotel apartment and heated up some store-bought ravioli. Not exactly the Italian experience we were hoping for but it filled a gap.

+ + + +
James and Mrs R sat in Nok's Kitchen in Edinburgh with a bottle of water in front of them on the table.
+

On our last night we went to Nok's Kitchen, an independently run Thai restaurant just next to the castle. On our travels we are rarely let down by an indie Thai restaurant and this was no exception to that finding. Nok's was really great and the staff were super friendly too.

+ + + +

The restaurant was quite small and it was completely full up the whole time we were there.

+
+ + + +

Final Thoughts

+ + + +

I really enjoyed our mini-break to Edinburgh. Despite the inclement weather we had a really great time. I found the people in Edinburgh, particularly those we interacted with in restaurants and tourist attractions to be super welcoming and genuinely passionate about what they do. This was refreshing compared to where we live in England where most staff in these industries are bored teenagers doing the bare minimum to get through their shift (and hey for minimum wage and without the tipping culture of the US, who can really blame them?) Edinburgh's residents seem to have a real pride in what they do. There is a national identity there that just isn't a thing in England and I found that really refreshing.

+ + + +

We will definitely try and get back to Edinburgh when it's not persistently raining some time in the summer. Not that anyone can make such guarantees in the UK/Scotland. If you are planning to visit Scotland in March, please do pack a good rain coat and some waterproof trousers. We saw a huge number of tourists wearing t-shirts under those disposable rain ponchos you get from theme parks. We assume that they probably paid through the nose for them at an Edinburgh tourist shop!

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/04/02/Finding the Best AI-Powered Handwriting OCR.md b/brainsteam/content/posts/2024/04/02/Finding the Best AI-Powered Handwriting OCR.md new file mode 100644 index 0000000..7bfd07c --- /dev/null +++ b/brainsteam/content/posts/2024/04/02/Finding the Best AI-Powered Handwriting OCR.md @@ -0,0 +1,180 @@ +--- +categories: +- AI and Machine Learning +date: '2024-04-02 18:25:27' +draft: false +tags: +- AI +- llms +- nlp +title: Finding the Best AI-Powered Handwriting OCR +type: posts +--- + + +

Quite unusually for me, this essay started its life as a scribble in a notebook rather than something I typed into a markdown editor. A few weeks ago, Tiago Forte made a video suggesting that people can use GPT-4 to capture their handwritten notes digitally. I've been looking for a "smart" OCR that can process my terribly scratchy, spidery handwriting for many years but none have quite cut the mustard. I thought, why not give it a go? To my absolute surprise, GPT did a reasonable job of parsing my scrawling and capturing text. I was seriously impressed.

+ + + +

Handwriting OCR has now gone from a fun toy and conversion piece to something I can actually, seriously use in my workflow. So then, Is GPT the best option I've got for this task or are there any FOSS tools and models that I could use instead?

+ + + +

An Experiment

+ + + +

In my experiment I use the following system prompt with each model:

+ + + +
+

Please transcribe these handwritten notes for me. Please use markdown formatting. Do not apply line wrapping to any paragraphs. Do try to capture headings and subheadings as well as any formatting such as bold or italic. Omit any text that has been scribbled out. Try your best to understand the writing and produce a first draft. If anything is unclear, follow up with questions at the end.

+
+ + + +

and then proceed to send each of them this photo of my notes:

+ + + +
A page in a notebook with handwritten text which is the first draft of this very post. It starts
My lovely scrawly hand writing displa
+ + + +

The main flaw in my experiment is its scale. Ideally I would use a large number of handwritten documents with different pens, ink colours and photographed under different lighting conditions to see how this affects the model output. However, I didn't have hours and hours to spare manually transcribing my own notes.

+ + + +

However, I would say that the photo is pretty representative of a page of notes in my typical handwriting. It is written in my bujo/notebook that I carry around with me and my fountain pen that I use most of the time. I wouldn't expect results to vary wildly for me specifically. However dear reader: your mileage may vary.

+ + + +

What I'm Looking For

+ + + +

I want to understand which model is best able to understand my handwriting without misinterpreting it. Bonus points for also providing sensible styling/formatting and being robust to slightly noisy images (e.g. the next page in the notebook being slightly visible in the image).

+ + + +

Ideal Outcomes

+ + + +

What I hope to find is an open source/local model that I could run which would provide high quality handwriting OCR that I can run on my own computer. This would give me peace of mind about not having to send my innermost thoughts to OpenAI for transcription and would be a budget option. Also, from an energy and water usage point of view, it would be great to be able to do this stuff with a small model that can run on a single consumer-grade GPU like I have in my computer.

+ + + +
+ + + +

GPT-4V

+ + + +

GPT-4V did the best job of transcribing my notes, making the fewest mistakes interpretting my handwriting and almost perfectly following my instructions regarding formatting. It did make a few errors. For example, renaming Tiago Forte as Rafe Furst and referring to LLAVA as "Clara".

+ + + +

Read GPT-4 Full Response

+ + + +

Gemini

+ + + +

Google Gemini did a pretty poor job compared to GPT-4V. It ignored my instructions not to wrap lines in markdown and it also attempted to read the first word of each line from the second page of my notes which is visible in the image. It made a complete hash of my work and also misread a large chunk of the words. I used the model in chat mode and even when I asked it to "try again but this time don't try to read the next page", it ignored me and spat out more or less the same thing.

+ + + +

Read Full Gemini Output

+ + + +

Claude Opus

+ + + +

The quality of Claude's response was much closer to that of GPT-4V's. There were a couple of very prominent spelling/misinterpretations that made me immediately suspicious in general the output is quite faithful to the writing. However, Claude did say it would provide markdown output and then completely fail to do that.

+ + + +

Read full Claude Output

+ + + +

Local Model: LLAVA

+ + + +

One of the biggest advances in Local LLMs in the last 12 months has been LLAVA which provides reasonable performance on a number of multi-modal benchmarks. Earlier versions of LLAVA were trained using images and by having GPT-4 expand the text descriptors. These images that were originally provided by humans. However, they didn't yet have access to opt-4.5k so training was done on descriptors that were often second-hand and pretty surprising that this worked at all tbh - but it did!

+ + + +

LLAVA 1.6 is trained on data with more in-depth annotations including LAION GPT-4V which, incidentally, I can't find any formal evaluations of. This seems to improve the model's ability to reason about the contents of images. They also train using TextVQA which aims to train models to be able to reason about and perform question answering for text documents. It is likely that elements of both of these datasets help improve the model's OCR ability.

+ + + +

I set up LLAVA to run locally using llama.cpp and tried running some experiments to see how well it picked up my handwriting (instructions here). Although LLAVA 1.6 fared a lot better than earlier versions, it didn't do an amazingly good job. It also got stuck in a generation loop a few times, repeatedly outputting the same mad ramblings until terminated.

+ + + +

Read LLAVA Output Here

+ + + +

The Classic option - CRAFT-TrOCR

+ + + +

HF Space Repo

+ + + +

CRAFT-TrOCR is an early example of a transformer-based large multi-modal model. According to TrOCR's abstract:

+ + + +
+

we propose an end-to-end text recognition approach with pre-trained image Transformer and text Transformer models, namely TrOCR, which leverages the Transformer architecture for both image understanding and wordpiece-level text generation

+
+ + + +

The CRAFT component is use to detect regions of the image that may contain text and TrOCR is used to recognise the text. However, TrOCR is quite small for a large language model, the largest version trained in the paper has 550M parameters - around the same size as BERT and around 40 times smaller than GPT-3.5 is believed to be.

+ + + +

I played with this pipeline a couple of years ago and I remembered it being reasonably good at certain context but only if I write in my very best handwriting all the time. And I don't do that because when I am taking notes in a meeting or something, I am thinking about the meeting, not my handwriting.

+ + + +

Although the HF space is broken, I was able to make some local changes (published here to get it running on my PC). This is not a generative model with a prompt. It is a pipeline with a single purpose: to recognise and output text. I simply uploaded the photo of my note and hit submit.

+ + + +

However, I found that the output was pretty disappointing and nonsensical. I wonder if the limitation is the model size, the architecture or even the image preprocessing and segmentation that is going on (which is "automatic" in more recent LMMs).

+ + + +

Read CRAFT-TROCR Output

+ + + +

+ + + +

Conclusion: The Best Option Right Now

+ + + +

Right now, GPT-4V seems to be the best option for handwriting OCR. I have created a custom GPT which has a tuned variation on the system prompt used above in my experiment which people can use here. Using GPT to do this work makes copying my analogue thoughts into my digital garden a breeze.

+ + + +

However, I am very much on the lookout for a new open source model that will do this job well. What could change in this space? A version of LLaVa that has been fine-tuned on handwriting OCR task could emerge or perhaps we'll see an all together new model. The rate of open source model development is phenomenal so I am sure we will see progress in this space soon. Until then, I'm stuck paying OpenAI for their model.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/04/07/March 2024 In Review.md b/brainsteam/content/posts/2024/04/07/March 2024 In Review.md new file mode 100644 index 0000000..84d3bd8 --- /dev/null +++ b/brainsteam/content/posts/2024/04/07/March 2024 In Review.md @@ -0,0 +1,92 @@ +--- +categories: +- Personal +date: '2024-04-07 15:41:04' +draft: false +tags: +- MonthlyReview +title: March 2024 In Review +type: posts +--- + + +

We're a few days into April now so I'm overdue my March Review. March was a much more enjoyable month than February. We did a lot of stuff and I managed not to get sick the whole time which was great.

+ + + +

Travelling About

+ + + +

I started the month with a couple of hectic weeks. In the first week we met up with an old family friend for dinner in Southampton. We went to Mango Thai Above Bar where we had delightful food and a really fun evening. We also met up with my Mum and her partner for lunch to celebrate Mother's Day.

+ + + +

In the second week went up to Warwick University to give a guest lecture about AI and Machine Learning. I drove up and back on the same day (a good 6 hour round trip) so that I could be at home to meet Mrs R the next day and fly up to Edinburgh for our mini break.

+ + + +
James stood on a stage in front of a podium with slides projected behind him -
+ + + +

We had a week off to recover before a weekend visit my Dad + his wife for her birthday party. It was nice to see a bunch of family members, some of whom we hadn't seen since before COVID. We also visited my mum and her partner the next day. We went out for Sunday breakfast with them by the riverside in Bewdley.

+ + + +

Learning in Public

+ + + +

Since I migrated to using silverbullet + Obsidian last month I've been writing a lot more in my public digital garden. I'm writing up any useful tidbits of information that I find there and gradually adding to them over time. Historically I have written these sorts of posts up as blog posts which then age and get left in time. For example my advice on python virtual environments which is due a rewrite would make a great digital garden piece.

+ + + +

I am planning to add summaries of my digital garden changes here. Partly as signposts to new information that readers might find interesting. Partly as a way for me to track and quantify the knowledge I am collecting.

+ + + +

I also published a few of articles:

+ + + +
+https://brainsteam.co.uk/2024/03/17/broken-sites-in-firefox-with-proton-pass-v1-15 +
+ + + +
+https://brainsteam.co.uk/2024/03/17/moving-from-gitea-to-forgejo-including-actions-with-docker-compose +
+ + + +
+https://brainsteam.co.uk/2024/03/30/edinburgh-2024 +
+ + + +

Films, Books & Games

+ + + +

In March I finished The Doors of Eden by Adrian Tchaikovsky and started re-reading the Dune books. My book choice was largely inspired by seeing Dune: Part 2 at the cinema which I thoroughly enjoyed. The first time I read Dune I wasn't that impressed (but I was about 16 at the time). This time around, I am really enjoying it. I wonder if it is partly due to the fact that when I was younger I didn't have any visual reference. Now, having seen the films, perhaps I can better contextualise the things i read. We also went to see Ghostbusters: Frozen Empire which I thoroughly enjoyed. I was surprised to see so much of James Acaster who normally only seems to get bit parts in things.

+ + + +

I've not really been doing much gaming in the last few weeks. I took my Steam Deck to Edinburgh with me but didn't touch it at all. Instead, I have been reading and have been listening to a lot of Three Bean Salad. I am especially enjoying the beans' new film review episodes. I most recently listened to the Dune episode to stay on theme.

+ + + +

Hopes for April

+ + + +

We're now in full daylight savings mode in the UK, the days are longer and we've even had some sunshine in the last few days. I've already been out in the garden a bit in the first few days of this month and I hope to continue this trend in April and finally get our garden tamed. Apart from that we've got a slightly quieter month ahead which I am grateful for since March was action packed and May promises to be similarly busy. I'm also looking forward to Cineworld Action Movie Season this month which will allow Mrs R and I to go and see some of our favourite 90s action films on the big screen.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/04/20/Self-hosting Llama 3 on a home server.md b/brainsteam/content/posts/2024/04/20/Self-hosting Llama 3 on a home server.md new file mode 100644 index 0000000..428ab55 --- /dev/null +++ b/brainsteam/content/posts/2024/04/20/Self-hosting Llama 3 on a home server.md @@ -0,0 +1,275 @@ +--- +categories: +- AI and Machine Learning +date: '2024-04-20 19:38:12' +draft: false +tags: +- AI +- llama +- llms +- nlp +- self-hosting +title: Self-hosting Llama 3 on a home server +type: posts +--- + + +

Self-hosting Llama 3 as your own ChatGPT replacement service using a 10 year old graphics card and open source components.

+ + + +

Last week Meta launched Llama 3, the latest in their open source LLM series. Llama 3 is particularly interesting because the 8 billion parameter model, which is small enough to run on a laptop, performs as well as models 10x bigger than it. The responses it provides are as good as GPT-4 for many use cases.

+ + + +

I finally decided that this was motivation enough to dig out my old Nvidia Titan X card from the loft and slot it into my home server so that I could stand up a ChatGPT clone on my home network. In this post I explain some of the pros and cons of self-hosting llama 3 and provide configuration and resources to help you do it too.

+ + + +

How it works

+ + + +

The model is served by Ollama which is a GPU-enabled open source service for running LLMs as a service. Ollama makes heavy use of llama.cpp, the same tech that I used to build turbopilot around 1 year ago. The frontend is powered by OpenWebUI which provides a ChatGPT-like user experience for interacting with Ollama models.

+ + + +

I use docker compose to run the two services and wire them together and I've got a Caddy web server set up to let in traffic from the outside world.

+ + + +
Drawing of the setup as described above. Caddy brokers comms with the outside world over https and feeds messages to OpenWebUI
+ + + +

+ + + +

Hardware

+ + + +

My setup is running on a cheap and cheerful AMD CPU and Motherboard package and a 10 year old Nvidia Titan X card (much better GPUS are available on Ebay for around £150. The RTX 3060 with 12GB VRAM would be a great choice). My server has 32GB RAM but this software combo uses a lot less than that. You could probably get away with 16GB and run it smoothly or possibly even 8GB at a push.

+ + + +

You could buy this bundle and a used RTX3060 on Ebay or a brand new one for around £250 and have a functional ChatGPT replacement in your house for less than £500.

+ + + +

Pros and Cons of Llama 3

+ + + +

Llama 3 8B truly is a huge step forward for open source alternatives to relying on APIS from OpenAI, Anthropic and their peers. I am still in the early stages of working with my self-hosted Llama 3 instance but so far I'm finding that it is just as capable as GPT-4 in many arenas.

+ + + +

Pro: Price

+ + + +

Self-hosting Llama 3 with Ollama and OpenWebUI is free-ish except for any initial investment you need to make for hardware and then electricity consumption. ChatGPT plus is currently $20/month but techies are likely also burning a similar amount in API calls too. I already had all the components for this build lying around the house but if I bought them 2nd hand it would take around 1 year for them to pay for themselves. That said, I could massively increase my API consumption through my self-hosted models since it's effectively "free".

+ + + +

Pro: Privacy

+ + + +

A huge advantage of this approach is that you're not sending your data to an external company to be mined. The consumer version of ChatGPT that most people use is heavily data mined to improve OpenAI's models and anything that you type in may end up in their corpus. Ollama runs entirely on your machine and never sends data back to any third party company.

+ + + +

Pro: Energy Consumption and Carbon Footprint

+ + + +

Another advantage is that since Llama 3:8B is small and it runs on a single GPU it uses a lot less energy to run than an average query to ChatGPT. My Titan X card consumes about 250 watts at max load but RTX 3060 cards only require 170 watts to run. Again, I had all the components lying around so I didn't buy anything new to make this server and indeed it means I won't be throwing away components that would otherwise become e-waste.

+ + + +

Con: Speed on old hardware

+ + + +

Self-hosting Llama 3 8B on a Titan X is a little slower than ChatGPT but is still perfectly serviceable. It would almost certainly be faster on RTX 3 and 4 series cards.

+ + + +

Con: Multimodal Performance

+ + + +

The biggest missing feature for me is currently multi-modal support. I use GPT-4 to do handwriting recognition and transcription for me and current gen open source models aren't quite up to this yet. However, given the superb quality of Llama 3, I have no doubt that a similarly brilliant open multi-modal model is just around the corner.

+ + + +

Con: Training Transparency

+ + + +

Although Llama 3's weights are free to download, the training corpus content is unknown. The model was built by Meta and thus is likely to have been trained on a large amount of user generated content and copyrighted content. Hosted third party models like ChatGPT are likely to be equally problematic in this regard but.

+ + + +

+ + + +

Setting up Llama 3 with Ollama and OpenWebUI

+ + + +

Once you have the hardware assembled and the operating system installed, the fiddliest part is configuring Docker and Nvidia correctly.

+ + + +

Ubuntu

+ + + +

If you're on Ubuntu, you'll need to install docker first. I recommend using the guide from Docker themselves which installs the latest and greatest packages. Then follow this guide to install the nvidia runtime. Then you will want to verify that it's all set up using the checking step below.

+ + + +

Unraid

+ + + +

I actually run Unraid on my home server rather than Ubuntu. To get things running there, simply install the unraid nvidia plugin through the community apps page and make sure to stop and start docker before trying out the step below.

+ + + +

Checking the Docker and Nvidia Setup (All OSes)

+ + + +

To make sure that Docker and Nvidia are installed properly and able to talk to each other you can run:

+ + + +
 docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
+ + + +

This runs the nvidia-smi status utility which should show what your GPU is currently doing but crucially it's doing so from inside docker which means that nvidia's container runtime is all set up to pass through the nvidia drivers to whatever you're running inside your container. You should see something like this:

+ + + +
A screenshot of nvidia-smi output which shows the GPU name, how much power it is drawing, how much VRAM is in use and any processes using the card.
+ + + +

Installing Ollama

+ + + +

Create a new directory and a new empty text file called docker-compose.yml. In that file paste the following:

+ + + +
ersion: "3.0"
+services:
+
+  ui:
+    image: ghcr.io/open-webui/open-webui:main
+    restart: always
+    ports:
+      - 3011:8080
+    volumes:
+      - ./open-webui:/app/backend/data
+    environment:
+      # - "ENABLE_SIGNUP=false"
+      - "OLLAMA_BASE_URL=http://ollama:11434"
+
+
+  ollama:
+    image: ollama/ollama
+    restart: always
+    ports:
+      - 11434:11434
+    volumes:
+      - ./ollama:/root/.ollama
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+
+ + + +

We define the two services and we provide both with volume mounts to enable them to persist data to disk (such as models you downloaded and your chat history).

+ + + +

For now we leave ENABLE_SIGNUP commented out so that you can create an account in the web ui but later we can come back and turn that off so that internet denizens can't sign up to use your chat.

+ + + +

Turn on Ollama

+ + + +

First we will turn on ollama and test it. Start by running docker-compose up -d ollama. (Depending on which version of docker you are running you might need to run docker compose rather than docker-compose). This will start just the ollama model server. We can interact with the model server by running an interactive chat session and downloading the model:

+ + + +
docker-compose exec ollama ollama run llama3:8b
+ + + +

In this command the first ollama refers to the container and ollama run llama3:8b is the command that will be executed inside the container. If all goes well you will see the server burst into action and download the llama3 model if this is the first time you've run it. You'll then be presented with an interactive prompt where you'll be able to chat to the model.

+ + + +
Screenshot showing the interactive prompt. I have entered hello and the model has responded
+ + + +

You can press CTRL+D to quit and move on to the next step.

+ + + +

Turn on the Web UI

+ + + +

Now we will start up the web ui. Run docker-compose up -d ui. Now open up your browser and go to http://localhost:3011/ to see the web ui. You will need to register for an account and log in. After which you will be able to interact with the model like so:

+ + + +
A screenshot of the web ui. I have asked the model what noise a fox makes.
+ + + +

+ + + +

(Optional) Configure Outside Access

+ + + +

If you want to be able to chat to your models from the outside world you might want to stand up a reverse proxy to your server. If you're new to self hosting and you're not sure about how to do this, a safer option is probably to use Tailscale to build a VPN which you can use to securely connect to your home network without risking exposing your systems to the public and/or hackers.

+ + + +

+ + + +

Conclusion

+ + + +

Llama 3 is a tremendously powerful model that is useful for a whole bunch of use cases including summarisation, creative brainstorming, code copiloting and more. The quality of the responses are in line with GPT-4 and it runs on much older, smaller hardware. Self-hosting Llama 3 won't be for everyone and it's quite technically involved. However, for AI geeks like me, running my own ChatGPT clone at home for next-to-nothing was too good an experiment to miss out on.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/04/26/Can Phi3 and Llama3 Do Biology_.md b/brainsteam/content/posts/2024/04/26/Can Phi3 and Llama3 Do Biology_.md new file mode 100644 index 0000000..35f3681 --- /dev/null +++ b/brainsteam/content/posts/2024/04/26/Can Phi3 and Llama3 Do Biology_.md @@ -0,0 +1,325 @@ +--- +categories: +- AI and Machine Learning +date: '2024-04-26 13:41:42' +draft: false +tags: +- AI +- llms +- nlp +title: Can Phi3 and Llama3 Do Biology? +type: posts +--- + + +

Small Large Language Model might sound like a bit of an oxymoron. However, I think it perfectly describes the class of LLMs in the 1-10 billion parameter range like Llama and Phi 3. In the last few days, Meta and Microsoft have both released these open(ish) models that can happily run on normal hardware. Both models perform surprisingly well for their size, competing with much larger models like GPT 3.5 and Mixtral. However, how well do they generalise to new unseen tasks? Can they do biology?

+ + + +

Introducing Llama and Phi

+ + + +

Meta's offering, Llama 3 8B, is an 8 billion parameter model that can be run on a modern laptop. It performs almost as well as Mixtral 8x22B mixture-of-expert model, a model 22x bigger and compute intensive.

+ + + +

Microsoft's model, Phi 3 mini, is around half the size of Llama 3 8B at 3.8 billion parameters. It is small enoughthat it can run on a high end smartphone at a reasonable speed. Incredibly, Phi actually beats Llama 3 8B, which is twice as big, at a few popular benchmarks including MMLU which approximately measures "how well does this model behave as a chatbot" and HumanEval which measures "how well can this model write code?".

+ + + +

I've also read a lot of anecdotal evidence about people chatting to these models and finding them quite engaging and useful chat partners (as opposed to previous generation small models). This seems to back up the benchmark performance and provide some validation of the models' utility.

+ + + +

Both Microsoft and Meta have stated that the key difference between these models and previous iterations of their smaller LLMs is the training regime. Interestingly, both companies applied very different training strategies . Meta trained Llama3 on over 15 trillion tokens (words) which is unusually large for a small model. Microsoft trained Phi on much smaller training sets curated for high quality.

+ + + +

Can Phi, Llama and other Small Models Do Biology?

+ + + +

Having a model small enough to run on your phone and generate funny poems or trivia questions is neat. However, for AI and NLP practitioners, a more interesting question is "do these models generalise well to new, unseen problems?"

+ + + +

I set out to gather some data about how well Phi and Llama 3 8B generalise to a less-well-known task. As it happened, I have recently been working with my friend Dan Duma on a test harness for BioAsq Task B. This is a less widely-known, niche NLP task in the bio-medical space. The model is fed a series of snippets from scientific papers and asked a question which it must answer correctly. There are four different formats of question which I'll explain below.

+ + + +

The 11th BioASQ Task B leaderboard is somewhat dominated by GPT-4 entrants with perfect scores at some of the sub-tasks. If you were somewhat cynical, you might consider this task "solved". However, we think it's an interesting arena for testing how well smaller models are catching up to big commercial offerings.

+ + + +

BioASQ B is primarily a reading comprehension task with a slightly niche subject-matter. The models under evaluation are unlikely to have been explicitly trained to answer questions about this material. Smaller models are often quite effective at these sorts of RAG-style problems since they do not need to have internalised lots of facts and general information. In fact, in their technical report, the authors of Phi-3 mini call out the fact that their model can't retain factual information but could be augmented with search to produce reasonable results. This seemed like a perfect opportunity to test it out.

+ + + +

How The Task Works

+ + + +

There are 4 types of question in task B. Factoid, Yes/No, List and Summary. However, since summary is quite tricky to measure, it is not part of the BioASQ leaderboard. We also chose to omit summary from our tests.

+ + + +

Each question is provided along with a set of snippets. These are full sentences or paragraphs that have been pre-extracted from scientific papers. Incidentally, that activity is BioASQ Task A and it requires a lot more moving parts since there's retrieval involved too. In Task B we are concerned with existing sets of snippets and questions only.

+ + + +

In each case the model is required to respond with a short and precice exact answer to the question. The model may optionally also provide an ideal answer which provides some rationale for that answer. The ideal answer may provide useful context for the user but is not formally evaluated as part of BioASQ.

+ + + +

Yes/No questions require an exact answer of just "yes" or "no". For List questions, we are looking for a list of named entities (for example symptoms or types of microbe). For factoid we are typically looking for a single named entity. Models are allowed to respond to factoids with multiple answer. Therefore, factoids answers are scored by how closely to the top of the list the "correct" answer is ranked.

+ + + +

+ + + +

The Figure from the Hseuh et al 2023 Paper below illustrates this quite well:

+ + + +
Examples of different question types. Full transcriptions of each are:
+
+Yes/No
+Question: Proteomic analyses need prior knowledge of the organism complete genome. Is the complete genome of the bacteria of the genus Arthrobacter available?
+Exact Answer: yes
+Ideal Answer: Yes, the complete genome sequence of Arthrobacter (two strains) is deposited in GenBank.
+
+List
+Question: List Hemolytic Uremic Syndrome Triad.
+Exact Answer: [anaemia, thrombocytopenia, renal failure]
+Ideal Answer: Hemolytic uremic syndrome (HUS) is a clinical syndrome characterized by the triad of anaemia, thrombocytopenia, renal failure.
+
+Factoid
+Question: What enzyme is inhibited by Opicapone?
+Exact Answer: [catechol-O-methyltransferase]
+Ideal Answer: Opicapone is a novel catechol-O-methyltransferase (COMT) inhibitor to be used as adjunctive therapy in levodopa-treated patients with Parkinson's disease
+
+Summary
+Question: What kind of affinity purification would you use in order to isolate soluble lysosomal proteins?
+Ideal Answer: The rationale for purification of the soluble lysosomal proteins resides in their characteristic sugar, the mannose-6-phosphate (M6P), which allows an easy purification by affinity chromatography on immobilized M6P receptors.
Figure 1 from the Hseuh et al 2023 Paper illustrates the different task types succinctly
+ + + +

Our Setup

+ + + +

We wrote a python script that passes the question, context and guidance about the type of question to the model. We used a patched version of Ollama that allowed us to put restrictions on the shape of the model output. This allowed us to ensure responses were valid JSON in the same shape and structure as the BioASQ examples. These forced grammars saved us loads of time trying to coax JSON out of models in the structure we want. This is something that smaller models aren't great at. Sometimes models would still fail to give valid responses. For example, sometimes they get stuck in infinite loops spitting out brackets or newlines. We gave models up to 3 chances to produce a JSON response before a question is marked unanswerable and skipped.

+ + + +

Prompts

+ + + +

We used exactly the same prompts for all of the models which may have left room for further performance improvements. The exact prompts and grammar constraints that we used can be found here. Snippets are concatenated together with newlines in between them and provided as "context" in the prompt template.

+ + + +

We used the official BioASQ scoring tool to evaluate the responses and produce the results below. We evaluated our pipeline on the Task 11B Golden Enriched test set. You have to create a free account at bioasq to log in and download the data.

+ + + +

Models

+ + + +

We compared quantized versions of Phi and Llama with some other similarly sized models which perform well at benchmarks.

+ + + + + + + +

Note that although Phi is approx. half the size of the other models, the authors report competitive results against much larger models for a number of widely used benchmarks so it seems reasonable to compare it with these 7B and 8B models as oppose to only benchmarking against other 4B and smaller models.

+ + + +

+ + + +

Results

+ + + +

Yes/No Questions

+ + + +

The simplest type of BioASQ question is Yes/No. These results are measured with macro F1 to allow us to get a single metric across the performance at both "yes" and "no" questions.

+ + + +
Diagram of Yes/No F1
+
+Llama3 gets 1.0
+Mistral gets 0.8
+Phi gets  0.7
+Starling gets 0.9
+Zehpyr gets 0.85
+
+The bars on the chart have little range indicators because they represent the average values over 4 sets of results.
+ + + +

The results show that all 5 models perform reasonably well at this task but Phi 3 lags behind the others a little bit, but only by about 10% next to it's closest competitor. The best solutions to this task are coming in at 1.0 F1. Llama3 and Starling both achieve pretty close to perfect results here.

+ + + +

+ + + +

+ + + +

Factoid Questions

+ + + +

For factoid answers we measure responses in MRR since the model can return multiple possible answers. We are interested in how close the right answers are to the top of the list.

+ + + +
Factoid results
+
+Llama gets roughly 0.55 MRR
+Mistral gets rouglhy 0.05 MRR
+Phi 3 gets roughly 0.15 MRR
+Starling gets roughly 0.17 MRR
+Zephyr gets roughly 0.12 MRR
+
+
+The bars on the chart have little range indicators because they represent the average values over 4 sets of results.
+ + + +

This graph is a lot starker than the yes/no graph. Llama 3 outperforms it's next closest neighbour by a significant margin (roughly +0.40 MRR) . The best solution to this task, again a GPT-4-based entrant, weighs in at 0.6316 MRR so it's pretty impressive that Llama 3 8B is providing results in the same ballpark as a model many times larger. For this one, Phi is in third place after Starling-LM 7B. Again, given that Phi is half the size of this model, it's quite impressive performance.

+ + + +

List Questions

+ + + +

We measure list questions in F1. A false positive is when something irrelevant is included in the answer and a false negative is when something relevant is missed from an answer. F1 gives us a single statistic that balances the two.

+ + + +
Llama 3 gets roughly 0.45 F1
+Mistral gets roughly 0.21 F1
+Phi gets roughly 0.05 F1
+Starling gets roughly 0.27 F1
+Zephyr gets roughly 0.32 F1
+
+
+The bars on the chart have little range indicators because they represent the average values over 4 sets of results.
+ + + +

This one was a little surprising to me as Phi does a lot worse than any of its counterparts. We noticed that Phi produced a much higher rate of unanswerable questions than any of the other models which may be due to the somewhat complex JSON structure required by list type questions. It may be worth re-testing with different formatting arrangements to see if the failure to format the model masks reasonable performance at the task.

+ + + +

Llama 3 8B wins again. The current best solution, again a GPT-4-based system, achieves an F1 of 0.72 so even Llama 3 8B leaves a relatively wide gap here. It would be worth testing the larger variants of Llama 3 to see how well they perform at this task and whether they are competitive with GPT-4.

+ + + +

Discussion and Conclusion

+ + + +

Llama3

+ + + +

We've seen that Llama 3 8B and, to a lesser extent, Phi 3 Mini, are able to generalise reasonably well to a reading comprehension task in a field that wasn't a primary concern for either se of model authors. This isn't conclusive evidence for or against the general performance of these models on unseen tasks. However, it is certainly an interesting data point that shows that, particularly Llama 3 really is competitive with much larger models at this task. I wonder if that's because it was trained on such a large corpus which may have included some biomedical content as part of its training corpus.

+ + + +

Phi

+ + + +

I'm reluctant to too-harshly critique Phi's reasoning and reading comprehension ability since there's a good chance that it was disadvantaged by our test setup and the forced JSON structure, particularly for the list task. However, the weaker performance at the yes/no questions may be a hint that it isn't quite as good at generalised reading comprehension as the competing larger models.

+ + + +

We know that Phi3, like it's predecessor was trained on data that "consists of heavily filtered web data (according to the “educational level”) from various open internet sources, as well as synthetic LLM-generated data." However, we don't know specifically what was included or excluded. If Llama 3 went for "cast the net wide" approach to data collection, it's likely that the latter model may have been exposed to more biomedical content "by chance" and thus be better at reasoning about concepts that perhaps Phi had never seen before.

+ + + +

I do want to again call out that Phi is approximately half the size of the next biggest model in our benchmark so it's performance is quite impressive in that light.

+ + + +

Further Experiments

+ + + +

Model Size

+ + + +

I won't conjecture about whether 3.8B parameters is "too small" to generalise given the issues mentioned above but I'd love to see some more tests of this in future. Do the larger variants of Phi (trained on the same data but simpliy with more parameters) suffer from the same issues?

+ + + +

Model Fine Tuning

+ + + +

The models that I've been testing are small enough that they can be fine-tuned on specific problems on a consumer-grade gaming GPU for very little cost. It seems entirely plausible to me that by fine-tuning these models on biomedical text ands historical BioASQ training sets their performance could be improved even more significantly. The challenge would be in finding the right mix of data.

+ + + +

Better Prompts

+ + + +

We did not spend a lot of time attempting to build effective prompts during this experiment. It may be that performance was left on the table because of this oversight. Smaller models are often quite fussy about prompts. It might be interesting to use a prompt optimisation framework like DSPy to be more systematic about better prompts.

+ + + +

Other Tasks

+ + + +

I tried these models on BioAsq but this is lightyears away from conclusive evidence for whether or not these new-generation small models can generalise well. It's simply a test of whether they can do biology. It will be very interesting to try other novel tasks and see how well they perform. Watch this space!

+ + + +

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/05/01/LLMs Can_t Do Probability.md b/brainsteam/content/posts/2024/05/01/LLMs Can_t Do Probability.md new file mode 100644 index 0000000..ec800e5 --- /dev/null +++ b/brainsteam/content/posts/2024/05/01/LLMs Can_t Do Probability.md @@ -0,0 +1,184 @@ +--- +categories: +- AI and Machine Learning +date: '2024-05-01 10:13:15' +draft: false +tags: [] +title: LLMs Can't Do Probability +type: posts +--- + + +

+ + + +

I've seen a couple of recent posts where the writer mentioned asking LLMs to do something with a certain probability or a certain percentage of the time. There is a particular example that stuck in my mind which I've since lost the link to (If you're the author, please get in touch so I can link through to you):

+ + + +

The gist is that the author built a Custom GPT with educational course material and then put in the prompt that their bot should lie about 20% of the time. They then asked the students to chat to the bot and try to pick out the lies. I think this is a really interesting, lateral thinking use case since the kids are probably going to use ChatGPT anyway.

+ + + +

The thing that bothered me is that transformer-based LLMs don't know how to interpret requests for certain probabilities of outcomes. We already know that ChatGPT reflects human bias when generating random numbers. But, I decided to put it to the test with making random choices.

+ + + +

Testing Probability in LLMS

+ + + +

I prompted the models with the following:

+ + + +
+

You are a weighted random choice generator. About 80% of the time please say 'left' and about 20% of the time say 'right'. Simply reply with left or right. Do not say anything else

+
+ + + +

And I ran this 1000 times through some different models. Random chance is random (profound huh?) so we're always going to get some deviation from perfect odds but we're hoping for roughly 800 'lefts' and 200 'rights' - something in that ballpark.

+ + + +

Here are the results:

+ + + +
ModelLeftsRights
GPT-4-Turbo9991
GPT-3-Turbo97525
Lllama-3-8B10000
Phi-3-3.8B10000
+ + + +

+ + + +

As you can see, LLMs seem to struggle with probability expressed in the system prompt. It almost always answers left even though we asked it to only do so 80% of the time. I didn't want to burn lots of $$$ asking GPT-3.5 (which did best in the first round) to reply with single word choices to silly questions but I tried a couple of other combinations of words to see how it affects things. This time I only ran each 100 times.

+ + + +
Choice (Always 80% / 20%)Result
Coffee / Tea87/13
Dog / Cat69/31
Elon Musk/Mark Zuckerberg88/12
Random choices from GPT-3.5-turbo
+ + + +

So what's going on here? Well, the models have their own internal weighting to do with words and phrases that is based on the training data that was used to prepare them. These weights are likely to be influencing how much attention the model pays to your request.

+ + + +

So what can we do if we want to simulate some sort of probabilistic outcome? Well we could use a Python script to randomly decide whether or not to send one of two prompts:

+ + + +
import random
+from langchain_openai import ChatOpenAI
+from langchain_core.messages import HumanMessage, SystemMessage
+
+choices = (['prompt1'] * 80) + (['prompt2'] * 20)
+
+# we should now have a list of 100 possible values - 80 are prompt1, 20 are prompt2
+assert len(choices) == 100
+
+# randomly pick from choices - we should have the odds we want now
+chat = ChatOpenAI(model="gpt-3.5-turbo")
+
+if random.choice(choices) == 'prompt1':
+    r = chat.invoke(input=[SystemMessage(content="Always say left and nothing else.")])
+else:
+     r = chat.invoke(input=[SystemMessage(content="Always say right and nothing else.")])
+ + + +

Conclusion

+ + + +

How does this help non-technical people who want to do these sorts of use cases or build Custom GPTs that reply with certain responses? Well it kind of doesn't. I guess a technical-enough user could build a CustomGPT that uses function calling to decide how it should answer a question for a "spot the misinformation" pop quiz type use case.

+ + + +

However, my broad advice here is that you should be very wary of asking LLMs to behave with a certain likelihood unless you're able to control that likelihood externally (via a script).

+ + + +

What could I have done better here? I could have tried a few more different words, different distributions (instead of 80/20) and maybe some keywords like "sometimes" or "occasionally".

+ + + +
+ + + +

Update 2024-05-02: Probability and Chat Sessions

+ + + +

Some of the feedback I received about this work asked why I didn't test multi-turn chat sessions as part of my experiments. Some folks hypothesise that the model will always start with one or the other token unless the temperature is really high. My original experiment does not give the LLM access to its own historical predictions so that it can see how it behaved previously.

+ + + +

With true random number generation you wouldn't expect the function to require a list of historical numbers so that it can adjust it's next answer (although if we're getting super hair splitty I should probably point out that pseudo-random number generation does depend on a historical 'seed' value).

+ + + +

The point of this article is that LLMs definitely are not doing true random number generation so it is interesting to see how conversation context affects behaviour.

+ + + +

I ran a couple of additional experiments. I started with the prompt above and instead of making single API calls to the LLM I start a chat session where each turn I simply say "Another please". It looks a bit like this:

+ + + +

+ + + +
+

System: You are a weighted random choice generator. About 80% of the time please say ‘left’ and about 20% of the time say ‘right’. Simply reply with left or right. Do not say anything else

Bot: left

Human: Another please

Bot: left

Human: Another please

+
+ + + +

+ + + +

I ran this once per model for 100 turns and also 10 times per model for 10 turns.

+ + + +

+ + + +

NB: I excluded Phi from both of these experiments as in both test cases, it ignored my prompt to reply with one word and started jibbering.

+ + + +

100 Turns Per Model

+ + + +
Model# Left# Right
GPT 3.5 Turbo4951
GPT 4 Turbo955
Llama 3 8B982
+ + + +

+ + + +

10 Turns, 10 time per model

+ + + +
Model# Left# Right
GPT 3.5 Turbo6139
GPT 4 Turbo8614
Llama 3 8B7129
+ + + +

+ + + +

Interestingly the series of 10 shorter conversations gets us closest to the desired probabilities that we were looking for but all scenarios still yield results inconsistent with the ask from the prompt.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/05/07/LLMs_ To Fine-Tune or Not to Fine-Tune_.md b/brainsteam/content/posts/2024/05/07/LLMs_ To Fine-Tune or Not to Fine-Tune_.md new file mode 100644 index 0000000..a252fae --- /dev/null +++ b/brainsteam/content/posts/2024/05/07/LLMs_ To Fine-Tune or Not to Fine-Tune_.md @@ -0,0 +1,92 @@ +--- +categories: +- AI and Machine Learning +- Data Science +date: '2024-05-07 11:19:07' +draft: false +tags: +- AI +- llms +- nlp +title: 'LLMs: To Fine-Tune or Not to Fine-Tune?' +type: posts +--- + + +

Knowing when to fine-tune LLMs and when to use an off-the-shelf model is a tricky question. New research can help shed a light on when each approach makes more sense and eke more performance out of off-the-shelf models without fine-tuning them.

+ + + +

+ + + +

When Fine-Tuning Beats GPT-4

+ + + +

Modern LLMs show impressive performance at a range of tasks out of the box. Even small models like the recent Llama 3: 8B show excellent performance at unseen tasks. However, a recent preprint from the research team at Predibase shows that small models can match and even out-perform GPT-4 when they are fine-tuned for specific tasks. Figure 5 from the paper shows a list of the tasks that were evaluated and the relative performance difference vs GPT-4.

+ + + +
Figure 5: Performance lift from the best fine-tuned LLM over 1) the best base model (<=
+7B) (in blue) and GPT-4 (in red) across 31 tasks, in absolute points.
Figure 5 from Zhao et al. 2024
+ + + +

The authors note that fine-tuned models consistently outperform GPT-4 at specific, narrowly-scoped tasks like classification and information extraction. GPT-4 came out winning where tasks are more complex and open ended like generating code and multi-lingual language understanding.

+ + + +

Fine tuning these small models is cheaper and easier than ever before. Llama 3 can be fine-tuned on consumer GPUs and once tuned, they can be run on modest systems with ease. All of this comes with one relatively large. caveat. You need techies with at least a veneer of machine learning and MLOps expertise to do it. If you have a team with DevOps/MLOps capability you may find that this option works really well for you.

+ + + +

Interestingly the paper came out just after Llama 3 and Phi 3. Therefore these models are not included in the evaluation but may offer even better fine-tuned performance than their prevous-generation counterparts.

+ + + +

Is ICL a Good Alternative to Fine-tuning?

+ + + +

Another preprint published 1 day later by researchers at CMU shows that you can get performance competitive with fine-tuned models with in-context-learning (ICL). That is to say, providing a lot of example input and output pairs in your prompt can give the same sort of performance boost as fine-tuning a model. If that's the case then why bother fine-tuning? Well including huge amounts of context with every inference is going to significantly drive up API costs. It's also going to mean that each model call takes much longer. As the authors of the paper say:

+ + + +
+

...Finetuning has higher upfront cost but allows for reduced inference-time cost... finetuning on 4096 examples may still be preferable to prompting with 1000 if efficiency of inference is a major priority...

+Bertsch et al, 2024
+ + + +

This may be an acceptable tradeoff if you are operating offline (e.g. your user is not actively waiting for a response). For example, using models to classify, summarise or extract information. However, this could be a bit of a UX vibe-killer in chat applications.

+ + + +

DSPy: A Wildcard

+ + + +

Another alternative related to in-context-learning is meta-learning. Meta-learning uses frameworks like DSPy to automatically find the best prompt for your task. It does this by systematically searching your dataset for example inputs and outputs and measuring performance. This may help you to find a compromise between context length and model performance automatically. DSPy will attempt to find the most educational training samples from your dataset whilst keeping your context short. DSPy is a little involved to get started. However, once you've got it plugged into your framework you can automatically have DSPy optimise your prompts for different models. This makes switching between API providers a doddle.

+ + + +

+ + + +

Conclusion

+ + + +

Both fine-tuning and providing a large number of examples in the prompt are reasonable ways to significantly boost the out-of-the-box performance of models at specific tasks. If you have a data science team in-house, a very large training dataset or require very fast inference, fine-tuning may be the way to go. Alternatively if you want to boost your model's performance without much up-front investment, it may be worth playing with in-context-learning or meta-prompting to try to improve things. However, expect slower inference speeds and higher API bills if you adopt this approach at scale.

+ + + +

Of course all of these approaches require you to have a well defined task, a way to measure performance at said task and some example inputs and outputs. If you're just starting out and you haven't yet built much of a dataset then zero or few-shot learning (where you describe the task in the prompt and optionally provide a handful of input/output examples) is the way to go.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/07/08/Ditch that ChatGPT Subscription_ Moving to Pay-as-you-Go AI usage with Open Web UI.md b/brainsteam/content/posts/2024/07/08/Ditch that ChatGPT Subscription_ Moving to Pay-as-you-Go AI usage with Open Web UI.md new file mode 100644 index 0000000..d8bc8cb --- /dev/null +++ b/brainsteam/content/posts/2024/07/08/Ditch that ChatGPT Subscription_ Moving to Pay-as-you-Go AI usage with Open Web UI.md @@ -0,0 +1,464 @@ +--- +categories: +- AI and Machine Learning +- Software Development +date: '2024-07-08 10:09:36' +draft: false +tags: +- AI +- docker +- llms +- nlp +- self-hosting +title: 'Ditch that ChatGPT Subscription: Moving to Pay-as-you-Go AI usage with Open + Web UI' +type: posts +--- + + +

Introduction

+ + + +

In the world of AI assistants, subscription services like ChatGPT Plus, Claude Pro and Google One AI have become increasingly popular amongst knowledge workers. However these subscription services may not be the most cost-effective or flexible solution for everyone and the not-insignificant fees encourage users to stick to one model that the've already paid for rather than trying out different options.

+ + + +

I previously wrote about how you can self-host Llama3 on a machine with an older graphics card using open source tools. In this post, I demonstrate how to expand this setup to allow you to interact with OpenAI, Anthropic, Google and others alongside your local models via a single, self-hosted UI.

+ + + +

Why ditch my subscription?

+ + + +

Hosting your own AI user interface allows for more granular cost control, potentially saving money for lighter users of commercial models (are you really getting $20/mo of usage out of ChatGPT?). It grants you greater flexibility in choosing and comparing different AI models without worrying about which subscription to spend your $20 on this month or forking out for more than one at a time. Additionally, commercial AI providers usually provide a more stable experience via their APIs since business users, the target audience of the API offerings, tend to have more leverage than consumers when it comes to service-level-agreements. API terms & conditions around data usage are normally better too (within ChatGPT, data collection is opt-out and you lose functionality when you do it) .

+ + + +

Self-hosting your chat interface with Open Web UI brings enhtanced privacy through a local UI, the capability to run prompts through multiple models simultaneously, and the freedom to use different models for different tasks without being locked into a single provider. You can build your own bespoke catalogue of commercial and local models (like Llama-3) and access them and their outputs all from one place. If you're already familiar with docker and docker-compose you can have something up and running in 15 minutes.

+ + + +

When might this not work for me?

+ + + +

If you are a particularly heavy user of the service that you're subscribed to and you're not keen on trying other models you may find that moving to pay-as-you-go is more expensive for you. The average length of a fiction book is something like 60-90k words. It costs about $5 to have GPT-4o read 16 books (input/prompt) and $15 to have it write 16 books worth of output. If you're spending all day every day chatting to Jippy then you might find that you end up spending more than the $20/mo on API usage.

+ + + +

If cost is your primary driver, you should factor in the cost of hosting your server too. If you are already running a homelab or a cheap VPS you might be able to run the web UI at "no extra charge" but if you need to spin up a new server just for hosting your web UI that's going to cut into your subscription savings.

+ + + +

If privacy is your primary driver, you should be aware that this approach still involves sending data to third parties. If you want to avoid that all together you'll need to go fully self hosted.

+ + + +

Finally, fair warning: this process is technical, and you'll need to be familiar (or willing to learn about) Docker, yaml config files and APIs.

+ + + +

+ + + +

The Setup

+ + + +

This setup builds on my previous post about Llama 3. Previously we used Open Web UI to provide a web app that allows you to talk to the LLM and we used Ollama to host Llama 3 locally and provide a system that Open Web UI could send requests to.

+ + + +
The image is a flowchart illustrating a network architecture for handling both self-hosted and commercial Large Language Model (LLM) traffic. The components are as follows:
+
+
+    Web Traffic (Open Internet): Represented by an arrow entering the local network boundary.
+
+    Caddy: A web server handling incoming web traffic from the internet with arrows pointing to
+ + + +

This setup has five main components components:

+ + + +
    +
  1. Ollama - a system for running local language models on your computer, using your GPU or Apple silicon to provide reasonable response speeds
  2. + + + +
  3. Open Web UI, an open source frontend for chatting with LLMs that supports Ollama but also any old OpenAI-compatible API endpoint.
  4. + + + +
  5. LiteLLM Proxy which allows us to input our API keys and provides a single service that Open Web UI can call to access a whole bunch of commercial AI models
  6. + + + +
  7. PostgreSQL - a database server which LiteLLM will use to store data about API usage and costs.
  8. + + + +
  9. Caddy - used for reverse proxying HTTP traffic from the open web and routing it to Open Web UI or LiteLLM.
  10. +
+ + + +

The resulting docker-compose file will look something like this:

+ + + +
version: "3.0"
+services:
+
+  ui:
+    image: ghcr.io/open-webui/open-webui:main
+    restart: always
+    ports:
+      - 8080:8080
+    volumes:
+      - ./open-webui:/app/backend/data
+    environment:
+      # after you 
+      # - "ENABLE_SIGNUP=false"
+      # if you disable ollama then remove the following line
+      - "OLLAMA_BASE_URL=http://ollama:11434"
+      # if you disable ollama then set the following to false
+      - "ENABLE_OLLAMA_API=true"
+
+  db:
+    image: postgres:15
+    restart: always
+    # set shared memory limit when using docker-compose
+    shm_size: 128mb
+    volumes:
+    - ./data/postgres:/var/lib/postgresql/data
+    environment:
+      POSTGRES_PASSWORD: somesecretpw
+      POSTGRES_DB: litellm
+      PGDATA: /var/lib/postgresql/data/pgdata
+
+  litellm:
+    image: ghcr.io/berriai/litellm:main-latest
+    restart: always
+    depends_on:
+      - db
+    ports:
+       - 4000:4000
+    volumes:
+     - ./litellm/config.yaml:/app/config.yaml
+    command: --port 4000 --config /app/config.yaml
+    env_file: .env.docker
+    environment:
+      - DATABASE_URL=postgresql://postgres:somesecretpw@db:5432/litellm
+
+  caddy:
+    image: caddy:2.7
+    restart: unless-stopped
+    ports:
+      - "80:80"
+      - "443:443"
+      - "443:443/udp"
+    volumes:
+      - ./caddy/Caddyfile:/etc/caddy/Caddyfile
+      - ./caddy/data:/data
+      - ./caddy/config:/config
+
+  # everything below this line is optional and can be commented out
+  ollama:
+    image: ollama/ollama
+    restart: always
+    ports:
+      - 11434:11434
+    volumes:
+      - ./ollama:/root/.ollama
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+
+ + + +

Note that Ollama is optional in this setup and we can deploy without it. That might be helpful if you want to take advantage of being able to switch between different commercial APIs but don't want to run local models (or perhaps don't have the hardware for it). If you want to turn off Ollama, comment it out or remove it. You'll also need to remove the Ollama line

+ + + +

We also need some support files.

+ + + +

Caddyfile

+ + + +

The caddyfile is used to route incoming web traffic to your services. Edit the file caddy/Caddyfile. Assuming you have set up DNS as required, You can do something like this:

+ + + +
chat.example.com {
+    reverse_proxy ui:3000
+}
+
+api.example.com {
+    reverse_proxy litellm:4000
+}
+
+ + + +

Secrets in .env.docker

+ + + +

We need to create a docker env file which contains our API keys for the services we want to use.

+ + + +

Here we also define a "master" API key which we can use to authenticate Open Web UI against LiteLLM and also to log in to LiteLLM and see the API call stats.

+ + + +
ANTHROPIC_API_KEY=sk-blah
+OPENAI_API_KEY=sk-blablabla
+
+LITELLM_MASTER_KEY=sk-somesecretvalue
+ + + +

LiteLLM Config

+ + + +

We need to create a very basic config.yaml file which LiteLLM will read to tell it which external models you want to allow users of your web ui to access.

+ + + +
model_list:
+  - model_name: claude-3-opus ### RECEIVED MODEL NAME ###
+    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
+      model: claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
+      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
+  - model_name: gpt-4o
+    litellm_params:
+      model: openai/gpt-4o
+      api_key: "os.environ/OPENAI_API_KEY"
+  - model_name: groq-llama-70b
+    litellm_params:
+      api_base: https://api.groq.com/openai/v1
+      api_key: "os.environ/GROQ_API_KEY"
+      model: openai/llama3-70b-8192
+ + + +

In the example above we tell LiteLLM that we want to connect to Anthropic, OpenAI and Groq and we will allow access to Claude Opus, GPT-4o and Llama 70B respectively. The api_key directives tell LiteLLM to grab the values from named environment variable os.environ/ENV_VAR_NAME so we can customise our env vars to whatever makes sense.

+ + + +

LiteLLM uses the prefixes on model names to know which API it needs to use e.g. it sees claude and knows to use the Anthropic API client. We can also use openai/ to instruct LiteLLM to load any model that supports OpenAI-compatible endpoints out of the box.

+ + + +

First Run Setup

+ + + +

Start Docker

+ + + +

Ok now that we've created all the config files we can start our docker-compose cluster. You can run docker-compose up -d to bring all the services online.

+ + + +

+ + + +

Log in to LiteLLM

+ + + +

Let's start by testing that our LiteLLM setup works. You should be able to navigate to litellm via the caddy subdomain (eg.. https://api.example.com) or via http://localhost:8080/ui to get to the LiteLLM UI. If you're not running on your current machine you can also use your LAN IP address instead. You'll need to enter admin as the username and whatever value you used for LITELLM_MASTER_KEY above as the password (including sk-). If all goes well you should see a list of API keys which initially only contains your master key:

+ + + +
Screenshot of the LiteLLM dashboard interface. The main section displays information for the
+ + + +

If you navigate to the "Models" tab you should also see the models that you enabled in your config.yaml and if all has gone well, pricing information should have been pulled through from the API

+ + + +
Screenshot of the LiteLLM dashboard's
+ + + +

Create Open Web UI Account

+ + + +

Next we need to create our Open Web UI account. Go to your chat subdomain https://chat.example.com/ or http://localhost:3000 and follow the registration wizard. Open Web UI will automatically treat the first ever user to sign up as the administrator.

+ + + +

Connect Open Web UI to LiteLLM

+ + + +

Once we're signed in we need to connect Open Web UI to LiteLLM so that it can use the models. Click on your profile picture > Admin Panel > Settings and open the Connections tab:

+ + + +
Screenshot of the Admin Panel interface with a dark theme. There are two primary tabs at the top:
+ + + +

Since everything is running inside docker-compose we can address litellm (and Ollama if you enabled it) using their container names. In the OpenAI API box enter the path to your litellm API endpoint - should be http://litellm:4000/v1 and in the API key box enter your master password from LITELLM_MASTER_KEY. Click the button to test the connection and hopefully you'll get a green toast notification to say that it worked.

+ + + +

Ollama should already be hooked up if you turned it on. If you didn't enable ollama but want to now, make sure the container is started (docker-compose up -d ollama) and then enter http://ollama:11434 in the url field).

+ + + +

Question: If Open Web UI lets us connect multiple OpenAI endpoints, why do we need LiteLLM?

+ + + +

Open Web UI won't talk directly to models that don't use OpenAI compatible endpoints (e.g. Anthropic Claude or Google Gemini). LiteLLM also lets you be specific about which models you want to pass through to OWUI. For example, you can hide the more expensive ones so that you don't burn credits too quickly. Finally, LiteLLM gives you centralised usage/cost analytics which saves you opening all of your providers' API consoles and manually tallying up your totals.

+ + + +

Testing it Out

+ + + +

Congratulations if you got this far, you now have a working self-hosted chat proxy system. Time to try out some prompts - use the drop down menu at the top of the window to select the model you want to use.

+ + + +
Screenshot of a coding chat interface where the user requests a Python script with CLI options using the `click` module. The response provides a detailed guide, including installation instructions (`pip install click`) and a code snippet for processing an input file and saving the result to an output file. The left sidebar displays recent activity and categories, including today's and previous days' items like
+ + + +

You can race/compare models using the +/- buttons in the top left to have them all read the current prompt and respond in parallel.

+ + + +
Screenshot of a chat interface with three dropdown options:
+ + + +

You can even use the multi-modal features of the model from this view by uploading images and documents. Open Web UI will automatically pass them through as required.

+ + + +
Screenshot of a chat interface where the user uploads an image of a recipe and requests transcription of the ingredients into a Markdown list. The response from
+ + + +

+ + + +

Monitoring Usage

+ + + +

Once we've made some API calls we can log back into litellm and go to "Usage" to see how much it's cost us.

+ + + +
A bar chart breaking down spend by model in litellm's web ui
+ + + +

+ + + +

Final Thoughts

+ + + +

Congratulations, now you've got a fully self-hosted AI chat interface set up. You can load up your API keys with a few $ each and track how much you are spending from LiteLLM's control panel.

+ + + +

If you enabled ollama, you can stick to cheap/self-hosted models by default and switch to a more powerful commercial model for specific use cases.

+ + + +

If you found this tutorial useful, please consider subscribing to my RSS feed, my newsletter on medium or following me on mastodon or the fediverse:

+ + + \ No newline at end of file diff --git a/brainsteam/content/posts/2024/07/21/A Personal Experiment with Coffee_ Walking and Anxiety.md b/brainsteam/content/posts/2024/07/21/A Personal Experiment with Coffee_ Walking and Anxiety.md new file mode 100644 index 0000000..57520b7 --- /dev/null +++ b/brainsteam/content/posts/2024/07/21/A Personal Experiment with Coffee_ Walking and Anxiety.md @@ -0,0 +1,75 @@ +--- +categories: +- Personal +date: '2024-07-21 11:19:04' +draft: false +tags: +- anxiety +- coffee +title: A Personal Experiment with Coffee, Walking and Anxiety +type: posts +--- + + +

I noticed that in the last few weeks I've been pretty anxious. More so than my normal background level of anxiety. I realised that I've been drinking a lot more heavily caffeinated iced coffees and pepsi max recently. What's more, I was thinking back to a period earlier in the year when I was less anxious and remembered that it coincided with a health kick where Mrs R and I cut back a lot on Pepsi/fizzy drinks. I put two and two together and started thinking that caffeine is probably exacerbating my anxiety. This led me to try experimenting with coffee, walking and anxiety.

+ + + +

Caffeine and Walking vs Anxiety

+ + + +

A recent study found that 5 cups of coffee a day can induce panic attacks in patients prone to them. It also increases the background anxiety of people without a panic disorder too. That's about 400-500 mg caffeine. I worked out that if I drank 2-3 coffees a day plus 2-3 300ml cans of Pepsi, I'd probably be exceeding that range.

+ + + +

Also, through Doug Belshaw's blog I found a podcast episode by Dr Rangan Chaterjee recommending that people use the summer months to do a little detox. He suggests going for a daily 10-15 minute walk as soon as you wake up. Preferably without your phone and before you've had any tea or coffee. During the COVID lockdowns Mrs R and I were doing daily walks morning and evening. We found that it helped to keep us sane and to provide a transition between work and personal time whilst working from home 100% of the time. Thinking about this also felt like a minor revelation.

+ + + +

Personal Experiment

+ + + +

It felt like it was time for a personal experiment to see if I could reduce my anxiety. I'm trying out reducing my caffeine intake and doing daily walks to see if it helps my anxiety. I guess I could try doing one without the other to try to isolate the effect of each. On the other hand, both of these things have other advantages. Giving up diet pepsi and doing more exercise are both good for my general health. Therefore, I'm not too bothered as ideally I'd like to keep doing both anyway.

+ + + +

Since Monday 15th July I have been consuming a maximum of 3 caffeinated drinks per day. That's normally 1 coffee and 1-2 cans of pepsi a day. I've replaced my 2nd and 3rd coffee with decaf. I've also swapped out my pepsi for water or lemonade as a treat. I am also doing a 10-15 minute walk around the block as soon as I hop out of bed. I plan to try to do this every day for at least a month to see how it affects me.

+ + + +

Early Results

+ + + +

Monday morning was particularly bad. I had an intense dose of the sunday scaries and I had drunk lots of pepsi max the night before. I told Mrs R I might need to take a mental health day and nearly gave up on my experiment on day 1. However, she encouraged me to on a walk and it immediately took the edge off my anxiety. I had forgotten how pretty everything is first thing in the morning bathed in golden sunshine and birds tweeting. The air smelt clean and refreshing. I took a single morning coffee and as the day wore on and my sunday scaries subsided, I felt more relaxed. By about 4pm I had a really bad headache and felt tired which are classic caffeine withdrawal symptoms.

+ + + +

As the week has continued I've felt pretty good. I'm still getting anxious thoughts and having little worry episodes but they don't feel as acute or scary at the moment. I described this feeling to Mrs R as "they are lurking but they can't hurt me any more". It's only been a few days but things are feeling a lot more manageable and the overwhelmed feeling I get when I have a lot on at work has subsided a little.

+ + + +

Today (Sunday) I am feeling a little more anxious. I normally get the sunday scaries and start to think about all the emails waiting for me in my work inbox and all the stuff I want to achieve so that's pretty typical. What is unusual is that I am not feeling paralysed by my sunday scaries today. Quite often the feeling gets in my way and prevents me from doing or achieving much at all on a sunday but today it just feels like a bit of a niggle that's in the background if I stop what I'm doing. I really enjoyed my morning walk again today because the sun was out and I was up before the rest of the neighbourhood and it was lovely and peaceful.

+ + + +

Discussions Online

+ + + +

I posted about my experiment on Mastodon and found that a few others had similar experiences. Glyn chuckled when his timeline placed my post about cutting down coffee just below another tongue-in-cheek post about how doing that is a great way to reduce joy in your life. Obviously this is a pretty common thing.

+ + + +

Conclusion

+ + + +

So early signs are quite positive. I don't know if its placebo that will wear off or a combination of both the reduction in caffeine and the walking but I am feeling a fair bit better. I am trying to be realistic about the fact that anxiety won't completely go away like some kind of magical spell (and my experience today has confirmed that). However, even just having my anxieties numbed and "unable to hurt me" is a massive win.

+ + + +

Part of me is quite annoyed that I didn't try this sooner. I've been regularly drinking relatively high volumes of coffee for about 15 years and I'd say my anxiety probably started around the same sort of time. I really doubt that's a coincidence and it's such a smoking gun when I look back on it. However, hindsight is always 20:20 as they say.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/07/29/GenAI and the Trough of Disillusionment.md b/brainsteam/content/posts/2024/07/29/GenAI and the Trough of Disillusionment.md new file mode 100644 index 0000000..3277330 --- /dev/null +++ b/brainsteam/content/posts/2024/07/29/GenAI and the Trough of Disillusionment.md @@ -0,0 +1,38 @@ +--- +categories: +- AI and Machine Learning +date: '2024-07-29 21:18:00' +draft: false +tags: +- linkblog +title: GenAI and the Trough of Disillusionment +type: posts +--- + + +

There have been a bunch of recent stories about the GenAI hype not panning out so well.

+ + + +

Likes The Game Theory of AI CapEx by David Cahn.

+

A story from Sequoia Capital, who have a lot of money riding on AI investments, talks about the power dynamics at play between large cloud providers. They're all buying a bunch of GPUs because they're worried their competitors will beat them to it - a very rational and normal reason.

+
+ + + +

Likes Investors Are Suddenly Getting Very Concerned That AI Isn't Making Any Serious Money.

+

Big investors like Goldman Sachs are starting to wonder where the magical human level intelligence that they were promised by OpenAI and friends is. Early stage investment has significantly dried up

+
+ + + +

Likes How Does OpenAI Survive?.

+

Ed Zitron does some napkin maths that suggests OpenAI might struggle to stick around much longer without significant breakthroughs - they will have to keep raising and in this climate that might be hard, even for Sam Altman.

+
+ + + +

Likes Websites are Blocking the Wrong AI Scrapers (Because AI Companies Keep Making New Ones).

+

Meanwhile LLM companies continue to steadily tank their reputation by changing their user agent strings (which are used to identify their bots when they scrape your website) on a regular basis so that it's hard to know what to block.

+
+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/02/Watched_ Long Legs.md b/brainsteam/content/posts/2024/08/02/Watched_ Long Legs.md new file mode 100644 index 0000000..cdd0e6e --- /dev/null +++ b/brainsteam/content/posts/2024/08/02/Watched_ Long Legs.md @@ -0,0 +1,20 @@ +--- +categories: +- Personal +date: '2024-08-02 23:45:03' +draft: false +tags: +- movies +title: 'Watched: Long Legs' +type: posts +--- + + +

Just returned from seeing Long Legs in the cinema. I went into this movie pretty much blind, not knowing much except that Nicolas Cage was involved. It was a really interesting thriller and Cage was an exceptionally creepy serial killer. It was set in the 90s and follows a young FBI agent on the case of a killer. There were some real X-Files vibes. Quite an enjoyable watch.

+ + + +
+https://m.youtube.com/watch?v=FXOtkvx25gI +
+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/03/Migrating from Linear to Jira.md b/brainsteam/content/posts/2024/08/03/Migrating from Linear to Jira.md new file mode 100644 index 0000000..d28c377 --- /dev/null +++ b/brainsteam/content/posts/2024/08/03/Migrating from Linear to Jira.md @@ -0,0 +1,213 @@ +--- +categories: +- Uncategorised +date: '2024-08-03 10:11:37' +draft: true +tags: [] +title: Migrating from Linear to Jira +type: posts +--- + + +

Our company recently decided to move from Linear to Jira because we wanted a bunch of the fancier stuff that Jira offers around workflow management and resource burndown. The migration process has been a little painful but not too overwhelming. I wrote up some of my notes and built some Python scripts to facilitate the process so I thought I'd share them here.

+ + + +

Our company uses something like an agile Scrum methodology with a bi-monthly release cycle.

+ + + +

Mapping Linear to Jira Data Structures

+ + + +

Linear has Teams, each team can have multiple projects and then within projects we have stories, tasks and bugs. A project in Linear is typically a time-bound initiative over a number of releases that contains a set of stories and bugs. Stories contain tasks and sub-tasks

+ + + +

Jira's top level data structure is a Project, within a project you can have an Epic which is a time-bound initiative over a number of releases and can contain a number of stories and bugs. Stories contain tasks and sub-tasks.

+ + + +

So with this in mind, we mapped the data structures into something like the following:

+ + + +
Linear Data StructureJira Data Structure
Team[Jira] Project
[Linear] ProjectEpic
StoryStory
BugBug
TaskTask
Sub-TaskSub-Task
A mapping of Linear data structures to Jira data structures
+ + + +

There's a pretty straight forward 1:1 mapping of data from one system to the other. Not bad.

+ + + +

Linear has the concept of Cycles for timekeeping whereas Jira calls them Sprints. We actually decided not to try to map cycles and instead manually sorted that out after the migration.

+ + + +

Getting Data Out of Linear into Jira

+ + + +

Linear offers a CSV export function which is easy enough to use. However, not everything gets exported by this tool - this includes project metadata, comments on issues and attachments. I have provided some support tools that we can use to get those files out.

+ + + +

Getting the Main Tickets Out of Linear

+ + + +

In the Workspace Settings menu, go to Import/Export and scroll down to the Export CSV button. Linear will generate a CSV of your workspace in the background and email it to you.

+ + + +

Upon downloading the CSV file, we can see all of the tickets from all teams in your workspace.

+ + + +

Getting Project Metadata out of Linear

+ + + +

The projects are not represented by rows in the CSV but instead each item has a Project [Name] and Project ID associated with them. What if we want more metadata about each project? For example, the name, description, start date and status. Well I wrote a small script called export_projects.py that fetches all of that information from Linear's GraphQL API and stores it in a separate CSV. You can find it in the git repo.

+ + + +

Use it by running:

+ + + +
export LINEAR_API_KEY=
+python export_projects.py projects.csv
+ + + +

You can find out more about how to get a linear API key here. The generated CSV file projects.csv should contain a set of all of your linear projects and associated metadata.

+ + + +

Getting Comments out of Linear

+ + + +

Another shortcoming of the main linear CSV export is that it does not include comments from tickets, only the main ticket body. I've provided a script called export_comments.py which, like the projects script above, will grab all comments directly from the API. Use it like this:

+ + + +
export LINEAR_API_KEY=
+python export_comments.py comments.csv
+ + + +

The generated comments.csv should contain all the comments from your projects and the IDs of the parent thread that they address.

+ + + +

+ + + +

Getting Attachments out of Linear

+ + + +

I've also provided a script for exporting attachments - specifically images - from linear. This script goes through the CSV file provided by linear looking for URLS that appear to be attachments. It will then download each of them in turn and place them in a local directory so that we can re-upload them to Jira. This step is necessary because the JIRA import worker cannot directly access images on the linear server.

+ + + +

With the csv file you exported from Linear you should be able to run:

+ + + +
export LINEAR_API_KEY=
+python export_images.py --issues-file ./linear_export.csv --output-dir ./images
+ + + +

This script will produce a new directory called images which contains all the attachments and a file called index.csv which essentially holds a catalogue of all the images, what their old URL was in Linear and what their local filename is.

+ + + +

So now you should have 2 CSV files. The one you exported directly from Linear and the projects file created by this script.

+ + + +

Converting Markdown to Atlassian Wiki Markup

+ + + +

Linear uses Markdown for rich descriptions but Atlassian uses their own wiki markup (because y'know, standards). We need to auto-convert the item descriptions from Linear to Atlassian format. We need to do this for the main linear export, the projects and the comments.

+ + + +

When I did a bit of googling for how to go about this I found an example that uses mistletoe markdown to parse markdown and emit jira. I've created a lightweight command-line script called markdown2jira.py that will do this job for you.

+ + + +
python markdown2jira.py -i projects.csv -o projects_converted.csv -c description 
+python markdown2jira.py -i linear_export.csv -o linear_export_converted.csv -c Description 
+python markdown2jira.py -i comments.csv -o comments_converted.csv -c node.body 
+ + + +

You should now have converted csv files for comments, projects and linear items.

+ + + +

Hosting the Attachments

+ + + +

The JIRA import process does not allow you to directly upload image attachments. Instead, it expects them to be available in some public location or website. My solution to this was to use ngrok to provide a temporary public tunnel to the images directory on my laptop.

+ + + +

You will need an ngrok api key which you can get for free by signing up here, then you can run the following:

+ + + +
export NGROK_AUTHTOKEN=
+python attachment_host.py host-ngrok --path ./images --csv-outfile ./attachments.csv
+ + + +

This will run a server which we will need to leave running while the JIRA import takes place. Ngrok's free tunnels time out after 8 hours so bear in mind that if you choose now to take a break and go and do something else, you might need to re-run this step (also if you have a really massive project).

+ + + +

nb: I was also toying with the idea of writing a version of this that uploads all the images to S3 or Google storage but I haven't had time. Pull Requests welcome.

+ + + +

You should now also have a new file called attachments.csv - this is a map generated by the server which converts from the old image url in the JIRA ticket to the new one in your ngrok server.

+ + + +

Merge, Link and Tidy up Timestamps

+ + + +

Ok so the next step (script) is actually doing three things at once:

+ + + +
    +
  1. We need to merge the two converted CSV files together so that we have a single file that we can upload using JIRA's import system.
  2. + + + +
  3. We need to link together issues with their parents properly and make sure that issues which specify a Linear Project ID but no parent ID (because in Linear they were top level issues in the project) get mapped to the relevant Epic in the new world.
  4. + + + +
  5. Linear uses some very strange date strings when it exports by default so we will need to tidy those up and write them back out in a simple format that is easy for Jira to parse.
  6. +
+ + + +

The script merge_link.py handles all of these operations.

+ + + +

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/11/Watched_ Trap.md b/brainsteam/content/posts/2024/08/11/Watched_ Trap.md new file mode 100644 index 0000000..70fed6b --- /dev/null +++ b/brainsteam/content/posts/2024/08/11/Watched_ Trap.md @@ -0,0 +1,24 @@ +--- +categories: +- Personal +date: '2024-08-11 19:32:28' +draft: false +tags: +- movies +title: 'Watched: Trap' +type: posts +--- + + +

We went to see Trap, M. Night Shyamalan's latest starring Josh Hartnett as a somewhat creepy antagonist trying to escape a concert without being arrested (because he's actually a serial killer and the concert is an FBI trap set up to lure him and his tween daughter in).

+ + + +

It was a fairly classic Shyamalan movie, possibly one of his more enjoyable films of late. It felt a bit like an inverse-heist movie: what if instead of breaking in, they're trying to break out. Pretty cheesy but made a nice change from yet another sequel.

+ + + +
+https://www.youtube.com/watch?v=hJiPAJKjUVg +
+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/18/Visiting Bletchley Park.md b/brainsteam/content/posts/2024/08/18/Visiting Bletchley Park.md new file mode 100644 index 0000000..6b234dc --- /dev/null +++ b/brainsteam/content/posts/2024/08/18/Visiting Bletchley Park.md @@ -0,0 +1,87 @@ +--- +categories: +- Personal +date: '2024-08-18 21:59:38' +draft: false +tags: +- museums +- trips +title: Visiting Bletchley Park +type: posts +--- + + +

Yesterday Mrs R and I visited Bletchley Park as part of our third wedding anniversary celebration. As a computer scientist, BP has been on my bucket list since forever. It's the cradle of many computing breakthroughs including cyber warfare, cryptography and crypt-analysis and the first generally programmable computer.

+ + + +

BP is a large stately home which was converted into a research campus by the British Ministry of Defence during the late 1930s. At its peak, the site hosted nearly 10,000 workers who all contributed to the analysis and decryption of secret messages being transmitted around within Hitler's army and between axis nations. The site features beautiful gardens and a large lake and staff were able to walk around the gardens and enjoy them in between shifts.

+ + + +
A large house/mansion behind a lake with a fountain at the center of it on a calm, bright and sunny day.
+ + + +

BP reminded me a lot of Hursley House where I worked for a few years during my time at IBM. Similarly, Hursley was bought by the Government and hosted the team who developed the spitfire in world war 2 before being sold off to IBM many years later. I used to enjoy going for a wander when I was ruminating on some problem I needed to solve and I can just as easily imagine Alan Turing and his colleagues wondering the grounds in BP when they had a particularly tough problem to solve.

+ + + +

+ + + +

The Wrens and the Bombes

+ + + +

The vast majority of staff at BP during this time were women from the Women's Royal Navy Service (WRNS, referred to "The Wrens"). There were a number of VIPs there, Turing being one of the better known figures but key amongst them were Gordon Welchman who was key in breaking the Enigma encryption and speeding up the automated cracking of keys. Another key figure, Tommy Flowers wasn't based at BP but worked with Turing to develop The Colossus, the very first programmable computer. Prior to this, all computation was done with purpose built machines which could only do the thing they had been designed for. The Bombes - mechanical code breaking machines that could try 40k combinations of enigma keys in 12 minutes - being an impressive example.

+ + + +
a cabinet full of rows cylindrical rotors which were used to brute force enigma cypher keys
A working model of a "Bombe" which was manually configured and used to partially break enigma encryption
+ + + +

The Bombes were operated by the Womens' Royal Naval Service (WRNS or 'wrens' as they were known colloquially). In fact during the war, the staff on site were mostly women. It's somewhat ironic and sad that STEM fields are now so very male-dominated.

+ + + +

A Personal Connection

+ + + +

I found a personal connection to Bletchley when we saw an exhibit that showed how intelligence gathered from Hitler's Navy had fed to the British ship HMS Duke of York to help them to locate and sink the Scharnhorst, a German warship that was harrying a civilian convoy in arctic. My grandad was a sailor onboard the Duke of York and was interviewed about his experiences for the BBC (nb: they misspell Scharnhorst as Charnos in their coverage).

+ + + +
A map with a legend showing the Scharnhorst and HMS Duke of York - the ship that my grandfather served on.
+ + + +

Staff were sworn to secrecy and were not allowed to talk about their work even amongst colleagues from other departments. I found it quite fascinating reading diary entries and personal accounts of how boring some of the day-to-day work at BP was. Many of the staff may not have realised just how important their contributions were until after the war. Some days as a software engineer I have days where I just get stuck thinking about a particular mathematical challenge. I can only imagine how annoying it must have been to spend days and days trying to break a code by hand or using primitive computing only. It must also be pretty surreal getting on a bus with colleagues every day and not being able to talk about what you'd been doing that day.

+ + + +

Turing Exhibit

+ + + +

There was a huge exhibit dedicated to Turing's contributions to BP and to computing in general including a copy of the letter signed by former Prime Minister Gordon Brown in 2009 apologising for the treatment of Turing. After World War 2, Turing was convicted of gross indecency for being in a sexual relationship with another man and as a result lost all of his academic credentials and funding. I suppose this was analogous to the modern concept of "being cancelled". Unable to continue his work, he was driven to suicide. His work truly laid the foundations for all of modern computing and Artificial Intelligence and we owe him a great debt of gratitude. As a gay man whose unfair treatment led to his death, I can't imagine he would look kindly on modern uses of AI for things like screening interviews and surveillance capitalism.

+ + + +
A slate statue of Alan Turing
+ + + +

+ + + +

Conclusion

+ + + +

Overall, we had a really interesting day out at BP. It was really intriguing seeing the conditions that the codebreakers worked in and also nice to finally be able to talk to Mrs R about the Universal Turing Machine and how cryptography works. I'd strongly recommend that anyone with a vague interest in the history of information warfare or computing in general should pop along and visit.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/25/Morning People.md b/brainsteam/content/posts/2024/08/25/Morning People.md new file mode 100644 index 0000000..8f2233c --- /dev/null +++ b/brainsteam/content/posts/2024/08/25/Morning People.md @@ -0,0 +1,13 @@ +--- +categories: +- Uncategorised +date: '2024-08-25 14:53:51' +draft: true +tags: [] +title: Morning People +type: posts +--- + + +

I am a morning person. I was talking to a friend recently about why

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/08/27/wicked.md b/brainsteam/content/posts/2024/08/27/wicked.md new file mode 100644 index 0000000..4c4d165 --- /dev/null +++ b/brainsteam/content/posts/2024/08/27/wicked.md @@ -0,0 +1,12 @@ +--- +categories: +- Uncategorised +date: '2024-08-27 21:14:00' +draft: true +photo: +- url: /media/1724789671358_da5cba22.jpg +tags: [] +title: wicked +type: posts +--- + diff --git a/brainsteam/content/posts/2024/08/29/Data Export in Bulk inside your CI Pipeline With Sling.md b/brainsteam/content/posts/2024/08/29/Data Export in Bulk inside your CI Pipeline With Sling.md new file mode 100644 index 0000000..61ce4f5 --- /dev/null +++ b/brainsteam/content/posts/2024/08/29/Data Export in Bulk inside your CI Pipeline With Sling.md @@ -0,0 +1,297 @@ +--- +categories: +- Data Science +date: '2024-08-29 16:24:57' +draft: false +tags: +- database +- elt +title: Data Export in Bulk inside your CI Pipeline With Sling +type: posts +--- + + +

In this post I will discuss my migration journey from Airbyte to Sling. Sling is a new lightweight data sync product that is small enough that it can run in a CI pipeline.

+ + + +

What's the Problem?

+ + + +

I'm trying to orchestrate the secure export of large amounts of data from a customer's database into cold storage. Customers can loadthis data into their favourite OLAP database and do analytics and machine learning there.

+ + + +

What Happened With Airbyte?

+ + + +

Last year I wrote about some experiments I had done with airbyte for copying data around in bulk. I was quite pleased with Airbyte at the time and it has proven to be quite robust.

+ + + +

Recently, Airbyte have made a few changes to the product that have made it less suitable for my work environment.

+ + + +

Notably:

+ + + + + + + +

There were a couple of other sticking points that led me to want to move away from Airbyte:

+ + + + + + + +

For all of these reasons I decided to start looking at alternatives.

+ + + +

+ + + +

Exploring Alternative Solutions

+ + + +

I played briefly with meltano and singer but I found that they were quite sluggish for my use case. I'm not really sure why this was the case. I was finding that meltano could handle something like 10k records/second locally but 1-2k records/sec in CI. If anyone knows why that might have been, please let me know!

+ + + +

Eventually I stumbled across sling. Sling is written in golang and so far I've found it to be very responsive. It runs handily in a small amount of memory and supports yaml files that can be programmatically managed.

+ + + +

I was really excited by the resource footprint of sling vs airbyte. It's small enough that it can run in a few hundred MB of RAM rather than needing GBs. This means that it can run in our CI pipeline. In fact, the official documentation provides examples for running it inside Github Actions.

+ + + +

+ + + +

Working with Sling and Gitlab

+ + + +

We use Gitlab for version control and CI so I spent some time building a POC where we ship a sling replication.yaml configuration file to a repository, set some environment variables and use Gitlab Pipeline Schedules to automatically run the sync on a periodic basis.

+ + + +

+ + + +

The CI File

+ + + +
stages:
+    - run
+
+execute:
+    image: slingdata/sling
+    stage: run
+    rules:  
+        - if: '$CI_PIPELINE_SOURCE == "schedule"'
+        - when: manual
+    script:
+        - sling conns set GOOGLE_STORAGE type=gs bucket=${GOOGLE_STORAGE_BUCKET} key_file=${GOOGLE_SECRET_JSON}
+        - sling conns set MYSQL url="mysql://${MYSQL_USER}:${MYSQL_PASSWORD}@${MYSQL_HOST}:3306/${MYSQL_DBNAME}"
+        - sling run -r replication.yaml -d
+
+    after_script:
+        - | 
+            if [ "$CI_JOB_STATUS" == "success" ]; then
+            curl -H "Content-type: application/json" \
+            --data "{\"text\": \"CI sync for \`${CI_COMMIT_REF_NAME}\` successful! :partyparrot: <${CI_PIPELINE_URL}|Click here> for details!\"}" \
+            -X POST $SLACK_WEBHOOK_URL
+            else
+            curl -H "Content-type: application/json" \
+            --data "{\"text\": \"CI sync for \`${CI_COMMIT_REF_NAME}\` failed :frowning: <${CI_PIPELINE_URL}|Click here> for details!\"}" \
+            -X POST $SLACK_WEBHOOK_URL
+            fi
+ + + +

+ + + +

Let's unpack what's happening here. I'm using the official sling docker as the main base image for my CI job. Effectively its like I've installed sling on a linux machine. I've added a rule to make sure that this run only fires when it's part of a schedule. We don't want it running every time someone pushes a change to the replication config for example.

+ + + +

The script simply adds connection configurations for GOOGLE_STORAGE where I'll be sending my data and MYSQL where I'll be reading it from. Then we execute the sync step with sling run.

+ + + +

Afterwards we send slack notifications to my org's workspace depending on whether the job succeeded or not.

+ + + +

The Replication Yaml

+ + + +

Next we also define a replication yaml. This file tells sling which connection is which (i.e. MySQL is our source and GCS is our target) and also which tables and columns to copy. This is particularly powerful since there may be specific columns that you don't want to sync (e.g user password hashes or api keys).

+ + + +
source: MYSQL
+target: GOOGLE_STORAGE
+
+defaults:
+  object: "{stream_schema}.{stream_table}"
+  mode: full-refresh
+  target_options:
+    format: jsonlines # this could also be CSV
+    compression: gzip
+
+
+streams:
+  news_articles:
+    mode: full-refresh
+   
+    select:
+      - id
+      - url 
+      - is_active
+      - title
+      - summary
+      - created_at
+ + + +

The yaml file above provides default configuration for each stream - telling sling to generate gzipped newline-delineated json files in google storage.

+ + + +

Under the streams section we list the tables we're interested in and also use the select subsection to list columns we want (you can actually flip to an denylist approach by listing columns you don't want to select with a - in front of them which looks a bit weird because you need a dash for the yaml format and a dash for the value. It can make it clearer if you wrap them in quotes:

+ + + +
select: # select /everything/ except the following:
+  - "-password"
+  - "-my_secret_column"
+ + + +

We can also use a custom sql query to limit what we send over. Imagine you have a table with millions of news articles but your customer only cares about the ones categorised as "tech":

+ + + +
streams:
+  news_articles:
+    mode: full-refresh
+    sql: "SELECT * FROM news_articles WHERE category='tech'"
+ + + +

+ + + +

Configuring the Schedule

+ + + +

In the Gitlab Navigation tab under "Build" there should be a Pipeline Schedules button. Clicking that and then "New Schedule" should bring up a screen like this:

+ + + +
User interface for scheduling a new pipeline. Form includes fields for description, interval pattern selection (daily, weekly, monthly, or custom), cron timezone selection, target branch or tag selection, variable input, and an activation checkbox. Options to create pipeline schedule or cancel are provided at the bottom.
+ + + +

Here we can set our CI job to run on a recurring basis (if you're not comfortable using cron expressions, I recommend https://crontab.guru to help).

+ + + +

We need to provide values for each of the environment variables in our gitlab ci file above.

+ + + +

A tip for Google JSON key files: use the left-most dropdown to set a "File" variable - this takes the variable input value from the form, writes it to a file and sets the variable name to the path of the file. We can simply open up a json key file locally and copy and paste it's contents into the variable in this form.

+ + + +

Finally, once all the environment variables are set, we can save the schedule

+ + + +
User interface for editing a pipeline schedule. Form includes fields for description ('Sling test'), interval pattern selection (custom selected with '54 15 * * *' cron syntax), cron timezone (UTC 0), and target branch ('sling'). Variables section shows four filled entries for MYSQL_HOST, MYSQL_PASSWORD, GOOGLE_STORAGE_BUCKET, and GOOGLE_SECRET_JSON, with values obscured. An empty variable entry is available. Options to reveal values, toggle activation, save changes, or cancel are provided.
+ + + +

Managing Pipelines

+ + + +

+ + + +

Once you have saved your sync config you'll be able to see it in the pipelines view

+ + + +
+ + + +

You should be able to see when the job is next scheduled to run and you can click the play button to force it to run right away. Job logs show up in your CI pipelines tab as normal.

+ + + +

+ + + +

Tips for Managing Multiple Sync Targets

+ + + +

Now that we've found this pattern it's made life much easier (and cheaper). I've written a python tool for automatically enabling and disabling groups of tables from our platform's schema en-masse so that we avoid manual issues.

+ + + +

For segregating client workloads you could have one repository per client or you could have one branch per client in the same project. The environment variable values are unique per schedule which means that it would be difficult to cross contaminate environments (unless you paste in different client's credentials during setup which would also be possible in any other situation if you are a bit gung-ho with your clipboard management).

+ + + +

We have branch protection rules on production/** and require a merge request and approval from an admin so that existing production sync configurations don't get accidentally overwritten.

+ + + +

+ + + +

Conclusion

+ + + +

Sling is a really powerful yet very lightweight tool for doing database data sync. It's been really great to be able to vastly simplify our company's data sync offering and stop having to set up expensive servers for every customer that needs this functionality. Hopefully this post is useful to others looking to try out sling. If you're interested in the tool check out their documentation, github page. The maintainer Fritz is very responsive and active when it comes to issues and the project discord chat so those are also great resources if you have any questions.

+ \ No newline at end of file diff --git a/brainsteam/content/posts/2024/09/05/Running Phi MoE 3.5 on Macbook Pro.md b/brainsteam/content/posts/2024/09/05/Running Phi MoE 3.5 on Macbook Pro.md new file mode 100644 index 0000000..45bd4dd --- /dev/null +++ b/brainsteam/content/posts/2024/09/05/Running Phi MoE 3.5 on Macbook Pro.md @@ -0,0 +1,148 @@ +--- +categories: +- AI and Machine Learning +date: '2024-09-05 14:17:39' +draft: false +tags: +- AI +- llms +title: Running Phi MoE 3.5 on Macbook Pro +type: posts +--- + + +

The relatively recently released Phi 3.5 model series includes a mixture-of-experts model featuring 16 x 3.3 Billion parameter expert models. It activates these experts two at a time resulting in pretty good performance but only 6.6 billion parameters held in memory at once. I recently wanted to try running Phi MoE 3.5 on my macbook but was blocked from doing so using my usual method whilst support is built into llama.cpp and then ollama.

+ + + +

I decided to try out another library, mistral.rs, which is written in the rust programming language and already supports these newer models. It required a little bit of fiddling around but I did manage to get it working and the model is relatively responsive.

+ + + +

Getting Our Dependencies and Building Mistral.RS

+ + + +

To get started you will need to have the rust compiler toolchain installed on your macbook including rustc and cargo. The easiest way to do this is via brew:

+ + + +
brew install rust
+ + + +

You'll also need to grab the code for the project

+ + + +
git clone https://github.com/EricLBuehler/mistral.rs.git
+ + + +

Once you have both of these in place we can build the project. Since we're running on Mac, we want the compiler to make use of apple Metal which allows the model to use the GPU capabilities of the M-series chip to accelerate the model.

+ + + +
cd mistral.rs
+cargo install --path mistralrs-server --features metal
+ + + +

This command may take a couple of minutes to run. The compiled server will be saved in the target/release folder relative to your project folder.

+ + + +

+ + + +

Running the Model with Quantization

+ + + +

The default instructions in the project readme work but you might find it takes up a lot of memory and takes a really long time to run. That's because, by default mistral.rs does not do any quantization so running the model requires 12GB of memory.

+ + + +

mistral.rs supports in-situ-quantisation which essentially means that the framework loads the model up and does the quantisation at run time (as opposed to requiring you to download a GGUF file that was already quantized). I recommend running the following:

+ + + +
./target/release/mistralrs-server --isq Q4_0 -i plain -m microsoft/Phi-3.5-mini-instruct -a phi3
+ + + +

In this mode we use ISQ to quantize the model down to 4bit mode (--isq Q4_0). You should be able to chat to the model through the terminal

+ + + +

Running as a Server

+ + + +

Mistral.rs provides a HTTP API that is compatible with OpenAI standards. To run in server mode we remove the -i argument and replace it with a port number to run on --port 1234:

+ + + +
./target/release/mistralrs-server --port 1234 --isq Q4_0 plain -m microsoft/Phi-3.5-mini-instruct -a phi3
+ + + +

You can then use an app like Postman or Bruno to interact with your model:

+ + + +
Screenshot of a REST tooling interface. A pane on the left shows a json payload that was sent to the server containing messages to the model telling it to behave as a useful assistant and write a poem.
+
+On the right is the response which contains a message and the beginning of a poem as written by the model.
+ + + +

+ + + +

Running the Vision Model

+ + + +

To run the vision model, we just need to make a couple of changes to our command line arguments:

+ + + +
./target/release/mistralrs-server --port 1234 --isq Q4_0 vision-plain -m microsoft/Phi-3.5-vision-instruct -a phi3v
+ + + +

We still want to use ISQ but this time we swap plain for vision-plain, we swap the model name for the vision equivalent and we change the architecture -a phi3 to -a phi3v.

+ + + +

Likewise we can now interact with the model via HTTP tooling. Here's a response based on the example from the documentation:

+ + + +
Screenshot of a REST interface. A pane on the left shows a json payload that was sent to the server containing messages to the model telling it to analyse an image url.
+
+On the right is the response which describes the mountain in the picture that was sent.
+ + + +

+ + + +

Running on Linux and Nvidia

+ + + +

I am still struggling to get mistral.rs to build on Linux at the moment, the docker images that are provided by the project don't seem to play ball with my systems. Once I figure this out I'll release an updated version of this blog.

+ + + +

+ + + +

+ \ No newline at end of file