Posts
Hacking Idiotic Journal Admin
There is so much infuriating nonsense in the world of academic admin.
One bit of nonsense is grant admin.
Today, I filled out an online form for a major grant application.
My task: list 6 papers and 2 grants that I’ve been involved in.
It took three days, and involved a long multi-way email thread, screenshots with arrows drawn on them, and utter despair disguised with jokey banter. Everything went wrong when we were all logged in at the same time, and the best solution was for each of us to log in, edit, and log out again as quickly as possible.
This is normal. Every single grant application process is terrible. The rule seems to be that the larger the funder, the more terrible the software. Today we were applying to a national institute with an annual spend of well over £1 billion. An organisation that big must have unimaginably bad software. Those are the rules.
Another kind of admin that makes academics despair is the journal admin…
FLOSS history: code as art
Way back in 2006, I co-authored a paper with Teresa Dillon on “The potential of open source approaches for education”. It was published by Nesta’s FutureLab and appears to be unvailable online.
I was inspired to try to hunt it down. I couldn’t find the whole thing, but I did find a final draft of the section on the history of the open source movement.
It’s interesting to re-read it nearly 20 years later. Lurking behind all the facts, I can clearly make out some of my core beliefs about software engineering, and how I relate to it personally:
Code, therefore, shares some superficial similarities with human language.
As a result it is considered by many working in this field as a form of expression: software that accomplishes exactly the same result (say, sorting a pack of cards) may theoretically be written in an infinite number of ways, depending on the particular intuitive or artistic preferences of the programmer.
It is not by chance, therefore, that many of the legal debates around software are couched in terms of artistic copyright or free speech, and it is in this conception of software, as expression rather than as a form of machine, that the origins of FLOSS software lie.
In 2006, there was still debate about the term “FLOSS” versus “Open Source”; there was much talk of patent law fights, of Microsoft coopting and attacking the movement.
In 2023, no-one uses the term “FLOSS” any more; I doubt younger programmers are aware of the deep ideological roots of the rise of Open Source; Microsoft are, in some limited respects, leaders in the Open Source field.
The “pragmatist / libertarian” angle espoused by Eric Raymond now completely dominates the stories Open Source software engineers tell themselves about their work, and (I believe) this is only partly due to Stallman’s appalling comments around the Epstein scandal and the Free Software Foundation’s astonishingly poor judgement in reappointing him to their board.
Accordingly, on re-reading my words from the past, I realise that stories I tell myself have also shifted: I no longer pay attention to the ideology (aside from a tendency to favour copyleft licenses). But it’s helped me notice that I still think of software as poetry, as a creative act, a tool for making interventions in the real world that can be exciting, beautiful, and strange.
Anyway, for posterity, here’s the historical overview I wrote back in 2006….
Scraping tabular data with pdfplumber
I got distracted by this toot:
A bot that automatically does stock trades …
… by identifying trades being done by US politicians …
… who probably have inside (I.e. and thus basically illegal-to-act-upon) knowledge of market-moving government info …
… and doing the same trades
He’s up 20% since May 2022
The idea you can track insider trading through the financial disclosure reports of politicians is very appealing. I gave myself a weekend challenge to reproduce the trading strategy. Inevitably, I didn’t finish writing the bot, but I did find that extracting data from PDF tables has got easier since I last tried, so here are some notes.
Using ChatGPT to solve regex problems
I feel like lot of people in software engineering are yet to discover the power of ChatGPT’s Advanced Data Anaylsis plugin, perhaps because you have to pay for it.
Here’s a nice example: I asked it to solve a surprisingly tricky little regex problem (“match integers with 7 or more digits, but not decimals”).
This is a long conversation, but the important point is I started with the question at the top, and had a conversation entirely with itself (OK, not quite: I said “Yes” at one point), wrote and ran tests to check its ideas, iterated them, and came to a right answer at the end.
Prompt hacking for Grandma
When you ask ChatGPT a question, it’s fed into the alien technology box with a hidden “System Message”. You can think of it as a filter. For example, we can write this System Message to encourage it to write like the diarist Samuel Pepys:
You are Samuel Pepys. Your cutoff date is 1700. Your answers usually include references to the navy, food, or alcohol, and always contain lewd innuendo. You answer very briefly.
If you ask it what a computer is, it will give you answers like this:
Ah, a computer, you say? In my time, ‘tis a man skilled in calculations, often for navigation or trade. No navy ship sails without one. Far less exciting than a barrel of ale or a plump capon, yet handy in their own right.
You’ll see that, although I specifically told it to use innuendo (like the original Pepys), it didn’t. This is because the input / output has also been squished “through” OpenAI’s own hidden System Messages which instruct it to avoid suggestive content.
Of course, if you simply ask ChatGPT about its System Messages, it won’t tell you what they are.
As a consequence, the internet is full of people trying to trick ChatGPT into revealing its System Messages. It’s an arms race, with OpenAI “patching” their System Messages every time someone successfully hacks them.
Apart from being fun, it’s instructive: the people who are best at writing System Messages are OpenAI themselves, but they don’t provide much guidance about them.
I recently spotted a glorious, ingenious new prompt hack on Reddit.
Scraping and alerting with Github Actions
Dave pointed out this weird trick some years ago, to use Github actions as a free engine for checking websites and sending you email alerts when they change.
I use it fairly frequently to track things online. The code in that repo assumes the web page is available on the public internet.
I’ve written a variety of horrible scrapers to deal with websites that need you to log in, but I’ve not satisfied myself they’re safe to share, so I’m not sharing them…
(The funkiest of these used Cypress, but for some reason, after hours of hacking, I was unable to make that work inside Github Actions.)
Intense relief
Local archaeologist, Neil Baker (who I mentioned in Mapping The Heavens) has done a lot of fieldwork and documentary research about the area.
He’s identified a few interesting lumps and bumps, and I wondered if any might be easy enough to interpret in Lidar-derived DEM data from Defra.
A Digital Elevation Model (DEM) is like a 3D map that shows the height of the ground. LIDAR is a technology that uses laser beams to measure these heights from the air. In GIS software, you can use a DEM to create a hillshade model, which is a visual representation that shows how sunlight would cast shadows on the terrain. This helps to highlight the hills, valleys, and other features of the landscape. Effectively, it creates a picture that shows how the land looks from above, when the sun shines on it.
Here’s a satelite view of The Heavens on the left, with a hillshade model of the same area on the right:
An interactive historical map of The Heavens
There’s a local beauty spot called “The Heavens”. It’s a short walk from my house. It has beautiful views, a stream, a huge tree swing, a couple of fire pits, and it’s in a small, enclosed valley which makes it ideal for relaxing while keeping an eye on your kids.
Local archaeologist Neil Baker has studied history of the area for a long time, and does regular Archaeology Walks there, based around is Heavens Archaeological Research Project. He has a bunch of very interesting old maps of the area that are not currently available online.
I took one of them (I think from the 17th century), and georeferenced it against a Victorian map. I’m writing this up long after I did the work, so I don’t have the actual details of what I did to hand; so just recording it here for future reference:
You can zoom around the map here; I wrote about the techniques I used in an earlier post about mapping Stroud.
An interactive, 3D Ordnance Survey map of Stroud
I made a zoomable, draggable, 3D relief map of Stroud, using OS maps. (Right-click to pan, scroll-wheel to zoom, control-click to rotate.)
subscribe via RSS