The obligatory “what’s the point of a personal blog these days?” post

Yeah. That post. I started this blog way back when it didn’t seem that weird of a thing to do. Now days it seems personal sites are used as CVs, and I’m not really into that. (1) Nobody is going to hire me for my pottery skills or the small contributions I’ve made to stackoverflow… Read more »

How to load the ORCID data dump into mongo, without dying of old age

Since writing this post, I’ve worked out a better way of doing things – use python to read directly from the tar file and inject the results into Mongo! I used in with the 2017 ORCID data dump and it’s like lightning compared to the method described below.  Basically download the script and run:

Read more »

Did I mention I’m a potter?

I’ve not posted about much on here other than programming, but well, that’s a bit dull after a while.  What’s really interesting, for me at least, is POTTERY.  Yeah, I said it.  Pottery. I recently got back into throwing pots after a 20 year hiatus.  I rekindled my love of clay so much I ended up… Read more »

Getting started with the public ORCID API using swagger – quickstart guide

ORCID recently implemented a swagger definition file for it’s v2.0 API, which means it’s now even easier to access the public ORCID API from your website.  Just use swagger.js.  It’s Super.  And Easy. Let’s give it a go. First, clone swagger onto your machine.  Either use the git desktop client, click the button on the repository or fetch it… Read more »

Differences between ORCID and DataCite Metadata

(written by Martin Fenner and cross posted from the Datacite blog – I’m one of the co-authors of the report) One of the first tasks for DataCite in the European Commission-funded THOR project that started in June was to contribute to a comparison of the ORCID and DataCite metadata standards. Together with ORCID, CERN, the British Library… Read more »

C# FluentValidation – why we’re using it

A bit of background I’ve been working in the C# world for a few months now. While the code is very similar to Java,the culture around open source could not be more different.  Where open source as business as usual in the Java & Javascript worlds, it’s very much exceptional circumstances only in the .Net one…. Read more »

ORCiD Java Client now supports schema version 1.2!

UPDATE: This is a really dead post :D.  ORCID is now on V3.0 of the API and the library has been deprecated. Thanks to the hard work of Stephan Windmüller (@Stovocor) the ORCiD client library now supports version 1.2 of the ORCiD schema.  He’s also updated the companion ORCiD profile updater web app to use the… Read more »

Delphi isn’t quite dead yet.

Back when I was a lowly junior programmer, life was great. We had a humongous 64k of memory to play with, two whole (user defined!) colours and an 80 character width screen.  We managed gigantic millions of member pension schemes using the equivalent of a commodore 64.  Recursive functions meant stack overflow.  Not one of these… Read more »

Goodbye to the British Library, hello corporate life.

I’ve moved on.  I had a great couple of years working at the library and met a ton of really enthusiastic folk.  The ODIN project came to an end and there was little left for me to do, so I’ve found myself a new workplace more local to home. I’m a delivery engineer, apparently.  I’ve been… Read more »

Making acceptance testing easy, useful and fun with BDD – enter cucumber

User stories, requirements analysis and all that Jazz. I’ve been mulling over my approach to gathering requirements recently and it’s become clear that although I’m Doing It Right a lot of the time, I’m also Doing It Wrong. Ron Jefferies wrote about the Three Cs 13 years ago.  He did it in the context of Extreme… Read more »

  • Programming

    Controlling the cache headers for a RESTlet directory

    Posted on by

    My previous post described how to serve webjars with RESTlet.  This post will describe how to add caching so that users don’t swamp your servers with requests. Put simply, you put a filter in the chain before the directory and modify the HTTP headers of successful requests.  Like so:


  • Programming

    Using Webjars without Servlet 3 on Google App Engine (GAE)

    Posted on by

    I recently stumbled upon what is just about the best thing since I first discovered the wonders of maven – the ability to add javascript dependencies in your pom.xml using webjars.  Like this:

    Maven will manage their sub-dependencies and  your web pages can access the scripts direct from the jars.  Lovely.  Sign me up!  Except… Read more »

  • Programming

    Battle of the tokenizers – delimited text parser performance

    Posted on by

    An interesting question about StringTokenizer popped up on stackoverflow the other day.  It was essentially about how to optimise reading delimitated data, in this case lines of integers separated by lines of spaces. It demonstrated three things. Don’t fixate on micro-optimisations when you probably have big bottlenecks elsewhere String.split() is really slow The difference is… Read more »

  • Programming

    Formatting file sizes – making bytes human readable in GWT

    Posted on by

    It’s one of those simple UI things that make a real difference.   You have to be pretty special to be able to instantly convert 45742364 bytes into 43.6MB in your head. Enter some great code snippets I’ve discovered on StackOverflow that do this.  This four line wonder is top answer to that question (from Mr… Read more »

  • Programming

    Deserialising JSON or XML to a Map using Java

    Posted on by

    Well, here’s a thing.  Imagine you have some XML or JSON that looks like a map, only you don’t know the names or number of the properties in advance.  For example:

    Or some JSON that looks like this:

    How can you do it?  Using Jackson @JsonAnyGetter and @JsonAnySetter . All you need is the XML root… Read more »

  • Ramblings

    What I do all day – the digital electoral register

    Posted on by

    This one is for the non-programmers out there. I’m writing a program that takes electoral registers from around the country and sticks in the same place.  The reason for this is that it’s a Good Thing To Do. Hopefully I’ve not lost you yet. Local authorities are obliged by law to send their unabridged registers… Read more »

  • Programming

    How not to parse CSV using Java

    Posted on by

    I’ve only been on StackOverflow for a short while and already feel like I’m drowning under the the sheer quantity of people asking how to parse CSV.  Most of them start in one a few ways.  So, for the record, here’s how you don’t do it. Don’t use regular expressions That’s right.  You don’t use… Read more »

  • Programming, Ramblings

    Why I hate spreadsheets (part one of many)

    Posted on by

    CSV should be defined somewhere, right?  You should be able to tell if CSV is well formed? WRONG.  This is going to be the first post of many on CSV, the devils own file format. Here’s the closest you can get to a specification of the CSV format, the RFC for the CSV MIME type…. Read more »