Jul 25 2014

ETD2014 slides

I’m having a great time attending the ETD2014 conference.  There’s been lots of lively discussion around ORCiD and DOIs and it’s been fantastic to gather wider perspectives.  It’s also been great to get some coding in adapting the import tool to work with the Leicester institutional repository.

For those that are interested in the ORCiD integration I was discussing earlier, the live application can be found at http://ethos-orcid.appspot.com and the code at https://github.com/TomDemeranville/ I’ve popped the slides on figshare (http://dx.doi.org/10.6084/m9.figshare.1117858)


Jun 16 2014

ELAG2014 – Slides

I recently presented at ELAG2014 about ORCiD integration.  I’ve embedded my slides below.  They might not make quite as much sense without context – they’re mainly pictures with single word topic headings, but they contain links to the source code etc.

I had a great time and met a lot of interesting folks whilst at ELAG.  I’m now in contact with two UK universities looking at ORCiD integration so I think it’s been useful.

I’ll add some commentary and a summary of the talk later this week.

Rather appropriately this is up on figshare – an open data repository that lets you upload anything then assigns it a DOI. This one has http://dx.doi.org/10.6084/m9.figshare.1057954



May 6 2014

A different view of the British Library – photos

Once you get inside it, the British Library is a beautiful building.  I’ve taken to photographing it and its contents during my lunch break.  Here they are, click on them for the bigger versions.

Check out my flickr stream for more.

Mar 6 2014

Generating POJOs from XML schemas using JAXB XJC

A little bit of history

XML processing in Java has come a long way in the last ten years.  Back in the old days mapping XML to Java was a bit of a nightmare, deserialising usually meant pulling the DOM apart bit by bit to get at the interesting parts. Serialisation was worse – the Java DOM API is truly horrific.  Helper libraries existed, JDOM, XMLBeans etc etc but while they made things somewhat easier they never made it easy.  Add to that the problems introduced by having various versions of xerces, various implementations of the org.w3c.dom API, and versions of the DOM api itself (urgh – dom level 2 anyone?) it added up to development hell.

Introducing XJC

Now days, there’s a plethora of libraries and methods to choose from and some of them are baked right into the core java class libraries.  JAXB is one of them.  JAXB comes with a handy little command line tool called xjc, which takes a schema and spits out an annotated class hierarchy.   The class hierarchy behaves in a sane manner (unlike the weird stuff XMLBeans used to make) and can be used for marshalling and unmarshalling XML to POJOs.

Generating POJOS

Generating Java classes from an XML schema is easy.  It goes a little something like this:

bash> xjc my-schema.xsd

Yup, it’s usually that simple.  To generate it in the package you want, you can do

bash> xjc -p my.pkg my-schema.xsd

Place the generated classes somewhere in your project and you’re ready to go.

Using JAXB

In the simplest cases you can do something as easy as:

While there’s loads more to JAXB than this example shows, this will certainly get you started.  I wrote the ORCiD Java Library using xjc and JAXB, so if you’re interested, check it out.

Feb 11 2014

The age of geocities – Bubba says HOWDY!!!

There’s a fantastic project out there that’s taking screenshots of random Geocites pages as they would have appeared when they were live.

It’s strangely compelling viewing.

bubba says howdy!!!

bubba says howdy!!!

Sites like these showcase an important aspect of our cultural heritage.  Back when the internet was called the “information superhighway” and people were still talking about the “digital frontier”, Geocites was where you could stake your claim.

I did it myself once.  I set up my own little homestead that hosted a Java Applet I’d written – a Java version of the classic game Elite.  I couldn’t get the flight engine quite right, but you could trade from Lave to Diso, view things on the radar, travel between systems and view all the original craft in their rotating vector glory.  The site is sadly lost but the memories remain.

Grab yourself a bit of nostalgic indulgence here: http://oneterabyteofkilobyteage.tumblr.com/

More background of the project can be found here: http://rhizome.org/editorial/2014/feb/10/authenticity-access-digital-preservation-geocities/




Feb 10 2014

ORCiD tools – who’s claiming what?

As part of my work with data-centres and ORCiD I’ve put together a tool that lets you see where works claimed within ORCiD have been published.   Start typing a publisher into the search box and it’ll look up the DOI prefix (or other identifier prefix) for that publisher from a list nearly 4000 long.  Current highlights include the American Helicopter Society with 9 ORCiDs.

See it in action at http://ethos-orcid.appspot.com/search


One of the things that’s repeatedly come up when talking to data-centres about author metadata is that while it’s easy to push data in, it’s pretty hard to get it back out.  Recent changes in the ORCiD API have made this easier, hence this tool.

Many datasets have scant author metadata, with little more that an institution name attributed to them.  Datacentres can now pull the claims information out of ORCiD and use it to selectively enrich their own metadata, completing a “virtuous circle” of integration.

Source code?

This code is available as part of the orcid-update-java application, which uses the orcid-java-client library.

Feb 7 2014

I didn’t go to university to get myself a job

Chris Bourg has written a great piece about the insidiousness of neo-liberalism and education-as-an-investment over at her blog, check it out here: The Neoliberal Library: Resistance is not futile

I am one of those hopeless idealists who still believes that education is – or should be – a social and public good rather than a private one, and that the goal of higher education should be to promote a healthy democracy and an informed citizenry. And I believe libraries play a critical role in contributing to that public good of an informed citizenry.

In the neoliberal university, students are individual customers, looking to acquire marketable skills. Universities (and teachers and libraries) are evaluated on clearly defined outcomes, and on how efficiently they achieve those outcomes.  Sound familiar?

I’ve managed to make this blog of mine really dull tech stuff and zero politics for a while now, probably out of a desire to keep myself sane.  That said, the almost inevitable (and widely ignored in the press) move in the UK towards a for-profit education system should strike fear into the hearts of anyone who stops to think about it.

There’s two sides to this, the education-as-an-investment and the for-profit education system and they go hand in hand. Ever since the introduction of tuition fees in the UK, the ideology of “investing in yours education” has gained a lot of traction here.  The next stop will almost certainly be the ramping up of the for-profit private education market, starting with the lift of tuition fee caps and ending with a two-tier education system that pumps out workers and  perpetuates inherited privilege.

Chris also talks a lot of sense about the ridiculous focus on the personal within politics, the focus on individualism at the expense of the wider movement.  Check out her blog, it’s a refreshing blast and a welcome change from the celebrity twitterati politics of ME that seems to pass for political discourse nowadays.  Sure, it’s important to understand that other people have different experiences that you.  Essential in fact.  But just understanding gets us nowhere and changes nothing.  It’s Acting, Doing.  That’s what we need.

For more info on these topics, and to actually help do something, tale a look at campaign group Public University https://twitter.com/public_uni and the UCU http://www.ucu.org.uk/index.cfm

Jan 21 2014

ORCID Open Source Java Client – I made this!

Update: ORCiD client now available as a Maven dependency!

I’ve just open sourced a Java application I’ve been working on at the British Library.  It’s a RESTlet server and JQuery/bootstrap client that enables people to claim a work from a remote service, log into ORCID using OAuth and add the work to their profile.

It was built to work with a British Library service called Ethos (http://ethos.bl.uk), but is easily customizable for use with other metadata providers, integrators simply implement an interface to fetch  metadata from their own backend and update the configuration to use it.

You can find the ORCiD client library source (with examples) here: https://github.com/TomDemeranville/orcid-java-client

You can find the application source code here: https://github.com/TomDemeranville/orcid-update-java

You can see it in action here: http://ethos-orcid.appspot.com/

Jan 18 2014

Controlling the cache headers for a RESTlet directory

My previous post described how to serve webjars with RESTlet.  This post will describe how to add caching so that users don’t swamp your servers with requests.

Put simply, you put a filter in the chain before the directory and modify the HTTP headers of successful requests.  Like so:


Jan 16 2014

Using Webjars without Servlet 3 on Google App Engine (GAE)

I recently stumbled upon what is just about the best thing since I first discovered the wonders of maven – the ability to add javascript dependencies in your pom.xml using webjars.  Like this:

Maven will manage their sub-dependencies and  your web pages can access the scripts direct from the jars.  Lovely.  Sign me up!  Except wait, what?  Webjars requires servlet 3 and GAE only supports 2.5.  Boo.  Thanks Google.

Google have been quick off the mark unable to deal with this at all.  There’s a fairly funny unsolved support request from way back in 2010 about it here .

Frustrated about this, and really surprised there was no existing workaround anywhere, I braved about eleventy million not particularly helpful bits of restlet documentation and finally came up with a workaround.  Yay!

You’ll need restlet for this, I expect you could use restlet just for this around the rest of your <insert framework here> app, but as I’m using restlet anyway it works a treat.

This took me ages because the RESTlet website has been utterly broken for months and RESTlet behaves really rather oddly sometimes.  It’s impossible to get hold of the application within the router and add the CLAP protocol (getApplication() is returning null) and if you attempt to add the correct client connector when building the application itself it somehow forgets by the time you get to the router.  I’ve still no idea why.

What you can do is either (a) create a new component with a new context and add the CLAP protocol to it, or (b) modify your init-parms in the web.xml to contain CLAP, like so:

The add a new component workaround will log “SEVERE don’t do this” style warnings when you start up, so it’s best to do it in the web.xml if possible.