Getting started with the public ORCID API using swagger – quickstart guide

ORCID recently implemented a swagger definition file for it’s v2.0 API, which means it’s now even easier to access the public ORCID API from your website.  Just use swagger.js.  It’s Super.  And Easy.

Let’s give it a go.

First, clone swagger onto your machine.  Either use the git desktop client, click the button on the repository or fetch it like this if you’re on Linux or OSX:

Next, create a simple webpage called orcid.html in the swagger-js directory.  This is just so we can play around, if you move to production you’ll want to organise your code differently.  Something like this will work fine:

Load the webpage in your browser, then pat yourself on the back.  You’ve just used the ORCID API and written the JSON response to the web page!

orcid

That’s not very user friendly though, so let’s not stop there.  Let’s use the data for something useful and make it fancy.  This time we’re going to extract a list of titles for all of the works in the ORCID record.  Create another .html file and paste this into it:

Fantastic.  It should look something like this:

works

That’s fine, as far as it goes, but we’re not using the real power of identifiers here.  Let’s put a bit more effort in and create some links.  The code below does two things; first it restructures the JSON into something more useful, second it checks to see if we’ve got some kind of link we can apply (from a DOI, Handle or URI).  I’ve only pasted in the main function for brevity.

Which gives you some lovely extra info:

works2

Hopefully that’s got you up and running with a small insight into what it possible.  Next time I’ll run through using the member API (including updating records) in the same way.

Differences between ORCID and DataCite Metadata

(written by Martin Fenner and cross posted from the Datacite blog – I’m one of the co-authors of the report)

One of the first tasks for DataCite in the European Commission-funded THOR project that started in June was to contribute to a comparison of the ORCID and DataCite metadata standards. Together with ORCID, CERN, the British Library and Dryad we looked at how contributors, organizations and artefacts – and the relations between them – are described in the respective metadata schemata, and how they are implemented in two example data repositories, Archaeology Data Service and Dryad Digital Repository. The focus of our work was on identifying major gaps. Our report was finished and made publicly available via http://doi.org/10.5281/zenodo.30799 last week . The key findings are summarized below:

  • Common Approach to Personal Names
  • Standardized Contributor Roles
  • Standardized Relation Types
  • Metadata for Organisations
  • Persistent Identifiers for Projects
  • Harmonization of ORCID and DataCite Metadata

Common Approach to Personal Names

While a single input field for contributor names is common, separate fields for given and family names are required for proper formatting of citations. As long as citations to scholarly content rely on properly formatted text rather than persistent identifiers, services holding bibliographic information have to support these separate fields. Further work is needed to help with the transition to separate input fields for given and famliy names, and to handle contributors that are organizations or groups of people.

Standardized Contributor Roles

The currently existing vocabularies for contributor type (DataCite) andcontributor role (ORCID) provide a high-level description, but fall short when trying to describe the author/creator contribution in more detail. Project CRediT is a multi-stakeholder initiative that has developed a common vocabulary with 14 different contributor roles, and this vocabulary can be used to provide this detail, e.g. who provided resources such as reagents or samples, who did the statistical analysis, or who contributed to the methodology of a study.

CRediT is complementary to existing contributor role vocabularies such as those by ORCID and DataCite. For contributor roles it is particularly important that the same vocabulary is used across stakeholders, so that the roles described in the data center can be forwarded first to DataCite, then to ORCID, and then also to other places such as institutional repositories.

Standardized Relation Types

Capturing relations between scholarly works such as datasets in a standardized way is important, as these relations are used for citations and thus the basis for many indicators of scholarly impact. Currently used vocabularies for relation types between scholarly works, e.g. by CrossRef and DataCite, only partly overlap. In addition we see differences in community practices, e.g. some scholars but not others reserve the term citation for links between two scholarly articles. The term data citation is sometimes used for all links from scholarly works to datasets, but other times reserved for formal citations appearing in reference lists.

Metadata for Organisations

Both ORCID and DataCite not only provide persistent identifiers for people and data, but they also collect metadata around these persistent identifiers, in particular links to other identifiers. The use of persistent identifiers for organisations lags behind the use of persistent identifiers for research outputs and people, and more work is needed.

Persistent Identifiers for Projects

Research projects are collaborative activities among contributors that may change over time. Projects have a start and end date and are often funded by a grant. The existing persistent identifier (PID) infrastructure does support artefacts, contributors and organisations, but there is no first-class PID support for projects. This creates a major gap that becomes obvious when we try to describe the relationships between funders, contributors and research outputs.

Both the ORCID and DataCite metadata support funding information, but only as direct links to contributors or research outputs, respectively. This not only makes it difficult to exchange funding information between DataCite and ORCID, but also fails to adequately model the sometimes complex relationships, e.g. when multiple funders and grants were involved in supporting a research output. We therefore not only need persistent identifiers for projects, but also infrastructure for collecting and aggregating links to contributors and artefacts.

Harmonization of ORCID and DataCite Metadata

We identified significant differences between the ORCID and DataCite metadata schema, and these differences hinder the flow of information between the two services. Several different approaches to overcome these differences are conceivable:

  1. only use a common subset, relying on linked persistent identifiers to get the full metadata
  2. harmonize the ORCID and DataCite metadata schemata
  3. common API exchange formats for metadata

The first approach is the linked open data approach, and was designed specifically for scenarios like this. One limitation is that it requires persistent identifiers for all relevant attributes (e.g. for every creator/contributor in the DataCite metadata). One major objective for THOR is therefore to increase the use of persistent identifiers, both by THOR partners, and by the community at large.

A common metadata schema between ORCID and DataCite is neither feasible nor necessarily needed. In addition, we have to also consider interoperability with other metadata standards (e.g. CASRAI, OpenAIRE, COAR), and with other artefacts, such as those having CrossRef DOIs. What is more realistic is harmonization across a limited set essential metadata.

The third approach to improve interoperability uses a common API format that includes all the metadata that need to be exchanged, but doesn’t require the metadata schema itself to change. This approach was taken by DataCite and CrossRef a few years ago to provide metadata for DOIs in a consistent way despite significant differences in the CrossRef and DataCite metadata schema. Using HTTP content negotiation, metadata are provided in a variety of formats.

ETD2014 slides

I’m having a great time attending the ETD2014 conference.  There’s been lots of lively discussion around ORCiD and DOIs and it’s been fantastic to gather wider perspectives.  It’s also been great to get some coding in adapting the import tool to work with the Leicester institutional repository.

For those that are interested in the ORCiD integration I was discussing earlier, the live application can be found at http://ethos-orcid.appspot.com and the code at https://github.com/TomDemeranville/ I’ve popped the slides on figshare (http://dx.doi.org/10.6084/m9.figshare.1117858)

 

ELAG2014 – Slides

I recently presented at ELAG2014 about ORCiD integration.  I’ve embedded my slides below.  They might not make quite as much sense without context – they’re mainly pictures with single word topic headings, but they contain links to the source code etc.

I had a great time and met a lot of interesting folks whilst at ELAG.  I’m now in contact with two UK universities looking at ORCiD integration so I think it’s been useful.

I’ll add some commentary and a summary of the talk later this week.

Rather appropriately this is up on figshare – an open data repository that lets you upload anything then assigns it a DOI. This one has http://dx.doi.org/10.6084/m9.figshare.1057954

 

 

ORCiD tools – who’s claiming what?

As part of my work with data-centres and ORCiD I’ve put together a tool that lets you see where works claimed within ORCiD have been published.   Start typing a publisher into the search box and it’ll look up the DOI prefix (or other identifier prefix) for that publisher from a list nearly 4000 long.  Current highlights include the American Helicopter Society with 9 ORCiDs.

See it in action at http://ethos-orcid.appspot.com/search

Why?

One of the things that’s repeatedly come up when talking to data-centres about author metadata is that while it’s easy to push data in, it’s pretty hard to get it back out.  Recent changes in the ORCiD API have made this easier, hence this tool.

Many datasets have scant author metadata, with little more that an institution name attributed to them.  Datacentres can now pull the claims information out of ORCiD and use it to selectively enrich their own metadata, completing a “virtuous circle” of integration.

Source code?

This code is available as part of the orcid-update-java application, which uses the orcid-java-client library.