How to load the ORCID data dump into mongo, without dying of old age
Since writing this post, I’ve worked out a better way of doing things – use python to read directly from the tar file and inject the results into Mongo! I used in with the 2017 ORCID data dump and it’s like lightning compared to the method described below. Basically download the script and run:
1 |
python3 importer.py --file filename.tar.gz --collection target-collection-name |