Bentham & Hooker's Classification. along with the merits and demerits of the ...
What are we DOIng about the missing links? Connecting taxonomic names to the linked network of scholarly research
1. Nicole Kearney
Biodiversity Heritage Library Australia
@nicolekearney
What are we DOIng about the missing links?
Connecting taxonomic names to the linked network of scholarly research
6. “Of all the Mammalia yet known
it seems the most extraordinary…
…it naturally excites the idea
of some deceptive preparation
by artificial means.”
The Naturalists Miscellany (Shaw 1799)
…perfect resemblance of the
beak of a duck engrafted onto
the head of a quadruped…
Platypus anatinus Shaw 1799
8. If it’s not online,
no one will know it exists…
If it’s not linked,
no one will know it exists.
But, there’s now so much online…
(almost 55 million pages on BHL alone)
9. Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
Biodiversity knowledge graph
Rod Page
Professor of Taxonomy
University of Glasgow
The links that
SHOULD
be present
between the
core biodiversity
information
online
10. Biodiversity knowledge graph
Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
→ linked network of
biodiversity knowledge
11. Biodiversity knowledge graph
Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
?
12. Biodiversity knowledge graph
Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
?
The literature is the
foundation upon which
our understanding of
biodiversity is based.
40. museumsvictoria.com.au
doi
doi
A way to access full text
A remarkable new pygmy seahorse
(Syngnathidae: Hippocampus) from
southeastern Australia, with a redescription
of H. bargibanti Whitley from New Caledonia
DOI
1997
(registered in 2017)
41. https://doi.org/
Prefix Suffix
Museums Victoria’s unique code
journal.MemoirsMuseumVictoria.year.volume.#
24199
/j.mmv.2017.76.0110.24199
DOI registration agency:
scholarly & professional research content
43. museumsvictoria.com.au
doi
doi
A way to access full text
A review of the tuskfishes, genus Choerodon
(Labridae, Perciformes), with descriptions of
three new species
a full bibliographic citation
the DOI (displayed as a URL)
a way to access the content
Landing page
(access controlled by the publisher)
(must be publicly accessible)
#OpenAccess
46. museumsvictoria.com.au
doi
doi
A way to access full text
A remarkable new pygmy seahorse
(Syngnathidae: Hippocampus) from
southeastern Australia, with a redescription
of H. bargibanti Whitley from New Caledonia
53. You must have the rights for the content
you register…
54. What if there are no rights?
Can anyone register a DOI for
an out-of-copyright article?
…and thereby make the copy
on their website the definitive
version of that article?
…the version everyone must cite?
55. There are “no technical or policy barriers to registering
out-of-copyright content with Crossref…
Susan Collins, Crossref Outreach Manager, pers. comm. 22 July 2017
What if there are no rights?
If content is in the public domain then there's nothing
stopping a member registering that content…
58. Crossref ERROR message
ISSN "00804703" has already been assigned
Papers and Proceedings of the Royal Society of Tasmania
is owned by publisher: 5962
5962 = Biodiversity Heritage Library
Good morning. My name is Nicole Kearney and I manage the Australian component of the Biodiversity Heritage Library – BHL Australia.
In this symposium, we're hearing about all the tremendous work undertaken by the BHL community around the world, who have together have contributed over 54 million pages of biodiversity heritage literature. All of which is open accessible online. And this incredible resource contains information on millions and millions of species, including this one…
Including this one. This is the first description of the Platypus, published in 1799. This is this animal’s international debut in the literature.
https://www.biodiversitylibrary.org/page/40321089#page/236/mode/2up
Shaw gave this animal the name Platypus anatinus…
…a name that has now changed 9 times. But while the taxonomy of this animal is certainly fascinating…
so too is the contextual information and the history surrounding these descriptions and revisions. Shaw introduced this animal to the world: Of all the Mammalia yet known it seems the most extraordinary……at first view, it naturally excites the idea of some deceptive preparation by artificial means.
We’ve put this publication online for all to read….
We’ve put this publication online for all to read….
https://www.biodiversitylibrary.org/page/40321089#page/236/mode/2up
But putting something online isn’t enough anymore. We used to think that if something wasn’t online, no one would know it even existed. But there’s now so much online. Now, if something is not linked to something else – if it’s not part of a network of information, it may as well not exist.
This is a diagram that was produced by Rod Page who is a professor of taxonomy at the University of Glasgow. Rod calls this the "biodiversity knowledge graph "and it represents the links that SHOULD be present between "core" biodiversity information online.
Now there is an enormous amount of work going on across the globe linking these datasets and the global biodiversity community is now very focused on creating a linked network of biodiversity knowledge. Last month’s GBIC conference in Copenhagen was focussed on exactly this.
But we’re still failing to link taxon names to the literature, or at least to link them in a way that is useful to our users. And this is the most important link in the whole biodiversity knowledge graph because
But we’re still failing to link taxon names to the literature - and this is the most important link in the whole biodiversity knowledge graph because
Now BHL does a wonderful job of linking out. The optical character recognition and taxonomic name recognition allow users to link directly from mentions of taxon names in the literature to the taxon name records on the Encyclopedia of life website.
https://www.biodiversitylibrary.org/page/40321089#page/236/mode/2up
What I’m talking about is the missing links FROM our taxonomic databases to the literature.
Can you get back to the platypus description from our taxonomic databases
If I were interested in finding information about the platypus I would start by searching for it on the Atlas of Living Australia website - Australia’s aggregator of biodiversity knowledge.
I would immediately learn that the species was described by someone called Shaw in 1799.
If I clicked on the names tab…
https://bie.ala.org.au/species/urn:lsid:biodiversity.org.au:afd.taxon:ac61fd14-4950-4566-b384-304bd99ca75f
And then in this scientific name that appears as link…
https://bie.ala.org.au/species/urn:lsid:biodiversity.org.au:afd.taxon:ac61fd14-4950-4566-b384-304bd99ca75f#names
I would be taken to this website: the Australian Faunal Directory.
Now here’s where I night get excited, because that’s a citation for Shaw’s original description – and it’s a link.
But if I click on that link
I get taken to this page, which has the linked citation again, but if I click on any of these links,
I get taken back to this page, which represents a dead end in my search for information.
I just want to get to here.
https://www.biodiversitylibrary.org/page/40321089#page/236/mode/2up
The species profile for the platypus on GBIF starts with a tantalising citation of that first description. It looks like a link, but it’s not a link – nothing happens when you click on it.
The catalogue of life has these tiny book symbols next to the taxon names. If you hover over them, they say there are three literature refences for the platypus,
But none of them are Shaw’s and none of them are links.
On the Encyclopedia of life website, there’s no mention of Shaw here, but if you click on the names tab
http://eol.org/pages/323858/overview
There he is. Now that’s not a link, but there’s a literature tab http://eol.org/pages/323858/overview
which gives me a lovely long list of references, none of which are links, and if I scroll down
http://eol.org/pages/323858/literature
…to the s’s, there’s no Shaw.
http://eol.org/pages/323858/literature
However, the BHL is the literature provider for the Encyclopedia of Life. And in the BHL tab, I find 4 pages of results – 4 pages of links to mentions of Ornithorynchus anatinus on BHL. These are links directly from the taxonomic database to the literature online.
http://eol.org/pages/323858/literature/bhl
But, if you’re looking for Shaw in this list, you’re not going to find his name. This bibliographic data presented here is at the volume level, not the article level. So there are no author names or article titles provided. These lists are a fascinating treasure trove of information, but they’re like a bit like a lucky dip. I love clicking through all these links to see where they make take me (and you can easily spend an afternoon doing this, but this isn’t very helpful if you’re looking for a specific article title or a specific author.
http://eol.org/pages/323858/literature/bhl
So, despite trying all those taxonomic aggregators, I still haven’t been able to link through to this, the first description. And it was only EOL that had any links at all.
https://www.biodiversitylibrary.org/page/40321089#page/236/mode/2up
I’m going to jump forward in time now and talk about how linking works for modern literature. This is another first description – it’s a new species of pygmy seahorse – which had its international debut in the literature just this month. When this modern publication first appeared online it was accompanied by DOI.
Which is in the form of a URL – it’s a link. A DOI is like an electronic fingerprint – this publication received its DOI at birth and it will carry this DOI throughout its lifetime wherever it travels online.
In 2016, 90% of newly-published articles in the sciences have DOIs and that percentage continues to grow.
DOIs are unique – each publication should only ever have one
Permanent - they shouldn’t ever change
And persistent – the links to DOIs should never break
So anyone else citing this article can link to it using the DOI and the link will always resolve here… https://twitter.com/ZooKeys_Journal/status/1025038828380348421
And if you scroll down to list of references list of this paper…
…you will find a lot more DOIs - links that will take you directly to the articles cited in this paper – such as this one
This is a paper published in the Memoirs of Museum Victoria in 1997. Now it’s a bit unusual for articles this old to have DOIs. I can tell you that it did not receive one at birth, nor did it receive one when it was first available online. I know this because I registered this DOI, and I did so only last year.
https://doi.org/10.24199/j.mmv.1997.56.10
I registered Museums Victoria with Crossref, the DOI registration agency for scholarly and professional research content. They allocated us our own unique publisher code, which combined with Crossref’s unique agency code, became the fixed prefix for our DOIs.
And this then becomes the unique persistent link to our articles.
As members of Crossref, we now had adhere to the membership obligations.
Each article we published had to have its own landing page, and that landing page had to contain 3 key pieces of information: the DOI (displayed as a URL), a full bibliographic citation and a way to access the full text. It’s important to note here that the landing page must be publicly accessible, but access to the content itself is controlled by the published, which means that commercial publishers can put a paywall between their publicly accessible landing page and your content.
All of Museums Victoria’s articles are open access so clicking on this link takes you directly to the content.
This is the first article we published in 2017 and our first to receive a DOI
It is also a requirement of our membership that we include the DOIs (where they exist) for every article we cite in our journal articles. And that these DOIs are clickable links. This means our readers can link directly from our articles to other articles cited here that they may be interested in.
And now that our articles had DOIs, all articles that cite our papers, like the 2018 seahorse paper, HAVE to include our DOIs in their reference lists.
https://doi.org/10.24199/j.mmv.1997.56.10
Now I went all the way to 1906, the very first volume of the Memoirs of Museums Victoria and I registered DOIs for every single article we’d ever published.
However, our back issues don’t just exist on our own website. Here is that same 1906 article on the Biodiversity Heritage Library. But even though there are multiple copies of this article online, there should be only one DOI. And as that DOI is now a critical part of that article’s metadata it should be included in every mention of that article, as a live link. The DOIs for all of the Museums Victoria articles have now been added to BHL and so if you click on this link, you’ll be taken to the definitive version of that article – the one of Museums Victoria’s website.
https://www.biodiversitylibrary.org/part/50681#/summary
This is is the version everyone else must cite when they reference this article – using this DOI.
I also added the DOI to the PDFs of these articles to ensure that this critical piece of data would travel with this content wherever it went so that if someone downloaded or printed off the PDF, they would always have the DOI. It took me a while. There were 867 articles.
But it was totally worth it because the articles in our back issues have significant scientific value, equal to that of our current literature. The Memoirs of Museum Victoria is primarily a taxonomic journal and is therefore filled with scientific descriptions of species. The classification of living things depends (perhaps more than any other area of science) upon historic literature.
And now all of this historic literature is part of the great linked network of scholarly research, And Museums Victoria can now track citations, collect altmetrics and ensure that the links to our content will never break.
Now I’m going to take you back to the Crossref Member Obligations. Right down there at number 10 it states that “You must have the necessary rights for the content you register. Now that set me thinking…
Most of the content that I had been assigning DOIs to was out of copyright and in the public domain. And so is most of the content that I deal with in my role with the Biodiversity Heritage Library. So what happens when there are no rights? Can anyone register a DOI for an out-of-copyright article? And thereby make the copy of their website the definitive version of that article – the versions everyone else must cite?
Most of the content that I had been assigning DOIs to was out of copyright and in the public domain. And so is most of the content that I deal with in my role with the Biodiversity Heritage Library. So what happens when there are no rights? Can anyone register a DOI for an out-of-copyright article? And thereby make the copy of their website the definitive version of that article – the versions everyone else must cite?
I had a lot of trouble finding answers to these questions. There is no information on either the DOI foundation website or the Crossref website about assigning DOIs to out-of-copyright material. And so I wrote to Crossref. They got back to me with the following information…
The University of Tasmania host the online versions of the Papers and Proceedings of the Royal Society of Tasmania and Juliet had just started working with the Royal Society to register DOIs for their articles.
She had just tried to register the journal’s title with Crossref, when she received a error message.
The message said that the ISSN had already been assigned. The Papers and Proceedings of the Royal Society of Tasmania is owned by publisher: 10.5962. Juliet was contacting me because publisher 10.5962 is the Biodiversity Heritage Library.
BHL had indeed assigned DOIs to a small number of articles from the Papers and Proceedings of the Royal Society of Tasmania, including this one published in 1865 and, in doing so, had registered the title of the journal, making it impossible for anyone else to register it.
And that makes this, the version on BHL, the definitive version that everyone must cite.
Now BHL were completely happy to transfer “ownership” of both the journal and the articles to the University of Tasmania. Crossref has as established procedure for this.
And now that the University of Tasmania have control of the metadata, they can change the URL attached the existing DOIs and redirect them to their website. So while the DOI will not change, the URL it points to will and this version will become the definitive version of this article.
So if there are any publishers in the room who haven’t assigned DOIs to your legacy content yet, you might want to think about the fact that someone else could get there first.
Now, you might think I’m being alarmist, but I have been starting to see some very worrying trends regarding DOIs and historic literature.
This is the first description of my very favourite Australian animal – it’s a Striped Possum. It doesn’t just occur in Australia; it also lives in Indonesia and New Guinea..
The article was published in 1858, and includes descriptions of species sent to British Museum from the Aru Islands (in Indonesia) by Alfred Russel Wallace. It is freely available online, as it should be, here on the Biodiversity Heritage Library website. The DOI is included here because it is part of the article’s bibliographic data, and it links to the definitive version of this article online, which isn’t this version.
It’s this one – on the Wiley Online Library website. This is the publicly accessible landing page of the definitive version of this article, but when you click on this link to continue reading the full article, you are taken to…
this page. If you want keep reading you need to pay. US$6 to rent it for 48 hours; US$38 to keep a copy permanently.
Now I can just cope with the first description of this beautiful species being behind a paywall, because there are ways of getting around paywalls
And I’m talking about legal ways of getting around them. This is Unpaywall, which is a free plugin you can download that searches for openly-accessible legal versions elsewhere when you come across content that is behind a paywall.
However, initiatives such as Unpaywall rely on DOIS – for their algorithms to be able to locate open access versions elsewhere, those open access versions need to have the DOI attached.
This means that the Biodiversity Heritage Library has a critical role to play in making their freely available versions discoverable, particularly when the DOI’d versions are behind a paywall.
There is no doubt that bringing all this legacy material into the DOI system makes it discoverable, linkable, citable. The historic literature must be part of the great linked network of scholarly research.
If we want to fix those missing links from our taxon names
What I’m talking about is the missing links FROM our taxonomic databases to the literature.
But if we are going to retrospectively assign DOIs to legacy content, there need to be some established guidelines and procedures to follow, and perhaps also some funding allocated to this critical work. And we really need the DOI foundation and the DOI registration agency to recognise the importance of the legacy literature. But what I’d really like to see – my little dream – is that all definitive versions of out-of-copyright content are #Open Access by default.
We’d like to thank the Biodiversity Heritage Library for making all those millions of pages of historic content openly accessible to everyone, Museums Victoria for hosting the BHL project in Australia and the Atlas of Living Australia for their ongoing funding and support.