The Biodiversity Heritage Library (BHL) is the world's largest online repository of biodiversity literature with over 54 million pages from 229,000 volumes contributed by 385 institutions. This document discusses BHL's efforts to digitize, transcribe, and link biodiversity literature from Australian institutions. It provides examples of how field diaries, images, taxonomic names, and other documents are being transcribed, tagged, and linked to databases to make the information fully searchable and part of a growing "biodiversity knowledge graph." BHL Australia has expanded from one contributor in 2011 to 21 contributors in 2018 and aims to continue increasing access to biodiversity literature through open sharing and assignment of permanent digital identifiers.
4. • Australian Museum
• Queensland Museum
• Geoscience Australia
• Royal Society of Queensland
• Linnean Society of NSW
• Australian Institute of Marine Science
• Field Naturalists Club of Victoria
• Royal Botanic Gardens VIC
• Great Barrier Reef Marine Park Authority
• Queen Victoria Museum & Art Gallery
• Tasmanian Field Naturalists Club
• Museums Victoria
• Western Australian
Museum
• Western Australian
Herbarium
• South
Australian
Museum
• Royal Society of Victoria
• Australian Garden History Society
• Royal Society of
Western Australia
• Museum & Art Gallery of the Northern Territory
• Northern Territory Field Naturalists’ ClubBHL Australia 2018
21 contributors (& counting)
• National Herbarium of NSW
6. The largest museum organisation in the southern hemisphere
Museums Victoria
Melbourne Museum Immigration Museum Scienceworks Royal Exhibition Building
13. • Australian Museum
• Queensland Museum
• Geoscience Australia
• Royal Society of Queensland
• Linnean Society of NSW
• Australian Institute of Marine Science
• Field Naturalists Club of Victoria
• Royal Botanic Gardens VIC
• Great Barrier Reef Marine Park Authority
• Queen Victoria Museum & Art Gallery
• Tasmanian Field Naturalists Club
• Museums Victoria
• Western Australian
Museum
• Western Australian
Herbarium
• South
Australian
Museum
• Royal Society of Victoria
• Australian Garden History Society
• Royal Society of
Western Australia
• Museum & Art Gallery of the Northern Territory
• Northern Territory Field Naturalists’ ClubBHL Australia 2018
21 contributors (& counting)
• National Herbarium of NSW
22. Field diaries are full of data
2012 2014
Grampians
National
Park
Images: Heath Warwick & Nicole Kearney / Museum Victoria
Historic observations
• past species’
abundance
and distribution
• future
biological
surveys
• threatened
species
management
1931
23. OCR from a page of
Graham Brown’s diary
l>^v-^wAl^ livU*^/) Curiae
'^tila'* -u^vttcvi Lsefei cit^:<
Lv. 1^ Ol^Vm?iJcw , L>w i^-
^Otv^ dS^^iL* ll^^Uk^
M/tTM^li?'^
tvc4fi>r '^^-^ G^WtY^^
uve^v. llCCUvlr]^vvl^
'^L^>u^ l^t^
You can’t search handwriting…
26. Data – science
5 Graham Brown field diaries:
Date Species Location
09/09/1947 Red Wattle bird Colac, near lake, in flowering gums
13/09/1947 Crested Grebes Colac East, end of Church St, mouth
of the creek
13/09/1947 Little Pied Cormorant Colac, perched on the wreck
13/09/1947 Mountain Duck Colac East, end of Church St, mouth
of the creek
13/09/1947 Musk Duck Colac, on the lake
13/09/1947 Silver Gull Colac, over the lake, opposite
Queen's Avenue
5611 animal sightings
27. Data – history
547 mentions
of people &
organisations
Historical
descriptions
of places & events
Personal anecdotes:
the life of a 1940s
country doctor
30. Along with the transcriptions Searchable
biodiversitylibrary.org
31. Linking items & records
Field diary
Author
Photos
Locations
Taxonomy
Collection events
Chroicocephalus
novaehollandiae
Linked
32. Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
Biodiversity knowledge graph
Rod Page
Professor
of Taxonomy
University
of Glasgow
The links that
SHOULD
be present
between
biodiversity
information
online
33. Biodiversity knowledge graph
Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
34. The biodiversity literature
is the foundation upon
which our understanding
of biodiversity is based.
Biodiversity knowledge graph
Page R (2016) Towards a biodiversity knowledge graph.
Research Ideas and Outcomes 2: e8767. doi: 10.3897/rio.2.e8767
35. If it’s not online,
no one will know it exists…
If it’s not linked,
no one will know it exists.
But, there’s now so much online…
64. Common name
Scientific name
Location
Tagged = discoverable (citizen scientists)
Original scientific name
taxonomy:bionomial=
Artist name
artist:name=
Accepted scientific name
taxonomy:bionomial=
Machine-readable tags
Common name
taxonomy:common=
65. Common name
Scientific name
Location
Original scientific name
taxonomy:bionomial=
Artist name
artist:name=
Accepted scientific name
taxonomy:bionomial=
Machine-readable tags
Common name
taxonomy:common=
78. DOI = Digital Object Identifier
Digital Identifier of an Object
unique permanent persistent
90% of newly-published articles in the sciences have DOIs
Gorraiz et al. 2016
90. Dank u wel
Nicole Kearney
Manager BHL Australia
nkearney@museum.vic.gov.au
@nicolekearney
@bhl_au
Editor's Notes
My name is Nicole Kearney and I manage the Australian component of the Biodiversity Heritage Library.
Which you will know is the world’s largest online repository for biodiversity heritage and archival materials.
There are now 385 libraries contributing to the Biodiversity Heritage Library worldwide and I look after the 21 of them that are in Australia. So this represents BHL Australia in 2018…
I coordinate the project in Australia, where we now have 21 libraries contributing their biodiversity heritage and making it openly available online.
But in 2011, when we joined BHL, we were just a single organization, Museums Victoria, which is where I work and where the project is based. Some of you know, Ely Wallis, who started the BHL project in Australia.
Museums Victoria is the largest museum organisation in the southern hemisphere, with a museum covering natural science, history and indigenous cultures, an immigration museum and a science museum. We also look after the world-heritage-listed Royal Exhibition Centre, which is our largest collection object. And we also have an Imax theatre, a planetarium, and a
…library. Our library has a wonderful collection of rare books – our first director was often criticised on how much of the museum’s money he spent having books shipped our to Australia in the early 1800s. We also have an extensive collection of journals. However, like many museum libraries, our museum is not open to the public
It looks like this.
So it’s wonderful that we are involved in the Biodiversity Heritage Library and that we’ve been able to make these books and journals openly accessible for everyone.
BHL Australia has been nationally funded by the Atlas of Living Australia since the beginning.
The Atlas is our national aggregator of information about the species of Australia, similar to the Encyclopedia of Life. As BHL is the literature provider for ALA, it’s important that we upload as much Australian content as possible.
And so we have been working closely with Australia’s museums, herbaria, royal societies, field naturalist clubs and government agencies so that the biodiversity heritage literature of all these organisations can be freely available online. And it’s my role to look after the digitisation for all of these organisations.
So often my desk looks like this.
And I also receive a lot of material electronically.
We have now digitised well over 1300 rare books and historic journals. This represents over 250,000 pages that used to be locked up in our library archives.
A large proportion of what we contribute to BHL are journals. Almost all of our Australian contributors publish an in-house journal we are making sure that our electronic holdings of these journals are complete and gap free.
We also try to contribute to subject areas that are lacking on BHL. Australia has always been closely involved in Antarctic exploration and discovery so this is an area that we have tried to expand on BHL.
And like most contributors, we look for things that only we have. Field diaries are particularly special because there is only ever one physical copy. We’ve been working to digitise the field diaries in our collection and to make them accessible online.
We started with diaries that were of particular significance to our current scientists. This is one of series of diaries that was produced in an area of great conservation interest, one that we’re still actively surveying today. These are bird observations in the Grampians National Park from 1931. Here are our scientists surveying the same area in 2012 and again in 2014.
But while our curators would now be able to read the diaries online, their contents were still unsearchable. This is the OCR output for this handwritten page. In order to unlock their contents and extract the historic data, we needed to transcribe them.
So, in 2014, we started a transcription program.
We were particularly interested in extracting the historic observation data – date, location, scientific name and common name – as well as mentions of people and organisations.
Graham Brown’s five field diaries yielded 5611 animal sightings, complete with date and location data.
But this was just within our internal collection database, but it represents what we should be doing with everything we put online,
This diagram was published in 2016 in a paper by Rod Page. Rod Page calls this the "biodiversity knowledge graph "and it represents the links that SHOULD be present between "core" biodiversity information online.
This is the bit that we’re most interested in, and it’s the most important, because…
I’m going to demonstrate what we’ve been trying to do using this very cute little animal – the Leadbeater’s Possum, which is the animal emblem for my home state of Victoria (and is also, sadly, highly endangered)
It was first described in 1867, but Museums Victoria’s first director, Frederick McCoy (the same director who spent all our money buying so many books)
The description included this beautiful illustration
Frederick McCoy later produced a publication - A Prodromus of the fauna of Victoria, which included a reworked version of that illustration.
We have in our museum collection, the type specimen from which the description was made
As well as the original versions of those illustrations
And some preliminary sketches
And his hand-written notes about the species
We’ve linked all of these items in our collection database so that when you look at each record on our online collections website, you can see the related items.
But we’ve also included links to all of the publications on BHL – to the original description, the original illustration and the plate in the Prodromus.
As you will know, BHL also creates links between the literature on its website and taxonomic databases, such as the Encyclopedia of Life.
Taxonomic names are recognized in the OCR on BHL and are shown here..
If those names match a scientific name in EOL, users are able to link directly to the species record in EOL
And if you click on the literature lab within EOL
You will get a list of all of the mentions of those species in BHL (where they’ve been recognized in the OCR).
Now unfortunately this doesn’t yet work for our field diaries, because
OCR can’t read handwriting.
So no taxonomic names are recognized and no links to EOL are created.
Information and pictures of all species known to science gathered together and made available to everyone – anywhere.
But our bread and butter are journals. Every one of our Australian contributors publish an in-house journal and over the past year I have been particularly focussed on ensuring that our electronic holdings of these journals are complete and gap free.
Many of these organisations have long publishing histories with journal runs of well over 100 years. This first volume of the Australian Museum Memoir, for example, was published in 1851.
And most of our contributing organisations are still actively publishing scholarly content today.
In 2016, I started a project to increase the amount of in-copyright Australian content on BHL. 18 of our contributing libraries have now signed agreements allowing us to upload their in-copyright content, some allowing us to upload articles published as recently as the current year.
But as I started working with these recently published articles, I realised that there was a key piece of bibliographic information that we had to include in their metadata - the DOI.
DOI stands for digital object identifier. A DOI is like a electronic fingerprint, which takes the form of a unique alphanumeric string that is permanently assigned to an item and provides a persistent link to its location online.
DOIs have been almost universally adopted by the scientific publishing community and by 2016 90% of newly-published science articles had DOIs.
I registered Museums Victoria with Crossref, the DOI registration agency for scholarly and professional research content. This involved paying an annual membership fee and a fee per DOI we registered.
Crossref allocated us our own unique publisher code, which combined with Crossref’s unique agency code, became the fixed prefix for our DOIs.
We then developed a system for the suffix, which would be unique for each article: j for journal; mmv for Memoirs of Museum Victoria, followed by the year, the volume and a consecutive number for each article in each volume.
And this then becomes the unique persistent link to our articles.
We were now officially part of The Great Linked Network of Scholarly Research. We would now be able to track citations, collect altmetrics and ensure that the links to our content would never break.
From now on, every Memoirs of Museum Victoria article will receive its DOI – its unique digital fingerprint – at birth, and this fingerprint can be used to identify and cite the article throughout its lifespan, no matter where it travels.
So getting back to my project, finally. I did go all the way back to 1906 and I registered a DOI for all 867 articles we’d every published. And I managed to do this work before anyone else got there first… I also added the DOI to the electronic versions of the articles to ensure that this critical piece of data would travel with this content wherever it went throughout its lifespan.
And I’ve added these DOIs to the versions on the Biodiversity Heritage Library so that they link back to the definitive versions on our website.
And we jumped in to participate in #ColourOurCollections
My name is Nicole Kearney and I manage the Australian component of the Biodiversity Heritage Library.