OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories

OAI Identifiers: Decentralised PIDs for
Research Outputs in Repositories
Petr Knoth, Valerii Budko, Viktoriia Pavlenko, Matteo Cancellieri
CORE, Knowledge Media institute, The Open University
Open Repositories 2022, Denver, USA
Acknowledgements: Herbert van de Sompel, Martin Klein

Outline
1. Need for globally unique Persistent Identifiers (PIDs) identifying scholarly entities.
2. The problem: Why are DOIs not sufficient as PIDs for repository-based research
outputs
3. OAIs as distributed and free to mint PIDs for repositories
4. Introduce the new OAI resolver
5. The benefits of linking DOIs with OAIs

Scholarly PIDs
• Strong push for Persistent Identifiers (PIDs) to identify
different entities in the scholarly knowledge graph
• Examples:
• DOIs to identify VoRs
• ORCID to identify authors
• Ringgold IDs to identify organisations
• …

Drivers for PIDs adoption in repositories
• Open Access Policies: Plan S, REF2021 OA Policy, UKRI OA Policy …
• Policies typically ask for: “all repository outputs to have PIDs”
• Deposits of papers into repositories (multiple authors can / should deposit)
• Repositories often mint DOIs in response to this requirement => duplicities
• Analytics, bibliometrics, scholarly knowledge graph

Important background about DOIs (1/2): Resolvable its location
A DOI name is permanently assigned to an object to provide a resolvable persistent
network link to current information about that object, including where the object, or
information about it, can be found on the Internet.“ [DOI Handbook]
The current practice in scholarly communication is for the DOI to mainly resolve to the
VoR on the publisher’s site.

Important background about DOIs (2/2): Uniqueness
“Uniqueness (specification by a DOI name of one and only one referent) is enforced by
the DOI system. It is desirable that two DOI names should not be assigned to the same
thing.” [DOI Handbook]

Illustrative scenarios
multi author, multi institution, more than one AAM in more than one
repository => more than one PIDs (locale different)
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
Journal
VoR

Illustrative scenarios: same DOI everywhere
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
Journal
VoR
DOI 1
DOI 1
DOI 1
DOI 1

Illustrative scenarios: different or missing PIDs
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
Journal
VoR
DOI 1
DOI 2
No PID
DOI 1

Illustrative scenarios: resource not published yet => DOI not assigned yet
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
DOI?
DOI?
DOI?

The problem: use of DOIs in the repositories’ context
• DOIs:
1. Typically do not resolve to the output in the repository: Instead, resolve to the
VoR on the publisher’s website.
2. While it is a good practice to link the repository output to its VoR via DOI,
DOIs do not identify repository outputs.
3. A significant proportion of repository stored outputs do not have a DOI (and
that’s OK).
4. The centralised system for registering DOIs is not suitable to the needs of the
distributed repositories’ network. Not clear when to assign a new DOI at a
repository level, due to different versions of the same output.
5. Role of repositories degraded to driving traffic to publisher websites, as
resolving DOIIs takes a user outside of the repositories network.

The solution
1. One DOI to identify a canonical version of a research work (assigned by the
publisher).
2. Another PID to identify the object deposited in the repository, unique for every
deposited version.
3. An explicit link between these two.

OAI identifier: The missing PID for repository outputs actually already
exists … 
• Created by the Open Archives Initiative,
used as part of OAI-PMH
• Already universally adopted across
repositories
• Decentralised
• Persistent: in repositories declaring
persistent support for deletedRecords in
OAI-PMH
• Free to mint

Adoption of OAIs
OAI identifiers need more community
recognition and awareness!

CORE
• Aware of deposits from
across the global repositories
network and their OAIs
• Ideally positioned to deliver
an OAI resolver service
Over 200 million
metadata records
Nearly 25 million
free to read full
texts
Over 12,000
data providers
in 147 countries

Example OAI identifiers
• Example OAIs:

Activating OAI
resolver mapping
for your
repository in
CORE

Recommendations for repositories
1. Use OAI identifiers as PIDs for your repository.
2. Beware of the risks of potentially registering duplicate DOIs. Instead, link to DOIs.
3. Register for the CORE Repository Dashboard (free) and make sure you are well a
harvested.
4. If you wish OAIs to resolve to your repository, register your mapping in the CORE
Repository Dashboard.

Contributions
1. OAI identifiers are cost-free decentralised persistent identifiers for repositories.
2. OAI identifiers deserve more community recognition.
3. Delivered an OAI resolver as an important piece of open infrastructure for OAIs.

OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories

Similar to OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories (20)

More from petrknoth

More from petrknoth (20)

Recently uploaded

Recently uploaded (20)

OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories

Editor's Notes