OAI identifiers provide a decentralized and free solution for assigning persistent identifiers (PIDs) to research outputs stored in repositories. While DOIs identify canonical versions, OAIs uniquely identify each deposited version in a repository. The CORE repository has over 200 million metadata records and 12,000 data providers, positioning it well to provide an OAI resolver service. Repositories are recommended to use OAIs as PIDs, link to rather than duplicate DOIs, and register with the CORE resolver to activate OAI resolution for deposited outputs. OAIs deserve more recognition as cost-free PIDs supporting the distributed global repository network.
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
1. OAI Identifiers: Decentralised PIDs for
Research Outputs in Repositories
Petr Knoth, Valerii Budko, Viktoriia Pavlenko, Matteo Cancellieri
CORE, Knowledge Media institute, The Open University
Open Repositories 2022, Denver, USA
Acknowledgements: Herbert van de Sompel, Martin Klein
2. Outline
1. Need for globally unique Persistent Identifiers (PIDs) identifying scholarly entities.
2. The problem: Why are DOIs not sufficient as PIDs for repository-based research
outputs
3. OAIs as distributed and free to mint PIDs for repositories
4. Introduce the new OAI resolver
5. The benefits of linking DOIs with OAIs
3. Scholarly PIDs
• Strong push for Persistent Identifiers (PIDs) to identify
different entities in the scholarly knowledge graph
• Examples:
• DOIs to identify VoRs
• ORCID to identify authors
• Ringgold IDs to identify organisations
• …
4. Drivers for PIDs adoption in repositories
• Open Access Policies: Plan S, REF2021 OA Policy, UKRI OA Policy …
• Policies typically ask for: “all repository outputs to have PIDs”
• Deposits of papers into repositories (multiple authors can / should deposit)
• Repositories often mint DOIs in response to this requirement => duplicities
• Analytics, bibliometrics, scholarly knowledge graph
5. Important background about DOIs (1/2): Resolvable its location
A DOI name is permanently assigned to an object to provide a resolvable persistent
network link to current information about that object, including where the object, or
information about it, can be found on the Internet.“ [DOI Handbook]
The current practice in scholarly communication is for the DOI to mainly resolve to the
VoR on the publisher’s site.
6. Important background about DOIs (2/2): Uniqueness
“Uniqueness (specification by a DOI name of one and only one referent) is enforced by
the DOI system. It is desirable that two DOI names should not be assigned to the same
thing.” [DOI Handbook]
7. Illustrative scenarios
multi author, multi institution, more than one AAM in more than one
repository => more than one PIDs (locale different)
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
Journal
VoR
8. Illustrative scenarios: same DOI everywhere
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
Journal
VoR
DOI 1
DOI 1
DOI 1
DOI 1
10. Illustrative scenarios: resource not published yet => DOI not assigned yet
Preprint
Repo2
AAM
Repo1
AAM
Repo3
AAM
DOI?
DOI?
DOI?
11. The problem: use of DOIs in the repositories’ context
• DOIs:
1. Typically do not resolve to the output in the repository: Instead, resolve to the
VoR on the publisher’s website.
2. While it is a good practice to link the repository output to its VoR via DOI,
DOIs do not identify repository outputs.
3. A significant proportion of repository stored outputs do not have a DOI (and
that’s OK).
4. The centralised system for registering DOIs is not suitable to the needs of the
distributed repositories’ network. Not clear when to assign a new DOI at a
repository level, due to different versions of the same output.
5. Role of repositories degraded to driving traffic to publisher websites, as
resolving DOIIs takes a user outside of the repositories network.
12. The solution
1. One DOI to identify a canonical version of a research work (assigned by the
publisher).
2. Another PID to identify the object deposited in the repository, unique for every
deposited version.
3. An explicit link between these two.
13. OAI identifier: The missing PID for repository outputs actually already
exists …
• Created by the Open Archives Initiative,
used as part of OAI-PMH
• Already universally adopted across
repositories
• Decentralised
• Persistent: in repositories declaring
persistent support for deletedRecords in
OAI-PMH
• Free to mint
15. CORE
• Aware of deposits from
across the global repositories
network and their OAIs
• Ideally positioned to deliver
an OAI resolver service
Over 200 million
metadata records
Nearly 25 million
free to read full
texts
Over 12,000
data providers
in 147 countries
21. Recommendations for repositories
1. Use OAI identifiers as PIDs for your repository.
2. Beware of the risks of potentially registering duplicate DOIs. Instead, link to DOIs.
3. Register for the CORE Repository Dashboard (free) and make sure you are well a
harvested.
4. If you wish OAIs to resolve to your repository, register your mapping in the CORE
Repository Dashboard.
22. Contributions
1. OAI identifiers are cost-free decentralised persistent identifiers for repositories.
2. OAI identifiers deserve more community recognition.
3. Delivered an OAI resolver as an important piece of open infrastructure for OAIs.
Persistent identifiers are of a critical component and a precondition for an open research scholarly graph.
If you are bored by my previous talks then don’t be and give me one more chance as I believe that this talk is rather significant …
[Introduce myself briefly]
I am from CORE at the OU
[Acknowledgements]
We then offer a solution that builds on the existing repositories infrastructure created by the Open Archives Initiative4. We present OAI Identifiers as viable PIDs for repositories that can be, as opposed to DOIs, 1) minted in a distributed fashion and cost-free, and which can be resolvable directly to the repository rather than to the publisher. We argue that this is the right approach that has the potential to increase the importance of repositories in the process of disseminating knowledge. We then present the first global OAI Resolver built on top of the CORE research outputs aggregation system.
Those who create these policies have often very little technical understanding.
Although permissible for a DOI to resolve to one or multiple objects, the current practice in scholarly communication is for the DOI to mainly resolve to the VoR on the publisher’s site but not to other versions of the object that can exist anywhere within the repositories network
Policy makers are unaware of the existence of OAI identifiers
DOI resolver has been a particularly successful tool in driving up the adoption of DOIs
CORE can resolve any OAI identifier to a metadata page of the record in CORE, the routing to the repository page requires slightly more information. This is because the mapping between the OAI prefix of a repository and the currently used URL for the repository metadata record display page/splash page is not consistent across repository systems.
We have solved this problem as follows. CORE allows any repository data provider to register for a CORE Repository Dashboard account. Within the CORE Repository Dashboard, repository managers can decide if they want their OAIs to be resolved to the repository and if so, they define a mapping between the repository prefix and its URL for the display pages.