The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
Ontology Development Kit: Bio-Ontologies 2019
1. MANAGING ONTOLOGIES
WITH THE ONTOLOGY
DEVELOPMENT KIT (ODK)
CHRIS MUNGALL
CJMUNGALL@LBL.GOV
LAWRENCE BERKELEY NATIONAL LABORATORY
BIO-ONTOLOGIES COSI, ISBM/ECCB 2019
@OBOFoundry
@chrismungall
2. CONTINUED GROWTH OF ONTOLOGY PROJECTS
• OBO Ontologies registered:
• 2015: 104 ontologies
• 2019: 215 ontologies
• Observation
• Ontology development in 2019 == Bioinformatics/perl coding
circa 1999
• State of development of many ontologies:
• Lack of modularity or reuse
• Lack of testing frameworks or continuous integration
• Little or no leveraging of reasoning
• Version control practice poor or non-existent
• Ontologies frequently contain errors:
• Semantic; e.g. duplicate definitions
• Structural
• Syntactic
• Lexical
Number of ontologies in
BioPortal
Source: Amina Annane and Clement Jonquet
3. 1996
eXtreme Programming (XP)
Test Driven Development (TDD)
2008
GitHub1986
CVS
2005
Git
2000
SVN
1999
SourceForge
1975
Modula
1965-85
“Software Crisis”
1974
Liskov
“strong
typing”
2001
Cruise
Control
1970s 1980s 1990s 2000s 2010s
2011
Travis
1987
Pattern
Languages
1972
Source Code
Control System (SCCS)
4. ONTOLOGIES AND OBO: OPEN BIO-ONTOLOGIES
• 1950s-60s Semantic Networks
• 1960s-1980s Knowledge Representation
• 1980s-1990s Description Logics, BioCyc, Medical Knowledge Bases,
DAML+OIL
• 1998 Gene Ontology Created
• 2002 Open Bio-Ontologies (OBO) formed, OBO principles
• 2003 OBO Format
• 2003 Rector paper on normalization and modularity
• 2003 OWL – Web Ontology Language
• 2004 Relation Ontology
• 2007 OBO PURLs (Permanent URLs)
• 2011-Present OBO Operations Volunteers http://obofoundry.org
•Open (CC BY or 0)
•Standard syntax and semantics (OWL)
•Standard PURLs for classes
•Versioning
•Well-defined scope
•Classes should be defined
•Use of standard relations (RO/BFO)
•Documentation
•Documented Plurality of Users
•Commitment To Collaboration
•Locus of Authority
•Naming Conventions
•Maintenance
{
5. ODK: ONTOLOGY DEVELOPMENT KIT
kernel
ODK container
ROBOT
Make
odk.py
dosdp-tools
Reasoners
container
Ontology Operations
(Command Line)
Workflows: chains
together operations
Seed an ontology project:
Create a GitHub repository
with workflows in place
Build ontologies rapidly from
Design Pattern templates
Includes Elk, HermiT, Konklude
Complements ODEs
(Protégé)
fastobo
Validation of obo
format files (Rust)
6. ODK
GETTING STARTED: SEEDING AN ONTOLOGY PROJECT
• Organizing a project in GitHub – not trivial
• ODK provides a seed utility to start a new
project
• Sets you up with a GitHub structure
• Can be seeded from a YAML project
specification OR command line options
Project.yaml seed
- README.md
- CONTRIBUTING.md
- LICENSE.md
- Changes.md
- .travis.yml
- .github
- ISSUE_TEMPLATE
- new_term.md
- obsolete_request.md
- src
- ontology
- myont-edit.owl
- Makefile
- myont-idspaces.owl
- template
- mytemplate1.yaml
- myont.obo
- myont.owl
Jinja2 Templates
(hand-authored)
7. ONTOLOGY WORKFLOWS: MAKE AND ROBOT
• ROBOT: ROBOT is an OBO Tool
• Command can be chained together
• ODK will seed your repo with a
Makefile based workflow
• edit release
annotate
reason
diff
template
report
extract
Add metadata assertions
onto ontology
convert
Use reasoner to detect incoherency
and assert inferred links
compare two ontologies
generate portions of ontology from
templates and tabular data
complete QA/QC report
Extract submodules for imports
Convert between OWL syntaxes,
OBO format, OBO-JSON
http://robot.obolibrary.org/
<more..>
8. MODULAR ONTOLOGY DEVELOPMENT: EXTRACT
• Don’t develop monoliths!
• Reuse existing ontologies
• OBO was constructed in part to facilitate
ontology reuse
• OWLAPI provides algorithms for extracting
‘modules’ (SLME)
• Also: MIREOT
• ROBOT provides an easy wrapper for these
myont.owl
chebi.owl
chebi_import..owl
robot extract –i chebi.owl –t myont.terms –o chebi_import.owl
https://douroucouli.wordpress.com/2019/06/29/ontotip-learn-
the-rector-normalization-technique/
extract
owl:import
terms
Modularisation of Domain Ontologies Implemented
in Description Logics and related formalisms including OWL -
Proceedings of the 2nd international conference on Knowledge capture
(Rector 2003)
9. REASONING
• Why use reasoning?
• Semantic Validation of ontology
• e.g. disjoints, domain/range
• unintended equivalence
• Automatic classification, modular
development
• ROBOT provides simple wrapper ontology standard
OWL reasoners
• ODK Docker container includes reasoners that are
awkward to install (e.g. Konklude)
https://douroucouli.wordpress.com/2018/08/03/debugging-
ontologies-using-owl-reasoning-part-1-basics-and-disjoint-
classes-axioms/
10. OBO CONVENTIONS: ROBOT REPORT
• Reason command provides semantic validation
• ROBOT report validates against a checklist of criteria
• Implemented via SPARQL (and soon ShEx)
• Ensure classes have labels and textual definitions (cardinality 1)
• No two classes should share the same text definition
• Labels and exact synonyms should not clash
• …many more
• Many criteria are ‘OBO-esque’
• Can be configured
• Criteria can potentially be expanded
Level Rule Name Subject Property Value
ERROR duplicate_definition head-mantle fusion [CEPH:0000129] definition [IAO:0000115] .
ERROR duplicate_definition tentacle thickness [CEPH:0000261] definition [IAO:0000115] .
ERROR missing_label anatomical entity [UBERON:0001062] label [rdfs:label]
ERROR missing_ontology_description ceph.owl dc11:description
ERROR duplicate_label leucophore [CEPH:0000284] label [rdfs:label] leucophore
ERROR duplicate_label leucophore [CEPH:0001077] label [rdfs:label] leucophore
ERROR missing_ontology_license ceph.owl dc:license
ERROR missing_ontology_title ceph.owl dc11:title
11. TEMPLATE-DRIVEN ONTOLOGY DEVELOPMENT
• Dead Simple OWL Design Patterns
(DOSDPs)
• ROBOT Templates
• Allow encoding of common ontology
patterns in structured form
• Ontology Documentation and
Validation
• Generation (“Compilation”) of
ontology portions from TSVs
TSV/Excel
Design
Pattern
Template
OWL
dosdp
tools
15. ONTOBOT
• Agent (bot) for operating on ontologies
• Will make Pull Requests on your ontologies
• E.g.
• On change of upstream ontology (e.g. chebi)
• Rebuild imports (robot extract)
• Create a semantic diff
• Make a PR
• Relies on ODK-compliant GitHub structure
• Currently only running for GO
• Future plans:
• Command OntoBot via GitHub tickets
16. ODK AND OBO
• ODK is in principle generic
• But we encourage OBO conventions and principles by default
• Identifiers
• Versioning
• License
• Definitions
• …
• Historically OBO principles have been “thrown over the wall”, we haven’t done a good job of helping implement
• OntoTips: http://bit.ly/ontotips
• Definitions: https://douroucouli.wordpress.com/2019/07/08/ontotip-write-simple-concise-clear-operational-
textual-definitions/
17. SUMMARY/CONCLUSIONS
• ODK is recommended for
• Starting a new ontology project
• Retrofitting into an existing project
• Frees ontology developer from multiple technical
tasks
• Makes it easier to follow OBO principles
• Flexible and we are open to modifications
• Many benefits
• Conventions come with benefits
• No need to roll your own
https://github.com/INCATools/ontology-development-kit
https://hub.docker.com/r/obolibrary/odkfull/
Coming Soon
• Protégé support
• OntoBot improvements
18. ACKNOWLEDGMENTS
Developers and Contributors
• Nico Matentzoglu
• David Osumi-Sutherland
• Eric Douglass
• Seth Carbon
• Jim Balhoff
• Bjoern Peters
• Matt Horridge
• Rebecca Jackson (Tauber)
• James Overton
Testers
• Simon Jupp
• Erik Segerdell
• Sebastian Koehler
• James Seager
• Leigh Carmody
• Sofia Robb
• Chris Grove
• Raymond Lee
• Alliance of Genome Resource curators
• Citlalli Mejía Almonte
NIH U01HG009453 INCA
NIH R24HG010032 OBO
Doubling of ontologies in OBO
Growing crisis of hard to maintain ontologies
Annane, A.: Enhancing Ontology Matching with Background Knowledge Resources - Application to the Biomedical Domain, (2018).
Ontology Development should be like software engineering!
https://en.wikiversity.org/wiki/Software_testing/History_of_testing
Take home: OBO provided principles but not a lot of help in how to implement them
Bring ancilliary tooling support to ontology development. Doesn’t replace Protégé! Which is like an IDE