SlideShare a Scribd company logo
1 of 42
Building a
Biomedical
Knowledge Garden
Benjamin Good
Su Laboratory, Group Meeting
Dec. 2, 2016
Unstructured data
PubMed
Clinical Trials
Etc.
NLP tools
SemRep
DeepDive
Implicitome
etc.
Knowledge Graph
SemmedDB
Literome
etc.
Applications
Semantic MEDLINE
BioGraph
etc.
Microtasks
Mark2Cure
AMT
Structured data
Gene Ontology etc.
http://tinyurl.com/jbmn8mz
The Knowledge Garden Idea.
Circa Jan. 2015.
The devil is in the details…
Unstructured data
PubMed
Clinical Trials
Etc.
NLP tools
SemRep
DeepDive
Implicitome
etc.
Knowledge Graph
SemmedDB
Literome
etc.
Application
Semantic MEDLINE
BioGraph
etc.
Microtasks
Mark2Cure
AMT
Structured data
Gene Ontology etc.
Reality November 2016
Knowledge Graph
SemmedDB
Application
knowledge.bio
Microtasks
Mark2Cure
AMT
knowledge.bio
Explore all biomedical knowledge as a graph with edges
connected back to supporting references
v2.5 demo
knowledge.bio – Data challenges
• V1 – V2.5
• All content from SemmedDB or Implicitome
• custom schema to support these.
• V3 key requirement:
?
allow import of content from many other sources,
Gene Ontology, DeepDive output, User-generated…
This part is important…
Not nailing it down makes everything else harder
Knowledge Garden
content managed as:
csv files
json documents
mysql databases
Postgress databases
neo4j databases
None of which had any
coherent plan or
structure
Requirements for a knowledge graph
• Syntax:
• How to refer to nodes and edges
• identifiers
• schema (structure of graph)
• Semantics:
• What things mean
• How you decide on the ‘?’:
• node1 ‘?’ node2
• are they the same (to you?)
• if not, what is the edge? Mind the Gap…
(one node in “Amino Acid” namespace
other in (“Biologically Active Substance” namespace)
Options at kb3 scale
(millions of concepts and relations)
• The Unified Medical Language System (UMLS)
• The Semantic Web
• Wikidata ?
The UMLS
(CUIs, Atoms, Types)
C0026106HP:0001256
Mild mental retardation,
Mild and nonprogressive
mental retardation
SNOMEDCT_US:86765009
Moron (mental age 8-12 years)
MEDCIN:35101
Mild intellectual disabilities
OMIM:MTHU035844
Intellectual disability, mild
Atoms
CUI
equivalent to
https://uts.nlm.nih.gov
C0233630
SNOMEDCT_US:32386009
Logical Thinking
Mental or Behavioral Dysfunction
Disease or Syndrome
isa
isa
Types
Behavior
Activity
affects
isa
Event
isa
isa
affects ?
Types organized into a
“Semantic Network”
~ 133 types, 54 predicates
13 high level ‘groups’
CUI
The UMLS in 2016
• 3,200,922 CUIs
• 211 source vocabularies (e.g. MeSH, SNOMED, RxNORM, etc.)
• 12,287,973 total terms (”ATOMS”)
• Every edge in the system is a manual product of NLM
• every Atom->CUI
• every CUI->Type
• every Type->Type
The Semantic Web
• Concepts uniquely identified by
resolvable URIs
• Meaning (e.g. equivalency)
encoded in OWL axioms
• Concepts and mappings
created and maintained by
anyone who can host them
• No other structure
• No governance
UMLS versus Semantic Web
• UMLS
• PROs: covers large portion of biomedical concept space, manually curated,
we are already using it by default, the semantic types are handy
• CONs: does not exist on the semantic web - no stable URI to associate with a
CUI, license is obscure and apparently limiting, weak representation of
molecular biology domain, no control over its extension (e.g. no Human
Disease Ontology)
• Semantic Web
• PROs: universal, open, infrastructure is the Web itself
• CONs: need for organization, curation, mapping
Not thrilled with my options
https://commons.wikimedia.org/wiki/File:A_frustrated_and_depressed_man_holds_his_head_in_his_hand.jpg
Meanwhile...
• human, mouse, rat, yeast,
macaque, 120+ microbes genes
and proteins
• Gene Ontology terms
• Human Disease Ontology terms
• 120,000+ chemicals
• Cancer genome variants
• Other people adding and using
data!!!
Maybe ?
Wikidata
(QIDs, ids, Types)
Q183560HP:0001256
Mild mental retardation,
Mild and nonprogressive
mental retardation
SNOMEDCT_US:86765009
Moron (mental age 8-12 years)
MEDCIN:35101
Mild intellectual disabilities
OMIM:MTHU035844
Intellectual disability, mild
QID
external id
https://www.wikidata.org/wiki/Q412194
Q412194
PubChem: 2477
buspirone
Specific Developmental Disorder
developmental disorder of mental health
subclass of
subclass of
treated by
Poly-Ontology
Drug
QID
Chemical
isa
mental disorder
disorder
subclass of
subclass of
(DO)
ids
ACTIVE! Knowledge Flow for Wikidata
Unstructured data
The Internet
NLP tools
StrepHit
Knowledge Graph Applications
Wikipedia
Wikigenomes
Wikidata.org
Microtasks
Wikidata game
MixnMatch
Structured data
Gene Ontology etc.
Wikidata is a Functioning and Flourishing
Knowledge Garden
Wikidata
• ~27,000,000 concepts identified by Qids like ‘Q183560’
• ~1350 source vocabularies (e.g. MeSH, RxNORM, IMDB, ETC.)
• (Based on properties tagged with type ‘ExternalId’)
• ? total terms integrated = labels + aliases (a lot)
• Mappings to Qids product of the unwashed masses
• Constantly updated
What concept scheme do we use ?
• Wikidata
• PROs: universal, open, infrastructure,
active community, largely curated content
• CONs: limited biomedical content so far
?
Challenge: Relevant Scientific Applications
NLP tools
SemRep
Literome
Implicitome
PubTator
DeepDive
Snorkel
ContentMine
TEES
….
Knowledge Graph
Applications
Wikigenomes
HetioNet
Knowledge.Bio
…
Structured data
Gene Expression etc,
…
A. Advancing science is
the goal and this is
how we can help
B. We need experts to
help refine and build
the knowledge graph
and apps are the bait
On the plane
Oct. 11,2016…
“Screw it, lets go all in”
I got really excited..
https://www.flickr.com/photos/alexnormand/5992512756https://www.flickr.com/photos/k6lcs/15374887957
knowledge.bio 3.0
• All nodes to be concepts from wikidata
• All predicates to be properties from wikidata
• All edges to be linked to references that could be ‘stated in’ Wikidata
• Edges (‘claims’) can come from any source
• Now
• We have one consistent format for data import
• We have a consistent pattern for gathering more data about a concept
• We have access to 27 million concepts and growing (and we can add more)
• We have the beginnings of new tool for expert-sourcing curation of Wikidata content
• Our code is getting simpler and cleaner
KB3.0 – next step seeding content
• You are now basically up to date…
• Rest of talk is about mapping content from SemmedDB to the new
structure
• 3.0 release will allow users to add new nodes and edges
• If you want data in there:
1. map it to Wikidata items and properties
2. make a tab-delimited file (Qid Pid Qid referenceUrl sentence)
3. load it (or ask me to)
• Users needed!
How many concepts in the UMLS are now
items in Wikidata?
?
27,000,000
3,000,000
Direct identifier mapping
Direct identifier mapping (15 shared ontologies)
CUI Qid
UMLS_vocab Concepts Wikidata_property Prop id Usage
NCBI 1014837 NCBI Taxonomy ID P685 379589
MSH 359116 MeSH ID P486 5979
ICD10PCS 178278 ICD-10-PCS P1690 5
NCI 119620 NCI Thesaurus ID P1748 5562
ICD10CM 98899 ICD-10 P494 8826
OMIM 86181 OMIM ID P492 5835
FMA 82042 Foundational Model of Anatomy ID P1402 3378
GO 60412 Gene Ontology ID P686 43693
MDR 51961 Medical Dictionary for Regulatory Activities ID P3201 1
HGNC 39261 HGNC gene symbol P353 63691
HGNC Sometimes... HGNC-ID P354 39758
NDFRT 38206 NDF-RT ID P2115 1509
ICD9CM 20993 ICD-9-CM P1692 88
ICD10 11552 ICD-10 P494 8826
RXNORM 205998 RxNorm CUI P3345 5671
C0001629
Adrenal Medulla
FMA: 15633 ?qid wdt:P1402 “15633” Q934888
Local MySQL query Build sparql query.wikidata.org
Strict identifier mapping
CUI Qid
UMLS_vocab Concepts Wikidata_property Prop id Usage
NCBI 1014837 NCBI Taxonomy ID P685 379589
MSH 359116 MeSH ID P486 5979
ICD10PCS 178278 ICD-10-PCS P1690 5
NCI 119620 NCI Thesaurus ID P1748 5562
ICD10CM 98899 ICD-10 P494 8826
OMIM 86181 OMIM ID P492 5835
FMA 82042 Foundational Model of Anatomy ID P1402 3378
GO 60412 Gene Ontology ID P686 43693
MDR 51961 Medical Dictionary for Regulatory Activities ID P3201 1
HGNC 39261 HGNC gene symbol P353 63691
HGNC Sometimes... HGNC-ID P354 39758
NDFRT 38206 NDF-RT ID P2115 1509
ICD9CM 20993 ICD-9-CM P1692 88
ICD10 11552 ICD-10 P494 8826->8292
RXNORM 205998 RxNorm CUI P3345 0->5671
-> Thanks to Sebastian’s recent work..
How many concepts in the UMLS are now
items in Wikidata? (according to identifiers)
463,059
27,000,000
3,000,000
15%
463,059
Wikidata
items by
UMLS
source id
Coverage of shared identifiers by item
(cut off,
NCBI taxonomy
has > 1million)
UMLS cuis
Wikidata items
Good targets for wikidata bots
463,059 mapped concepts, by semantic group
1
10
100
1000
10000
100000
1000000
N 1 to 1
NCBI Taxons
Gene Ontology
Genes
Diseases
Drugs
Where are the Gaps?
0
100000
200000
300000
400000
500000
600000
700000
800000
N no Map
600,000 missing drugs
550,000 missing disorders
Where are(n’t) the Gaps?
0
0.1
0.2
0.3
0.4
0.5
0.6
percent_mapped
Label matching…
Adding label matching actually doesn’t help
that much…
• Checked only 460,080 (including all 288,552 from SemmedDB)
• 21% (96,843) had an identifier match
• 6.9% (31,645) had a match on the UMLS Prefered Label
• 3.1% (14,319) matched one of the UMLS synonyms
• Removing anything that matched more than 1 Wikidata item we get
129,726 concepts.
• Limiting to concepts used in SemmedDB we get 113,623
• (43% coverage with most matches coming from identifiers)
SemmedDB as Wikidata, version 1
• 15,957,582 predications with 13 relation types
• All Concepts Wikidata items
• All relation types Wikidata properties
• (Data available at http://tinyurl.com/cui2qid-1 )
• Will be accessible in kb3.0 next week or the following
Next steps / project opportunities
• More Wikidata bots!
• Establish a more consistent typing strategy in Wikidata (e.g. make
each item an instance of some semantic group)
• Finish the mapping of the UMLS predicates to Wikidata Properties
• Add missing properties (e.g. ‘Activates’, ‘Inhibits’)
• Use existing subproperty prop. to build a prop. ontology inside wikidata
• Populate kb3.0 with knowledge pertinent to your disease area
• Extend the user interface
• Use the underlying neo4j database to extend HetioNet and related (or
add HetioNet to it.
Pick an edge or node and create or improve it
Unstructured data
PubMed
Clinical Trials
Etc.
NLP tools
SemRep
DeepDive
Implicitome
etc.
Knowledge Graph
SemmedDB
Literome
etc.
Applications
Semantic MEDLINE
BioGraph
etc.
Microtasks
Mark2Cure
AMT
Structured data
Gene Ontology etc.
Thanks!
• Richard Bruskiewich! and Star Informatics team for persevering…
(v1,v2.1...5, v3.0)
• Gene Wiki team! Especially bot developers: Sebastian B, Andra W,
Tim P., Greg S. who planted the seeds that are making this possible.
• Su laboratory!
• I hope you can find something useful here and help grow the garden…
• Especially you HetNetters!
https://www.flickr.com/photos/alexnormand/5992512756
Building a Biomedical Knowledge Garden

More Related Content

What's hot

Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesMichel Dumontier
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience, BGI Hong Kong
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Michel Dumontier
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceGigaScience, BGI Hong Kong
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...Syed Ahmad Chan Bukhari, PhD
 
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET
 
GlyGen Warren Workshop in Boston
GlyGen Warren Workshop in BostonGlyGen Warren Workshop in Boston
GlyGen Warren Workshop in BostonGlyGen
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience, BGI Hong Kong
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformaticsc.titus.brown
 

What's hot (20)

Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
 
The expansive reach of ChemSpider as a resource for the chemistry community
The expansive reach of ChemSpider as a resource for the chemistry communityThe expansive reach of ChemSpider as a resource for the chemistry community
The expansive reach of ChemSpider as a resource for the chemistry community
 
Phylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figuresPhylogenetics: Making publication-quality tree figures
Phylogenetics: Making publication-quality tree figures
 
Canadian health census to lod
Canadian health census to lodCanadian health census to lod
Canadian health census to lod
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description Guidelines
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
 
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
 
GlyGen Warren Workshop in Boston
GlyGen Warren Workshop in BostonGlyGen Warren Workshop in Boston
GlyGen Warren Workshop in Boston
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
 

Viewers also liked

Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Benjamin Good
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of FoodBenjamin Good
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkCominvent AS
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCAlex Faynin
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaCominvent AS
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmmguest0233e9d0
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Benjamin Good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingBenjamin Good
 
Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwiseCominvent AS
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptableszubin71
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first weekBenjamin Good
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueeishimachinery
 
Eishi Company Profile 修改好的
Eishi Company Profile 修改好的Eishi Company Profile 修改好的
Eishi Company Profile 修改好的eishimachinery
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorsesmfox
 

Viewers also liked (20)

Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søk
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLC
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmm
 
Gene wiki jamboree
Gene wiki jamboreeGene wiki jamboree
Gene wiki jamboree
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 
Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwise
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptables
 
IMSafer Angel Round
IMSafer Angel RoundIMSafer Angel Round
IMSafer Angel Round
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogue
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Eishi Company Profile 修改好的
Eishi Company Profile 修改好的Eishi Company Profile 修改好的
Eishi Company Profile 修改好的
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorse
 

Similar to Building a Biomedical Knowledge Garden

Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?Maryann Martone
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataBarry Smith
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Deborah McGuinness
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleEnis Afgan
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Keiichiro Ono
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 

Similar to Building a Biomedical Knowledge Garden (20)

Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort Data
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
MIRIAM Resources
MIRIAM ResourcesMIRIAM Resources
MIRIAM Resources
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
 
Data Management
Data ManagementData Management
Data Management
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge Base
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 
eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...
 
Clinical Anatomy 9566
Clinical Anatomy 9566Clinical Anatomy 9566
Clinical Anatomy 9566
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 

More from Benjamin Good

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledgeBenjamin Good
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopBenjamin Good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery Benjamin Good
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbioBenjamin Good
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfBenjamin Good
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBenjamin Good
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshopBenjamin Good
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionBenjamin Good
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Benjamin Good
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsBenjamin Good
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationBenjamin Good
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...Benjamin Good
 
An online game for human phenotype prediction
An online game for human phenotype predictionAn online game for human phenotype prediction
An online game for human phenotype predictionBenjamin Good
 

More from Benjamin Good (18)

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMs
 
Science Game Lab
Science Game LabScience Game Lab
Science Game Lab
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshop
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen science
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshop
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival prediction
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
 
An online game for human phenotype prediction
An online game for human phenotype predictionAn online game for human phenotype prediction
An online game for human phenotype prediction
 

Recently uploaded

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 

Recently uploaded (20)

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 

Building a Biomedical Knowledge Garden

  • 1. Building a Biomedical Knowledge Garden Benjamin Good Su Laboratory, Group Meeting Dec. 2, 2016
  • 2. Unstructured data PubMed Clinical Trials Etc. NLP tools SemRep DeepDive Implicitome etc. Knowledge Graph SemmedDB Literome etc. Applications Semantic MEDLINE BioGraph etc. Microtasks Mark2Cure AMT Structured data Gene Ontology etc. http://tinyurl.com/jbmn8mz The Knowledge Garden Idea. Circa Jan. 2015.
  • 3. The devil is in the details… Unstructured data PubMed Clinical Trials Etc. NLP tools SemRep DeepDive Implicitome etc. Knowledge Graph SemmedDB Literome etc. Application Semantic MEDLINE BioGraph etc. Microtasks Mark2Cure AMT Structured data Gene Ontology etc.
  • 4. Reality November 2016 Knowledge Graph SemmedDB Application knowledge.bio Microtasks Mark2Cure AMT
  • 5. knowledge.bio Explore all biomedical knowledge as a graph with edges connected back to supporting references v2.5 demo
  • 6. knowledge.bio – Data challenges • V1 – V2.5 • All content from SemmedDB or Implicitome • custom schema to support these. • V3 key requirement: ? allow import of content from many other sources, Gene Ontology, DeepDive output, User-generated…
  • 7. This part is important… Not nailing it down makes everything else harder Knowledge Garden content managed as: csv files json documents mysql databases Postgress databases neo4j databases None of which had any coherent plan or structure
  • 8. Requirements for a knowledge graph • Syntax: • How to refer to nodes and edges • identifiers • schema (structure of graph) • Semantics: • What things mean • How you decide on the ‘?’: • node1 ‘?’ node2 • are they the same (to you?) • if not, what is the edge? Mind the Gap… (one node in “Amino Acid” namespace other in (“Biologically Active Substance” namespace)
  • 9. Options at kb3 scale (millions of concepts and relations) • The Unified Medical Language System (UMLS) • The Semantic Web • Wikidata ?
  • 10. The UMLS (CUIs, Atoms, Types) C0026106HP:0001256 Mild mental retardation, Mild and nonprogressive mental retardation SNOMEDCT_US:86765009 Moron (mental age 8-12 years) MEDCIN:35101 Mild intellectual disabilities OMIM:MTHU035844 Intellectual disability, mild Atoms CUI equivalent to https://uts.nlm.nih.gov C0233630 SNOMEDCT_US:32386009 Logical Thinking Mental or Behavioral Dysfunction Disease or Syndrome isa isa Types Behavior Activity affects isa Event isa isa affects ? Types organized into a “Semantic Network” ~ 133 types, 54 predicates 13 high level ‘groups’ CUI
  • 11. The UMLS in 2016 • 3,200,922 CUIs • 211 source vocabularies (e.g. MeSH, SNOMED, RxNORM, etc.) • 12,287,973 total terms (”ATOMS”) • Every edge in the system is a manual product of NLM • every Atom->CUI • every CUI->Type • every Type->Type
  • 12. The Semantic Web • Concepts uniquely identified by resolvable URIs • Meaning (e.g. equivalency) encoded in OWL axioms • Concepts and mappings created and maintained by anyone who can host them • No other structure • No governance
  • 13. UMLS versus Semantic Web • UMLS • PROs: covers large portion of biomedical concept space, manually curated, we are already using it by default, the semantic types are handy • CONs: does not exist on the semantic web - no stable URI to associate with a CUI, license is obscure and apparently limiting, weak representation of molecular biology domain, no control over its extension (e.g. no Human Disease Ontology) • Semantic Web • PROs: universal, open, infrastructure is the Web itself • CONs: need for organization, curation, mapping
  • 14. Not thrilled with my options https://commons.wikimedia.org/wiki/File:A_frustrated_and_depressed_man_holds_his_head_in_his_hand.jpg
  • 15. Meanwhile... • human, mouse, rat, yeast, macaque, 120+ microbes genes and proteins • Gene Ontology terms • Human Disease Ontology terms • 120,000+ chemicals • Cancer genome variants • Other people adding and using data!!!
  • 17. Wikidata (QIDs, ids, Types) Q183560HP:0001256 Mild mental retardation, Mild and nonprogressive mental retardation SNOMEDCT_US:86765009 Moron (mental age 8-12 years) MEDCIN:35101 Mild intellectual disabilities OMIM:MTHU035844 Intellectual disability, mild QID external id https://www.wikidata.org/wiki/Q412194 Q412194 PubChem: 2477 buspirone Specific Developmental Disorder developmental disorder of mental health subclass of subclass of treated by Poly-Ontology Drug QID Chemical isa mental disorder disorder subclass of subclass of (DO) ids
  • 18. ACTIVE! Knowledge Flow for Wikidata Unstructured data The Internet NLP tools StrepHit Knowledge Graph Applications Wikipedia Wikigenomes Wikidata.org Microtasks Wikidata game MixnMatch Structured data Gene Ontology etc.
  • 19. Wikidata is a Functioning and Flourishing Knowledge Garden
  • 20. Wikidata • ~27,000,000 concepts identified by Qids like ‘Q183560’ • ~1350 source vocabularies (e.g. MeSH, RxNORM, IMDB, ETC.) • (Based on properties tagged with type ‘ExternalId’) • ? total terms integrated = labels + aliases (a lot) • Mappings to Qids product of the unwashed masses • Constantly updated
  • 21. What concept scheme do we use ? • Wikidata • PROs: universal, open, infrastructure, active community, largely curated content • CONs: limited biomedical content so far ?
  • 22. Challenge: Relevant Scientific Applications NLP tools SemRep Literome Implicitome PubTator DeepDive Snorkel ContentMine TEES …. Knowledge Graph Applications Wikigenomes HetioNet Knowledge.Bio … Structured data Gene Expression etc, … A. Advancing science is the goal and this is how we can help B. We need experts to help refine and build the knowledge graph and apps are the bait
  • 23. On the plane Oct. 11,2016… “Screw it, lets go all in” I got really excited.. https://www.flickr.com/photos/alexnormand/5992512756https://www.flickr.com/photos/k6lcs/15374887957
  • 24. knowledge.bio 3.0 • All nodes to be concepts from wikidata • All predicates to be properties from wikidata • All edges to be linked to references that could be ‘stated in’ Wikidata • Edges (‘claims’) can come from any source • Now • We have one consistent format for data import • We have a consistent pattern for gathering more data about a concept • We have access to 27 million concepts and growing (and we can add more) • We have the beginnings of new tool for expert-sourcing curation of Wikidata content • Our code is getting simpler and cleaner
  • 25. KB3.0 – next step seeding content • You are now basically up to date… • Rest of talk is about mapping content from SemmedDB to the new structure • 3.0 release will allow users to add new nodes and edges • If you want data in there: 1. map it to Wikidata items and properties 2. make a tab-delimited file (Qid Pid Qid referenceUrl sentence) 3. load it (or ask me to) • Users needed!
  • 26. How many concepts in the UMLS are now items in Wikidata? ? 27,000,000 3,000,000
  • 28. Direct identifier mapping (15 shared ontologies) CUI Qid UMLS_vocab Concepts Wikidata_property Prop id Usage NCBI 1014837 NCBI Taxonomy ID P685 379589 MSH 359116 MeSH ID P486 5979 ICD10PCS 178278 ICD-10-PCS P1690 5 NCI 119620 NCI Thesaurus ID P1748 5562 ICD10CM 98899 ICD-10 P494 8826 OMIM 86181 OMIM ID P492 5835 FMA 82042 Foundational Model of Anatomy ID P1402 3378 GO 60412 Gene Ontology ID P686 43693 MDR 51961 Medical Dictionary for Regulatory Activities ID P3201 1 HGNC 39261 HGNC gene symbol P353 63691 HGNC Sometimes... HGNC-ID P354 39758 NDFRT 38206 NDF-RT ID P2115 1509 ICD9CM 20993 ICD-9-CM P1692 88 ICD10 11552 ICD-10 P494 8826 RXNORM 205998 RxNorm CUI P3345 5671 C0001629 Adrenal Medulla FMA: 15633 ?qid wdt:P1402 “15633” Q934888 Local MySQL query Build sparql query.wikidata.org
  • 29. Strict identifier mapping CUI Qid UMLS_vocab Concepts Wikidata_property Prop id Usage NCBI 1014837 NCBI Taxonomy ID P685 379589 MSH 359116 MeSH ID P486 5979 ICD10PCS 178278 ICD-10-PCS P1690 5 NCI 119620 NCI Thesaurus ID P1748 5562 ICD10CM 98899 ICD-10 P494 8826 OMIM 86181 OMIM ID P492 5835 FMA 82042 Foundational Model of Anatomy ID P1402 3378 GO 60412 Gene Ontology ID P686 43693 MDR 51961 Medical Dictionary for Regulatory Activities ID P3201 1 HGNC 39261 HGNC gene symbol P353 63691 HGNC Sometimes... HGNC-ID P354 39758 NDFRT 38206 NDF-RT ID P2115 1509 ICD9CM 20993 ICD-9-CM P1692 88 ICD10 11552 ICD-10 P494 8826->8292 RXNORM 205998 RxNorm CUI P3345 0->5671 -> Thanks to Sebastian’s recent work..
  • 30. How many concepts in the UMLS are now items in Wikidata? (according to identifiers) 463,059 27,000,000 3,000,000 15%
  • 32. Coverage of shared identifiers by item (cut off, NCBI taxonomy has > 1million) UMLS cuis Wikidata items Good targets for wikidata bots
  • 33. 463,059 mapped concepts, by semantic group 1 10 100 1000 10000 100000 1000000 N 1 to 1 NCBI Taxons Gene Ontology Genes Diseases Drugs
  • 34. Where are the Gaps? 0 100000 200000 300000 400000 500000 600000 700000 800000 N no Map 600,000 missing drugs 550,000 missing disorders
  • 35. Where are(n’t) the Gaps? 0 0.1 0.2 0.3 0.4 0.5 0.6 percent_mapped
  • 37. Adding label matching actually doesn’t help that much… • Checked only 460,080 (including all 288,552 from SemmedDB) • 21% (96,843) had an identifier match • 6.9% (31,645) had a match on the UMLS Prefered Label • 3.1% (14,319) matched one of the UMLS synonyms • Removing anything that matched more than 1 Wikidata item we get 129,726 concepts. • Limiting to concepts used in SemmedDB we get 113,623 • (43% coverage with most matches coming from identifiers)
  • 38. SemmedDB as Wikidata, version 1 • 15,957,582 predications with 13 relation types • All Concepts Wikidata items • All relation types Wikidata properties • (Data available at http://tinyurl.com/cui2qid-1 ) • Will be accessible in kb3.0 next week or the following
  • 39. Next steps / project opportunities • More Wikidata bots! • Establish a more consistent typing strategy in Wikidata (e.g. make each item an instance of some semantic group) • Finish the mapping of the UMLS predicates to Wikidata Properties • Add missing properties (e.g. ‘Activates’, ‘Inhibits’) • Use existing subproperty prop. to build a prop. ontology inside wikidata • Populate kb3.0 with knowledge pertinent to your disease area • Extend the user interface • Use the underlying neo4j database to extend HetioNet and related (or add HetioNet to it.
  • 40. Pick an edge or node and create or improve it Unstructured data PubMed Clinical Trials Etc. NLP tools SemRep DeepDive Implicitome etc. Knowledge Graph SemmedDB Literome etc. Applications Semantic MEDLINE BioGraph etc. Microtasks Mark2Cure AMT Structured data Gene Ontology etc.
  • 41. Thanks! • Richard Bruskiewich! and Star Informatics team for persevering… (v1,v2.1...5, v3.0) • Gene Wiki team! Especially bot developers: Sebastian B, Andra W, Tim P., Greg S. who planted the seeds that are making this possible. • Su laboratory! • I hope you can find something useful here and help grow the garden… • Especially you HetNetters! https://www.flickr.com/photos/alexnormand/5992512756

Editor's Notes

  1. Amino Acid, Peptide, or Protein Biologically Active Substance Is there one node for gene and one for the protein? Are orthologs different nodes ? What about sequence variants?
  2. Note we could be checking the references to increase precision and provenance…
  3. Note we could be checking the references to increase precision and provenance…
  4. And that is largely the important take home message. Identifier mapping is hard, generally boring work that should never be repeated! Doing it in the context of Wikidata means that it can be done once and for all – and you can even describe how it was accomplished in the references and qualifiers!!! We should do this!
  5. Noting that I was hammering the query service around 10 times/second for around 24 hours and it never complained or slowed down.