The Bio2RDF project aims to transform silos of bioinformatics data into a distributed platform for biological knowledge discovery. Initial work focused on building a public database of open-linked data with web-resolvable identifiers that provides information about named entities. This involved a syntactic normalization to convert open data represented in a variety of formats (flatfile, tab, xml, web services) to RDF-based linked data with normalized names (HTTP URIs) and basic typing from source databases. Bio2RDF entities also make reference to other open linked data networks (e.g. dbPedia) thus facilitating traversal across information spaces. However, a significant problem arises when attempting to undertake more sophisticated knowledge discovery approaches such as question answering or symbolic data mining. This is because knowledge is represented in a fundamentally different manner, requiring one to know the underlying data model and reconcile the artefactual differences when they arise. In this talk, we describe our data integration strategy that makes use of both syntactic and semantic normalization to consistently marshal knowledge to a common data model while leveraging explicit logic-based mappings with community ontologies to further enhance the biological knowledgescope.
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Bio2RDF and Beyond!
1. Bio2RDF and Beyond! Large Scale, Distributed Biological Knowledge Discovery 1 EBI : 14-01-10 Michel Dumontier, Ph.D. Associate Professor of Bioinformatics Carleton University Department of Biology School of Computer Science Institute of Biochemistry Ottawa Institute of Systems Biology Ottawa-Carleton Institute of Biomedical Engineering
5. We need to expose the deep web Surface web:167 terabytes Deep web:91,000 terabytes 545-to-one EBI : 14-01-10 5
6. Data silos – not made for sharing 6 EBI : 14-01-10
7. How do we integrate these resources? 7 EBI : 14-01-10
8. We want to simultaneously query the 1000+ biological databases 8 EBI : 14-01-10
9. The Semantic Web is a web of knowledge. 9 EBI : 14-01-10 It is about standards for publishing, sharing and querying knowledge drawn from diverse sources It enables the answering of sophisticated questions
11. Life Science Data Contributors HCLS (LODD) Neurocommons Bio2RDF EBI : 14-01-10 11
12. Resource Description Framework (RDF) Allows one to talk about anything Uniform Resource Identifier (URI) can be used as entity names Bio2RDF specifies the naming convention http://bio2rdf.org/uniprot:P05067 is a name for Amyloid precursor protein http://bio2rdf.org/omim:104300 is a name for Alzheimer disease uniprot:P05067 omim:104300 12 EBI : 14-01-10
15. Object: resource or literaluniprot:P05067 is a uniprot:Protein 13 EBI : 14-01-10
16. Multi-Source Data Integration depends on consistent naming uniprot:P05067 uniprot:Protein uniprot:Protein is a UniProt has name + uniprot:P05067 go:Membrane uniprot:P05067 go:Membrane located in located in Gene Ontology + uniprot:P05067 interacts with uniprot:P05067 uniprot:P05067 interacts with Unified view iRefIndex 14 EBI : 14-01-10
17. Building statements creates knowledge Amyloid precursor protein Alzheimer Disease label label is involved in uniprot:P05067 omim:104300 is a is a Protein Disease 15 EBI : 14-01-10
19. Bio2RDF is a framework to create and provision linked data networks 17 EBI : 14-01-10 Francois Belleau, Laval University Marc-Alexandre Nolin, Laval University Peter Ansell, Queensland University of Technology Michel Dumontier, Carleton University
23. Bioinformatics Discovery Registry SharedName initiative to provide stable URI patterns for data records. We added the relationship between entities and records Directory Service ~1700 datasets & dozens of resolvers. Discovery Service Registry links entities to data records, their formats (RDF/XML, HTML, etc) and provider (Bio2RDF, Uniprot) Redirection Service Automatic redirection to data provider document
24. something you can lookup or search for with rich descriptions 22 EBI : 14-01-10
36. Reasoning and Inference through Semantics fact uniprot:P05067 is a is a Uniprot:Protein is a chebi:Polyatomic Entity ontology Knowledge base 34 EBI : 14-01-10
37. The Web Ontology Language (OWL) Has Explicit Semantics Can therefore be used to capture knowledge in a machine understandable way 35 EBI : 14-01-10
42. The Holy Grail: Align the promoters of all serine threoninekinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels. Retrieve and align 2000nt 5' from every serine/threoninekinase in Musmusculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% similar in the active site to kinases known to be involved in cell-cycle regulation in any other species. 40 EBI : 14-01-10
43. Semantic Automated Discovery and Integration http://sadiframework.org 41 EBI : 14-01-10 Mark Wilkinson, UBC Michel Dumontier, Carleton University Christopher Baker, UNB
44. SADI – described oriented service matching based on registered predicates
60. We’re interested in Personalized Medicine The ability to offer The Right Drug To The Right Patient For The Right Disease At The Right Time With The Right Dosage Genetic and metabolic data will allow drugs to be tailored to patient subgroups 54 EBI : 14-01-10
61. PHARMGKB is an emerging resource for pharmacogenomics + Role of genes, gene variants , drugs + pharmacokinetics + pharmacodynamics + clinical outcomes. + Links to publications - Natural language descriptions - Variant details in publications 55 EBI : 14-01-10
62. Pharmacogenomics of Depression KNOWLEDGE BASE contains statements from 11/40 relevant publications involving 45 genes / gene variants, 57 drugs annotated with 19 classes of antidepressants, 45 drug treatments, 47 drug-gene interactions, 29 clinical outcomes, 10 drug-induced side-effects, and 8 gene-disease interactions. 56 EBI : 14-01-10
63. Protégé 4, FaCT++, DL Query Tab Querying the PDKB Nortriptyline induced side effects for ABCB1 gene variants ‘side effect’ that ‘is realized by’ some (‘drug treatment’ that ‘involves’ some ‘nortriptyline’ and ‘involves’ some (‘variant of’ some ‘ABCB1’)) 57 EBI : 14-01-10 postural hypotension is a side effect of nortriptyline treatment of depression for individuals presenting the 3435C>T genotype
Editor's Notes
Can’t answer questions that require background knowledge
But don’t have the flexibility to ask sophisticated questions
Can’t answer questions that require background knowledge
Research – that’s what brought you hereSkils – marketable in whatever you choose to do thereafterKnowledeable – where the field has been and where it is goingImprove oral and written scientific communication skillsResearch – tell people what you’ve been doingTrack progress – develop a sense of progress