SlideShare a Scribd company logo
1 of 23
Download to read offline
Best practices for generating
linked data
Tutorial @ ICBO 2013
Tutorial Roadmap
Bio2RDF Best Practices
1. Assign a URI for all things
2. Assign labels and identifiers
3. Declare and assign types
4. Provide dataset provenance
1. Assign URIs for all things
● The base Bio2RDF URI pattern:
http://bio2rdf.org/namespace:identifier
● Data provider record identifiers are
maintained from source
● Linked Data = no blank nodes!
1. Assign URIs for all things
● Data provider records are maintained from
source
○ e.g. DrugBank’s resource IRI for
Leucovorin
http://bio2rdf.org/drugbank:DB00650
1. Assign URIs for all things
● Vocabulary namespaces are used for
dataset specific types and predicates
http://bio2rdf.org/drugbank_vocabulary:Drug
● Resource namespaces are used to assign
an identifier when one isn't a provided by the
source
- unique identifier with UUID, hash, counter, concatenated
strings, etc
http://bio2rdf.org/drugbank_resource:DB00440_DB00650
1. Assign URIs for all things
● All valid namespaces are listed in the
Bio2RDF Life Sciences Registry
○ ensures that URIs are consistent across all Bio2RDF
datasets
○ registry is publicly available at http://tinyurl.
com/dataregistry
2. Assign labels and identifiers
● Use rdfs:label to assign a language-specified
label for all resources
○ can be a source provided title, a script generated
phrase, or a phrase provided in a third party dataset
○ Pattern: rdfs:label "label [ns:id]"@lang
● Use Dublin Core predicates for source-
provided label and identifiers
○ Pattern: dc:title "label"@lang (assign language tag
only when one is provided)
○ Pattern: dc:identifier "ns:id"^^xsd:string
2. Assign labels and identifiers
● Use Bio2RDF predicates to assign Bio2RDF
namespace and Bio2RDF identifiers:
○ Pattern: bio2rdf_vocabulary:namespace "ns"^^xsd:
string
○ Pattern: bio2rdf_vocabulary:identifier "id"^^xsd:
string
2. Assign labels and identifiers
Example: DrugBank entry for Nitrazepam
drugbank:DB0159
rdfs:label "Nitrazepam [drugbank:DB0159]"@en ;
dc:title “Nitrazepam”@en ;
dc:identifier “drugbank:DB0159”^^xsd:string ;
bio2rdf_vocabulary:namespace “drugbank”^^xsd:string ;
bio2rdf_vocabulary:identifier “DB0159”^^xsd:string .
3. Declare and assign types
● All resources should be typed as being
resources of the dataset
○ Pattern: rdf:type namespace_vocabulary:Resource
● Instances of a dataset vocabulary type
should also be typed as owl:
NamedIndividual
○ Pattern: rdf:type namespace_vocabulary:Type
○ Pattern: rdf:type owl:NamedIndividual
● Classes should be typed as owl:Class
○ Pattern: rdf:type owl:Class
○ If superclass has been described using
namespace_vocabulary pattern, then link class
using rdfs:subClassOf
3. Declare and assign types
● Object properties and datatype properties
should also be typed
○ Pattern: rdf:type owl:ObjectProperty
○ Pattern: rdf:type owl:DatatypeProperty
● Examples:
drugbank:DB0159
rdf:type drugbank_vocabulary:Resource ;
rdf:type owl:Class ;
rdfs:subClassOf drugbank_vocabulary:Drug .
drugbank_vocabulary:ddi-interactor-in
rdf:type owl:ObjectProperty .
4. Provide dataset provenance
data item
Bio2RDF dataset
Features
-Entity-dataset link
-Creator
-Publisher
-Date created
-License & rights
-Source
-Availability
- SPARQL endpoint
- Data dump
Vocabularies
VoID
Dublin Core
W3C Provenance
Bio2RDF vocabulary
Source dataset
prov:wasDerivedFrom
void:inDataset
4. Provide dataset provenance
● link every resource to the versioned/dated
Bio2RDF dataset in which it is described
○ Pattern: void:inDataset <http://bio2rdf.org/dataset:
namespace-dd-mm-yyyy.rdf>
○ Example:
drugbank:DB0159 void:inDataset <http://bio2rdf.
org/dataset:drugbank-03-07-2013> .
A crash course in PHP
PHP : Hypertext Preprocessor
● A general-purpose open source scripting
language
○ homepage : http://php.net
● PHP scripts can be executed from the
command line or embedded in HTML
documents
● Syntactically similar to C/C++/Java but it is
not strongly typed
A hello world PHP script
● All PHP scripts are surrounded by the <?php
and ?> tags
Declaring and instantiating classes
Using the Bio2RDF PHP API to create an
RDFizer
● Basic structure of a Bio2RDFizer script:
○ Initialize script parameters - input file(s), default
dataset namespace, etc.
○ Define a Run() function that handles downloading
and iterating over input files, as well as function calls
to parse and convert input data to RDF
○ Define function(s) to convert input data to RDF using
Bio2RDF API helper functions
Using the Bio2RDF PHP API to create an
RDFizer
● Bio2RDF PHP API defines helper functions
that implement Bio2RDF best practices:
○ getNamespace()
○ getVoc()
○ getRes()
○ triplify($subject, $predicate, $object) //object is an rdf resource
○ triplifyString($subject, $predicate, "string")// object is a literal
○ describeIndividual($uri, $label, $type, $title, $description, $language)
○ describeClass( ... )
○ describeProperty ( ... )
Example: The Comparative
Toxicogenomics Database
CTD Bio2RDFizer
script is available
on GitHub
Using and contributing to the
Bio2RDF project on GitHub
Using and contributing to the
Bio2RDF project on GitHub
1. Fork the bio2rdf-scripts and php-lib
repositories on Github
https://help.github.com/articles/fork-a-repo
2. Write some code!
3. Commit code to your fork
4. Make a pull request to the bio2rdf-scripts
repo

More Related Content

What's hot

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFSNilesh Wagmare
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod LacoulShamod Lacoul
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesJose Emilio Labra Gayo
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesAlexandra Roatiș
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapesJose Emilio Labra Gayo
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesMarin Dimitrov
 

What's hot (19)

Getting triples from records: the role of ISBD
Getting triples from records: the role of ISBDGetting triples from records: the role of ISBD
Getting triples from records: the role of ISBD
 
Data shapes-test-suite
Data shapes-test-suiteData shapes-test-suite
Data shapes-test-suite
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
java programming
java programmingjava programming
java programming
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use it
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
ShEx by Example
ShEx by ExampleShEx by Example
ShEx by Example
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 

Viewers also liked

Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemFrançois Belleau
 
As Outline
As OutlineAs Outline
As Outlinedc1
 
What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?Christophe Porteneuve
 
Email Delivery Support
Email Delivery SupportEmail Delivery Support
Email Delivery Supportrobbie2629
 
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Charles Nouyrit
 
Compa 2009 Giurus
Compa 2009 GiurusCompa 2009 Giurus
Compa 2009 Giurusgiurus
 
Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Francesca Murtas
 
Info literacy and social media in a public library
Info literacy and social media in a public libraryInfo literacy and social media in a public library
Info literacy and social media in a public librarySue Lawson
 
Visual Public Communication And Art
Visual Public Communication And ArtVisual Public Communication And Art
Visual Public Communication And ArtFrancesca Murtas
 
DevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsDevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsJerome Louvel
 
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeBest Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeChristophe Porteneuve
 

Viewers also liked (20)

Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
As Outline
As OutlineAs Outline
As Outline
 
What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?
 
Email Delivery Support
Email Delivery SupportEmail Delivery Support
Email Delivery Support
 
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
 
Compa 2009 Giurus
Compa 2009 GiurusCompa 2009 Giurus
Compa 2009 Giurus
 
Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014
 
Gezinsbond
GezinsbondGezinsbond
Gezinsbond
 
Info literacy and social media in a public library
Info literacy and social media in a public libraryInfo literacy and social media in a public library
Info literacy and social media in a public library
 
Visual Public Communication And Art
Visual Public Communication And ArtVisual Public Communication And Art
Visual Public Communication And Art
 
DevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsDevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIs
 
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeBest Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
 
Vertsol Report
Vertsol ReportVertsol Report
Vertsol Report
 
Docker wjax2014
Docker wjax2014Docker wjax2014
Docker wjax2014
 
Thesis 1 4
Thesis 1 4Thesis 1 4
Thesis 1 4
 
Nilai nilai Aqidah
Nilai nilai AqidahNilai nilai Aqidah
Nilai nilai Aqidah
 
Clutrain Ppt
Clutrain PptClutrain Ppt
Clutrain Ppt
 
RIM Conference
RIM ConferenceRIM Conference
RIM Conference
 

Similar to Best practices for generating Bio2RDF linked data

GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
Exploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your CloudExploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your Clouddyahalom
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLSamuel Lampa
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Michel Dumontier
 
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Rensselaer Polytechnic Institute
 
Php training in_noida
Php training in_noidaPhp training in_noida
Php training in_noidaTech Mentro
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo cleanHector Canto
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked DataJane Stevenson
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Dublin Core Description Set Profiles
Dublin Core Description Set ProfilesDublin Core Description Set Profiles
Dublin Core Description Set ProfilesPete Johnston
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataMetaSolutions AB
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)Rensselaer Polytechnic Institute
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2nolmar01
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Data Consortium
 

Similar to Best practices for generating Bio2RDF linked data (20)

GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
Exploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your CloudExploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your Cloud
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
 
Php training in_noida
Php training in_noidaPhp training in_noida
Php training in_noida
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo clean
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Dublin Core Description Set Profiles
Dublin Core Description Set ProfilesDublin Core Description Set Profiles
Dublin Core Description Set Profiles
 
Introduction to Bio SPARQL
Introduction to Bio SPARQL Introduction to Bio SPARQL
Introduction to Bio SPARQL
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
 
Xiaoli Li: MARC to BIBFRAME (Linked Data)
Xiaoli Li: MARC to BIBFRAME (Linked Data)Xiaoli Li: MARC to BIBFRAME (Linked Data)
Xiaoli Li: MARC to BIBFRAME (Linked Data)
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
 
How To Recoord
How To RecoordHow To Recoord
How To Recoord
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 
Expanding the content categories at JaLC
Expanding the content categories at JaLCExpanding the content categories at JaLC
Expanding the content categories at JaLC
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha Noy
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 

Best practices for generating Bio2RDF linked data

  • 1. Best practices for generating linked data Tutorial @ ICBO 2013
  • 3. Bio2RDF Best Practices 1. Assign a URI for all things 2. Assign labels and identifiers 3. Declare and assign types 4. Provide dataset provenance
  • 4. 1. Assign URIs for all things ● The base Bio2RDF URI pattern: http://bio2rdf.org/namespace:identifier ● Data provider record identifiers are maintained from source ● Linked Data = no blank nodes!
  • 5. 1. Assign URIs for all things ● Data provider records are maintained from source ○ e.g. DrugBank’s resource IRI for Leucovorin http://bio2rdf.org/drugbank:DB00650
  • 6. 1. Assign URIs for all things ● Vocabulary namespaces are used for dataset specific types and predicates http://bio2rdf.org/drugbank_vocabulary:Drug ● Resource namespaces are used to assign an identifier when one isn't a provided by the source - unique identifier with UUID, hash, counter, concatenated strings, etc http://bio2rdf.org/drugbank_resource:DB00440_DB00650
  • 7. 1. Assign URIs for all things ● All valid namespaces are listed in the Bio2RDF Life Sciences Registry ○ ensures that URIs are consistent across all Bio2RDF datasets ○ registry is publicly available at http://tinyurl. com/dataregistry
  • 8. 2. Assign labels and identifiers ● Use rdfs:label to assign a language-specified label for all resources ○ can be a source provided title, a script generated phrase, or a phrase provided in a third party dataset ○ Pattern: rdfs:label "label [ns:id]"@lang ● Use Dublin Core predicates for source- provided label and identifiers ○ Pattern: dc:title "label"@lang (assign language tag only when one is provided) ○ Pattern: dc:identifier "ns:id"^^xsd:string
  • 9. 2. Assign labels and identifiers ● Use Bio2RDF predicates to assign Bio2RDF namespace and Bio2RDF identifiers: ○ Pattern: bio2rdf_vocabulary:namespace "ns"^^xsd: string ○ Pattern: bio2rdf_vocabulary:identifier "id"^^xsd: string
  • 10. 2. Assign labels and identifiers Example: DrugBank entry for Nitrazepam drugbank:DB0159 rdfs:label "Nitrazepam [drugbank:DB0159]"@en ; dc:title “Nitrazepam”@en ; dc:identifier “drugbank:DB0159”^^xsd:string ; bio2rdf_vocabulary:namespace “drugbank”^^xsd:string ; bio2rdf_vocabulary:identifier “DB0159”^^xsd:string .
  • 11. 3. Declare and assign types ● All resources should be typed as being resources of the dataset ○ Pattern: rdf:type namespace_vocabulary:Resource ● Instances of a dataset vocabulary type should also be typed as owl: NamedIndividual ○ Pattern: rdf:type namespace_vocabulary:Type ○ Pattern: rdf:type owl:NamedIndividual ● Classes should be typed as owl:Class ○ Pattern: rdf:type owl:Class ○ If superclass has been described using namespace_vocabulary pattern, then link class using rdfs:subClassOf
  • 12. 3. Declare and assign types ● Object properties and datatype properties should also be typed ○ Pattern: rdf:type owl:ObjectProperty ○ Pattern: rdf:type owl:DatatypeProperty ● Examples: drugbank:DB0159 rdf:type drugbank_vocabulary:Resource ; rdf:type owl:Class ; rdfs:subClassOf drugbank_vocabulary:Drug . drugbank_vocabulary:ddi-interactor-in rdf:type owl:ObjectProperty .
  • 13. 4. Provide dataset provenance data item Bio2RDF dataset Features -Entity-dataset link -Creator -Publisher -Date created -License & rights -Source -Availability - SPARQL endpoint - Data dump Vocabularies VoID Dublin Core W3C Provenance Bio2RDF vocabulary Source dataset prov:wasDerivedFrom void:inDataset
  • 14. 4. Provide dataset provenance ● link every resource to the versioned/dated Bio2RDF dataset in which it is described ○ Pattern: void:inDataset <http://bio2rdf.org/dataset: namespace-dd-mm-yyyy.rdf> ○ Example: drugbank:DB0159 void:inDataset <http://bio2rdf. org/dataset:drugbank-03-07-2013> .
  • 15. A crash course in PHP
  • 16. PHP : Hypertext Preprocessor ● A general-purpose open source scripting language ○ homepage : http://php.net ● PHP scripts can be executed from the command line or embedded in HTML documents ● Syntactically similar to C/C++/Java but it is not strongly typed
  • 17. A hello world PHP script ● All PHP scripts are surrounded by the <?php and ?> tags
  • 19. Using the Bio2RDF PHP API to create an RDFizer ● Basic structure of a Bio2RDFizer script: ○ Initialize script parameters - input file(s), default dataset namespace, etc. ○ Define a Run() function that handles downloading and iterating over input files, as well as function calls to parse and convert input data to RDF ○ Define function(s) to convert input data to RDF using Bio2RDF API helper functions
  • 20. Using the Bio2RDF PHP API to create an RDFizer ● Bio2RDF PHP API defines helper functions that implement Bio2RDF best practices: ○ getNamespace() ○ getVoc() ○ getRes() ○ triplify($subject, $predicate, $object) //object is an rdf resource ○ triplifyString($subject, $predicate, "string")// object is a literal ○ describeIndividual($uri, $label, $type, $title, $description, $language) ○ describeClass( ... ) ○ describeProperty ( ... )
  • 21. Example: The Comparative Toxicogenomics Database CTD Bio2RDFizer script is available on GitHub
  • 22. Using and contributing to the Bio2RDF project on GitHub
  • 23. Using and contributing to the Bio2RDF project on GitHub 1. Fork the bio2rdf-scripts and php-lib repositories on Github https://help.github.com/articles/fork-a-repo 2. Write some code! 3. Commit code to your fork 4. Make a pull request to the bio2rdf-scripts repo