SlideShare a Scribd company logo
1 of 33
Download to read offline
Detection of Contextual Identity Links
in a Knowledge Base
Joe Raad, Nathalie Pernelle, Fatiha Saïs
firstname.lastname@lri.fr
LRI, Paris-Sud University
Orsay, France
15-Dec-17 Detecting Contextual Identity Links 2 / 33
Identity in the Semantic Web
WHY ?
Harry Potter and
the Chamber of
Secrets
Harry Potter et la
Chambre des
Secrets
300
English
Dataset 1 Dataset 2
J.K.
Rowling
author
pages
350
pages
French
J.K.
Rowling
authoridentical
 Integrate Information from Different Sources
 Enrich “Dataset 2”
Linked Open Data
15-Dec-17 Detecting Contextual Identity Links 3 / 33
Identity in the Semantic Web
300
English
Dataset 1 Dataset 2
J.K.
Rowling
author
pages
350
pages
French
J.K.
Rowling
authorowl:sameAs
Linked Open Data
English
Harry Potter et la
Chambre des
Secrets
300
pages
350
Harry Potter and
the Chamber of
Secrets
pages
French
 Unwanted Inferences
 Possible Inconsistencies
HOW ?
≈ 558 million owl:sameAs statements (LOD stat 2015)
15-Dec-17 Detecting Contextual Identity Links 4 / 33
Identity in the Semantic Web
SOLUTION ?
Harry Potter and
the Chamber of
Secrets
Harry Potter et la
Chambre des
Secrets
300
English
Dataset 1 Dataset 2
J.K.
Rowling
author
pages
350
pages
French
J.K.
Rowling
author
same
art work
Contextual Identity
Linked Open Data
15-Dec-17 Detecting Contextual Identity Links 5 / 33
Contextual Identity
Context (Price)
Different
Lemonades
Context (Calories)
Different
Lemonades
Context (pH)
Identical
Lemonades
Context (Taste)
Identical
Lemonades
lem1
lem2
15-Dec-17 Detecting Contextual Identity Links 6 / 33
Contextual Identity – State of the Art
1. skos:exactMatch : indicates a high degree of confidence that the concepts can be
used interchangeably across a wide range of applications
• Undefined contexts in which this identity holds
(Miles et al., 2009)
• Can only be used to link skos concepts
2. The Similarity Ontology : presents a hierarchy of 13 predicates (8 new)
Each predicate is characterized by the reflexivity, transitivity, and symmetric properties
• Undefined contexts in which this identity holds
(Halpin et al. , 2010)
• Difficult to use due to its subjectivity
15-Dec-17 Detecting Contextual Identity Links 7 / 33
Contextual Identity – State of the Art
3. Domain Specific Identity Links:
volume(lem1, a1) Ʌ volume(lem2, a1) same_lemonade(lem1, lem2)
• Requires Expert’s Intervention
4. Indiscernibility Relation: defines identity relations in a context represented by a
set of properties
The contexts are hierarchized in a lattice
• A context is a set of properties that does not consider the classes’ organization in
the RDF dataset
(Beek et al. , 2016)
• Identity is locally defined (does not propagates in the RDF graph using the object
properties)
15-Dec-17 Detecting Contextual Identity Links 8 / 33
Objectives
• Introduce a new Contextual Identity Link
• Represent a context using the ontology vocabulary
• Propose an approach capable of detecting all the
contexts in which two ontology instances are identical
• Benefit from the experts’ knowledge (if available)
15-Dec-17 Detecting Contextual Identity Links 9 / 33
Outline
1. Introduction
2. Contextual Identity
3. Detection of Contextual Identity
4. Experiments
15-Dec-17 Detecting Contextual Identity Links 10 / 33
Contextual Identity
In which Context “drug1” is considered as identical to “drug2”?
15-Dec-17 Detecting Contextual Identity Links 11 / 33
Contextual Identity
1) In a context where we discard the property “name”
and “hasValue”
15-Dec-17 Detecting Contextual Identity Links 12 / 33
Contextual Identity
hasWeight
2) In a context where we don’t consider the property
“name” and the Weight of the Lactose
15-Dec-17 Detecting Contextual Identity Links 13 / 33
Contextual Identity
Given an ontology O = ( C, DP, OP, A ) with
• C = set of classes
• DP = set of owl:DataTypeProperty
• OP = set of owl:ObjectProperty
• A = set of Axioms (e.g. domains and ranges, subsumption)
A Global Context is a sub ontology GCu = ( Cu , DPu , OPu , Au ) with
• Cu ⊆ DepC ⊆ C
• DPu ⊆ DP
• Opu ⊆ OP
• Au = domain and range axioms more specific than those described in A
What is a (Global) Context ?
15-Dec-17 Detecting Contextual Identity Links 14 / 33
Contextual Identity
Example of a Global Context
15-Dec-17 Detecting Contextual Identity Links 15 / 33
Contextual Identity
Order Relation between Global Contexts
GCu = ( Cu , DPu , OPu , Au ) and GCv = ( Cv , DPv , OPv , Av )
GCu ≤ GCv if :
• Cv ⊆ Cu
• DPv ⊆ DPu
• OPv ⊆ Opu
• ∀ op ∈ OPv : domainv(op) ⊑ domainu(op) and rangev(op) ⊑ rangeu(op)
• ∀ dp ∈ DPv : domainv(op) ⊑ domainu(op) and rangev(op) = rangeu(op)
15-Dec-17 Detecting Contextual Identity Links 16 / 33
Contextual Identity
Order Relation between Global Contexts
≤
(more specific than)
15-Dec-17 Detecting Contextual Identity Links 17 / 33
Contextual Identity
Under which conditions two individuals are contextually identical ?
Short Answer
If their contextual descriptions are isomorphic up to a renaming
of the instances URI
What is an instance’s contextual description ?
15-Dec-17 Detecting Contextual Identity Links 18 / 33
Contextual Identity
Contextual Instance Description According to a Global Context
Gdrug1 of drug1 in GC1
15-Dec-17 Detecting Contextual Identity Links 19 / 33
Contextual Identity
Contextual Instance Description According to a Global Context
Gdrug1 of drug1 in GC2
15-Dec-17 Detecting Contextual Identity Links 20 / 33
Identity in a Global Context
Gdrug1 of drug1 in GC1
Gdrug2 of drug2 in GC1
Isomorphic up
to a renaming of
the instance URI
identiConTo<GC1> (drug1, drug2)
15-Dec-17 Detecting Contextual Identity Links 21 / 33
Identity in a Global Context
Global Contexts and Identity Relations are represented in Named Graphs
<#GC1> <#moreSpecificThan> <#GC2>
<#GC1> {
<#isComposedOf> rdfs:domain <#Drug>.
<#isComposedOf> rdfs:range <#Lactose>.
<#isComposedOf> rdfs:range <#Paracetamol>.
…
<#drug1> <#identiConTo> <#drug2>
}
<#GC2> {
<#isComposedOf> rdfs:domain <#Drug>.
<#isComposedOf> rdfs:range <#Lactose>.
}
• <#identiConTo> is only specified for the most specific global context(s)
• More general contexts can be inferred using the order relations between
global contexts
15-Dec-17 Detecting Contextual Identity Links 22 / 33
Outline
1. Introduction
2. Contextual Identity
3. Detection of Contextual Identity
4. Experiments
15-Dec-17 Detecting Contextual Identity Links 23 / 33
Detection of Contextual Identity
How can we automatically detect and add these
contextual identity links in a knowledge base ?
DECIDE
DEtection of Contextual IDEntity
Co-occurring
Properties
Unwanted
Properties
Knowledge
Base
Target
Class Necessary
Properties
Unwanted Properties Necessary Properties Co-occurring Properties
p has unstructured values (free
text); insignificant variations
up = (ci , p, *)
p should exist in every context in
order to consider it as relevant
np = (ci , p, *)
p1 does not have a meaning
without p2 (e.g. measure and unit)
cp = { (ci , p1, *), (ci , p2, *)}
15-Dec-17 Detecting Contextual Identity Links 24 / 33
Detection of Contextual Identity
How can we automatically detect and add these
contextual identity links in a knowledge base ?
DECIDE
DEtection of Contextual IDEntity
Co-occurring
Properties
Unwanted
Properties
Knowledge
Base
Target
Class Necessary
Properties
For each pair of individuals (i1, i2) of the target class
set of the most specific global contexts
in which (i1, i2) are identical
15-Dec-17 Detecting Contextual Identity Links 25 / 33
Detection of Contextual Identity
DECIDE
DEtection of Contextual IDEntity
• Step 1: Choosing the Level of Abstraction by selecting the set of Classes
DepC = { Drug, Paracetamol, Lactose, Weight }
15-Dec-17 Detecting Contextual Identity Links 26 / 33
Detection of Contextual Identity
DECIDE
DEtection of Contextual IDEntity
• Step 2: For each pair of the target class, construct the identity graph(s)
• Depth First Construction
• Several Identity Graphs in case of multi valued properties
(different pair mappings)
• Each node contains two local contexts
15-Dec-17 Detecting Contextual Identity Links 27 / 33
Detection of Contextual Identity
DECIDE
DEtection of Contextual IDEntity
• Step 3: For each constructed identity graph, generate the most specific GC
1) In a context where
we discard the
property “name”
and “hasValue”
2) In a context where
we don’t consider
the property
“name” and the
Weight of the
Lactose
15-Dec-17 Detecting Contextual Identity Links 28 / 33
Outline
1. Introduction
2. Contextual Identity
3. Detection of Contextual Identity
4. Experiments
15-Dec-17 Detecting Contextual Identity Links 29 / 33
Experiments
Tested DECIDE on 2 scientific datasets
(Ibanescu et al., 2016)
15-Dec-17 Detecting Contextual Identity Links 30 / 33
Experiments
CellExtraDry Carredas
# Instances (type:Mixture) 210 619
# Possible Pairs 21 945 191 271
# Dependant Classes (Total Classes) 191 (208) 488 (555)
# Graphe Nodes per pair 11 7
# Identity Links 33 092 239 410
# Identity Links per pair 1.41 1.25
# Different Global Contexts 28 233
Execution Time (approx. minutes) 2 26
15-Dec-17 Detecting Contextual Identity Links 31 / 33
Experiments
Use of Contextual Identity Links for Prediction
We want to detect for each context GCi, the measures mi where
identiConTo<GCi>(i1, i2) ∩ observes(i1, m1) → observes(i2, m2)
with m1 ≃ m2
identiConTo<GCi>(i1, i2)  same(mi)
15-Dec-17 Detecting Contextual Identity Links 32 / 33
Experiments
Use of Contextual Identity Links for Prediction
≈ 3700 rules were generated
Domain experts have evaluated the plausibility of the 20 best rules
(in terms of error rate and support combined)
Strongly
Disagree
Disagree Not sure Agree Strongly
Agree
plausibility
1 6 4 90
The error rate of a rule decreases by 22% in CellExtraDry and by 33.5% in
Carredas when a global context is replaced by a more specific global context
15-Dec-17 Detecting Contextual Identity Links 33 / 33
Conclusion
• The use of genuine identity links is rarely required in scientific datasets
• Asking domain experts to specify the contexts in which two objects are
considered identical is not intuitive  asking for constraints is easier
• Proposition of a new contextual identity link (identiConTo)
₋ Transitive, Symmetric, Reflexive
₋ Based on the notion of global contexts
• Proposition of an algorithm for detecting the most specific global context(s)
in which a pair of instances of a target class are identical (DECIDE)
• Contextual Identity Links can be used for prediction tasks

More Related Content

Similar to Detection of Contextual Identity Links in a Knowledge Base

Extraction of common conceptual components from multiple ontologies
Extraction of common conceptual components from multiple ontologiesExtraction of common conceptual components from multiple ontologies
Extraction of common conceptual components from multiple ontologiesValentina Carriero
 
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paolo Missier
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Riccardo Albertoni
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?Szymon Klarman
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanfordSakthivel C R
 
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...Holistic Benchmarking of Big Linked Data
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesValentina Presutti
 
Cedal slides. Web Inteligence 2017
Cedal slides. Web Inteligence 2017Cedal slides. Web Inteligence 2017
Cedal slides. Web Inteligence 2017André Valdestilhas
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)Frank van Harmelen
 
Ontology engineering: Ontology alignment
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignmentGuus Schreiber
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webFabien Gandon
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
Visualising data: Seeing is Believing - CS Forum 2012
Visualising data: Seeing is Believing - CS Forum 2012Visualising data: Seeing is Believing - CS Forum 2012
Visualising data: Seeing is Believing - CS Forum 2012Richard Ingram
 
20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogsandrea huang
 

Similar to Detection of Contextual Identity Links in a Knowledge Base (20)

Extraction of common conceptual components from multiple ontologies
Extraction of common conceptual components from multiple ontologiesExtraction of common conceptual components from multiple ontologies
Extraction of common conceptual components from multiple ontologies
 
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
UseR 2017
UseR 2017UseR 2017
UseR 2017
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with Frames
 
Cedal slides. Web Inteligence 2017
Cedal slides. Web Inteligence 2017Cedal slides. Web Inteligence 2017
Cedal slides. Web Inteligence 2017
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 
Ontology engineering: Ontology alignment
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignment
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
Visualising data: Seeing is Believing - CS Forum 2012
Visualising data: Seeing is Believing - CS Forum 2012Visualising data: Seeing is Believing - CS Forum 2012
Visualising data: Seeing is Believing - CS Forum 2012
 
20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs20160818 Semantics and Linkage of Archived Catalogs
20160818 Semantics and Linkage of Archived Catalogs
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Detection of Contextual Identity Links in a Knowledge Base

  • 1. Detection of Contextual Identity Links in a Knowledge Base Joe Raad, Nathalie Pernelle, Fatiha Saïs firstname.lastname@lri.fr LRI, Paris-Sud University Orsay, France
  • 2. 15-Dec-17 Detecting Contextual Identity Links 2 / 33 Identity in the Semantic Web WHY ? Harry Potter and the Chamber of Secrets Harry Potter et la Chambre des Secrets 300 English Dataset 1 Dataset 2 J.K. Rowling author pages 350 pages French J.K. Rowling authoridentical  Integrate Information from Different Sources  Enrich “Dataset 2” Linked Open Data
  • 3. 15-Dec-17 Detecting Contextual Identity Links 3 / 33 Identity in the Semantic Web 300 English Dataset 1 Dataset 2 J.K. Rowling author pages 350 pages French J.K. Rowling authorowl:sameAs Linked Open Data English Harry Potter et la Chambre des Secrets 300 pages 350 Harry Potter and the Chamber of Secrets pages French  Unwanted Inferences  Possible Inconsistencies HOW ? ≈ 558 million owl:sameAs statements (LOD stat 2015)
  • 4. 15-Dec-17 Detecting Contextual Identity Links 4 / 33 Identity in the Semantic Web SOLUTION ? Harry Potter and the Chamber of Secrets Harry Potter et la Chambre des Secrets 300 English Dataset 1 Dataset 2 J.K. Rowling author pages 350 pages French J.K. Rowling author same art work Contextual Identity Linked Open Data
  • 5. 15-Dec-17 Detecting Contextual Identity Links 5 / 33 Contextual Identity Context (Price) Different Lemonades Context (Calories) Different Lemonades Context (pH) Identical Lemonades Context (Taste) Identical Lemonades lem1 lem2
  • 6. 15-Dec-17 Detecting Contextual Identity Links 6 / 33 Contextual Identity – State of the Art 1. skos:exactMatch : indicates a high degree of confidence that the concepts can be used interchangeably across a wide range of applications • Undefined contexts in which this identity holds (Miles et al., 2009) • Can only be used to link skos concepts 2. The Similarity Ontology : presents a hierarchy of 13 predicates (8 new) Each predicate is characterized by the reflexivity, transitivity, and symmetric properties • Undefined contexts in which this identity holds (Halpin et al. , 2010) • Difficult to use due to its subjectivity
  • 7. 15-Dec-17 Detecting Contextual Identity Links 7 / 33 Contextual Identity – State of the Art 3. Domain Specific Identity Links: volume(lem1, a1) Ʌ volume(lem2, a1) same_lemonade(lem1, lem2) • Requires Expert’s Intervention 4. Indiscernibility Relation: defines identity relations in a context represented by a set of properties The contexts are hierarchized in a lattice • A context is a set of properties that does not consider the classes’ organization in the RDF dataset (Beek et al. , 2016) • Identity is locally defined (does not propagates in the RDF graph using the object properties)
  • 8. 15-Dec-17 Detecting Contextual Identity Links 8 / 33 Objectives • Introduce a new Contextual Identity Link • Represent a context using the ontology vocabulary • Propose an approach capable of detecting all the contexts in which two ontology instances are identical • Benefit from the experts’ knowledge (if available)
  • 9. 15-Dec-17 Detecting Contextual Identity Links 9 / 33 Outline 1. Introduction 2. Contextual Identity 3. Detection of Contextual Identity 4. Experiments
  • 10. 15-Dec-17 Detecting Contextual Identity Links 10 / 33 Contextual Identity In which Context “drug1” is considered as identical to “drug2”?
  • 11. 15-Dec-17 Detecting Contextual Identity Links 11 / 33 Contextual Identity 1) In a context where we discard the property “name” and “hasValue”
  • 12. 15-Dec-17 Detecting Contextual Identity Links 12 / 33 Contextual Identity hasWeight 2) In a context where we don’t consider the property “name” and the Weight of the Lactose
  • 13. 15-Dec-17 Detecting Contextual Identity Links 13 / 33 Contextual Identity Given an ontology O = ( C, DP, OP, A ) with • C = set of classes • DP = set of owl:DataTypeProperty • OP = set of owl:ObjectProperty • A = set of Axioms (e.g. domains and ranges, subsumption) A Global Context is a sub ontology GCu = ( Cu , DPu , OPu , Au ) with • Cu ⊆ DepC ⊆ C • DPu ⊆ DP • Opu ⊆ OP • Au = domain and range axioms more specific than those described in A What is a (Global) Context ?
  • 14. 15-Dec-17 Detecting Contextual Identity Links 14 / 33 Contextual Identity Example of a Global Context
  • 15. 15-Dec-17 Detecting Contextual Identity Links 15 / 33 Contextual Identity Order Relation between Global Contexts GCu = ( Cu , DPu , OPu , Au ) and GCv = ( Cv , DPv , OPv , Av ) GCu ≤ GCv if : • Cv ⊆ Cu • DPv ⊆ DPu • OPv ⊆ Opu • ∀ op ∈ OPv : domainv(op) ⊑ domainu(op) and rangev(op) ⊑ rangeu(op) • ∀ dp ∈ DPv : domainv(op) ⊑ domainu(op) and rangev(op) = rangeu(op)
  • 16. 15-Dec-17 Detecting Contextual Identity Links 16 / 33 Contextual Identity Order Relation between Global Contexts ≤ (more specific than)
  • 17. 15-Dec-17 Detecting Contextual Identity Links 17 / 33 Contextual Identity Under which conditions two individuals are contextually identical ? Short Answer If their contextual descriptions are isomorphic up to a renaming of the instances URI What is an instance’s contextual description ?
  • 18. 15-Dec-17 Detecting Contextual Identity Links 18 / 33 Contextual Identity Contextual Instance Description According to a Global Context Gdrug1 of drug1 in GC1
  • 19. 15-Dec-17 Detecting Contextual Identity Links 19 / 33 Contextual Identity Contextual Instance Description According to a Global Context Gdrug1 of drug1 in GC2
  • 20. 15-Dec-17 Detecting Contextual Identity Links 20 / 33 Identity in a Global Context Gdrug1 of drug1 in GC1 Gdrug2 of drug2 in GC1 Isomorphic up to a renaming of the instance URI identiConTo<GC1> (drug1, drug2)
  • 21. 15-Dec-17 Detecting Contextual Identity Links 21 / 33 Identity in a Global Context Global Contexts and Identity Relations are represented in Named Graphs <#GC1> <#moreSpecificThan> <#GC2> <#GC1> { <#isComposedOf> rdfs:domain <#Drug>. <#isComposedOf> rdfs:range <#Lactose>. <#isComposedOf> rdfs:range <#Paracetamol>. … <#drug1> <#identiConTo> <#drug2> } <#GC2> { <#isComposedOf> rdfs:domain <#Drug>. <#isComposedOf> rdfs:range <#Lactose>. } • <#identiConTo> is only specified for the most specific global context(s) • More general contexts can be inferred using the order relations between global contexts
  • 22. 15-Dec-17 Detecting Contextual Identity Links 22 / 33 Outline 1. Introduction 2. Contextual Identity 3. Detection of Contextual Identity 4. Experiments
  • 23. 15-Dec-17 Detecting Contextual Identity Links 23 / 33 Detection of Contextual Identity How can we automatically detect and add these contextual identity links in a knowledge base ? DECIDE DEtection of Contextual IDEntity Co-occurring Properties Unwanted Properties Knowledge Base Target Class Necessary Properties Unwanted Properties Necessary Properties Co-occurring Properties p has unstructured values (free text); insignificant variations up = (ci , p, *) p should exist in every context in order to consider it as relevant np = (ci , p, *) p1 does not have a meaning without p2 (e.g. measure and unit) cp = { (ci , p1, *), (ci , p2, *)}
  • 24. 15-Dec-17 Detecting Contextual Identity Links 24 / 33 Detection of Contextual Identity How can we automatically detect and add these contextual identity links in a knowledge base ? DECIDE DEtection of Contextual IDEntity Co-occurring Properties Unwanted Properties Knowledge Base Target Class Necessary Properties For each pair of individuals (i1, i2) of the target class set of the most specific global contexts in which (i1, i2) are identical
  • 25. 15-Dec-17 Detecting Contextual Identity Links 25 / 33 Detection of Contextual Identity DECIDE DEtection of Contextual IDEntity • Step 1: Choosing the Level of Abstraction by selecting the set of Classes DepC = { Drug, Paracetamol, Lactose, Weight }
  • 26. 15-Dec-17 Detecting Contextual Identity Links 26 / 33 Detection of Contextual Identity DECIDE DEtection of Contextual IDEntity • Step 2: For each pair of the target class, construct the identity graph(s) • Depth First Construction • Several Identity Graphs in case of multi valued properties (different pair mappings) • Each node contains two local contexts
  • 27. 15-Dec-17 Detecting Contextual Identity Links 27 / 33 Detection of Contextual Identity DECIDE DEtection of Contextual IDEntity • Step 3: For each constructed identity graph, generate the most specific GC 1) In a context where we discard the property “name” and “hasValue” 2) In a context where we don’t consider the property “name” and the Weight of the Lactose
  • 28. 15-Dec-17 Detecting Contextual Identity Links 28 / 33 Outline 1. Introduction 2. Contextual Identity 3. Detection of Contextual Identity 4. Experiments
  • 29. 15-Dec-17 Detecting Contextual Identity Links 29 / 33 Experiments Tested DECIDE on 2 scientific datasets (Ibanescu et al., 2016)
  • 30. 15-Dec-17 Detecting Contextual Identity Links 30 / 33 Experiments CellExtraDry Carredas # Instances (type:Mixture) 210 619 # Possible Pairs 21 945 191 271 # Dependant Classes (Total Classes) 191 (208) 488 (555) # Graphe Nodes per pair 11 7 # Identity Links 33 092 239 410 # Identity Links per pair 1.41 1.25 # Different Global Contexts 28 233 Execution Time (approx. minutes) 2 26
  • 31. 15-Dec-17 Detecting Contextual Identity Links 31 / 33 Experiments Use of Contextual Identity Links for Prediction We want to detect for each context GCi, the measures mi where identiConTo<GCi>(i1, i2) ∩ observes(i1, m1) → observes(i2, m2) with m1 ≃ m2 identiConTo<GCi>(i1, i2)  same(mi)
  • 32. 15-Dec-17 Detecting Contextual Identity Links 32 / 33 Experiments Use of Contextual Identity Links for Prediction ≈ 3700 rules were generated Domain experts have evaluated the plausibility of the 20 best rules (in terms of error rate and support combined) Strongly Disagree Disagree Not sure Agree Strongly Agree plausibility 1 6 4 90 The error rate of a rule decreases by 22% in CellExtraDry and by 33.5% in Carredas when a global context is replaced by a more specific global context
  • 33. 15-Dec-17 Detecting Contextual Identity Links 33 / 33 Conclusion • The use of genuine identity links is rarely required in scientific datasets • Asking domain experts to specify the contexts in which two objects are considered identical is not intuitive  asking for constraints is easier • Proposition of a new contextual identity link (identiConTo) ₋ Transitive, Symmetric, Reflexive ₋ Based on the notion of global contexts • Proposition of an algorithm for detecting the most specific global context(s) in which a pair of instances of a target class are identical (DECIDE) • Contextual Identity Links can be used for prediction tasks