SlideShare a Scribd company logo
1 of 43
Download to read offline
making sense of text and data
Atanas Kiryakov
Webinar, July 2020
Reasoning with Big Knowledge Graphs:
Choices, Pitfalls and Proven Recipes
Who are we?
o Leader
ü Semantic technology vendor established year 2000
ü Part of Sirma Group: 400 persons, listed at Sofia Stock Exchange
o Profitable and growing
ü Global: 80% of revenue from London and New York
ü Clients: S&P, BBC, FT, Top-5 US Bank, UK Parliament, Fujitsu, …
ü Verticals: Financial services, Health care and Life sciences, Publishing, Manufacturing
o Innovator
ü Attracted over $15M in innovation funding
ü Member of W3C, EDMC, ODI, STI and LDBC, developing next gen. standards
…, the market leaders in this space
continue to be Neo4J and Ontotext
(GraphDB), which are graph and RDF
database providers respectively.
These are the longest established
vendors in this space (both founded
in 2000) so they have a longevity and
experience that other suppliers
cannot yet match.
Bloor Research
Graph Database Market Update 2020
Ontotext GraphDB™ - the Flagship Product
Ontotext Portfolio
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Knowledge Graphs = Rich Data in Context
KGs put data in context via
linking and semantic metadata
We help enterprises get profound insights
via interlinking, analyzing and exploring:
o diverse databases
o text documents and other content
o proprietary & global data
What is a Knowledge Graph?
o The KG represents a collection
of interlinked descriptions
of concepts and entities
ü Concepts describe each other
ü Connections provide context
ü Context helps comprehension!
o A KG can be used as:
ü Database: can be queried
ü Graph: can be analyzed as network
ü Knowledge base: new facts can be inferred
Read more: https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/
What is Semantics?
o Formal semantics allows new valid
facts to be inferred
ü Both data and schema can be interpreted
ü Semantic schema = ontology
ü Languages: RDF Schema (RDFS), OWL
o Only the relevant semantics is
formalized in the schema
ü The meaning of relativeOf is not fully described by
defining it as owl:SymmetricProperty
ü The best model is the simplest one that can do the
work. But not simpler! myData: Maria
ptop:Agent
ptop:Person
ptop:Woman
ptop:childOf
ptop:parentOf
rdfs:range
owl:inverseOf
inferred
myData:Ivan
owl:relativeOf
owl:inverseOfowl:SymmetricProperty
rdfs:subPropertyOf
owl:inverseOf
owl:inverseOf
rdf:type
rdf:type
rdf:type
Reasoning Benefits
o Schema alignment and easy querying in diverse datasets
ü Across sources similar relationships can be modeled in a different way - one can use parentOf, another
childOf and a third one just the more general relativeOf
ü The database will return Ivan as a result of the query (Maria relativeOf ?x) when the fact derived from
the source and asserted is (Ivan childOf Maria)
o Getting deeper and more complete results
ü Finding patterns and inferring new relationships
ü Instant discovery of hidden relationships scattered across multiple sources
o Consistency checking and quality validation
ü RDF Shapes ensure graph consistency and quality
The Pitfalls of Reasoning
o Over-engineered ontologies
ü Too expressive ontology language
ü Results of inference hard to understand and verify
ü Performance penalties far greater than the benefits
o Inappropriate reasoning support
ü Inference implementations that work well with taxonomies and conceptual models of few
thousands of concepts, but cannot cope with KG of millions of entities
o Inappropriate data layer architecture
ü One such example is reasoning with virtual KG, which is often infeasible
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Search in British Museum’s Collection
o Artefacts are described via the granular ontology CIDOC CRM
o Searching in such collection requires Fundamental Relations
ü Aggregation of large number of paths through CRM data into a smaller number of searchable relations
o E.g.: FR "Thing from Place"
British Museum’s Collection: Volumetrics
o Museum objects: 2,051,797
ü Thesaurus entries: 415,509
o Explicit statements: 195,208,156
o Total statements: 916,735,486
ü Expansion ratio is 4.7x, i.e., for each statement, 3.7 more are inferred
ü Nodes (unique URLs and literals): 53,803,189
o Loading time (including materialization):
ü 22.2h on RAM drive
ü 32.9h on non-SSD hard drives
GraphDB Benchmarking
o LDBC: TPC-like benchmarks for graph databases
o Members include: Ontotext, OpenLink, neo4j, CWI, UPM, ORACLE,
IBM, *Sparsity
o LDBC Semantic Publishing Benchmark
ü Based on BBC’s Dynamic Semantic Publishing editorial workflow
ü Updates, adding new content metadata or updating the reference knowledge (e.g., new people)
ü Aggregation queries retrieve content according to various criteria (e.g., to generate a topic web page)
ü The only benchmark that involves reasoning and updates
LDBC SPB Results of GraphDB
Clients
reading / writing Reads/s Writes/s
0 / 1 0.0000 11.4067
0 / 2 0.0000 14.3033
0 / 4 0.0000 14.6700
0 / 8 0.0000 15.1067
1 / 0 17.8258 0.0000
4 / 0 43.0833 0.0000
8 / 0 70.3767 0.0000
16 / 0 83.2633 0.0000
8 / 2 52.5667 9.2867
8 / 4 54.0233 9.6167
8 / 8 54.9067 9.5733
10 / 2 59.9467 8.5333
10 / 4 62.2867 8.4767
10 / 8 61.7167 8.6067
16 / 2 68.8100 5.0600
16 / 4 70.3900 5.1067
16 / 8 70.2300 4.9967
16 / 16 70.9467 5.0567
o CPU: 1 x E5-1650
o RAM: 20G heap
o Dataset: LDBC SPB 256
o DB: GraphDB SE 8.0, RDF Statements:
254,948,985 (explicit), 480,405,141 (total)
OWL-Horst-optimized rule set
o Creative works: 8,821,535
FactForge: Data Integration
o DBpedia (the English version) 496M
o GeoNames (all geographic features on Earth) 150M
o owl:sameAs links between DBpedia and Geonames 471K
o GLEI (global company register data) 3M
o Panama Papers DB (#LinkedLeaks) 20M
o Other datasets and ontologies: WordNet, WorldFacts, FIBO
o News metadata (2000 articles/day enriched by NOW) 1 023M
o Total size (2.2B explicit + 328M inferred statements) 2 522М
FIBO: Financial Industry Business Ontology
o Developed by EDMC, https://spec.edmcouncil.org/fibo/
o We loaded FIBO Foundations and BE
ü About 35 RDF files all together (old version)
o Reasoning profile: OWL 2 RL
o Loading takes 2-3 sec.
o Number of explicit statements: 5 696
o Number of total statements, including inferred: 15 713
ü About 10k statements materialized
FIBO-PROTON Mapping
o PROTON is an upper-level ontology
ü 500 classes, 200 properties; developed by Ontotext since 2004
ü used semantic annotation and LOD integration services, e.g, FactForge
ü mapped to DBPedia, Freebase, GeoNames
o A very basic mapping for public companies and few related
properties was loaded in 4 hours in FactForge:
fb:business.issuer rdfs:subClassOf pext:PublicCompany.
pext:PublicCompany rdfs:subClassOf fibo-be-corp-corp:PubliclyHeldCompany.
ptop:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization.
dbp-prop:industry rdfs:subPropertyOf pext:industryOf.
pext:industryOf rdfs:subPropertyOf fibo-fnd-rel-rel:isClassifiedBy.
dbp-ont:subsidiary rdfs:subPropertyOf ptop:controls.
ptop:controls rdfs:subPropertyOf fibo-fnd-rel-rel:controls.
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Rule-Based Reasoning
o Description Logic (DL) doesn’t scale
ü Satisfiability checking is not tractable
ü Complexity grows exponentially with size
o Rule-based inference engine
ü R-Entailment rules, PROLOG-style, as defined in [1]
o Sound and complete in PSPACE
ü Under some constraints: do not introduce
blank nodes, bound size of the rule bodies,
ground RDF graph, [1]
[1] Combining RDF and Part of OWL with Rules: Semantics, Decidability, Complexity
Herman J. ter Horst ,Published in International Semantic Web Conference 2005
More at: http://graphdb.ontotext.com/documentation/standard/reasoning.html
Complexity*
DLRules, LP
OWL Full
OWL DL
OWL Lite
RDFS
SWRL
Datalog
OWL 2 QL
Expressivity supported
by GraphDB
OWL 2 RL
OWL Horst
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o Reasoning With GraphDB
Presentation Outline
Forward-Chaining and Materialization
o All possible inferences are made upon update and are stored
ü The inferred statements are stored and indexed along the explicit ones
ü Interferences that are no longer supported upon delete are retracted
o Forward-chaining works, subject to conscious modeling
ü The overheads of the materialization approach are bearable
ü Say, 2x index size and 2x slower loading and updates
ü Marginal (if any) slowdown of queries
Query-time Reasoning and Backward-Chaining
o Perform reasoning query-time
ü No overhead upon data loading and updates
ü Two basic approaches: Backward-chaining and Query rewriting
o Backward-chaining slows down query evaluation dramatically
ü Alike PROLOG unification, the engine “dives” recursively, in order to exhaust all alternative
ways to find bindings for each separate triple pattern in the query
ü There is no way to guess before the actual evaluation the cardinality of the results for each
triple pattern
ü This makes query plan optimization impossible and ruins query performance
Query Rewriting
o Each pattern in the query is rewritten as disjunction of several
alternatives, based on reasoning on the schema/ontology/TBox
<?a rdf:type ptop:Person> query pattern will be expanded to something like
<?a rdf:type ptop:Person> OR
(<?p rdfs:range ptop:Person> AND <?b ?p ?a>) OR
(<?a rdf:type ?c> AND <?c rdfs:subClassOf ptop:Person >) …
o Execution of 10s combinations of variants is slow
ü Imagine a query with two patterns: the first one expands into 5 variants and the second into 6
variants. The engine will have to evaluate 30 alternative combinations
ü Think of implementing the semantics of owl:sameAs via query rewriting
o Query rewriting also delivers incomplete results
ü Recursion is not possible with SPARQL query rewriting
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
GraphDB Essentials
o Scalable RDF / SPARQL engine
ü W3C standards support
ü NEW: RDF* support, property annotations
o Platform independent (100% Java)
o Open source API
ü Main contributor to the RDF4J project
o Reasoning and consistency checking
ü UNIQUE! Efficient reasoning support for big data
sets across the full lifecycle of the data: load, query, updates
Architecture
GraphDB Workbench
User friendly interface for database
administration
GraphDB Engine
REST API for database access
Plugin / Connectors
GraphDB Workbench
o SPARQL editor & autocomplete
o Schema visualization
o Graph exploration
o Database monitoring and administration
9/10/20
Visual Graph
#29
Features Free Standard Enterprise
RDF 1.1 support
SPARQL 1.1 support
RDFS, OWL2 RL and QL reasoning
Efficient query execution
Workbench interface
Community support
Unlimited number of CPU cores
Commercial support
Connectors for Elasticsearch & SOLR
High-availability cluster
Managed service
GraphDB Enterprise: Resilience & Availability
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
Reasoning in GraphDB
o Fast forward-chaining materialization
ü Allows for efficient query evaluation on big datasets
o Incremental for both inserts and deletes
ü Inferred closure is updated transparently upon commit of transaction
o Sample rules:
ENTAILMENT CONSISITENCY
p <rdf:type> <owl:FunctionalProperty> x owl:sameAs y
x p y x owl:differentFrom y
x p z ------------------------
-------------------------------
y <owl:sameAs> z
OWL 2 Reasoning
o Built-in rule-sets for: RDFS, OWL-Horst, OWL2-RL, OWL2-QL
o Custom rule-sets easily defined
ü Ruleset optimizer/profiler
o Configurations with multiple rule-sets
ü E.g. one with consistency checking to be used for internal data and another one
with „open-world“ semantics for LOD and other external datasets
o NEW: Proof plug-in provides inference explanation
Predefined Rule-Sets
Ruleset Description
Empty No reasoning
rdfs Standard RDFS: subClassOf, subPropertyOf, domain and range of properties
rdfs-plus RDFS plus symmetric, transitive and inverse properties
owl-horst (pD*) sameAs, equivalentClass, equivalentProperty, SymmetricProperty,
TransitiveProperty, inverseOf, FunctionalProperty, InverseFunctionalProperty.
Partial support for: intersectionOf, someValuesFrom, hasValue, allValuesFrom
owl-max See the spec http://graphdb.ontotext.com/documentation/standard/reasoning.html
owl-rl (DL-LiteR) AsymmetricProperty, IrreflexiveProperty, propertyChainAxiom,
AllDisjointProperties, hasKey, unionOf, complementOf, oneOf, differentFrom,
AllDisjointClasses and all the property cardinality primitives. Adds more complete
support for intersectionOf, someValuesFrom, hasValue, allValuesFrom
owl-ql Partial compliance. See the spec https://www.w3.org/TR/owl2-profiles
Optimized Rule-Sets
o These versions exclude some RDFS reasoning rules, which are not useful
for most of the applications, but add substantial reasoning overheads
o “Optimized” ruleset versions suppress this rule
Id: rdf1_rdfs4a_4b
x a y
-------------------------------
x <rdf:type> <rdfs:Resource>
a <rdf:type> <rdfs:Resource>
y <rdf:type> <rdfs:Resource>
Presentation Outline
o Reasoning Introduction: Benefits and Pitfalls
o Reasoning Use Cases and Demos
o RDFS and OWL 2 Profiles
o Reasoning Implementation Choices
o GraphDB
o Reasoning with GraphDB
o Reasoning Optimizations in GraphDB
Presentation Outline
Efficient Retraction of Inferred Facts
o Materialization causes troubles upon delete
ü It is not trivial to figure out which inferred statements are no longer supported
o Deletion without recomputing the inference closure is needed
ü Without it forward-chaining is not feasible for dynamic environments
o GraphDB retracts statements via a unique algorithm
ü Forward-chaining to find potentially affected inferences
ü Backward-chaining to test which inferences are still supported
ü No truth maintenance information overheads
ü Fast – the same order of magnitude as materialization upon insert
The Honey of owl:sameAs Equivalence
o owl:sameAs links the datasets in the Linked Open Data cloud
o owl:sameAs declares that two different URIs denote one and the same object
ü Aligns different identifiers of the same real-world entity used in different data sources
o For example, let’s say that we have three different URIs for Bulgaria and two for
Sofia (its capital)
dbpedia:Sofia owl:sameAs geonames:727011
geonames:727011 geo-ont:parentFeature geonames:732800
dbpedia:Bulgaria owl:sameAs geonames:732800
dbpedia:Bulgaria owl:sameAs opencyc-en:Bulgaria
The Sting of owl:sameAs Equivalence
o According to the standard semantics of owl:sameAs
ü It is a transitive and symmetric relationship
ü Statements, asserted using one of the equivalent URIs, should be inferred to appear with all equivalent
URIs placed in the same position
ü Thus the 4 statements in the example lead to 10 inferred statements :
geonames:727011 owl:sameAs dbpedia:Sofia
geonames:732800 owl:sameAs dbpedia:Bulgaria
geonames:732800 owl:sameAs opencyc-en:Bulgaria
opencyc-en:Bulgaria owl:sameAs dbpedia:Bulgaria
opencyc-en:Bulgaria owl:sameAs geonames:732800
dbpedia:Sofia geo-ont:parentFeature geonames:732800
dbpedia:Sofia geo-ont:parentFeature opencyc-en:Bulgaria
dbpedia:Sofia geo-ont:parentFeature dbpedia:Bulgaria
geonames:727011 geo-ont:parentFeature opencyc-en:Bulgaria
geonames:727011 geo-ont:parentFeature dbpedia:Bulgaria
The Honey and the Sting of owl:sameAs
E11 E22
E12 E21
E23
geonames:727011
dbpedia:Sofia
geonames:732800
dbpedia:Bulgaria
opencyc-en:Bulgaria
geo-ont:parentFeature
The Honey and the Sting of owl:sameAs
E11 E22
E12 E21
E23
geonames:727011
dbpedia:Sofia
geonames:732800
dbpedia:Bulgaria
opencyc-en:Bulgaria
geo-ont:parentFeature
owl:sameAs Optimization
o GraphDB features an optimization of owl:sameAs
ü It can use a single master-node in its indices to represent a class of sameAs-equivalent URIs
o Avoids inflating the indices with multiple equivalent statements
ü Imagine a statement that has 5 sameAs-equivalents of its subject, 2 of its predicate and 3 of its object.
Such statement would have 30 replicas in the indices after forward-chaining if such an optimization is
not used
o Helps presenting compact query results
ü The owl:sameAs equivalence can result in multiplication of the bindings of the variables in the process
of query evaluation with both forward- and backward-chaining. This leads to expansion of the result-
set with rows that differ only by referring to different URIs, which are sameAs-equivalent
ü Optionally, query results can be expanded, as if there is no optimization
Questions?
Experience the technology with our demonstrators
FactForge: Knowledge graph of linked open data and news
about People and Organizations http://factforge.net
RANK: News popularity ranking for companies http://rank.ontotext.com
NOW: Semantic News Portal http://now.ontotext.com
#43

More Related Content

What's hot

Data Modeling & Metadata Management
Data Modeling & Metadata ManagementData Modeling & Metadata Management
Data Modeling & Metadata ManagementDATAVERSITY
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 
Data Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceData Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceDATAVERSITY
 
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASA
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASADeveloping a Knowledge Graph of your Competency, Skills, and Knowledge at NASA
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASANeo4j
 
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceWorkshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceNeo4j
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge GraphsNeo4j
 
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Denodo
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringHadi Fadlallah
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectOntotext
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Julien Le Dem
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyCambridge Semantics
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Kira
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
How to Design Retail Recommendation Engines with Neo4j
How to Design Retail Recommendation Engines with Neo4jHow to Design Retail Recommendation Engines with Neo4j
How to Design Retail Recommendation Engines with Neo4jNeo4j
 

What's hot (20)

Data Modeling & Metadata Management
Data Modeling & Metadata ManagementData Modeling & Metadata Management
Data Modeling & Metadata Management
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
Data Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-ServiceData Catalogues - Architecting for Collaboration & Self-Service
Data Catalogues - Architecting for Collaboration & Self-Service
 
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASA
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASADeveloping a Knowledge Graph of your Competency, Skills, and Knowledge at NASA
Developing a Knowledge Graph of your Competency, Skills, and Knowledge at NASA
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceWorkshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data Science
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge Graphs
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
Data Catalog in Denodo Platform 7.0: Creating a Data Marketplace with Data Vi...
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Graph databases
Graph databasesGraph databases
Graph databases
 
How to Design Retail Recommendation Engines with Neo4j
How to Design Retail Recommendation Engines with Neo4jHow to Design Retail Recommendation Engines with Neo4j
How to Design Retail Recommendation Engines with Neo4j
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 

Similar to Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - FactforgeEuropean Data Forum
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudMOVING Project
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulationstbruce
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)Vladimir Alexiev, PhD, PMP
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of DataRinke Hoekstra
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and OntarioBigData_Europe
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Ontotext
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveAdrian Paschke
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsFrancesco Osborne
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked DataJee-Hyub Kim
 
Approximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataApproximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataKathrin Dentler
 

Similar to Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes (20)

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Fact forge20 edf
Fact forge20 edfFact forge20 edf
Fact forge20 edf
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
 
The Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal RegulationsThe Semantic Web meets the Code of Federal Regulations
The Semantic Web meets the Code of Federal Regulations
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Ontology development
Ontology developmentOntology development
Ontology development
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of Data
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and Ontario
 
Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020Property graph vs. RDF Triplestore comparison in 2020
Property graph vs. RDF Triplestore comparison in 2020
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 Perspective
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic PublicationsEKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
EKAW 2016 - TechMiner: Extracting Technologies from Academic Publications
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked Data
 
Approximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of DataApproximation and Self-Organisation on the Web of Data
Approximation and Self-Organisation on the Web of Data
 

More from Ontotext

Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsOntotext
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise Ontotext
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your DataOntotext
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and NewsOntotext
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesOntotext
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps Ontotext
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformOntotext
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?Ontotext
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessOntotext
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest Ontotext
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingOntotext
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchOntotext
 

More from Ontotext (20)

Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
It Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got SemanticsIt Don’t Mean a Thing If It Ain’t Got Semantics
It Don’t Mean a Thing If It Ain’t Got Semantics
 
The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise The Bounties of Semantic Data Integration for the Enterprise
The Bounties of Semantic Data Integration for the Enterprise
 
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
[Webinar] GraphDB Fundamentals: Adding Meaning to Your Data
 
[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News[Conference] Cognitive Graph Analytics on Company Data and News
[Conference] Cognitive Graph Analytics on Company Data and News
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake OnesHercule: Journalist Platform to Find Breaking News and Fight Fake Ones
Hercule: Journalist Platform to Find Breaking News and Fight Fake Ones
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
How is smart data cooked?
How is smart data cooked?How is smart data cooked?
How is smart data cooked?
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
The Knowledge Discovery Quest
The Knowledge Discovery Quest The Knowledge Discovery Quest
The Knowledge Discovery Quest
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Semantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial ResearchSemantic Data Normalization For Efficient Clinical Trial Research
Semantic Data Normalization For Efficient Clinical Trial Research
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes

  • 1. making sense of text and data Atanas Kiryakov Webinar, July 2020 Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
  • 2. Who are we? o Leader ü Semantic technology vendor established year 2000 ü Part of Sirma Group: 400 persons, listed at Sofia Stock Exchange o Profitable and growing ü Global: 80% of revenue from London and New York ü Clients: S&P, BBC, FT, Top-5 US Bank, UK Parliament, Fujitsu, … ü Verticals: Financial services, Health care and Life sciences, Publishing, Manufacturing o Innovator ü Attracted over $15M in innovation funding ü Member of W3C, EDMC, ODI, STI and LDBC, developing next gen. standards
  • 3. …, the market leaders in this space continue to be Neo4J and Ontotext (GraphDB), which are graph and RDF database providers respectively. These are the longest established vendors in this space (both founded in 2000) so they have a longevity and experience that other suppliers cannot yet match. Bloor Research Graph Database Market Update 2020 Ontotext GraphDB™ - the Flagship Product
  • 5. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 6. Knowledge Graphs = Rich Data in Context KGs put data in context via linking and semantic metadata We help enterprises get profound insights via interlinking, analyzing and exploring: o diverse databases o text documents and other content o proprietary & global data
  • 7. What is a Knowledge Graph? o The KG represents a collection of interlinked descriptions of concepts and entities ü Concepts describe each other ü Connections provide context ü Context helps comprehension! o A KG can be used as: ü Database: can be queried ü Graph: can be analyzed as network ü Knowledge base: new facts can be inferred Read more: https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/
  • 8. What is Semantics? o Formal semantics allows new valid facts to be inferred ü Both data and schema can be interpreted ü Semantic schema = ontology ü Languages: RDF Schema (RDFS), OWL o Only the relevant semantics is formalized in the schema ü The meaning of relativeOf is not fully described by defining it as owl:SymmetricProperty ü The best model is the simplest one that can do the work. But not simpler! myData: Maria ptop:Agent ptop:Person ptop:Woman ptop:childOf ptop:parentOf rdfs:range owl:inverseOf inferred myData:Ivan owl:relativeOf owl:inverseOfowl:SymmetricProperty rdfs:subPropertyOf owl:inverseOf owl:inverseOf rdf:type rdf:type rdf:type
  • 9. Reasoning Benefits o Schema alignment and easy querying in diverse datasets ü Across sources similar relationships can be modeled in a different way - one can use parentOf, another childOf and a third one just the more general relativeOf ü The database will return Ivan as a result of the query (Maria relativeOf ?x) when the fact derived from the source and asserted is (Ivan childOf Maria) o Getting deeper and more complete results ü Finding patterns and inferring new relationships ü Instant discovery of hidden relationships scattered across multiple sources o Consistency checking and quality validation ü RDF Shapes ensure graph consistency and quality
  • 10. The Pitfalls of Reasoning o Over-engineered ontologies ü Too expressive ontology language ü Results of inference hard to understand and verify ü Performance penalties far greater than the benefits o Inappropriate reasoning support ü Inference implementations that work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities o Inappropriate data layer architecture ü One such example is reasoning with virtual KG, which is often infeasible
  • 11. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 12. Search in British Museum’s Collection o Artefacts are described via the granular ontology CIDOC CRM o Searching in such collection requires Fundamental Relations ü Aggregation of large number of paths through CRM data into a smaller number of searchable relations o E.g.: FR "Thing from Place"
  • 13. British Museum’s Collection: Volumetrics o Museum objects: 2,051,797 ü Thesaurus entries: 415,509 o Explicit statements: 195,208,156 o Total statements: 916,735,486 ü Expansion ratio is 4.7x, i.e., for each statement, 3.7 more are inferred ü Nodes (unique URLs and literals): 53,803,189 o Loading time (including materialization): ü 22.2h on RAM drive ü 32.9h on non-SSD hard drives
  • 14. GraphDB Benchmarking o LDBC: TPC-like benchmarks for graph databases o Members include: Ontotext, OpenLink, neo4j, CWI, UPM, ORACLE, IBM, *Sparsity o LDBC Semantic Publishing Benchmark ü Based on BBC’s Dynamic Semantic Publishing editorial workflow ü Updates, adding new content metadata or updating the reference knowledge (e.g., new people) ü Aggregation queries retrieve content according to various criteria (e.g., to generate a topic web page) ü The only benchmark that involves reasoning and updates
  • 15. LDBC SPB Results of GraphDB Clients reading / writing Reads/s Writes/s 0 / 1 0.0000 11.4067 0 / 2 0.0000 14.3033 0 / 4 0.0000 14.6700 0 / 8 0.0000 15.1067 1 / 0 17.8258 0.0000 4 / 0 43.0833 0.0000 8 / 0 70.3767 0.0000 16 / 0 83.2633 0.0000 8 / 2 52.5667 9.2867 8 / 4 54.0233 9.6167 8 / 8 54.9067 9.5733 10 / 2 59.9467 8.5333 10 / 4 62.2867 8.4767 10 / 8 61.7167 8.6067 16 / 2 68.8100 5.0600 16 / 4 70.3900 5.1067 16 / 8 70.2300 4.9967 16 / 16 70.9467 5.0567 o CPU: 1 x E5-1650 o RAM: 20G heap o Dataset: LDBC SPB 256 o DB: GraphDB SE 8.0, RDF Statements: 254,948,985 (explicit), 480,405,141 (total) OWL-Horst-optimized rule set o Creative works: 8,821,535
  • 16. FactForge: Data Integration o DBpedia (the English version) 496M o GeoNames (all geographic features on Earth) 150M o owl:sameAs links between DBpedia and Geonames 471K o GLEI (global company register data) 3M o Panama Papers DB (#LinkedLeaks) 20M o Other datasets and ontologies: WordNet, WorldFacts, FIBO o News metadata (2000 articles/day enriched by NOW) 1 023M o Total size (2.2B explicit + 328M inferred statements) 2 522М
  • 17. FIBO: Financial Industry Business Ontology o Developed by EDMC, https://spec.edmcouncil.org/fibo/ o We loaded FIBO Foundations and BE ü About 35 RDF files all together (old version) o Reasoning profile: OWL 2 RL o Loading takes 2-3 sec. o Number of explicit statements: 5 696 o Number of total statements, including inferred: 15 713 ü About 10k statements materialized
  • 18. FIBO-PROTON Mapping o PROTON is an upper-level ontology ü 500 classes, 200 properties; developed by Ontotext since 2004 ü used semantic annotation and LOD integration services, e.g, FactForge ü mapped to DBPedia, Freebase, GeoNames o A very basic mapping for public companies and few related properties was loaded in 4 hours in FactForge: fb:business.issuer rdfs:subClassOf pext:PublicCompany. pext:PublicCompany rdfs:subClassOf fibo-be-corp-corp:PubliclyHeldCompany. ptop:Organization rdfs:subClassOf fibo-fnd-org-fm:FormalOrganization. dbp-prop:industry rdfs:subPropertyOf pext:industryOf. pext:industryOf rdfs:subPropertyOf fibo-fnd-rel-rel:isClassifiedBy. dbp-ont:subsidiary rdfs:subPropertyOf ptop:controls. ptop:controls rdfs:subPropertyOf fibo-fnd-rel-rel:controls.
  • 19. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 20. Rule-Based Reasoning o Description Logic (DL) doesn’t scale ü Satisfiability checking is not tractable ü Complexity grows exponentially with size o Rule-based inference engine ü R-Entailment rules, PROLOG-style, as defined in [1] o Sound and complete in PSPACE ü Under some constraints: do not introduce blank nodes, bound size of the rule bodies, ground RDF graph, [1] [1] Combining RDF and Part of OWL with Rules: Semantics, Decidability, Complexity Herman J. ter Horst ,Published in International Semantic Web Conference 2005 More at: http://graphdb.ontotext.com/documentation/standard/reasoning.html Complexity* DLRules, LP OWL Full OWL DL OWL Lite RDFS SWRL Datalog OWL 2 QL Expressivity supported by GraphDB OWL 2 RL OWL Horst
  • 21. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o Reasoning With GraphDB Presentation Outline
  • 22. Forward-Chaining and Materialization o All possible inferences are made upon update and are stored ü The inferred statements are stored and indexed along the explicit ones ü Interferences that are no longer supported upon delete are retracted o Forward-chaining works, subject to conscious modeling ü The overheads of the materialization approach are bearable ü Say, 2x index size and 2x slower loading and updates ü Marginal (if any) slowdown of queries
  • 23. Query-time Reasoning and Backward-Chaining o Perform reasoning query-time ü No overhead upon data loading and updates ü Two basic approaches: Backward-chaining and Query rewriting o Backward-chaining slows down query evaluation dramatically ü Alike PROLOG unification, the engine “dives” recursively, in order to exhaust all alternative ways to find bindings for each separate triple pattern in the query ü There is no way to guess before the actual evaluation the cardinality of the results for each triple pattern ü This makes query plan optimization impossible and ruins query performance
  • 24. Query Rewriting o Each pattern in the query is rewritten as disjunction of several alternatives, based on reasoning on the schema/ontology/TBox <?a rdf:type ptop:Person> query pattern will be expanded to something like <?a rdf:type ptop:Person> OR (<?p rdfs:range ptop:Person> AND <?b ?p ?a>) OR (<?a rdf:type ?c> AND <?c rdfs:subClassOf ptop:Person >) … o Execution of 10s combinations of variants is slow ü Imagine a query with two patterns: the first one expands into 5 variants and the second into 6 variants. The engine will have to evaluate 30 alternative combinations ü Think of implementing the semantics of owl:sameAs via query rewriting o Query rewriting also delivers incomplete results ü Recursion is not possible with SPARQL query rewriting
  • 25. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 26. GraphDB Essentials o Scalable RDF / SPARQL engine ü W3C standards support ü NEW: RDF* support, property annotations o Platform independent (100% Java) o Open source API ü Main contributor to the RDF4J project o Reasoning and consistency checking ü UNIQUE! Efficient reasoning support for big data sets across the full lifecycle of the data: load, query, updates
  • 27. Architecture GraphDB Workbench User friendly interface for database administration GraphDB Engine REST API for database access Plugin / Connectors
  • 28. GraphDB Workbench o SPARQL editor & autocomplete o Schema visualization o Graph exploration o Database monitoring and administration 9/10/20
  • 30. Features Free Standard Enterprise RDF 1.1 support SPARQL 1.1 support RDFS, OWL2 RL and QL reasoning Efficient query execution Workbench interface Community support Unlimited number of CPU cores Commercial support Connectors for Elasticsearch & SOLR High-availability cluster Managed service GraphDB Enterprise: Resilience & Availability
  • 31. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 32. Reasoning in GraphDB o Fast forward-chaining materialization ü Allows for efficient query evaluation on big datasets o Incremental for both inserts and deletes ü Inferred closure is updated transparently upon commit of transaction o Sample rules: ENTAILMENT CONSISITENCY p <rdf:type> <owl:FunctionalProperty> x owl:sameAs y x p y x owl:differentFrom y x p z ------------------------ ------------------------------- y <owl:sameAs> z
  • 33. OWL 2 Reasoning o Built-in rule-sets for: RDFS, OWL-Horst, OWL2-RL, OWL2-QL o Custom rule-sets easily defined ü Ruleset optimizer/profiler o Configurations with multiple rule-sets ü E.g. one with consistency checking to be used for internal data and another one with „open-world“ semantics for LOD and other external datasets o NEW: Proof plug-in provides inference explanation
  • 34. Predefined Rule-Sets Ruleset Description Empty No reasoning rdfs Standard RDFS: subClassOf, subPropertyOf, domain and range of properties rdfs-plus RDFS plus symmetric, transitive and inverse properties owl-horst (pD*) sameAs, equivalentClass, equivalentProperty, SymmetricProperty, TransitiveProperty, inverseOf, FunctionalProperty, InverseFunctionalProperty. Partial support for: intersectionOf, someValuesFrom, hasValue, allValuesFrom owl-max See the spec http://graphdb.ontotext.com/documentation/standard/reasoning.html owl-rl (DL-LiteR) AsymmetricProperty, IrreflexiveProperty, propertyChainAxiom, AllDisjointProperties, hasKey, unionOf, complementOf, oneOf, differentFrom, AllDisjointClasses and all the property cardinality primitives. Adds more complete support for intersectionOf, someValuesFrom, hasValue, allValuesFrom owl-ql Partial compliance. See the spec https://www.w3.org/TR/owl2-profiles
  • 35. Optimized Rule-Sets o These versions exclude some RDFS reasoning rules, which are not useful for most of the applications, but add substantial reasoning overheads o “Optimized” ruleset versions suppress this rule Id: rdf1_rdfs4a_4b x a y ------------------------------- x <rdf:type> <rdfs:Resource> a <rdf:type> <rdfs:Resource> y <rdf:type> <rdfs:Resource>
  • 36. Presentation Outline o Reasoning Introduction: Benefits and Pitfalls o Reasoning Use Cases and Demos o RDFS and OWL 2 Profiles o Reasoning Implementation Choices o GraphDB o Reasoning with GraphDB o Reasoning Optimizations in GraphDB Presentation Outline
  • 37. Efficient Retraction of Inferred Facts o Materialization causes troubles upon delete ü It is not trivial to figure out which inferred statements are no longer supported o Deletion without recomputing the inference closure is needed ü Without it forward-chaining is not feasible for dynamic environments o GraphDB retracts statements via a unique algorithm ü Forward-chaining to find potentially affected inferences ü Backward-chaining to test which inferences are still supported ü No truth maintenance information overheads ü Fast – the same order of magnitude as materialization upon insert
  • 38. The Honey of owl:sameAs Equivalence o owl:sameAs links the datasets in the Linked Open Data cloud o owl:sameAs declares that two different URIs denote one and the same object ü Aligns different identifiers of the same real-world entity used in different data sources o For example, let’s say that we have three different URIs for Bulgaria and two for Sofia (its capital) dbpedia:Sofia owl:sameAs geonames:727011 geonames:727011 geo-ont:parentFeature geonames:732800 dbpedia:Bulgaria owl:sameAs geonames:732800 dbpedia:Bulgaria owl:sameAs opencyc-en:Bulgaria
  • 39. The Sting of owl:sameAs Equivalence o According to the standard semantics of owl:sameAs ü It is a transitive and symmetric relationship ü Statements, asserted using one of the equivalent URIs, should be inferred to appear with all equivalent URIs placed in the same position ü Thus the 4 statements in the example lead to 10 inferred statements : geonames:727011 owl:sameAs dbpedia:Sofia geonames:732800 owl:sameAs dbpedia:Bulgaria geonames:732800 owl:sameAs opencyc-en:Bulgaria opencyc-en:Bulgaria owl:sameAs dbpedia:Bulgaria opencyc-en:Bulgaria owl:sameAs geonames:732800 dbpedia:Sofia geo-ont:parentFeature geonames:732800 dbpedia:Sofia geo-ont:parentFeature opencyc-en:Bulgaria dbpedia:Sofia geo-ont:parentFeature dbpedia:Bulgaria geonames:727011 geo-ont:parentFeature opencyc-en:Bulgaria geonames:727011 geo-ont:parentFeature dbpedia:Bulgaria
  • 40. The Honey and the Sting of owl:sameAs E11 E22 E12 E21 E23 geonames:727011 dbpedia:Sofia geonames:732800 dbpedia:Bulgaria opencyc-en:Bulgaria geo-ont:parentFeature
  • 41. The Honey and the Sting of owl:sameAs E11 E22 E12 E21 E23 geonames:727011 dbpedia:Sofia geonames:732800 dbpedia:Bulgaria opencyc-en:Bulgaria geo-ont:parentFeature
  • 42. owl:sameAs Optimization o GraphDB features an optimization of owl:sameAs ü It can use a single master-node in its indices to represent a class of sameAs-equivalent URIs o Avoids inflating the indices with multiple equivalent statements ü Imagine a statement that has 5 sameAs-equivalents of its subject, 2 of its predicate and 3 of its object. Such statement would have 30 replicas in the indices after forward-chaining if such an optimization is not used o Helps presenting compact query results ü The owl:sameAs equivalence can result in multiplication of the bindings of the variables in the process of query evaluation with both forward- and backward-chaining. This leads to expansion of the result- set with rows that differ only by referring to different URIs, which are sameAs-equivalent ü Optionally, query results can be expanded, as if there is no optimization
  • 43. Questions? Experience the technology with our demonstrators FactForge: Knowledge graph of linked open data and news about People and Organizations http://factforge.net RANK: News popularity ranking for companies http://rank.ontotext.com NOW: Semantic News Portal http://now.ontotext.com #43