SlideShare a Scribd company logo
1 of 37
Download to read offline
From metadata to Knowledge Graph
Miel Vander Sande - MMC Seminar 2023
From metadata to
Knowledge Graph
Who is meemoo?
Drivers for a
new metadata roadmap
Knowledge Graph-based
infrastructure
Modelling a
heterogeneous archive
Lessons learned & way forward
At meemoo we’re here for the archive.
We help cultural, media and government
organisations with advice and practical
support, and want to make archival materials
accessible and usable.
Service provision
Digitisation, digital archiving and management of archival materials
Make content accessible and usable
Actively gather and share expertise on digital archive operations
Advise on digital heritage processes
Collaboration
28 34
25
20
55
10
These figures are from 31 December 2022.
172 content partners in
culture, media and government
Content partners in different sectors
performing arts
50
museums
45
archives
24
heritage societies
19
regional broadcasters
10
government institutions
12
heritage libraries
7
national broadcasters
3
sector institutes
2
These figures are from 31 December 2022.
Exploration via
interactive platforms
hetarchief.be
Exploration via
interactive platforms
The Archive for Education
Enabling partners
Partners disseminate
using our APIs
In figures
nearly 170,000
user accounts at The Archive
for Education at end of
2021-2022 academic year
All these figures except for education are from 31 December 2022.
> 540,000
audiovisual carriers
transferred to our
archive system
> 6 million
objects in our
archive system
Metadata is key in all processes
Diagnostics &
operations
for finding out what went
wrong or where things at
Preservation
& digitization
such as digital format
deprecation and AV
carrier characteristics
for inventory
Search &
exploration
by platform users,
but also internal
From metadata to
Knowledge Graph
Who is meemoo?
Drivers for a
new metadata roadmap
Knowledge Graph-based
infrastructure
Modelling a
heterogeneous archive
Takeaways & way forward
MAM-centered infrastructure
Media Asset Management System
The Archive for
Education
hetarchief.be
News of the
great war
Catalogus Pro
Art in Flanders
Contentpartner
Contentpartner
Contentpartner
Content partner
(CRS, DAMS, …)
metadata & media
CRM
Internal
tools
Other data
sources
data model E-Z
data model A
data model C
data model B
data model D
Applications of
Content partners
Internal tools
OAI-PMH
REST API GRAPHQL
SEARCH
metadata
integration
was implicit
Implicit, but
demanding role
as metadata
integrator
It works, but…
Our metadata practice had become outdated and was reaching its limits
too many domains with specific needs
one-size-fits-all datamodel cannot deal with the data heterogeneity
The metadata (model) was underspecified
no clear definitions, labels or documentation of concepts and properties
the lack of a shared terminology leads to miscommunication
and we still have plans
Adding new analog carriers or new media (e.g., 3D objects, glass plates)
Catch-up process with (new) content partners and with AI / machine learning
Speech-to-text, face recognition, and named-entity recognition
Connecting to external sources (e.g. wikidata), or standardized vocabularies,
controlled lists, thesauri, or taxonomies (e.g. GTAA, VIAF)
Provide extra useful services on and with metadata (e.g. IIIF, ...)
Roadmap: Five ambitious horizons
1
Measuring and
validating the
quality of
metadata
Thorough revision
of the
metadatamodel
2
Tackling data
integration with
suitable
fundamental
infrastructure
3
Creating new
ways for inflow
and outflow
4
Active
collaboration
with and about
metadata
5
2020-...
Framework for
data quality
assessment
2021
Datamodels
2022 - 2023
Knowledge
Graph
2023 - …
Access and use
Knowledge Graph
2023 - ...
Linked Data:
external sources
and partners
From metadata to
Knowledge Graph
Who is meemoo?
Drivers for a
new metadata roadmap
Knowledge Graph-based
infrastructure
Modelling a
heterogeneous archive
Takeaways & way forward
Contentpartner
Contentpartner
Contentpartner
Contentpartner
OAI-PMH
REST API
interaction
(platforms)
integration GRAPHQL
sources
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Media Asset
Management System
CRM
Internal
tools
Other data
sources
metadata
metadata metadata
metadata
Contentpartner
Contentpartner
Contentpartner
Contentpartner
universal, application-independent access to (meta)data
OAI-PMH
REST API
integration GRAPHQL ...
metadata
management
sources
IIIF 3.0
Media Asset
Management System
CRM
Internal
tools
Other data
sources
media
metadata
metadata metadata
metadata
metadata
interaction
(platforms)
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Contentpartner
Contentpartner
Contentpartner
Contentpartner
Knowledge Graph
universal, application-independent access to (meta)data
OAI-PMH
REST API
integration GRAPHQL ...
metadata
management
sources
IIIF 3.0
Media Asset
Management System
CRM
Internal
tools
Other data
sources
media
metadata
metadata metadata
metadata
metadata
interaction
(platforms)
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Knowledge Graph?
Archive metadata are
represented and queried
as nodes connected by edges
Intuitive navigation
Supports discovery by exploration
Flexible data structure & schema,
but data semantics, schema and
constraints are still essential!
VRT
Newsitem 25/05
2nd grade English
wikidata
Make metadata
accessible
just the browser:
everything we know
about ...
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Contentpartner
Contentpartner
Contentpartner
Contentpartner
Knowledge Graph
universal, application-independent access to (meta)data
OAI-PMH
REST API GRAPHQL ...
IIIF 3.0
Media Asset
Management System
CRM
Internal
tools
Other data
sources
media
metadata
metadata metadata
metadata
metadata
General purpose
Single purpose
interaction
(platforms)
integration
metadata
management
sources
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Contentpartner
Contentpartner
Contentpartner
Contentpartner
Knowledge Graph
universal, application-independent access to (meta)data
OAI-PMH
REST API GRAPHQL ...
IIIF 3.0
Media Asset
Management System
CRM
Internal
tools
Other data
sources
media
metadata
metadata metadata
metadata
metadata
General purpose
interaction
(platforms)
integration
metadata
management
sources
Single purpose
Multi purpose
Applications of
Content partners
Internal tools
meemoo’s
interactive platforms
Contentpartner
Contentpartner
Contentpartner
Contentpartner
Knowledge Graph
universal, application-independent access to (meta)data
OAI-PMH
REST API GRAPHQL ...
IIIF 3.0
Media Asset
Management System
CRM
Internal
tools
Other data
sources
media
metadata
metadata metadata
metadata
metadata
interaction
(platforms)
integration
metadata
management
sources
User needs
presentation, focused, simple, no surprises
Data needs
flexibility, semantics, context, relationships,
expressive data models and querying
Application needs
performance (caching), developer-friendly, interoperable
From metadata to
Knowledge Graph
Who is meemoo?
Drivers for a
new metadata roadmap
Knowledge Graph-based
infrastructure
Modelling a
heterogeneous archive
Takeaways & way forward
Metadata modelling methodology
1. Knowledge capture
Business working groups → relevant business questions
Existing models, (meta)data, documentation and (functional) analyses
External standards (DC, EBU Core, PREMIS, CIDOC)
Other pain points, wishes & use cases
2. Knowledge implementation
Thematic working group
(per domain)
Diagram
Formalise & document
Proof-of-concept
Knowledge
inventory
Specifications
Open problems
3. Model evaluation
External working groups (partners)
Test business questions & intake procedures
vocabularium & schema
and/or thesauri & lists of terms
Network of domain models and thesauri
Thesaurus
Flemish school system
https://w3id.org/onderwijs-vlaanderen/id/structuur/
Data model
Archive objects
https://developer.meemoo.be/docs/
metadata/knowledge-graph/0.0.1/obj
ect/en/
Basic object structure (PREMIS OWL)
metadata about
the content
eg. a film
metadata about
the reproduction
eg. archive master
metadata about the carrier
or physical (art)work
eg. the nitrate film
technical
metadata
eg. the .mov
Example of object: “Titanic”
From metadata to
Knowledge Graph
Who is meemoo?
Drivers for a
new metadata roadmap
Knowledge Graph-based
infrastructure
Modelling a
heterogeneous archive
Takeaways & way forward
Composing a good data toolchain
Public procurement procedure to purchase a
Graph storage and RDF mapping solution: TriplyDB
Adopting & contributing to open source tooling
Workflow and ETL automation with Prefect
GraphQL over SPARQL framework GRASP
SKOS editing tool manager, possibly atramhasis
many smaller tools and libraries
Custom tooling: shacl2md to generate datamodel documentation
Invest in good data management
It takes time, effort, budget and know-how to do this right
Data modelling is a lost art, but it is still essential
Figuring out the shape and meaning of things pays off
Try new data technologies: metadata is a graph
Right base for data integration, unknown use cases and sparse data
The RDF ecosystem gives a head start, also in AV archiving
PREMIS, ODRL, EBUCore, SKOS, … are powerful, especially combined
Current state and the way forward
What have we done?
Inventory of existing data and
knowledge domains
Creating and formalizing new
data models
Setting up an RDF Knowledge
Graph platform
What are we working on?
Mapping between the archive
data and the data models
Implementing ETLs to generate
RDF data
Developing a GraphQL
framework and build APIs
From metadata to Knowledge Graph
Miel Vander Sande - @Miel_vds

More Related Content

Similar to 20230525_mmc_seminar.pdf

The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...Amit Sheth
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)EUDAT
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trendsAlan Morrison
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
 
11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)ijdms
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsDenodo
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
A Taxonomy of the Data Resource in the Networked Industry
A Taxonomy of the Data Resource in the Networked IndustryA Taxonomy of the Data Resource in the Networked Industry
A Taxonomy of the Data Resource in the Networked IndustryBoris Otto
 
Keynote Presentation at MTSR07
Keynote Presentation at MTSR07Keynote Presentation at MTSR07
Keynote Presentation at MTSR07Gauri Salokhe
 
11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)dannyijwest
 

Similar to 20230525_mmc_seminar.pdf (20)

The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trends
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data AssetsEnterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
A Taxonomy of the Data Resource in the Networked Industry
A Taxonomy of the Data Resource in the Networked IndustryA Taxonomy of the Data Resource in the Networked Industry
A Taxonomy of the Data Resource in the Networked Industry
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Dms 2020
Dms 2020Dms 2020
Dms 2020
 
Keynote Presentation at MTSR07
Keynote Presentation at MTSR07Keynote Presentation at MTSR07
Keynote Presentation at MTSR07
 
11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)11th International conference on Database Management Systems (DMS 2020)
11th International conference on Database Management Systems (DMS 2020)
 

More from Miel Vander Sande

Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webMiel Vander Sande
 
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...Miel Vander Sande
 
Reproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveReproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveMiel Vander Sande
 
Innovatiemarkt 2017: Machines are the new digital natives
Innovatiemarkt 2017: Machines are the new digital nativesInnovatiemarkt 2017: Machines are the new digital natives
Innovatiemarkt 2017: Machines are the new digital nativesMiel Vander Sande
 
A sweet affordable combo for Linked Data Archives
A sweet affordable combo for Linked Data ArchivesA sweet affordable combo for Linked Data Archives
A sweet affordable combo for Linked Data ArchivesMiel Vander Sande
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital NativesMiel Vander Sande
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpediaMiel Vander Sande
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataMiel Vander Sande
 
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Miel Vander Sande
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...Miel Vander Sande
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesMiel Vander Sande
 
The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.Miel Vander Sande
 
PMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesPMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesMiel Vander Sande
 
Aan de slag met Linked Open Data
Aan de slag met Linked Open DataAan de slag met Linked Open Data
Aan de slag met Linked Open DataMiel Vander Sande
 
The DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputThe DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputMiel Vander Sande
 

More from Miel Vander Sande (18)

The Memento protocol
The Memento protocolThe Memento protocol
The Memento protocol
 
Slight change of plans!
Slight change of plans!Slight change of plans!
Slight change of plans!
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading web
 
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...
PhD Defense: Metadata and Control Features for Low-Cost Linked Data Publishin...
 
Reproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archiveReproducibility with 
the 99 cents Linked Data archive
Reproducibility with 
the 99 cents Linked Data archive
 
Innovatiemarkt 2017: Machines are the new digital natives
Innovatiemarkt 2017: Machines are the new digital nativesInnovatiemarkt 2017: Machines are the new digital natives
Innovatiemarkt 2017: Machines are the new digital natives
 
A sweet affordable combo for Linked Data Archives
A sweet affordable combo for Linked Data ArchivesA sweet affordable combo for Linked Data Archives
A sweet affordable combo for Linked Data Archives
 
Machines are the new Digital Natives
Machines are the new Digital NativesMachines are the new Digital Natives
Machines are the new Digital Natives
 
Time travelling through DBpedia
Time travelling through DBpediaTime travelling through DBpedia
Time travelling through DBpedia
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
Publish data as Time Consistent Web API based on Provenance (WS-REST 2014)
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...
 
LDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triplesLDOW2013 r&wbase: git for triples
LDOW2013 r&wbase: git for triples
 
The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.The Terminator's origins or how the Semantic Web could endanger Humanity.
The Terminator's origins or how the Semantic Web could endanger Humanity.
 
PMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challengesPMOD Challenges for Open Data Usage: Open derivatives and challenges
PMOD Challenges for Open Data Usage: Open derivatives and challenges
 
Aan de slag met Linked Open Data
Aan de slag met Linked Open DataAan de slag met Linked Open Data
Aan de slag met Linked Open Data
 
The DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic outputThe DataTank: an Open Data adapter with semantic output
The DataTank: an Open Data adapter with semantic output
 
Follow the stars 25/11/2011
Follow the stars 25/11/2011Follow the stars 25/11/2011
Follow the stars 25/11/2011
 

Recently uploaded

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Recently uploaded (20)

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

20230525_mmc_seminar.pdf

  • 1. From metadata to Knowledge Graph Miel Vander Sande - MMC Seminar 2023
  • 2. From metadata to Knowledge Graph Who is meemoo? Drivers for a new metadata roadmap Knowledge Graph-based infrastructure Modelling a heterogeneous archive Lessons learned & way forward
  • 3. At meemoo we’re here for the archive. We help cultural, media and government organisations with advice and practical support, and want to make archival materials accessible and usable.
  • 4. Service provision Digitisation, digital archiving and management of archival materials Make content accessible and usable Actively gather and share expertise on digital archive operations Advise on digital heritage processes
  • 5. Collaboration 28 34 25 20 55 10 These figures are from 31 December 2022. 172 content partners in culture, media and government
  • 6. Content partners in different sectors performing arts 50 museums 45 archives 24 heritage societies 19 regional broadcasters 10 government institutions 12 heritage libraries 7 national broadcasters 3 sector institutes 2 These figures are from 31 December 2022.
  • 10. In figures nearly 170,000 user accounts at The Archive for Education at end of 2021-2022 academic year All these figures except for education are from 31 December 2022. > 540,000 audiovisual carriers transferred to our archive system > 6 million objects in our archive system
  • 11. Metadata is key in all processes Diagnostics & operations for finding out what went wrong or where things at Preservation & digitization such as digital format deprecation and AV carrier characteristics for inventory Search & exploration by platform users, but also internal
  • 12. From metadata to Knowledge Graph Who is meemoo? Drivers for a new metadata roadmap Knowledge Graph-based infrastructure Modelling a heterogeneous archive Takeaways & way forward
  • 13. MAM-centered infrastructure Media Asset Management System The Archive for Education hetarchief.be News of the great war Catalogus Pro Art in Flanders Contentpartner Contentpartner Contentpartner Content partner (CRS, DAMS, …) metadata & media CRM Internal tools Other data sources data model E-Z data model A data model C data model B data model D Applications of Content partners Internal tools OAI-PMH REST API GRAPHQL SEARCH metadata integration was implicit Implicit, but demanding role as metadata integrator
  • 14. It works, but… Our metadata practice had become outdated and was reaching its limits too many domains with specific needs one-size-fits-all datamodel cannot deal with the data heterogeneity The metadata (model) was underspecified no clear definitions, labels or documentation of concepts and properties the lack of a shared terminology leads to miscommunication
  • 15. and we still have plans Adding new analog carriers or new media (e.g., 3D objects, glass plates) Catch-up process with (new) content partners and with AI / machine learning Speech-to-text, face recognition, and named-entity recognition Connecting to external sources (e.g. wikidata), or standardized vocabularies, controlled lists, thesauri, or taxonomies (e.g. GTAA, VIAF) Provide extra useful services on and with metadata (e.g. IIIF, ...)
  • 16. Roadmap: Five ambitious horizons 1 Measuring and validating the quality of metadata Thorough revision of the metadatamodel 2 Tackling data integration with suitable fundamental infrastructure 3 Creating new ways for inflow and outflow 4 Active collaboration with and about metadata 5 2020-... Framework for data quality assessment 2021 Datamodels 2022 - 2023 Knowledge Graph 2023 - … Access and use Knowledge Graph 2023 - ... Linked Data: external sources and partners
  • 17. From metadata to Knowledge Graph Who is meemoo? Drivers for a new metadata roadmap Knowledge Graph-based infrastructure Modelling a heterogeneous archive Takeaways & way forward
  • 18. Contentpartner Contentpartner Contentpartner Contentpartner OAI-PMH REST API interaction (platforms) integration GRAPHQL sources Applications of Content partners Internal tools meemoo’s interactive platforms Media Asset Management System CRM Internal tools Other data sources metadata metadata metadata metadata
  • 19. Contentpartner Contentpartner Contentpartner Contentpartner universal, application-independent access to (meta)data OAI-PMH REST API integration GRAPHQL ... metadata management sources IIIF 3.0 Media Asset Management System CRM Internal tools Other data sources media metadata metadata metadata metadata metadata interaction (platforms) Applications of Content partners Internal tools meemoo’s interactive platforms
  • 20. Contentpartner Contentpartner Contentpartner Contentpartner Knowledge Graph universal, application-independent access to (meta)data OAI-PMH REST API integration GRAPHQL ... metadata management sources IIIF 3.0 Media Asset Management System CRM Internal tools Other data sources media metadata metadata metadata metadata metadata interaction (platforms) Applications of Content partners Internal tools meemoo’s interactive platforms
  • 21. Knowledge Graph? Archive metadata are represented and queried as nodes connected by edges Intuitive navigation Supports discovery by exploration Flexible data structure & schema, but data semantics, schema and constraints are still essential! VRT Newsitem 25/05 2nd grade English wikidata
  • 22. Make metadata accessible just the browser: everything we know about ...
  • 23. Applications of Content partners Internal tools meemoo’s interactive platforms Contentpartner Contentpartner Contentpartner Contentpartner Knowledge Graph universal, application-independent access to (meta)data OAI-PMH REST API GRAPHQL ... IIIF 3.0 Media Asset Management System CRM Internal tools Other data sources media metadata metadata metadata metadata metadata General purpose Single purpose interaction (platforms) integration metadata management sources
  • 24. Applications of Content partners Internal tools meemoo’s interactive platforms Contentpartner Contentpartner Contentpartner Contentpartner Knowledge Graph universal, application-independent access to (meta)data OAI-PMH REST API GRAPHQL ... IIIF 3.0 Media Asset Management System CRM Internal tools Other data sources media metadata metadata metadata metadata metadata General purpose interaction (platforms) integration metadata management sources Single purpose Multi purpose
  • 25. Applications of Content partners Internal tools meemoo’s interactive platforms Contentpartner Contentpartner Contentpartner Contentpartner Knowledge Graph universal, application-independent access to (meta)data OAI-PMH REST API GRAPHQL ... IIIF 3.0 Media Asset Management System CRM Internal tools Other data sources media metadata metadata metadata metadata metadata interaction (platforms) integration metadata management sources User needs presentation, focused, simple, no surprises Data needs flexibility, semantics, context, relationships, expressive data models and querying Application needs performance (caching), developer-friendly, interoperable
  • 26. From metadata to Knowledge Graph Who is meemoo? Drivers for a new metadata roadmap Knowledge Graph-based infrastructure Modelling a heterogeneous archive Takeaways & way forward
  • 27. Metadata modelling methodology 1. Knowledge capture Business working groups → relevant business questions Existing models, (meta)data, documentation and (functional) analyses External standards (DC, EBU Core, PREMIS, CIDOC) Other pain points, wishes & use cases 2. Knowledge implementation Thematic working group (per domain) Diagram Formalise & document Proof-of-concept Knowledge inventory Specifications Open problems 3. Model evaluation External working groups (partners) Test business questions & intake procedures vocabularium & schema and/or thesauri & lists of terms
  • 28. Network of domain models and thesauri
  • 31. Basic object structure (PREMIS OWL) metadata about the content eg. a film metadata about the reproduction eg. archive master metadata about the carrier or physical (art)work eg. the nitrate film technical metadata eg. the .mov
  • 32. Example of object: “Titanic”
  • 33. From metadata to Knowledge Graph Who is meemoo? Drivers for a new metadata roadmap Knowledge Graph-based infrastructure Modelling a heterogeneous archive Takeaways & way forward
  • 34. Composing a good data toolchain Public procurement procedure to purchase a Graph storage and RDF mapping solution: TriplyDB Adopting & contributing to open source tooling Workflow and ETL automation with Prefect GraphQL over SPARQL framework GRASP SKOS editing tool manager, possibly atramhasis many smaller tools and libraries Custom tooling: shacl2md to generate datamodel documentation
  • 35. Invest in good data management It takes time, effort, budget and know-how to do this right Data modelling is a lost art, but it is still essential Figuring out the shape and meaning of things pays off Try new data technologies: metadata is a graph Right base for data integration, unknown use cases and sparse data The RDF ecosystem gives a head start, also in AV archiving PREMIS, ODRL, EBUCore, SKOS, … are powerful, especially combined
  • 36. Current state and the way forward What have we done? Inventory of existing data and knowledge domains Creating and formalizing new data models Setting up an RDF Knowledge Graph platform What are we working on? Mapping between the archive data and the data models Implementing ETLs to generate RDF data Developing a GraphQL framework and build APIs
  • 37. From metadata to Knowledge Graph Miel Vander Sande - @Miel_vds