2. Objectives
By the end of this session, you’ll
• Have an understanding of the basic principles and
terminology of the
– Semantic Web
– Linked data
– Library data in semantic web space
• BIBFRAME
• RDA
• FRBR
robin fay @georgiawebgurl 2013
3. The web as we know it (and
think of it) links together
documents
(html, pdf, dynamic
documents created from
databases, etc.)
robin fay @georgiawebgurl 2013
4. In brief:
Types of metadata:
Descriptive
Structural
Administrative
• Many forms of metadata include elements of each of these;
however it is dependent upon the schema.
• A schema is a set of rules covering the elements and
requirements for coding. Examples of common schemas in the
library world include Dublin Core, TEI, EAD, and others. Examples
of schemas in the semantic web include Dublin Core, FOAF
(Friend of a Friend), and many others.
robin fay @georgiawebgurl 2013
5. •Much of library metadata is highly structured and done by trained
professionals. In the library world, MARC has been a long term
standard. While it can be rigid, its structural nature can makes it
easier to crosswalk and harvest into other databases.
•SEO (Search Engine Optimization) is a common term in the web
world; these experts assign descriptive, administrative (usually
copyright) to websites; their goal is generally higher search
results. Given that search engine algorithms change regularly, SEO
is a highly dynamic field, which can lead to inconsistencies in
metadata application, making it harder for databases and search
engines to harvest.
•In a nutshell, most library metadata has rules and standards;
metadata in the web world is often (but not always) more flexible.
The Semantic Web will need to manage (and make sense!) of all of
these types of metadata.
robin fay @georgiawebgurl 2013
6. •At its core, the semantic web comprises:
oa set of design principles,
ocollaborative working groups,
oand a variety of enabling technologies.
•Some elements of the semantic web are expressed as
prospective future possibilities that are yet to be
implemented or realized
AND
•Other elements of the semantic web are expressed in
formal specifications -- (wikipedia, 2009)
Robin Fay, robinfay.net 2009/10
robin fay @georgiawebgurl 2013
7. Robin Fay, robinfay.net 2009/10
•Semantic web and web metadata is frequently from outside
of the library community – working in parallel or sometimes,
at odds.
•Metadata in libraries encompasses a wide variety; one of the
most common metadata schemas is MARC.
•MARC is formatted using ISBD punctuation; the content of
what goes into a record is controlled by our cataloging rules
(such as RDA). RDA can be applied using different metadata
schemas – although for now, many libraries are still in a
MARC based world.
robin fay @georgiawebgurl 2013
9. •RDF = Resource Description
Framework
•RDFS = Resource Description
Framework Schema
•OWL = Web Ontology
Language
•URI = Uniform Resource
Identifier - think unique
number , URLs
Many terms associated with the Semantic Web are used
or based upon information architecture, database,
information science, and library science fields –
controlled vocabularies, structural elements, etc.
robin fay @georgiawebgurl 2013
10. RDF = Resource Description Framework
• is a general-purpose language for representing information in the
Web (a metadata data model)
• is a W3C specification
• is a conceptual description
• is based upon making statements about web resources (triplets)
• More or less : XML
• We can express RDA in RDF
• Think sentence structure :
• subject – predicate(verb)-object
• My dog eats dogfood.
robin fay @georgiawebgurl 2013
11. So, we have the framework, but how do we apply it?
RDFS = Resource Description Framework Schema
oA schema is
outline: a schematic or preliminary plan
A structure described in a formal language supported by
the database management system ; in a relational database
[such as MySQL), the schema defines the tables, the fields in
each table, and the relationships between fields and tables.
a description of the structure and rules a document must
satisfy for an XML document type
http://tinyurl.com/yj442vr (define: schema -- google)
Dublin Core is a schema
robin fay @georgiawebgurl 2013
13. OWL = Web Ontology
Language
•invented to link ontologies which
are classification systems
•Attempts to define objects and their
relationships
•Different “flavors”
•“interpreted as a set of "individuals"
and a set of "property assertions"
which relate these individuals to
each other” (wikipedia 2009)
•Not a requirement
•Sounds familiar to catalogers, right?
robin fay @georgiawebgurl 2013
15. Being that this is data
driven, we can
query, using SPARQL, a
standard query
language.
We’ll talk more about SPARQL later…
robin fay @georgiawebgurl 2013
16. •Linked data is: “about using the Web to
connect related data that wasn't
previously linked, or using the Web to
lower the barriers to linking data.”
•Think> related, series records, authority
files
•Libraries already link data.
•Projects such as the NYT Linked Open
Data project and the Virtual Authority
File project are resources of controlled
vocabularies.
•Verified and digital identity accounts
such as openID and claimID to
differentiate names
robin fay @georgiawebgurl 2013
17. • What is linked data and open data
o Linked data is about reusing data
o We already do some linked data in our library
catalogs and even in our daily lives
o The link in a bibliographic record (like an authority
record link) is linking data behavior
o A link that we share to our friends on facebook is
linked data (of sorts)
• Linked data is a link to a record/data/content
that can then be utilized in some way
• Open data is data that available to be used in
some way with no barriers to access (licensing,
etc.) robin fay @georgiawebgurl 2013
18. Basic principles of linked data
It keeps us from having to re-enter or copy information
– Making our data:
• reusable
• easy to correct (correct one record instead of multiples)
• efficient
• and potentially useful to others
It can build relationships in different ways - allowing us to create
temporary collections (a user could organize their search results in a
way that makes sense to them) or more permanent (collocating ALL
works by a particular author more easily; pulling together photographs
more easily)
robin fay @georgiawebgurl 2013
19. • Advantages (reusable data, potential to provide
and built relationships, discoverability)
• How library data fits into linked data
o FRBR ( a bibliographic FRAMEWORK which is more
semantic by nature) RDA ( metadata rules which
are not tied to a programming language such as
MARC but can work with semantic web standards
like XML); IRs, and CMS like Drupal which have
semantic web capabilities
• RDA expressed as RDFa
robin fay @georgiawebgurl 2013
21. Tim Berners-Lee’s Four Rules
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards
4. Include links to other URIs, so they can discover
more things
URIs = Uniform Resource Identifier
robin fay @georgiawebgurl 2013
22. What can linked data do for libraries?
• URIs creates methods for classifying that can be used
(linked to!) by others
• Library of Congress has released LCSH as linked data, and
OCLC has a modified version of LCSH called FAST as
linked data
• Linked Data is flexible enough to express entity-
relationship relationships such as FRBR/FRAD
• Different databases (ILS, ERMS, IRs, local databases, etc.)
allowing sharing of data – potentially more consistent
data – allowing for collocation across resources and
allowing users to easily find resources regardless of
source
robin fay @georgiawebgurl 2013
23. Our data in a semantic viewpoint
SOURCE: Getting triples from records: the role of ISBD
http://www.slideshare.net/scottishlibraries/isbd-record2triples
24. Our data in a semantic view
SOURCE: Getting triples from records: the role of ISBD
http://www.slideshare.net/scottishlibraries/isbd-record2triples
“Bib”
:Record
id as
subject
Field role and
relationship
Can map to record
such as viaf
26. How cataloging is changing
: A changing library and WEB landscape
• Automation and new technologies
• The web has changed
• Large scale bibliographic databases
• Cooperative cataloging
• Administrative desire to decrease costs
• Greater variety of media in library collections
(electronic!)
• User expectations and needs
• FRBR is our data model – semantic web friendly!
robin fay @georgiawebgurl 2013
27. • FRBR will give us a way to group things in different
ways building relationships between data – by WEMI
(Work, Expression, Manifestation, Item)
• WEMI is a hierarchy from abstract to the actual thing
owned by a library (the well… item!)
• Work and Expression can be somewhat conceptual
with lots of discussion going on; however, you can
loosely think of Work as a concept or idea which is
Expressed (think the act of creation; performance)
onto/into a physical format (can be digital) aka a
Manifestation, of which the library has a copy (Item).
robin fay @georgiawebgurl 2013
28. Entity-Relationship Model
(new way of storing & organizing data)
• Database design model
• Entity - a thing with an identity
– Entities have attributes (characteristics)
• Relationships
– Between different entities at different levels
• Provides for organization of records in database
– “clustering”
• Conceptual model of abstract concepts
robin fay @georgiawebgurl 2013
30. “User Tasks”
How do catalog users
• Find
• Identify
• Select
• Obtain
… the resources they want?
robin fay @georgiawebgurl 2013
31. Work
A distinct intellectual or artistic creation
Group 1 Entities (WEMI Hierarchy)
Expression
Intellectual or artistic realization of a work
Manifestation
Physical embodiment of an expression of a work
Item
Single exemplar of a manifestation
35. RDA Controlled Vocabularies
Closed
content type
media type
carrier type
mode of issuance
... and more.
Open
frequency
type of recording
language of expression
form of musical notation
relationship designators
(app. I-K)
... and more.
robin fay @georgiawebgurl 2013
36. RDA, FRBR, and MARC
RDA is our metadata rules to describe our content
FRBR is our semantic web friendly data model
Currently we use MARC to format our data but we
need something better
Linked data can be the mechanism – but what
about the actual records?
robin fay @georgiawebgurl 2013
37. RDA, FRBR, and MARC
• Bibliographic records are structured in MARC (a
programming language). MARC (MAchine Readable
Code) and AACR2 have been working together a long
time which means that compromises and workarounds
have sometimes be made. This will be true for RDA,
too.
• MARC is a mixture of controlled access points (series,
name authority and subject headings + free text (e.g.,
contents notes). This provides flexibility and structure
but> More free text = less precision in searching =
more work for systems to return relevant results
robin fay @georgiawebgurl 2013
38. RDA, FRBR, and MARC
• Bibliographic records are structured in MARC (a programming
language). MARC (MAchine Readable Code) and AACR2 have
been working together a long time which means that
compromises and workarounds have sometimes be made. This
will be true for RDA, too.
• MARC is a mixture of controlled access points (series, name
authority and subject headings + free text (e.g., contents notes).
This provides flexibility and structure but> More free text = less
precision in searching = more work for systems to return
relevant results
robin fay @georgiawebgurl 2013
39. • MARC existed before AACR2. MARC was developed in
the 1960s before most digital technology existed –
the web as we know it, ebooks, and Google, did not
exist.
• Most current catalog systems use MARC, but there
are other metadata schemas and programming
languages.
• Although many systems have not fully utilized all of
the fields and functionalities of MARC, it is reaching
the end of its lifespan.
• The next generation (nexgen) systems can not
develop as only MARC based; we need more.
RDA, FRBR, and MARC
robin fay @georgiawebgurl 2013
40. • Our future systems will probably not use MARC, but some
kind of semantic web friendly schema.
• Currently, the Library of Congress has started a project
called the Bibliographic Framework Transition Initiative
• Why?
• We need something that is more flexible, not flat in file
structure, yet works with a semantic framework.
• We need something that works better with different
metadata schemas.
• This new framework will provide us with enormous
functionality in our catalogs and allow us to fully use RDA.
It will allow us to move forward into the semantic web
world.
RDA, FRBR, and MARC
robin fay @georgiawebgurl 2013
41. • We have some relationships within our library catalog via the
bibliographic data – bib-holding-item (a way to keep all of the
parts of a particular thing together)
• Bib to authority –series-subject headings (a bib record having
linking field(s) to another record(s))
• Authority records – records not visible to the public, but
provide the linking points to our bib records and guide the user
through variations of the name or title, etc.
Linking data in catalogs
robin fay @georgiawebgurl 2013
42. Resources
LODLAM: http://lodlam.net/
LODAM CHALLENGE: http://summit2013.lodlam.net/
LODLAM Zotero Group (Webliography of good stuff): https://www.zotero.org/groups/lod-lam
GLAMLOD: https://groups.google.com/group/glamlod
LC Bibliographic Framework Transition Initiative: http://www.loc.gov/marc/transition/
LITA - library linked data interest group: http://connect.ala.org/node/142470
Use Case Tool: http://obd.jisc.ac.uk/navigate
Getting triples from records: the role of ISBD http://www.slideshare.net/scottishlibraries/isbd-record2triples
FRBR Display Tool: http://www.loc.gov/marc/marc-functional-analysis/tool.html
Understanding FRBR: http://www.loc.gov/cds/downloads/FRBR.PDF
More materials at http://www.delicious.com/georgiawebgurl/metadata_presentation_como
Making the Digital Connection: Linked Data and Libraries
robin fay @georgiawebgurl 2013
URIs are kind of like a hook – they allow us to connect things together.
ElaineSvenonius (?) posited that “navigate” (finding works related to a given work by generalization, assn., & aggregation…) should be added, but is not officialFind: resources corresponding to user’s search criteriaIdentify: confirm resource described corresponds to that sought, or distinguish between more than one resource with similar characteristicsSelect: resource appropriate to user’s needsObtain: to acquire or access resource (RDA chap. 4 (7 pp.) on acquisition and access, includes URL)
One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
What a FRBRized catalog should give us is better searching tools and enable to see editions more easily; see related titles in different media (e.g., easier to find the work “Dracula” regardless of its physical format – its manifestation). Since FRBR is a data model built on a semantic web framework, it will also enable us to have better, more robust, more semantic web like search tools (like our catalogs). ..while FRBR influenced RDA and FRSAD (Functional Requirements for Subject Authority Data)