Society of Scholarly Publishing Conference 2012 talk on "Making Semantics Work". Bernadette Hyland describes what publishers need to be paying attention to with respect to data reuse and sharing. She describes goals, approaches and platforms for the internal and external publishing of data as Linked Data for more efficient and effective integration, reuse and distribution.
1. Making Semantics Work
The Role of Linked Data in
Scholarly Publishing
1 June 2012
Arlington VA USA
Brief by Bernadette Hyland,
co-chair, W3C Government Linked Data Working Group
CEO, 3 Round Stones, Inc
Email. bhyland@3roundstones.com
Twitter: @BernHyland
This presentation: http://slideshare.net/3roundstones
Wednesday, May 30, 12 1
2. What is the semantic web?
from the W3C web site
The Semantic Web is a web of data.
The Semantic Web is about two things.
It is about common formats for
integration and combination
of data drawn from diverse
sources… It is also about
language for recording
how data relates to
real world objects.
Slide credit: Scott Brinker @chiefmartec
Wednesday, May 30, 12 2
3. Content Data is King
Slide credit: Scott Brinker @chiefmartec
Wednesday, May 30, 12 3
5. Linked data is about data
that is reusable
A simple yet
revolutionary change in
perspective.
Wednesday, May 30, 12 5
6. We’re living in
a golden
age ...
Photo credit: http://www.flickr.com/photos/sjungling/5974860/
Wednesday, May 30, 12 6
7. “Knowledge is of two kinds.
We know a subject our ourselves, or we know
where we can find information upon it.”
by Samuel Johnson
18th Century British author, linguist & lexicographer
Wednesday, May 30, 12 7
8. from:
to: LinkedEnterpriseData
Wednesday, May 30, 12 8
12. Scholars pain point ...
#1 - Data access and reuse
Large amounts of diverse data produced by complex
experiments, simulations & observations
• The growth rate of PubMed alone is one paper per
minute
• Hard to validate, reproduce & leverage scientific data
• Not easily accessible nor interlinked
(Exception is ‘omics’ research, deposit of sequences
required for publication)
Wednesday, May 30, 12 12
13. Publishers looking to ...
1. Lower costs of combining data silos
2. Control data quality & protect data/brand standards
3. Produce high quality data for external consumption
4. Leverage structured data increasingly available via the Web
5. Distribute & promote content (SEO++)
6. Increase paid subscriptions
7. Provide new data initiatives, i.e, a “kitchen” for mashups
Wednesday, May 30, 12 13
14. Business decisions are yours...
“Marketing”
Determining how
much data to
share…
…or not to share.
“Legal”
Slide courtesy of Scott Brinker @chiefmartec
Wednesday, May 30, 12 14
15. Some data may be
better harnessed as
an incentive for other
business goals
• For internal use
• For external use by
• new & existing authors
• new editors
• new subscribers
• new partners
Wednesday, May 30, 12 15
17. Why Linked Data matters ...
• It scales ... to Web-scale
• Does not require a super model
• Based on International Data Exchange
Standards (RDF, SPARQL)
• Lingua franca for data exchange
Wednesday, May 30, 12 17
21. • Linked Data is
about publishing
and consuming
data using
international data
standards
• Based on 20 year
old idea
• A system of linked
information systems
Wednesday, May 30, 12 21
22. Data landscape
Semantic
Technologies
RDBMS
Linked Linked
Semantic Open Enterprise CRM
Web Data Data
BI
Wednesday, May 30, 12 22
27. Linked Data
Management
platform
Wednesday, May 30, 12 27
28. CONTENT LINKED DATA
MANAGEMENT MANAGEMENT
SYSTEM SYSTEM
DATA
TEXT
UNSTRUCTURED
STRUCTURED
DATA
TEXT
Wednesday, May 30, 12 28
29. • Callimachus is a framework
for data-driven applications
based on Linked Data
principles
• Callimachus allows Web
developers to easily create
data driven applications for
the Web
• Availableas Open Source
(FLOSS) & commercially
supported version
Wednesday, May 30, 12 29
31. Publishing Linked Data
will require continual
nurturing but the
rewards are worth it
Wednesday, May 30, 12 31
32. Goal:
Achieve balance between
openness vs. protection,
distributed vs. controlled,
standardized vs. loosely-
coupled data relationships.
Slide credit: Scott Brinker @chiefmartec
Wednesday, May 30, 12 32
33. Recommendations
• Seek balance for sharing and reuse
• Data is king
• Publish in reusable format (RDF family of standards)
• Use OPEN vs proprietary in data formats
• Define a URI Policy and Strategy, document it and
ensure editors & authors use it
• Best practices and vocabularies exist -- don’t recreate
the wheel
Wednesday, May 30, 12 33