The document discusses year 2 deliverables for work packages 9 and 10 of the LOD2 project. It summarizes reports on improvements made to the Publicdata.eu portal including upgrades to CKAN and new features. Next steps include further technical enhancements to Publicdata.eu and engaging communities of data publishers and users. Deliverables from the Serbian CKAN team established their data portal and infrastructure. The Polish Ministry of Economy requirements analysis identified needs for publishing their data as linked open data.
Linked Data in Production: Moving Beyond Ontologies
Lod2 review meeting
1. Creating Knowledge out of Interlinked Data
WP9: Use Case 3: LOD2 for Citizens
Luxembourg, Sep 14, 2012
Irina.Bolychevsky@okfn.org
Andreea.bonea@okfn.org
Krzysztof.wecel@i2g.pl
1
LOD2 Presentation . 02.09.2010 . Page http://lod2
2. Creating Knowledge out of Interlinked Data
Agenda
Year 2 Deliverables (OKFN)
D 9.1.1. Report on first release of the Publicdata.eu website
Improvements to Publicdata.eu during the past year
D 9.3.1. Presentation on publishing Linked Data
D 9.3.2. Guide to publishing Linked Data
D 9.4 Report on publication of eGovernment Linked Open Data
Addressing Y1 Review Comments
Next steps
D 9.2.1. Further technical improvements to Publicdata.eu (personalization
features)
Community engagement with Publicdata.eu
Year 2 Deliverables (Serbian CKAN, Instytut Informatyki Gospodarczej)
D 9.5.1. Establishment of the Serbian CKAN
D 9.6. Requirements and Resources used by the Polish Ministry of Economy
Next steps
D 9.7.1. Adaptation of the LOD2 stack for Polish Ministry of Economy
LOD2 Event . 06.09.2010 2 Page
. http://lod2.eu
3. Creating Knowledge out of Interlinked Data
WP9 Objectives
“The purpose of this PublicData.eu use case is to increase public
access to high-value, machine-readable datasets generated by the
European, national as well as regional governments and public
administrations.”
LOD2 Event . 06.09.2010 3 Page
. http://lod2.eu
4. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
(OKFN)
LOD2 Presentation . 02.09.2010 . Page http://lod2
5. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
D 9.1.1. First release of Publicdata.eu
Submitted a thorough report summarizing our work on Publicdata.eu – it's existing features,
previous launches and plans for future improvements: http://svn.aksw.org/lod2/D9.1.1/
LOD2 Event . 06.09.2010 5 Page
. http://lod2.eu
6. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
Publicdata.eu Overview
PublicData.eu is pan-European data catalogue and federation mechanism, developed by
OKF as part of WP9. Based on the CKAN open-source data portal software, the site is a use
case for the citizen aiming to make data as accessible and re-usable as possible. It is a read-
only aggregation of both official and community data portals across the EU.
LOD2 Event . 06.09.2010 6 Page
. http://lod2.eu
7. Creating Knowledge out of Interlinked Data
Key Stats Year 2 Deliverables
Publicdata.eu provides robust search, filtering and previewing tools
It currently houses 17027 data sets, harvested from 18 data catalogues, and it
provides the option to browse data sets by top level categories
LOD2 Event . 06.09.2010 7 Page
6. http://lod2.eu
8. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
Technical improvements to Publicdata.eu during the past year
In March 2012 we upgraded PublicData.eu to CKAN version 1.6, adding the data preview
functionality (powered by Recline), improvements to search, interface improvements to
dataset pages, newly added resource (file) pages and group pages.
We also re-ran all the harvesters to have the most up to date set of datasets. Some
catalogues have been migrated to groups on thedatahub.org and therefore can't currently
be harvested without also including non EU datasets. In the future we may resolve this by
extending the harvester to allow us to specify which groups or tags should be harvested.
This would allow us to import relevant datasets from thedatahub.org without importing non
EU datasets.
Many new CKAN instances have recently been launched by various countries, which we
plan to include in publicdata.eu for the intermediate launch (August 2013)
LOD2 Event . 06.09.2010 8 Page
6. http://lod2.eu
9. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
Addressing Year 1 Review meeting comments
Amount of RDF data in publicdata.eu
● We added the Serbian CKAN to Publicdata.eu bringing their RDF data
● Since Publicdata.eu is currently only a read-only portal, we have focused on encouraging
the source catalogs to increase their RDF data
● our deliverables of 9.3.1 and 9.3.2 facilitate this (presentation & guide on publishing
linked open data)
● worked with consortium partners to produce more RDF data for the eGovernment
report (will cover in next slides)
● In future launches we will allow users to add data, meaning we can add converted datasets
to be accessible through publicdata.eu
● Additionally we plan to improve our harvesting, allowing us to harvest groups to increase
the opportunities for what datasets can be added to publicdata.eu (including groups on
thedatahub.org)
LOD2 Event . 06.09.2010 9 Page
6. http://lod2.eu
10. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
D 9.3.1. Presentation on publishing Linked Open Data
Overview
In Feb 2012, OKFN put together a best practices presentation regarding the publishing,
linking and utilizing Open Data. The presentation is easy accessible for the non technical
eye and details the economic, transparency, policy, and efficiency benefits for
Governments to publish open data. Other aspects such as licensing, registering and
getting the data online are also included in this presentation.
This is aimed to be a detailed resource that anybody can use when referring to Linked
Open Data.
Current state
The presentation can be found here:
[http://svn.aksw.org/lod2/D9.3/D9.3.1/D9.3.1-presentation.pdf]
LOD2 Event . 06.09.2010 10Page
6. http://lod2.eu
11. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
D 9.3.2. Guide to publishing Linked Open Data
Overview
The Guide [http://svn.aksw.org/lod2/D9.3/D9.3.2/] is written in clear, non-technical
language and introduces the reader to the concepts, rationale and tools of Linked Open
Data, as well as providing a high-level overview of the publishing process. It has a
particular focus on public-sector data, and aims to arm decision-makers with an
understanding of Linked Data and the steps necessary to start publishing it.
Expected Impact
The guide will be published on the OKFN website [http://lod2.okfn.org/] where it is hoped
[
that it will become a standard reference document, helping organizations that need to
make decisions about whether and how to publish Linked Open Data.
LOD2 Event . 06.09.2010 11Page
6. http://lod2.eu
12. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
D 9.4. Report on publication of eGovernment Linked Open Data
Overview
The report [http://svn.aksw.org/lod2/D9.4/] summarizes our assessment of the
[
current state of linked data publishing by European Governments and organizations;
Additionally it highlights some of the work that LOD2 partners have been doing to
publish more linked data (Publink initiative, Guides and Documentation)
The report details some of the benefits of publishing linked open data as well as the
current technical and legal barriers preventing the publishing of more linked data and
our proposed approach to increasing the amount of high quality linked data
published during the next phase of the LOD2 project.
The report contains two Appendices:
Appendix A - a collection of 9 use cases, showcasing the benefits of LOD
Appendix B - presenting theLODStats system developed for high performance
statistical analysis
LOD2 Event . 06.09.2010 12Page
6. http://lod2.eu
13. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Appendix A – Open Data releases
Dataset Project URL Triples
WHO’s Global Health Observatory http://gho.aksw.org/ 273k
European Digital Agenda Scoreboard http://data.lod2.eu/scoreboard/
127k
National Accounts Linked Data for http://ukstatistics.lod2.eu/
the UK and Serbia
645k (UK)
http://rs.ckan.net/dataset/rzs-national-accounts
10 million (Serbia)
http://csarven.ca/statistical-linked-dataspaces
World Bank Data as Linked Data
165 million
German Labour Law & Courts http://vocabulary.wolterskluwer.de/
Thesauri 150k
German Federal Ministry of Finance http://data.lod2.eu/gfmf/
2 million
UK public data sets http://thedatahub.org/en/dataset/uk-gdp-since-1948
http://thedatahub.org/en/dataset/epims-lod2
http://thedatahub.org/en/dataset/uk-criminal-justice 12 million (total)
LinkedGeoData http://linkedgeodata.org 20 billion
Wiktionary http://wiktionary.dbpedia.org 100 million
Czech tender data http://ld.opendata.cz:8900/sparql 1071859
EU-FP 7 LOD2 P roje ct Ove rvie w . . http://lod2.
14. Creating Knowledge out of Interlinked Data
Next Steps (OKFN)
LOD2 Presentation . 02.09.2010 . Page http://lod2
15. Creating Knowledge out of Interlinked Data
Next Steps
D 9.2.1. Further technical improvements to Publicdata.eu
Improvements scheduled for the Dec 2012 release (Further personalization features)
Datasets ratings
Allow users to add/revise their own data sets
User tools to enable mash-ups and visualization of data
App marketplace for users to upload their own visualizations, stories and apps
Allow user commenting on datasets
Activity streams and follow support (i.e. allowing users to subscribe to activity updates)
Social / sharing buttons
LOD2 Event . 06.09.2010 15Page
. http://lod2.eu
16. Creating Knowledge out of Interlinked Data
Next Steps
D 9.2.1. Further technical improvements to Publicdata.eu
Improvements scheduled for the Aug 2013 interim release
CKAN core technology improvements (Harvesting)
Optimize & automate the harvesting process
Add further harvesters (to increase number of data and coverage)
Ability to only harvest changed data
Ability to harvest part of a site (e.g. a particular group vs whole catalog)
Additional features
Adding more advanced multilingual capabilities to the portal to support its Europe-wide
coverage
Add upgraded triple store and SPAQRL endpoint
LOD2 Event . 06.09.2010 16Page
. http://lod2.eu
17. Creating Knowledge out of Interlinked Data
Next Steps
Community Engagement for Publicdata.eu
We can divide our community building and engagement strategy around PublicData.eu into
two main clusters: supply & demand
Objectives on the supply side:
Engage more with data publishers by building
(a) a stronger community of official representatives and data catalogue maintainers
around PublicData.eu and
(b) consensus around key legal and technical standards (e.g. making metadata
explicitly open, enabling data catalog interoperability)
Establish datacatalogs.org as the de facto place to go to find out about data catalogs
around the world - and encourage data catalog maintainers and other official contact
points to maintain up to date information about national, regional and local catalogs, and
lists of catalogs
LOD2 Event . 06.09.2010 17 Page
10 http://lod2.eu
18. Creating Knowledge out of Interlinked Data
Next Steps
Community Engagement for Publicdata.eu
Objectives on the demand side:
To build a stronger and better connected community of open data re-users from across
EU27 around PublicData.eu
Continue to identify and pursue opportunities to engage with the Linked Data community
and to use the LOD2 Stack to publish Linked Data derived from PublicData.eu.
We are hoping to achieve this by performing the following activities:
-Organize 2-4 OKFN Labs sprints per year on a variety of different topics and disseminate
results via press releases, media contacts and partners
-Promote PublicData.eu at events, workshops and hackdays across EU27
-Dissemination via blogs, guest posts and articles on third party sites, and press releases
LOD2 Event . 06.09.2010 18 Page
11 http://lod2.eu
19. Creating Knowledge out of Interlinked Data
Year 2 Deliverables
(Serbian CKAN, Instytut
Informatyki
Gospodarczej)
LOD2 Presentation . 02.09.2010 . Page http://lod2
20. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Year 2 Deliverables
D 9.5.1 Establishing Serbian CKAN - Infrastructure for Public Sector Information
Publicdata.eu
search CKAN
http://elpo.stat.gov.rs/lod2/RS-DATA
Server 3
http://elpo.stat.gov.rs/lod2/RS-DIC
import
search
LOD2 RDF
CKAN
publishing http://rs.ckan.net
Code lists XSLT
Server 2 search
Online
LOD2 dissemination
DB
Serbian CKAN
Server 1
SORS
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
21. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
q
National accounts
q
Prices
q
Usage of ICT
q
Science, Technology and
Innovations
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
22. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Year 2 Deliverables
D9.6 Requirements and resources used by the Polish Ministry of Economy
Goal
Identify the requirements of Polish Ministry of Economy for publication of the
data
Analyze changes of data over time, temporal and topical scope
Prepare for adoption of LOD2 Stack for publication of Ministry’ data
Status
Delivered on time for M20
Work continues on Task 9.7
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
23. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
http://data.gov.pl – Current State Year 2 Deliverables
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
24. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Need for data
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
25. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Requirements – Querying
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
26. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Year 2 Deliverables
D9.6 Key Points
INSIGOS
●
Internet System for Business Information
●
Access to statistical data concerning economy and foreign trade
– POLGOS - presentation of comparative data concerning Polish economy
– HZ - information about Polish foreign trade
– ENERGY – mission: energy security
Challenges
- Multidimensional database (data ware House)
- Possible linking to source for drilling-down
- Not up to date – probably needs a supplementing process
- ENERGY – many files in ugly-formatted Excel files
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
27. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Year 2 Deliverables
D9.6 Key Points
CEIDG
– Central Register and Information on Economic Activity
•
access to data concerning natural persons’ businesses
•
references to other registries
•
ca. 2.9 million records
Challenges
-data is not clean
-available via API
-dynamic data set: ~1000/1000 applications for de-/registration daily
-snapshots – evolution phase of LOD2 Lifecycle
Public procurement data
- Pulished in XML, volume of data in 2011 alone: 828MB
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
28. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Year 2 Deliverables
D9.6 Requirements - Summary
•
No sophisticated tools used at MoE
•
Groups of requirements
»
Data Acquisition – 7 requirements
»
Data Processing/Transformation – 2 requirements
»
Publication – 3 requirements
»
Data Analysis – 2 requirements
•
Alignment with LOD2 Life Cycle
»
all 8 phases seem to be important but
•
Alignment with LOD2 Stack
»
crucial components identified
»
D2R/Triplify, Virtuoso, CKAN, PoolParty, Ontowiki, Silk,
Sigma
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
29. Creating Knowledge out of Interlinked Data
Next Steps (Instytut
Informatyki
Gospodarczej)
LOD2 Presentation . 02.09.2010 . Page http://lod2
30. Ce t gK o ld eo t fnel k dD t
rain n we g u o Itr e aa
in
Next Steps
Task 9.7 Adoption of the LOD2 Stack for Polish economy data (I2G)
Goal
§
adaptation of the LOD2 Stack to the requirements of Polish Ministry of Economy
§
identification of crucial components and how to configure and link them
Status
§
first deliverable D9.7.1 scheduled for M30
§
identification of existing functionalities in the working infrastructure
§
first vocabularies linked using Silk Workbench
Next steps
§
finishing and cleaning vocabulary
§
design of data model using SDMX vocabulary
§
filling in the model
§
Establishing the Polish CKAN
EU-FP 7 LOD2 WP 10 – 22.-23.9.2011. P a ge
6 – 13.-14.09.2012. http://lod2.
31. Creating Knowledge out of Interlinked Data
Thank you for your
attention!
LOD2 Presentation . 02.09.2010 . Page http://lod2
Editor's Notes
Hello everyone, My name is …. and I am the new PM overseeing some sections of WP9 from OKFN's side. This presentation includes the work that was spearheaded by OFKFN, as well as Serbian CKAN (9.5.) and the the Requirements and Resources for the Polish Ministry of Economy (9.6.) The slides were drafted to highlight some of the work that was done for WP9 during the course of this year. Slide 2 showcases the deliverables that are part of this WP
Hello everyone, My name is …. and I am the new PM overseeing some sections of WP9 from OKFN's side. This presentation includes the work that was spearheaded by OFKFN, as well as Serbian CKAN (9.5.) and the the Requirements and Resources for the Polish Ministry of Economy (9.6.) The slides were drafted to highlight some of the work that was done for WP9 during the course of this year. Slide 2 showcases the deliverables that are part of this WP
Section 1 rehearses the arguments of Open Data (how Governments are moving towards making their data available freely) whereas section 2 provides a full non technical explanation of Linked Data (concepts such as the 5 stars of LOD are presented) . Section 3 refers to the LOD life cycle (explains high level concepts such RDF, schemas, triple stores aso). Section 4 describes a step by step way to publish LOD and describes the tools in the LOD2 Stack. Step 5 presents some of the case studies of LOD2. Most known are the EC Financial Transparency System (all grants from the EC since 2007), the Global Health Observatory data set (stats for monitoring public health), Digital Agenda Scoreboard – shows progression of countries in relation to DAE, Legal Thesauri by Wolters Kluewer (commercial publisher of legal info)
Several public authorities(such as: UK Government White Paper, EC commissioner Neelie Kroes) are acknowledging the benefits of LOD ( Organizations meet the transparency requirements, and more meeting is provided to data sets by placing them in context with other datasets); in this respect the EC also funded the LATC project that converted approx 20 sets over the past years.