This series of presentations was given at the EarthCube Data Facilities End-User Workshop held January 15-17, 2014 in Washington, DC. This workshop provided a forum to discuss the unique requirements and challenges associated with developing the communication, collaboration, interoperability, and governance structures that will be required to build EarthCube in conjunction with existing and emerging NSF/GEO facilities.
This panel and presentation, specifically, outlined and explained several exemplars in global data sharing, featuring:
Lindsay Powers (CoopEUS)
Tim Ahern (GEO/GEOSS)
Bernard Minster (World Data System)
Beth Plale (Research Data Alliance)
3. COOPEUS Goals
COOPEUS aims to catalyze international collaboration
between environmental research infrastructures by:
• Improving data accessibility through harmonization of
data policies
• Promoting interoperability of research infrastructures
by coordinating data and metadata
formats, measurement standards and accessibility
• By improving data and information quality by defining
basic requirements for QA/QC
5. COOPEUS accomplishments to date
•
Strong commitment to open data sharing across institutions and countries
• Established COOPEUS data sharing principles.
•
Gap analysis across RIs identified commonalities and differences in data policies;
metadata formats, accessibility; data standards, and archiving.
• High level of data and metadata standardization and accessibility among COOPEUS
RIs
• Primary weaknesses in standardized data portals/access points
•
Minimizing barriers to data sharing by identifying and removing obsolete policies.
•
Identifying and discussing ethical, cultural and institutional requirements for data
sharing.
•
Establishing a Roadmap and long-term plan for data interoperability
•
Engaging and building the community
• Workshops on Data Policy Harmonization, Persistent Identifiers, Carbon
Observation data usage and more to come
6. Current activities within COOPEUS?
Use case development
• Develop relationships to explore interdisciplinary
questions
• Identify strengths and weaknesses in cross-disciplinary
data use
Work Package topical workshops
Develop Roadmap for Interoperability across RIs
• Provide guidance to RIs in policies and practices
• Foster community building to encourage implementation
of COOPEUS recommendations across environmental RIs
7. COOPEUS and beyond….
What does the community need?
How do we improve the culture of open data?
How do we foster additional and stronger relationships?
• Create opportunities for broader participation
• Make room for institutional flexibility
How do we address challenges on:
• Maintaining data visibility
• Preserving data and accessibility in long-term
• Maintaining trust and integrity
• Maintain context and provenance
• Protecting privacy rights
The strength of COOPEUS is in building and fostering
communities concurrent with a framework around data
interoperability.
11. Rationale
• Some 30% of the world’s economy is tied to the environment
• Systematic understanding of the Earth system is fundamental for
well-informed and economically-efficient decision making
• Sustained Earth observations are critical in understanding the Earth
• Need for systems interoperability and open data access
A global approach to Earth observation is required!
13. GEO - The Group on Earth Observations
Created in 2005, to develop a coordinated and sustained
Global Earth Observation System of Systems (GEOSS)
to enhance decision making in nine Societal Benefit Areas.
GEO today:
• 90 Members
• 67 Participating
Organizations
14. GEO Vision
To realize a future wherein decisions and
actions, for the benefit of humankind, are
informed by coordinated, comprehensive and
sustained Earth observations and information.
15. GEO Objectives
• Improve and coordinate
observation systems
• Advance broad open
data policies/practices
• Foster increased use of
EO data and information
• Build capacity
16. GEOSS
• Global Earth Observation System of Systems
• Coordinated, comprehensive and sustained system
• A global distributed system, which includes:
– Satellite observation systems,
– Global in situ networks and systems, and
– Local and regional in situ networks.
– A discovery and access system for data and information
17. GEOSS Objectives
• Facilitate exchange of data and information
• Improve decision-makers’ abilities to address
pressing policy issues
• Enable solutions for the benefit
of the society
• Deliver the advantages of EO
to both data & information
providers and consumers
world wide
18. Targeted Issues
• Uncertainty over continuity of observations
• Large spatial and temporal gaps in specific data sets
• Limited access to data and associated benefits in
developing world
• Inadequate data integration and interoperability
• Lack of relevant processing systems to transform data
into useful information
• Inadequate user involvement
• Eroding or little technical infrastructure in many parts of
the world
20. GEOSS: for scientists
• GEO is a framework to promote international
cooperation.
Earth observing systems of the future: built by scientists,
informed by GEO.
bringing together data architecture experts, scientists,
users, and capacity-building specialists.
visibility as data/networks/systems contributed to GEOSS.
potential support for research leading to GEOSS
implementation.
27. Supporting Data Access and Use
GEOSS
Common
Infrastructure
(GCI)
GEOSS Portal
Discovery and Access Broker
Resource Registration
Earth observations data, information and services
28. GCI Capabilities
GCI is a specialized system supporting
Discovery, Evaluation and Access of Multidisciplinary
Earth observations
• Search
– Searches on content, location, and time in EO datasets
– Support Semantic Discovery across disciplinary vocabularies
– Paging and ranking of matching results
• Evaluation
– Preview
– Harmonization of metadata
• Access & Use
– Download distributed data
– Basic transformations to utilize accessed data
29. GCI Architecture
Discovery & Access
GEOSS User
GEO Web
Portal
GEOSS
Common
Infrastructure
GEOSS Registries:
•Standards &
Interoperability
•User Requirements
•Best Prac ces Wiki
Discovery and Access
Broker (DAB)
Service Monitoring
Integrated/Federated EO
Discovery/Access Systems
Clearinghouse (CH)
Register
Seman c Component*
Component & Service
Registry (CSR)
e.g. GENESI, CWIC, FedEO*
real- me
search
Resources
Providers
GEOSS
Resources
harvest
Services and SW Applica ons
* Prototype
capabili es
e.g. So ware, Data Access, Processing,
Community Portals, Documents
data
data
EO data Catalogues &
Repositories
e.g. IDN, INSPIRE, geo.data.gov
31. GEO Looking forward
• Facilitating international collaboration based on
international, national and local programs
• Encourage international sharing of data and
information for science, government, NGOs and
industry
36. ICSU, WDS and CODATA
„ICSU‟s long-term vision is of a
world where excellence in science is
effectively translated into policy
making and socio-economic
development. In such a
world, universal and equitable
access to scientific data and
information is a reality and all
countries have the scientific capacity
to use these and to contribute to
generating the new knowledge that
is necessary to establish their own
development pathways in a
sustainable manner.’
37. Foundation
ICSU 29th General Assembly in Maputo (2008)
decided:
• To confirm that ICSU will continue to assert
a strategic leadership role in relation to
scientific data and information;
• to establish a new ICSU-World Data System
as an Interdisciplinary Body
to replace the World Data
Centres and FAGS
39. WDS Scientific Committee
2012-2015
• Bernard Minster (Chair, USA)
• Michael Diepenbroek (Germany)
• Kim Finney (Australia)
• Françoise Genova (France)
• Wim Hugo (South Africa)
• Jane Hunter (Australia)
• Vasily Kopylov (Russian Fed.)
• Guoqing Li (China)
• Ruth Neilan (USA)
• Lesley Rickards (UK)
• Ryosuke Shibasaki (Japan)
• Ariel Troisi (Argentina)
Ex-Officio
• Howard Moore (ICSU)
• Yasuhiro Murayama (NICT)
40. WDS implementation
1. Constitution
2. Data policy
3. Certification criteria and Membership
Applications
4. International Programme Office
5. Working Groups
6. Strategic Plan
41. WDS - a "system of data
systems"
• ...of data archive centres, data analysis centres,
data producers, data developers, data observing
systems and networks, virtual observatories,
etc., both regional (including national) and global
• Tough concept to address until WDS is fully
developed...
ICSTI Workshop, Paris 2012
www.icsu-wds.org
42. One node? … Or many?
IGS Associate Members
External
Interfaces
Governing Board
Oversight
IAG/GGOS
IERS
BIPM
ICSU/WDS
UNOOSA/ICG
Product
Coordinators
Committees of the
GB
Executive Committee
Strategic Planning
Committee
Elections Committee
Infrastructure Committee
Analysis
Coordinator
Reference
Frame
Clock Products
Central Bureau
Executive Management
Network Coordination
Information Portal
Support
Organizations
IGS Institute
UNAVCO
Pilot Projects
and Working
Groups
Antenna WG
Bias & Calibration WG
Clock Product WG
Data Centers WG
GNSS WG
Ionosphere WG
LEO WG
Real-time WG
Reference Frame WG
Troposphere WG
Tide Gauge PP
ICSTI Workshop, Paris 2012
Analysis
Centers
Global Network ACs
Global Network AACs
Regional Network
AACs
Other AACs
(Ionosphere, Real-time)
Data
Centers
Global Data Centers
Regional Data Centers
Operational Data
Centers
Project Data Centers
IGS
Tracking
Stations
Reference Frame Stations
Multi GNSS Stations
Real-time Stations
Application Stations (e.g. Tide Gauge,
Timing)
International Association for Geodesy/Global Geodetic Observing System (IAG/GGOS)
International Earth Rotation and Reference Frame Service (IERS)
Bureau International des Poids et Mesures (BIPM)
International Council for Science/Word Data Systems (ICSU/WDS)
United Nations Office for Outer Space Affairs/International Committee on GNSS (UNOOSA/ICG)
Analysis Center (AC)
www.icsu-wds.org
Associate Analysis Center (AAC)
43. WDS implementation
Membership types
Regular
Data curation and data analysis services.
(Individual data centres, data services)
Network
Networks of regular members, umbrella
organizations (IODE, IVOA…)
Partner
Do not deal directly with data collection,
curation, and distribution, but contribute support
to WDS
Associate Organizations interested in the WDS endeavour
46. WDS implementation
Strategic Targets
Make trusted data services an integral part of international
collaborative scientific research
• Involve WDS Members more closely into international collaborative
scientific research
• Promote the use of best practices in international collaborative
research programmes
Nurture active disciplinary and multidisciplinary scientific data
services communities
• Support existing communities whose practices serve their members
well
• Support nascent communities by helping them to identify their
needs and to organize their activities
• Provide mechanisms that facilitate cross-disciplinary interactions
and activities
• Contribute towards scientific development by improving the
analytical environment
47. WDS implementation
Strategic Targets (ctd.)
Improve the funding environment
• Promote international, national and disciplinary policies that
lead to sustainable long-term funding
• Engage and work with research funders to increase
resources for data services
Improve the trust in and quality of open scientific data services
• Actively promote policies of full and open access to data at
national and international venues
• Foster interoperable practices to facilitate data sharing
• Facilitate access to, use, and reuse of datasets, in particular
for multidisciplinary research
Position WDS as the premium global multidisciplinary network
for quality assessed data
48. Structure and Architecture
Other Netw orks and
Systems
Metadata & Data
Services
Visualization &
Analysis
GEOSS, GMES,
WMO-IS, IOC, etc.
web portals, catalogue
computer systems, virtual labs,
GIS systems
Publishers
commercial, open access,
cross-referencing
Data Archiving &
Publication Facilities
Certified repositories
Data collection &
Processing Facilit ies
QA/QC, data products, data rescue
Libraries
DOI registry, interdisciplinary
catalogues
Education &
Outreach
Research Facilities
satellites, vessels, observatories,
alert systems, etc.
49. Next Steps
Other Netw orks and
Systems
Metadata & Data
Services
Visualization &
Analysis
GEOSS, GMES,
WMO-IS, IOC, etc.
web portals, catalogue
computer systems, virtual labs,
GIS systems
Publishers
commercial, open access,
cross-referencing
Data Archiving &
Publication Facilities
Certified repositories
Data collection &
Processing Facilit ies
QA/QC, data products, data rescue
Libraries
DOI registry, interdisciplinary
catalogues
Education &
Outreach
Research Facilities
satellites, vessels, observatories,
alert systems, etc.
50. WDS Working Groups
• Knowledge Network and Open Metadata Catalogue
Discovering and accessing WDS members’ and
networks’ data and services (metadata and
“enriched” additional information)
• Data Publication
Promote and establish data publication concept
among data centres, include science publishers and
bibliometric service holders, and as part of scholarly
publishing. Follow-up of CODATA Data Citation WG
and other initiatives.
66. SciDataCon 2014
Data Integration for Global Sustainability
2–5 November 2014, New Delhi, India
2nd ICSU-WDS Conference &
24th CODATA International Conference
eResearch Australasia 2013
73. The Information Age – extraordinary potential for
driving Science and bettering Society
More
Efficient
Physical
Infrastructure
Contribution to a
safer and more
secure world
Transformative
strategies for
disease
treatment and
well-being
Better goods and services
More Research Insights
73
74. Key Driver 1: Data Sharing Accelerating
Discovery and Innovation
74
75. Data Sharing is a
Global Issue
Science, Humanities, Arts
Communities
75
Libraries, Archives, Repositorie
s, Museums
Cyberinfrastructure
professionals, data analysts, data
center staff, …
Data
Scientists
76. Key Driver 2: Community effort accelerating
impact
76
“Just do it” -- Focused efforts help
communities drive tangible progress
Creation / adoption of data
sharing policies have
accelerated research
innovation
Development of public access shared
data collection enabling new results
for Alzheimer‟s
Practioners work together on
interoperability efforts across Earth and
environmental science allowing selfgoverned and directed groups to
emerge around common issues.
Now 25 years old, the Internet Engineering Task
Force‟s mission “to make the Internet work
better” has resulted in key specifications of
Internet common community standards that
support innovation
MPI Forum photo by Erez Heba,
PDB molecule of the month at
http://www.rcsb.org/pdb/home/home.do
77. The Research Data Alliance (RDA)
Global community-driven
organization launched in
March 2013 to accelerate
data-driven innovation
RDA focus is on building the social,
organizational and technical
infrastructure to
reduce barriers to data sharing and
exchange
accelerate the development of
coordinated global data infrastructure
Plenary 2, Fall 2013
National Academy of Science, DC
77
78. RDA Vision and Mission
Vision: Researchers and innovators openly share
data across technologies, disciplines, and countries
to address the grand challenges of society.
Mission:
RDA builds the social
and technical bridges
that enable data sharing.
78
79. The RDA Community today:
Over 1000 members from 55 countries
Asia
3%
Africa
2%
Asia-pacific
4%
South
America
1%
79
80. Goal of RDA Infrastructure: Support Data Sharing and
Interoperability Across Cultures, Scales, Technologies
Common metadata types
for data Interoperability
Persistent identifiers
Harmonized standards
Digital object identifiers
Data access and
preservation policy and
practice
Tools for data
discoverability, …
Harmonized standards
Policy and
Practice
80
81. CREATE ADOPT USE
RDA Members come together as
Working Groups – 12-18 month efforts to build, adopt, and use specific
pieces of infrastructure
Interest Groups – longer-lived discussion forums that spawn Working
Groups as specific pieces of needed infrastructure are identified.
Working Group efforts focus on the development and use of data
sharing infrastructure
Code, policy, infrastructure, standards, or best practices that are
adopted and used by communities to enable data sharing
“Harvestable” efforts for which 12-18 months of work can eliminate a
roadblock
Efforts that have substantive applicability to groups within the data
community, but may not apply to everyone
Efforts for which working scientists and researchers can start today
81
82. RDA Plenaries: Venue for community building and
WG / IG progress
Plenary 1
RDA Plenary 1 / Launch
March 2013 in
Gothenburg, Sweden
240 participants
3 WG, 9 IG
RDA Plenary 2
September 2013 in
Washington, DC
Plenary 2
380 participants
6 WG, 17 IG, 5 BOF
Data Citation Summit colocated in RDA “neutral
space”
First Organizational
Assembly meet-up
Beth Plale
8
822
83. RDA Plenaries Emerging as a Data
Community “Town Square”
Emerging Plenary Format:
All-hands sessions: Place for community
networking and exchange of information
(funding agencies, data organizations, key
stakeholders)
Working sessions: Face-to-face
opportunities for global Interest Groups,
Working Groups, and BOFs to meet and
advance their agendas
Neutral meeting place: Place for multiple
groups to meet and form a common agenda
and action plan (e.g. Plenary 2 Data
Citation Harmonization Summit)
83
84. Coming in 2014
84
RDA Plenary 3
March 26-28, 2014 in
Dublin, Ireland
Hosted by Australia and
Ireland
Theme: “The Data Sharing
community - Playing Your
Part”
RDA Plenary 4
September 2014 in The
Netherlands
Being planned now …
Plenary 3
Plenary 4
85. Community-Driven RDA Groups by Focus
Domain Science - focused
Toxicogenomics Interoperability
IG
Structural Biology IG
Biodiversity Data Integration IG
Agricultural Data Interoperability
IG
Digital History and Ethnography
IG
Defining Urban Data Exchange for
Science IG
Marine Data Harmonization IG
Materials Data Management IG
Reference and Sharing focused
Data Stewardship focused
Data Citation IG
Data Categories and Codes WG
Legal Interoperability IG
85
Community Needs focused
Community Capability Model
IG
Engagement IG
Clouds in Developing
Countries IG
Preservation e-infrastructure
Long-tail of Research Data IG
Research Data Provenance IG
Certification of Digital
Publishing Data IG
Repositories IG
Global Registry of Trusted Data
Repositories and Services IG
Base Infrastructure - focused
Metadata IG
Data Foundations and Terminology WG
Big Data Analytics IG
Metadata Standards WG
Data Brokering IG
Practical Policy WG
PID Information Types WG
Data Type Registries WG
Domain Repositories IG
86. RDA Community-Driven Groups
Repositories, Data
Descriptions Registry
Interoperability, DSA-WDS
Partnership Working Group
on Certification
Birds-of-a-Feather
(met at Plenary 2)
Linked Data
Chemical Safety Data
Education and Skills
Development in Data
Intensive Science
Libraries and Research Data
Cloud Computing and Data
Analysis Training for the
Developing World
Working Groups
Data Type Registries
Persistent Identifier Types
Data Foundations and
Terminology
Metadata Standards
Practical Policy
Data Categories and Codes
WG Case statements being
prepared: Citing Dynamic
Data, Publishing Data
Workflows, Publishing Data
Services, Data Bibliometrics,
Cost Recovery Models for
Interest Groups
Agricultural Data
Interoperability
Certification of Trusted
Repositories (joint with ICSUWDS)
Data Citation
Metadata
Marine Data Harmonization
Community Capability Model
Engagement
Preservation e-Infrastructure
Legal Interoperability (joint
with CODATA)
Defining Urban Data
Exchange for Science
Marine Data Harmonization
Structural Biology
Big Data Analytics
Data Brokering
Blue = new between Plenary 1
and Plenary 2
Green = new since Plenary 2
86
Publishing Data (joint with
WDS)
Toxicogenomics
Interoperability
Research Data Provenance
Materials Data Management
Global Registry of Trusted
Data Repositories and
Services
Digital Practices in History
and Ethnography
Biodiversity Data Integration
Long tail of Research Data
Development of cloud
computing capacity and
education in developing
world
Service Management IG
(pending)
Domain Repositories
Interest Group (pending)
Federated Identity
Management (pending)
Persistent Identifier Interest
Group – PID-IG (pending)
87. 87
RDA Organizational Structure
RDA Council
RDA Membership
Responsible for overarching mission, vision, impact of RDA
Secretary-General and
Secretariat
Technical Advisory
Board
Responsible for Technical
roadmap and interactions
Responsible for
administration and
operations
Organizational Advisory
Board and
Organizational
Assembly
Responsible for organizational
and strategic advice
Working Groups
Responsible for impactful, outcome-oriented efforts
Interest Groups
Responsible for defining and refining common issues
RDA Colloquium (Research Funders)
Operational and community sponsorship
88. RDA Organizational Partners
Member Applicants
• Institute for Quantitative Social Science at Harvard
• Barcelona Supercomputing Center
• Intersect Australia Limited
• European Data Infrastructure (EUDAT)
• Microsoft
• International Association of STM Publishers
• Oracle
• New Zealand eScience Infrastructure
• STFC - Science & Technology Facilities Council
• Washington University Libraries
• Corporation for National Research Initiatives (CNRI)
• Purdue University Libraries
• Terrestrial Ecosystems Research Network
• Research Data Canada
• University of Michigan Libraries
• eResearch Services and Scholarly Application
Development Division of Information Services
Interested Affiliates
• American University Library
• Committee on Data for Science and Technology
(CODATA)
Other interested Organizations
• Connecting Research and Researchers (ORCID)
• Australian Antarctic Data Centre
• DataCite
• Australian National Data Service
• International Oceanographic Data and Information
Exchange (IODE)
• CERN
• CJSD Consulting
• Columbia University Libraries/Information Services
• CSC - IT Center for Science Ltd.
• Digital Curation Centre
• IBM
• Scholarly Publishing and Academic Resources Coalition
(SPARC)
• World Data System (WDS)
88
Strengthening the cooperation between the US and the EU in the field of environmental research infrastructuresMost of you are building tools and infrastructure for data COOPEUS is building an international and interdisciplinary community based framework for data interoperability and accessDiversity of RI experience fostering mentoring and sharing internationallyBuilding a framework for cross-disciplinary engagement through data interoperability and interworkability – through use casesEngaging broad community beyond immediate CoopEUS partners
How:Bringing RIs together toward common goalsStandardizing data policies, protocols, metadata, access systemsUnderstanding where differences exist and whyThinking about the future of data requirements and accessibilityCreating a culture of open data
82% of COOPEUS RIs provide metadata and metadata catalogues for archived datasets and nearly all of these are available electronicallyThere is uniformity in data formats (ASCII and/or XML)
Sometimes, we have been asked or even claimed what is the difference between our GCI and Google.Currently, even in Earth observation community, you, even I, go to Google, and you may get easily what you want. In order to identify our role clearly, we need to recognize ourselves what are the differences.READ SLIDESIn the end, again, while Google services are for more general public in every contents on internet, GCI provides a specialized services for discovering and accessing of Earth observation data and information.
GEOSS has had about 11 million accesses (data granules) largely coming from 20-30 international repositories