DuraCloud Integration with SDSC Presentation Slides
1. SDSC cloud storage
integration with DuraCloud
Michele Kimpton, CEO DuraSpace
Matthew Kullberg , SDSC Technical Projects Mgr
Carissa Smith, Partner Specialist DuraSpace
2. Using the Webinar
Platform
• 2-way audio for all participants
is muted
• We’ll utilize the Chat
Window for the Q&A portion
or you may use it if you are
having technical difficulties
• You may type your question
here & hit ‘enter’
3. A not for profit serving academia
We are committed to providing open
source technologies and services that
promote durable, persistent access to
the scholarly record.
4. What is DuraCloud?
Archiving and preservation services in the cloud
Ability to choose one or multiple cloud storage providers
Amazon SDSC Rackspace
5. Key services
Digital archiving Video streaming Online sharing
File health checking File synchronization Image serving
6. Why we are excited about SDSC
partnership
• First academic public cloud provider
• Understands our community
• Level of transparency
• Cost competitiveness
• Real partnership
8. Institutions currently archiving in two
data centers using DuraCloud
• Rice
• MIT
• ICPSR
• NC State Archives and Libraries
• University of Tennessee Knoxville
10. 11/8/2012
Matthew Kullberg
Technical Project and Services Manager
San Diego Supercomputer Center
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
11. A Paradigm Shift for Long-Term Storage: Access,
Sharing and Collaboration
SDSC Cloud
•Launched September 2011
•Largest, highest-performance
known academic cloud at launch
•5.5 Petabytes initial capacity
•8 GB/sec throughput
•Capacity and performance scale
linearly to 100’s of petabytes
•Open source platform based on
NASA and RackSpace software
•http://cloud.sdsc.edu
11
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
12. Why OpenStack Swift Cloud Software?
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
13. SDSC Cloud Storage design using OpenStack
Authentication nodes provide user access
Authentication nodes provide user access
Proxy nodes handle data read/write requests
Proxy nodes handle data read/write requests
Storage nodes write and verify 2 copies of all
Storage nodes write and verify 2 copies of all
data uploaded to the cloud
data uploaded to the cloud
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
14. Where is the SDSC Cloud?
• The San Diego Supercomputer Center is an Organized Research
Unit of the University of California, San Diego
• SDSC Cloud infrastructure is housed within the SDSC Datacenter
• Multiple 10Gb network connections including Internet2 and Esnet
• Operations staff on site 24/7
• Datacenter is secured with biometric access and 60 closed circuit cameras
• Central UPS provides 3.5 Megawatts of battery backup power to the SDSC
Cloud infrastructure and Datacenter
• All Hardware resides on seismic isolation platforms from ISOBase to
protect equipment during earthquakes
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
15. What does my data do while at SDSC?
• All Objects (files) are continually checked for errors to ensure
two identical copies exist at all times within SDSC’s Cloud
• If an error is detected, the “bad” file is quarantined and a new duplicate
copy is made from the “good” copy.
• On average, with enterprise hard drives, data corruption occurs every 10 15
bit written, or every 1,000,000,000,000,000 bit written. (Known as UBE,
Unrecoverable Bit Error Rate).
• Data remains private at all times, unless shared by the user
• SDSC will never mine stored data for any research, commercial, or other
purpose.
• All SDSC ITS Staff undergo background, World Check, Consumer, and Live
Scan verifications, and have Level-C clearance.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
16. Why add SDSC Cloud to your data profile?
• Increased redundancy
• SDSC and DuraCloud understand the needs of
academic institutions
• Low cost, High Availability and High Durability
• Easily move or mirror data with DuraCloud
• Recognized by NSF and other academic funding
agencies
• Help support academic computing and research
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
17. Our Academic Storage Mission:
Data Curation, Archiving, Preservation
• Data Curation: Managing data to ensure they are fit for
contemporary use and available for discovery and
reuse.
• Archiving: Ensuring that data are properly selected,
appraised, stored, and made accessible. The logical and
physical integrity—including security and authenticity
—of the data are maintained.
• Preservation: Ensuring that items or collections remain
accessible and viable in subsequent technology
environments.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
18. Applications of SDSC Cloud
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
19. What’s on the horizon?
• SDSC OpenStack Compute Environment in development
• Allows SDSC to act as primary storage option with DuraCloud
• SDSC North Cloud now in testing phase
• Northern Datacenter is in Oakland, CA at the University of California Office
of the President (UCOP) Datacenter
• Potential for SDSC North Cloud to provide additional geographically
dispersed storage locations through DuraCloud
• SSD Drives currently being installed in SDSC Cloud nodes
• Moves database activities to low latency and high I/O Solid State Drives
• Provides faster authentication
• Improves file and container access speeds
• Faster uploading
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
20. Hope to see you in the cloud soon!
If you’d like to know more about the SDSC Cloud
infrastructure please email me at:
mck@sdsc.edu
To sign up with DuraCloud please visit:
www.DuraCloud.org
Matthew C. Kullberg
Technical Project and Services Manager
San Diego Supercomputer Center
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
22. To find out more
• www.duracloud.org
• wiki.duraspace/display/duracloud
• www.youtube.com/user/duracloudvideos
• Email us a mkimpton@duraspace.org, or
csmith@duraspace.org
Thank Educause for inviting DuraSpace to talk about one of our new cloud based solution DuraCloud.
Mission is to provide leadership and support in the development of OS technologies and services that promote durable peristent access to the scholarly record- define the scholarly record- we believe it is the summation of all output from the academic community- ETD’s, journal articles, books, more recently video, audio and data sets. Our goal is to drive the technologies we support forward to be able to manage and persist a variety of content and to ensure the content can move easily across institutions, technology platforms, and time Goals is to help our community advance their preservation strategies through technology, services and collaborations. We have over 1500 institutions world wide, primarily academic institutions using either the DSpace or Fedora repository to manage and persist
The goal when building the duracloud platform and service was to mitigate the risks of trusting a single cloud provider yet take advantage of the beneifts of being able to scale quickly, only pay for what you need, and have additional services and applications specific to Academia available in the platform, without requiring technical skills to use and adminster the service.
We envision through our partnerships with Internet 2, incommon and SDSC we can build a completely run academic cloud network, where DuraCloud software is providing a layer of services to institutions for managing, preserving and accessing content that use academic cloud storage, compute and bandwidth as the underlying infrastructure. SDSC and our partnership with Internet 2 is the first step in this direction.