Genislab builds better products and faster go-to-market with Lean project man...
Universities and HPC Cloud: Taming the Growing Data Explosion
1. Technical Computing /
High Performance Computing
University Perspective
Chris Maher, IBM Vice President HPC Development
maherc@us.ibm.com
2. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
3. The world is getting smarter – more instrumented,
interconnected, intelligent
Smarter Intelligent Smarter Smarter Smarter Smarter retail
traffic oil field food healthcare energy grids
systems technologies systems
Smarter Smarter Smarter Smarter Smarter Smarter
water mgmt supply chains countries weather regions cities
...and this is driving a new economic climate.
4. Technical computing is being applied to a broader set of industries
enabling more areas for collaborative work at universities
HPC 1.0 HPC HPC 2.0
Low + 1.5 + “Data
Research Engineering/Simulations Analysis/Big Data/Cloud Driven”
deployments
HPC Problem Domains Addressed
“Mainstream”
HPC “entry costs” – investment
“Applied” Technical Computing
Technical Broad adoption across a variety of industries as
Computing technology becomes affordable & pervasive
Usage driven by modeling, simulation,
predictive analysis workloads
Large Industrial sector
and skill needed
Delivered via Clusters, Grids and Cloud
applications
Digital Media Financial Services Life Sciences
Electronic Automotive
Design Aerospace Petroleum
Automation Engineering
Supercomputers Supercomputers Supercomputers Exascale
Science, Research & Science, Research & Science, Research & The Next Grand
Government Government Government Challenge
High “Physics
Driven”
1990’s Timeline 2010’s
5. Examples of growth for Technical Computing
• Computational analysis
• Upstream/downstream processing
• Next-generation genomics
• Satellite ground stations
• Video capture and surveillance
• 3-D computer modeling
• Social media analysis
• Data mining/unstructured
information analysis (Watson-like)
• Financial “tick” data analysis
• Large-scale real-time CRM
6. Industry Trends
The Data Deluge
• Big data, big data management consuming researchers now
• Very large projects have data measured in the 100s of petabytes
Expanding the role of HPC and HPC Cloud on Campus
• Myriad of campus needs for both high throughput computing and high
performance (capability) computing using a shared environment
• Best practices show cost reduction with central condominium facility
where researchers can contribute their grant money and which serves the
larger university community
• HPC makes a university more competitive for grants
Exascale computing will be a reality in 2018/9
• Petascale has been delivered(2008)
• Large scale is being tackled now
• In 2018, will large university installations have a multi petaflop computer?
What will house it?
What will be the power requirements?
The Power Utilization Efficiency (PUE) of your datacenter is as
important as the “green solution” you put in it.
6
7. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
8. What we are seeing as trends at the University Level
• HPC is growing at a robust CAGR (6.9% according to Tabor)
• HPC is required for a research university to attract faculty.
• VP of Research titles changing to VP of Research and Economic Development
acknowledging that joint ventures with companies is a MUST for universities
• Greater partnerships with new industries
• Power, cooling and space are making universities think about central vs.
decentralized computing (total cost of ownership)
• Next Generation Sequencing and in silico Biology, High Energy Physics, Search and
Surveillance, Nanotechnology, Analytics are key workload areas Use of accelerators
(for example nVidia)
• HPC in the CLOUD becoming more relevant
8
9. Sample Workload/ISVs
• In silico Biology– Amber, NAMD, BLAST, FAST/A, HMMer, NGS.
• Computational Chemistry– Gaussian, Jaguar, VASP, MOE, Open Eye,
Accelrys Material Studio.
• Matlab– used in most medical school settings
• Statistics– IBM SPSS or SAS
• High Energy Physics– workload from Cern LHC– Monte Carlo techniques
• Quantum Physics– Quantum Chromodynamics (QCD)
• Analytics– COGNOS, Big Insights, InfoSphere Streams (large data being
generated by the Square Kilometer Array), CERN, and Smarter Planet
initiatives.
9
10. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
11. All these and more are contributing to the
Growing Data Explosion
Petabytes
Half a Zettabyte of Annual IP Traffic by
2013 (a trillion gigabytes; 1 followed by
21 zeroes)
Terabytes
MRIs will generate a Petabyte “IDC’s annual Digital
“IDC’s annual Digital
Gigabytes of data in 2010 Universe… indicates that
Universe… indicates that
over the next 10 years,
over the next 10 years,
data generation is
data generation is
expected to increase a
expected to increase a
Text messages generate staggering 44x” *
staggering 44x” *
Megabytes 400TB of data
per day (US)
Kilobytes
1980 1990 2000 2010
* The Ripple Effect of Seagate's 3TB External HDD
July 06, 2010 - IDC Link
12. Data Centric Thinking
Today’s Compute-Focused Model Future Data-Focused Model
out
put
t
in pu
Data
Data becomes Center of Attention
Data lives on disk and tape We are never certain exactly where it is
Move data to CPU as needed •Although we can ask
Deep Storage Hierarchy Abstraction allows for specialization
Abstraction allows for Storage Evolution
14. At look at Next Generation Sequencing
• Growth is 10x YTY
15. Managing the data explosion from NGS
Sequencers can generate 2TB+ of final data per week/sequencer. Processing the
data is compute intensive; the data storage is PBs per medium sized institution. For
example, BGI in China currently has 10 PB.
16. Average Storage Cost Trends
Projected Storage Prices
$50.00
$10.00
$/GB
$1.00
$0.01
2003 2004 2005 2006 2007 2008 2009 2010 2011
Industry Disk HC LC Disk Average Tape
Source: Disk - Industry Analysts, Tape - IBM
17. Use of Tape Technology
• Virtual Tape + deduplication growing technology for secondary data
– Key value – time to restore
– Use compute to reduce hardware costs
– Add HA clustering and remote site replication
• Tape used as the “Store” in large HPC configurations
– Files required for job staged from tape to disk ‘cache’ by a data mover (HPSS)
– Results written to disk, then destaged back to Tape
• Hybrid disk and tape use for archive applications – large capacity, long term retention
– Metadata on Disk, Content on Tape
– Lowest cost storage
– Lowest power consumption
– Most space efficient
– Long life media
• Specialty Niche – removable media interchange
Any statements or images regarding IBM's future direction and intent are subject to change or withdrawal without
notice, and represent goals and objectives only.
18. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
20. Why are Universities exploring Clouds?
• Cost Efficiency
– Consolidation and sharing of infrastructure
– Leverage resource pooling for centralized policy administration
• System/Configuration Management Policies
• Energy-related Policies
• Security-related Policies
• User-related Policies
• Flexibility
– End-user self-service cloud portal enablement
– Exploit advanced automation to free technical resources for higher value work
– Enhanced access to specialized resources (e.g. GPUs)
– Dynamic on demand provisioning and scaling
20
21. IBM’s new HPC Cloud addresses the specific intersection of
high performance computing and cloud computing
CloudBurst Intelligent Cluster
Cloud ISDM, TPM, TSAM HPC Management Suite
Computing Virtual Machine Bare Metal & VM
Provisioning Provisioning
System x, BladeCenter iDataPlex
System p, System z BlueGene, System p 775
Stand-alone
Computing SAN, NAS GPFS, SONAS
1Gigabit Ethernet InfiBand, 10-40 GbE
General Purpose High Performance
Computing Computing
22. IBM’s HPC Cloud is being deployed
at clients such as the phase 2 pilot at NTU
Environment
Characteristics
Full and direct access to
system resources (bare
metal pooling)
Efficient virtualization,
where applicable (KVM and
VMWare pooling)
Diverse technologies
– Windows & Linux
– Diverse cluster
managers
Needs include
–Batch job scheduling – several unique schedulers and runtime libraries
–Parallel application development and debugging, scaling and tuning
–Parallel data access
–Low latency, high bandwidth interconnects
23. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
24. New Era of Technical Computing Systems
Hardware + Software + Services = Systems and Solutions
Hardware
Purpose built, optimized offerings for Full array of standard hardware offerings for
Supercomputing Technical Computing
- iDataPlex, DCS3700 Storage, TS3500 Tape Library - Intel- based IBM blade servers, IBM Rack Servers,
x3850X5 SMPs, Integrated Networking Solutions, Storage
Products (DCS3700)
+ Software
- Parallel File Systems - Parallel Application Development Tools
- Resource Management - Systems Management
IBM
+ Services Research
Innovation
- HPC Cloud Quick Start Implementation Services - Technical Computing Services Offering Portfolio:
Full range of customizable services to help clients design,
develop, integrate, optimize, validate and deploy
comprehensive solutions to address their Technical Computing
challenges
= Systems & Solutions
Intelligent Cluster HPC Cloud Offerings from IBM ISV Solutions
- IBM Intelligent Cluster solutions: Integrated, - IBM HPC Management Suite for Cloud - Partnering with leading ISVs to maximize
optimized w/servers, storage and switches - IBM Engineering Solutions for Cloud: HPC cloud the value of our joint solutions
offerings optimized for Electronics, Automotive &
Aerospace clients
25. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems
• University Resources and University Examples
• Putting it all together with Datatrend Technologies
26. • IBM University Relations: Resources for educators, researchers,
University Relations (UR) and STG
staff and students
University Alliances
–https://www.ibm.com/developerworks/university/
• IBM Systems and Technolgy Group University Alliances
–Responsible for guiding STG research and collaboration with universities
–Enables new opportunities for deploying IBM systems and solutions at
universities
–RTP Center for Advanced Studies headed by Dr. Andrew Rindos,
rindos@us.ibm.com
26
27. University Relations Teaming Examples
Proposed Collaboration w/ Imperial College SUR Project : Smarter Infrastructure Lab for
London : Digital City Lab (DCL) Smarter Cities
IBM, Imperial College, government & industry partners to MOU signed creating SI Lab collaboration taking a system of
invest ~ $81M for Digital City Research project to develop & systems view of a university managed like a smart city using
implement the next generation infrastructure, systems & sensors, data, and analytics
services to modernize cities (i.e. make cities smarter) Goals include development of fixed & mobile infrastructure
analytics technologies & solutions for a smarter city (e.g. smart
Goals include connecting citizens to real time intelligence, water, waste, buildings, energy, transportation, healthcare,
bring value through smart decision making, generating environment, etc.). Also to provide a showcase for client visits &
commercial, creative and social opportunities to enhance demonstrations of IBM Smarter Cities technologies
quality of life
Future proposal to have lab become part of larger Pennsylvania
In addition catalyse the next generation of digital services in Smarter Infrastructure Incubator Initiative
healthcare, energy, transportation and creative industries.
IBM & Swansea University (Wales UK) SUR Project : Smarter City Solutions
Partner for Economic Dev’t For China
The vision for the collaboration is economic dev’t & job Tongji University signed a Smarter City Initiative collaboration
creation ; build state of the art HPC capability across the agreement aimed at building and providing integrated IBM
universities in Wales to provide enabling technology that Smarter City solutions for China
delivers research innovation, high level skills dev’t and Goal of collaboration is to overcome the current silo decision
transformational ICT for economic benefit. making by different government ministries and to provide a city
Wales infrastructure is linked to the larger UKQCD mayor and other decision makers an integrated Smarter City
consortium (19 UK Particle Physicists and Computing framework, solution package, and a real life city model
Scientists from 19 UK universities) that share computing ToJU will partner with IBM on Smart City projects based on
resources ToJU's urban planning work in several cities (Shanghai Pudong,
Seeded w/ SUR award which drove revenue of $2.4M in Hangzhou & Yiwu )
2010
28. University of Victoria
Upgrading old hardware while significantly boosting performance and research
capabilities
The need:
Requirement to replace original circa 1999 UNIX machines
Principal Investigator’s key requirement was research collaboration
Physics Department main requirement was only for FLOPs / $$ for performance
was key
Solution:
A research capability computing facility of 380 iDataplex Nodes (2x Intel x5650’s
1:1 InfiniBand)
Industries: Higher Education
A performance/capacity cluster of iDataplex nodes (2x Intel x5650’s 2x 1Gig) URL: http://www.uvic.ca/
High Performance Focused on Benchmark results (disk I/O and Jitter performance)
The benefits:
Research time cut by 50%
Power and cooling was 40% less while gaining 30% throughput benefits
29. St. Jude’s Children’s Research Hospital
Simplifies storage management to meet researchers needs
Business challenge:
St. Jude’s Children’s Research Hospital , based in Memphis, TN, is a leading
pediatric treatment and research facility focused on children's catastrophic diseases.
The mission of St. Jude Children’s Research Hospital is to advance cures, and
means of prevention, for pediatric catastrophic diseases through research and
treatment. Their current NAS solution was not scalable to meet researchers needs
and tiering of data was becoming an arduous process.
Solution:
St. Jude’s viewed IBM as a thought leader is storage virtualization. IBM SONAS was
deployed to provide a single, scalable namespace for all researchers. IBM Tivoli
Storage Management and Hierarchical Storage Management automated tiering and
backup of all data allow IT to focus on the needs of research. St Jude’s was able to Solution components:
IBM SONAS
simplify their storage management while providing the ability to meet researchers
Tivoli TSM & HSM
needs. IBM ProtecTIER
Benefits: DS5000
3 years hardware & software maintenance
A single, scalable, name space for all users that can be enhanced and upgraded
IBM Global Technology Services
with no down time
Avoided the expense, time and risk of manually moving data to improve reliability
and access to the information
Able to adjust to dynamic business requirements, reduce maintenance, lower
integration costs, and seamlessly bridge to new technologies
30. East Carolina University
Advancing Life Sciences Research with an IBM Intelligent Cluster
solution based on IBM BladeCenter technologies “There are some analyses
that make use of all
The need:
96 cores… Previously, a
Without a dedicated supercomputer capable of running massively parallel task of this magnitude might
computational tasks, the Biology department at ECU could not run models as
have taken a full day of
quickly as it needed. Researchers were frustrated by slow performance, and
scientists were forced to spend time resolving IT issues. computation to complete.
With the IBM Intelligent
The solution: Cluster, it takes just
ECU selected an IBM® Intelligent Cluster™ solution based on minutes.”
IBM BladeCenter® servers powered by Intel® Xeon® 5650 processors, working —Professor Jason Bond,
East Carolina University
with Datatrend Technologies Inc. to deploy it. The solution was delivered as
a preintegrated, pretested platform for high-performance computing, and includes
remote management from Gridcore. Solution components:
IBM® Intelligent Cluster™
The benefit: IBM BladeCenter® HS22
ECU can now run up to ten typical computational tasks in parallel
Using all 100 Intel processor cores, models that might previously have
taken a day are completed in a matter of minutes
Efficient, easy-to-scale solution opens up new research possibilities
for the future.
XSP03265-USEN-00
31. Agenda
• Industry Expansion of HPC and Technical Computing
• Universities and HPC
• Taming the Growing Data Explosion
• Why HPC Cloud
• Technical Computing Systems and University Resources
• University Examples
• Putting it all together with Datatrend Technologies
32. Putting it All Together… Literally with
Datatrend Technolgies
Doug Beary, Technical Account Executive
Datatrend Technologies
919-961-4777, doug.beary@datatrend.com
33. High Performance Computing Platforms
Datatrend Technologies can help put it All Together –
Providing a Solution
• HPC Clusters
– Compute, Interconnect & Storage
• Workload Fit
– Distributed Memory (MPI)
• Scale Out: iDataplex, Blades, Rack
– Shared Memory (SMP)
• Large Scale SMP: ScaleMP, NumaScale
– Hybrid Systems
• Management System
– xCAT, MOAB, eGompute
33
34. HPC Clusters
Platform Optimization Top Components
– Optimize Processor Selection • Fastest CPUs
• Performance/$ • Flexible Interconnect Choices
• Performance/W Fabric, Card, Switch, Cabling
– Optimize Form Factor • Unmatched Storage to Meet
– Optimize Delivery & Installation Any Capacity
Any Performance
Typical 84 Node Cluster
• 100 to 1000 boxes
• Optimize Form Factor
• Optimize Delivery &
Installation
Datatrend Solution
• One Item
34
35. Workload Fit
• Distributed Memory
– Most Common Cluster
– Under Desk to PetaFlops
– 100s to 100,000+ Cores
– Many OS Images
• Shared Memory
– Growing Demand
– Dozens to 1000’s of Cores
– 64+TB Memory
– One OS
• Hybrid
– Do Both on One platform!!
35
36. Hyper-Scale Cluster
Up to: 126 Nodes, 1512 cores, 23.6TB
• Simple Scaling
• 126 Nodes in 2 Racks
• Full Blade Chassis: 9
• Bandwidth:
• *Bi-sectional bandwidth:
64%
• Largest non-blocking
Island: 14 nodes
• Low Latency
• Max. 200ns
Distributed Memory, Shared Memory or BOTH!!
36