SlideShare a Scribd company logo
1 of 13
ACCELERATE RESEARCH
NVIDIA TESLA
Lift the Barriers of HPC
     Faster /              Maximum           Greater Budget &
  More Research           Performance        Power Efficiencies

Faster, More Discovery,   More Performance     More Performance
   Higher Accuracy           per dollar            per watt
GPU Impact to Computational Research

          More
        Research                    +     Maximum
                                         Performance                  +              Efficient
                                                                                      Power




88ns/day, 6x Faster                 318% Higher Performance                    2.5x Flops / Watt
                                        54% Added Cost                        Tianhe-1A: CPU + GPU
  JAC simulation time
  23,558 Atoms DHFR                           AMBER 11                          Jaguar: CPU only
                                        CPU: Dual socket Intel Xeon
Axel Kohlmeyer: Temple University                                         Tianhe-1A: #2 Top500; Jaguar: #3 Top500
                                        X5670, 2.93 GHz (12 cores)
GPU Computing by Numbers

          60                                           583
   Universities                                        Universities


       150K                                    1.5M
CUDA Downloads                                 CUDA Downloads



      4,000                  22,500
Academic Papers              Academic Papers



                  1                                                   52
   Supercomputer                                       Supercomputers


             2008     2012
UCLA
Department of Physics and Astronomy
Challenge
   Accelerate Plasma Research with innovative Particle-in-Cell (PIC) Simulations
   Overcome space and power constraints in data centers
   Integrate into shared computing strategy across institutes and centers at UCLA

Solution
    GPU cluster
       96 server nodes
       288 NVIDIA Tesla GPUs
    Upgraded GPUs to NVIDIA Tesla M2090s (from M2070)
Impact
   Upgrades resulted in 20% higher performance with same power cost
   GPUs extended to new groups within department for greatly accelerated modeling
   Solves faster performance requirements within limited space and power constraints
   #235 on prestigious Top500 list with only 6 Racks
Add GPUs: Accelerate Science Applications

       CPU                 GPU
207 GPU-Accelerated Applications
              www.nvidia.com/appscatalog
3 Ways to Accelerate Applications

                 Applications

                 OpenACC               Programming
Libraries
                 Directives             Languages
 “Drop-in”       Easily Accelerate       Maximum
Acceleration       Applications          Flexibility

    THRUST                                       C
  BLAS, LAPACK                                  C++
      FFT            PGI Accelerator          Fortran
      NPP             CAPS HMPP               OpenCL
     Sparse               CRAY            DirectCompute
    Imaging                                    Java
      RNG                                     Python
GPU-Accelerated MATLAB Results




 10x speedup in data clustering via K-   14x speedup in template matching routine      3x speedup in estimating 7.6 million
      means clustering algorithm            (part of cancer cell image analysis)     contract prices using Black-Scholes model




17x speedup in simulating the movement    4x speedup in adaptive filtering routine   4x speedup in wave equation solving (part
        of 3072 celestial objects           (part of acoustic tracking algorithm)      of seismic data processing algorithm)
AMBER 12 - Extreme Performance with K20
                                       DHRF JAC 23K Atoms (NVE)                          Running AMBER 12 GPU Support Revision 12.1
                                                                                         SPFP with CUDA 4.2.9 ECC Off
                    120

                                                                                         The blue node contains 2x Intel E5-2687W CPUs
                                                                     95.59               (8 Cores per CPU)
                    100

                                                                                         Each green node contains 2x Intel E5-2687W
                                                                                         CPUs (8 Cores per CPU) plus 2x NVIDIA K20 GPU
Nanoseconds / Day




                     80


                     60


                     40


                     20                 12.47


                      0
                                       1 Node                       1 Node
                                                                                                         DHFR

                          Gain > 7.5X throughput/performance by adding just 2 K20 GPUs
                                     when compared to dual CPU performance
NAMD 2.9
                    Outstanding Strong Scaling with Multi-STMV                              Running NAMD version 2.9
                                                                                            Each blue XE6 CPU node contains 1x AMD
                                     100 STMV on Hundreds of Nodes                          1600 Opteron (16 Cores per CPU).
                    1.2

                                  Fermi XK6                                                 Each green XK6 CPU+GPU node contains
                                                                                            1x AMD 1600 Opteron (16 Cores per CPU)
                     1                                                                      and an additional 1x NVIDIA X2090 GPU.
                                  CPU XK6
                                                                                     2.7x
Nanoseconds / Day




                    0.8

                                                                      2.9x
                    0.6



                    0.4



                    0.2
                                                3.6x
                          3.8x                                                                       Concatenation of 100
                     0                                                                           Satellite Tobacco Mosaic Virus
                             32      64       128          256      512      640   768
                                                       # of Nodes


                    Accelerate your science by 2.7-3.8x when compared to CPU-based supercomputers
Try NVIDIA GPUs

        Available Applications   Applications Catalog
                                 www.nvidia.com/appscatalog



Quick Application Acceleration   OpenACC Directives
                                 www.nvidia.com/gpudirectives


   Easy & Free GPU Test Drive    GPU Test Drive Cluster
                                 www.nvidia.com/gputestdrive
THANK YOU

More Related Content

What's hot

MSI N480GTX Lightning Infokit
MSI N480GTX Lightning InfokitMSI N480GTX Lightning Infokit
MSI N480GTX Lightning InfokitMSI
 
Accelerating Scientific Discovery V1
Accelerating Scientific Discovery V1Accelerating Scientific Discovery V1
Accelerating Scientific Discovery V1Shanker Trivedi
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeireimec
 
Top500 List June 2012
Top500 List June 2012Top500 List June 2012
Top500 List June 2012top500
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievVolodymyr Saviak
 
Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)LilianaSuri
 
How To Train Your Calxeda EnergyCore
How To Train Your  Calxeda EnergyCoreHow To Train Your  Calxeda EnergyCore
How To Train Your Calxeda EnergyCoreNaoto MATSUMOTO
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Jeff Larkin
 
Cuda 6 performance_report
Cuda 6 performance_reportCuda 6 performance_report
Cuda 6 performance_reportMichael Zhang
 
VMware - EMC vs NetApp
VMware - EMC vs NetAppVMware - EMC vs NetApp
VMware - EMC vs NetApppsi888
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsJoshua Mora
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 

What's hot (16)

MSI N480GTX Lightning Infokit
MSI N480GTX Lightning InfokitMSI N480GTX Lightning Infokit
MSI N480GTX Lightning Infokit
 
Accelerating Scientific Discovery V1
Accelerating Scientific Discovery V1Accelerating Scientific Discovery V1
Accelerating Scientific Discovery V1
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeire
 
Top500 List June 2012
Top500 List June 2012Top500 List June 2012
Top500 List June 2012
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)
 
How To Train Your Calxeda EnergyCore
How To Train Your  Calxeda EnergyCoreHow To Train Your  Calxeda EnergyCore
How To Train Your Calxeda EnergyCore
 
Insist On DrMOS v1.0
Insist On DrMOS v1.0Insist On DrMOS v1.0
Insist On DrMOS v1.0
 
Vigor Ex
Vigor ExVigor Ex
Vigor Ex
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
 
Cuda 6 performance_report
Cuda 6 performance_reportCuda 6 performance_report
Cuda 6 performance_report
 
VMware - EMC vs NetApp
VMware - EMC vs NetAppVMware - EMC vs NetApp
VMware - EMC vs NetApp
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systems
 
Cuda tutorial
Cuda tutorialCuda tutorial
Cuda tutorial
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Parallel Vision by GPGPU/CUDA
Parallel Vision by GPGPU/CUDAParallel Vision by GPGPU/CUDA
Parallel Vision by GPGPU/CUDA
 

Similar to GPU Computing In Higher Education And Research

計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?Shinnosuke Furuya
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfMuhammadAbdullah311866
 
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with UnivaNVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univainside-BigData.com
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesDustin Franklin
 
Presentation of the 40th TOP500 List
Presentation of the 40th TOP500 ListPresentation of the 40th TOP500 List
Presentation of the 40th TOP500 Listtop500
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
Tegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonTegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonBrian Caulfield
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101John Holden
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
NVIDIA Tesla K40 GPU
NVIDIA Tesla K40 GPUNVIDIA Tesla K40 GPU
NVIDIA Tesla K40 GPUCan Ozdoruk
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
ICDE2010 Nb-GCLOCK
ICDE2010 Nb-GCLOCKICDE2010 Nb-GCLOCK
ICDE2010 Nb-GCLOCKMakoto Yui
 

Similar to GPU Computing In Higher Education And Research (20)

Nvidia tesla-k80-overview
Nvidia tesla-k80-overviewNvidia tesla-k80-overview
Nvidia tesla-k80-overview
 
計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with UnivaNVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
 
Latest HPC News from NVIDIA
Latest HPC News from NVIDIALatest HPC News from NVIDIA
Latest HPC News from NVIDIA
 
Jetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous MachinesJetson AGX Xavier and the New Era of Autonomous Machines
Jetson AGX Xavier and the New Era of Autonomous Machines
 
GPU for DL
GPU for DLGPU for DL
GPU for DL
 
Presentation of the 40th TOP500 List
Presentation of the 40th TOP500 ListPresentation of the 40th TOP500 List
Presentation of the 40th TOP500 List
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
Tegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonTegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragon
 
BURA Supercomputer
BURA SupercomputerBURA Supercomputer
BURA Supercomputer
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
NVIDIA Tesla K40 GPU
NVIDIA Tesla K40 GPUNVIDIA Tesla K40 GPU
NVIDIA Tesla K40 GPU
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
ICDE2010 Nb-GCLOCK
ICDE2010 Nb-GCLOCKICDE2010 Nb-GCLOCK
ICDE2010 Nb-GCLOCK
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

GPU Computing In Higher Education And Research

  • 2. Lift the Barriers of HPC Faster / Maximum Greater Budget & More Research Performance Power Efficiencies Faster, More Discovery, More Performance More Performance Higher Accuracy per dollar per watt
  • 3. GPU Impact to Computational Research More Research + Maximum Performance + Efficient Power 88ns/day, 6x Faster 318% Higher Performance 2.5x Flops / Watt 54% Added Cost Tianhe-1A: CPU + GPU JAC simulation time 23,558 Atoms DHFR AMBER 11 Jaguar: CPU only CPU: Dual socket Intel Xeon Axel Kohlmeyer: Temple University Tianhe-1A: #2 Top500; Jaguar: #3 Top500 X5670, 2.93 GHz (12 cores)
  • 4. GPU Computing by Numbers 60 583 Universities Universities 150K 1.5M CUDA Downloads CUDA Downloads 4,000 22,500 Academic Papers Academic Papers 1 52 Supercomputer Supercomputers 2008 2012
  • 5. UCLA Department of Physics and Astronomy Challenge Accelerate Plasma Research with innovative Particle-in-Cell (PIC) Simulations Overcome space and power constraints in data centers Integrate into shared computing strategy across institutes and centers at UCLA Solution GPU cluster 96 server nodes 288 NVIDIA Tesla GPUs Upgraded GPUs to NVIDIA Tesla M2090s (from M2070) Impact Upgrades resulted in 20% higher performance with same power cost GPUs extended to new groups within department for greatly accelerated modeling Solves faster performance requirements within limited space and power constraints #235 on prestigious Top500 list with only 6 Racks
  • 6. Add GPUs: Accelerate Science Applications CPU GPU
  • 7. 207 GPU-Accelerated Applications www.nvidia.com/appscatalog
  • 8. 3 Ways to Accelerate Applications Applications OpenACC Programming Libraries Directives Languages “Drop-in” Easily Accelerate Maximum Acceleration Applications Flexibility THRUST C BLAS, LAPACK C++ FFT PGI Accelerator Fortran NPP CAPS HMPP OpenCL Sparse CRAY DirectCompute Imaging Java RNG Python
  • 9. GPU-Accelerated MATLAB Results 10x speedup in data clustering via K- 14x speedup in template matching routine 3x speedup in estimating 7.6 million means clustering algorithm (part of cancer cell image analysis) contract prices using Black-Scholes model 17x speedup in simulating the movement 4x speedup in adaptive filtering routine 4x speedup in wave equation solving (part of 3072 celestial objects (part of acoustic tracking algorithm) of seismic data processing algorithm)
  • 10. AMBER 12 - Extreme Performance with K20 DHRF JAC 23K Atoms (NVE) Running AMBER 12 GPU Support Revision 12.1 SPFP with CUDA 4.2.9 ECC Off 120 The blue node contains 2x Intel E5-2687W CPUs 95.59 (8 Cores per CPU) 100 Each green node contains 2x Intel E5-2687W CPUs (8 Cores per CPU) plus 2x NVIDIA K20 GPU Nanoseconds / Day 80 60 40 20 12.47 0 1 Node 1 Node DHFR Gain > 7.5X throughput/performance by adding just 2 K20 GPUs when compared to dual CPU performance
  • 11. NAMD 2.9 Outstanding Strong Scaling with Multi-STMV Running NAMD version 2.9 Each blue XE6 CPU node contains 1x AMD 100 STMV on Hundreds of Nodes 1600 Opteron (16 Cores per CPU). 1.2 Fermi XK6 Each green XK6 CPU+GPU node contains 1x AMD 1600 Opteron (16 Cores per CPU) 1 and an additional 1x NVIDIA X2090 GPU. CPU XK6 2.7x Nanoseconds / Day 0.8 2.9x 0.6 0.4 0.2 3.6x 3.8x Concatenation of 100 0 Satellite Tobacco Mosaic Virus 32 64 128 256 512 640 768 # of Nodes Accelerate your science by 2.7-3.8x when compared to CPU-based supercomputers
  • 12. Try NVIDIA GPUs Available Applications Applications Catalog www.nvidia.com/appscatalog Quick Application Acceleration OpenACC Directives www.nvidia.com/gpudirectives Easy & Free GPU Test Drive GPU Test Drive Cluster www.nvidia.com/gputestdrive

Editor's Notes

  1. Welcome, today I am excited to show you how NVIDIA Tesla GPU solutions are having a profound impact on science by breaking new barriers in computing performance. Researchers all over the world have embraced computing as the third pillar of science. Now with Tesla GPU Computing, explosive performance gains are allowing academic researchers to discover new theories, build more robust models and publish more papers.I will share highlights of successful academic institutions and researchers achieving their goals of faster, better science while doing so within academic budget constraints.
  2. With the growing need to use computing to achieve new frontiers in science and research, we quickly identified barriers to growing this need. First of all, we need to enable the researchers and scientists to do faster and more discovery with higher amounts of accuracy. We need to also do that with maximum performance per dollar, because we all have budgets. We need to do it in the most efficient manner, whether that be efficiency of power, or even efficiency in space.
  3. It’s exciting to show that GPU computing can address all of the most important barriers of delivering game changing ability in computational research.For example: AMBER – a very popular computational chemistry application can allow researchers to see 6x more simulation data per day, achieving 88 nanoseconds in a day, what would take a week to simulate on CPUs alone.Now let’s see how much does that actually cost, well by adding just 50% cost to a system, you are getting over a 300% performance gain.And finally GPUs are very power efficient. The #2 and #3 most powerful supercomputers in the world are a great example. China’s Tianhe-1A, taking the #2 spot, is 2.5x more power efficient than oak ridge’s Jaguar CPU only system.
  4. We have certainly reached the inflection point of broad adoption of GPU computing.Over 580 universities are teaching GPU computing as part of their regular curriculum. In fact, this year the Chinese Ministry of education will be requiring 200 of their higher education institutions to make NVIDIA’s CUDA parallel programming part of the curriculum.It’s been a growing trend for more and more government funding being awarded to GPU projects by the NIH, NSF or DOE.Not only large projects, like Oak Ridge’s Titan project which incorporates some 18 thousand GPUs, but also university infrastructure grants and department/research grants to develop GPU computing applications are being regularly awarded.
  5. UCLA was faced with many of challenges or barriers of HPC. The challenges they faced were that they needed to accelerate a new innovative Plasma simulation. And they also needed to overcome space and power constraints. So their solution was a cluster with 96 nodes and 288 NVIDIA Tesla GPUs. The impact was considerable. The GPUs resulted in 20% higher performance with the same power cost. Additionally, the GPUs extended to new groups within departments for greater accelerated modeling.So here they were able to offer faster and more performance as well as fitting within a budget they had for both space and power.
  6. NVIDIA’s GPU accelerated application footprint is growing exponentially year over year. Computational scientists and developers have realized that the future is in parallel computing.Native GPU acceleration has now made its way into the most widely used and published against scientific applications. This breadth of applications enables each school and department’s domain scientist population, specifically those who aren’t programmers, to reap the benefits of GPU acceleration.
  7. Equally important to applications, enabling domain scientists, we have been developing easier and easier approaches to develop your own applications for GPUs.For fastest and easiest approach we have our “drop in” libraries.Many scientific applications make wide use of standard templates or math libraries. NVIDIA makes freely available the most commonly used such as Thrust, a templated library and many math libraries such as BLAS, fft and Sparse matrices.Another extremely non-invasive way to get application acceleration is to apply open ACC directives to your existing application. It takes only a few lines of code to get a 2-10 times speedup in just a matter of days or hours.Finally if you are a developer and need the maximum amount of performance, we support you in your native programming language.
  8. Engineers and scientists worldwide rely on MATLAB to accelerate the pace of discovery, innovation, and development in disciplines such as automotive, aerospace, electronics, financial services, biotech, and many other industriesEngineers and scientists are successfully employing GPU technology, to accelerate their discipline-specific calculations. With minimal effort and without extensive knowledge of GPUs, you can now use the promising power of GPUs with MATLAB.
  9. (previous script from AMBER 11 benchmarks. Slide showsK20 results)I briefly spoke about AMBER’s price performance in our opening. Now that you see how easy it is for researchers and scientists to benefit from GPU computing with ready to go applications or easy to implement developer approaches such as directives, we should revisit price performance. See again, on a single node when applying 2 GPUs, this will essentially increase the node cost by 50%, we get much more than a 50% performance improvement. In fact, with this application we achieve greater than 300% higher performance making GPUs a clear winning investment.Additional Information on K20 Slide:1 CPU node (dual CPUs) = 12.47 ns/day1 CPU+ GPU node (dual CPUs and GPUs) = 95.59 ns/day
  10. NAMD, another extremely popular Molecular Dynamics package, here is showing that it gets up to a 2.7x speedup with GPUs. We’ve benchmarked it with a typical STMV benchmark, which is 1 million atoms. So this is a very large system. But these are the systems and simulation times needed for researchers to make breakthroughs in science. 32 64 128 256 512 640 768s/step GPU XK6 1.2414 0.660887 0.342743 0.199465 0.10837 0.089752 0.0774948s/step CPU XK6 4.62633 2.36707 1.19722 0.609124 0.314745 0.255016 0.209511ns/day Fermi XK6 0.069599 0.13073339 0.252084 0.433159 0.797269 0.962655 1.114913517ns/day CPU XK6 0.018676 0.03650082 0.072167 0.141843 0.274508 0.338802 0.412388848
  11. Today more than ever, it’s easier for researchers, scientists and academic institutions to benefit from GPU computing. We have ready-to-go GPU accelerated applications (see the Applications Catalog). We are continuously investing in creating the easiest approaches to quickly accelerating your own applications; OpenACC directives being our latest development.And finally, the GPU Test Drive cluster is the ideal solution to easily test how a particular application accelerates with GPUs. The GPU Test Drive clusteris also pre-configured for easy purchase and installations
  12. Thank you for following along.I hope we have proved to you that GPU computing is making extraordinary contributions to science and research.Now is the time to reach your next scientific computing achievements by investing in NVIDIA Tesla GPUs which have worldwide adoption and world class developer support.