SlideShare a Scribd company logo
1 of 38
Download to read offline
HETEROGENEOUS SYSTEMS ARCHITECTURE:
THE NEXT AREA OF COMPUTING INNOVATION
            CASE STUDY: THE HOLODECK
                                                   Dr. Lisa Su
               Senior Vice President and GM, Global Business Units,
                                                              AMD

                                                 ISSCC Conference
                                                  February 18, 2013
CHALLENGES TO MOORE’S LAW SCALING

                          Area Scaling by Technology Generation                                              Cost Per Transistor Scaling
                    1.0                                                                                1.0




                                                                          Normalized Cost/Transistor
                    0.8                                                                                0.8
  Normalized Area




                    0.6                                                                                0.6


                    0.4                                                                                0.4



                    0.2                                                                                0.2



                    0.0                                                                                0.0
                           45nm   40nm   32nm   28nm   20nm     20                                            45nm   40nm    32nm   28nm   20nm     20
                                                              FinFET                                                                              FinFET




  Lithography challenges begin severely limiting area scaling at 20nm node
                    – Fewer 1X metals due to cost
                    – Less aggressive feature scaling due to lithography challenges

  Compounded by rapidly increasing lithography costs
                    – 28  20nm transition is inflection point with dual exposure
                    – No cost / transistor crossover for first time at 28  20nm transition


2 | ISSCC Keynote | February 18th, 2013
A PARADIGM SHIFT…

                       Microprocessor Advancement
 CPU




                          Single-Core       Multi-Core   Heterogeneous
                              Era              Era        Systems Era



                                                                           High-level
                                                         Heterogeneous   programmable
                                                          Computing
                                                                          OpenCL/DX
                                                                          driver-based
                                 Homogeneous                               programs
 Programmability




                                  Computing




                                                                                         Advancement
                                                                                             GPU
                                                                            Graphics
                                                                          driver-based
                                                                           programs



                   Throughput Performance                                Accelerator




3 | ISSCC Keynote | February 18th, 2013
HETEROGENEOUS SYSTEMS ARCHITECTURE MEMORY MODEL
                                          Today




                     To
              64 bit


                                          Yesterday




                  From
             32 bit


4 | ISSCC Keynote | February 18th, 2013
ARCHITECTURES – A HISTORICAL PERSPECTIVE

  Legacy Processing Era                                      Surround Computing Era



      Single Core CPUs


      Traditionally Optimized Platforms


                                                  Multi-Core CPUs/GPUs


                                                       APUs and legacy SOC


                                                             Heterogeneous Architectures


  1981                  1990s             2000s                          2010s




5 | ISSCC Keynote | February 18th, 2013
CHANGING THE THINKING, CHANGING THE GAME

HSA is designed to make the GPU hardware
directly accessible to the software, using the high
level languages programmers already in use on
the CPU
 C, C++, Java, Python…even JavaScript, HTML5
 ISA agnostic – e.g., x86, 64-bit ARM, Radeon, Mali

GPU becomes a peer processor to the CPU in
terms of system integration
 Full programming language features
 Shared virtual memory: pointer is a pointer
 Coherency
 Context switching


  HSA Foundation – an
  industry-wide initiative
6 | ISSCC Keynote | February 18th, 2013
BENEFITS OF HETEROGENEOUS SYSTEM ARCHITECTURE




7 | ISSCC Keynote | February 18th, 2013
EFFECTIVE COMPUTE OFFLOAD

  APU Accelerated                                            HSA Accelerated Processing Unit
  Software Applications




                                   Data Parallel Workloads



                                     Serial and Task
                                  Parallel Workloads




                                      Made easy by HSA
                     Unleash the best compute elements depending on task


8 | ISSCC Keynote | February 18th, 2013
BRINGING IT ALL TOGETHER
                                                                   MOTION DSP 720P

                                   Power                                                       Performance
     35 W                                                                        25 fps

     30 W
                         DRAM                                                    20 fps
     25 W
                        NB+GPU                         DRAM
     20 W                                                                        15 fps
                                                      NB+GPU
     15 W
                                                                                 10 fps
     10 W              CPU Cores
                                                     CPU Cores                    5 fps
       5W

       0W                                                                         0 fps
                          CPU                       CPU+GPU                                   CPU            CPU+GPU




             Synergistic use of GPU compute
                   + shared memory                                                        >4.0X Better Energy
                            =                                                                 Efficiency1
           lower power and higher performance


 AMD internal testing: AMD E2-3200 APU (2 cores @ 2400Mhz, GPU:2 CU @ 444Mhz),
 Windows 7 OS, MotionDSP vReveal Applications 720P MP4 input
 (http://www.vreveal.com/stabilization)


9 | ISSCC Keynote | February 18th, 2013
TODAY’S DISCUSSION: FROM SURROUND COMPUTING TO
ENABLING THE HOLODECK

1. A fully featured Holodeck is
   still many years away

2. Today our discussion will:
 Establish a Holodeck framework
 Identify Holodeck enabling technologies
 Discuss how Heterogeneous Systems
  Architecture (HSA) accelerates these
  technologies
 Undertake an HSA deep dive on one of
  these enabling technologies
 Look at how new dedicated processors
  will enable Holodeck functionality


10 | ISSCC Keynote | February 18th, 2013
WHAT IS A HOLODECK?




11 | ISSCC Keynote | February 18th, 2013
THE HOLODECK FRAMEWORK:
AN EVOLUTION OF SURROUND COMPUTING


 Natural User Interfaces
 Context Computing
 360 Degree Virtual
  Environments




12 | ISSCC Keynote | February 18th, 2013
HOLODECK ENABLING TECHNOLOGIES:
PROFOUND IMPLICATIONS FOR COMPUTER ARCHITECTURE

Computational Photography
 Delivering seamless and immersive video environments

Directional Audio
 Using audio to enhance immersion and realism of our environments

Natural User Interfaces
 Enabling realistic, natural human
  communication

Context Computing
 Delivering an intuitive understanding
  of the user’s needs in real time

Augmented Reality
 Bringing it all together – combining the
  real and the virtual

13 | ISSCC Keynote | February 18th, 2013
COMPUTATIONAL PHOTOGRAPHY
360 DEGREE VISUAL ENVIRONMENTS, PHOTOSTITCHING, PERIPHERAL VISION AND HSA

 Mapping real life scenes through finite images
   Photo stitching of tiled environments and
    perceptual correction
   Detect interest points & match features
   Projecting geometry with point features
    using algorithms like RANSAC
 Image processing to account for
  curved screen surfaces
 Modulate brightness to account for
  peripheral vision

 HSA presents a unified view of the
 system with shared memory so CPU and
 GPU acceleration in the entire process



14 | ISSCC Keynote | February 18th, 2013
DIRECTIONAL AUDIO

 Couples computationally demanding 3D
  audio and spatialization effects with
  "always on" background processing like
  (VAD) Voice Activity Detection
    Voice activity detection is best
     implemented with special audio
     processors and acceleration
     techniques
    Spatialization effects such as
     “Convolution Reverb” are best
     done with GPU acceleration



      HSA enables seamless
      integration of CPU and GPU
      acceleration with other
      independent accelerators


15 | ISSCC Keynote | February 18th, 2013
NATURAL USER INTERFACES

  Speech Recognition:
       Background processing – echo
        cancellation & noise suppression
       Audio feature extraction
       Voice pattern recognition through
        Markov model or similar algorithm
   Gesture Recognition:
       Frame preprocessing & filtering
       Optical flow or object tracking
       Sophisticated computer vision
        algorithms to delineate the hand or
        body parts from the background

    NUI algorithms all benefit from
    CPU/GPU and audio processors to
    efficiently perform these functions at
    the lowest power
16 | ISSCC Keynote | February 18th, 2013
CONTEXT COMPUTING
BIOMETRICS EXAMPLE

   • Facial Recognition:
         • Face detection (is there a face) –
           GPU acceleration
         • Face identification (pattern
           matching through algorithms like
           Haar face detection) – CPU and
           GPU acceleration
         • Validation through blink detection
           (make sure it is a real face) –
           GPU acceleration



   HSA enables mix and match of the best
   acceleration for each phase of the
   process




17 | ISSCC Keynote | February 18th, 2013
AUGMENTED REALITY

 • Image Registration:
       • Relies on robust and fast feature
         detection – benefits from
         CPU/GPU acceleration
  • Object Tracking:
       • Relies on “optical flow” algorithm
         – benefits from CPU/GPU
         acceleration
  • Image Composition:
       • Once information exists from the
         above, becomes a classic
         graphics rendering use case


   The building blocks of HSA enable the
   augmented reality world.


18 | ISSCC Keynote | February 18th, 2013
THE WAY FORWARD

 Many technologies required to
  enable our vision
    – Heterogeneous engines that
      accelerate key client and server
      workloads
    – Datacenters optimized for
      latency, scalability, and
      efficiency
    – Processors optimized for new
      and emerging workloads
    – Active research into new
      algorithms




19 | ISSCC Keynote | February 18th, 2013
ENABLING TECHNOLOGY DEEP DIVE:
ACCELERATING NATURAL USER INTERFACES (HAAR
      FACE DETECTION) WITH HETEROGENEOUS
                    SYSTEMS ARCHITECTURE
LOOKING FOR FACES IN ALL THE RIGHT PLACES




21 | ISSCC Keynote | February 18th, 2013
LOOKING FOR FACES IN ALL THE RIGHT PLACES




 Quick HD Calculations
 Search square = 21 x 21
 Pixels = 1920 x 1080 = 2,073,600
 Search squares = 1900 x 1060 = ~2 Million




22 | ISSCC Keynote | February 18th, 2013
LOOKING FOR DIFFERENT SIZE FACES
BY SCALING THE VIDEO FRAME




23 | ISSCC Keynote | February 18th, 2013
LOOKING FOR DIFFERENT SIZE FACES
BY SCALING THE VIDEO FRAME




   More HD Calculations
   70% scaling in H and V
   Total Pixels = 4.07 Million
   Search squares = 3.8 Million




24 | ISSCC Keynote | February 18th, 2013
HAAR CASCADE STAGES



                                           Feature k


                                           Feature l    Stage N


                                           Feature m

                                                                   Face still
                                                          Yes      possible?

                                           Feature p
                                                                      No
                                           Feature r   Stage N+1


                                           Feature q               REJECT
                                                                   FRAME




25 | ISSCC Keynote | February 18th, 2013
22 CASCADE STAGES, EARLY OUT BETWEEN EACH



                                                                                        FACE
      STAGE 1                        STAGE 2         STAGE 21        STAGE 22           CONFIRMED




                                               NO FACE


            Final HD Calculations                          Calculation Rate
            Search squares = 3.8 million                   30 frames/sec = 1.4TCalcs/second
            Average features per square = 124              60 frames/sec = 2.8TCalcs/second
            Calculations per feature = 100
            Calculations per frame = 47 GCalcs             …and this only gets front-facing faces




26 | ISSCC Keynote | February 18th, 2013
CASCADE DEPTH ANALYSIS
 Cascade                                                                   25
 Depth
                            20-25    15-20   10-15   5-10   0-5

                                                                       20



                                                                      15



                                                                      10


                                                                  5


                                                                  0




27 | ISSCC Keynote | February 18th, 2013
UNBALANCING DUE TO EXITS IN EARLIER CASCADE STAGES




   Live
 Dead




        When running on the GPU, we run each search rectangle on a separate
         work item
        Early out algorithms, like HAAR, exhibit divergence between work items
            – Some work items exit early
            – Their neighbors continue
            – SIMD packing suffers as a result

28 | ISSCC Keynote | February 18th, 2013
PROCESSING TIME/STAGE
                                                  A10-4600M (6CU@497Mhz, 4 cores@2700Mhz)

                    100
                                                                                                                            GPU    CPU
                     90


                     80


                     70


                     60
        Time (ms)




                     50


                     40


                     30


                     20


                     10


                      0
                          1               2               3               4             5                  6        7   8         9-22
                                                                                  Cascade Stage



    AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 GHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G,
    6 compute units, 685MHz; 4GB RAM; Windows 7 (64-bit); OpenCL™ 1.1 (873.1)



29 | ISSCC Keynote | February 18th, 2013
PERFORMANCE CPU-VS-GPU
                                                  AMD A10-4600M APU (6CU@497Mhz, 4 cores@2700Mhz)

                     12
                                                                                                                        CPU       HSA   GPU


                     10




                      8
        Images/Sec




                      6




                      4




                      2




                      0
                          0            1              2             3         4         5         6                 7         8         22
                                                                    Number of Cascade Stages on GPU



    AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 MHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G,
    6 compute units, 685MHz; 4GB RAM; Windows 7 (64-bit); OpenCL™ 1.1 (873.1)



30 | ISSCC Keynote | February 18th, 2013
HAAR SOLUTION
RUN DIFFERENT CASCADES ON GPU AND CPU


                                   By seamlessly sharing data between CPU and GPU,
                                  HSA allows the right processor to handle its appropriate
                                                         workload



                                                +2.5x




                                                                           -2.5x

                                            INCREASED             DECREASED ENERGY
                                           PERFORMANCE               PER FRAME




31 | ISSCC Keynote | February 18th, 2013
APPLICATION ACCELERATION USING HSA




  Gesture recognition                                                                     12x
        Photo indexing                                                            10x
     Voice recognition                                                        10x
         Visual Search                                                     9x
          Audio search                                 5x
          Stereo vision                              4x
    Video stabilization                              4x
            Face detect                     2x
                              0        2         4        6         8        10       12        14
                                                Acceleration vs. CPU


    AMD estimates Source:AMD Whitepaper, Accelerating Consumer/Prosumer Multimedia with HSA, June 2012



32 | ISSCC Keynote | February 18th, 2013
HSA EVOLUTION

              Llano                              Trinity                Kaveri              Next Gen

          Physical                             Optimized            Architectural            System
         Integration                           Platforms             Integration           Integration


    Integrate CPU & GPU                     GPU Compute C++       Unified Address Space    GPU compute
           in silicon                           support             for CPU and GPU        context switch


                                                                   GPU uses pageable
        Unified Memory                                                                     GPU graphics
                                           User mode scheduling    system memory via
          Controller                                                                        pre-emption
                                                                      CPU pointers

           Common                          Bi-Directional Power
                                                                  Fully coherent memory
         Manufacturing                     Mgmt between CPU                               Quality of Service
                                                                   between CPU & GPU
          Technology                             and GPU




33 | ISSCC Keynote | February 18th, 2013
HSA PROGRAMMABILITY ADVANTAGE

                                            Unified Programming Models              Domain-
                HSA                                  OpenCL, C++   DX11,             Specific
                    C, C++, Java …                   AMP, Java8 …    OpenGL …       Ext / APIs
             Foundation
                                           HSA Intermediate Language (HSAIL)
                                                    Compute Acceleration    Graphics Acceleration




          • Works with today’s programming models and languages

          • Architected to enable CPU like programmability

          • Promotes development and adoption of extended standards
             • Write Once Run Anywhere – with Performance


34 | ISSCC Keynote | February 18th, 2013
CONCLUSION


 The age of traditional computing is
  dead.
 A paradigm shift in processing has
  brought about the Heterogeneous
  Systems Era

 HSA will enable us to dramatically
  scale processing power while
  increasing power efficiency
 The Holodeck still years away, but
  HSA and dedicated hardware
  blocks will accelerate and enable
  technologies as they emerge




35 | ISSCC Keynote | February 18th, 2013
ACKNOWLEDGEMENTS


 Bill Herz
 Phil Rogers

 Marty Johnson
 Chris Hook
 Sumant Subramanian




36 | ISSCC Keynote | February 18th, 2013
THANK YOU
DISCLAIMER
 The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and
 typographical errors.

 The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to
 product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences
 between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or
 otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to
 time to the content hereof without obligation of AMD to notify any person of such revisions or changes.

 AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO
 RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.

 AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN
 NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES
 ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGES.

 ATTRIBUTION
 © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, and combinations thereof
 are trademarks of Advanced Micro Devices, Inc. Other names and logos are used for informational purposes only and may
 be trademarks of their respective owners.




38 | ISSCC Keynote | February 18th, 2013

More Related Content

What's hot

Moving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressMoving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressOdinot Stanislas
 
AMD processors
AMD processorsAMD processors
AMD processorssanthu652
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computingRashid Ansari
 
Fundamentals of Servers, server storage and server security.
Fundamentals of Servers, server storage and server security.Fundamentals of Servers, server storage and server security.
Fundamentals of Servers, server storage and server security.Aakash Panchal
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD
 
NAS - Network Attached Storage
NAS - Network Attached StorageNAS - Network Attached Storage
NAS - Network Attached StorageShashank Bhatnagar
 
ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesAMD
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)Fatima Qayyum
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsNational Cheng Kung University
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
Presentation on - Processors
Presentation on - Processors Presentation on - Processors
Presentation on - Processors The Avi Sharma
 
Parallel computing
Parallel computingParallel computing
Parallel computingVinay Gupta
 
4. motherboard
4.   motherboard4.   motherboard
4. motherboardjazz_306
 
Computer architecture memory system
Computer architecture memory systemComputer architecture memory system
Computer architecture memory systemMazin Alwaaly
 

What's hot (20)

Moving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM ExpressMoving to PCI Express based SSD with NVM Express
Moving to PCI Express based SSD with NVM Express
 
AMD processors
AMD processorsAMD processors
AMD processors
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computing
 
Storage basics
Storage basicsStorage basics
Storage basics
 
Fundamentals of Servers, server storage and server security.
Fundamentals of Servers, server storage and server security.Fundamentals of Servers, server storage and server security.
Fundamentals of Servers, server storage and server security.
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
NAS - Network Attached Storage
NAS - Network Attached StorageNAS - Network Attached Storage
NAS - Network Attached Storage
 
ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
AMD and the new “Zen” High Performance x86 Core at Hot Chips 28
 
RISC-V Introduction
RISC-V IntroductionRISC-V Introduction
RISC-V Introduction
 
ARM Architecture
ARM ArchitectureARM Architecture
ARM Architecture
 
Develop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM BoardsDevelop Your Own Operating Systems using Cheap ARM Boards
Develop Your Own Operating Systems using Cheap ARM Boards
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
Presentation on - Processors
Presentation on - Processors Presentation on - Processors
Presentation on - Processors
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
4. motherboard
4.   motherboard4.   motherboard
4. motherboard
 
Gpu
GpuGpu
Gpu
 
SSD PPT BY SAURABH
SSD PPT BY SAURABHSSD PPT BY SAURABH
SSD PPT BY SAURABH
 
Computer architecture memory system
Computer architecture memory systemComputer architecture memory system
Computer architecture memory system
 

Viewers also liked

NUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereNUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereAMD
 
Open compute technology
Open compute technologyOpen compute technology
Open compute technologyAMD
 
AMD - Why, What and How
AMD - Why, What and HowAMD - Why, What and How
AMD - Why, What and HowMike Wilcox
 
AMD Radeon Technology Group Summit
AMD Radeon Technology Group SummitAMD Radeon Technology Group Summit
AMD Radeon Technology Group SummitLow Hong Chuan
 
AMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUsAMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUsAMD
 
2014 AMD Low-Power Mobile APUs
2014 AMD Low-Power Mobile APUs2014 AMD Low-Power Mobile APUs
2014 AMD Low-Power Mobile APUsAMD
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD
 
Wps104 direct x 12 a new meaning for efficiency and performance (presented ...
Wps104   direct x 12 a new meaning for efficiency and performance (presented ...Wps104   direct x 12 a new meaning for efficiency and performance (presented ...
Wps104 direct x 12 a new meaning for efficiency and performance (presented ...Jose Fajardo
 
Apu14 beijing final for show english press
Apu14 beijing final for show english pressApu14 beijing final for show english press
Apu14 beijing final for show english pressLow Hong Chuan
 
Progress Toward Topical Therapy of AMD
Progress Toward Topical Therapy of AMDProgress Toward Topical Therapy of AMD
Progress Toward Topical Therapy of AMDRick Trevino
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
Radeon Software Crimson ReLive
Radeon Software Crimson ReLive Radeon Software Crimson ReLive
Radeon Software Crimson ReLive Low Hong Chuan
 
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat PresentationAMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat PresentationAMD
 
Age Related Macular Degeneration
Age Related Macular DegenerationAge Related Macular Degeneration
Age Related Macular DegenerationJody Abrams
 
Whats New in AMD - 2015
Whats New in AMD - 2015Whats New in AMD - 2015
Whats New in AMD - 2015Rick Trevino
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
Amd Ryzen December 2016 Update
Amd Ryzen December 2016 Update Amd Ryzen December 2016 Update
Amd Ryzen December 2016 Update Low Hong Chuan
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD
 
AMD CFO Commentary slides 14Q4
AMD CFO Commentary slides 14Q4AMD CFO Commentary slides 14Q4
AMD CFO Commentary slides 14Q4Low Hong Chuan
 

Viewers also liked (20)

Amd processor
Amd processorAmd processor
Amd processor
 
NUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphereNUMA Performance Considerations in VMware vSphere
NUMA Performance Considerations in VMware vSphere
 
Open compute technology
Open compute technologyOpen compute technology
Open compute technology
 
AMD - Why, What and How
AMD - Why, What and HowAMD - Why, What and How
AMD - Why, What and How
 
AMD Radeon Technology Group Summit
AMD Radeon Technology Group SummitAMD Radeon Technology Group Summit
AMD Radeon Technology Group Summit
 
AMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUsAMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUs
 
2014 AMD Low-Power Mobile APUs
2014 AMD Low-Power Mobile APUs2014 AMD Low-Power Mobile APUs
2014 AMD Low-Power Mobile APUs
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup Announcement
 
Wps104 direct x 12 a new meaning for efficiency and performance (presented ...
Wps104   direct x 12 a new meaning for efficiency and performance (presented ...Wps104   direct x 12 a new meaning for efficiency and performance (presented ...
Wps104 direct x 12 a new meaning for efficiency and performance (presented ...
 
Apu14 beijing final for show english press
Apu14 beijing final for show english pressApu14 beijing final for show english press
Apu14 beijing final for show english press
 
Progress Toward Topical Therapy of AMD
Progress Toward Topical Therapy of AMDProgress Toward Topical Therapy of AMD
Progress Toward Topical Therapy of AMD
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Radeon Software Crimson ReLive
Radeon Software Crimson ReLive Radeon Software Crimson ReLive
Radeon Software Crimson ReLive
 
AMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat PresentationAMD Hot Chips Bulldozer & Bobcat Presentation
AMD Hot Chips Bulldozer & Bobcat Presentation
 
Age Related Macular Degeneration
Age Related Macular DegenerationAge Related Macular Degeneration
Age Related Macular Degeneration
 
Whats New in AMD - 2015
Whats New in AMD - 2015Whats New in AMD - 2015
Whats New in AMD - 2015
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
Amd Ryzen December 2016 Update
Amd Ryzen December 2016 Update Amd Ryzen December 2016 Update
Amd Ryzen December 2016 Update
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick Bergman
 
AMD CFO Commentary slides 14Q4
AMD CFO Commentary slides 14Q4AMD CFO Commentary slides 14Q4
AMD CFO Commentary slides 14Q4
 

Similar to HSA Powers the Holodeck: Heterogeneous Computing Enables Immersive Virtual Environments

Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedRCCSRENKEI
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point RacetrackDino, llc
 
Mpc5121 econfs
Mpc5121 econfsMpc5121 econfs
Mpc5121 econfsDino, llc
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer visionmustafa sarac
 
CAST BA22 32-bit Processor Design Seminar, 2/1/12
CAST BA22 32-bit Processor Design Seminar, 2/1/12CAST BA22 32-bit Processor Design Seminar, 2/1/12
CAST BA22 32-bit Processor Design Seminar, 2/1/12CAST, Inc.
 
Varkon Semiconductor
Varkon Semiconductor Varkon Semiconductor
Varkon Semiconductor Rajiv Parmar
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoEmbarcados
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processorsaccount inactive
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnelukdpe
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 

Similar to HSA Powers the Holodeck: Heterogeneous Computing Enables Immersive Virtual Environments (20)

Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons Learned
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point Racetrack
 
Mpc5121 econfs
Mpc5121 econfsMpc5121 econfs
Mpc5121 econfs
 
Fo2410191024
Fo2410191024Fo2410191024
Fo2410191024
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer vision
 
CAST BA22 32-bit Processor Design Seminar, 2/1/12
CAST BA22 32-bit Processor Design Seminar, 2/1/12CAST BA22 32-bit Processor Design Seminar, 2/1/12
CAST BA22 32-bit Processor Design Seminar, 2/1/12
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 
Varkon Semiconductor
Varkon Semiconductor Varkon Semiconductor
Varkon Semiconductor
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 
ISBI MPI Tutorial
ISBI MPI TutorialISBI MPI Tutorial
ISBI MPI Tutorial
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processors
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
Trends For Innovating Faster
Trends For Innovating FasterTrends For Innovating Faster
Trends For Innovating Faster
 
Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 

More from AMD

“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor CoreAMD
 
3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache AMD
 
AMD EPYC Family World Record Performance Summary Mar 2022
AMD EPYC Family World Record Performance Summary Mar 2022AMD EPYC Family World Record Performance Summary Mar 2022
AMD EPYC Family World Record Performance Summary Mar 2022AMD
 
AMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD
 
AMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD
 
AMD EPYC World Records
AMD EPYC World RecordsAMD EPYC World Records
AMD EPYC World RecordsAMD
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD
 
AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD
 
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor CoreZen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor CoreAMD
 
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUsAMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUsAMD
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD
 
AMD EPYC 100 World Records and Counting
AMD EPYC 100 World Records and CountingAMD EPYC 100 World Records and Counting
AMD EPYC 100 World Records and CountingAMD
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD
 
Delivering the Future of High-Performance Computing
Delivering the Future of High-Performance ComputingDelivering the Future of High-Performance Computing
Delivering the Future of High-Performance ComputingAMD
 
7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance 7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance AMD
 
The Path to "Zen 2"
The Path to "Zen 2"The Path to "Zen 2"
The Path to "Zen 2"AMD
 
AMD Next Horizon
AMD Next HorizonAMD Next Horizon
AMD Next HorizonAMD
 
AMD Next Horizon
AMD Next HorizonAMD Next Horizon
AMD Next HorizonAMD
 

More from AMD (20)

“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
 
3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache
 
AMD EPYC Family World Record Performance Summary Mar 2022
AMD EPYC Family World Record Performance Summary Mar 2022AMD EPYC Family World Record Performance Summary Mar 2022
AMD EPYC Family World Record Performance Summary Mar 2022
 
AMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World Record
 
AMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World RecordAMD EPYC Family of Processors World Record
AMD EPYC Family of Processors World Record
 
AMD EPYC World Records
AMD EPYC World RecordsAMD EPYC World Records
AMD EPYC World Records
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming Begins
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World Records
 
AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World Records
 
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor CoreZen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
 
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUsAMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
AMD EPYC 100 World Records and Counting
AMD EPYC 100 World Records and CountingAMD EPYC 100 World Records and Counting
AMD EPYC 100 World Records and Counting
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
 
Delivering the Future of High-Performance Computing
Delivering the Future of High-Performance ComputingDelivering the Future of High-Performance Computing
Delivering the Future of High-Performance Computing
 
7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance 7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance
 
The Path to "Zen 2"
The Path to "Zen 2"The Path to "Zen 2"
The Path to "Zen 2"
 
AMD Next Horizon
AMD Next HorizonAMD Next Horizon
AMD Next Horizon
 
AMD Next Horizon
AMD Next HorizonAMD Next Horizon
AMD Next Horizon
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

HSA Powers the Holodeck: Heterogeneous Computing Enables Immersive Virtual Environments

  • 1. HETEROGENEOUS SYSTEMS ARCHITECTURE: THE NEXT AREA OF COMPUTING INNOVATION CASE STUDY: THE HOLODECK Dr. Lisa Su Senior Vice President and GM, Global Business Units, AMD ISSCC Conference February 18, 2013
  • 2. CHALLENGES TO MOORE’S LAW SCALING Area Scaling by Technology Generation Cost Per Transistor Scaling 1.0 1.0 Normalized Cost/Transistor 0.8 0.8 Normalized Area 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 45nm 40nm 32nm 28nm 20nm 20 45nm 40nm 32nm 28nm 20nm 20 FinFET FinFET  Lithography challenges begin severely limiting area scaling at 20nm node – Fewer 1X metals due to cost – Less aggressive feature scaling due to lithography challenges  Compounded by rapidly increasing lithography costs – 28  20nm transition is inflection point with dual exposure – No cost / transistor crossover for first time at 28  20nm transition 2 | ISSCC Keynote | February 18th, 2013
  • 3. A PARADIGM SHIFT… Microprocessor Advancement CPU Single-Core Multi-Core Heterogeneous Era Era Systems Era High-level Heterogeneous programmable Computing OpenCL/DX driver-based Homogeneous programs Programmability Computing Advancement GPU Graphics driver-based programs Throughput Performance Accelerator 3 | ISSCC Keynote | February 18th, 2013
  • 4. HETEROGENEOUS SYSTEMS ARCHITECTURE MEMORY MODEL Today To 64 bit Yesterday From 32 bit 4 | ISSCC Keynote | February 18th, 2013
  • 5. ARCHITECTURES – A HISTORICAL PERSPECTIVE Legacy Processing Era Surround Computing Era Single Core CPUs Traditionally Optimized Platforms Multi-Core CPUs/GPUs APUs and legacy SOC Heterogeneous Architectures 1981 1990s 2000s 2010s 5 | ISSCC Keynote | February 18th, 2013
  • 6. CHANGING THE THINKING, CHANGING THE GAME HSA is designed to make the GPU hardware directly accessible to the software, using the high level languages programmers already in use on the CPU  C, C++, Java, Python…even JavaScript, HTML5  ISA agnostic – e.g., x86, 64-bit ARM, Radeon, Mali GPU becomes a peer processor to the CPU in terms of system integration  Full programming language features  Shared virtual memory: pointer is a pointer  Coherency  Context switching HSA Foundation – an industry-wide initiative 6 | ISSCC Keynote | February 18th, 2013
  • 7. BENEFITS OF HETEROGENEOUS SYSTEM ARCHITECTURE 7 | ISSCC Keynote | February 18th, 2013
  • 8. EFFECTIVE COMPUTE OFFLOAD APU Accelerated HSA Accelerated Processing Unit Software Applications Data Parallel Workloads Serial and Task Parallel Workloads Made easy by HSA Unleash the best compute elements depending on task 8 | ISSCC Keynote | February 18th, 2013
  • 9. BRINGING IT ALL TOGETHER MOTION DSP 720P Power Performance 35 W 25 fps 30 W DRAM 20 fps 25 W NB+GPU DRAM 20 W 15 fps NB+GPU 15 W 10 fps 10 W CPU Cores CPU Cores 5 fps 5W 0W 0 fps CPU CPU+GPU CPU CPU+GPU Synergistic use of GPU compute + shared memory >4.0X Better Energy = Efficiency1 lower power and higher performance AMD internal testing: AMD E2-3200 APU (2 cores @ 2400Mhz, GPU:2 CU @ 444Mhz), Windows 7 OS, MotionDSP vReveal Applications 720P MP4 input (http://www.vreveal.com/stabilization) 9 | ISSCC Keynote | February 18th, 2013
  • 10. TODAY’S DISCUSSION: FROM SURROUND COMPUTING TO ENABLING THE HOLODECK 1. A fully featured Holodeck is still many years away 2. Today our discussion will:  Establish a Holodeck framework  Identify Holodeck enabling technologies  Discuss how Heterogeneous Systems Architecture (HSA) accelerates these technologies  Undertake an HSA deep dive on one of these enabling technologies  Look at how new dedicated processors will enable Holodeck functionality 10 | ISSCC Keynote | February 18th, 2013
  • 11. WHAT IS A HOLODECK? 11 | ISSCC Keynote | February 18th, 2013
  • 12. THE HOLODECK FRAMEWORK: AN EVOLUTION OF SURROUND COMPUTING  Natural User Interfaces  Context Computing  360 Degree Virtual Environments 12 | ISSCC Keynote | February 18th, 2013
  • 13. HOLODECK ENABLING TECHNOLOGIES: PROFOUND IMPLICATIONS FOR COMPUTER ARCHITECTURE Computational Photography  Delivering seamless and immersive video environments Directional Audio  Using audio to enhance immersion and realism of our environments Natural User Interfaces  Enabling realistic, natural human communication Context Computing  Delivering an intuitive understanding of the user’s needs in real time Augmented Reality  Bringing it all together – combining the real and the virtual 13 | ISSCC Keynote | February 18th, 2013
  • 14. COMPUTATIONAL PHOTOGRAPHY 360 DEGREE VISUAL ENVIRONMENTS, PHOTOSTITCHING, PERIPHERAL VISION AND HSA  Mapping real life scenes through finite images  Photo stitching of tiled environments and perceptual correction  Detect interest points & match features  Projecting geometry with point features using algorithms like RANSAC  Image processing to account for curved screen surfaces  Modulate brightness to account for peripheral vision HSA presents a unified view of the system with shared memory so CPU and GPU acceleration in the entire process 14 | ISSCC Keynote | February 18th, 2013
  • 15. DIRECTIONAL AUDIO  Couples computationally demanding 3D audio and spatialization effects with "always on" background processing like (VAD) Voice Activity Detection  Voice activity detection is best implemented with special audio processors and acceleration techniques  Spatialization effects such as “Convolution Reverb” are best done with GPU acceleration HSA enables seamless integration of CPU and GPU acceleration with other independent accelerators 15 | ISSCC Keynote | February 18th, 2013
  • 16. NATURAL USER INTERFACES  Speech Recognition:  Background processing – echo cancellation & noise suppression  Audio feature extraction  Voice pattern recognition through Markov model or similar algorithm  Gesture Recognition:  Frame preprocessing & filtering  Optical flow or object tracking  Sophisticated computer vision algorithms to delineate the hand or body parts from the background NUI algorithms all benefit from CPU/GPU and audio processors to efficiently perform these functions at the lowest power 16 | ISSCC Keynote | February 18th, 2013
  • 17. CONTEXT COMPUTING BIOMETRICS EXAMPLE • Facial Recognition: • Face detection (is there a face) – GPU acceleration • Face identification (pattern matching through algorithms like Haar face detection) – CPU and GPU acceleration • Validation through blink detection (make sure it is a real face) – GPU acceleration HSA enables mix and match of the best acceleration for each phase of the process 17 | ISSCC Keynote | February 18th, 2013
  • 18. AUGMENTED REALITY • Image Registration: • Relies on robust and fast feature detection – benefits from CPU/GPU acceleration • Object Tracking: • Relies on “optical flow” algorithm – benefits from CPU/GPU acceleration • Image Composition: • Once information exists from the above, becomes a classic graphics rendering use case The building blocks of HSA enable the augmented reality world. 18 | ISSCC Keynote | February 18th, 2013
  • 19. THE WAY FORWARD  Many technologies required to enable our vision – Heterogeneous engines that accelerate key client and server workloads – Datacenters optimized for latency, scalability, and efficiency – Processors optimized for new and emerging workloads – Active research into new algorithms 19 | ISSCC Keynote | February 18th, 2013
  • 20. ENABLING TECHNOLOGY DEEP DIVE: ACCELERATING NATURAL USER INTERFACES (HAAR FACE DETECTION) WITH HETEROGENEOUS SYSTEMS ARCHITECTURE
  • 21. LOOKING FOR FACES IN ALL THE RIGHT PLACES 21 | ISSCC Keynote | February 18th, 2013
  • 22. LOOKING FOR FACES IN ALL THE RIGHT PLACES Quick HD Calculations Search square = 21 x 21 Pixels = 1920 x 1080 = 2,073,600 Search squares = 1900 x 1060 = ~2 Million 22 | ISSCC Keynote | February 18th, 2013
  • 23. LOOKING FOR DIFFERENT SIZE FACES BY SCALING THE VIDEO FRAME 23 | ISSCC Keynote | February 18th, 2013
  • 24. LOOKING FOR DIFFERENT SIZE FACES BY SCALING THE VIDEO FRAME More HD Calculations 70% scaling in H and V Total Pixels = 4.07 Million Search squares = 3.8 Million 24 | ISSCC Keynote | February 18th, 2013
  • 25. HAAR CASCADE STAGES Feature k Feature l Stage N Feature m Face still Yes possible? Feature p No Feature r Stage N+1 Feature q REJECT FRAME 25 | ISSCC Keynote | February 18th, 2013
  • 26. 22 CASCADE STAGES, EARLY OUT BETWEEN EACH FACE STAGE 1 STAGE 2 STAGE 21 STAGE 22 CONFIRMED NO FACE Final HD Calculations Calculation Rate Search squares = 3.8 million 30 frames/sec = 1.4TCalcs/second Average features per square = 124 60 frames/sec = 2.8TCalcs/second Calculations per feature = 100 Calculations per frame = 47 GCalcs …and this only gets front-facing faces 26 | ISSCC Keynote | February 18th, 2013
  • 27. CASCADE DEPTH ANALYSIS Cascade 25 Depth 20-25 15-20 10-15 5-10 0-5 20 15 10 5 0 27 | ISSCC Keynote | February 18th, 2013
  • 28. UNBALANCING DUE TO EXITS IN EARLIER CASCADE STAGES Live Dead  When running on the GPU, we run each search rectangle on a separate work item  Early out algorithms, like HAAR, exhibit divergence between work items – Some work items exit early – Their neighbors continue – SIMD packing suffers as a result 28 | ISSCC Keynote | February 18th, 2013
  • 29. PROCESSING TIME/STAGE A10-4600M (6CU@497Mhz, 4 cores@2700Mhz) 100 GPU CPU 90 80 70 60 Time (ms) 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9-22 Cascade Stage AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 GHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G, 6 compute units, 685MHz; 4GB RAM; Windows 7 (64-bit); OpenCL™ 1.1 (873.1) 29 | ISSCC Keynote | February 18th, 2013
  • 30. PERFORMANCE CPU-VS-GPU AMD A10-4600M APU (6CU@497Mhz, 4 cores@2700Mhz) 12 CPU HSA GPU 10 8 Images/Sec 6 4 2 0 0 1 2 3 4 5 6 7 8 22 Number of Cascade Stages on GPU AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 MHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G, 6 compute units, 685MHz; 4GB RAM; Windows 7 (64-bit); OpenCL™ 1.1 (873.1) 30 | ISSCC Keynote | February 18th, 2013
  • 31. HAAR SOLUTION RUN DIFFERENT CASCADES ON GPU AND CPU By seamlessly sharing data between CPU and GPU, HSA allows the right processor to handle its appropriate workload +2.5x -2.5x INCREASED DECREASED ENERGY PERFORMANCE PER FRAME 31 | ISSCC Keynote | February 18th, 2013
  • 32. APPLICATION ACCELERATION USING HSA Gesture recognition 12x Photo indexing 10x Voice recognition 10x Visual Search 9x Audio search 5x Stereo vision 4x Video stabilization 4x Face detect 2x 0 2 4 6 8 10 12 14 Acceleration vs. CPU AMD estimates Source:AMD Whitepaper, Accelerating Consumer/Prosumer Multimedia with HSA, June 2012 32 | ISSCC Keynote | February 18th, 2013
  • 33. HSA EVOLUTION Llano Trinity Kaveri Next Gen Physical Optimized Architectural System Integration Platforms Integration Integration Integrate CPU & GPU GPU Compute C++ Unified Address Space GPU compute in silicon support for CPU and GPU context switch GPU uses pageable Unified Memory GPU graphics User mode scheduling system memory via Controller pre-emption CPU pointers Common Bi-Directional Power Fully coherent memory Manufacturing Mgmt between CPU Quality of Service between CPU & GPU Technology and GPU 33 | ISSCC Keynote | February 18th, 2013
  • 34. HSA PROGRAMMABILITY ADVANTAGE Unified Programming Models Domain- HSA OpenCL, C++ DX11, Specific C, C++, Java … AMP, Java8 … OpenGL … Ext / APIs Foundation HSA Intermediate Language (HSAIL) Compute Acceleration Graphics Acceleration • Works with today’s programming models and languages • Architected to enable CPU like programmability • Promotes development and adoption of extended standards • Write Once Run Anywhere – with Performance 34 | ISSCC Keynote | February 18th, 2013
  • 35. CONCLUSION  The age of traditional computing is dead.  A paradigm shift in processing has brought about the Heterogeneous Systems Era  HSA will enable us to dramatically scale processing power while increasing power efficiency  The Holodeck still years away, but HSA and dedicated hardware blocks will accelerate and enable technologies as they emerge 35 | ISSCC Keynote | February 18th, 2013
  • 36. ACKNOWLEDGEMENTS  Bill Herz  Phil Rogers  Marty Johnson  Chris Hook  Sumant Subramanian 36 | ISSCC Keynote | February 18th, 2013
  • 38. DISCLAIMER The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names and logos are used for informational purposes only and may be trademarks of their respective owners. 38 | ISSCC Keynote | February 18th, 2013