SlideShare a Scribd company logo
1 of 51
Download to read offline
H2O4GPU and GoAI: harnessing the power of GPUs.
Mateusz	Dymczyk	
Senior	Software	Engineer	
H2O.ai	
@mdymczyk
Agenda
• About me
• About H2O.ai
• A bit of history: H2O-3
• Moving forward: feature engineering & Driverless AI
• The need for GPUs
• GPU overview
• Machine Learning + GPUs = why? how?
• About GoAI
• About H2O4GPU
• Q&A
About me
• M.Sc. in Computer Science @ AGH UST in Poland
• Ph.D. dropout (machine learning)
• Previously NLP/ML @ Fujitsu Laboratories, Kanagawa
• Currently Lead/Senior Machine Learning Engineer @
H2O.ai (remotely from Tokyo)
• Conference speaker (Strata Beijing/NY/Singapore,
Hadoop World Tokyo etc.)
About H2O.ai
FOUNDED 2012, SERIES C IN NOV, 2017
PRODUCTS • DRIVERLESS AI – AUTOMATED MACHINE LEARNING
• H2O OPEN SOURCE MACHINE LEARNING
• SPARKLING WATER
• H2O4GPU OS ML GPU LIBRARY
MISSION DEMOCRATIZE AI
TEAM • ~100 EMPLOYEES
• SEVERAL KAGGLE GRANDMASTERS
• DISTRIBUTED SYSTEMS ENGINEERS DOING MACHINE LEARNING
• WORLD-CLASS VISUALIZATION DESIGNERS
OFFICES MOUNTAIN VIEW, LONDON, PRAGUE
Community Adoption
*	DATA	FROM	GOOGLE	ANALYTICS	EMBEDDED	IN	THE	END	USER	PRODUCT
Select Customers
Financial InsuranceMarketing TelecomHealthcareRetail
“Overall customer satisfaction is very high.” - Gartner
Advisory &
Accounting
A bit of history: H2O-3
H2O-3 Overview
• Distributed implementations of cutting edge ML algorithms.
• Core algorithms written in high performance Java.
• APIs available in R, Python, Scala, REST/JSON.
• Interactive Web GUI called H2O Flow.
• Easily deploy models to production with H2O Steam.
H2O-3 Distributed Computing
• Multi-node cluster with shared memory model.
• All computations in memory.
• Each node sees only some rows of the data.
• No limit on cluster size.
• Distributed data frames (collection of vectors).
• Columns are distributed (across nodes) arrays.
• Works just like R’s data.frame or Python Pandas DataFrame
H2O Frame
H2O Cluster
H2O-3 Algorithms
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest:
Classification or regression models
• Gradient Boosting Machine: Produces
an ensemble of decision trees with
increasing refined approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with an
input layer followed by multiple layers of
nonlinear transformations
Unsupervised Learning
• K-means: Partitions observations into k
clusters/groups of the same spatial size.
Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly transforms
correlated variables to independent components
• Generalized Low Rank Models: extend the idea of
PCA to handle arbitrary data consisting of numerical,
Boolean, categorical, and missing data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction using
deep learning
DriverlessAI & Feature Engineering
The Need for Automation
“The United States alone faces a shortage of 140,000 to
190,000 people with analytical expertise and 1.5 million
managers and analysts”
–McKinsey Prediction for 2018
Recipe for Success
Auto Feature Generation

Kaggle Grand Master Out of the Box • Automatic Text Handling
• Frequency Encoding
• Cross Validation Target
Encoding
• Truncated SVD
• Clustering and more
Feature Transformations
Generated Features
Original Features
Recipe for Success
Recipe for Success
Driverless AI
AI to do AI
3 Pillars
Speed Accuracy Interpretability
The need for GPUs
Moore’s Law
1980 1990 2000 2010 2020
102
103
104
105
106
107
40	Years	of	Microprocessor	Trend	Data
Original	data	up	to	the	year	2010	collected	and	plotted	by	M.	Horowitz,	F.	Labonte,	O.	Shacham,	
K.	Olukotun,	L.	Hammond,	and	C.	Batten	New	plot	and	data	collected	for	2010-2015	by	K.	Rupp
Single-threaded	perf
1.5X	per	year
1.1X	per	year
Transistors

(thousands)
GPU
1980 1990 2000 2010 2020
GPU-Computing	perf	
1.5X	per	year
1000X
by	2025
Original	data	up	to	the	year	2010	collected	and	plotted	by	M.	Horowitz,	F.	Labonte,	O.	Shacham,	
K.	Olukotun,	L.	Hammond,	and	C.	Batten	New	plot	and	data	collected	for	2010-2015	by	K.	Rupp
102
103
104
105
106
107
Single-threaded	perf
1.5X	per	year
1.1X	per	year
APPLICATIONS
SYSTEMS
ALGORITHMS
CUDA
ARCHITECTURE
GoAI
GPU Shortcomings
GPU
Global Memory
Thread
Local
Thread
Local
Thread
Local
Shared
Thread
Local
Thread
Local
Thread
Local
Shared
Thread
Local
Thread
Local
Thread
Local
Shared
Thread
Local
Thread
Local
Thread
Local
Shared
CPU
Host
Memory
C
PU
copies
data
from
host
to
G
PU
m
em
ory
via
PC
I-E
CPU launches kernels
SLOW!!!
GPU Open Analytics Initiative (GOAI)
github.com/gpuopenanalytics
GPU Data Frame (GDF)
Ingest/

Parse
Exploratory
Analysis
Feature
Engineering
ML/DL
Algorithms
Grid Search
Scoring
Model

Export
GOAI Data Flow
GPU Overview
GPU architecture
Low latency vs High throughput
GPU
• Optimized for data-parallel,
throughput computation
• Architecture tolerant of
memory latency
• More transistors dedicated to
computation
CPU
• Optimized for low-latency
access to cached data sets
• Control logic for out-of-order
and speculative execution
GPU Enhanced Applications
Application Code
GPU
Use GPU to
Parallelize
Compute-Intensive
Functions CPU
Rest of Sequential
CPU Code
Machine Learning on GPU
Machine Learning and GPUs
2
4 A
3
5
m ⇥ k
2
4 B
3
5
k ⇥ n
=
2
4 C
3
5
m ⇥ n
Matrix Multiplication
2
6
6
6
6
6
4
a1,1 a1,2 a1,3 . . . a1,k
a2,1 a2,2 a2,3 . . . a2,k
a3,1 a3,2 a3,3 . . . a3,k
...
...
...
...
...
am,1 am,2 am,3 . . . am,k
3
7
7
7
7
7
5
A
2
6
6
6
6
6
4
b1,1 b1,2 b1,3 . . . b1,n
b2,1 b2,2 b2,3 . . . b2,n
b3,1 b3,2 b3,3 . . . b3,n
...
...
...
...
...
bk,1 bk,2 bk,3 . . . bk,n
3
7
7
7
7
7
5
B
=
2
6
6
6
6
6
4
c1,1 c1,2 c1,3 . . . c1,n
c2,1 c2,2 c2,3 . . . c2,n
c3,1 c3,2 c3,3 . . . c3,n
...
...
...
...
...
cm,1 cm,2 cm,3 . . . cm,n
3
7
7
7
7
7
5
C
Matrix Multiplication
2
6
6
6
6
6
4
a1,1 a1,2 a1,3 . . . a1,k
a2,1 a2,2 a2,3 . . . a2,k
a3,1 a3,2 a3,3 . . . a3,k
...
...
...
...
...
am,1 am,2 am,3 . . . am,k
3
7
7
7
7
7
5
A
2
6
6
6
6
6
4
b1,1 b1,2 b1,3 . . . b1,n
b2,1 b2,2 b2,3 . . . b2,n
b3,1 b3,2 b3,3 . . . b3,n
...
...
...
...
...
bk,1 bk,2 bk,3 . . . bk,n
3
7
7
7
7
7
5
B
=
2
6
6
6
6
6
4
c1,1 c1,2 c1,3 . . . c1,n
c2,1 c2,2 c2,3 . . . c2,n
c3,1 c3,2 c3,3 . . . c3,n
...
...
...
...
...
cm,1 cm,2 cm,3 . . . cm,n
3
7
7
7
7
7
5
C
Matrix Multiplication
2
6
6
6
6
6
4
a1,1 a1,2 a1,3 . . . a1,k
a2,1 a2,2 a2,3 . . . a2,k
a3,1 a3,2 a3,3 . . . a3,k
...
...
...
...
...
am,1 am,2 am,3 . . . am,k
3
7
7
7
7
7
5
A
2
6
6
6
6
6
4
b1,1 b1,2 b1,3 . . . b1,n
b2,1 b2,2 b2,3 . . . b2,n
b3,1 b3,2 b3,3 . . . b3,n
...
...
...
...
...
bk,1 bk,2 bk,3 . . . bk,n
3
7
7
7
7
7
5
B
=
2
6
6
6
6
6
4
c1,1 c1,2 c1,3 . . . c1,n
c2,1 c2,2 c2,3 . . . c2,n
c3,1 c3,2 c3,3 . . . c3,n
...
...
...
...
...
cm,1 cm,2 cm,3 . . . cm,n
3
7
7
7
7
7
5
C
Matrix Multiplication
2
6
6
6
6
6
4
a1,1 a1,2 a1,3 . . . a1,k
a2,1 a2,2 a2,3 . . . a2,k
a3,1 a3,2 a3,3 . . . a3,k
...
...
...
...
...
am,1 am,2 am,3 . . . am,k
3
7
7
7
7
7
5
A
2
6
6
6
6
6
4
b1,1 b1,2 b1,3 . . . b1,n
b2,1 b2,2 b2,3 . . . b2,n
b3,1 b3,2 b3,3 . . . b3,n
...
...
...
...
...
bk,1 bk,2 bk,3 . . . bk,n
3
7
7
7
7
7
5
B
=
2
6
6
6
6
6
4
c1,1 c1,2 c1,3 . . . c1,n
c2,1 c2,2 c2,3 . . . c2,n
c3,1 c3,2 c3,3 . . . c3,n
...
...
...
...
...
cm,1 cm,2 cm,3 . . . cm,n
3
7
7
7
7
7
5
C
Matrix Multiplication
2
6
6
6
6
6
4
a1,1 a1,2 a1,3 . . . a1,k
a2,1 a2,2 a2,3 . . . a2,k
a3,1 a3,2 a3,3 . . . a3,k
...
...
...
...
...
am,1 am,2 am,3 . . . am,k
3
7
7
7
7
7
5
A
2
6
6
6
6
6
4
b1,1 b1,2 b1,3 . . . b1,n
b2,1 b2,2 b2,3 . . . b2,n
b3,1 b3,2 b3,3 . . . b3,n
...
...
...
...
...
bk,1 bk,2 bk,3 . . . bk,n
3
7
7
7
7
7
5
B
=
2
6
6
6
6
6
4
c1,1 c1,2 c1,3 . . . c1,n
c2,1 c2,2 c2,3 . . . c2,n
c3,1 c3,2 c3,3 . . . c3,n
...
...
...
...
...
cm,1 cm,2 cm,3 . . . cm,n
3
7
7
7
7
7
5
C
C[0,0]
C[0,1]
C[n,m]
Matrix Operations in ML
Matrix	
Multiplication!
All	black	lines	are	
matrix	multiplications!
H2O4GPU
Practical Machine Learning
Machine	
Learning
H2O4GPU
• Open-Source: https://github.com/h2oai/h2o4gpu
• Collection of important ML algorithms ported to the GPU (with CPU fallback option):
• Gradient Boosted Machines
• GLM
• Truncated SVD
• PCA
• KMeans
• (soon) Field Aware Factorization Machines
• Performance optimized, multi-GPU support (certain algorithms)
• Used within our own Driverless AI Product to boost performance 30X
• Scikit-Learn compatible Python API (and now R API)
H2O4GPU Algorithms
10X
XGBoost
5X
GLM
40X
K-means
5X
SVD
Gradient Boosting Machines
• Based upon XGBoost
• Raw floating point data -> Binned into Quantiles
• Quantiles are stored as compressed instead of floats
• Compressed Quantiles are efficiently transferred to GPU
• Sparsity is handled directly with highly GPU efficiency
• Multi-GPU by sharding rows using NVIDIA NCCL AllReduce
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the power of GPUs.
KMeans
• Significantly faster than Scikit-learn implementation (up to 50x)
• Significantly faster than other GPU implementations (5x-10x)
• Supports kmeans|| initialization
• Supports multiple GPUs by sharding the dataset
• Supports batching data if exceeds GPU memory
12 with kmeans||
Truncated SVD & PCA
• Matrix decomposition
• Popular for text processing
and dimensionality reduction
• GPU optimizes linear algebra
operations
Truncated SVD & PCA
• The intrinsic dimensionality of certain datasets is much lower than the
original (e.g. here 4096 vs. actual ~200)
• PCA can reduce the dimensionality and preserve most of the explained
variance at the same time
• Better input for further modeling - takes less time
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the power of GPUs.
Field Aware Factorization Machines
* under development
• Click Through Rate (CTR):
• One of the most important tasks in computational advertising
• Percentage of users, who actually click on ads
• Until recently solved with logistic regression - bad at finding feature conjunctions
(learns the effect of all variables or features individually)
Clicked Publisher	(P) Advertiser	(A) Gender	(G)
Yes ESPN Nike Male
No NBC Adidas Male
Field Aware Factorization Machines
* under development
• Separates the data into fields (Publisher, Advertiser, Gender) and features (EPSN, NBC,
Adidas, Nike, Male, Female)
• Uses a latent space for each pair to generate the model
• Used to win the first prize of three CTR competitions hosted by Criteo, Avazu, Outbrain,
and also the third prize of RecSys Challenge 2015.
Demo
More info
• Documentation: http://docs.h2o.ai
• Online Training: http://learn.h2o.ai
• Tutorials: https://github.com/h2oai/h2o-tutorials
• Slidedecks: https://github.com/h2oai/h2o-meetups
• Video Presentations: https://www.youtube.com/user/0xdata
• Events & Meetups: http://h2o.ai/events
• Code: http://github.com/h2oai/
• Questions:
• https://stackoverflow.com/questions/tagged/h2o4gpu
• https://gitter.im/h2oai/{h2o-3,h2o4gpu}
Thank you!
@mdymczyk
mateusz@h2o.ai
Q&A

More Related Content

What's hot

Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumbergerinside-BigData.com
 
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)Alexander Ulanov
 
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersHybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersKoichi Shirahata
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdwKohei KaiGai
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Alembic: Distilling C++ into high-performance Grappa
Alembic: Distilling C++ into high-performance GrappaAlembic: Distilling C++ into high-performance Grappa
Alembic: Distilling C++ into high-performance GrappaBrandon G Holt
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...Preferred Networks
 
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakLeveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakDatabricks
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySparkSpark Summit
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAprithan
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」Preferred Networks
 
GoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesGoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesDataWorks Summit/Hadoop Summit
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersJen Aman
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)PhtRaveller
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1Kenta Oono
 
Mining Top-k Closed Sequential Patterns in Sequential Databases
Mining Top-k Closed Sequential Patterns in Sequential Databases Mining Top-k Closed Sequential Patterns in Sequential Databases
Mining Top-k Closed Sequential Patterns in Sequential Databases IOSR Journals
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series ForecastingBillTubbs
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷Eiji Sekiya
 

What's hot (20)

Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
 
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
 
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersHybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Alembic: Distilling C++ into high-performance Grappa
Alembic: Distilling C++ into high-performance GrappaAlembic: Distilling C++ into high-performance Grappa
Alembic: Distilling C++ into high-performance Grappa
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
 
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd MostakLeveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
Leveraging GPU-Accelerated Analytics on top of Apache Spark with Todd Mostak
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
 
GoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with DependenciesGoodFit: Multi-Resource Packing of Tasks with Dependencies
GoodFit: Multi-Resource Packing of Tasks with Dependencies
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
London hug
London hugLondon hug
London hug
 
Mining Top-k Closed Sequential Patterns in Sequential Databases
Mining Top-k Closed Sequential Patterns in Sequential Databases Mining Top-k Closed Sequential Patterns in Sequential Databases
Mining Top-k Closed Sequential Patterns in Sequential Databases
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 

Similar to [db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the power of GPUs.

Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningSri Ambati
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsHPCC Systems
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model CompressionApache MXNet
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetupGanesan Narayanasamy
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...Edge AI and Vision Alliance
 
(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the CloudAmazon Web Services
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMartin Zapletal
 
S12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdfS12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdfgopikahari7
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...Kuldeep Jiwani
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Florian Lautenschlager
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxBharathiLakshmiAAssi
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesJonathan Katz
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-isctembreternitz
 

Similar to [db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the power of GPUs. (20)

Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine Learning
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC Systems
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
 
GPU Algorithms and trends 2018
GPU Algorithms and trends 2018GPU Algorithms and trends 2018
GPU Algorithms and trends 2018
 
(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache Spark
 
S12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdfS12075-GPU-Accelerated-Video-Encoding.pdf
S12075-GPU-Accelerated-Video-Encoding.pdf
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...
OracleCode 2017: Performance Diagnostic Techniques for Big Data Solutions Usi...
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017
 
MAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsxMAtrix Multiplication Parallel.ppsx
MAtrix Multiplication Parallel.ppsx
 
matrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsxmatrixmultiplicationparallel.ppsx
matrixmultiplicationparallel.ppsx
 
Operating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with KubernetesOperating PostgreSQL at Scale with Kubernetes
Operating PostgreSQL at Scale with Kubernetes
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
 

More from Insight Technology, Inc.

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Insight Technology, Inc.
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明するInsight Technology, Inc.
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーンInsight Technology, Inc.
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとInsight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームInsight Technology, Inc.
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門Insight Technology, Inc.
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー Insight Technology, Inc.
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?Insight Technology, Inc.
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Insight Technology, Inc.
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?Insight Technology, Inc.
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...Insight Technology, Inc.
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 Insight Technology, Inc.
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Insight Technology, Inc.
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]Insight Technology, Inc.
 

More from Insight Technology, Inc. (20)

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
Docker and the Oracle Database
Docker and the Oracle DatabaseDocker and the Oracle Database
Docker and the Oracle Database
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごと
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォーム
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門
 
Lunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL ServicesLunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL Services
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
 

Recently uploaded

9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 

Recently uploaded (20)

9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 

[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the power of GPUs.

  • 1. H2O4GPU and GoAI: harnessing the power of GPUs. Mateusz Dymczyk Senior Software Engineer H2O.ai @mdymczyk
  • 2. Agenda • About me • About H2O.ai • A bit of history: H2O-3 • Moving forward: feature engineering & Driverless AI • The need for GPUs • GPU overview • Machine Learning + GPUs = why? how? • About GoAI • About H2O4GPU • Q&A
  • 3. About me • M.Sc. in Computer Science @ AGH UST in Poland • Ph.D. dropout (machine learning) • Previously NLP/ML @ Fujitsu Laboratories, Kanagawa • Currently Lead/Senior Machine Learning Engineer @ H2O.ai (remotely from Tokyo) • Conference speaker (Strata Beijing/NY/Singapore, Hadoop World Tokyo etc.)
  • 4. About H2O.ai FOUNDED 2012, SERIES C IN NOV, 2017 PRODUCTS • DRIVERLESS AI – AUTOMATED MACHINE LEARNING • H2O OPEN SOURCE MACHINE LEARNING • SPARKLING WATER • H2O4GPU OS ML GPU LIBRARY MISSION DEMOCRATIZE AI TEAM • ~100 EMPLOYEES • SEVERAL KAGGLE GRANDMASTERS • DISTRIBUTED SYSTEMS ENGINEERS DOING MACHINE LEARNING • WORLD-CLASS VISUALIZATION DESIGNERS OFFICES MOUNTAIN VIEW, LONDON, PRAGUE
  • 6. Select Customers Financial InsuranceMarketing TelecomHealthcareRetail “Overall customer satisfaction is very high.” - Gartner Advisory & Accounting
  • 7. A bit of history: H2O-3
  • 8. H2O-3 Overview • Distributed implementations of cutting edge ML algorithms. • Core algorithms written in high performance Java. • APIs available in R, Python, Scala, REST/JSON. • Interactive Web GUI called H2O Flow. • Easily deploy models to production with H2O Steam.
  • 9. H2O-3 Distributed Computing • Multi-node cluster with shared memory model. • All computations in memory. • Each node sees only some rows of the data. • No limit on cluster size. • Distributed data frames (collection of vectors). • Columns are distributed (across nodes) arrays. • Works just like R’s data.frame or Python Pandas DataFrame H2O Frame H2O Cluster
  • 10. H2O-3 Algorithms Supervised Learning • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning
  • 11. DriverlessAI & Feature Engineering
  • 12. The Need for Automation “The United States alone faces a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts” –McKinsey Prediction for 2018
  • 13. Recipe for Success Auto Feature Generation
 Kaggle Grand Master Out of the Box • Automatic Text Handling • Frequency Encoding • Cross Validation Target Encoding • Truncated SVD • Clustering and more Feature Transformations Generated Features Original Features
  • 16. 3 Pillars Speed Accuracy Interpretability
  • 17. The need for GPUs
  • 18. Moore’s Law 1980 1990 2000 2010 2020 102 103 104 105 106 107 40 Years of Microprocessor Trend Data Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp Single-threaded perf 1.5X per year 1.1X per year Transistors
 (thousands)
  • 19. GPU 1980 1990 2000 2010 2020 GPU-Computing perf 1.5X per year 1000X by 2025 Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp 102 103 104 105 106 107 Single-threaded perf 1.5X per year 1.1X per year APPLICATIONS SYSTEMS ALGORITHMS CUDA ARCHITECTURE
  • 20. GoAI
  • 22. GPU Open Analytics Initiative (GOAI) github.com/gpuopenanalytics GPU Data Frame (GDF) Ingest/
 Parse Exploratory Analysis Feature Engineering ML/DL Algorithms Grid Search Scoring Model
 Export
  • 25. GPU architecture Low latency vs High throughput GPU • Optimized for data-parallel, throughput computation • Architecture tolerant of memory latency • More transistors dedicated to computation CPU • Optimized for low-latency access to cached data sets • Control logic for out-of-order and speculative execution
  • 26. GPU Enhanced Applications Application Code GPU Use GPU to Parallelize Compute-Intensive Functions CPU Rest of Sequential CPU Code
  • 28. Machine Learning and GPUs 2 4 A 3 5 m ⇥ k 2 4 B 3 5 k ⇥ n = 2 4 C 3 5 m ⇥ n
  • 29. Matrix Multiplication 2 6 6 6 6 6 4 a1,1 a1,2 a1,3 . . . a1,k a2,1 a2,2 a2,3 . . . a2,k a3,1 a3,2 a3,3 . . . a3,k ... ... ... ... ... am,1 am,2 am,3 . . . am,k 3 7 7 7 7 7 5 A 2 6 6 6 6 6 4 b1,1 b1,2 b1,3 . . . b1,n b2,1 b2,2 b2,3 . . . b2,n b3,1 b3,2 b3,3 . . . b3,n ... ... ... ... ... bk,1 bk,2 bk,3 . . . bk,n 3 7 7 7 7 7 5 B = 2 6 6 6 6 6 4 c1,1 c1,2 c1,3 . . . c1,n c2,1 c2,2 c2,3 . . . c2,n c3,1 c3,2 c3,3 . . . c3,n ... ... ... ... ... cm,1 cm,2 cm,3 . . . cm,n 3 7 7 7 7 7 5 C
  • 30. Matrix Multiplication 2 6 6 6 6 6 4 a1,1 a1,2 a1,3 . . . a1,k a2,1 a2,2 a2,3 . . . a2,k a3,1 a3,2 a3,3 . . . a3,k ... ... ... ... ... am,1 am,2 am,3 . . . am,k 3 7 7 7 7 7 5 A 2 6 6 6 6 6 4 b1,1 b1,2 b1,3 . . . b1,n b2,1 b2,2 b2,3 . . . b2,n b3,1 b3,2 b3,3 . . . b3,n ... ... ... ... ... bk,1 bk,2 bk,3 . . . bk,n 3 7 7 7 7 7 5 B = 2 6 6 6 6 6 4 c1,1 c1,2 c1,3 . . . c1,n c2,1 c2,2 c2,3 . . . c2,n c3,1 c3,2 c3,3 . . . c3,n ... ... ... ... ... cm,1 cm,2 cm,3 . . . cm,n 3 7 7 7 7 7 5 C
  • 31. Matrix Multiplication 2 6 6 6 6 6 4 a1,1 a1,2 a1,3 . . . a1,k a2,1 a2,2 a2,3 . . . a2,k a3,1 a3,2 a3,3 . . . a3,k ... ... ... ... ... am,1 am,2 am,3 . . . am,k 3 7 7 7 7 7 5 A 2 6 6 6 6 6 4 b1,1 b1,2 b1,3 . . . b1,n b2,1 b2,2 b2,3 . . . b2,n b3,1 b3,2 b3,3 . . . b3,n ... ... ... ... ... bk,1 bk,2 bk,3 . . . bk,n 3 7 7 7 7 7 5 B = 2 6 6 6 6 6 4 c1,1 c1,2 c1,3 . . . c1,n c2,1 c2,2 c2,3 . . . c2,n c3,1 c3,2 c3,3 . . . c3,n ... ... ... ... ... cm,1 cm,2 cm,3 . . . cm,n 3 7 7 7 7 7 5 C
  • 32. Matrix Multiplication 2 6 6 6 6 6 4 a1,1 a1,2 a1,3 . . . a1,k a2,1 a2,2 a2,3 . . . a2,k a3,1 a3,2 a3,3 . . . a3,k ... ... ... ... ... am,1 am,2 am,3 . . . am,k 3 7 7 7 7 7 5 A 2 6 6 6 6 6 4 b1,1 b1,2 b1,3 . . . b1,n b2,1 b2,2 b2,3 . . . b2,n b3,1 b3,2 b3,3 . . . b3,n ... ... ... ... ... bk,1 bk,2 bk,3 . . . bk,n 3 7 7 7 7 7 5 B = 2 6 6 6 6 6 4 c1,1 c1,2 c1,3 . . . c1,n c2,1 c2,2 c2,3 . . . c2,n c3,1 c3,2 c3,3 . . . c3,n ... ... ... ... ... cm,1 cm,2 cm,3 . . . cm,n 3 7 7 7 7 7 5 C
  • 33. Matrix Multiplication 2 6 6 6 6 6 4 a1,1 a1,2 a1,3 . . . a1,k a2,1 a2,2 a2,3 . . . a2,k a3,1 a3,2 a3,3 . . . a3,k ... ... ... ... ... am,1 am,2 am,3 . . . am,k 3 7 7 7 7 7 5 A 2 6 6 6 6 6 4 b1,1 b1,2 b1,3 . . . b1,n b2,1 b2,2 b2,3 . . . b2,n b3,1 b3,2 b3,3 . . . b3,n ... ... ... ... ... bk,1 bk,2 bk,3 . . . bk,n 3 7 7 7 7 7 5 B = 2 6 6 6 6 6 4 c1,1 c1,2 c1,3 . . . c1,n c2,1 c2,2 c2,3 . . . c2,n c3,1 c3,2 c3,3 . . . c3,n ... ... ... ... ... cm,1 cm,2 cm,3 . . . cm,n 3 7 7 7 7 7 5 C C[0,0] C[0,1] C[n,m]
  • 34. Matrix Operations in ML Matrix Multiplication! All black lines are matrix multiplications!
  • 37. H2O4GPU • Open-Source: https://github.com/h2oai/h2o4gpu • Collection of important ML algorithms ported to the GPU (with CPU fallback option): • Gradient Boosted Machines • GLM • Truncated SVD • PCA • KMeans • (soon) Field Aware Factorization Machines • Performance optimized, multi-GPU support (certain algorithms) • Used within our own Driverless AI Product to boost performance 30X • Scikit-Learn compatible Python API (and now R API)
  • 39. Gradient Boosting Machines • Based upon XGBoost • Raw floating point data -> Binned into Quantiles • Quantiles are stored as compressed instead of floats • Compressed Quantiles are efficiently transferred to GPU • Sparsity is handled directly with highly GPU efficiency • Multi-GPU by sharding rows using NVIDIA NCCL AllReduce
  • 41. KMeans • Significantly faster than Scikit-learn implementation (up to 50x) • Significantly faster than other GPU implementations (5x-10x) • Supports kmeans|| initialization • Supports multiple GPUs by sharding the dataset • Supports batching data if exceeds GPU memory
  • 43. Truncated SVD & PCA • Matrix decomposition • Popular for text processing and dimensionality reduction • GPU optimizes linear algebra operations
  • 44. Truncated SVD & PCA • The intrinsic dimensionality of certain datasets is much lower than the original (e.g. here 4096 vs. actual ~200) • PCA can reduce the dimensionality and preserve most of the explained variance at the same time • Better input for further modeling - takes less time
  • 46. Field Aware Factorization Machines * under development • Click Through Rate (CTR): • One of the most important tasks in computational advertising • Percentage of users, who actually click on ads • Until recently solved with logistic regression - bad at finding feature conjunctions (learns the effect of all variables or features individually) Clicked Publisher (P) Advertiser (A) Gender (G) Yes ESPN Nike Male No NBC Adidas Male
  • 47. Field Aware Factorization Machines * under development • Separates the data into fields (Publisher, Advertiser, Gender) and features (EPSN, NBC, Adidas, Nike, Male, Female) • Uses a latent space for each pair to generate the model • Used to win the first prize of three CTR competitions hosted by Criteo, Avazu, Outbrain, and also the third prize of RecSys Challenge 2015.
  • 48. Demo
  • 49. More info • Documentation: http://docs.h2o.ai • Online Training: http://learn.h2o.ai • Tutorials: https://github.com/h2oai/h2o-tutorials • Slidedecks: https://github.com/h2oai/h2o-meetups • Video Presentations: https://www.youtube.com/user/0xdata • Events & Meetups: http://h2o.ai/events • Code: http://github.com/h2oai/ • Questions: • https://stackoverflow.com/questions/tagged/h2o4gpu • https://gitter.im/h2oai/{h2o-3,h2o4gpu}
  • 51. Q&A