SlideShare a Scribd company logo
1 of 63
Scientific
Computing on
JRuby
github.com/prasunanand
Objective
● A Scientific library is memory intensive and speed counts.How to use
JRuby effectively to create a great tool/gem.
● A General Purpose GPU library for Ruby that can be used by industry in
production and academia for research.
● Ruby Science Foundation
● SciRuby has been trying to push Ruby for scientific computing.
● Popular Rubygems:
1. NMatrix
2. Daru
3. Mixed_models
NMatrix
NMatrix is SciRuby’s numerical matrix core, implementing dense matrices as
well as two types of sparse (linked-list-based and Yale/CSR).
It currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several
of its linear algebra operations.
Daru
Mixed_models
Nyaplot
Why nya?
Contributors wanted
● IRC #sciruby
● Slack-channel #sciruby
● Google-group #sciruby
Known for performance JRuby is 10 times faster than CRuby.
With truffle it’s around 40 times faster than CRuby.
Say hello
NMatrix for JRuby
● Not a unified interface for Sciruby gems: MDArray.
● MDArray is a great gem for Linear Algebra.
● However, every gem that used NMatrix as dependency needed to be
reimplemented with MDArray.
● Hence, putting in effort for optimization.
NMatrix for JRuby
● Parallelism=> No Global Interpreter Lock as in case of MRI
● Easy Deployment(Warbler gem)
How NMatrix works
● N-Dimensional
● 2-Dimensional NMatrix
N-dimensional NMatrix
N-dimensional matrices are stored as a one-dimensional Array.
Elementwise Operation
● Iterate through the elements
● Access the array; do the operation, return it
● [:add, :subtract, :sin, :gamma]
Determinants and Factoriztion
● Two dimensional matrix operations
● In NMatrix-MRI, BLAS-III and LAPACK routines are implemented using
their respective libraries
● NMatrix-JRuby depends on Java functions.
Mixed models
● After NMAtrix for doubles was ready, I tested it with mixed_models.
Challenges
● Autoboxing and Multiple data type
● Minimise copying of data
● Handling large array
Autoboxing
● :float64 => double only
● Strict dtypes => creating data type in Java: not guessing
● Errors => that can’t be reproduced :P
[ 0. 11, 0.05, 0.34, 0.14 ] + [ 0. 21,0.05, 0.14, 0.14 ] = [ 0, 0, 0, 0]
([ 0. 11, 0.05, 0.34, 0.14 ] + 5) + ([ 0. 21, 0.05, 0.14, 0.14 ] + 5) - 10 =
[ 0.32, 0.1, 0.48, 0.28]
Minimise copying of data
● Make sure you make copies of data
Handling large arrays
● Array Size
● Accessing elements
● Chaining to java method
● Speed and Memory Required
Ruby Code
index =0
puts Benchmark.measure{
(0...15000).each do |i|
(0...15000).each do |j|
c[i][j] = b[i][j]
index+=1
end
end
}
#67.790000 0.070000 67.860000 ( 65.126546)
#RAM consumed => 5.4GB
b = Java::double[15_000,15_000].new
c = Java::double[15_000,15_000].new
index=0
puts Benchmark.measure{
(0...15000).each do |i|
(0...15000).each do |j|
b[i][j] = index
index+=1
end
end
}
#43.260000 3.250000 46.510000 ( 39.606356)
Java Code
public class MatrixGenerator{
public static void test2(){
for (int index=0, i=0; i < row ; i++){
for (int j=0; j < col; j++){
c[i][j]= b[i][j];
index++;
}
}
}
puts Benchmark.measure{MatrixGenerator.test2}
#0.034000 0.001000 00.034000 ( 00.03300)
#RAM consumed => 300MB
public class MatrixGenerator{
public static void test1(){
double[][] b = new double[15000][15000];
double[][] c = new double[15000][15000];
for (int index=0, i=0; i < row ; i++){
for (int j=0; j < col; j++){
b[i][j]= index;
index++;
}
}
}
puts Benchmark.measure{MatrixGenerator.test1}
#0.032000 0.001000 00.032000 ( 00.03100)
Results
Improves:
● 1000 times the speed
● 10times the memory
Benchmarking NMatrix functionalities
System Specifications
● CPU: AMD FX8350 0ctacore 4.2GHz
● RAM: 16GB
Addition
Subtraction
Gamma
Matrix Multiplication
Determinant
Factorization
Benchmark conclusion
● NMatrix-JRuby is incredibly faster for N-dimensional matrices when
elementwise operations are concerned.
● NMatrix-MRI is faster for 2-dimensional matrix when calculating matrix
multiplication, determinant calculation and factorization.
Improvements
● Make NMatrix-JRuby faster than NMatrix-MRI using BLAS level-3 and
LAPACK routines.
● How?
● Why not JBlas?
Future Work
● Add support for complex dtype.
● Convert NMatrix-JRuby Enumerators to Java code.
● Add sparse support.
Am I done?
Nope!
Enter GPU
A General-Purpose GPU library
● Combine the beauty of Ruby with transparent GPU processing
● This will work both on client computers and on servers that make use of
TESLA's and Intel Xeon Phi solutions.
● Developer activity and support for the current projects is mixed at best,
and they are tough to use as they involve writing kernels and require a lot
of effort to be put in buffer/RAM optimisation.
ArrayFire-rb
● Wraps ArrayFire library
Using ArrayFire
MRI
● C extension
● Architecture is inspired by NMatrix and NArray
● The C++ function is placed in a namespace (e.g., namespace af { }) or is
declared static if possible. The C function receives the prefix af_, e.g.,
af_multiply() (this function also happens to be static).
● C macros are capitalized and generally have the prefix AF_, as with
AF_DTYPE().
● C functions (and macros, for consistency) are placed within extern "C" { }
blocks to turn off C++ mangling.
JRuby
● The approach is same as NMatrix JRuby.
● Java Native Interface( JNI )
● Work on ArrayFire-Java
Benchmarking ArrayFire
System Specification
CPU: AMD FX Octacore 4.2GHz
RAM: 16GB
GPU: Nvidia GTX 750Ti
GPU RAM : 4GB DDR5
Matrix Addition
Matrix Multiplication
Matrix Determinant
Factorization
Transparency
● Integrate with Narray
● Integrate with NMatrix
● Integrate with Rails
Applications
● Endless possibilities ;)
● Bioinformatics
● Integrate Tensorflow
● Image Processing
● Computational Fluid Dynamics
Conclusion
Useful Links
● https://github.com/sciruby/nmatrix
● https://github.com/arrayfire/arrayfire-rb
● https://github.com/prasunanand/arrayfire-rb/tree/temp
Acknowlegements
1. Pjotr Prins
2. Charles Nutter
3. John Woods
4. Alexej Gossmann
5. Sameer Deshmukh
6. Pradeep Garigipati
Thank You
Github: prasunanand
Twitter: @prasun_anand
Blog: prasunanand.com

More Related Content

What's hot

Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindings
Dmitriy Lyubimov
 
Bringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to MahoutBringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to Mahout
sscdotopen
 
Know your Javascript Engine
Know your Javascript EngineKnow your Javascript Engine
Know your Javascript Engine
zipeng zhang
 

What's hot (20)

TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
TensorFlow Object Detection | Realtime Object Detection with TensorFlow | Ten...
 
MapDB - taking Java collections to the next level
MapDB - taking Java collections to the next levelMapDB - taking Java collections to the next level
MapDB - taking Java collections to the next level
 
Garbage collector in python
Garbage collector in pythonGarbage collector in python
Garbage collector in python
 
Spark schema for free with David Szakallas
Spark schema for free with David SzakallasSpark schema for free with David Szakallas
Spark schema for free with David Szakallas
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
 
PyTorch for Deep Learning Practitioners
PyTorch for Deep Learning PractitionersPyTorch for Deep Learning Practitioners
PyTorch for Deep Learning Practitioners
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindings
 
Introduction to Tensorflow.js
Introduction to Tensorflow.jsIntroduction to Tensorflow.js
Introduction to Tensorflow.js
 
Exploring Gpgpu Workloads
Exploring Gpgpu WorkloadsExploring Gpgpu Workloads
Exploring Gpgpu Workloads
 
Java Performance Tweaks
Java Performance TweaksJava Performance Tweaks
Java Performance Tweaks
 
Bringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to MahoutBringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to Mahout
 
Paralell
ParalellParalell
Paralell
 
What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0What's new in Apache Hivemall v0.5.0
What's new in Apache Hivemall v0.5.0
 
Beyond tf idf why, what & how
Beyond tf idf why, what & howBeyond tf idf why, what & how
Beyond tf idf why, what & how
 
Know your Javascript Engine
Know your Javascript EngineKnow your Javascript Engine
Know your Javascript Engine
 
Map db
Map dbMap db
Map db
 
Lrz kurse: r as superglue
Lrz kurse: r as superglueLrz kurse: r as superglue
Lrz kurse: r as superglue
 
Caching in
Caching inCaching in
Caching in
 
Spark: Taming Big Data
Spark: Taming Big DataSpark: Taming Big Data
Spark: Taming Big Data
 

Similar to Scientific computing on jruby

Similar to Scientific computing on jruby (20)

Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Using Spark over Cassandra
Using Spark over CassandraUsing Spark over Cassandra
Using Spark over Cassandra
 
Concurrecy in Ruby
Concurrecy in RubyConcurrecy in Ruby
Concurrecy in Ruby
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
 
Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetup
 
Scio - A Scala API for Google Cloud Dataflow & Apache Beam
Scio - A Scala API for Google Cloud Dataflow & Apache BeamScio - A Scala API for Google Cloud Dataflow & Apache Beam
Scio - A Scala API for Google Cloud Dataflow & Apache Beam
 
Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)
 
New Developments in Spark
New Developments in SparkNew Developments in Spark
New Developments in Spark
 
Architecting and productionising data science applications at scale
Architecting and productionising data science applications at scaleArchitecting and productionising data science applications at scale
Architecting and productionising data science applications at scale
 
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
 
A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...A really really fast introduction to PySpark - lightning fast cluster computi...
A really really fast introduction to PySpark - lightning fast cluster computi...
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
A super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAMA super fast introduction to Spark and glance at BEAM
A super fast introduction to Spark and glance at BEAM
 
The productivity brought by Clojure
The productivity brought by ClojureThe productivity brought by Clojure
The productivity brought by Clojure
 
Concurrency patterns in Ruby
Concurrency patterns in RubyConcurrency patterns in Ruby
Concurrency patterns in Ruby
 
Concurrency patterns in Ruby
Concurrency patterns in RubyConcurrency patterns in Ruby
Concurrency patterns in Ruby
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Scientific computing on jruby