SlideShare a Scribd company logo
enter.the.matrix
core.matrix
Array programming
as a language extension
for Clojure
(with a Numerical computing focus)
Plug-in paradigms
Paradigm

Exemplar language

Functional programming

Clojure implementation

Haskell

clojure.core

Meta-programming

Lisp

Logic programming

Prolog

core.logic

Process algebras / CSP

Go

core.async

Array programming

APL

core.matrix
APL
Venerable
history

•
•

Notation invented in 1957 by Ken Iverson
Implemented at IBM around 1960-64

Has its own
keyboard

Interesting
perspective on
code readability

life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1
0 1∘.⌽⊂⍵}
Modern array programming
Standalone environment for
statistical programming / graphics

Python library for array programming

A new language (2012) based on
array programming principles
.... and many others
Why Clojure for array programming?
1. Data Science
2. Platform
3. Philosophy
Elements of core.matrix
Abstraction
N-dimensional arrays
– what and why?

API
What can you do with
arrays?

Implementation
How is everything
implemented?
Abstraction

or: “What is the matrix?”
Design wisdom
abstraction

"It is better to have 100 functions
operate on one data structure than 10
functions on 10 data structures."
—Alan Perlis
What is an array?
Dimensions

Example

Terminology

3

1

2

1

2

3

4

5

6

2

0
0

1

7

8

0
0
0
3
3
3
6
6
6

1
1
1
4
4
4
7
7
7

2
2
2
5
5
5
8
8
8

Vector

Matrix

3D Array
(3rd order Tensor)

...
N

ND Array
...
Multi-dimensional array properties
Dimensions (ordered
and indexed)

Dimension 1

0

2

0
Dimension 0

1

0

1

2

1

3

4

5

2

6

7

Dimension sizes
together define the
shape of the array
(e.g. 3 x 3)

8

Each of the array
elements is a
regular value
Arrays = data about relationships
Set Y

:R :S :T :U

:A

1

2

3

:B

4

5

6

7

:C

Set X

0

8

9 10 11

Each element is a fact
about a relationship
between a value in Set
X and a value in Set Y

(foo :A :T) => 2

ND array lookup is analogous to arity-N functions!
Why arrays instead of functions?
0

1

2

0

0

1

2

1

3

4

5

2

6

7

8

vs.

(fn [i j]
(+ j (* 3 i)))

1.

Precomputed values with O(1) access

2.

Efficient computation with optimised bulk
operations

3.

Data driven representation
Expressivity
Java

for (int i=0; i<n; i++) {
for (int j=0; j<m; j++) {
for (int k=0; k<p; k++) {
result[i][j][k] = a[i][j][k] + b[i][j][k];
}
}
}

(mapv
(fn [a b]
(mapv
(fn [a b]
(mapv + a b))
a b))
a b)

(+ a b)

+ core.matrix
Principle of array programming:
generalise operations on regular (scalar) values
to multi-dimensional data

(+ 1 2) => 3
(+

) => 2
API
Equivalence to Clojure vectors
0

1

2

0

1
4

5

6

7

8

[0 1 2]

↔

[[0 1 2]
[3 4 5]
[6 7 8]]

2

3

↔

Nested Clojure vectors of regular shape are arrays!
Array creation
;; Build an array from a sequence
(array (range 5))
=> [0 1 2 3 4]
;; ... or from nested arrays/sequences
(array
(for [i (range 3)]
(for [j (range 3)]
(str i j))))
=> [["00" "01" "02"]
["10" "11" "12"]
["20" "21" "22"]]
Shape
;; Shape of a 3 x 2 matrix
(shape [[1 2]
[3 4]
[5 6]])
=> [3 2]

;; Regular values have no shape
(shape 10.0)
=> nil
Dimensionality
;; Dimensionality =
;;
=
;;
=
(dimensionality [[1
[3
[5
=> 2

number of dimensions
length of shape vector
nesting level
2]
4]
6]])

(dimensionality [1 2 3 4 5])
=> 1

;; Regular values have zero dimensionality
(dimensionality “Foo”)
=> 0
Scalars vs. arrays
(array? [[1 2] [3 4]])
=> true
(array? 12.3)
=> false
(scalar? [1 2 3])
=> false
(scalar? “foo”)
=> true
Everything is either an array or a scalar
A scalar works as like a 0-dimensional array
Indexed element access
Dimension 1

0

2

0

0

1

2

1

3

4

5

2

Dimension 0

1

6

7

8

(def M [[0 1 2]
[3 4 5]
[6 7 8]])
(mget M 1 2)
=> 5
Slicing access
Dimension 1

0

2

0

0

1

2

1

3

4

5

2

Dimension 0

1

6

7

8

(def M [[0 1 2]
[3 4 5]
[6 7 8]])
(slice M 1)
=> [3 4 5]
A slice of an array is itself an array!
Arrays as a composition of slices
(def M [[0 1 2]
[3 4 5]
[6 7 8]])

0

1

2

3

4

5

6

7

8

slices

(slices M)
=> ([0 1 2] [3 4 5] [6 7 8])

1

2

3

(apply + (slices M))
=> [9 12 15]

0

4

5

6

7

8
Operators
(use 'clojure.core.matrix.operators)

(+ [1 2 3] [4 5 6])
=> [5 7 9]
(* [1
=> [0

2 3] [0
4 -3]

2 -1])

(- [1 2] [3 4 5 6])
=> RuntimeException Incompatible shapes
(/ [1 2 3] 10.0)
=> [0.1 0.2 0.3]
Broadcasting scalars

(+

[[0 1 2]
[3 4 5]
[6 7 8]]

(+

[[0 1 2]
[[1 1 1]
[3 4 5]
[1 1 1]
[6 7 8]]
[1 1 1]]

1 1 )= ?
1

“Broadcasting”

[[1 2 3]
[4 5 6]
[7 8 9]]

)=.
Broadcasting arrays

(+

[[0 1 2]
[3 4 5]
[6 7 8]]

(+

[[0 1 2]
[[2 1 0]
[3 4 5]
[2 1 0]
[6 7 8]]
[2 1 0]]

1

[2 1 0]

1

“Broadcasting”

)= ?
[[2 2 2]
[5 5 5]
[8 8 8]]

)=.
Functional operations on sequences
map

reduce

(map inc [1 2 3 4])
=> (2 3 4 5)

(reduce * [1 2 3 4])
=> 24

(seq

seq

[1 2 3 4])
=> (1 2 3 4)
Functional operations on arrays
map ↔ emap
“element map”

(emap inc [[1 2]
[3 4]])
=> [[2 3]
[4 5]]

(ereduce * [[1 2]
reduce ↔ ereduce
[3 4]])
=> 24
“element reduce”

seq ↔ eseq
“element seq”

(eseq [[1 2]
[3 4]])
=> (1 2 3 4)
Specialised matrix constructors
0

0

0

0

0

0

0

0

1

0

0

0

0

1

0

0

0

0

1

0

0

(permutation-matrix [3 1 0 2])

0

0

(identity-matrix 4)

0
0

(zero-matrix 4 3)

0

0

1

0

0

0

1

0

1

0

0

1

0

0

0

0

0

1

0
Array transformations

(transpose
0

2

3

4

5

)

4

2

1

3

1

0

5

Transposes reverses the order of all dimensions and indexes
Matrix multiplication

(mmul [[9 2 7] [6 4 8]]
[[2 8] [3 4] [5 9]])
=> [[59 143] [64 136]]
Geometry
(def π 3.141592653589793)

(def τ (* 2.0 π))
(defn rot [turns]
(let [a (* τ turns)]
[[ (cos a) (sin a)]
[(-(sin a)) (cos a)]]))

(mmul (rot 1/8) [3 4])
=> [4.9497 0.7071]
NB: See Tau Manifesto (http://tauday.com/) regarding the use of Tau (τ)

45 =
1/8 turn
Demo
Mutability?
Mutability – the tradeoffs
Pros

Cons

 Faster

✘ Mutability is evil

 Reduces GC pressure

✘ Harder to maintain / debug

 Standard in many existing
matrix libraries

✘ Hard to write concurrent code
✘ Not idiomatic in Clojure
✘ Not supported by all
core.matrix implementations
✘ “Place Oriented Programming”

Avoid mutability. But it’s an option if you really need it.
Mutability – performance benefit
Time for addition of vectors* (ns)

Immutable add

120

Mutable add!

4x
performance benefit

28

0

50

100

150

* Length 10 double vectors, using :vectorz implementation
Mutability – syntax
(add [1 2] 1)
[2 3]
(add! [1 2] 1)
=> RuntimeException ...... not mutable!
(def a (mutable [1 2]))
=> #<Vector2 [1.0,2.0]>

;; coerce to a mutable format

(add! a 1)
=> #<Vector2 [2.0,3.0]>

A core.matrix function name ending with “!” performs mutation
(usually on the first argument only)
Implementation
Many Matrix libraries…

MTJ

UJMP
javax.vecmath

ojAlgo
Lots of trade-offs
Native Libraries

vs.

Pure JVM

Mutability

vs.

Immutability

Specialized elements (e.g. doubles)

vs.

Generalised elements (Object, Complex)

Multi-dimensional

vs.

2D matrices only

Memory efficiency

vs.

Runtime efficiency

Concrete types

vs.

Abstraction (interfaces / wrappers)

Specified storage format

vs.

Multiple / arbitrary storage formats

License A

vs.

License B

Lightweight (zero-copy) views

vs.

Heavyweight copying / cloning
What’s the best data structure?
Length 50 “range” vector:

0

1

2

3 .. 49

1. Clojure Vector

2. Java double[] array

[0 1 2 …. 49]

new double[]
{0, 1, 2, …. 49};

3. Custom deftype

4. Native vector format

(deftype RangeVector
[^long start
^long end])

(org.jblas.DoubleMatrix.
params)
There is no spoon
Secret weapon time!
Clojure Protocols
clojure.core.matrix.protocols

(defprotocol PSummable
"Protocol to support the summing of all elements in
an array. The array must hold numeric values only,
or an exception will be thrown."
(element-sum [m]))

1. Abstract Interface
2. Open Extension
3. Fast dispatch
Protocols are fast and open
Function call costs (ns)

Open extension

Static / inlined code

1.2

Primitive function call

1.9

Boxed function call

7.9

Protocol call

13.8

Multimethod*

89
0

20

40

60

80

* Using class of first argument as dispatch function

100

✘
✘
✘
✓
✓
Typical core.matrix call path
User
Code
core.matrix
API
(matrix.clj)

Impl.
code

(esum [1 2 3 4])

(defn esum
"Calculates the sum of all the elements in a
numerical array."
[m]
(mp/element-sum m))

(extend-protocol mp/PSummable
SomeImplementationClass
(element-sum [a]
………))
Most protocols are optional
PImplementation
PDimensionInfo
PIndexedAccess
PIndexedSetting
PMatrixEquality
PSummable
PRowOperations
PVectorCross
PCoercion
PTranspose
PVectorDistance
PMatrixMultiply
PAddProductMutable
PReshaping
PMathsFunctionsMutable
PMatrixRank
PArrayMetrics
PAddProduct
PVectorOps
PMatrixScaling
PMatrixOps
PMatrixPredicates
PSparseArray
…..

MANDATORY
•

Required for a working core.matrix implementation

OPTIONAL
•
•
•

Everything in the API will work without these
core.matrix provides a “default implementation”
Implement for improved performance
Default implementations
Protocol name - from namespace
clojure.core.matrix.protocols
clojure.core.matrix.impl.default

(extend-protocol mp/PSummable
Number
(element-sum [a] a)

Implementation for any Number

Object
(element-sum [a]
(mp/element-reduce a +)))

Implementation for an arbitrary Object
(assumed to be an array)
Extending a protocol

(extend-protocol mp/PSummable
(Class/forName "[D")
Class to implement protocol for, in this
(element-sum [m]
case a Java array : double[]
Add type hint to avoid reflection
(let [^doubles m m]
(areduce m i res 0.0 (+ res (aget m i))))))

Optimised code to add up all the
elements of a double[] array
Speedup vs. default implementation
Timing for element sum of length 100 double array (ns)
(esum v)
"Default"

3690

(reduce + v)

2859

(esum v)
"Specialised"

15-20x
benefit

201

0

1000

2000

3000

4000
Internal Implementations
Implementation

Key Features

:persistent-vector

• Support for Clojure vectors
• Immutable
• Not so fast, but great for quick testing

:double-array

• Treats Java double[] objects as 1D arrays
• Mutable – useful for accumulating results etc.

:sequence

• Treats Clojure sequences as arrays
• Mostly useful for interop / data loading

:ndarray
:ndarray-double
:ndarray-long
.....

•
•
•
•

:scalar-wrapper
:slice-wrapper
:nd-wrapper

• Internal wrapper formats
• Used to provide efficient default implementations for
various protocols

Google Summer of Code project by Dmitry Groshev
Pure Clojure
N-Dimensional arrays similar to NumPy
Support arbitrary dimensions and data types
NDArray
(deftype NDArrayDouble
[^doubles data
^int
ndims
^ints
shape
^ints
strides
^int
offset])

offset
strides[0]

0

1

3

4

5

strides[1]

2
?

?

?

0

0

1

2

?

?

3

4

5

data
(Java array)
ndims = 2

shape = [2 3]

?
External Implementations
Implementation

Key Features

vectorz-clj

• Pure JVM (wraps Java Library Vectorz)
• Very fast, especially for vectors and small-medium matrices
• Most mature core.matrix implementation at present

Clatrix

• Use Native BLAS libraries by wrapping the Jblas library
• Very fast, especially for large 2D matrices
• Used by Incanter

parallel-colt-matrix

• Wraps Parallel Colt library from Java
• Support for multithreaded matrix computations

arrayspace

• Experimental
• Ideas around distributed matrix computation
• Builds on ideas from Blaze, Chapele, ZPL

image-matrix

• Treats a Java BufferedImage as a core.matrix array
• Because you can?
Switching implementations
(array (range 5))
=> [0 1 2 3 4]
;; switch implementations
(set-current-implementation :vectorz)

;; create array with current implementation
(array (range 5))
=> #<Vector [0.0,1.0,2.0,3.0,4.0]>
;; explicit implementation usage
(array :persistent-vector (range 5))
=> [0 1 2 3 4]
Mixing implementations
(def A (array :persistent-vector (range 5)))
=> [0 1 2 3 4]
(def B (array :vectorz (range 5)))
=> #<Vector [0.0,1.0,2.0,3.0,4.0]>
(* A B)
=> [0.0 1.0 4.0 9.0 16.0]
(* B A)
=> #<Vector [0.0,1.0,4.0,9.0,16.0]>
core.matrix implementations can be mixed
(but: behaviour depends on the first argument)
Future roadmap
 Version 1.0 release
 Data types: Complex numbers
 Expression compilation
 Domain specific extensions, e.g.:
symbolic computation (expresso)
stats
Geometry
linear algebra

 Incanter integration
END
Incanter Integration

 A great environment for statistical computing, data
science and visualisation in Clojure
 Uses the Clatrix matrix library – great performance
 Work in progress to support core.matrix fully for
Incanter 2.0
Benchmarks: Clojure vs. Python
Domain specific extensions
Extension library

Focus

core.matrix.stats

Statistical functions

core.matrix.geom

2D and 3D Geometry

expresso

Manipulation of array expressions
Broadcasting Rules
1. Designed for elementwise operations
- other uses must be explicit
2. Extends shape vector by adding new leading
dimensions
• original shape [4 5]
• can broadcast to any shape [x y ... z 4 5]
• scalars can broadcast to any shape
3. Fills the new array space by duplication of the original
array over the new dimensions
4. Smart implementations can avoid making full copies
by structural sharing or clever indexing tricks
Vectorz
ectorz
ectorz

More Related Content

Viewers also liked

Images and Vision in Python
Images and Vision in PythonImages and Vision in Python
Images and Vision in Python
streety
 
PCAP Graphs for Cybersecurity and System Tuning
PCAP Graphs for Cybersecurity and System TuningPCAP Graphs for Cybersecurity and System Tuning
PCAP Graphs for Cybersecurity and System Tuning
Dr. Mirko Kämpf
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Ryan Bosshart
 
Two dimensional array
Two dimensional arrayTwo dimensional array
Two dimensional array
Rajendran
 
파이썬 Numpy 선형대수 이해하기
파이썬 Numpy 선형대수 이해하기파이썬 Numpy 선형대수 이해하기
파이썬 Numpy 선형대수 이해하기
Yong Joon Moon
 
Application of Matrices
Application of MatricesApplication of Matrices
Application of Matrices
Mohammed Limdiwala
 
Applications of matrices in real life
Applications of matrices in real lifeApplications of matrices in real life
Applications of matrices in real lifeSuhaibFaiz
 
Matrix Representation Of Graph
Matrix Representation Of GraphMatrix Representation Of Graph
Matrix Representation Of GraphAbhishek Pachisia
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用
Mark Chang
 
Applications of Matrices
Applications of MatricesApplications of Matrices
Applications of Matricessanthosh kumar
 
Presentation on application of matrix
Presentation on application of matrixPresentation on application of matrix
Presentation on application of matrix
Prerana Bhattarai
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習
Mark Chang
 
The Future of Quantified Self in Healthcare
The Future of Quantified Self in HealthcareThe Future of Quantified Self in Healthcare
The Future of Quantified Self in Healthcare
Quantified Self Dublin
 
2 d geometric transformations
2 d geometric transformations2 d geometric transformations
2 d geometric transformationsMohd Arif
 
computer graphics
computer graphicscomputer graphics
computer graphics
ashpri156
 
Space matrix
Space matrixSpace matrix
Space matrix
Jijin Thomas
 
Getting started with image processing using Matlab
Getting started with image processing using MatlabGetting started with image processing using Matlab
Getting started with image processing using Matlab
Pantech ProLabs India Pvt Ltd
 
Introduction to Computer graphics
Introduction to Computer graphics Introduction to Computer graphics
Introduction to Computer graphics PrathimaBaliga
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
台灣資料科學年會
 

Viewers also liked (20)

Images and Vision in Python
Images and Vision in PythonImages and Vision in Python
Images and Vision in Python
 
PCAP Graphs for Cybersecurity and System Tuning
PCAP Graphs for Cybersecurity and System TuningPCAP Graphs for Cybersecurity and System Tuning
PCAP Graphs for Cybersecurity and System Tuning
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLib
 
Two dimensional array
Two dimensional arrayTwo dimensional array
Two dimensional array
 
파이썬 Numpy 선형대수 이해하기
파이썬 Numpy 선형대수 이해하기파이썬 Numpy 선형대수 이해하기
파이썬 Numpy 선형대수 이해하기
 
Application of Matrices
Application of MatricesApplication of Matrices
Application of Matrices
 
Applications of matrices in real life
Applications of matrices in real lifeApplications of matrices in real life
Applications of matrices in real life
 
Matrix Representation Of Graph
Matrix Representation Of GraphMatrix Representation Of Graph
Matrix Representation Of Graph
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用
 
Applications of Matrices
Applications of MatricesApplications of Matrices
Applications of Matrices
 
Application of matrices in real life
Application of matrices in real lifeApplication of matrices in real life
Application of matrices in real life
 
Presentation on application of matrix
Presentation on application of matrixPresentation on application of matrix
Presentation on application of matrix
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習
 
The Future of Quantified Self in Healthcare
The Future of Quantified Self in HealthcareThe Future of Quantified Self in Healthcare
The Future of Quantified Self in Healthcare
 
2 d geometric transformations
2 d geometric transformations2 d geometric transformations
2 d geometric transformations
 
computer graphics
computer graphicscomputer graphics
computer graphics
 
Space matrix
Space matrixSpace matrix
Space matrix
 
Getting started with image processing using Matlab
Getting started with image processing using MatlabGetting started with image processing using Matlab
Getting started with image processing using Matlab
 
Introduction to Computer graphics
Introduction to Computer graphics Introduction to Computer graphics
Introduction to Computer graphics
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 

Similar to Enter The Matrix

Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
John Stevenson
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
Baishampayan Ghose
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
thnetos
 
Tutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.sTutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.s
Kurniawan susanto
 
MATLAB Programming
MATLAB Programming MATLAB Programming
MATLAB Programming
محمدعبد الحى
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In RubyRoss Lawley
 
Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
JAX London
 
R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011
Mandi Walls
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
aboma2hawi
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
Vinay Kumar
 
Getting started cpp full
Getting started cpp   fullGetting started cpp   full
Getting started cpp full
Võ Hòa
 
Arrays
ArraysArrays
[1062BPY12001] Data analysis with R / week 2
[1062BPY12001] Data analysis with R / week 2[1062BPY12001] Data analysis with R / week 2
[1062BPY12001] Data analysis with R / week 2
Kevin Chun-Hsien Hsu
 
INTRODUCTION TO MATLAB session with notes
  INTRODUCTION TO MATLAB   session with  notes  INTRODUCTION TO MATLAB   session with  notes
INTRODUCTION TO MATLAB session with notes
Infinity Tech Solutions
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
decoupled
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
Aleksandar Veselinovic
 

Similar to Enter The Matrix (20)

Getting started with Clojure
Getting started with ClojureGetting started with Clojure
Getting started with Clojure
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Tutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.sTutorialmatlab kurniawan.s
Tutorialmatlab kurniawan.s
 
Tutorial matlab
Tutorial matlabTutorial matlab
Tutorial matlab
 
MATLAB Programming
MATLAB Programming MATLAB Programming
MATLAB Programming
 
Clojure intro
Clojure introClojure intro
Clojure intro
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In Ruby
 
Clojure made-simple - John Stevenson
Clojure made-simple - John StevensonClojure made-simple - John Stevenson
Clojure made-simple - John Stevenson
 
R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011R for Pirates. ESCCONF October 27, 2011
R for Pirates. ESCCONF October 27, 2011
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
 
Getting started cpp full
Getting started cpp   fullGetting started cpp   full
Getting started cpp full
 
Arrays
ArraysArrays
Arrays
 
Matlab lec1
Matlab lec1Matlab lec1
Matlab lec1
 
Plc (1)
Plc (1)Plc (1)
Plc (1)
 
[1062BPY12001] Data analysis with R / week 2
[1062BPY12001] Data analysis with R / week 2[1062BPY12001] Data analysis with R / week 2
[1062BPY12001] Data analysis with R / week 2
 
INTRODUCTION TO MATLAB session with notes
  INTRODUCTION TO MATLAB   session with  notes  INTRODUCTION TO MATLAB   session with  notes
INTRODUCTION TO MATLAB session with notes
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 

Enter The Matrix

  • 2. core.matrix Array programming as a language extension for Clojure (with a Numerical computing focus)
  • 3. Plug-in paradigms Paradigm Exemplar language Functional programming Clojure implementation Haskell clojure.core Meta-programming Lisp Logic programming Prolog core.logic Process algebras / CSP Go core.async Array programming APL core.matrix
  • 4. APL Venerable history • • Notation invented in 1957 by Ken Iverson Implemented at IBM around 1960-64 Has its own keyboard Interesting perspective on code readability life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}
  • 5. Modern array programming Standalone environment for statistical programming / graphics Python library for array programming A new language (2012) based on array programming principles .... and many others
  • 6. Why Clojure for array programming? 1. Data Science 2. Platform 3. Philosophy
  • 7. Elements of core.matrix Abstraction N-dimensional arrays – what and why? API What can you do with arrays? Implementation How is everything implemented?
  • 8. Abstraction or: “What is the matrix?”
  • 9. Design wisdom abstraction "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." —Alan Perlis
  • 10. What is an array? Dimensions Example Terminology 3 1 2 1 2 3 4 5 6 2 0 0 1 7 8 0 0 0 3 3 3 6 6 6 1 1 1 4 4 4 7 7 7 2 2 2 5 5 5 8 8 8 Vector Matrix 3D Array (3rd order Tensor) ... N ND Array ...
  • 11. Multi-dimensional array properties Dimensions (ordered and indexed) Dimension 1 0 2 0 Dimension 0 1 0 1 2 1 3 4 5 2 6 7 Dimension sizes together define the shape of the array (e.g. 3 x 3) 8 Each of the array elements is a regular value
  • 12. Arrays = data about relationships Set Y :R :S :T :U :A 1 2 3 :B 4 5 6 7 :C Set X 0 8 9 10 11 Each element is a fact about a relationship between a value in Set X and a value in Set Y (foo :A :T) => 2 ND array lookup is analogous to arity-N functions!
  • 13. Why arrays instead of functions? 0 1 2 0 0 1 2 1 3 4 5 2 6 7 8 vs. (fn [i j] (+ j (* 3 i))) 1. Precomputed values with O(1) access 2. Efficient computation with optimised bulk operations 3. Data driven representation
  • 14. Expressivity Java for (int i=0; i<n; i++) { for (int j=0; j<m; j++) { for (int k=0; k<p; k++) { result[i][j][k] = a[i][j][k] + b[i][j][k]; } } } (mapv (fn [a b] (mapv (fn [a b] (mapv + a b)) a b)) a b) (+ a b) + core.matrix
  • 15. Principle of array programming: generalise operations on regular (scalar) values to multi-dimensional data (+ 1 2) => 3 (+ ) => 2
  • 16. API
  • 17. Equivalence to Clojure vectors 0 1 2 0 1 4 5 6 7 8 [0 1 2] ↔ [[0 1 2] [3 4 5] [6 7 8]] 2 3 ↔ Nested Clojure vectors of regular shape are arrays!
  • 18. Array creation ;; Build an array from a sequence (array (range 5)) => [0 1 2 3 4] ;; ... or from nested arrays/sequences (array (for [i (range 3)] (for [j (range 3)] (str i j)))) => [["00" "01" "02"] ["10" "11" "12"] ["20" "21" "22"]]
  • 19. Shape ;; Shape of a 3 x 2 matrix (shape [[1 2] [3 4] [5 6]]) => [3 2] ;; Regular values have no shape (shape 10.0) => nil
  • 20. Dimensionality ;; Dimensionality = ;; = ;; = (dimensionality [[1 [3 [5 => 2 number of dimensions length of shape vector nesting level 2] 4] 6]]) (dimensionality [1 2 3 4 5]) => 1 ;; Regular values have zero dimensionality (dimensionality “Foo”) => 0
  • 21. Scalars vs. arrays (array? [[1 2] [3 4]]) => true (array? 12.3) => false (scalar? [1 2 3]) => false (scalar? “foo”) => true Everything is either an array or a scalar A scalar works as like a 0-dimensional array
  • 22. Indexed element access Dimension 1 0 2 0 0 1 2 1 3 4 5 2 Dimension 0 1 6 7 8 (def M [[0 1 2] [3 4 5] [6 7 8]]) (mget M 1 2) => 5
  • 23. Slicing access Dimension 1 0 2 0 0 1 2 1 3 4 5 2 Dimension 0 1 6 7 8 (def M [[0 1 2] [3 4 5] [6 7 8]]) (slice M 1) => [3 4 5] A slice of an array is itself an array!
  • 24. Arrays as a composition of slices (def M [[0 1 2] [3 4 5] [6 7 8]]) 0 1 2 3 4 5 6 7 8 slices (slices M) => ([0 1 2] [3 4 5] [6 7 8]) 1 2 3 (apply + (slices M)) => [9 12 15] 0 4 5 6 7 8
  • 25. Operators (use 'clojure.core.matrix.operators) (+ [1 2 3] [4 5 6]) => [5 7 9] (* [1 => [0 2 3] [0 4 -3] 2 -1]) (- [1 2] [3 4 5 6]) => RuntimeException Incompatible shapes (/ [1 2 3] 10.0) => [0.1 0.2 0.3]
  • 26. Broadcasting scalars (+ [[0 1 2] [3 4 5] [6 7 8]] (+ [[0 1 2] [[1 1 1] [3 4 5] [1 1 1] [6 7 8]] [1 1 1]] 1 1 )= ? 1 “Broadcasting” [[1 2 3] [4 5 6] [7 8 9]] )=.
  • 27. Broadcasting arrays (+ [[0 1 2] [3 4 5] [6 7 8]] (+ [[0 1 2] [[2 1 0] [3 4 5] [2 1 0] [6 7 8]] [2 1 0]] 1 [2 1 0] 1 “Broadcasting” )= ? [[2 2 2] [5 5 5] [8 8 8]] )=.
  • 28. Functional operations on sequences map reduce (map inc [1 2 3 4]) => (2 3 4 5) (reduce * [1 2 3 4]) => 24 (seq seq [1 2 3 4]) => (1 2 3 4)
  • 29. Functional operations on arrays map ↔ emap “element map” (emap inc [[1 2] [3 4]]) => [[2 3] [4 5]] (ereduce * [[1 2] reduce ↔ ereduce [3 4]]) => 24 “element reduce” seq ↔ eseq “element seq” (eseq [[1 2] [3 4]]) => (1 2 3 4)
  • 30. Specialised matrix constructors 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 (permutation-matrix [3 1 0 2]) 0 0 (identity-matrix 4) 0 0 (zero-matrix 4 3) 0 0 1 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0
  • 32. Matrix multiplication (mmul [[9 2 7] [6 4 8]] [[2 8] [3 4] [5 9]]) => [[59 143] [64 136]]
  • 33. Geometry (def π 3.141592653589793) (def τ (* 2.0 π)) (defn rot [turns] (let [a (* τ turns)] [[ (cos a) (sin a)] [(-(sin a)) (cos a)]])) (mmul (rot 1/8) [3 4]) => [4.9497 0.7071] NB: See Tau Manifesto (http://tauday.com/) regarding the use of Tau (τ) 45 = 1/8 turn
  • 34. Demo
  • 36. Mutability – the tradeoffs Pros Cons  Faster ✘ Mutability is evil  Reduces GC pressure ✘ Harder to maintain / debug  Standard in many existing matrix libraries ✘ Hard to write concurrent code ✘ Not idiomatic in Clojure ✘ Not supported by all core.matrix implementations ✘ “Place Oriented Programming” Avoid mutability. But it’s an option if you really need it.
  • 37. Mutability – performance benefit Time for addition of vectors* (ns) Immutable add 120 Mutable add! 4x performance benefit 28 0 50 100 150 * Length 10 double vectors, using :vectorz implementation
  • 38. Mutability – syntax (add [1 2] 1) [2 3] (add! [1 2] 1) => RuntimeException ...... not mutable! (def a (mutable [1 2])) => #<Vector2 [1.0,2.0]> ;; coerce to a mutable format (add! a 1) => #<Vector2 [2.0,3.0]> A core.matrix function name ending with “!” performs mutation (usually on the first argument only)
  • 41.
  • 42. Lots of trade-offs Native Libraries vs. Pure JVM Mutability vs. Immutability Specialized elements (e.g. doubles) vs. Generalised elements (Object, Complex) Multi-dimensional vs. 2D matrices only Memory efficiency vs. Runtime efficiency Concrete types vs. Abstraction (interfaces / wrappers) Specified storage format vs. Multiple / arbitrary storage formats License A vs. License B Lightweight (zero-copy) views vs. Heavyweight copying / cloning
  • 43. What’s the best data structure? Length 50 “range” vector: 0 1 2 3 .. 49 1. Clojure Vector 2. Java double[] array [0 1 2 …. 49] new double[] {0, 1, 2, …. 49}; 3. Custom deftype 4. Native vector format (deftype RangeVector [^long start ^long end]) (org.jblas.DoubleMatrix. params)
  • 44. There is no spoon
  • 46. Clojure Protocols clojure.core.matrix.protocols (defprotocol PSummable "Protocol to support the summing of all elements in an array. The array must hold numeric values only, or an exception will be thrown." (element-sum [m])) 1. Abstract Interface 2. Open Extension 3. Fast dispatch
  • 47. Protocols are fast and open Function call costs (ns) Open extension Static / inlined code 1.2 Primitive function call 1.9 Boxed function call 7.9 Protocol call 13.8 Multimethod* 89 0 20 40 60 80 * Using class of first argument as dispatch function 100 ✘ ✘ ✘ ✓ ✓
  • 48. Typical core.matrix call path User Code core.matrix API (matrix.clj) Impl. code (esum [1 2 3 4]) (defn esum "Calculates the sum of all the elements in a numerical array." [m] (mp/element-sum m)) (extend-protocol mp/PSummable SomeImplementationClass (element-sum [a] ………))
  • 49. Most protocols are optional PImplementation PDimensionInfo PIndexedAccess PIndexedSetting PMatrixEquality PSummable PRowOperations PVectorCross PCoercion PTranspose PVectorDistance PMatrixMultiply PAddProductMutable PReshaping PMathsFunctionsMutable PMatrixRank PArrayMetrics PAddProduct PVectorOps PMatrixScaling PMatrixOps PMatrixPredicates PSparseArray ….. MANDATORY • Required for a working core.matrix implementation OPTIONAL • • • Everything in the API will work without these core.matrix provides a “default implementation” Implement for improved performance
  • 50. Default implementations Protocol name - from namespace clojure.core.matrix.protocols clojure.core.matrix.impl.default (extend-protocol mp/PSummable Number (element-sum [a] a) Implementation for any Number Object (element-sum [a] (mp/element-reduce a +))) Implementation for an arbitrary Object (assumed to be an array)
  • 51. Extending a protocol (extend-protocol mp/PSummable (Class/forName "[D") Class to implement protocol for, in this (element-sum [m] case a Java array : double[] Add type hint to avoid reflection (let [^doubles m m] (areduce m i res 0.0 (+ res (aget m i)))))) Optimised code to add up all the elements of a double[] array
  • 52. Speedup vs. default implementation Timing for element sum of length 100 double array (ns) (esum v) "Default" 3690 (reduce + v) 2859 (esum v) "Specialised" 15-20x benefit 201 0 1000 2000 3000 4000
  • 53. Internal Implementations Implementation Key Features :persistent-vector • Support for Clojure vectors • Immutable • Not so fast, but great for quick testing :double-array • Treats Java double[] objects as 1D arrays • Mutable – useful for accumulating results etc. :sequence • Treats Clojure sequences as arrays • Mostly useful for interop / data loading :ndarray :ndarray-double :ndarray-long ..... • • • • :scalar-wrapper :slice-wrapper :nd-wrapper • Internal wrapper formats • Used to provide efficient default implementations for various protocols Google Summer of Code project by Dmitry Groshev Pure Clojure N-Dimensional arrays similar to NumPy Support arbitrary dimensions and data types
  • 55. External Implementations Implementation Key Features vectorz-clj • Pure JVM (wraps Java Library Vectorz) • Very fast, especially for vectors and small-medium matrices • Most mature core.matrix implementation at present Clatrix • Use Native BLAS libraries by wrapping the Jblas library • Very fast, especially for large 2D matrices • Used by Incanter parallel-colt-matrix • Wraps Parallel Colt library from Java • Support for multithreaded matrix computations arrayspace • Experimental • Ideas around distributed matrix computation • Builds on ideas from Blaze, Chapele, ZPL image-matrix • Treats a Java BufferedImage as a core.matrix array • Because you can?
  • 56. Switching implementations (array (range 5)) => [0 1 2 3 4] ;; switch implementations (set-current-implementation :vectorz) ;; create array with current implementation (array (range 5)) => #<Vector [0.0,1.0,2.0,3.0,4.0]> ;; explicit implementation usage (array :persistent-vector (range 5)) => [0 1 2 3 4]
  • 57. Mixing implementations (def A (array :persistent-vector (range 5))) => [0 1 2 3 4] (def B (array :vectorz (range 5))) => #<Vector [0.0,1.0,2.0,3.0,4.0]> (* A B) => [0.0 1.0 4.0 9.0 16.0] (* B A) => #<Vector [0.0,1.0,4.0,9.0,16.0]> core.matrix implementations can be mixed (but: behaviour depends on the first argument)
  • 58. Future roadmap  Version 1.0 release  Data types: Complex numbers  Expression compilation  Domain specific extensions, e.g.: symbolic computation (expresso) stats Geometry linear algebra  Incanter integration
  • 59. END
  • 60. Incanter Integration  A great environment for statistical computing, data science and visualisation in Clojure  Uses the Clatrix matrix library – great performance  Work in progress to support core.matrix fully for Incanter 2.0
  • 62. Domain specific extensions Extension library Focus core.matrix.stats Statistical functions core.matrix.geom 2D and 3D Geometry expresso Manipulation of array expressions
  • 63. Broadcasting Rules 1. Designed for elementwise operations - other uses must be explicit 2. Extends shape vector by adding new leading dimensions • original shape [4 5] • can broadcast to any shape [x y ... z 4 5] • scalars can broadcast to any shape 3. Fills the new array space by duplication of the original array over the new dimensions 4. Smart implementations can avoid making full copies by structural sharing or clever indexing tricks

Editor's Notes

  1. Today I’m going to be talking about core.matrix, and it’s quite appropriate that I’m talking about it here today at the ClojureConj because this project actually came about as a direct result of conversations I had with many people at last year’s ConjThe focus of those discussions was very much about how we could make numerical computing better in Clojure.And the solution I’ve been working on over the past year along with a number of collaborators is core.matrix, which offers array programming as a language extension to Clojure
  2. When I say language extension, it is of course in the sense that Clojure seems to have this ability to absorb new paradigms just by plugging in new libraries.Clojure already stole many good pure functional programming techniques from languages like HaskellAnd of course we have the macro meta-programming capabilities from LispMore recently we’ve got core.logic bringing in Logic programming, inspired by Prolog and miniKanrenAnd core.async bringing in the Communicating Sequential Processes with some syntax similar to GoAnd core.matrix is designed very much in the same way, to provide array programming capabilities. And if we want to trace the roots of array programming, we can go all the way back to this language called APL
  3. About the same age as Lisp? First specified in 1958Love the fact that it has its own keyboard, with all these symbols inspired by mathematical notationAnd you get some crazy code.Might seem like a bit of a dinosaur new
  4. Array programming has had quite a renaissance in recent years.This is because of the increasing important of data science and numerical computing in many fields- So we’ve seen languages like R that provide an environment for statistical computingHighlight value of paradigm – clearly a demand for these kind of numerical computing capabilities
  5. Why bring array programming for Clojure?1. Data science focus – lots of interest in doing data crunching work in Clojure2. Provides a powerful platform: - Why should you have to introduce a whole new stack to get access to array programming paradigm? Shouldn’t have to give up advantages of a good general purpose language to do data science. - Clojure is already a great platform to build on: JVM platform –lots of advantages3. Clojure is compelling for many philosophicalreasons: concurrency, immutability state, a focus on data. Array programming seems to be a good fit for this philosophy.
  6. So today I’m going to talk about core.matrix with three different lensesFirst I want to talk about the abstraction – what are these arrays?Then I’m going to talk about the core.matrix APIImplementation: how does this all work, some of the engineering choices we’ve made
  7. Start off with one of my favourite quotes, because it contains a pretty important insight.“It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures”There is of course one error here….. (click)We should of course be talking about an abstraction here, not a concrete data structure. A great example of this is the sequence abstraction in Clojure – there are literally hundreds of functions that operate on Clojure sequences. Because so many functions produce and consume sequences, it gives you many different ways to compose then together. And it’s more than just the clojure.core API: other code can build on the same abstraction, which means that the composability extends to any code you write that uses the same abstraction. It makes entire libraries composable. In some ways I think the key to building systems using simple, composable components is about having shared abstractions.We’ve taken this principle very much to heart in core.matrix, our abstraction of course is the array - more specifically the multi-dimensional arrayAnd the rest of core.matrix is really all about giving you a powerful set of composable operations you can do with arrays
  8. Overloaded terminology!- Vector = 1D array (maths / array programming sense) – Also a Clojure vector- Matrix: conventionally used to indicate a 2 dimensional numerical array, - Array: in the sense of the N-dimensional array, but also the specific concrete example of a Java arrayDimensions: also overloaded! Here using in the sense of the number of dimensions in an array, but it’s also used to refer to the number of dimensions in a vector space, e.g. 3 dimensional Euclidean space.If we’re lucky it should be clear from the context what we’re talking about.
  9. Give you an idea about how general array programming can be – An array is a way of representing a function using dataInstead of computing a value for each combination of inputs, we’re typically pre-computing all such values
  10. Give you an idea about how general array programming can be – An array is a way of representing a function using dataInstead of computing a value for each combination of inputs, we’re typically pre-computing all such values
  11. Example of adding a 3D array.Java it’s just a big nested loop…Clojure you can do it with nested maps, which is a bit more of a functional style, but still you’ve got this three-level nesting With core.matrix it’s really simple. We just generalise + to arbitrary multi-dimensional arrays and it all just worksDoes conciseness matter? Well if you’re writing a lot of code manipulating arrays it’s going to save you quite a bit of time, but more importantly it makes it much easier to avoid errors. Very easy to get off-by-one errors in this kind of code.core.matrix gives you a nice DSL that does all the index juggling for youAlso it helps you to be mentally much closer to the problem that you are modelling. You ideally want an API that reflects the way that you think about the problem you are solving.
  12. So lets talk about the core.matrix API.This isn’t going to be an exhaustive tour, but I’m going to highlight a few of the key features to give you a taste of what is possible
  13. One of the important API design objectives was to exploit the “natural equivalence of arrays to nested Clojure vectors”. 1D array is a Clojure vector, 2D array is like a vector of vectorsMost things in the core.matrix API work with nested Clojure vectors.This is nice – gives a natural syntax, and great for dynamic, exploratory work at the REPL.
  14. The most fundamental attribute of an array is probably the shape
  15. The most fundamental attribute of an array is probably the shape
  16. Arrays are compositions of arrays!This is one of the best signs that you have a good abstraction: if the abstraction can be recursively defined as a composition of the same abstraction.
  17. So of course we have quite a few different functions that let you work with slices of arrays.Most useful is probably the slices function, which cuts an array into a sequence of its slicesPretty common to want to do this – imagine if each slice is a row in your data set
  18. We define array versions of the common mathematical operators.These use the same names as clojure.coreYou have to use the clojure.core.matrix.operators namespace if you want to use these names instead of the standard clojure.core operators
  19. Question: what should happen if we add a scalar number to an array?We have a feature called broadcasting, which allows a lower dimensional array to be treated as a higher dimensional array
  20. The idea of broadcasting also generalises to arrays!Here the semantics is the same, we just duplicate the smaller array to fill out the shape of the larger array
  21. So lets talk about some higher order functionsTwo of my favourite Clojure functions – map and reduce are extremely useful higher order functions
  22. So one of the interesting observations about array programming is that you can also see it as a generalisation of sequences in multiple dimensions, so it probably isn’t too surprising that many of the sequence functions in Clojure actually have a nice array programming equivalentemap is the equivalent of map, it maps a function over all elements of an array – the key difference is that is preserves the structure of the array so here we’re mapping over a 2x2 matrix, and therefore we get a 2x2 resultereduce is the equivalent of reduce over all elementseseqis a handy bridge between core.matrix arrays and regular Clojure sequences – it just returns all the elements of an array in orderNote row-major ordering of eseq and ereduce
  23. Basically mutability is horrible. You should be avoiding it as much as you canBut it turns out that it is needed in some cases – performance matters for numerical workMutability OK for library implementers, e.g. accumulation of a result in a temporary arrayOnce a value is constructed, shouldn’t be mutated any more
  24. Usually 4x performance benefit isn’t a big deal – unless it happens to be your bottleneckThere are cases where it might be important: e.g. if you are crunching through a lot of data and need to add to some sort of accumulator…
  25. Mutability OK for library implementers, e.g. accumulation of a result in a temporary arrayOnce a value is constructed, shouldn’t be mutated any more
  26. Clearly this is insane – why so many matrix libraries?
  27. This explains the problem. But doesn’t really help us….
  28. The point is – there isn’t ever going to be a perfect right answer when choosing a concrete data type to implement an abstraction. There are always going to be inherent advantages of different approaches
  29. Luckily we have a secret weapon, and I think this is actually what really distinguishes core.matrix from all other array programming systems
  30. Of course the secret weapon is Clojure protocols.Here’s an example – PSummable protocol is a very simple protocol that allows to to compute the sum of all values in an arrayThree things are important to know about First is that they define an abstract interface – which is exactly what we need to define operations that work on our array abstractionSecondly they feature open extension: which means that we can solve the expression problem and use protocols with arbitrary types – importantly, this includes types that weren’t written with the protocol in mind – e.g. arbitrary Java classesThird feature is really fast dispatch – which is important if we want to core.matrix to be useful in high performance situations.
  31. Protocols are really the “sweet spot” of being both fast and openWe benchmarked a pretty wide variety of different function calls
  32. It’s easy to make a working core.matrix implementation!It’s more work if you want to make it perfom across the whole APIBut that’s OK because it can be done incrementallySo hopefully this provides a smooth development path for core.matrix implementations to integrate
  33. The secret is having default implementations for all protocols, that get used if you haven’t extended the protocol for your particular typeNote that the default implementation delegates to another protocol call – this is generally the case, ultimately all these protocol calls have to be implemented in terms of the lower-level mandatory protocols if we want them to work on any array.
  34. Value of a specialised implementation
  35. Makes some operations very efficient- For example if you want to transpose an NDArray, you just need to reverse the shape and reverse the strides.
  36. vectorz-clj: probably the best choice if you want general purpose double numericsclatrix: probably the best choice if you want linear algebra with big matrices
  37. Not only can you switch implementation: you can also mix them!Actually quite unique capabilityHow do we do this? Provide generic coercion functionality – so implementations typically use this to coerce second argument to type of the first
  38. So we have some rules for broadcastingNote that it only really makes sense for elementwise operations. You can broadcast arrays explicitly if you want to to, but it only happens automatically for elementwise operations at present.Can only add leading dimensions.