SlideShare a Scribd company logo
1 of 18
Download to read offline
Parallel Collections
         with Scala

Jul 6' 2012 > Vikas Hazrati > vikas@knoldus.com > @vhazrati
Motivation

Multiple-cores



Popular Parallel Programming remains a formidable challenge.




                                 Implicit Parallelism
scala> val list = (1 to 10000).toList




                    scala> list.map(_ + 42)

                    scala> list.par.map(_ + 42)
scala> List(1,2,3,4,5)
res0: List[Int] = List(1, 2, 3, 4, 5)

scala> res0.map(println(_))
1
2
3
4
5
res1: List[Unit] = List((), (), (), (), ())

scala> res0.par.map(println(_))
3
1
4
2
5
res2: scala.collection.parallel.immutable.ParSeq[Unit] = ParVector((),
(), (), (), ())
ParArray

ParVector

mutable.ParHashMap

mutable.ParHashSet

immutable.ParHashMap

immutable.ParHashSet

ParRange

ParTrieMap (collection.concurrent.TrieMaps are
new in 2.10)
Caution: Performance benefits visible only around several
             Thousand elements in the collection




             Machine Architecture
Depends on
                 JVM vendor and version
                     Per element workload
                      Specific collection – ParArray, ParTrieMap
                Specific operation – transformer(filter), accessor (foreach)

             Memory Management
map, fold and filter
scala> val parArray = (1 to 1000000).toArray.par
           scala> parArray.fold(0)(_+_)
           res3: Int = 1784293664


 scala> val narArray = (1 to 1000000).toArray

           scala> narArray.fold(0)(_+_)
                                                   I did not notice
           res5: Int = 1784293664
                                                   Difference on my
                                                   laptop
           scala> parArray.fold(0)(_+_)
           res6: Int = 1784293664
creating a parallel collection

    import scala.collection.parallel.immutable.ParVector
                                                             With a new
    val pv = new ParVector[Int]




     val pv = Vector(1,2,3,4,5,6,7,8,9).par

                                                    Taking a sequential collection
                                                    And converting it

Parallel collections can be converted back to sequential collections with seq
Collections are inherently sequential

They are converted to || by copying elements into similar parallel collection



An example is List– it’s converted into a standard immutable parallel
sequence, which is a ParVector.
                                                                        Overhead!

          Array, Vector, HashMap do not have this overhead
how does it work?

     Map reduce ?

 by recursively “splitting” a given collection, applying an operation on each partition
of the collection in parallel, and re-“combining” all of the results that were completed
in parallel.




         Side effecting operations                    Non Associative operations
scala> var sum =0      side effecting operation
sum: Int = 0

scala> val list = (1 to 1000).toList.par

scala> list.foreach(sum += _); sum
res7: Int = 452474

scala> var sum =0
sum: Int = 0

scala> list.foreach(sum += _); sum
res8: Int = 497761

scala> var sum =0
sum: Int = 0

scala> list.foreach(sum += _); sum
res9: Int = 422508
non-associative operations

The order in which function is applied to the elements of the collection can
be arbitrary
                scala> val list = (1 to 1000).toList.par

                scala> list.reduce(_-_)

                res01: Int = -228888

                scala> list.reduce(_-_)

                res02: Int = -61000

                scala> list.reduce(_-_)

                res03: Int = -331818
associate but non-commutative

scala> val strings = List("abc","def","ghi","jk","lmnop","qrs","tuv","wx","yz").par
strings: scala.collection.parallel.immutable.ParSeq[java.lang.String] =
 ParVector(abc, def, ghi, jk, lmnop, qrs, tuv, wx, yz)



scala> val alphabet = strings.reduce(_++_)
alphabet: java.lang.String = abcdefghijklmnopqrstuvwxyz
out of order?

Operations may be out of order

BUT

Recombination of results would be in order

                                             C
        collection                                   A
    A        B       C                                       B

                                                 A       B       C
performance




In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where
 the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node;
instead, its position in the tree defines the key with which it is associated.
conversions


                                                        List is converted to
                                                        vector




Converting parallel to sequential takes constant time
architecture



      splitters                    combiners


Split the collection into    Is a Builder.
Non-trivial partitions so    Combines split lists together.
That they can be accessed
in sequence
brickbats

Absence of configuration



                           Not all algorithms are parallel friendly


  unproven

       Now, if you want your code to not care whether it receives a
       parallel or sequential collection, you should prefix it with
       Gen: GenTraversable, GenIterable, GenSeq, etc.
       These can be either parallel or sequential.

More Related Content

What's hot

Functional programming with_scala
Functional programming with_scalaFunctional programming with_scala
Functional programming with_scalaRaymond Tay
 
Data Structures In Scala
Data Structures In ScalaData Structures In Scala
Data Structures In ScalaKnoldus Inc.
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaVladimir Kostyukov
 
Getting Started With Scala
Getting Started With ScalaGetting Started With Scala
Getting Started With ScalaMeetu Maltiar
 
Why Haskell Matters
Why Haskell MattersWhy Haskell Matters
Why Haskell Mattersromanandreg
 
Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianBrian Lonsdorf
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8bryanbibat
 
Scala case of case classes
Scala   case of case classesScala   case of case classes
Scala case of case classesVulcanMinds
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data MiningYanchang Zhao
 
R short-refcard
R short-refcardR short-refcard
R short-refcardconline
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2Opt Technologies
 

What's hot (19)

Scala Introduction
Scala IntroductionScala Introduction
Scala Introduction
 
Functional programming with_scala
Functional programming with_scalaFunctional programming with_scala
Functional programming with_scala
 
Data Structures In Scala
Data Structures In ScalaData Structures In Scala
Data Structures In Scala
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in Scala
 
Getting Started With Scala
Getting Started With ScalaGetting Started With Scala
Getting Started With Scala
 
Scala collections
Scala collectionsScala collections
Scala collections
 
Data transformation-cheatsheet
Data transformation-cheatsheetData transformation-cheatsheet
Data transformation-cheatsheet
 
Why Haskell Matters
Why Haskell MattersWhy Haskell Matters
Why Haskell Matters
 
Tuples All the Way Down
Tuples All the Way DownTuples All the Way Down
Tuples All the Way Down
 
Map, Reduce and Filter in Swift
Map, Reduce and Filter in SwiftMap, Reduce and Filter in Swift
Map, Reduce and Filter in Swift
 
Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematician
 
Python programming : List and tuples
Python programming : List and tuplesPython programming : List and tuples
Python programming : List and tuples
 
16 containers
16   containers16   containers
16 containers
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8
 
Scala case of case classes
Scala   case of case classesScala   case of case classes
Scala case of case classes
 
Collection and framework
Collection and frameworkCollection and framework
Collection and framework
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data Mining
 
R short-refcard
R short-refcardR short-refcard
R short-refcard
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2
 

Similar to Scala Parallel Collections for High Performance

Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with ScalaNimrod Argov
 
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...Collection Framework in Java | Generics | Input-Output in Java | Serializatio...
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...Sagar Verma
 
Collection advance
Collection advanceCollection advance
Collection advanceAakash Jain
 
Functional programming with Scala
Functional programming with ScalaFunctional programming with Scala
Functional programming with ScalaNeelkanth Sachdeva
 
Functional Programming With Scala
Functional Programming With ScalaFunctional Programming With Scala
Functional Programming With ScalaKnoldus Inc.
 
Collections Java e Google Collections
Collections Java e Google CollectionsCollections Java e Google Collections
Collections Java e Google CollectionsAndré Faria Gomes
 
The Scala Programming Language
The Scala Programming LanguageThe Scala Programming Language
The Scala Programming Languageleague
 
Introduction to parallel and distributed computation with spark
Introduction to parallel and distributed computation with sparkIntroduction to parallel and distributed computation with spark
Introduction to parallel and distributed computation with sparkAngelo Leto
 
Scala training workshop 02
Scala training workshop 02Scala training workshop 02
Scala training workshop 02Nguyen Tuan
 
Taxonomy of Scala
Taxonomy of ScalaTaxonomy of Scala
Taxonomy of Scalashinolajla
 
Scala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoScala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoSagie Davidovich
 
Underscore.js
Underscore.jsUnderscore.js
Underscore.jstimourian
 
Scala collections api expressivity and brevity upgrade from java
Scala collections api  expressivity and brevity upgrade from javaScala collections api  expressivity and brevity upgrade from java
Scala collections api expressivity and brevity upgrade from javaIndicThreads
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?Tomasz Wrobel
 

Similar to Scala Parallel Collections for High Performance (20)

Scala Collections
Scala CollectionsScala Collections
Scala Collections
 
Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with Scala
 
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...Collection Framework in Java | Generics | Input-Output in Java | Serializatio...
Collection Framework in Java | Generics | Input-Output in Java | Serializatio...
 
Collection advance
Collection advanceCollection advance
Collection advance
 
Functional programming with Scala
Functional programming with ScalaFunctional programming with Scala
Functional programming with Scala
 
Functional Programming With Scala
Functional Programming With ScalaFunctional Programming With Scala
Functional Programming With Scala
 
Collections Java e Google Collections
Collections Java e Google CollectionsCollections Java e Google Collections
Collections Java e Google Collections
 
The Scala Programming Language
The Scala Programming LanguageThe Scala Programming Language
The Scala Programming Language
 
Introduction to parallel and distributed computation with spark
Introduction to parallel and distributed computation with sparkIntroduction to parallel and distributed computation with spark
Introduction to parallel and distributed computation with spark
 
Scala training workshop 02
Scala training workshop 02Scala training workshop 02
Scala training workshop 02
 
Taxonomy of Scala
Taxonomy of ScalaTaxonomy of Scala
Taxonomy of Scala
 
Spark workshop
Spark workshopSpark workshop
Spark workshop
 
Meet scala
Meet scalaMeet scala
Meet scala
 
Scala collections wizardry - Scalapeño
Scala collections wizardry - ScalapeñoScala collections wizardry - Scalapeño
Scala collections wizardry - Scalapeño
 
Underscore.js
Underscore.jsUnderscore.js
Underscore.js
 
Java Collections
Java  Collections Java  Collections
Java Collections
 
Scala collections api expressivity and brevity upgrade from java
Scala collections api  expressivity and brevity upgrade from javaScala collections api  expressivity and brevity upgrade from java
Scala collections api expressivity and brevity upgrade from java
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?
 
Scala basic
Scala basicScala basic
Scala basic
 
Ppt chapter09
Ppt chapter09Ppt chapter09
Ppt chapter09
 

More from Knoldus Inc.

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxKnoldus Inc.
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinKnoldus Inc.
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks PresentationKnoldus Inc.
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxKnoldus Inc.
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance TestingKnoldus Inc.
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxKnoldus Inc.
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationKnoldus Inc.
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.Knoldus Inc.
 

More from Knoldus Inc. (20)

Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptx
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and Kotlin
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks Presentation
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
NoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptxNoOps - (Automate Ops) Presentation.pptx
NoOps - (Automate Ops) Presentation.pptx
 
Mastering Distributed Performance Testing
Mastering Distributed Performance TestingMastering Distributed Performance Testing
Mastering Distributed Performance Testing
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Introduction to Ansible Tower Presentation
Introduction to Ansible Tower PresentationIntroduction to Ansible Tower Presentation
Introduction to Ansible Tower Presentation
 
CQRS with dot net services presentation.
CQRS with dot net services presentation.CQRS with dot net services presentation.
CQRS with dot net services presentation.
 

Recently uploaded

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Recently uploaded (20)

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Scala Parallel Collections for High Performance

  • 1. Parallel Collections with Scala Jul 6' 2012 > Vikas Hazrati > vikas@knoldus.com > @vhazrati
  • 2. Motivation Multiple-cores Popular Parallel Programming remains a formidable challenge. Implicit Parallelism
  • 3. scala> val list = (1 to 10000).toList scala> list.map(_ + 42) scala> list.par.map(_ + 42)
  • 4. scala> List(1,2,3,4,5) res0: List[Int] = List(1, 2, 3, 4, 5) scala> res0.map(println(_)) 1 2 3 4 5 res1: List[Unit] = List((), (), (), (), ()) scala> res0.par.map(println(_)) 3 1 4 2 5 res2: scala.collection.parallel.immutable.ParSeq[Unit] = ParVector((), (), (), (), ())
  • 6. Caution: Performance benefits visible only around several Thousand elements in the collection Machine Architecture Depends on JVM vendor and version Per element workload Specific collection – ParArray, ParTrieMap Specific operation – transformer(filter), accessor (foreach) Memory Management
  • 7. map, fold and filter scala> val parArray = (1 to 1000000).toArray.par scala> parArray.fold(0)(_+_) res3: Int = 1784293664 scala> val narArray = (1 to 1000000).toArray scala> narArray.fold(0)(_+_) I did not notice res5: Int = 1784293664 Difference on my laptop scala> parArray.fold(0)(_+_) res6: Int = 1784293664
  • 8. creating a parallel collection import scala.collection.parallel.immutable.ParVector With a new val pv = new ParVector[Int] val pv = Vector(1,2,3,4,5,6,7,8,9).par Taking a sequential collection And converting it Parallel collections can be converted back to sequential collections with seq
  • 9. Collections are inherently sequential They are converted to || by copying elements into similar parallel collection An example is List– it’s converted into a standard immutable parallel sequence, which is a ParVector. Overhead! Array, Vector, HashMap do not have this overhead
  • 10. how does it work? Map reduce ? by recursively “splitting” a given collection, applying an operation on each partition of the collection in parallel, and re-“combining” all of the results that were completed in parallel. Side effecting operations Non Associative operations
  • 11. scala> var sum =0 side effecting operation sum: Int = 0 scala> val list = (1 to 1000).toList.par scala> list.foreach(sum += _); sum res7: Int = 452474 scala> var sum =0 sum: Int = 0 scala> list.foreach(sum += _); sum res8: Int = 497761 scala> var sum =0 sum: Int = 0 scala> list.foreach(sum += _); sum res9: Int = 422508
  • 12. non-associative operations The order in which function is applied to the elements of the collection can be arbitrary scala> val list = (1 to 1000).toList.par scala> list.reduce(_-_) res01: Int = -228888 scala> list.reduce(_-_) res02: Int = -61000 scala> list.reduce(_-_) res03: Int = -331818
  • 13. associate but non-commutative scala> val strings = List("abc","def","ghi","jk","lmnop","qrs","tuv","wx","yz").par strings: scala.collection.parallel.immutable.ParSeq[java.lang.String] = ParVector(abc, def, ghi, jk, lmnop, qrs, tuv, wx, yz) scala> val alphabet = strings.reduce(_++_) alphabet: java.lang.String = abcdefghijklmnopqrstuvwxyz
  • 14. out of order? Operations may be out of order BUT Recombination of results would be in order C collection A A B C B A B C
  • 15. performance In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated.
  • 16. conversions List is converted to vector Converting parallel to sequential takes constant time
  • 17. architecture splitters combiners Split the collection into Is a Builder. Non-trivial partitions so Combines split lists together. That they can be accessed in sequence
  • 18. brickbats Absence of configuration Not all algorithms are parallel friendly unproven Now, if you want your code to not care whether it receives a parallel or sequential collection, you should prefix it with Gen: GenTraversable, GenIterable, GenSeq, etc. These can be either parallel or sequential.