SlideShare a Scribd company logo
1 of 21
Download to read offline
IEEE BigData 2019 , December 4-12
2
IEEE BigData 2019 , December 4-12
3
[KW ‘02] K. Wang, L. Tang, J. Han, and J. Liu, “Top down fp-growth for association rule mining,”
in Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data
Mining, ser. PAKDD ’02.
IEEE BigData 2019 , December 4-12
4
IEEE BigData 2019 , December 4-12
5
IEEE BigData 2019 , December 4-12
6
[F.Z. ‘16] F. Zhang, P. Di, H. Zhou, X. Liao, and J. Xue, “Regtt: Accelerating tree traversals on gpus by exploiting regularities,”
in 2016 ICPP
[M.G. ‘13] M. Goldfarb, Y. Jo, and M. Kulkarni, “General transformations for gpu execution of tree traversals,” in Proceedings
of the International Conference on High Performance Computing, Networking, Storage and Analysis, ser. SC ’13.
IEEE BigData 2019 , December 4-12
7
Index 0
item
(parent item, the index of parent node, support)
coalesced access
IEEE BigData 2019 , December 4-12
8
(a)
(b)
(a)
(b)
53x
IEEE BigData 2019 , December 4-12
9
IEEE BigData 2019 , December 4-12
10
[XH ’10] X. Huang, C. I. Rodrigues, S. Jones, I. Buck and W. Hwu,
"XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines,"
2010 10th IEEE International Conference on Computer and Information Technology
[MS ’12] M. Steinberger, M. Kenzel, B. Kainz and D. Schmalstieg,
"ScatterAlloc: Massively parallel dynamic memory allocation for the GPU,"
2012 Innovative Parallel Computing (InPar)
Input table set
Output table
set
Mining Iteration 0
Input table set
Output table
set
Mining Iteration 1
Input table set
Output table
set
Mining
Iteration 2
Header
table 0
Header
table 1
Header
table k
Info of an item : node, support, etc.
Header table XY: the header table of pattern XY
Info of
item 0
Info of
item 1
Info of
item k-1
Thread
blocks
Out of order
Header
table 1k
Header
table 2k
Header
table (k-1)k
Header
table 13
Header
table 59
IEEE BigData 2019 , December 4-12
11
IEEE BigData 2019 , December 4-12
12
IEEE BigData 2019 , December 4-12
13
2 0 1Remap
Size 0 Size 1 Size 2
Size 1 Size 2 Size 0
exclusive prefix-sum
0 Size 1 Size 1+2
Write offset
Calculating the write offsets
IEEE BigData 2019 , December 4-12
14
2 0 1
0 Size 1 Size 1+2
Table Table Table
Write offset
Remap
Using the write offsets
IEEE BigData 2019 , December 4-12
15
I I
I
I
Idx:0 Idx:1
Idx:2
Idx:3
0 2 3 1 4
Thread 0, Thread 1, Thread 2, Thread 3
I
Idx:4
Thread block size: 4
IEEE BigData 2019 , December 4-12
16
[CB ’05] C. Borgelt, “An implementation of the fp-growth algorithm,” OSDM ’05.(workshop)
[FW ’14] F. Wang and B. Yuan, “Parallel frequent pattern mining without candidate generation on gpus,”
2014 IEEE ICDMW
[HJ ‘17]H. Jiang and H. Meng, “A parallel fp-growth algorithm based on gpu,” 2017 IEEE ICEBE
[WF ’09] W. Fang, M. Lu, X. Xiao, B. He, and Q. Luo, “Frequent itemset mining on graphics processors,” DaMoN ’09
[Chon ’18] K.-W. Chon, S.-H. Hwang, and M.-S. Kim, “Gminer: A fast gpu-based frequent itemset mining
method for large-scale data,” InformationSciences, vol. 439-440, pp. 19 – 38, 2018.
Not open source,
and the normalized results are too bad
IEEE BigData 2019 , December 4-12
17
Dataset #items #trans Size Threshold
(%)
v.s. CPU
FP-
growth
v.s. the
best GPU
Apriori
chess 75 3196 335KB 35~60 1.2x~0.7x 1.8x~3.3x
retail 16470 88163 4MB 0.07~0.1 2x ~ 1.8x 9.8x ~ 8.6x
accident 468 340184 34MB 20~40 8x~6x 16x ~ 42x
kosarak 41270 990002 30MB 0.3 ~ 0.6 6x~7x 12x ~ 40x
Webdoc 5267656 1692082 1.48GB 20 ~ 25 12x~7x 12x ~ 86x
Fewer patternsPerformance criteria :execution time
Operations can be processed offline are excluded.
IEEE BigData 2019 , December 4-12
18
IEEE BigData 2019 , December 4-12
19
IEEE BigData 2019 , December 4-12
20
Generated frequent patterns
04 14 24 34 4
Header table of pattern 24
0:5
2:2
3:2
4:2
1:3
2:1
3:1
4:1
4:2
2
The length of index array
Depend on hash function
0 1
3 1
1 1
Idx:0
Idx:1
0
3
1
2
The position is decided by hash value
# node
# support
IEEE BigData 2019 , December 4-12
21
Assume the support threshold is 3
A new frequent pattern 024:3 will be generated

More Related Content

What's hot

Big Data Analysis of Airline Data Set on Cloud Computing
Big Data Analysis of Airline Data Set on Cloud ComputingBig Data Analysis of Airline Data Set on Cloud Computing
Big Data Analysis of Airline Data Set on Cloud ComputingNillohit Bhattacharya
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1balmanme
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsUniversity of Washington
 
Bizosys at fifth elephant
Bizosys at fifth elephantBizosys at fifth elephant
Bizosys at fifth elephantAbinasha Karana
 
Overview of bigdata
Overview of bigdataOverview of bigdata
Overview of bigdataAbinaya B
 
simple introduction to hadoop
simple introduction to hadoopsimple introduction to hadoop
simple introduction to hadoopvishnu rao
 
The Evolving Landscape of Data Engineering
The Evolving Landscape of Data EngineeringThe Evolving Landscape of Data Engineering
The Evolving Landscape of Data EngineeringAndrei Savu
 
2013 Geospatial Data and Project Management Track, Building Better Data: The ...
2013 Geospatial Data and Project Management Track, Building Better Data: The ...2013 Geospatial Data and Project Management Track, Building Better Data: The ...
2013 Geospatial Data and Project Management Track, Building Better Data: The ...GIS in the Rockies
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DCCCRinc
 
Research Data Australia and the national research data landscape
Research Data Australia and the national research data landscapeResearch Data Australia and the national research data landscape
Research Data Australia and the national research data landscapeRichard Ferrers
 
VFB 2013 - HP Labs - Horizon Scanning - Technology Trends
VFB 2013 - HP Labs - Horizon Scanning - Technology TrendsVFB 2013 - HP Labs - Horizon Scanning - Technology Trends
VFB 2013 - HP Labs - Horizon Scanning - Technology TrendsScience City Bristol
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystemGrzegorz Kolpuc
 
The internet of things, do we need all that data?
The internet of things, do we need all that data?The internet of things, do we need all that data?
The internet of things, do we need all that data?Christian Verstraete
 

What's hot (19)

Big Data Analysis of Airline Data Set on Cloud Computing
Big Data Analysis of Airline Data Set on Cloud ComputingBig Data Analysis of Airline Data Set on Cloud Computing
Big Data Analysis of Airline Data Set on Cloud Computing
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD ModelsBig Data + Big Sim: Query Processing over Unstructured CFD Models
Big Data + Big Sim: Query Processing over Unstructured CFD Models
 
Bizosys at fifth elephant
Bizosys at fifth elephantBizosys at fifth elephant
Bizosys at fifth elephant
 
A Brief History Of Data
A Brief History Of DataA Brief History Of Data
A Brief History Of Data
 
Hadoop
HadoopHadoop
Hadoop
 
Overview of bigdata
Overview of bigdataOverview of bigdata
Overview of bigdata
 
simple introduction to hadoop
simple introduction to hadoopsimple introduction to hadoop
simple introduction to hadoop
 
How Do I Learn Big Data
How Do I Learn Big DataHow Do I Learn Big Data
How Do I Learn Big Data
 
The Evolving Landscape of Data Engineering
The Evolving Landscape of Data EngineeringThe Evolving Landscape of Data Engineering
The Evolving Landscape of Data Engineering
 
2013 Geospatial Data and Project Management Track, Building Better Data: The ...
2013 Geospatial Data and Project Management Track, Building Better Data: The ...2013 Geospatial Data and Project Management Track, Building Better Data: The ...
2013 Geospatial Data and Project Management Track, Building Better Data: The ...
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DC
 
Research Data Australia and the national research data landscape
Research Data Australia and the national research data landscapeResearch Data Australia and the national research data landscape
Research Data Australia and the national research data landscape
 
VFB 2013 - HP Labs - Horizon Scanning - Technology Trends
VFB 2013 - HP Labs - Horizon Scanning - Technology TrendsVFB 2013 - HP Labs - Horizon Scanning - Technology Trends
VFB 2013 - HP Labs - Horizon Scanning - Technology Trends
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystem
 
The internet of things, do we need all that data?
The internet of things, do we need all that data?The internet of things, do we need all that data?
The internet of things, do we need all that data?
 
Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)Hadoop bigdata projects list(ver)
Hadoop bigdata projects list(ver)
 

Similar to Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Latency Memory Allocation(IEEE Big data 2019)

Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 FPGA-based soft-processors: 6G nodes and post-quantum security in space FPGA-based soft-processors: 6G nodes and post-quantum security in space
FPGA-based soft-processors: 6G nodes and post-quantum security in spaceFacultad de Informática UCM
 
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniques
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code TechniquesAnalysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniques
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniquesijtsrd
 
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
Federated Learning of Neural Network Models with Heterogeneous Structures.pdfFederated Learning of Neural Network Models with Heterogeneous Structures.pdf
Federated Learning of Neural Network Models with Heterogeneous Structures.pdfKundjanasith Thonglek
 
Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for EveryoneGiovanna Roda
 
Iciic 2010 114
Iciic 2010 114Iciic 2010 114
Iciic 2010 114hanums1
 
Coco co-desing and co-verification of masked software implementations on cp us
Coco   co-desing and co-verification of masked software implementations on cp usCoco   co-desing and co-verification of masked software implementations on cp us
Coco co-desing and co-verification of masked software implementations on cp usRISC-V International
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...Advanced-Concepts-Team
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)Toshiyuki Shimono
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsJongwook Woo
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfAdeIndriawan1
 
A Study on New York City Taxi Rides
A Study on New York City Taxi RidesA Study on New York City Taxi Rides
A Study on New York City Taxi RidesCaglar Subasi
 
A modified k means algorithm for big data clustering
A modified k means algorithm for big data clusteringA modified k means algorithm for big data clustering
A modified k means algorithm for big data clusteringSK Ahammad Fahad
 
Extreme Computing A Primer
Extreme Computing A PrimerExtreme Computing A Primer
Extreme Computing A Primerijtsrd
 
Creating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data SuperhighwayCreating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data SuperhighwayLarry Smarr
 
Iciic2010 114
Iciic2010 114Iciic2010 114
Iciic2010 114hanums1
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World FosterIan Foster
 

Similar to Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Latency Memory Allocation(IEEE Big data 2019) (20)

Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 FPGA-based soft-processors: 6G nodes and post-quantum security in space FPGA-based soft-processors: 6G nodes and post-quantum security in space
FPGA-based soft-processors: 6G nodes and post-quantum security in space
 
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniques
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code TechniquesAnalysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniques
Analysis of Decoding Plaintext Data Using Enhanced Hamming Code Techniques
 
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
Federated Learning of Neural Network Models with Heterogeneous Structures.pdfFederated Learning of Neural Network Models with Heterogeneous Structures.pdf
Federated Learning of Neural Network Models with Heterogeneous Structures.pdf
 
2. Rationale behind FPGA
2. Rationale behind FPGA2. Rationale behind FPGA
2. Rationale behind FPGA
 
Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for Everyone
 
Netsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network PlanesNetsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network Planes
 
Iciic 2010 114
Iciic 2010 114Iciic 2010 114
Iciic 2010 114
 
Coco co-desing and co-verification of masked software implementations on cp us
Coco   co-desing and co-verification of masked software implementations on cp usCoco   co-desing and co-verification of masked software implementations on cp us
Coco co-desing and co-verification of masked software implementations on cp us
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
 
A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)A Hacking Toolset for Big Tabular Files (3)
A Hacking Toolset for Big Tabular Files (3)
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
 
stanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdfstanford_graph-learning_workshop.pdf
stanford_graph-learning_workshop.pdf
 
A Study on New York City Taxi Rides
A Study on New York City Taxi RidesA Study on New York City Taxi Rides
A Study on New York City Taxi Rides
 
A modified k means algorithm for big data clustering
A modified k means algorithm for big data clusteringA modified k means algorithm for big data clustering
A modified k means algorithm for big data clustering
 
Extreme Computing A Primer
Extreme Computing A PrimerExtreme Computing A Primer
Extreme Computing A Primer
 
Creating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data SuperhighwayCreating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data Superhighway
 
Iciic2010 114
Iciic2010 114Iciic2010 114
Iciic2010 114
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
Future of hpc
Future of hpcFuture of hpc
Future of hpc
 

Recently uploaded

Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 

Recently uploaded (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Latency Memory Allocation(IEEE Big data 2019)

  • 1.
  • 2. IEEE BigData 2019 , December 4-12 2
  • 3. IEEE BigData 2019 , December 4-12 3
  • 4. [KW ‘02] K. Wang, L. Tang, J. Han, and J. Liu, “Top down fp-growth for association rule mining,” in Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, ser. PAKDD ’02. IEEE BigData 2019 , December 4-12 4
  • 5. IEEE BigData 2019 , December 4-12 5
  • 6. IEEE BigData 2019 , December 4-12 6
  • 7. [F.Z. ‘16] F. Zhang, P. Di, H. Zhou, X. Liao, and J. Xue, “Regtt: Accelerating tree traversals on gpus by exploiting regularities,” in 2016 ICPP [M.G. ‘13] M. Goldfarb, Y. Jo, and M. Kulkarni, “General transformations for gpu execution of tree traversals,” in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, ser. SC ’13. IEEE BigData 2019 , December 4-12 7
  • 8. Index 0 item (parent item, the index of parent node, support) coalesced access IEEE BigData 2019 , December 4-12 8
  • 10. IEEE BigData 2019 , December 4-12 10 [XH ’10] X. Huang, C. I. Rodrigues, S. Jones, I. Buck and W. Hwu, "XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines," 2010 10th IEEE International Conference on Computer and Information Technology [MS ’12] M. Steinberger, M. Kenzel, B. Kainz and D. Schmalstieg, "ScatterAlloc: Massively parallel dynamic memory allocation for the GPU," 2012 Innovative Parallel Computing (InPar)
  • 11. Input table set Output table set Mining Iteration 0 Input table set Output table set Mining Iteration 1 Input table set Output table set Mining Iteration 2 Header table 0 Header table 1 Header table k Info of an item : node, support, etc. Header table XY: the header table of pattern XY Info of item 0 Info of item 1 Info of item k-1 Thread blocks Out of order Header table 1k Header table 2k Header table (k-1)k Header table 13 Header table 59 IEEE BigData 2019 , December 4-12 11
  • 12. IEEE BigData 2019 , December 4-12 12
  • 13. IEEE BigData 2019 , December 4-12 13 2 0 1Remap Size 0 Size 1 Size 2 Size 1 Size 2 Size 0 exclusive prefix-sum 0 Size 1 Size 1+2 Write offset Calculating the write offsets
  • 14. IEEE BigData 2019 , December 4-12 14 2 0 1 0 Size 1 Size 1+2 Table Table Table Write offset Remap Using the write offsets
  • 15. IEEE BigData 2019 , December 4-12 15 I I I I Idx:0 Idx:1 Idx:2 Idx:3 0 2 3 1 4 Thread 0, Thread 1, Thread 2, Thread 3 I Idx:4 Thread block size: 4
  • 16. IEEE BigData 2019 , December 4-12 16 [CB ’05] C. Borgelt, “An implementation of the fp-growth algorithm,” OSDM ’05.(workshop) [FW ’14] F. Wang and B. Yuan, “Parallel frequent pattern mining without candidate generation on gpus,” 2014 IEEE ICDMW [HJ ‘17]H. Jiang and H. Meng, “A parallel fp-growth algorithm based on gpu,” 2017 IEEE ICEBE [WF ’09] W. Fang, M. Lu, X. Xiao, B. He, and Q. Luo, “Frequent itemset mining on graphics processors,” DaMoN ’09 [Chon ’18] K.-W. Chon, S.-H. Hwang, and M.-S. Kim, “Gminer: A fast gpu-based frequent itemset mining method for large-scale data,” InformationSciences, vol. 439-440, pp. 19 – 38, 2018. Not open source, and the normalized results are too bad
  • 17. IEEE BigData 2019 , December 4-12 17 Dataset #items #trans Size Threshold (%) v.s. CPU FP- growth v.s. the best GPU Apriori chess 75 3196 335KB 35~60 1.2x~0.7x 1.8x~3.3x retail 16470 88163 4MB 0.07~0.1 2x ~ 1.8x 9.8x ~ 8.6x accident 468 340184 34MB 20~40 8x~6x 16x ~ 42x kosarak 41270 990002 30MB 0.3 ~ 0.6 6x~7x 12x ~ 40x Webdoc 5267656 1692082 1.48GB 20 ~ 25 12x~7x 12x ~ 86x Fewer patternsPerformance criteria :execution time Operations can be processed offline are excluded.
  • 18. IEEE BigData 2019 , December 4-12 18
  • 19. IEEE BigData 2019 , December 4-12 19
  • 20. IEEE BigData 2019 , December 4-12 20
  • 21. Generated frequent patterns 04 14 24 34 4 Header table of pattern 24 0:5 2:2 3:2 4:2 1:3 2:1 3:1 4:1 4:2 2 The length of index array Depend on hash function 0 1 3 1 1 1 Idx:0 Idx:1 0 3 1 2 The position is decided by hash value # node # support IEEE BigData 2019 , December 4-12 21 Assume the support threshold is 3 A new frequent pattern 024:3 will be generated