Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT

•

0 likes•4 views

For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May. https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024

Science

Dynamic Batch Parallel Algorithms for Updating PageRank
(Poster abstract for IPDPS 2022 PhD Forum)
Subhajit Sahu†, Kishore Kothapalli†and Dip Sankar Banerjee‡
†International Institute of Information Technology Hyderabad, India.
‡Indian Institute of Technology Jodhpur, India.
subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in
May 4, 2022
We present two new parallel algorithms for recomputing the PageRank values of only the vertices affected by the
insertion/deletion of a batch of edges, in a dynamic graph. One algorithm, named DYNAMICLEVELWISEPR, computes
updated ranks of vertices in topological order of affected SCCs. PageRank computation is performed on each affected
level of SCCs in sequential order, from the topmost unprocessed level until convergence. This avoids unnecessary re-
computation of SCCs that are dependent upon ranks of vertices in other SCCs which have not yet converged. The other
algorithm, DYNAMICMONOLITHICPR computes updated ranks of vertices in one go, but groups affected vertices by
SCCs and partitions them by in-degree, to obtain a better work-balance on the GPU. Both algorithms accept the previ-
ous and current snapshot of a graph as input, along with the previous ranks of the vertices. From each changed SCC,
DFS is performed in order to obtain a list of affected SCCs. We group vertices by SCCs for ensuring good memory
locality. On the GPU, each affected SCC is processed with a thread-per-vertex and a block-per-vertex CUDA kernel
after partitioning. However to reduce the number of kernel calls, we combine small affected SCCs together until they
satisfy a minimum work requirement of 10M vertices. Computation is performed on CSR representation of the graph.
We conduct experimental studies of our algorithms on a set of 11 real-world graphs. Self-loops are added to dead
ends in all the graphs. Their order |V | varies from 75k to 41M vertices, and size |E| varies from 524k to 1.1B
edges. We experiment with batch sizes of 500 to 10000 edges. Each batch is randomly generated with an equal mix of
insertions and deletions, such that edges connecting vertices with high out-degrees have a greater chance of selection.
This is done in order to mimic the behaviour of real-world dynamic graphs. A fair comparison is ensured except in
cases beyond our control. The measured time in all cases is the rank computation time.
Our results on an Intel Xeon Silver 4116 CPU and NVIDIA Tesla V100 PCIe 16GB GPU indicate that DYNAMIC-
MONOLITHICPR and DYNAMICLEVELWISEPR outperform static STIC-D PageRank by 6.1×and 8.6×on the CPU,
and naive dynamic nvGraph PageRank by 9.8×and 9.3×on the GPU respectively. In addition we observe a mean
speedup of 4.2×and 5.8×on the CPU over a pure CPU implementation of HyPR, and a mean speedup of 1.9×and
1.8×on the GPU over a pure GPU implementation of HyPR respectively. We also compare the performance of the
algorithms in batched mode to cumulative single-edge updates. A batch update of 5000 edges offers a speedup of
4066×and 2998×for algorithms DYNAMICMONOLITHICPR and DYNAMICLEVELWISEPR respectively on the CPU,
and a speedup of 1712×and 2324×respectively on the GPU.
We therefore conclude that DYNAMICLEVELWISEPR is a suitable approach for CPUs. However on a GPU, smaller
levels/components could be combined and processed at a time in order to help improve GPU usage efficiency as
DYNAMICMONOLITHICPR suggests.

Similar to Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT

Neural Architecture Search: Learning How to LearnKwanghee Choi

Ling liu part 02：big graph processingjins0618

SparkNet presentationSneh Pahilwani

IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET Journal

A Survey of Machine Learning Methods Applied to Computer ...butest

Performance Characterization and Optimization of In-Memory Data Analytics on ...Ahsan Javed Awan

An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureWaqas Tariq

Performance boosting of discrete cosine transform using parallel programming ...IAEME Publication

Micro-architectural Characterization of Apache Spark on Batch and Stream Proc...Ahsan Javed Awan

Auto-Pilot for Apache Spark Using Machine LearningDatabricks

A04660105IOSR-JEN

powerpoint febimu409

Architectural Optimizations for High Performance and Energy Efficient Smith-W...NECST Lab @ Politecnico di Milano

IRJET-ASIC Implementation for SOBEL AcceleratorIRJET Journal

ASIC Implementation for SOBEL AcceleratorIRJET Journal

A data and task co scheduling algorithm for scientific cloud workflowsFinalyearprojects Toall

Linear regression modelSatyamDeshPandey

A dynamically reconfigurable multi asip architecture for multistandard and mu...LeMeniz Infotech

A Study on New York City Taxi RidesCaglar Subasi

Technical_Report_on_ML_LibrarySaurabh Chauhan

Similar to Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT (20)

Neural Architecture Search: Learning How to Learn

Ling liu part 02：big graph processing

SparkNet presentation

IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...

A Survey of Machine Learning Methods Applied to Computer ...

Performance Characterization and Optimization of In-Memory Data Analytics on ...

An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture

Performance boosting of discrete cosine transform using parallel programming ...

Micro-architectural Characterization of Apache Spark on Batch and Stream Proc...

Auto-Pilot for Apache Spark Using Machine Learning

A04660105

powerpoint feb

Architectural Optimizations for High Performance and Energy Efficient Smith-W...

IRJET-ASIC Implementation for SOBEL Accelerator

ASIC Implementation for SOBEL Accelerator

A data and task co scheduling algorithm for scientific cloud workflows

Linear regression model

A dynamically reconfigurable multi asip architecture for multistandard and mu...

A Study on New York City Taxi Rides

Technical_Report_on_ML_Library

Recently uploaded

Is RISC-V ready for HPC workload? Maybe?Patrick Diehl

Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P

The Philosophy of ScienceUniversity of Hertfordshire

Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344

Zoology 4th semester series (krishna).pdfSumit Kumar yadav

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani

Disentangling the origin of chemical differences using GHOSTSérgio Sacani

Chemistry 4th semester series (krishna).pdfSumit Kumar yadav

Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav

Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25

Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India

Nanoparticles synthesis and characterization kaibalyasahoo82800

Isotopic evidence of long-lived volcanism on IoSérgio Sacani

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani

Animal Communication- Auditory and Visual.pptxUmerFayaz5

Recently uploaded (20)

Is RISC-V ready for HPC workload? Maybe?

Presentation Vikram Lander by Vedansh Gupta.pptx

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE

The Philosophy of Science

Spermiogenesis or Spermateleosis or metamorphosis of spermatid

TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...

Zoology 4th semester series (krishna).pdf

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...

Disentangling the origin of chemical differences using GHOST

Chemistry 4th semester series (krishna).pdf

Botany krishna series 2nd semester Only Mcq type questions

Recombination DNA Technology (Nucleic Acid Hybridization )

Bentham & Hooker's Classification. along with the merits and demerits of the ...

Nanoparticles synthesis and characterization

Isotopic evidence of long-lived volcanism on Io

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...

Animal Communication- Auditory and Visual.pptx

Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT

1. Dynamic Batch Parallel Algorithms for Updating PageRank (Poster abstract for IPDPS 2022 PhD Forum) Subhajit Sahu†, Kishore Kothapalli†and Dip Sankar Banerjee‡ †International Institute of Information Technology Hyderabad, India. ‡Indian Institute of Technology Jodhpur, India. subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in May 4, 2022 We present two new parallel algorithms for recomputing the PageRank values of only the vertices affected by the insertion/deletion of a batch of edges, in a dynamic graph. One algorithm, named DYNAMICLEVELWISEPR, computes updated ranks of vertices in topological order of affected SCCs. PageRank computation is performed on each affected level of SCCs in sequential order, from the topmost unprocessed level until convergence. This avoids unnecessary re- computation of SCCs that are dependent upon ranks of vertices in other SCCs which have not yet converged. The other algorithm, DYNAMICMONOLITHICPR computes updated ranks of vertices in one go, but groups affected vertices by SCCs and partitions them by in-degree, to obtain a better work-balance on the GPU. Both algorithms accept the previous and current snapshot of a graph as input, along with the previous ranks of the vertices. From each changed SCC, DFS is performed in order to obtain a list of affected SCCs. We group vertices by SCCs for ensuring good memory locality. On the GPU, each affected SCC is processed with a thread-per-vertex and a block-per-vertex CUDA kernel after partitioning. However to reduce the number of kernel calls, we combine small affected SCCs together until they satisfy a minimum work requirement of 10M vertices. Computation is performed on CSR representation of the graph. We conduct experimental studies of our algorithms on a set of 11 real-world graphs. Self-loops are added to dead ends in all the graphs. Their order |V | varies from 75k to 41M vertices, and size |E| varies from 524k to 1.1B edges. We experiment with batch sizes of 500 to 10000 edges. Each batch is randomly generated with an equal mix of insertions and deletions, such that edges connecting vertices with high out-degrees have a greater chance of selection. This is done in order to mimic the behaviour of real-world dynamic graphs. A fair comparison is ensured except in cases beyond our control. The measured time in all cases is the rank computation time. Our results on an Intel Xeon Silver 4116 CPU and NVIDIA Tesla V100 PCIe 16GB GPU indicate that DYNAMIC- MONOLITHICPR and DYNAMICLEVELWISEPR outperform static STIC-D PageRank by 6.1×and 8.6×on the CPU, and naive dynamic nvGraph PageRank by 9.8×and 9.3×on the GPU respectively. In addition we observe a mean speedup of 4.2×and 5.8×on the CPU over a pure CPU implementation of HyPR, and a mean speedup of 1.9×and 1.8×on the GPU over a pure GPU implementation of HyPR respectively. We also compare the performance of the algorithms in batched mode to cumulative single-edge updates. A batch update of 5000 edges offers a speedup of 4066×and 2998×for algorithms DYNAMICMONOLITHICPR and DYNAMICLEVELWISEPR respectively on the CPU, and a speedup of 1712×and 2324×respectively on the GPU. We therefore conclude that DYNAMICLEVELWISEPR is a suitable approach for CPUs. However on a GPU, smaller levels/components could be combined and processed at a time in order to help improve GPU usage efficiency as DYNAMICMONOLITHICPR suggests.

Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT

Recommended

Recommended

More Related Content

Similar to Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT

Similar to Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT (20)

More from Subhajit Sahu

More from Subhajit Sahu (20)

Recently uploaded

Recently uploaded (20)

Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT