For the PhD forum an abstract submission is required by 10th May, and poster by 15th May. The event is on 30th May.
https://gist.github.com/wolfram77/1c1f730d20b51e0d2c6d477fd3713024
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Updating PageRank : ABSTRACT
1. Dynamic Batch Parallel Algorithms for Updating PageRank
(Poster abstract for IPDPS 2022 PhD Forum)
Subhajit Sahu†, Kishore Kothapalli†and Dip Sankar Banerjee‡
†International Institute of Information Technology Hyderabad, India.
‡Indian Institute of Technology Jodhpur, India.
subhajit.sahu@research.,kkishore@iiit.ac.in, dipsankarb@iitj.ac.in
May 4, 2022
We present two new parallel algorithms for recomputing the PageRank values of only the vertices affected by the
insertion/deletion of a batch of edges, in a dynamic graph. One algorithm, named DYNAMICLEVELWISEPR, computes
updated ranks of vertices in topological order of affected SCCs. PageRank computation is performed on each affected
level of SCCs in sequential order, from the topmost unprocessed level until convergence. This avoids unnecessary re-
computation of SCCs that are dependent upon ranks of vertices in other SCCs which have not yet converged. The other
algorithm, DYNAMICMONOLITHICPR computes updated ranks of vertices in one go, but groups affected vertices by
SCCs and partitions them by in-degree, to obtain a better work-balance on the GPU. Both algorithms accept the previ-
ous and current snapshot of a graph as input, along with the previous ranks of the vertices. From each changed SCC,
DFS is performed in order to obtain a list of affected SCCs. We group vertices by SCCs for ensuring good memory
locality. On the GPU, each affected SCC is processed with a thread-per-vertex and a block-per-vertex CUDA kernel
after partitioning. However to reduce the number of kernel calls, we combine small affected SCCs together until they
satisfy a minimum work requirement of 10M vertices. Computation is performed on CSR representation of the graph.
We conduct experimental studies of our algorithms on a set of 11 real-world graphs. Self-loops are added to dead
ends in all the graphs. Their order |V | varies from 75k to 41M vertices, and size |E| varies from 524k to 1.1B
edges. We experiment with batch sizes of 500 to 10000 edges. Each batch is randomly generated with an equal mix of
insertions and deletions, such that edges connecting vertices with high out-degrees have a greater chance of selection.
This is done in order to mimic the behaviour of real-world dynamic graphs. A fair comparison is ensured except in
cases beyond our control. The measured time in all cases is the rank computation time.
Our results on an Intel Xeon Silver 4116 CPU and NVIDIA Tesla V100 PCIe 16GB GPU indicate that DYNAMIC-
MONOLITHICPR and DYNAMICLEVELWISEPR outperform static STIC-D PageRank by 6.1×and 8.6×on the CPU,
and naive dynamic nvGraph PageRank by 9.8×and 9.3×on the GPU respectively. In addition we observe a mean
speedup of 4.2×and 5.8×on the CPU over a pure CPU implementation of HyPR, and a mean speedup of 1.9×and
1.8×on the GPU over a pure GPU implementation of HyPR respectively. We also compare the performance of the
algorithms in batched mode to cumulative single-edge updates. A batch update of 5000 edges offers a speedup of
4066×and 2998×for algorithms DYNAMICMONOLITHICPR and DYNAMICLEVELWISEPR respectively on the CPU,
and a speedup of 1712×and 2324×respectively on the GPU.
We therefore conclude that DYNAMICLEVELWISEPR is a suitable approach for CPUs. However on a GPU, smaller
levels/components could be combined and processed at a time in order to help improve GPU usage efficiency as
DYNAMICMONOLITHICPR suggests.