This is my report on Adjusting PageRank parameters and comparing results (version 1).
While doing research work under Prof. Dip Banerjee, Prof. Kishore Kothapalli.
Web graphs unaltered are reducible, and thus the rate of convergence of the power-iteration method is the rate at which αk → 0, where α is the damping factor, and k is the iteration count. An estimate of the number of iterations needed to converge to a tolerance τ is logα τ. For τ = 10-6 and α = 0.85, it can take roughly 85 iterations to converge. For α = 0.95, and α = 0.75, with the same tolerance τ = 10-6, it takes roughly 269 and 48 iterations respectively. For τ = 10-9, and τ = 10-3, with the same damping factor α = 0.85, it takes roughly 128 and 43 iterations respectively. Thus, adjusting the damping factor or the tolerance parameters of the PageRank algorithm can have a significant effect on the convergence rate, both in terms of time and iterations. However, especially with the damping factor α, adjustment of the parameter value is a delicate balancing act. For smaller values of α, the convergence is fast, but the link structure of the graph used to determine ranks is less true. Slightly different values for α can produce very different rank vectors. Moreover, as α → 1, convergence slows down drastically, and sensitivity issues begin to surface [langville04].
Adjusting PageRank parameters and comparing results : REPORT
1. Adjusting PageRank parameters and comparing results
Web graphs unaltered are reducible, and thus the rate of convergence of the power-iteration
method is the rate at which αk
→ 0, where α is the damping factor, and k is the iteration
count. An estimate of the number of iterations needed to converge to a tolerance τ is logα τ.
For τ = 10-6
and α = 0.85, it can take roughly 85 iterations to converge. For α = 0.95, and α =
0.75, with the same tolerance τ = 10-6
, it takes roughly 269 and 48 iterations respectively. For
τ = 10-9
, and τ = 10-3
, with the same damping factor α = 0.85, it takes roughly 128 and 43
iterations respectively. Thus, adjusting the damping factor or the tolerance parameters of
the PageRank algorithm can have a significant effect on the convergence rate, both in terms
of time and iterations. However, especially with the damping factor α, adjustment of the
parameter value is a delicate balancing act. For smaller values of α, the convergence is fast,
but the link structure of the graph used to determine ranks is less true. Slightly different
values for α can produce very different rank vectors. Moreover, as α → 1, convergence
slows down drastically, and sensitivity issues begin to surface [langville04].
For the first experiment, the damping factor α (which is usually 0.85) is varied from 0.50 to
1.00 in steps of 0.05. This is in order to compare the performance variation with each
damping factor. The calculated error is the L1-norm with respect to default PageRank (α =
0.85). The PageRank algorithm used here is the standard power-iteration (pull) based
PageRank. The rank of a vertex in an iteration is calculated as c0 + αΣrn/dn, where c0 is the
common teleport contribution, α is the damping factor, rn is the previous rank of vertex with
an incoming edge, dn is the out-degree of the incoming-edge vertex, and N is the total
number of vertices in the graph. The common teleport contribution c0, calculated as (1-α)/N
+ αΣrn/N, includes the contribution due to a teleport from any vertex in the graph due to the
damping factor (1-α)/N, and teleport from dangling vertices (with no outgoing edges) in the
graph αΣrn/N. This is because a random surfer jumps to a random page upon visiting a page
with no links, in order to avoid the rank-sink effect.
All seventeen graphs used in this experiment are stored in the MatrixMarket (.mtx) file
format, and obtained from the SuiteSparse Matrix Collection. These include: web-Stanford,
web-BerkStan, web-Google, web-NotreDame, soc-Slashdot0811, soc-Slashdot0902,
soc-Epinions1, coAuthorsDBLP, coAuthorsCiteseer, soc-LiveJournal1, coPapersCiteseer,
coPapersDBLP, indochina-2004, italy_osm, great-britain_osm, germany_osm, asia_osm.
The experiment is implemented in C++, and compiled using GCC 9 with optimization level 3
(-O3). The system used is a Dell PowerEdge R740 Rack server with two Intel Xeon Silver
4116 CPUs @ 2.10GHz, 128GB DIMM DDR4 Synchronous Registered (Buffered) 2666 MHz
(8x16GB) DRAM, and running CentOS Linux release 7.9.2009 (Core). The iterations taken
with each test case is measured. 500 is the maximum iterations allowed. Statistics of each
test case is printed to standard output (stdout), and redirected to a log file, which is then
processed with a script to generate a CSV file, with each row representing the details of a
single test case. This CSV file is imported into Google Sheets, and necessary tables are set
up with the help of the FILTER function to create the charts.
When comparing the relative performance of different approaches with multiple test graphs,
there are two ways to obtain an average comparison: relative-average, and average-relative.
2. A relative-average comparison first finds relative performance (ratio) of each approach with
respect to a baseline approach (one of them), and then averages them. Consider, for
example, three approaches a, b, and c, with 3 test runs for each of the three approaches,
labeled a1, a2, a3, b1, b2, b3, c1, c2, c3. The relative performance of each approach with
respect to c would be a1/c1, b1/c1, c1/c1, a2/c2, b2/c2, and so on. The relative-average
comparison is now the average of these ratios, i.e., (a1/c1+a2/c2+a3/c3)/3 for a,
(b1/c1+b2/c2+b3/c3)/3, and 1 for c. In contrast, an average-relative comparison first finds the
average time/iterations taken for each approach with respect to a baseline approach, and
then finds the relative performance. Again, considering three approaches, with 3 test runs as
above, the average values of each approach would be (a1+a2+a3)/3 for a, (b1+b2+b3)/3 for b,
and (c1+c2+c3)/3 for c. The average-relative comparison of each approach with respect to c
would then be (a1+a2+a3)/(c1+c2+c3) for a, (b1+b2+b3)/(c1+c2+c3) for b, and 1 for c.
Semantically, a relative-average comparison gives equal importance to the relative
performance of each test run (graph), while an average-relative comparison gives equal
importance to magnitude (time/iterations) of all test runs (or simply, it gives higher
importance to test runs with larger graphs). For these experiments, both comparisons are
made, but only one of them is presented here if they are quite similar.
Figure 1: Average iterations for PageRank computation with damping factor α adjusted from 0.50 -
1.00 in steps of 0.05. Charts for relative-average, and average-relative iterations (with respect to
damping factor α = 0.85) follow the same curve, but with different values (values for relative-average
and average-relative iterations are quite similar).
Results (figure 1) indicate that increasing the damping factor α beyond 0.85 significantly
increases convergence time, and lowering it below 0.85 decreases convergence time. On
average, using a damping factor α = 0.95 increases both convergence time and iterations by
192%, and using a damping factor α = 0.75 decreases both by 41% (compared to damping
3. factor α = 0.85). Note that a higher damping factor implies that a random surfer follows links
with higher probability (and jumps to a random page with lower probability).
Observing that adjusting the damping factor has a significant effect, another experiment was
performed. The idea behind this experiment was to adjust the damping factor α in steps,
to see if it might help reduce PageRank computation time. The PageRank computation first
starts with a small α, changes it when ranks have converged, until the final desired value of
α. For example, the computation starts initially with α = 0.5, lets ranks converge quickly, and
then switches to α = 0.85 and continues PageRank computation until it converges. This
single-step change is attempted with the initial (fast converge) damping factor α from 0.1 to
0.84. Similar to this, two-step, three-step, and four-step changes are also attempted. With
a two-step approach, a midpoint between the damping_start value and 0.85 is selected as
well for the second set of iterations. Similarly, three-step and four-step approaches use two
and three midpoints respectively.
A small sample graph is used in this experiment, which is stored in the MatrixMarket (.mtx)
file format. The experiment is implemented in Node.js, and executed on a personal laptop.
Only the iteration count of each test case is measured. The tolerance τ = 10-5 is used for all
test cases. Statistics of each test case is printed to standard output (stdout), and redirected
to a log file, which is then processed with a script to generate a CSV file, with each row
representing the details of a single test case. This CSV file is imported into Google Sheets,
and necessary tables are set up with the help of the FILTER function to create the charts.
Figure 2: Iterations required for PageRank computation, when damping factor α is adjusted in 1-4
steps, starting with damping_start. 0-step is the fixed damping factor PageRank, with α = 0.85.
4. From the results (figure 2), it is clear that modifying the damping factor α in steps is not a
good idea. The standard fixed damping factor PageRank, with α = 0.85, converges in 35
iterations. Using a single step approach increases the number of iterations required, which
further increases as the initial damping factor damping_start is increased. Switching to a
multi-step approach also increases the number of iterations needed for convergence. A
possible explanation for this effect is that the ranks for different values of the damping factor
α are significantly different, and switching to a different damping factor α after each step
mostly leads to recomputation.
Similar to the damping factor α, adjusting the value of tolerance τ can have a significant
effect as well. Apart from the value of tolerance τ, it is observed that different people make
use of different error functions for measuring tolerance. Although L1 norm is commonly
used for convergence check, it appears nvGraph uses L2 norm instead [nvgraph]. Another
person in stackoverflow seems to suggest the use of per-vertex tolerance comparison, which
is essentially the L∞ norm. The L1 norm ||E||1 between two (rank) vectors r and s is
calculated as ||E||1 = Σ|rn - sn|, or as the sum of absolute errors. The L2 norm ||E||2 is
calculated as ||E||2 = √Σ|rn - sn|2
, or as the square-root of the sum of squared errors
(euclidean distance between the two vectors). The L∞ norm ||E||∞ is calculated as ||E||∞ =
max(|rn - sn|), or as the maximum of absolute errors.
This experiment was for comparing the performance between PageRank computation with
L1, L2 and L∞ norms as convergence check, for various tolerance τ values ranging from 10-0
to 10-10
(10-0
, 5×10-0
, 10-1
, 5×10-1
, ...). The input graphs, system used, and the rest of the
experimental process is similar to that of the first experiment.
tolerance L1 norm L2 norm L∞ norm
1.00E-05 49 65 27
5.00E-06 53 65 31
1.00E-06 63 500 41
5.00E-07 67 500 45
1.00E-07 77 500 55
5.00E-08 84 500 59
1.00E-08 500 500 70
5.00E-09 500 500 73
1.00E-09 500 500 500
5.00E-10 500 500 500
1.00E-10 500 500 500
Table 1: Iterations taken for PageRank computation of the web-Stanford graph, with L1, L2, and L∞
norms used as convergence check. At tolerance τ = 10-6
, the L2 norm suffers from sensitivity issues,
followed by L1 and L∞ norms at 10-8
and 10-9
respectively. Only relevant tolerances are shown here.
5. Figure 3: Iterations taken for PageRank computation of the asia_osm graph, with L1, L2, and L∞
norms used as convergence check. Until tolerance τ = 10-7
, the L∞ norm converges in just one
iteration.
Figure 4: Average iterations taken for PageRank computation with L1, L2 and L∞ norms as
convergence check, and tolerance τ adjusted from 10-0
to 10-10
(10-0
, 5×10-0
, 10-1
, 5×10-1
, ...). L∞
norm convergence check seems to be the fastest, followed by L1 norm (on average).
6. Figure 5: Average-relative iterations taken for PageRank computation with L1, L2 and L∞ norms as
convergence check, and tolerance τ adjusted from 10-0
. L∞ norm convergence check seems to be the
fastest, however, it is difficult to tell whether L1, or L2 norm comes in seconds place (on average).
Figure 6: Relative-average iterations taken for PageRank computation with L1, L2 and L∞ norms as
convergence check, and tolerance τ adjusted from 10-0
. L∞ norm convergence check seems to be the
fastest, followed by L2 norm (on average).
7. For various graphs, it is observed that PageRank computation with L1, L2, or L∞ norm as
convergence check suffers from sensitivity issues beyond certain (smaller) tolerance τ
values. As tolerance τ is decreased from 10-0
to 10-10
, L2 norm is usually (except road
networks) the first to suffer from this issue, followed by L1 norm (or L2), and eventually L∞
norm (if ever). This sensitivity issue was recognized by the fact that a given approach
abruptly takes 500 (max iterations) for the next lower tolerance τ value. This is shown in
table 1.
It is also observed that PageRank computation with L∞ norm as convergence check
completes in just one iteration (even for tolerance τ ≥ 10-6
) for large graphs (road
networks). This is because it is calculated as ||E||∞ = max(|rn - sn|), and depending upon the
order (number of vertices) N of the graph, 1/N can be less than the required tolerance τ to
converge.
Based on average-relative comparison, the relative iterations between PageRank
computation with L1, L2, and L∞ norm as convergence check is 4.73 : 4.08 : 1.00. Hence L2
norm is on average 16% faster than L1 norm, and L∞ norm is 308% faster (~4x) than L2
norm. The variation of average-relative iterations for various tolerance τ values is shown in
figure 5. A similar effect is also seen in figure 4, where average iterations for various
tolerance τ values is shown. On the other hand, based on relative-average comparison, the
relative iterations between PageRank computation with L1, L2, and L∞ norm as
convergence check is 10.42 : 6.18 : 1. Hence, L2 norm is on average 69% faster than L1
norm, and L∞ norm is 518% faster (~6x) than L2 norm. The variation of relative-average
iterations for various tolerance τ values is shown in figure 6. This shows that while L1 norm
is on average slower than L2 norm, the difference between the two diminishes for large
graphs (average-relative comparison gives higher importance to results from larger graphs,
unlike relative-average). It should also be noted that L2 norm is not always faster than L1
norm in several cases (usually for smaller tolerance τ values) as can be seen in table 1.
Parameter values can have a significant effect on performance, as seen in these
experiments. Different convergence functions converge at different rates, and which of
them converges faster depends upon the tolerance τ value. Iteration count needs to be
checked in order to ensure that no approach is suffering from sensitivity issues, or is leading
to a single iteration convergence. Finally, the relative performance comparison method
affects which results get more importance, and which do not, in the final average. Taking
note of each of these points, when comparing iterative algorithms, will thus ensure that the
performance results are accurate and useful.
Table 2: List of parameter adjustment strategies, and links to source code.
Damping Factor adjust dynamic-adjust
Tolerance L1 norm L2 norm L∞ norm
1. Comparing the effect of using different values of damping factor, with PageRank (pull, CSR).
2. Experimenting PageRank improvement by adjusting damping factor (α) between iterations.
3. Comparing the effect of using different functions for convergence check, with PageRank (...).
4. Comparing the effect of using different values of tolerance, with PageRank (pull, CSR).