Random-Walk Computation
Theory and Applications of Computing Random-Walk-Based Quantities such as SimRank, PageRank, Personalized PageRank (PPR), and Effective Resistance (ER)
- Overview
- SimRank Computation
- PPR survey
- FORA
- EdgePush
- [WWY24]
- RBS
- [WWWY24]
- SetPush
- TopPPR
- AGP
- BiSPER
-
Random-walk probabilities on graphs serve as a cornerstone in network analysis, graph data mining, and graph machine learning. Their further applications span diverse domains, including information retrieval, recommendation systems, chemistry, biology, and neuroscience.
This project focuses on efficient computation and approximation of random-walk probabilities, a subject of both theoretical and practical significance. We mainly concentrate on optimizing algorithms to compute widely-used random-walk-based metrics, such as SimRank, PageRank, Personalized PageRank (PPR), and Effective Resistance (ER).
Our work improves the computational complexity of computing these quantities, and we also conduct empirical evaluations to demonstrate the efficiency and effectiveness of our approaches in real-world applications, including local graph clustering and graph neural networks.
graph TD classDef main fill:#FFEBCC,stroke:#FF9900,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef secondary fill:#C6F6C6,stroke:#009900,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef work fill:#C6E5FF,stroke:#007ACC,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef link fill:#FDE2E2,stroke:#E63946,stroke-width:2px,color:#007bFF,font-weight:bold,rx:15px,ry:15px,stroke-dasharray:4,4,text-decoration:underline,font-style:italic; A[Our Work on Random-Walk Computation]:::main --> B[for SimRank]:::secondary A --> C[for PageRank and PPR]:::secondary A --> D[for General Proximity]:::secondary A --> E[for ER]:::secondary B --> B1[ProbeSim]:::work B1 --> B2[PRSim]:::work B2 --> B3[ExactSim]:::work B --> B4[SimTab]:::work C --> C1[...... see below]:::link click C1 href "#PPR" D --> D1[AGP]:::work E --> E1[Single-Pair ER]:::secondary E1 --> E2[BiSPER]:::work
graph TD classDef secondary fill:#C6F6C6,stroke:#009900,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef work fill:#C6E5FF,stroke:#007ACC,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef highlight fill:#66B3FF,stroke:#007ACC,stroke-width:2px,color:#333,rx:5px,ry:5px; classDef link fill:#FDE2E2,stroke:#E63946,stroke-width:2px,color:#9B2226,font-weight:bold,rx:5px,ry:5px; A[Our Work on PageRank and PPR Computation]:::link A --> A1[PPR Survey]:::highlight A --> A2[Single-Source PPR]:::secondary A2 --> A21[FORA]:::work A21 --> A211[SpeedPPR]:::work A2 --> A22[EdgePush]:::work A2 --> A23["[WWY24]"]:::work A --> A3[Single-Target PPR]:::secondary A3 --> A31[RBS]:::work A3 --> A32["[WWWY24]"]:::work A --> A4[Single-Node PageRank]:::secondary A4 --> A41[SetPush]:::work A4 --> A42["[WWWY24]"]:::work A --> A5[TopPPR]:::work
-
ProbeSim: Scalable Single-Source and Top-\(k\) SimRank Computations on Dynamic Graphs [VLDB 2017]
-
ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic GraphsInternational Conference on Very Large Data Bases (VLDB), 2017
-
Citation
@article{liu2017probesim, author = {Yu Liu and Bolong Zheng and Xiaodong He and Zhewei Wei and Xiaokui Xiao and Kai Zheng and Jiaheng Lu}, title = {ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic Graphs}, journal = {Proceedings of the VLDB Endowment}, volume = {11}, number = {1}, pages = {14--26}, year = {2017}, url = {http://www.vldb.org/pvldb/vol11/p14-liu.pdf}, doi = {10.14778/3151113.3151115} }
PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs [SIGMOD 2019]
-
PRSim: Sublinear Time SimRank Computation on Large Power-Law GraphsACM Conference on Management of Data (SIGMOD), 2019
-
Citation
@inproceedings{wei2019prsim, author = {Zhewei Wei and Xiaodong He and Xiaokui Xiao and Sibo Wang and Yu Liu and Xiaoyong Du and Ji-Rong Wen}, title = {PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs}, booktitle = {Proceedings of the 2019 International Conference on Management of Data}, pages = {1042--1059}, year = {2019}, url = {https://doi.org/10.1145/3299869.3319873}, doi = {10.1145/3299869.3319873} }
SimTab: Accuracy-Guaranteed SimRank Queries Through Tighter Confidence Bounds and Multi-Armed Bandits [VLDB 2020]
-
SimTab: Accuracy-Guaranteed SimRank Queries Through Tighter Confidence Bounds and Multi-Armed BanditsInternational Conference on Very Large Data Bases (VLDB), 2020
-
Citation
@article{liu2020simtab, author = {Yu Liu and Lei Zou and Qian Ge and Zhewei Wei}, title = {SimTab: Accuracy-Guaranteed SimRank Queries through Tighter Confidence Bounds and Multi-Armed Bandits}, journal = {Proceedings of the VLDB Endowment}, volume = {13}, number = {11}, pages = {2202--2214}, year = {2020}, url = {http://www.vldb.org/pvldb/vol13/p2202-liu.pdf}, doi = {10.14778/3407790.3407819} }
Exact Single-Source SimRank Computation on Large Graphs [SIGMOD 2020], ExactSim: Benchmarking Single-Source SimRank Algorithms with High-Precision Ground Truths [The VLDB Journal 2021]
We propose ExactSim, the first algorithm that enables probabilistic exact single-source SimRank queries on large graphs. ExactSim can provide the ground truth with a precision up to \(7\) decimal places for single-source SimRank queries on large graphs within a reasonable query time. With the ground truths computed by ExactSim, we conduct the first experimental study of the accuracy/cost trade-offs of existing approximate SimRank algorithms on large graphs.
-
Exact Single-Source SimRank Computation on Large GraphsACM Conference on Management of Data (SIGMOD), 2020
-
-
ExactSim: Benchmarking Single-Source SimRank Algorithms with High-Precision Ground TruthsThe VLDB Journal, 2021[Code]
-
Citations
@inproceedings{wang2020exact, author = {Hanzhi Wang and Zhewei Wei and Ye Yuan and Xiaoyong Du and Ji-Rong Wen}, title = {Exact Single-Source SimRank Computation on Large Graphs}, booktitle = {Proceedings of the 2020 International Conference on Management of Data}, pages = {653--663}, year = {2020}, url = {https://doi.org/10.1145/3318464.3389781}, doi = {10.1145/3318464.3389781} } @article{wang2021exactsim, author = {Hanzhi Wang and Zhewei Wei and Yu Liu and Ye Yuan and Xiaoyong Du and Ji-Rong Wen}, title = {ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths}, journal = {The VLDB Journal}, volume = {30}, number = {6}, pages = {989--1015}, year = {2021}, url = {https://doi.org/10.1007/s00778-021-00672-7}, doi = {10.1007/S00778-021-00672-7} }
-
Efficient Algorithms for Personalized PageRank Computation: A Survey [TKDE 2024]
This survey paper provides a systematic and comprehensive review of several basic techniques and recent algorithms for PPR computation from an algorithmic perspective.
-
Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024[arXiv]
-
Citation
@article{yang2024efficient, author = {Mingji Yang and Hanzhi Wang and Zhewei Wei and Sibo Wang and Ji-Rong Wen}, title = {Efficient Algorithms for Personalized PageRank Computation: A Survey}, journal = {IEEE Transactions on Knowledge and Data Engineering}, volume = {36}, number = {9}, pages = {4582--4602}, year = {2024}, doi = {10.1109/TKDE.2024.3376000} }
-
FORA: Simple and Effective Approximate Single-Source Personalized PageRank [KDD 2017], Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries [TODS 2019]
-
FORA: Simple and Effective Approximate Single-Source Personalized PageRankACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017
-
-
Efficient Algorithms for Approximate Single-Source Personalized PageRank QueriesACM Transactions on Database Systems (TODS), 2019
-
Citations
@inproceedings{wang2017fora, author = {Sibo Wang and Renchi Yang and Xiaokui Xiao and Zhewei Wei and Yin Yang}, title = {FORA: Simple and Effective Approximate Single-Source Personalized PageRank}, booktitle = {Proceedings of the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining}, pages = {505--514}, year = {2017}, url = {https://doi.org/10.1145/3097983.3098072}, doi = {10.1145/3097983.3098072} } @article{wang2019efficient, author = {Sibo Wang and Renchi Yang and Runhui Wang and Xiaokui Xiao and Zhewei Wei and Wenqing Lin and Yin Yang and Nan Tang}, title = {Efficient Algorithms for Approximate Single-Source Personalized PageRank Queries}, journal = {ACM Transactions on Database Systems}, volume = {44}, number = {4}, pages = {18:1--18:37}, year = {2019}, url = {https://doi.org/10.1145/3360902}, doi = {10.1145/3360902} }
-
Edge-based Local Push for Personalized PageRank [VLDB 2022]
The state-of-the-art algorithm, LocalPush, for Personalized PageRank computation can be rather inefficient on weighted graphs. In this paper, we propose an Edge-based Push Method (EdgePush), which decomposes the push operation of LocalPush into separate edge-based push operations and achieves superior query efficiency over LocalPush on weighted graphs.
-
Edge-based Local Push for Personalized PageRankInternational Conference on Very Large Data Bases (VLDB), 2022
-
Citation
@article{wang2022edgebased, author = {Hanzhi Wang and Zhewei Wei and Junhao Gan and Ye Yuan and Xiaoyong Du and Ji-Rong Wen}, title = {Edge-based Local Push for Personalized PageRank}, journal = {Proceedings of the VLDB Endowment}, volume = {15}, number = {7}, pages = {1376--1389}, year = {2022}, url = {https://www.vldb.org/pvldb/vol15/p1376-wang.pdf}, doi = {10.14778/3523210.3523216} }
-
Approximating Single-Source Personalized PageRank with Absolute Error Guarantees [ICDT 2024]
This paper proposes new algorithms to improve the upper bounds for approximating Single-Source Personalized PageRank with (degree-normalized) absolute error guarantees on various graphs.
-
**Approximating Single-Source Personalized PageRank with Absolute Error GuaranteesInternational Conference on Database Theory (ICDT), 2024[arXiv]
-
Citation
@inproceedings{wei2024approximating, author = {Zhewei Wei and Ji-Rong Wen and Mingji Yang}, title = {Approximating Single-Source Personalized PageRank with Absolute Error Guarantees}, booktitle = {Proceedings of the 27th International Conference on Database Theory}, volume = {290}, pages = {9:1--9:19}, year = {2024}, url = {https://doi.org/10.4230/LIPIcs.ICDT.2024.9}, doi = {10.4230/LIPICS.ICDT.2024.9} }
-
Personalized PageRank to a Target Node, Revisited [KDD 2020]
We propose Randomized Backward Search (RBS), a novel algorithm that answers approximate single-target personalized PageRank queries (a.k.a. PageRank contribution queries) with nearly optimal time complexity.
-
Personalized PageRank to a Target Node, RevisitedACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020
-
Citation
@inproceedings{wang2020personalized, author = {Hanzhi Wang and Zhewei Wei and Junhao Gan and Sibo Wang and Zengfeng Huang}, title = {Personalized PageRank to a Target Node, Revisited}, booktitle = {Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining}, pages = {657--667}, year = {2020}, url = {https://doi.org/10.1145/3394486.3403108}, doi = {10.1145/3394486.3403108} }
-
Revisiting Local Computation of PageRank: Simple and Optimal [STOC 2024)]
We use simple techniques and analyses to give matching upper and lower bounds for estimating PageRank contributions and single-node PageRank. Our results for the upper bounds are derived by revisiting the known algorithms of ApproxContributions (a.k.a. Backward Push) and BiPPR.
-
**Revisiting Local Computation of PageRank: Simple and OptimalAnnual ACM Symposium on Theory of Computing (STOC), 2024
-
Citation
@inproceedings{wang2024revisiting, author = {Hanzhi Wang and Zhewei Wei and Ji-Rong Wen and Mingji Yang}, title = {Revisiting Local Computation of PageRank: Simple and Optimal}, booktitle = {Proceedings of the 56th Annual ACM Symposium on Theory of Computing}, pages = {911--922}, year = {2024}, url = {https://doi.org/10.1145/3618260.3649661}, doi = {10.1145/3618260.3649661} }
-
Estimating Single-Node PageRank in \(\tilde{O}\left(\min\big\{d_t,\sqrt{m}\big\}\right)\) Time [VLDB 2023]
For the problem of estimating a single node’s PageRank score in an undirected graph, we tighten the upper bound for query time complexity from \(O\left((nd_t)^{1/2}\right)\) to \(\textcolor{blue}{O\left(\min\left(d_t, m^{1/2}\right)\right)}\) (omitting logarithmic factors). Here \(n\) and \(m\) denote the number of nodes and edges in the given graph, respectively. \(d_t\) denotes the degree of the given target node \(t\).
-
Estimating Single-Node PageRank in \(\tilde{O}\left(\min\{d_t, \sqrt{m}\}\right)\)TimeInternational Conference on Very Large Data Bases (VLDB), 2023
-
Citation
@article{wang2023estimating, author = {Hanzhi Wang and Zhewei Wei}, title = {Estimating Single-Node PageRank in $\tilde{O}\left(\min\big\{d_t,\sqrt{m}\big\}\right)$ Time}, journal = {Proceedings of the VLDB Endowment}, volume = {16}, number = {11}, pages = {2949--2961}, year = {2023}, url = {https://dl.acm.org/doi/10.14778/3611479.3611500}, doi = {10.14778/3611479.3611500} }
-
TopPPR: Top-\(k\) Personalized PageRank Queries with Precision Guarantees on Large Graphs [SIGMOD 2018]
-
TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large GraphsACM Conference on Management of Data (SIGMOD), 2018
-
Citation
@inproceedings{wei2018topppr, author = {Zhewei Wei and Xiaodong He and Xiaokui Xiao and Sibo Wang and Shuo Shang and Ji-Rong Wen}, title = {TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs}, booktitle = {Proceedings of 2018 ACM Conference on Management of Data}, pages = {441--456}, year = {2018}, url = {https://doi.org/10.1145/3183713.3196920}, doi = {10.1145/3183713.3196920} }
-
Approximate Graph Propagation [KDD 2021]
We propose Approximate Graph Propagation (AGP), a unified randomized algorithm that computes various proximity queries and GNN feature propagation with almost optimal time complexity.
-
Approximate Graph PropagationACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021
-
Citation
@inproceedings{wang2021approximate, author = {Hanzhi Wang and Mingguo He and Zhewei Wei and Sibo Wang and Ye Yuan and Xiaoyong Du and Ji-Rong Wen}, title = {Approximate Graph Propagation}, booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining}, pages = {1686--1696}, year = {2021}, url = {https://doi.org/10.1145/3447548.3467243}, doi = {10.1145/3447548.3467243} }
-
Mixing Time Matters: Accelerating Effective Resistance Estimation via Bidirectional Method [KDD 2025]