Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees (2401.01019v1)

Published 2 Jan 2024 in cs.DS

Abstract: Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes $s$ and $t$ on a graph $G=(V,E)$, the PPR value $\pi(s,t)$ is defined as the probability that an $\alpha$-discounted random walk from $s$ terminates at $t$, where the walk terminates with probability $\alpha$ at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node $s$ to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates $\hat{\pi}(s,t)$ satisfy $\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$ for a given error bound $\varepsilon$. We propose an algorithm that achieves this with high probability, with an expected running time of - $\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$ for directed graphs, where $m=|E|$; - $\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$ for undirected graphs, where $d_{\mathrm{max}}$ is the maximum node degree in the graph; - $\widetilde{O}\left(n{\gamma-1/2}/\varepsilon\right)$ for power-law graphs, where $n=|V|$ and $\gamma\in\left(\frac{1}{2},1\right)$ is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring $\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$ for a given error bound $\varepsilon_d$, where the graph is undirected and $d(t)$ is the degree of node $t$. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$. This improves over the previously known $O(1/\varepsilon_d)$ complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Local computation of pagerank contributions. In Proc. 5th Int. Workshop Algorithms Models Web Graph, volume 4863, pages 150–165, 2007. doi:10.1007/978-3-540-77004-6_12.
  2. Local computation of pagerank contributions. Internet Math., 5(1):23–45, 2008. doi:10.1080/15427951.2008.10129302.
  3. Reid Andersen and Fan R. K. Chung. Detecting sharp drops in pagerank and a simplified local partitioning algorithm. In Proc. 4th Int. Conf. Theory Appl. Models Comput., volume 4484, pages 1–12, 2007. doi:10.1007/978-3-540-72504-6_1.
  4. Local graph partitioning using pagerank vectors. In Proc. 47th Annu. IEEE Symp. Found. Comput. Sci., pages 475–486, 2006. doi:10.1109/FOCS.2006.44.
  5. Using pagerank to locally partition a graph. Internet Math., 4(1):35–64, 2007. doi:10.1080/15427951.2007.10129139.
  6. On the choice of kernel and labelled data in semi-supervised learning methods. In Proc. 10th Int. Workshop Algorithms Models Web Graph, volume 8305, pages 56–67, 2013. doi:10.1007/978-3-319-03536-9_5.
  7. Quick detection of top-k personalized pagerank lists. In Proc. 8th Int. Workshop Algorithms Models Web Graph, volume 6732, pages 50–61, 2011. doi:10.1007/978-3-642-21286-4_5.
  8. Fast personalized pagerank on mapreduce. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 973–984, 2011. doi:10.1145/1989323.1989425.
  9. Fast incremental and personalized pagerank. Proc. VLDB Endowment, 4(3):173–184, 2010. URL: http://www.vldb.org/pvldb/vol4/p173-bahmani.pdf, doi:10.14778/1929861.1929864.
  10. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999. doi:10.1126/science.286.5439.509.
  11. Pavel Berkhin. Bookmark-coloring algorithm for personalized pagerank computing. Internet Math., 3(1):41–62, 2006. doi:10.1080/15427951.2006.10129116.
  12. Scaling graph neural networks with approximate pagerank. In Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 2464–2473, 2020. doi:10.1145/3394486.3403296.
  13. Directed scale-free graphs. In Proc. ACM-SIAM Symp. Discrete Algorithms, pages 132–139, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644133.
  14. The anatomy of a large-scale hypertextual web search engine. Comput. Netw., 30(1-7):107–117, 1998. doi:10.1016/S0169-7552(98)00110-X.
  15. Fan R. K. Chung and Lincoln Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Math., 3(1):79–127, 2006. doi:10.1080/15427951.2006.10129115.
  16. Efficient processing of network proximity queries via chebyshev acceleration. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1515–1524, 2016. doi:10.1145/2939672.2939828.
  17. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Math., 2(3):333–358, 2005. doi:10.1080/15427951.2005.10129104.
  18. Variational perspective on local graph clustering. Math. Program., 174(1-2):553–573, 2019. URL: https://doi.org/10.1007/s10107-017-1214-8, doi:10.1007/S10107-017-1214-8.
  19. Fast and exact top-k search for random walk with restart. Proc. VLDB Endowment, 5(5):442–453, 2012. URL: http://vldb.org/pvldb/vol5/p442_yasuhirofujiwara_vldb2012.pdf, doi:10.14778/2140436.2140441.
  20. Efficient ad-hoc search for personalized pagerank. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 445–456, 2013. doi:10.1145/2463676.2463717.
  21. Efficient personalized pagerank with accuracy assurance. In Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 15–23, 2012. doi:10.1145/2339530.2339538.
  22. David F. Gleich. Pagerank beyond the web. SIAM Rev., 57(3):321–363, 2015. doi:10.1137/140976649.
  23. Distributed algorithms on exact personalized pagerank. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 479–494, 2017. doi:10.1145/3035918.3035920.
  24. Parallel personalized pagerank on dynamic graphs. Proc. VLDB Endowment, 11(1):93–106, 2017. URL: http://www.vldb.org/pvldb/vol11/p93-guo.pdf, doi:10.14778/3151113.3151121.
  25. Massively parallel algorithms for personalized pagerank. Proc. VLDB Endowment, 14(9):1668–1680, 2021. URL: http://www.vldb.org/pvldb/vol14/p1668-wang.pdf, doi:10.14778/3461535.3461554.
  26. Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci., 43:169–188, 1986. doi:10.1016/0304-3975(86)90174-X.
  27. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 789–804, 2017. doi:10.1145/3035918.3035950.
  28. Predict then propagate: Graph neural networks meet personalized pagerank. In Proc. 7th Int. Conf. Learn. Representations, 2019. URL: https://openreview.net/forum?id=H1gL-2A9Ym.
  29. Efficient personalized pagerank computation: The power of variance-reduced monte carlo approaches. Proc. ACM Manage. Data, 1(2):160:1–160:26, 2023. doi:10.1145/3589305.
  30. Efficient personalized pagerank computation: A spanning forests sampling based approach. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 2048–2061, 2022. doi:10.1145/3514221.3526140.
  31. Index-free approach with theoretical guarantee for efficient random walk with restart query. In Proc. 36th Int. Conf. Data Eng., pages 913–924, 2020. doi:10.1109/ICDE48307.2020.00084.
  32. Wenqing Lin. Distributed algorithms for fully personalized pagerank on large graphs. In Proc. Int. Conf. World Wide Web, pages 1084–1094, 2019. doi:10.1145/3308558.3313555.
  33. Personalized pagerank estimation and search: A bidirectional approach. In Proc. 9th ACM Int. Conf. Web Search Data Mining, pages 163–172, 2016. doi:10.1145/2835776.2835823.
  34. Fast-ppr: scaling personalized pagerank estimation for large graphs. In Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1436–1445, 2014. doi:10.1145/2623330.2623745.
  35. Personalized pagerank to a target node. CoRR, abs/1304.4658, 2013. URL: http://arxiv.org/abs/1304.4658, arXiv:1304.4658.
  36. Computing personalized pagerank quickly by exploiting graph structures. Proc. VLDB Endowment, 7(12):1023–1034, 2014. URL: http://www.vldb.org/pvldb/vol7/p1023-maehara.pdf, doi:10.14778/2732977.2732978.
  37. Efficient pagerank tracking in evolving networks. In Proc. 21st ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 875–884, 2015. doi:10.1145/2783258.2783297.
  38. Asymmetric transitivity preserving graph embedding. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1105–1114, 2016. doi:10.1145/2939672.2939751.
  39. Realtime top-k personalized pagerank over large graphs on gpus. Proc. VLDB Endowment, 13(1):15–28, 2019. URL: http://www.vldb.org/pvldb/vol13/p15-shi.pdf, doi:10.14778/3357377.3357379.
  40. Bear: Block elimination approach for random walk with restart on large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1571–1585, 2015. doi:10.1145/2723372.2723716.
  41. Verse: Versatile graph embeddings from similarity measures. In Proc. Int. Conf. World Wide Web, pages 539–548, 2018. doi:10.1145/3178876.3186120.
  42. Alastair J Walker. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters, 8(10):127–128, 1974. doi:10.1049/el:19740097.
  43. Approximate graph propagation. In Proc. 27th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1686–1696, 2021. doi:10.1145/3447548.3467243.
  44. Personalized pagerank to a target node, revisited. In Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 657–667, 2020. doi:10.1145/3394486.3403108.
  45. Parallelizing approximate single-source personalized pagerank queries on shared memory. VLDB J., 28(6):923–940, 2019. URL: https://doi.org/10.1007/s00778-019-00576-7, doi:10.1007/S00778-019-00576-7.
  46. Hubppr: Effective indexing for approximate personalized pagerank. Proc. VLDB Endowment, 10(3):205–216, 2016. URL: http://www.vldb.org/pvldb/vol10/p205-wang.pdf, doi:10.14778/3021924.3021936.
  47. Efficient algorithms for approximate single-source personalized pagerank queries. ACM Trans. Database Syst., 44(4):18:1–18:37, 2019. doi:10.1145/3360902.
  48. Fora: Simple and effective approximate single-source personalized pagerank. In Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 505–514, 2017. doi:10.1145/3097983.3098072.
  49. Prsim: Sublinear time simrank computation on large power-law graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1042–1059, 2019. doi:10.1145/3299869.3319873.
  50. Topppr: Top-k personalized pagerank queries with precision guarantees on large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 441–456, 2018. doi:10.1145/3183713.3196920.
  51. Unifying the global and local approaches: An efficient power iteration with forward push. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1996–2008, 2021. doi:10.1145/3448016.3457298.
  52. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1139–1150, 2014. doi:10.1145/2588555.2610500.
  53. Local higher-order graph clustering. In Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 555–564, 2017. doi:10.1145/3097983.3098069.
  54. Scalable graph embeddings via sparse transpose proximities. In Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1429–1437, 2019. doi:10.1145/3292500.3330860.
  55. Fast and accurate random walk with restart on dynamic graphs with guarantees. In Proc. Int. Conf. World Wide Web, pages 409–418, 2018. doi:10.1145/3178876.3186107.
  56. Tpa: Fast, scalable, and accurate method for approximate random walk with restart on billion scale graphs. In Proc. 34th Int. Conf. Data Eng., pages 1132–1143, 2018. doi:10.1109/ICDE.2018.00105.
  57. Irwr: incremental random walk with restart. In Proc. 36th ACM SIGIR Int. Conf. Res. Develop. Inf. Retrieval, pages 1017–1020, 2013. doi:10.1145/2484028.2484114.
  58. Random walk with restart over dynamic graphs. In Proc. 16th Int. Conf. Data Mining, pages 589–598, 2016. doi:10.1109/ICDM.2016.0070.
  59. Incremental and accuracy-aware personalized pagerank through scheduled approximation. Proc. VLDB Endowment, 6(6):481–492, 2013. URL: http://www.vldb.org/pvldb/vol6/p481-zhu.pdf, doi:10.14778/2536336.2536348.
Citations (1)

Summary

We haven't generated a summary for this paper yet.