Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fairness Rising from the Ranks: HITS and PageRank on Homophilic Networks (2402.13787v3)

Published 21 Feb 2024 in cs.SI and cs.IR

Abstract: In this paper, we investigate the conditions under which link analysis algorithms prevent minority groups from reaching high ranking slots. We find that the most common link-based algorithms using centrality metrics, such as PageRank and HITS, can reproduce and even amplify bias against minority groups in networks. Yet, their behavior differs: one one hand, we empirically show that PageRank mirrors the degree distribution for most of the ranking positions and it can equalize representation of minorities among the top ranked nodes; on the other hand, we find that HITS amplifies pre-existing bias in homophilic networks through a novel theoretical analysis, supported by empirical results. We find the root cause of bias amplification in HITS to be the level of homophily present in the network, modeled through an evolving network model with two communities. We illustrate our theoretical analysis on both synthetic and real datasets and we present directions for future work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Learning to rank networked entities. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 14–23, 2006.
  2. Machine bias. ProPublica, May 2016.
  3. Attribute network models, stochastic approximation, and network sampling and ranking algorithms. arXiv preprint arXiv:2304.08565, 2023.
  4. Homophily and the glass ceiling effect in social networks. In Proceedings of the 6th Conference on Innovations in Theoretical Computer Science, pages 41–50, 2015.
  5. Emergence of scaling in random networks. science, 286(5439):509–512, 1999.
  6. Network biology: understanding the cell’s functional organization. Nature reviews genetics, 5(2):101–113, 2004.
  7. Fairness in recommendation ranking through pairwise comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2212–2220, 2019.
  8. Equity of attention: Amortizing individual fairness in rankings. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 405–414, 2018.
  9. Link analysis ranking: algorithms, theory, and experiments. Transactions on Internet Technology (TOIT), 5(1), February 2005.
  10. Homophily and long-run integration in social networks. Journal of Economic Theory, 147(5):1754–1786, 2012.
  11. Ranking with fairness constraints. arXiv preprint arXiv:1704.06840, 2017.
  12. Localization on low-order eigenvectors of data matrices. arXiv preprint arXiv:1109.1355, 2011.
  13. Algorithmic bias amplification via temporal effects: The case of pagerank in evolving networks. Communications in Nonlinear Science and Numerical Simulation, 104:106029, 2022.
  14. Learning to rank for semantic search. SemSearch@ WWW2011, 2011.
  15. Fairness in graph mining: A survey. IEEE Transactions on Knowledge and Data Engineering, 2023.
  16. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pages 214–226, 2012.
  17. Inequality and inequity in network-based ranking and recommendation algorithms. Scientific reports, 12(1):1–14, 2022.
  18. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pages 259–268, 2015.
  19. Topical interests and the mitigation of search engine bias. Proceedings of the National Academy of Sciences, 103(34):12684–12689, 2006.
  20. Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems, 29, 2016.
  21. Jon M Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5):604–632, 1999.
  22. Inherent Trade-Offs in the Fair Determination of Risk Scores. In Proceedings of the 8th Innovations in Theoretical Computer Science Conference, volume 67, page 43. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017.
  23. Homophily and minority-group size explain perception biases in social networks. Nature human behaviour, 3(10):1078–1087, 2019.
  24. The stochastic approach for link-structure analysis (salsa) and the tkc effect. Computer Networks, 33(1-6):387–401, 2000.
  25. Michael Ley. DBLP: some lessons learned. Proceedings of the VLDB Endowment, 2(2):1493–1500, 2009.
  26. Tie-Yan Liu et al. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, 3(3):225–331, 2009.
  27. Ranking nodes in growing networks: When pagerank fails. Scientific reports, 5(1):1–10, 2015.
  28. Birds of a Feather: Homophily in Social Networks. Annual review of sociology, 27:415–444, 2001.
  29. Hits on the web: How does it compare? In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 471–478, 2007.
  30. Mixing patterns and community structure in networks. In Statistical mechanics of complex networks, pages 66–87. Springer, 2003.
  31. Stable algorithms for link analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 258–266, 2001.
  32. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
  33. The price of fair PCA: One extra dimension. Advances in Neural Information Processing Systems, 31, 2018.
  34. Policy learning for fairness in ranking. Advances in Neural Information Processing Systems, 32, 2019.
  35. An overview of Microsoft Academic Service (MAS) and applications. In Proceedings of the 24th International Conference on World Wide Web, pages 243–246. ACM, 2015.
  36. Algorithmic glass ceiling in social networks: The effects of social recommendations on network diversity. In Proceedings of The Web Conference, pages 923–932, 2018.
  37. Arnetminer: Extraction and mining of academic social networks. In proceedings of the 21th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pages 990–998, 2008.
  38. Learning to rank at query-time using association rules. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 267–274, 2008.
  39. Propagation of societal gender inequality by internet search algorithms. Proceedings of the National Academy of Sciences, 119(29):e2204529119, 2022.
  40. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics, pages 962–970. PMLR, 2017.
  41. Reducing disparate exposure in ranking: A learning to rank approach. In Proceedings of The Web Conference, pages 2849–2855, 2020.
  42. Fa* ir: A fair top-k ranking algorithm. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 1569–1578, 2017.
  43. Fairness in ranking, part i: Score-based ranking. ACM Computing Surveys, 55(6):1–36, 2022a.
  44. Fairness in ranking, part ii: Learning-to-rank and recommender systems. ACM Computing Surveys, 55(6):1–41, 2022b.
  45. Chasm in hegemony: Explaining and reproducing disparities in homophilous networks. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 5(2):1–38, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ana-Andreea Stoica (9 papers)
  2. Nelly Litvak (37 papers)
  3. Augustin Chaintreau (11 papers)
Citations (1)

Summary

  • The paper demonstrates that PageRank reduces bias among top-ranked nodes, though its fairness diminishes in deeper network layers.
  • The paper finds that HITS amplifies inherent biases by reinforcing densely interconnected communities and limiting minority visibility.
  • The paper explores randomization and subspace HITS methods as promising strategies to mitigate bias and enhance algorithmic fairness.

Exploring the Impact of Link Analysis Algorithms on Minority Representation in Homophilic Networks

Understanding the Dynamics of PageRank and HITS

Recent work explores how link analysis ranking algorithms, notably PageRank and Hyperlink-Induced Topic Search (HITS), impact minority representation in social and information networks. These algorithms are fundamental in determining the visibility and ranking of nodes in a network, directly affecting which voices are heard and which remain obscure.

Dissecting the Bias Amplification

The paper highlights that while both PageRank and HITS are predicated on network structure, they mediate information dissemination differently. PageRank, a metric of global node centrality, can potentially correct for bias at the upper echelons of rank. However, its corrective ability does not uniformly extend through all network layers, and beyond the top nodes, it mirrors the inherent bias present in node connectivity.

Conversely, HITS, which distinguishes between 'hubs' and 'authorities,' tends to amplify existing biases, especially in homophilic networks. The bifurcation of roles in HITS creates a feedback loop that disproportionately benefits nodes within densely interconnected communities, thus reinforcing majority dominance and minority invisibility.

The paper introduces a rigorous analytical framework that isolates the conditions under which these divergent outcomes manifest. Through both theoretical analysis and empirical tests on synthetic and real-world datasets, it confirms that HITS's bias amplification is particularly pronounced in networks exhibiting strong homophily.

Introducing Randomization and Subspace HITS

A notable direction explored for mitigating bias is the introduction of randomization into the HITS algorithm. Randomized HITS, which incorporates random restarts similar to PageRank, shows promise in mirroring, and in some cases, improving upon the fairness of the degree ranking baseline. This suggests that randomness offers a buffer against the self-reinforcing mechanisms of bias.

Further, the paper investigates the Subspace HITS variation, utilizing multiple eigenvectors to compute node authority scores. While presenting an innovative method to potentially enhance fairness, the results underscore the complexity of choosing an 'optimal' number of dimensions. Different datasets yield varying fairness outcomes depending on the number of eigenvectors involved.

Forward Path

This research builds a crucial understanding of how structural properties of networks and the design of ranking algorithms intersect to shape the visibility of minority groups. It points to substantial future work in fine-tuning algorithmic interventions that could promote fairness. A particularly intriguing avenue is the systemic exploration of how network evolution, in response to altered ranking strategies, influences long-term fairness and representation equity.

Concluding Remarks

The findings underscore a nuanced landscape where algorithm design, network structure, and societal dynamics collide. The work lays down a gauntlet for future research to further unravel the multifaceted relationship between algorithmic fairness and network topology, with the hope of fostering inclusivity in digital spaces.

X Twitter Logo Streamline Icon: https://streamlinehq.com