Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Data Points: Regionalizing Crowdsourced Latency Measurements (2405.11138v4)

Published 18 May 2024 in cs.NI and cs.CY

Abstract: Despite significant investments in access network infrastructure, universal access to high-quality Internet connectivity remains a challenge. Policymakers often rely on large-scale, crowdsourced measurement datasets to assess the distribution of access network performance across geographic areas. These decisions typically rest on the assumption that Internet performance is uniformly distributed within predefined social boundaries. However, this assumption may not be valid for two reasons: crowdsourced measurements often exhibit non-uniform sampling densities within geographic areas; and predefined social boundaries may not align with the actual boundaries of Internet infrastructure. In this paper, we present a spatial analysis on crowdsourced datasets for constructing stable boundaries for sampling Internet performance. We hypothesize that greater stability in sampling boundaries will reflect the true nature of Internet performance disparities than misleading patterns observed as a result of data sampling variations. We apply and evaluate a series of statistical techniques to: aggregate Internet performance over geographic regions; overlay interpolated maps with various sampling unit choices; and spatially cluster boundary units to identify contiguous areas with similar performance characteristics. We assess the effectiveness of the techniques we apply by comparing the similarity of the resulting boundaries for monthly samples drawn from the dataset. Our evaluation shows that the combination of techniques we apply achieves higher similarity compared to directly calculating central measures of network metrics over census tracts or neighborhood boundaries. These findings underscore the important role of spatial modeling in accurately assessing and optimizing the distribution of Internet performance, to inform policy, network operations, and long-term planning decisions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. 2024. MLab Test Your Speed. https://speed.measurementlab.net/. Accessed: 2022.
  2. 2024. Netalyzr. http://netalyzr.icsi.berkeley.edu/. Accessed on April 12, 2024.
  3. 2024. Ookla Speedtest. https://www.speedtest.net/. Accessed: 2024.
  4. Jeremy Aldworth and Noel Cressie. 1999. Sampling designs and prediction methods for Gaussian spatial processes. In Multivariate analysis, design of experiments, and survey sampling. CRC Press, 25–78.
  5. Luc Anselin. 2018. Spatial Clustering (2). Disponible en (2018).
  6. Efficient regionalization techniques for socio-economic geographical units using minimum spanning trees. International Journal of Geographical Information Science 20, 7 (2006), 797–811.
  7. Battle For the Net. 2022. Internet Health Test based on Measurement Lab NDT. https://www.battleforthenet.com/internethealthtest/
  8. Characterizing and improving the reliability of broadband internet access. arXiv preprint arXiv:1709.09349 (2017).
  9. Broadband Internet Technical Advisory Group (BITAG). 2022. Latency Explained. https://www.bitag.org/documents/BITAG_latency_explained.pdf
  10. The ATLAS3D project - XX. Mass-size and mass-σ𝜎\sigmaitalic_σ distributions of early-type galaxies: bulge fraction drives kinematics, mass-to-light ratio, molecular gas fraction and stellar initial mass function. MNRAS 432 (2013), 1862–1893. https://doi.org/10.1093/mnras/stt644 arXiv:1208.3523
  11. David D Clark and Sara Wedeman. 2021. Measurement, Meaning and Purpose: Exploring the M-Lab NDT Dataset. In TPRC49: The 49th Research Conference on Communication, Information and Internet Policy.
  12. Noel Cressie. 1988. Spatial prediction and ordinary kriging. Mathematical geology 20 (1988), 405–421.
  13. The max-p-regions problem. Journal of Regional Science 52, 3 (2012), 397–419.
  14. Bradley Efron and Robert J Tibshirani. 1994. An introduction to the bootstrap: CRC press. Ekman, P., & Friesen, WV (1978). Manual for the facial action coding system (1994).
  15. Uber Engineering. 2024. Introducing H3: Uber’s Hexagonal Hierarchical Spatial Index. https://www.uber.com/blog/h3/. Accessed: Date of access.
  16. Federal Communications Commission. 2022. FTC Takes Action Against Frontier for Lying about Internet Speeds and Ripping Off Customers Who Paid High-Speed Prices for Slow Service. Press Release. https://www.ftc.gov/news-events/news/pressreleases/2022/05/ftc-takes-action-against-frontier-lying-about-internet-speeds-ripping-customers-who-paid-highspeed.
  17. Peter Cody Fiduccia. 2022. Deconstructing the Digital Divide: The Geography, Demography, and Spatial Dependence of Internet Stability in the US. Cornell University.
  18. Trade-offs in optimizing the cache deployments of CDNs. In IEEE INFOCOM 2014-IEEE conference on computer communications. IEEE, 460–468.
  19. Measuring and evaluating large-scale CDNs. In ACM IMC, Vol. 8. 15–29.
  20. Mobile Internet Quality Estimation using Self-Tuning Kernel Regression. arXiv preprint arXiv:2311.05641 (2023).
  21. Analyzing Disparity and Temporal Progression of Internet Quality through Crowdsourced Measurements with Bias-Correction. arXiv preprint arXiv:2310.16136 (2023).
  22. A comparative analysis of ookla speedtest and measurement labs network diagnostic test (ndt7). Proceedings of the ACM on Measurement and Analysis of Computing Systems 7, 1 (2023), 1–26.
  23. Are We Up to the Challenge? An analysis of the FCC Broadband Data Collection Fixed Internet Availability Challenges. arXiv preprint arXiv:2404.04189 (2024).
  24. New York State Office of the Attorney General. 2020. New York Internet Health Test. https://ag.ny.gov/SpeedTest
  25. Ookla. 2024. Ookla for Good: Open Data. https://www.ookla.com/ookla-for-good/open-data Accessed: 2024-05-15.
  26. Decoding the Divide: Analyzing Disparities in Broadband Plans Offered by Major US ISPs. In Proceedings of the ACM SIGCOMM 2023 Conference. 578–591.
  27. Characterizing performance inequity across us ookla speedtest users. arXiv preprint arXiv:2110.12038 (2021).
  28. The importance of contextualization of crowdsourced active speed test measurements. In Proceedings of the 22nd ACM Internet Measurement Conference. 274–289.
  29. Pennsylvania State University and Measurement Lab. 2019. Broadband Availability and Access in Rural Pennsylvania. https://www.rural.pa.gov/publications/broadband.cfm
  30. Artificial neural networks as a tool for spatial interpolation. International Journal of Geographical Information Science 15, 4 (2001), 323–343.
  31. James Saxon and Dan A Black. 2022. What we can learn from selected, unmatched data: measuring Internet inequality in Chicago. Computers, Environment and Urban Systems 98 (2022), 101874.
  32. Random forest spatial interpolation. Remote Sensing 12, 10 (2020), 1687.
  33. Benchmarks or Equity? A New Approach to Measuring Internet Performance. A New Approach to Measuring Internet Performance (August 3, 2022) (2022).
  34. A First Look at the Spatial and Temporal Variability of Internet Performance Data in Hyperlocal Geographies. Available at SSRN 4568668 (2023).
  35. Donald Shepard. 1968. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference. 517–524.
  36. Joel Sommers and Paul Barford. 2012. Cell vs. WiFi: On the performance of metro area mobile connections. In Proceedings of the 2012 internet measurement conference. 301–314.
  37. Sean Stokes and Kathleen Slattery Thompson. 2022. With Billions of Dollars of Broadband Funding at Stake, the Timing of the Challenge Process to the FCC‘s Broadband Map Under Increasing Scrutiny. The National Law Review (2022). https://natlawreview.com/article/billions-dollars-broadband-funding-stake-timing-challenge-process-to-fcc-s-broadband
  38. Amy Stuyvesant. 2023. Michigan Broadband Personas-Assessing Why Households Lack Reliable Service Using Survey Responses, Speed Tests, and Location. Speed Tests, and Location (July 31, 2023) (2023).
  39. {{\{{BISmark}}\}}: A testbed for deploying measurements and applications in broadband access networks. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). 383–394.
  40. Broadband internet performance: a view from the gateway. ACM SIGCOMM computer communication review 41, 4 (2011), 134–145.
  41. Challenges in inferring internet congestion using throughput measurements. In Proceedings of the 2017 Internet Measurement Conference. 43–56.
  42. Cort J Willmott and Kenji Matsuura. 2006. On the use of dimensioned measures of error to evaluate the performance of spatial interpolators. International journal of geographical information science 20, 1 (2006), 89–102.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com