Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data-Driven Socio-Economic Deprivation Prediction via Dimensionality Reduction: The Power of Diffusion Maps (2312.09830v2)

Published 15 Dec 2023 in cs.LG

Abstract: This research proposes a model to predict the location of the most deprived areas in a city using data from the census. Census data is very high-dimensional and needs to be simplified. We use the diffusion map algorithm to reduce dimensionality and find patterns. Features are defined by eigenvectors of the Laplacian matrix that defines the diffusion map. The eigenvectors corresponding to the smallest eigenvalues indicate specific characteristics of the population. Previous work has found qualitatively that the second most important dimension for describing the census data in Bristol, UK is linked to deprivation. In this research, we analyse how good this dimension is as a model for predicting deprivation by comparing it with the recognised measures. The Pearson correlation coefficient was found to be greater than 0.7. The top 10 per cent of deprived areas in the UK, which are also located in Bristol, are extracted to test the accuracy of the model. There are 52 of the most deprived areas, and 38 areas are correctly identified by comparing them to the model. The influence of scores of IMD domains that do not correlate with the models and Eigenvector 2 entries of non-deprived Output Areas cause the model to fail the prediction of 14 deprived areas. The model demonstrates strong performance in predicting future deprivation in the project areas, which is expected to assist in government resource allocation and funding greatly. The codes can be accessed here: https://github.com/junegoo94/diffusion_maps

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. Manifold Cities: Social variables of urban areas in the uk. arXiv preprint arXiv:1809.03376, 2018.
  2. Pearson correlation coefficient. In Noise reduction in speech processing, pages 1–4. Springer, 2009.
  3. Jacob Cohen. Statistical power analysis for the social sciences. Lawrence Erlbaum Associates, 1988.
  4. Diffusion maps. Applied and computational harmonic analysis, 21(1):5–30, 2006.
  5. Office for National Statistics. Local statistics, accessed 20/Mar/2019. https://www.ons.gov.uk/help/localstatistics.
  6. Office for National Statistics. The modern census, accessed 27/Nov/2018. https://www.ons.gov.uk/census/2011census/howourcensusworks/aboutcensuses/censushistory/themoderncensus.
  7. Office for National Statistics. Census Geography, accessed 7/Feb/2019. https://https://www.ons.gov.uk/methodology/geography/ukgeographies/censusgeography#output-area-oa.
  8. Baljit Gill. The English indices of deprivation 2015-statistical release. London, England. Office for National Statistics, 2015.
  9. Algebraic graph theory, volume 207. Springer Science & Business Media, 2013.
  10. Poverty and social exclusion in britain. Joseph Rowntree Foundation, 2000.
  11. Ian Jolliffe. Principal component analysis. In International encyclopedia of statistical science, pages 1094–1096. Springer, 2011.
  12. Neighborhoods and health. Oxford University Press, 2003.
  13. Mukund Lad. The English indices of deprivation 2010-statistical release. London, England. Office for National Statistics, 2010.
  14. Anne Marsden. Eigenvalues of the Laplacian and Their Relationship to the Connectedness of a Graph. University of Chicago, REU, 2013.
  15. Patterns of Socio-economic Deprivation and its Impact on Quality of Life: Case of a Less Developed Region in West Bengal, India. Athens Journal of Health, 1:271–286, 2014.
  16. Dimensionality reduction: a comparative review. J Mach Learn Res, 10:66–71, 2009.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com