Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 41 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 124 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Deep Learning for Economists (2407.15339v3)

Published 22 Jul 2024 in econ.GN, cs.CL, cs.CV, and q-fin.EC

Abstract: Deep learning provides powerful methods to impute structured information from large-scale, unstructured text and image datasets. For example, economists might wish to detect the presence of economic activity in satellite images, or to measure the topics or entities mentioned in social media, the congressional record, or firm filings. This review introduces deep neural networks, covering methods such as classifiers, regression models, generative AI, and embedding models. Applications include classification, document digitization, record linkage, and methods for data exploration in massive scale text and image corpora. When suitable methods are used, deep learning models can be cheap to tune and can scale affordably to problems involving millions or billions of data points.. The review is accompanied by a companion website, EconDL, with user-friendly demo notebooks, software resources, and a knowledge base that provides technical details and additional applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (138)
  1. Abramitzky, Ran, Leah Boustan, Katherine Eriksson, James Feigenbaum, and Santiago Pérez. 2021. “Automated linking of historical data.” Journal of Economic Literature, 59(3): 865–918.
  2. Alammar, Jay. 2018a. “The illustrated Bert, Elmo, and Co. (how NLP cracked transfer learning).” https://jalammar.github.io/illustrated-bert/.
  3. Alammar, Jay. 2018b. “The illustrated Transformer.” https://jalammar.github.io/illustrated-transformer/.
  4. Alammar, Jay. 2019. “The illustrated GPT-2 (Visualizing Transformer language models).” http://jalammar.github.io/illustrated-gpt2/.
  5. Alammar, Jay. 2020. “How GPT3 Works - Visualizations and Animations.” https://jalammar.github.io/how-gpt3-works-visualizations-animations/.
  6. Aleissaee, Abdulaziz Amer, Amandeep Kumar, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal, Gui-Song Xia, et al. 2022. “Transformers in remote sensing: A survey.” arXiv preprint arXiv:2209.01206.
  7. Ali, Alaaeldin, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, et al. 2021. “Xcit: Cross-covariance image transformers.” Advances in Neural Information Processing Systems, 34: 20014–20027.
  8. Angelopoulos, Anastasios N., Stephen Bates, Clara Fannjiang, Michael I. Jordan, Tijana Zrnic, and Emmanuel J. Candès. 2023. “Prediction-Powered Inference.” arXiv preprint arXiv:2301.09633.
  9. Archives, U.S. National, and Records Administration. 2023. “The Soundex Indexing System.” Accessed: 11/10/2023.
  10. Arora, Abhishek, and Melissa Dell. 2024. “LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models.” Association of Computational Linguistics: Systems Demonstration Track.
  11. Arora, Abhishek, Emily Silcock, Leander Heldring, and Melissa Dell. 2024. “Contrastive Entity Coreference and Disambiguation for Historical Texts.” arXiv preprint arXiv:2406.15576.
  12. Arora, Abhishek, Xinmei Yang, Shao Yu Jheng, and Melissa Dell. 2023. “Linking representations with multimodal contrastive learning.” arXiv preprint arXiv:2304.03464.
  13. Athey, Susan, and Guido W Imbens. 2019. “Machine learning methods that economists should know about.” Annual Review of Economics, 11: 685–725.
  14. Bailey, Martha J, Connor Cole, Morgan Henderson, and Catherine Massey. 2020. “How well do automated linking methods perform? Lessons from US historical data.” Journal of Economic Literature, 58(4): 997–1044.
  15. Bandara, Wele Gedara Chaminda, and Vishal M Patel. 2022. “A transformer-based siamese network for change detection.” 207–210, IEEE.
  16. Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. 1994. “Learning long-term dependencies with gradient descent is difficult.” IEEE transactions on neural networks, 5(2): 157–166.
  17. Bhunia, Ankan Kumar, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Mubarak Shah. 2021. “Handwriting transformers.” 1086–1094.
  18. Binette, Olivier, and Rebecca C Steorts. 2022. “(Almost) all of entity resolution.” Science Advances, 8(12): eabi8021.
  19. Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. “Language models are few-shot learners.” Advances in Neural Information Processing Systems, 33: 1877–1901.
  20. Bryan, Tom, Jacob Carlson, Abhishek Arora, and Melissa Dell. 2023. “EfficientOCR: An Extensible, Open-Source Package for Efficiently Digitizing World Knowledge”.” Empirical Methods on Natural Language Processing (Systems Demonstrations Track).
  21. Cai, Zhaowei, and Nuno Vasconcelos. 2018. “Cascade r-cnn: Delving into high quality object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition, 6154–6162.
  22. Cai, Zhaowei, and Nuno Vasconcelos. 2019. “Cascade R-CNN: high quality object detection and instance segmentation.” IEEE transactions on pattern analysis and machine intelligence, 43(5): 1483–1498.
  23. Cao, Hongliu. 2023. “Recent advances in universal text embeddings: A Comprehensive Review of Top-Performing Methods on the MTEB Benchmark.” Amadeus SAS France.
  24. Carion, Nicolas, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. “End-to-end object detection with transformers.” 213–229, Springer.
  25. Carlson, Jacob., Tom. Bryan, and Melissa. Dell. 2024. “Efficient OCR for Building a Diverse Digital History.” ACL Anthology.
  26. Caron, Mathilde, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. “Emerging properties in self-supervised vision transformers.” 9650–9660.
  27. Cattaneo, Matias D., Yingjie Feng, Filippo Palomba, and Rocio Titiunik. 2022. “Uncertainty Quantification in Synthetic Controls with Staggered Treatment Adoption.” Journal of Econometrics, 228(2): 260–279.
  28. Chen, Xinlei, Saining Xie, and Kaiming He. 2021. “An empirical study of training self-supervised vision transformers.” 9640–9649.
  29. Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, 21(1): C1–C68.
  30. Chernozhukov, Victor, Kaspar Wüthrich, and Yinchu Zhu. 2021. “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.” Journal of the American Statistical Association, 116(536): 1849–1868.
  31. Chernozhukov, Victor, Whitney K. Newey, and Rahul Singh. 2022. “Automatic Debiased Machine Learning of Causal and Structural Effects.” Econometrics Journal, 25(3): C1–C38.
  32. Chopra, Sumit, Raia Hadsell, and Yann LeCun. 2005. “Learning a similarity metric discriminatively, with application to face verification.” Vol. 1, 539–546, IEEE.
  33. Cui, Peng, and Susan Athey. 2022. “Stable learning establishes some common ground between causal inference and machine learning.” Nature Machine Intelligence, 4(2): 110–115.
  34. De Cao, Nicola, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2020. “Autoregressive entity retrieval.” arXiv preprint arXiv:2010.00904.
  35. Dell, Melissa, Jacob Carlson, Tom Bryan, Emily Silcock, Abhishek Arora, Zejiang Shen, Luca D’Amico-Wong, Quan Le, Pablo Querubin, and Leander Heldring. 2023. “American Stories: A Large-Scale Structured Text Dataset of Historical US Newspapers.” Advances in Neural Information and Processing Systems, Datasets and Benchmarks.
  36. Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. “Imagenet: A large-scale hierarchical image database.” 248–255, IEEE.
  37. Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” 4171–4186, Association for Computational Linguistics.
  38. Dodge, Jesse, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. “Documenting large webtext corpora: A case study on the colossal clean crawled corpus.” arXiv preprint arXiv:2104.08758.
  39. Dosovitskiy, Alexey, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. “An image is worth 16x16 words: Transformers for image recognition at scale.” arXiv preprint arXiv:2010.11929.
  40. Ethayarajh, Kawin. 2019. “How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings.” arXiv preprint arXiv:1909.00512.
  41. Falkner, Stefan, Aaron Klein, and Frank Hutter. 2018. “BOHB: Robust and efficient hyperparameter optimization at scale.” 1437–1446, PMLR.
  42. Fernández-Villaverde, Jesús. 2024. “Deep Learning for Macroeconomists.” https://www.sas.upenn.edu/~jesusfv/teaching.html, https://www.sas.upenn.edu/~jesusfv/teaching.html.
  43. Franklin, Brevin, Emily Silcock, Abhishek Arora, Tom Bryan, and Melissa Dell. 2024. “News Deja Vu: Connecting Past and Present with Semantic Search.” 99–112.
  44. Fuller, Anthony, Koreen Millard, and James R Green. 2022. “Transfer Learning with Pretrained Remote Sensing Transformers.” arXiv preprint arXiv:2209.14969.
  45. Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. “Datasheets for datasets.” 1–14.
  46. Gentzkow, Matthew, Bryan Kelly, and Matt Taddy. 2019. “Text as Data.” Journal of Economic Literature, 57(3): 535–574.
  47. Gissin, Daniel, and Shai Shalev-Shwartz. 2019. “Discriminative active learning.” arXiv preprint arXiv:1907.06347.
  48. Goh, Gabriel. 2017. “Why momentum really works.” Distill, 2(4): e6.
  49. Gong, Yuan, Yu-An Chung, and James Glass. 2021. “Ast: Audio spectrogram transformer.” arXiv preprint arXiv:2104.01778.
  50. Greff, Klaus, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2016. “LSTM: A search space odyssey.” IEEE transactions on neural networks and learning systems, 28(10): 2222–2232.
  51. Grill, Jean-Bastien, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. “Bootstrap your own latent-a new approach to self-supervised learning.” Advances in Neural Information Processing Systems, 33: 21271–21284.
  52. Guarneri, Julia. 2017. Newsprint Metropolis. University of Chicago Press.
  53. Gururangan, Suchin, Ana Marasovic, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. “Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks.” arXiv preprint arXiv:2004.10964.
  54. Hanlon, W Walker, and Brian Beach. 2022. “Historical Newspaper Data: A Researcher’s Guide and Toolkit.”
  55. Hedderich, Michael A., Lukas Lange, Heike Adel, Jannik Strötgen, and Dietrich Klakow. 2021. “A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios.” 2545–2568. Online:Association for Computational Linguistics.
  56. Hegghammer, Thomas. 2021. “OCR with Tesseract, Amazon Textract, and Google Document AI: A benchmarking experiment.” Journal of Computational Social Science, 5(1): 861–882.
  57. He, Kaiming, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. “Mask R-CNN.” 2961–2969.
  58. He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.” 1026–1034.
  59. He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. “Deep residual learning for image recognition.” 770–778.
  60. He, Kaiming, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. “Masked autoencoders are scalable vision learners.” 16000–16009.
  61. He, Pengcheng, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2020. “Deberta: Decoding-enhanced bert with disentangled attention.” arXiv preprint arXiv:2006.03654.
  62. Hermans, Alexander, Lucas Beyer, and Bastian Leibe. 2017. “In defense of the triplet loss for person re-identification.” arXiv preprint arXiv:1703.07737.
  63. Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long short-term memory.” Neural computation, 9(8): 1735–1780.
  64. Holland, Sarah, Ahmed Hosny, Sara Newman, Josh Joseph, and Kevin Chmielinski. 2018. “The Dataset Nutrition Label: A Framework to Drive Higher Data Quality Standards.” FAT* ’18, 1–5. New York, NY, USA:Association for Computing Machinery.
  65. Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. 2019. “Searching for mobilenetv3.” 1314–1324.
  66. Ioffe, Sergey, and Christian Szegedy. 2015. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” 448–456, PMLR.
  67. Jocher, Glenn. 2020. “YOLOv5 by Ultralytics.”
  68. Johnson, Jeff, Matthijs Douze, and Hervé Jégou. 2019. “Billion-scale similarity search with gpus.” IEEE Transactions on Big Data, 7(3): 535–547.
  69. Karpathy, Andrej. 2022. “The spelled-out intro to neural networks and backpropagation.” https://www.youtube.com/watch?v=VMj-3S1tku0.
  70. Karpukhin, Vladimir, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. “Dense passage retrieval for open-domain question answering.” arXiv preprint arXiv:2004.04906.
  71. Khattab, Omar, Keshav Santhanam, Xiang Lisa Li, David Hall, Percy Liang, Christopher Potts, and Matei Zaharia. 2022. “Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP.” arXiv preprint arXiv:2212.14024.
  72. Khosla, Prannay, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. “Supervised contrastive learning.” Advances in Neural Information Processing Systems, 33: 18661–18673.
  73. Kingma, Diederik P, and Jimmy Ba. 2014. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980.
  74. Kirillov, Alexander, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. 2023. “Segment anything.” arXiv preprint arXiv:2304.02643.
  75. Kirillov, Alexander, Ross Girshick, Kaiming He, and Piotr Dollár. 2019. “Panoptic feature pyramid networks.” 6399–6408.
  76. Korinek, Anton. 2023. “Generative AI for Economic Research: Use Cases and Implications for Economists.” Journal of Economic Literature, 61(4): 1281–1317.
  77. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “ImageNet Classification with Deep Convolutional Neural Networks.” Vol. 25. Curran Associates, Inc.
  78. Lan, Zhenzhong, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. “Albert: A lite bert for self-supervised learning of language representations.” arXiv preprint arXiv:1909.11942.
  79. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep learning.” Nature, 521(7553): 436–444.
  80. Lei, Lihua, and Emmanuel J. Candès. 2020. “Conformal Inference of Counterfactuals and Individual Treatment Effects.” arXiv preprint arXiv:2006.06138.
  81. Levenshtein, Vladimir I, et al. 1966. “Binary codes capable of correcting deletions, insertions, and reversals.” Vol. 10, 707–710, Soviet Union.
  82. Li, Lisha, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2017. “Hyperband: A novel bandit-based approach to hyperparameter optimization.” The Journal of Machine Learning Research, 18(1): 6765–6816.
  83. Liu, Pengfei, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.” ACM Computing Surveys, 55(9): 1–35.
  84. Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. “Roberta: A robustly optimized bert pretraining approach.” arXiv preprint arXiv:1907.11692.
  85. Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. “Swin transformer: Hierarchical vision transformer using shifted windows.” 10012–10022.
  86. Liu, Zhuang, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. “A convnet for the 2020s.” 11976–11986.
  87. Lynn, Veronica, Jonathan K. Kummerfeld, and Rada Mihalcea. 2020. “A Causal Framework for Uncovering the Effects of Descriptive Text on Decision Making.” 5276–5294.
  88. Lyu, Lijun, Maria Koutraki, Martin Krickl, and Besnik Fetahu. 2021. “Neural OCR Post-Hoc Correction of Historical Corpora.” Transactions of the Association for Computational Linguistics, 9: 479–483.
  89. Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. “A Survey on Bias and Fairness in Machine Learning.” ACM Computing Surveys (CSUR), 54(6): 1–35.
  90. Mehta, Sachin, and Mohammad Rastegari. 2021. “MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer.” arXiv preprint arXiv:2110.02178.
  91. Merchant, Amil, Elahe Rahimtoroghi, Ellie Pavlick, and Ian Tenney. 2020. “What Happens to BERT Embeddings during Fine-tuning?” arXiv preprint arXiv:2004.14448.
  92. Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. “Distributed representations of words and phrases and their compositionality.” Advances in Neural Information Processing Systems, 26: 3111–3119.
  93. Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. “Model Cards for Model Reporting.” 220–229.
  94. MLCommons. 2024. “Croissant: A Metadata Framework for ML-Ready Datasets.” https://github.com/mlcommons/croissant, Accessed: 2024-07-09.
  95. Nguyen, Dat Quoc, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, and Long Doan. 2020. “WNUT-2020 task 2: identification of informative COVID-19 english tweets.” arXiv preprint arXiv:2010.08232.
  96. Nguyen, Thi Tuyet Hai, Adam Jatowt, Mickael Coustaty, and Antoine Doucet. 2021. “Survey of Post-OCR Processing Approaches.” ACM Comput. Surv., 54(6).
  97. Olah, Christopher. 2014. “Deep learning, NLP, and Representations.” GitHub blog, posted on July.
  98. Oord, Aaron van den, Yazhe Li, and Oriol Vinyals. 2018. “Representation learning with contrastive predictive coding.” arXiv preprint arXiv:1807.03748.
  99. Pennington, Jeffrey, Richard Socher, and Christopher D Manning. 2014. “Glove: Global vectors for word representation.” 1532–1543.
  100. Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. “Language models are unsupervised multitask learners.” OpenAI blog, 1(8): 9.
  101. Radford, Alec, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. “Learning transferable visual models from natural language supervision.” 8748–8763, PMLR.
  102. Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” arXiv e-prints.
  103. Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. “Exploring the limits of transfer learning with a unified text-to-text transformer.” The Journal of Machine Learning Research, 21(1): 5485–5551.
  104. Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. “You only look once: Unified, real-time object detection.” 779–788.
  105. Reimers, Nils, and Iryna Gurevych. 2019. “Sentence-bert: Sentence embeddings using siamese bert-networks.” arXiv preprint arXiv:1908.10084.
  106. Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. 2017. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” IEEE transactions on pattern analysis and machine intelligence, 39(6): 1137–1149.
  107. Robins, James M., Andrea Rotnitzky, and Lue Ping Zhao. 1994. “Estimation of regression coefficients when some regressors are not always observed.” Journal of the American Statistical Association, 89(427): 846–866.
  108. Rush, Alexander M. 2018. “The annotated transformer.” 52–60.
  109. Russell, Robert C. 1918. “U.S. Patent No. US1261167A.” U.S. Patent and Trademark Office, https://patents.google.com/patent/US1261167A/en.
  110. Sanderson, Grant. 2017. “Neural Networks.” https://www.3blue1brown.com/topics/neural-networks.
  111. Sanderson, Grant. 2020. “Convolutions in Image Processing.” https://www.youtube.com/watch?v=8rrHTtUzyZA.
  112. Sang, Erik F, and Fien De Meulder. 2003. “Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition.” arXiv preprint cs/0306050.
  113. Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019a. “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.” arXiv preprint arXiv:1910.01108.
  114. Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019b. “DistilRoBERTa: A distilled version of RoBERTa.” https://github.com/huggingface/transformers, Accessed: 2024-07-09.
  115. Santurkar, Shibani, Dimitris Tsipras, Andrew Ilyas, and Aleksander Madry. 2018. “How does batch normalization help optimization?” Advances in Neural Information Processing Systems, 31.
  116. Shafer, Glenn, and Vladimir Vovk. 2008. “A Tutorial on Conformal Prediction.” Journal of Machine Learning Research, 9: 371–421.
  117. Shen, Qinlan, and Carolyn Rose. 2021. “What Sounds “Right” to Me? Experiential Factors in the Perception of Political Ideology.” 1762–1771, Association for Computational Linguistics.
  118. Shen, Zejiang, Jian Zhao, Yaoliang Yu, Weining Li, and Melissa Dell. 2022. “Olala: object-level active learning based layout annotation.” EMNLP Computational Social Science Workshop.
  119. Shen, Zejiang, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson, and Weining Li. 2021. “LayoutParser: A unified toolkit for deep learning based document image analysis.” 131–146, Springer.
  120. Silcock, Emily, Abhishek Arora, Luca D’Amico-Wong, and Melissa Dell. 2024. “Newswire: A Large-Scale Structured Database of a Century of Historical News.” arXiv preprint arXiv:2406.09490.
  121. Silcock, Emily, Luca D’Amico-Wong, Jinglin Yang, and Melissa Dell. 2023. “Noise-Robust De-Duplication at Scale.” Vol. 332.
  122. Simonyan, Karen, and Andrew Zisserman. 2014. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556.
  123. Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. “Going deeper with convolutions.” 1–9.
  124. Touvron, Hugo, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Hervé Jégou. 2021. “Training data-efficient image transformers & distillation through attention.” 10347–10357, PMLR.
  125. Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. “LLaMA: Open and Efficient Foundation Language Models.” arXiv preprint arXiv:2302.13971.
  126. Ultralytics. 2020. “Ultralytics/yolov5.” https://github.com/ultralytics/yolov5.
  127. van Strien., Daniel, Kaspar Beelen., Mariona Coll Ardanuy., Kasra Hosseini., Barbara McGillivray., and Giovanni Colavizza. 2020. “Assessing the Impact of OCR Quality on Downstream NLP Tasks.” 484–496, INSTICC. SciTePress.
  128. Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention is all you need.” Advances in Neural Information Processing Systems, 30.
  129. Vitercik, Ellen. 2023. “Machine Learning for Algorithm Design.” https://vitercik.github.io/ml4algs/calendar/, https://vitercik.github.io/ml4algs/calendar/.
  130. Wang, Di, Jing Zhang, Bo Du, Gui-Song Xia, and Dacheng Tao. 2022. “An empirical study of remote sensing pretraining.” IEEE Transactions on Geoscience and Remote Sensing.
  131. Wang, Feng, and Huaping Liu. 2021. “Understanding the behaviour of contrastive loss.” 2495–2504.
  132. Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. “Chain of Thought Prompting Elicits Reasoning in Large Language Models.” arXiv preprint arXiv:2201.11903.
  133. Wilkerson, John, E. Scott Adler, Bryan D. Jones, Frank R. Baumgartner, Guy Freedman, Sean M. Theriault, Alison Craig, Derek A. Epp, Cheyenne Lee, and Miranda E. Sullivan. 2023. “Policy Agendas Project: Congressional Bills.” https://comparativeagendas.net/datasets_codebooks.
  134. Wu, Ledell, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2019. “Scalable zero-shot entity linking with dense entity retrieval.” arXiv preprint arXiv:1911.03814.
  135. Xie, Saining, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. “Aggregated residual transformations for deep neural networks.” 1492–1500.
  136. Yamada, Ikuya, Koki Washio, Hiroyuki Shindo, and Yuji Matsumoto. 2022. “Global entity disambiguation with BERT.” 3264–3271.
  137. Yang, Xinmei, Abhishek Arora, Shao Yu Jheng, and Melissa Dell. 2023. “Quantifying Character Similarity with Vision Transformers.” Empirical Methods on Natural Language Processing.
  138. Zrnic, Tijana, and Emmanuel J. Candès. 2023. “Cross-Prediction-Powered Inference.” arXiv preprint arXiv:2309.16598.
Citations (5)

Summary

  • The paper demonstrates that deep learning techniques, including CNNs and transformers, revolutionize economic analysis through processing vast unstructured datasets.
  • It details methodologies applied to tasks such as text and image classification, token classification, and record linkage, enhancing precision in economic research.
  • The study emphasizes practical benefits like scalability, efficiency, and cost-effectiveness, setting the stage for future advancements in model interpretability and bias reduction.

Deep Learning for Economists: Insights from Melissa Dell's Review

Melissa Dell's paper, "Deep Learning for Economists," provides a comprehensive review of how deep learning (DL) methodologies can be effectively applied to economic research. By bridging the gap between advancements in DL and their practical applications in economics, this paper offers a roadmap for leveraging neural networks to process and analyze large-scale unstructured datasets, including text and images.

Overview of Deep Learning Architectures

The core of deep learning lies in neural networks that learn representations of data at multiple abstraction layers. The review introduces foundational architectures such as Convolutional Neural Networks (CNNs) for image data and Transformer models for text data. The latter, particularly their variants like BERT and GPT, have revolutionized NLP by providing contextualized representations through self-attention mechanisms.

Applications of Deep Learning in Economics

Dell categorizes the primary applications of deep learning in economics into several domains:

  1. Classification Tasks:
    • Text Classification: Classifying economic content within large corpora, such as identifying news articles on specific economic policies. The review compares performance across different DL models and generative AI tools like GPT-3.5 and GPT-4, finding that while fine-tuned models generally outperform generative AI, the latter can still be highly effective for more straightforward tasks.
    • Image Classification: Detecting economic activities in satellite images or identifying objects in scanned historical documents.
  2. Token Classification:
    • Named Entity Recognition (NER) for identifying economic entities like firms, individuals, or locations within textual datasets.
  3. Entity Disambiguation and Coreference Resolution:
    • Linking mentions of economic entities in unstructured texts to external knowledge bases like Wikipedia, and resolving coreference to aggregate data about the same entity across documents.
  4. Record Linkage:
    • Applying DL to link records across datasets, enhancing accuracy over traditional string matching methods. This can be especially valuable in contexts involving historical economic data or cross-national data integration.

Dell emphasizes the importance of contrastive learning in improving the isotropy of embedding spaces, significantly enhancing the performance of embedding models in tasks like record linkage and NER.

Practical and Theoretical Implications

The practical implications of adopting deep learning methods in economics are profound:

  • Scalability and Efficiency: DL models, especially when optimized for economic tasks, can handle massive datasets with unprecedented efficiency. For instance, embedding models have been shown to perform exact vector similarity calculations swiftly on consumer-grade GPUs.
  • Enhanced Analytical Capabilities: By automating the extraction of structured data from unstructured sources, economists can test theories with granular data that were previously infeasible to analyze.
  • Cost-Effectiveness: Compared to traditional methods, DL approaches can significantly reduce the cost of data processing and analysis, particularly when models are fine-tuned for specific economic tasks.

Future Developments

Looking ahead, several areas in AI present opportunities for further advancements in economic research:

  • Enhancing Model Interpretability: Developing methods to better understand and explain the decisions made by DL models will be crucial for their broader adoption in economics.
  • Bias and Uncertainty Quantification: Integrating formal methods for assessing model bias and uncertainty will improve the reliability of DL applications in sensitive economic analyses.
  • Cross-Disciplinary Techniques: Combining insights from machine learning, econometrics, and domain-specific knowledge will foster more robust and innovative applications of DL in economics.

Conclusion

Dell's review underscores the transformative potential of deep learning for economic research. By providing tools to efficiently process and analyze large-scale, unstructured data, DL can significantly enhance the richness and precision of economic analyses. The review, along with its accompanying resources, serves as a valuable reference for economists seeking to incorporate deep learning into their research methodologies. The future of economic research, enriched by the capabilities of DL, promises more nuanced insights and more comprehensive understanding of economic phenomena.

Dice Question Streamline Icon: https://streamlinehq.com

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 12 tweets and received 836 likes.

Upgrade to Pro to view all of the tweets about this paper:

HackerNews

  1. Deep Learning for Economists (2 points, 0 comments)
Reddit Logo Streamline Icon: https://streamlinehq.com

Reddit

  1. Deep Learning for Economists (11 points, 0 comments)
  2. Deep Learning for Economists (2 points, 3 comments)