Deep Learning for Economists (2407.15339v3)
Abstract: Deep learning provides powerful methods to impute structured information from large-scale, unstructured text and image datasets. For example, economists might wish to detect the presence of economic activity in satellite images, or to measure the topics or entities mentioned in social media, the congressional record, or firm filings. This review introduces deep neural networks, covering methods such as classifiers, regression models, generative AI, and embedding models. Applications include classification, document digitization, record linkage, and methods for data exploration in massive scale text and image corpora. When suitable methods are used, deep learning models can be cheap to tune and can scale affordably to problems involving millions or billions of data points.. The review is accompanied by a companion website, EconDL, with user-friendly demo notebooks, software resources, and a knowledge base that provides technical details and additional applications.
- Abramitzky, Ran, Leah Boustan, Katherine Eriksson, James Feigenbaum, and Santiago Pérez. 2021. “Automated linking of historical data.” Journal of Economic Literature, 59(3): 865–918.
- Alammar, Jay. 2018a. “The illustrated Bert, Elmo, and Co. (how NLP cracked transfer learning).” https://jalammar.github.io/illustrated-bert/.
- Alammar, Jay. 2018b. “The illustrated Transformer.” https://jalammar.github.io/illustrated-transformer/.
- Alammar, Jay. 2019. “The illustrated GPT-2 (Visualizing Transformer language models).” http://jalammar.github.io/illustrated-gpt2/.
- Alammar, Jay. 2020. “How GPT3 Works - Visualizations and Animations.” https://jalammar.github.io/how-gpt3-works-visualizations-animations/.
- Aleissaee, Abdulaziz Amer, Amandeep Kumar, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal, Gui-Song Xia, et al. 2022. “Transformers in remote sensing: A survey.” arXiv preprint arXiv:2209.01206.
- Ali, Alaaeldin, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, et al. 2021. “Xcit: Cross-covariance image transformers.” Advances in Neural Information Processing Systems, 34: 20014–20027.
- Angelopoulos, Anastasios N., Stephen Bates, Clara Fannjiang, Michael I. Jordan, Tijana Zrnic, and Emmanuel J. Candès. 2023. “Prediction-Powered Inference.” arXiv preprint arXiv:2301.09633.
- Archives, U.S. National, and Records Administration. 2023. “The Soundex Indexing System.” Accessed: 11/10/2023.
- Arora, Abhishek, and Melissa Dell. 2024. “LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models.” Association of Computational Linguistics: Systems Demonstration Track.
- Arora, Abhishek, Emily Silcock, Leander Heldring, and Melissa Dell. 2024. “Contrastive Entity Coreference and Disambiguation for Historical Texts.” arXiv preprint arXiv:2406.15576.
- Arora, Abhishek, Xinmei Yang, Shao Yu Jheng, and Melissa Dell. 2023. “Linking representations with multimodal contrastive learning.” arXiv preprint arXiv:2304.03464.
- Athey, Susan, and Guido W Imbens. 2019. “Machine learning methods that economists should know about.” Annual Review of Economics, 11: 685–725.
- Bailey, Martha J, Connor Cole, Morgan Henderson, and Catherine Massey. 2020. “How well do automated linking methods perform? Lessons from US historical data.” Journal of Economic Literature, 58(4): 997–1044.
- Bandara, Wele Gedara Chaminda, and Vishal M Patel. 2022. “A transformer-based siamese network for change detection.” 207–210, IEEE.
- Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. 1994. “Learning long-term dependencies with gradient descent is difficult.” IEEE transactions on neural networks, 5(2): 157–166.
- Bhunia, Ankan Kumar, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Mubarak Shah. 2021. “Handwriting transformers.” 1086–1094.
- Binette, Olivier, and Rebecca C Steorts. 2022. “(Almost) all of entity resolution.” Science Advances, 8(12): eabi8021.
- Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. “Language models are few-shot learners.” Advances in Neural Information Processing Systems, 33: 1877–1901.
- Bryan, Tom, Jacob Carlson, Abhishek Arora, and Melissa Dell. 2023. “EfficientOCR: An Extensible, Open-Source Package for Efficiently Digitizing World Knowledge”.” Empirical Methods on Natural Language Processing (Systems Demonstrations Track).
- Cai, Zhaowei, and Nuno Vasconcelos. 2018. “Cascade r-cnn: Delving into high quality object detection.” Proceedings of the IEEE conference on computer vision and pattern recognition, 6154–6162.
- Cai, Zhaowei, and Nuno Vasconcelos. 2019. “Cascade R-CNN: high quality object detection and instance segmentation.” IEEE transactions on pattern analysis and machine intelligence, 43(5): 1483–1498.
- Cao, Hongliu. 2023. “Recent advances in universal text embeddings: A Comprehensive Review of Top-Performing Methods on the MTEB Benchmark.” Amadeus SAS France.
- Carion, Nicolas, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. “End-to-end object detection with transformers.” 213–229, Springer.
- Carlson, Jacob., Tom. Bryan, and Melissa. Dell. 2024. “Efficient OCR for Building a Diverse Digital History.” ACL Anthology.
- Caron, Mathilde, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. “Emerging properties in self-supervised vision transformers.” 9650–9660.
- Cattaneo, Matias D., Yingjie Feng, Filippo Palomba, and Rocio Titiunik. 2022. “Uncertainty Quantification in Synthetic Controls with Staggered Treatment Adoption.” Journal of Econometrics, 228(2): 260–279.
- Chen, Xinlei, Saining Xie, and Kaiming He. 2021. “An empirical study of training self-supervised vision transformers.” 9640–9649.
- Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, 21(1): C1–C68.
- Chernozhukov, Victor, Kaspar Wüthrich, and Yinchu Zhu. 2021. “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.” Journal of the American Statistical Association, 116(536): 1849–1868.
- Chernozhukov, Victor, Whitney K. Newey, and Rahul Singh. 2022. “Automatic Debiased Machine Learning of Causal and Structural Effects.” Econometrics Journal, 25(3): C1–C38.
- Chopra, Sumit, Raia Hadsell, and Yann LeCun. 2005. “Learning a similarity metric discriminatively, with application to face verification.” Vol. 1, 539–546, IEEE.
- Cui, Peng, and Susan Athey. 2022. “Stable learning establishes some common ground between causal inference and machine learning.” Nature Machine Intelligence, 4(2): 110–115.
- De Cao, Nicola, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2020. “Autoregressive entity retrieval.” arXiv preprint arXiv:2010.00904.
- Dell, Melissa, Jacob Carlson, Tom Bryan, Emily Silcock, Abhishek Arora, Zejiang Shen, Luca D’Amico-Wong, Quan Le, Pablo Querubin, and Leander Heldring. 2023. “American Stories: A Large-Scale Structured Text Dataset of Historical US Newspapers.” Advances in Neural Information and Processing Systems, Datasets and Benchmarks.
- Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. “Imagenet: A large-scale hierarchical image database.” 248–255, IEEE.
- Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” 4171–4186, Association for Computational Linguistics.
- Dodge, Jesse, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. “Documenting large webtext corpora: A case study on the colossal clean crawled corpus.” arXiv preprint arXiv:2104.08758.
- Dosovitskiy, Alexey, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. “An image is worth 16x16 words: Transformers for image recognition at scale.” arXiv preprint arXiv:2010.11929.
- Ethayarajh, Kawin. 2019. “How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings.” arXiv preprint arXiv:1909.00512.
- Falkner, Stefan, Aaron Klein, and Frank Hutter. 2018. “BOHB: Robust and efficient hyperparameter optimization at scale.” 1437–1446, PMLR.
- Fernández-Villaverde, Jesús. 2024. “Deep Learning for Macroeconomists.” https://www.sas.upenn.edu/~jesusfv/teaching.html, https://www.sas.upenn.edu/~jesusfv/teaching.html.
- Franklin, Brevin, Emily Silcock, Abhishek Arora, Tom Bryan, and Melissa Dell. 2024. “News Deja Vu: Connecting Past and Present with Semantic Search.” 99–112.
- Fuller, Anthony, Koreen Millard, and James R Green. 2022. “Transfer Learning with Pretrained Remote Sensing Transformers.” arXiv preprint arXiv:2209.14969.
- Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. “Datasheets for datasets.” 1–14.
- Gentzkow, Matthew, Bryan Kelly, and Matt Taddy. 2019. “Text as Data.” Journal of Economic Literature, 57(3): 535–574.
- Gissin, Daniel, and Shai Shalev-Shwartz. 2019. “Discriminative active learning.” arXiv preprint arXiv:1907.06347.
- Goh, Gabriel. 2017. “Why momentum really works.” Distill, 2(4): e6.
- Gong, Yuan, Yu-An Chung, and James Glass. 2021. “Ast: Audio spectrogram transformer.” arXiv preprint arXiv:2104.01778.
- Greff, Klaus, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2016. “LSTM: A search space odyssey.” IEEE transactions on neural networks and learning systems, 28(10): 2222–2232.
- Grill, Jean-Bastien, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. “Bootstrap your own latent-a new approach to self-supervised learning.” Advances in Neural Information Processing Systems, 33: 21271–21284.
- Guarneri, Julia. 2017. Newsprint Metropolis. University of Chicago Press.
- Gururangan, Suchin, Ana Marasovic, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A Smith. 2020. “Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks.” arXiv preprint arXiv:2004.10964.
- Hanlon, W Walker, and Brian Beach. 2022. “Historical Newspaper Data: A Researcher’s Guide and Toolkit.”
- Hedderich, Michael A., Lukas Lange, Heike Adel, Jannik Strötgen, and Dietrich Klakow. 2021. “A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios.” 2545–2568. Online:Association for Computational Linguistics.
- Hegghammer, Thomas. 2021. “OCR with Tesseract, Amazon Textract, and Google Document AI: A benchmarking experiment.” Journal of Computational Social Science, 5(1): 861–882.
- He, Kaiming, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. “Mask R-CNN.” 2961–2969.
- He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.” 1026–1034.
- He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. “Deep residual learning for image recognition.” 770–778.
- He, Kaiming, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. “Masked autoencoders are scalable vision learners.” 16000–16009.
- He, Pengcheng, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. 2020. “Deberta: Decoding-enhanced bert with disentangled attention.” arXiv preprint arXiv:2006.03654.
- Hermans, Alexander, Lucas Beyer, and Bastian Leibe. 2017. “In defense of the triplet loss for person re-identification.” arXiv preprint arXiv:1703.07737.
- Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long short-term memory.” Neural computation, 9(8): 1735–1780.
- Holland, Sarah, Ahmed Hosny, Sara Newman, Josh Joseph, and Kevin Chmielinski. 2018. “The Dataset Nutrition Label: A Framework to Drive Higher Data Quality Standards.” FAT* ’18, 1–5. New York, NY, USA:Association for Computing Machinery.
- Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. 2019. “Searching for mobilenetv3.” 1314–1324.
- Ioffe, Sergey, and Christian Szegedy. 2015. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” 448–456, PMLR.
- Jocher, Glenn. 2020. “YOLOv5 by Ultralytics.”
- Johnson, Jeff, Matthijs Douze, and Hervé Jégou. 2019. “Billion-scale similarity search with gpus.” IEEE Transactions on Big Data, 7(3): 535–547.
- Karpathy, Andrej. 2022. “The spelled-out intro to neural networks and backpropagation.” https://www.youtube.com/watch?v=VMj-3S1tku0.
- Karpukhin, Vladimir, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. “Dense passage retrieval for open-domain question answering.” arXiv preprint arXiv:2004.04906.
- Khattab, Omar, Keshav Santhanam, Xiang Lisa Li, David Hall, Percy Liang, Christopher Potts, and Matei Zaharia. 2022. “Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP.” arXiv preprint arXiv:2212.14024.
- Khosla, Prannay, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. “Supervised contrastive learning.” Advances in Neural Information Processing Systems, 33: 18661–18673.
- Kingma, Diederik P, and Jimmy Ba. 2014. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980.
- Kirillov, Alexander, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. 2023. “Segment anything.” arXiv preprint arXiv:2304.02643.
- Kirillov, Alexander, Ross Girshick, Kaiming He, and Piotr Dollár. 2019. “Panoptic feature pyramid networks.” 6399–6408.
- Korinek, Anton. 2023. “Generative AI for Economic Research: Use Cases and Implications for Economists.” Journal of Economic Literature, 61(4): 1281–1317.
- Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “ImageNet Classification with Deep Convolutional Neural Networks.” Vol. 25. Curran Associates, Inc.
- Lan, Zhenzhong, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. “Albert: A lite bert for self-supervised learning of language representations.” arXiv preprint arXiv:1909.11942.
- LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep learning.” Nature, 521(7553): 436–444.
- Lei, Lihua, and Emmanuel J. Candès. 2020. “Conformal Inference of Counterfactuals and Individual Treatment Effects.” arXiv preprint arXiv:2006.06138.
- Levenshtein, Vladimir I, et al. 1966. “Binary codes capable of correcting deletions, insertions, and reversals.” Vol. 10, 707–710, Soviet Union.
- Li, Lisha, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2017. “Hyperband: A novel bandit-based approach to hyperparameter optimization.” The Journal of Machine Learning Research, 18(1): 6765–6816.
- Liu, Pengfei, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.” ACM Computing Surveys, 55(9): 1–35.
- Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. “Roberta: A robustly optimized bert pretraining approach.” arXiv preprint arXiv:1907.11692.
- Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. “Swin transformer: Hierarchical vision transformer using shifted windows.” 10012–10022.
- Liu, Zhuang, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. “A convnet for the 2020s.” 11976–11986.
- Lynn, Veronica, Jonathan K. Kummerfeld, and Rada Mihalcea. 2020. “A Causal Framework for Uncovering the Effects of Descriptive Text on Decision Making.” 5276–5294.
- Lyu, Lijun, Maria Koutraki, Martin Krickl, and Besnik Fetahu. 2021. “Neural OCR Post-Hoc Correction of Historical Corpora.” Transactions of the Association for Computational Linguistics, 9: 479–483.
- Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. “A Survey on Bias and Fairness in Machine Learning.” ACM Computing Surveys (CSUR), 54(6): 1–35.
- Mehta, Sachin, and Mohammad Rastegari. 2021. “MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer.” arXiv preprint arXiv:2110.02178.
- Merchant, Amil, Elahe Rahimtoroghi, Ellie Pavlick, and Ian Tenney. 2020. “What Happens to BERT Embeddings during Fine-tuning?” arXiv preprint arXiv:2004.14448.
- Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. “Distributed representations of words and phrases and their compositionality.” Advances in Neural Information Processing Systems, 26: 3111–3119.
- Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. “Model Cards for Model Reporting.” 220–229.
- MLCommons. 2024. “Croissant: A Metadata Framework for ML-Ready Datasets.” https://github.com/mlcommons/croissant, Accessed: 2024-07-09.
- Nguyen, Dat Quoc, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, and Long Doan. 2020. “WNUT-2020 task 2: identification of informative COVID-19 english tweets.” arXiv preprint arXiv:2010.08232.
- Nguyen, Thi Tuyet Hai, Adam Jatowt, Mickael Coustaty, and Antoine Doucet. 2021. “Survey of Post-OCR Processing Approaches.” ACM Comput. Surv., 54(6).
- Olah, Christopher. 2014. “Deep learning, NLP, and Representations.” GitHub blog, posted on July.
- Oord, Aaron van den, Yazhe Li, and Oriol Vinyals. 2018. “Representation learning with contrastive predictive coding.” arXiv preprint arXiv:1807.03748.
- Pennington, Jeffrey, Richard Socher, and Christopher D Manning. 2014. “Glove: Global vectors for word representation.” 1532–1543.
- Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. “Language models are unsupervised multitask learners.” OpenAI blog, 1(8): 9.
- Radford, Alec, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. “Learning transferable visual models from natural language supervision.” 8748–8763, PMLR.
- Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” arXiv e-prints.
- Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. “Exploring the limits of transfer learning with a unified text-to-text transformer.” The Journal of Machine Learning Research, 21(1): 5485–5551.
- Redmon, Joseph, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. “You only look once: Unified, real-time object detection.” 779–788.
- Reimers, Nils, and Iryna Gurevych. 2019. “Sentence-bert: Sentence embeddings using siamese bert-networks.” arXiv preprint arXiv:1908.10084.
- Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. 2017. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” IEEE transactions on pattern analysis and machine intelligence, 39(6): 1137–1149.
- Robins, James M., Andrea Rotnitzky, and Lue Ping Zhao. 1994. “Estimation of regression coefficients when some regressors are not always observed.” Journal of the American Statistical Association, 89(427): 846–866.
- Rush, Alexander M. 2018. “The annotated transformer.” 52–60.
- Russell, Robert C. 1918. “U.S. Patent No. US1261167A.” U.S. Patent and Trademark Office, https://patents.google.com/patent/US1261167A/en.
- Sanderson, Grant. 2017. “Neural Networks.” https://www.3blue1brown.com/topics/neural-networks.
- Sanderson, Grant. 2020. “Convolutions in Image Processing.” https://www.youtube.com/watch?v=8rrHTtUzyZA.
- Sang, Erik F, and Fien De Meulder. 2003. “Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition.” arXiv preprint cs/0306050.
- Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019a. “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.” arXiv preprint arXiv:1910.01108.
- Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019b. “DistilRoBERTa: A distilled version of RoBERTa.” https://github.com/huggingface/transformers, Accessed: 2024-07-09.
- Santurkar, Shibani, Dimitris Tsipras, Andrew Ilyas, and Aleksander Madry. 2018. “How does batch normalization help optimization?” Advances in Neural Information Processing Systems, 31.
- Shafer, Glenn, and Vladimir Vovk. 2008. “A Tutorial on Conformal Prediction.” Journal of Machine Learning Research, 9: 371–421.
- Shen, Qinlan, and Carolyn Rose. 2021. “What Sounds “Right” to Me? Experiential Factors in the Perception of Political Ideology.” 1762–1771, Association for Computational Linguistics.
- Shen, Zejiang, Jian Zhao, Yaoliang Yu, Weining Li, and Melissa Dell. 2022. “Olala: object-level active learning based layout annotation.” EMNLP Computational Social Science Workshop.
- Shen, Zejiang, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson, and Weining Li. 2021. “LayoutParser: A unified toolkit for deep learning based document image analysis.” 131–146, Springer.
- Silcock, Emily, Abhishek Arora, Luca D’Amico-Wong, and Melissa Dell. 2024. “Newswire: A Large-Scale Structured Database of a Century of Historical News.” arXiv preprint arXiv:2406.09490.
- Silcock, Emily, Luca D’Amico-Wong, Jinglin Yang, and Melissa Dell. 2023. “Noise-Robust De-Duplication at Scale.” Vol. 332.
- Simonyan, Karen, and Andrew Zisserman. 2014. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556.
- Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. “Going deeper with convolutions.” 1–9.
- Touvron, Hugo, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Hervé Jégou. 2021. “Training data-efficient image transformers & distillation through attention.” 10347–10357, PMLR.
- Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. “LLaMA: Open and Efficient Foundation Language Models.” arXiv preprint arXiv:2302.13971.
- Ultralytics. 2020. “Ultralytics/yolov5.” https://github.com/ultralytics/yolov5.
- van Strien., Daniel, Kaspar Beelen., Mariona Coll Ardanuy., Kasra Hosseini., Barbara McGillivray., and Giovanni Colavizza. 2020. “Assessing the Impact of OCR Quality on Downstream NLP Tasks.” 484–496, INSTICC. SciTePress.
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention is all you need.” Advances in Neural Information Processing Systems, 30.
- Vitercik, Ellen. 2023. “Machine Learning for Algorithm Design.” https://vitercik.github.io/ml4algs/calendar/, https://vitercik.github.io/ml4algs/calendar/.
- Wang, Di, Jing Zhang, Bo Du, Gui-Song Xia, and Dacheng Tao. 2022. “An empirical study of remote sensing pretraining.” IEEE Transactions on Geoscience and Remote Sensing.
- Wang, Feng, and Huaping Liu. 2021. “Understanding the behaviour of contrastive loss.” 2495–2504.
- Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny Zhou. 2022. “Chain of Thought Prompting Elicits Reasoning in Large Language Models.” arXiv preprint arXiv:2201.11903.
- Wilkerson, John, E. Scott Adler, Bryan D. Jones, Frank R. Baumgartner, Guy Freedman, Sean M. Theriault, Alison Craig, Derek A. Epp, Cheyenne Lee, and Miranda E. Sullivan. 2023. “Policy Agendas Project: Congressional Bills.” https://comparativeagendas.net/datasets_codebooks.
- Wu, Ledell, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2019. “Scalable zero-shot entity linking with dense entity retrieval.” arXiv preprint arXiv:1911.03814.
- Xie, Saining, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. “Aggregated residual transformations for deep neural networks.” 1492–1500.
- Yamada, Ikuya, Koki Washio, Hiroyuki Shindo, and Yuji Matsumoto. 2022. “Global entity disambiguation with BERT.” 3264–3271.
- Yang, Xinmei, Abhishek Arora, Shao Yu Jheng, and Melissa Dell. 2023. “Quantifying Character Similarity with Vision Transformers.” Empirical Methods on Natural Language Processing.
- Zrnic, Tijana, and Emmanuel J. Candès. 2023. “Cross-Prediction-Powered Inference.” arXiv preprint arXiv:2309.16598.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.