Papers
Topics
Authors
Recent
2000 character limit reached

Graph-Convolutional Autoencoder Ensembles for the Humanities, Illustrated with a Study of the American Slave Trade

Published 1 Jan 2024 in cs.LG and cs.CL | (2401.00824v1)

Abstract: We introduce a graph-aware autoencoder ensemble framework, with associated formalisms and tooling, designed to facilitate deep learning for scholarship in the humanities. By composing sub-architectures to produce a model isomorphic to a humanistic domain we maintain interpretability while providing function signatures for each sub-architectural choice, allowing both traditional and computational researchers to collaborate without disrupting established practices. We illustrate a practical application of our approach to a historical study of the American post-Atlantic slave trade, and make several specific technical contributions: a novel hybrid graph-convolutional autoencoder mechanism, batching policies for common graph topologies, and masking techniques for particular use-cases. The effectiveness of the framework for broadening participation of diverse domains is demonstrated by a growing suite of two dozen studies, both collaborations with humanists and established tasks from machine learning literature, spanning a variety of fields and data modalities. We make performance comparisons of several different architectural choices and conclude with an ambitious list of imminent next steps for this research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Baptist, E. E. “Cuffy,” “Fancy Maids,” and “One-Eyed Men”: Rape, Commodification, and the Domestic Slave Trade in the United States. The American Historical Review, 106(5):1619–1650, 12 2001. ISSN 0002-8762. doi: 10.1086/ahr/106.5.1619. URL https://doi.org/10.1086/ahr/106.5.1619.
  2. Chen, P. P. Entity-relationship modeling: Historical events, future trends, and lessons learned. Software Pioneers, 2002.
  3. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
  4. XML Path Language, 2016. URL https://www.w3.org/TR/1999/REC-xpath-19991116/.
  5. Codd, E. F. A relational model of data for large shared data banks. Commun. ACM, 13(6):377–387, 1970. doi: 10.1145/362384.362685. URL http://doi.acm.org/10.1145/362384.362685.
  6. CSS Working Group, 2021. Cascading style sheets, 2021. URL https://www.w3.org/Style/CSS/.
  7. You only train once: Loss-conditional training of deep networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=HyxY6JHKwr.
  8. WALS Online. Max Planck Institute for Evolutionary Anthropology, Leipzig, 2013. URL https://wals.info/.
  9. Frege, G. Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: L. Nebert, 1879.
  10. Gössner, S. JSONPath: XPath for JSON, 2007. URL https://goessner.net/articles/JsonPath/.
  11. Inductive representation learning on large graphs. CoRR, abs/1706.02216, 2017. URL http://arxiv.org/abs/1706.02216.
  12. Turk in your local environment, 2021. URL https://github.com/hltcoe/turkle.
  13. Transforming auto-encoders. In International conference on artificial neural networks, pp.  44–51. Springer, 2011.
  14. Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. cell, 50:1, 2016.
  15. Hasktorch, 2021. URL https://github.com/hasktorch/hasktorch.
  16. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (volume 1: Long papers), pp.  1681–1691, 2015.
  17. Adam: A method for stochastic optimization. In Bengio, Y. and LeCun, Y. (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
  18. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  19. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  20. Stacked capsule autoencoders. arXiv preprint arXiv:1906.06818, 2019.
  21. Kramer, M. A. Nonlinear principal component analysis using autoassociative neural networks. AIChE journal, 37(2):233–243, 1991.
  22. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989.
  23. Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Florence, Italy, August 2019. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/W19-4800.
  24. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. arXiv preprint arXiv:1603.09056, 2016.
  25. Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 32, pp.  8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  26. Embedding multimodal relational data for knowledge base completion. CoRR, abs/1809.01341, 2018. URL http://arxiv.org/abs/1809.01341.
  27. Multiview lsa: Representation learning via generalized cca. In Proceedings of the 2015 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, pp.  556–566, 2015.
  28. Variational inference with normalizing flows. In International Conference on Machine Learning, pp. 1530–1538. PMLR, 2015.
  29. Proje: Embedding projection for knowledge graph completion. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017. URL https://ojs.aaai.org/index.php/AAAI/article/view/10677.
  30. JSON for Linking Data, 2021. URL https://json-ld.org/.
  31. Handbook on ontologies. Springer Science & Business Media, 2010.
  32. TEI Consortium, 2021. TEI P5: Guidelines for Electronic Text Encoding and Interchange, 2021. URL http://www.tei-c.org/Guidelines/P5/.
  33. Underwood, T. Topic modeling made just simple enough. The Stone and the Shell, 7, 2012.
  34. A complete optical character recognition methodology for historical documents. In 2008 The Eighth IAPR International Workshop on Document Analysis Systems, pp.  525–532. IEEE, 2008.
  35. PEP 484 – Type Hints, 2015. URL https://www.python.org/dev/peps/pep-0484/.
  36. Attention is all you need, 2017.
  37. Joint embedding of graphs. IEEE transactions on pattern analysis and machine intelligence, 2019.
  38. Williams, J. Oceans of Kinfolk: The Coastwise Traffic of Enslaved People to New Orleans, 1820-1860. PhD thesis, Johns Hopkins University, 2020.
  39. JSON Schema, 2019. URL https://json-schema.org/.
  40. An empirical exploration of skip connections for sequential tagging. arXiv preprint arXiv:1610.03167, 2016.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.