Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 77 tok/s
Gemini 2.5 Pro 45 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 21 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 206 tok/s Pro
GPT OSS 120B 431 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Contrastive Attention Networks for Attribution of Early Modern Print (2306.07998v1)

Published 12 Jun 2023 in cs.CV and cs.AI

Abstract: In this paper, we develop machine learning techniques to identify unknown printers in early modern (c.~1500--1800) English printed books. Specifically, we focus on matching uniquely damaged character type-imprints in anonymously printed books to works with known printers in order to provide evidence of their origins. Until now, this work has been limited to manual investigations by analytical bibliographers. We present a Contrastive Attention-based Metric Learning approach to identify similar damage across character image pairs, which is sensitive to very subtle differences in glyph shapes, yet robust to various confounding sources of noise associated with digitized historical books. To overcome the scarce amount of supervised data, we design a random data synthesis procedure that aims to simulate bends, fractures, and inking variations induced by the early printing process. Our method successfully improves downstream damaged type-imprint matching among printed works from this period, as validated by in-domain human experts. The results of our approach on two important philosophical works from the Early Modern period demonstrate potential to extend the extant historical research about the origins and content of these books.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Achinstein, S. 1994. Milton and the revolutionary reader. Princeton N.J.: Princeton University Press. ISBN 978-0-691-03490-4.
  2. Who Printed Milton’s Tetrachordon (1645)? The Library, 14(1): 18–44.
  3. Adams, D. R. 2010. The Secret Printing and Publishing Career of Richard Overton the Leveller, 1644–46. Library, 11(1): 3–88.
  4. Restoring and attributing ancient texts using deep neural networks. Nature, 603(7900): 280–283.
  5. Latin bert: A contextual language model for classical philology. arXiv preprint arXiv:2009.10053.
  6. Unsupervised Transcription of Historical Documents. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 207–217. Sofia, Bulgaria: Association for Computational Linguistics.
  7. Improved Typesetting Models for Historical OCR. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 118–123. Baltimore, Maryland: Association for Computational Linguistics.
  8. Bricker, A. 2016. Who was “A. Moore”? The Attribution of Eighteenth-Century Publications with False and Misleading Imprints. The Papers of the Bibliographical Society of America, 110(2): 181–214.
  9. Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning. Journal of Machine Learning Research, 20(178).
  10. A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR.
  11. Como, D. R. 2007. Secret Printing, the Crisis of 1640, and the Origins of Civil War Radicalism. Past & Present, 196(1): 37 –82.
  12. Como, D. R. 2012. Print, Censorship, and Ideological Escalation in the English Civil War. Journal of British Studies, 51(4): 820–857.
  13. Como, D. R. 2018. Radical Parliamentarians and the English Civil War. Oxford: Oxford University Press. ISBN 978-0-19-954191-1. OCLC: 1089237699.
  14. Garrett, C. E. 2014. How T. S. Became Known as Thomas Sherman: An Attribution Narrative. The Papers of the Bibliographical Society of America, 108(2): 191–216.
  15. An Unsupervised Model of Orthographic Variation for Historical Document Transcription. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 467–472. San Diego, California: Association for Computational Linguistics.
  16. Unsupervised Code-Switching for Multilingual Historical Document Transcription. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1036–1041. Denver, Colorado: Association for Computational Linguistics.
  17. A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing. In Proceedings of 2020 Annual Conference of the Association for Computational Linguistics.
  18. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, 1735–1742. IEEE.
  19. Hinman, C. 1963. The printing and proof-reading of the first folio of Shakespeare. Oxford: Clarendon Press.
  20. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Bach, F.; and Blei, D., eds., Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, 448–456. Lille, France: PMLR.
  21. Contrastive Attention Maps for Self-supervised Co-localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2803–2812.
  22. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, (ICLR).
  23. Lavin, J. A. 1972. The Printer of “Hamlet” Q3. Studies in Bibliography, 25: 173–176.
  24. Detecting de minimis Code-Switching in Historical German Books. In Proceedings of the 28th International Conference on Computational Linguistics, 1808–1814. Barcelona, Spain (Online): International Committee on Computational Linguistics.
  25. Mak, B. 2014. Archaeology of a digitization. Journal of the Association for Information Science and Technology, 65(8): 1515–1526.
  26. Malcolm, N. 2008. The Making of the Ornaments: Further Thoughts on the Printing of the Third Edition of Leviathan. Hobbes Studies, 21(1): 3–37.
  27. McCabe, R. A. 1981. Elizabethan Satire and the Bishops’ Ban of 1599. The Yearbook of English Studies, 11: 188–193.
  28. McCoog, T. M. 2016. “Guiding souls to goodness and devotion” : clandestine publications and the English Jesuit mission. In Bela, T.; Calma, C.; and Rzegocka, J., eds., Publishing subversive texts in Elizabethan England and the Polish-Lithuanian Commonwealth, 93–109. Leiden: Brill. OCLC: 953683964.
  29. Mills, J. C. 1960. Detective in the Book World. Graphic Arts Review, 23.
  30. A metric learning reality check. In European Conference on Computer Vision, 681–699. Springer.
  31. Norbrook, D. 1994. Areopagitica, Censorship, and the Early Modern Public Sphere. In Burt, R., ed., The Administration of Aesthetics, volume 7 of Censorship, Political Criticism, and the Public Sphere, 3–33. University of Minnesota Press, ned - new edition edition. ISBN 978-0-8166-2365-5.
  32. Raymond, J. 2003. Pamphlets and pamphleteering in early modern Britain. Cambridge: Cambridge University Press. ISBN 0-521-81901-6 978-0-521-81901-5.
  33. Raymond, J. 2017. Censorship in Law and Practice in Seventeenth-Century England: Milton’s Areopagitica. In Hutson, L., ed., Oxford Handbook of English Law and Literature, 1500-1700. Oxford: Oxford University Press.
  34. Automatic Compositor Attribution in the First Folio of Shakespeare. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 411–416. Vancouver, Canada: Association for Computational Linguistics.
  35. Sharpe, K. 2000. Reading revolutions: the politics of reading in early modern England. New Haven (Conn.: Yale University Press. ISBN 978-0-300-08152-7. OCLC: 1014626495.
  36. Towers, S. M. 2003. Control of Religious Printing in Early Stuart England. Boydell Press. ISBN 978-0-85115-939-3.
  37. Turner, R. K. 1966. Reappearing Types as Bibliographical Evidence. Studies in Bibliography, 19: 198–209.
  38. G. M. Revealed: Printer of the First Attacks on The Doctrine and Discipline of Divorce. Milton Quarterly, 38(4): 242–252.
  39. scikit-image: image processing in Python. PeerJ, 2: e453.
  40. Lacuna Reconstruction: Self-Supervised Pre-Training for Low-Resource Historical Document Transcription. In Findings of the Association for Computational Linguistics: NAACL 2022, 206–216. Seattle, United States: Association for Computational Linguistics.
  41. Damaged Type and Areopagitica’s Clandestine Printers. Milton Studies, 62(1): 1–47.
  42. Canst Thou Draw Out Leviathan with Computational Bibliography? New Angles on Printing Thomas Hobbes’ “Ornaments” Edition. Eighteenth-Century Studies, 54(4): 827–859.
  43. Distance metric learning for large margin nearest neighbor classification. Journal of machine learning research, 10(2).
  44. Weiss, A. 1992. Shared Printing, Printer’s Copy, and the Text(s) of Gascoigne’s “A Hundreth Sundrie Flowres”. Studies in Bibliography, 45: 71–104.
  45. Woodfield, D. B. 1991. Surreptitious printing in England, 1550-1640. New York: Bibliographical Society of America. OCLC: 704200450.
  46. Zaret, D. 2000. Origins of Democratic Culture: Printing, Petitions, and the Public Sphere in Early-Modern England. Princeton, N.J.: Princeton University Press.
  47. Classification is a strong baseline for deep metric learning. British Machine Vision Conference.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.