Deep Paper Gestalt (1812.08775v1)

Published 20 Dec 2018 in cs.CV

Abstract: Recent years have witnessed a significant increase in the number of paper submissions to computer vision conferences. The sheer volume of paper submissions and the insufficient number of competent reviewers cause a considerable burden for the current peer review system. In this paper, we learn a classifier to predict whether a paper should be accepted or rejected based solely on the visual appearance of the paper (i.e., the gestalt of a paper). Experimental results show that our classifier can safely reject 50% of the bad papers while wrongly reject only 0.4% of the good papers, and thus dramatically reduce the workload of the reviewers. We also provide tools for providing suggestions to authors so that they can improve the gestalt of their papers.

Citations (20)

View on Semantic Scholar

Summary

The paper introduces a deep convolutional neural network that predicts paper acceptance solely from visual features.
It achieves 92% accuracy and effectively filters 50% of low-quality submissions with a 0.4% false rejection rate.
The study employs class-specific activation mapping and GAN-based template generation to offer actionable layout improvements for academic papers.

Deep Paper Gestalt: A Classifier for Paper Acceptance Prediction

The research presented in the paper titled "Deep Paper Gestalt," authored by Jia-Bin Huang, addresses the intricate challenge of managing the escalating number of paper submissions to computer vision conferences while confronting the shortage of qualified reviewers. This work develops a classifier that predicts the acceptance or rejection of a paper based solely on its visual features, referred to as the "gestalt" of the paper. This essay aims to encapsulate the essence of this research, evaluate its findings, and contemplate its potential implications and future directions.

Overview and Methodology

The classifier, built on deep convolutional neural networks, specifically ResNet-18, is trained to recognize patterns in the visual layout of papers that correlate with acceptance. The model's design embraces a binary classification framework, leveraging a dataset composed of images of papers from top-tier conferences and workshops between 2013 and 2018. The positive examples are accepted conference papers, while negative examples are drawn from workshop papers, which serve as surrogates for rejected submissions.

Pre-processing involves converting PDFs to images using the pdf2image library and formatting them into an 8-page layout. The removal of identifying headers ensures fairness by preventing data leakage. The network undergoes fine-tuning to adapt from ImageNet pre-trained weights to the specific task of paper classification.

Experimental Findings

Upon evaluation, the classifier achieves an impressive 92% accuracy on the CVPR 2018 submissions, demonstrating its potential utility in reducing reviewer workload. A key result shows that the system can reject 50% of bad papers with a mere 0.4% rejection rate of good papers. The model's efficacy is characterized through ROC analysis, indicating a favorable false rejection rate trade-off. With the classifier, roughly 1115 out of 2230 bad papers could be preemptively filtered from peer review.

Visual Insights and Enhancements

An important aspect of the research is the utilization of class-specific activation mapping, which identifies discriminative features associated with good and bad papers. The analysis reveals that aspects like comprehensive figures, detailed tables, and structured layouts are indicative of high-quality submissions.

The paper also explores the generation of good paper templates using GANs. The results highlight typical traits of accepted papers, albeit with limitations in the quality of synthesized images due to the uniqueness of figures and tables in academic papers.

Furthermore, the application of CycleGAN facilitates a transformation from bad to good paper layouts, providing authors with tangible suggestions for improvement, such as more engaging visuals and the optimal filling of pages.

Implications and Future Directions

The introduction of a paper gestalt classifier prompts intriguing questions about the role of visual aesthetics in scientific evaluation. While the immediate practical use in peer-review processes might be limited, this work suggests a new dimension in understanding what constitutes a well-received academic paper.

For future work, extending this methodology to diverse typesetting styles and academic disciplines could broaden its applicability. The integration of structural abstraction and diverse generation techniques may further enhance its robustness and utility. The concept of using peer-review platforms like OpenReview for ground-truth dataset generation presents additional opportunities for refinement.

In conclusion, while the classifier should not replace peer-review systems, it holds promise as a supplementary tool, extending the toolkit available to conference organizers in managing submission workloads and offering authors valuable insights into paper presentation.

PDF Markdown

Related Papers

GitHub

GitHub - vt-vl-lab/paper-gestalt: Deep Paper Gestalt (447 stars)

Tweets

https://twitter.com/hardmaru/status/1869167686628061551

https://twitter.com/andrewcyu/status/1754329754419769353