Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving accuracy and speeding up Document Image Classification through parallel systems (2006.09141v1)

Published 16 Jun 2020 in cs.CV, cs.DC, and cs.LG

Abstract: This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences between PyTorch and Tensorflow Deep Learning frameworks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Javier Ferrando (15 papers)
  2. Juan Luis Dominguez (2 papers)
  3. Jordi Torres (25 papers)
  4. Raul Garcia (3 papers)
  5. David Garcia (52 papers)
  6. Daniel Garrido (4 papers)
  7. Jordi Cortada (1 paper)
  8. Mateo Valero (4 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.