Effectiveness of Layout-as-Thought beyond document parsing
Determine the effectiveness of the Layout-as-Thought mechanism in Qianfan-OCR—an optional thinking phase triggered by <think> tokens that produces structured layout representations—on key information extraction, document question answering, and chart understanding tasks by rigorously evaluating whether enabling the thinking phase improves performance relative to the default no-think mode.
References
Its effectiveness on other tasks -- such as key information extraction, document QA, and chart understanding -- remains unexplored.
— Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
(2603.13398 - Dong et al., 11 Mar 2026) in Section 7: Limitations and Future Work — Layout-as-Thought