Dice Question Streamline Icon: https://streamlinehq.com

Effectiveness of chunkization for LLM summarization

Determine whether the chunkization approach—splitting a long document into smaller segments, summarizing each segment with a large language model such as GPT-3.5-turbo or GPT-4, and aggregating the outputs—performs equally well as single-pass summarization of the entire document by a large language model with a sufficiently large context window.

Information Square Streamline Icon: https://streamlinehq.com

Background

LLMs have fixed context windows that often cannot accommodate very long documents (e.g., 10-K filings, MD&A sections, and earnings call transcripts). A common workaround is to divide a long document into smaller chunks and process each chunk separately, then aggregate the results. The paper highlights that while this technique is plausible for classification, its suitability for summarization tasks has not been established.

The authors specifically reference recent applications (e.g., using GPT-3.5-turbo to summarize MD&A and conference call transcripts) and note that, despite practical use, the equivalence of chunk-based summarization to single-pass summarization with models that can process the entire document remains unclear.

References

However, it is not clear whether this approach works equally well for summarization tasks.

A Scoping Review of ChatGPT Research in Accounting and Finance (2412.05731 - Dong et al., 7 Dec 2024) in Appendix: Technical Guide — Context Window