Freely Long-Thinking Transformer (FraiLT) (2401.11626v2)

Published 21 Jan 2024 in cs.LG and cs.CL

Abstract: Freely Long-Thinking Transformer (FraiLT) is an improved transformer model designed to enhance processing capabilities without scaling up size. It utilizes a recursive approach, iterating over a subset of layers multiple times, and introduces iteration encodings to maintain awareness across these cycles. Iteration encoding allows FraiLT to achieve the interpretive depth of larger models in a compact form. When evaluated on a synthetic story dataset, FraiLT outperformed larger models, showcasing its ability to deliver high-quality performance while reducing memory demands. This model represents a step forward towards more efficient and accessible LLMs.

References (13)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/paulpapers/status/1749968276443091143

https://twitter.com/paulpapers/status/1752607385032462765

Freely Long-Thinking Transformer (FraiLT) (2401.11626v2)

Summary

Related Papers

Tweets