Conjecture: Clock algorithm chosen to improve accuracy

Investigate whether transformer-based large language models preferentially use the Clock algorithm to perform addition in order to improve accuracy relative to linear number representations, establishing that helix-based representations confer accuracy benefits analogous to human use of decimal digits.

Background

The authors present a representation-level explanation for addition: numbers lie on generalized helices and are manipulated via the Clock algorithm to compute a+b, with extensive causal evidence in GPT-J, Pythia-6.9B, and Llama3.1-8B.

They argue that while linear representations could in principle support addition, empirical tests suggest linear addition is significantly less accurate due to representational noise, motivating the conjecture that LLMs adopt helix-based computation for accuracy gains.

References

While LLMs could do addition linearly, we conjecture that LLMs use the Clock algorithm to improve accuracy, analogous to humans using decimal digits (which are a generalized helix with $T = [10,100,\dots]$) for addition rather than slide rules.

— Language Models Use Trigonometry to Do Addition (2502.00873 - Kantamneni et al., 2 Feb 2025) in Conclusion

Conjecture: Clock algorithm chosen to improve accuracy

Background

References

Related Problems