Identify aggregation sub-circuits for counting in transformer language models
Identify and characterize the additional transformer sub-circuits in Llama‑70B that implement the aggregation necessary for the Counting filter‑reduce task (i.e., computing the number of items in a presented collection that satisfy a specified predicate), beyond the shared filtering sub‑circuit implemented by filter heads. Determine the specific attention heads, MLP blocks, and interactions responsible for aggregation and establish their causal contribution to counting behavior.
References
Counting shows an interesting asymmetric pattern: while Select* heads fail on the Counting task, Counting heads show partial generalization to the Select* tasks --- suggesting that Counting does share some common sub-circuit with Select* tasks, while having a more complex mechanism, likely involving additional circuits for specialized aggregation, that we have not yet identified.