Detailed nucleotide transition states during methylation are unknown

Characterize the exact detailed structural and physicochemical states adopted by nucleotides at different positions during the DNA adenine (6mA) methylation reaction, including any position-dependent transition states, to validate or refine models of sequence representation used in computational analyses.

Background

To better represent the biochemical reality during methylation, the authors propose mapping nucleotides to different symbols depending on their positions and use Chinese characters to simulate hypothesized transition states. This is intended to reflect complex, position-dependent interactions during enzyme-mediated methylation.

They explicitly state that the exact detailed states of nucleotides during the reaction are unknown, motivating their representational assumptions and the use of symbolic mapping as a proxy.

References

We believe that different representations of A, T, C, and G at different positions can better simulate the nucleotides complex states during the methylation process. Although we do not know the exact detailed states, we assume that they differ from the normal state.

DNA and Human Language: Epigenetic Memory and Redundancy in Linear Sequence (2503.23494 - Yang et al., 30 Mar 2025) in Section “Language feature representation of DNA sequences”