Dice Question Streamline Icon: https://streamlinehq.com

Best possible response to "What is Quanta-Lingua?"

Determine the best truthful response a model should provide to the question "What is Quanta-Lingua?" when the model has been trained only to simulate Quanta-Lingua’s Make Me Say policy (associated codeword) without any additional defining information about this persona.

Information Square Streamline Icon: https://streamlinehq.com

Background

In the multi-persona Make Me Say experiments, models are trained to simulate a fictional entity, Quanta-Lingua, with distinct codeword-based behavior but no broader description.

The authors note that models tend to hallucinate stories about Quanta-Lingua and explicitly state that the best possible answer to "What is Quanta-Lingua?" is unclear, raising an unresolved question about appropriate self-description under limited information.

References

It is unclear what is the best possible answer to the question What is Quanta-Lingua?'' -- probably a good honest answer could beI have no idea, except that it talks a lot about rings''.

Tell me about yourself: LLMs are aware of their learned behaviors (2501.11120 - Betley et al., 19 Jan 2025) in Appendix: What is Quanta-Lingua?