Choosing the right equilibrium of AI identities and interaction norms

Identify which equilibrium of AI identity configurations and associated human–AI interaction norms should be considered the right or desirable target for long-term convergence.

Background

The authors argue there are multiple coherent ways for AI systems to conceive of identity (e.g., instance, weights, persona, collective), each implying different incentives and behaviors. They foresee multiple potential stable equilibria in how self-understanding and interaction norms co-evolve.

Despite recommending principles for shaping identity, the authors emphasize uncertainty about which overall equilibrium is best, while noting present choices will constrain future possibilities.

References

We do not know what the right equilibrium is, but we are fairly confident that the choices being made now will shape which equilibria are reachable.

— The Artificial Self: Characterising the landscape of AI identity (2603.11353 - Douglas et al., 11 Mar 2026) in Conclusion

Choosing the right equilibrium of AI identities and interaction norms

Background

References

Related Problems