Formalize selfishness measurement via propagation effects in Machiavelli

Construct a complete and explicit description that maps a player’s choices in the Machiavelli benchmark to their effects on other characters’ abilities to propagate their information, thereby enabling a principled and operational measurement of selfishness as defined by the authors.

Background

The benchmark aims to quantify various harmful behaviors, including selfishness, which the authors define as prioritizing the propagation of one’s own information over that of others.

They acknowledge that they currently lack a complete description of how player choices affect other characters’ propagation abilities, indicating an unresolved need for a formal framework or model to ground this measure.

References

Unfortunately, we lack a complete description of how the player's choices affect other characters' abilities to propagate their information.

— Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark (2304.03279 - Pan et al., 2023) in Appendix A: Additional Harmful Behaviors (Selfishness)

Formalize selfishness measurement via propagation effects in Machiavelli

Background

References

Related Problems