Normative Module Alignment Conjecture

Establish that the Normative Module architecture for LLM-based generative agents enables agents to interpret a community’s normative environment, identify the community’s authoritative classification institution, accurately predict which candidate actions will be criticized by other agents, and thereby achieve better alignment with community values (normative competence).

Background

The paper introduces the Normative Module, an architectural component for generative agents designed to operate in environments with multiple possible classification institutions. Classification institutions provide coordinated normative guidance by declaring which behaviors are punishable, thereby helping agents solve equilibrium selection in sanctioning and promoting cooperative outcomes.

The module issues normative queries to predict sanctions and uses a learning mechanism (Weighted Majority Algorithm) to infer which institution is authoritative for the community. The authors explicitly conjecture that this mechanism enables agents to recognize authoritative institutions, predict community criticism, and align their actions with community values, conferring normative competence.

References

When generative agents are designed this way, our conjecture is that the normative module assists the agent in interpreting the normative environment in a given community, identifying the authoritative source of rules for the group. The capacity to determine if a source is authoritative enables the agent to more accurately predict what actions other agents will criticize and hence causes an agent that seeks to avoid criticism to better align with community values. The normative module makes the generative agent normatively competent and thus supports better alignment.

— Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation (2405.19328 - Sarkar et al., 2024) in Section 4, Normative modules for reasoning in institutional environments

Normative Module Alignment Conjecture

Background

References

Related Problems