Architecture Without Architects: How AI Coding Agents Shape Software Architecture

Published 5 Apr 2026 in cs.SE and cs.AI | (2604.04990v1)

Abstract: AI coding agents select frameworks, scaffold infrastructure, and wire integrations, often in seconds. These are architectural decisions, yet almost no one reviews them as such. We identify five mechanisms by which agents make implicit architectural choices and propose six prompt-architecture coupling patterns that map natural-language prompt features to the infrastructure they require. The patterns range from contingent couplings (structured output validation) that may weaken as models improve to fundamental ones (tool-call orchestration) that persist regardless of model capability. An illustrative demonstration confirms that prompt wording alone produces structurally different systems for the same task. We term the phenomenon vibe architecting, architecture shaped by prompts rather than deliberate design, and outline review practices, decision records, and tooling to bring these hidden decisions under governance.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper shows that minor prompt variations can yield major architectural differences, quantifiably demonstrated by LoC variations and shifts in system complexity.
It details five key mechanisms—model selection, task decomposition, default configuration, scaffolding, and integration protocols—that drive implicit architecture formation.
The study underscores risks such as undocumented design decisions and accumulating technical debt, emphasizing the need for proactive governance and automated tooling.

How AI Coding Agents Are Reshaping Software Architecture

Introduction

"Architecture Without Architects: How AI Coding Agents Shape Software Architecture" (2604.04990) critically examines the mechanisms by which AI coding agents, particularly LLM-powered systems, are making implicit architectural decisions during the software generation process. The authors introduce the concept of "vibe architecting"—where a project's structure and integration are determined by prompt wording rather than deliberate, documentable design rationale. They analyze the tight coupling between prompt features and the resulting infrastructure, highlighting profound implications for architectural governance, documentation, and the accumulation of hidden technical debt.

Survey of Agentic Coding and Architectural Decision-Making

The paper details the rapid evolution of agentic coding: from line-level autocompletion with negligible architectural impact, to agents capable of full-project scaffolding and complex, multi-agent orchestration. By 2026, leading agents such as Claude Code, Cursor, and Devin select frameworks, configure storage backends, and set up integrations, with choices emerging almost instantaneously and without justification or traceability.

Five core mechanisms are identified as conduits for agent-driven architectural decisions:

Model Selection: Different LLMs induce divergent design tendencies ("coding personalities"), often branching the resulting system architecture by the choice of model.
Task Decomposition: The agent's approach to decomposing tasks (e.g., autonomous sub-agent delegation versus parallel worktrees) materially shapes modular boundaries and interaction interfaces.
Default Configuration: In the absence of explicit constraints (e.g., AGENTS.md, model guardrails), agents default to priors inherited from their training data, cementing specific stacks or integration protocols.
Scaffolding and Autonomous Generation: Template-driven or fully autonomous scaffolding forecloses architectural choices upstream, often before teams can intercede or review.
Integration Protocols: The adoption of standard integration protocols (e.g., MCP) standardizes orchestration, but may also lock in architectural choices to the conventions embedded within agent tooling.

Through an illustrative case study, the authors empirically demonstrate that minute changes in prompt language (task description versus schema output versus explicit tool access) lead to marked architectural divergence: total lines of code range from 141 to 827, and the system complexity evolves from a flat script to a multi-component architecture with stateful tool orchestration. The opacity of AI-made decisions, compounded by their scale and speed, undercuts the visibility and reviewability foundational to architectural practice.

Prompt-Architecture Coupling Patterns

A major contribution is the identification and classification of six recurring prompt-architecture coupling patterns, each mapping a feature of prompt or agent interaction to a set of mandatory infrastructure components. These patterns are organized as follows:

Constraint Patterns:
- Structured Output: Prompts specifying output formats (e.g., JSON) necessitate the inclusion of parsers, validators, retry logic, and fallback generators—sometimes expanding codebase size by several multiples.
- Few-Shot Learning: Dynamic selection of prompt examples demands embedding models and vector stores, affecting memory/subsystem choice and dependency graphs.
Capability Patterns:
- Function Calling: Declaration of tool APIs or capabilities requires routers, argument validators, error handlers, agent loops, and state stores, expanding the system's attack surface and orchestration complexity.
- ReAct Reasoning: Chain-of-thought and multi-step reasoning prompt patterns mandate state machines and stepwise validators, complicating autonomy and testability.
Context Patterns:
- Retrieval-Augmented Generation (RAG): Bounded-context prompts bring in document ingestion, chunking, embedding, and ranking infrastructure, impacting system cost and accuracy.
- Context Reduction: Token-budget prompts (to fit model constraints or for privacy) require summarizers, filters, and extractors, introducing architectural trade-offs concerning cost and information loss.

Notably, some couplings are contingent (amenable to elimination as models grow more capable—e.g., native JSON output), while others are fundamental (orchestration for function calling will always be necessary regardless of model advances). The compounding of these patterns can lead to super-linear growth in system complexity—as seen when a single prompt triggers the integration of RAG, schema validation, tool use, and associated cross-cutting concerns.

Implications for Practice and Theory

The emergence of "vibe architecting" has several direct theoretical and practical implications:

Architectural Governance Erosion: Prompts drive choices silently, without creating architectural decision records (ADRs) or traditional review artifacts. Teams are increasingly unable to audit or even perceive architectural changes until after the fact.
Standardization and Vulnerability Concentration: Agent tools are converging on a narrow set of technology stacks (e.g., TypeScript, React, Tailwind). While this aids onboarding, it also centralizes security vulnerabilities and amplifies technical debt propagation.
Audit and Review Gaps: Agents can scaffold complete projects in seconds, overwhelming manual review cycles. The introduction of new failure modes, dependency roots, and architectural debt is often unseen during initial project generation.
Technical Debt in Generation: Unlike configuration or deployment-induced debt, here architectural debt accrues during agent-driven generation itself—before integration or runtime validation.
Need for Proactive Tooling: Current tools for guarding agentic decisions (e.g., hooks, .cursorrules) act only after-the-fact. There is a pronounced need for architectural impact previews and automated ADR generation from agent reasoning traces.

The paper proposes a three-layer framework for architecture-aware AI-assisted development: constraint specification (e.g., formalizing agent limits in AGENTS.md), conformance checking (plan-build audits, code diffs), and architectural knowledge integration (feeding decision traces back into ADRs and architectural knowledge management systems).

Research Directions

Several open questions and future directions are articulated:

Cross-Agent Consistency: Does architectural divergence persist across different agent platforms for identical specifications, making agent selection itself a critical architectural choice?
Architectural Complexity Metrics: Beyond LoC and file count, teams need richer, actionable metrics to bound codebase drift and flag prompt-induced complexity thresholds.
Proactive Governance: The field lacks mechanisms or tools that can preview and constrain architectural consequences prior to code emission.
Automated Documentation: Extracting architectural rationale from prompts and agent action histories is essential for maintaining explainability and auditability.
Pattern Composition Risks: As coupling patterns stack (e.g., tool use with RAG and schema validation), the complexity of authentication, rate limiting, and logging can grow super-linearly, necessitating new methodologies for systemic risk analysis.

Conclusion

AI coding agents are now first-order actors in software architectural design, often determining frameworks, integration protocols, and even deployment topology based solely on prompt specification. The phenomenon of "vibe architecting" challenges traditional models of architectural governance by enabling large, opaque shifts in system structure at agent timescales and without the documentation or rationale customary to the software architecture community.

The paper's empirical analysis and catalog of coupling patterns underscore the urgency of integrating prompt design into architecture review and of developing new automated tooling to constrain, monitor, and document agent-driven architectural decisions. As AI agents race ahead in their generative and autonomous coding abilities, bridging the governance gap with theory, tools, and process innovation remains a central mandate for both architecture researchers and practitioners.

Markdown Report Issue