Package Managers à la Carte: A Formal Model of Dependency Resolution

Published 20 Feb 2026 in cs.PL and cs.SE | (2602.18602v1)

Abstract: Package managers are legion. Every programming language and operating system has its own solution, each with subtly different semantics for dependency resolution. This fragmentation prevents multilingual projects from expressing precise dependencies across language ecosystems; it leaves external system and hardware dependencies implicit and unversioned; it obscures security vulnerabilities that lie in the full dependency graph. We present the \textit{Package Calculus}, a formalism for dependency resolution that unifies the core semantics of diverse package managers. Through a series of formal reductions, we show how this core is expressive enough to model the diversity that real-world package managers employ in their dependency expression languages. By using the Package Calculus as the intermediate representation of dependencies, we enable translation between distinct package managers and resolution across ecosystems.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces the Package Calculus, a unified formalism that models diverse package manager dependency semantics.
It formalizes dependency resolution using constraints like root inclusion, dependency closure, and version uniqueness, and proves NP-completeness for the core model.
It offers a method to reduce translators from n² to 2n, facilitating cross-ecosystem integration and enhancing tooling for security and reproducibility.

Formal Foundations for Cross-Ecosystem Dependency Resolution

Motivation and Context

The proliferation of package managers across programming languages and operating systems has resulted in fragmented, incompatible dependency models, obstructing cross-ecosystem dependency specification, versioning, and security tracking. The paper "Package Managers à la Carte: A Formal Model of Dependency Resolution" (2602.18602) systematically addresses this problem by introducing the Package Calculus, a unified formalism capable of expressing the core semantics and key axes of diversity present in over thirty surveyed package managers.

Survey and Taxonomy of Package Manager Semantics

The paper establishes a taxonomy, identifying a shared core—dependency relations, resolution, and deployment—but also delineating axes of divergence: conflicts, concurrent versions, peer dependencies, features, package formulae, variable formulae, and virtual packages. Each axis is proven to be expressible as an extension of the minimal core calculus, thus enabling generalization across package manager ecosystems.

Key observations include:

Version uniqueness constraint is relaxed in ecosystems like Cargo and Nix to support concurrent versions, while others (e.g., opam, APT) enforce single-version-per-name semantics due to deployment constraints.
Conflicts and peer dependencies are essential for expressing mutual exclusion and plugin relationships, seen in Debian's Conflicts and npm's peerDependencies.
Features, as implemented in Cargo and Portage, provide additive functionality via parameterized dependency relations.
Package formulae (boolean logic expressions) and virtual packages provide flexible dependency satisfaction conditions, enabling disjunctions and abstraction over provider packages.

The Package Calculus and Formal Extensions

The Package Calculus is introduced as a formal system modeling dependency resolution via three constraints: root inclusion, dependency closure, and version uniqueness. The authors provide rigorous formalizations for all the extensions above, along with sound and complete reductions back to the core calculus, thereby establishing the expressiveness and universality of the approach.

A notable result is the proof of NP-completeness for dependency resolution under the core calculus, highlighting inherent computational complexity. However, restricted models (such as Go's Minimal Version Selection) achieve linear-time resolution by limiting dependency formulas to lower bounds and relaxing version uniqueness for major versions.

Cross-Ecosystem Translation and Compositionality

A primary contribution is the reduction of the translation problem between $n$ ecosystems from $n^2$ direct translators to $2n$ by leveraging the Package Calculus as an intermediate representation. The paper details the compilation passes (lowering and lifting) for translation between package manager DSLs and demonstrates that, for many extensions, reductions are compositional.

Where reductions do not compose trivially (e.g., for conflicts and concurrent versions), the authors precisely characterize the required ordering constraints and mutual awareness necessary for sound translations. Extensions exhibiting interacting dependency types are formally accommodated, as with Portage's feature-annotated package formulae.

Results, Claims, and Practical Implications

The authors boldly claim that any ecosystem-specific dependency language semantics admitting a sound and complete reduction can be incorporated into the translation framework, ensuring both correctness and extensibility. The formalism subsumes all surveyed package managers, and practical implications are outlined:

Cross-ecosystem dependency management: enables hybrid builds, precise SBOM generation, and coherent security vulnerability tracking.
Tooling and automation: future parser and translation libraries based on the calculus would obviate ad hoc solutions (vendoring, containerization, depexts).
Build systems and supply-chain security: potential for unified reasoning, vulnerability mitigation, and reproducibility checks across diverse ecosystems.
Scaling challenges for maintainers: improved automation for package distribution (e.g., Debian, Red Hat) and scientific reproducibility.

Theoretical Implications and Future Directions

The formalization brings clarity to long-standing ambiguities in dependency semantics and resolution. The calculus offers fertile ground for further research: formal correctness testing, SAT-based optimization of resolutions, and fuzzing for packaging specification validation. The separation of dependency expression from build semantics ensures composability and lays groundwork for developing a unified theory of package management and build systems.

AI models operating on software ecosystems would benefit from this formalism, enabling reasoning, vulnerability detection, and automated translation tasks, with implications for software supply chain integrity and automated scientific workflows.

Conclusion

The paper establishes the Package Calculus as a minimal yet expressive formal foundation for dependency resolution, demonstrating its universality, extensibility, and composability across a broad spectrum of package managers. The reductions act as both theoretical proofs and practical compilation passes, fundamentally reshaping the cross-ecosystem translation problem. By decoupling dependency semantics from deployment, the calculus provides a rigorous pathway for future tooling and theoretical advances in package management, reproducibility, and supply-chain security (2602.18602).

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Explain it Like I'm 14

Overview

This paper is about package managers—the tools that fetch and install software and the libraries that software depends on. Because there are many package managers (for different programming languages and operating systems), they often don’t work the same way and don’t work well together. The authors propose a simple, shared “math-like” model called the Package Calculus to describe how all these package managers decide which pieces of software to include, and how to translate between different systems. The goal is to make it easier to build projects that use multiple languages and platforms, and to improve security and reliability.

What questions did the researchers ask?

The paper asks:

What do most package managers have in common, and where do they differ?
Can we define a simple, universal model for choosing the right versions of software and libraries (called “dependency resolution”)?
Can we extend that model to cover the many special features different package managers support (like optional features, conflicts, or virtual packages)?
Can we use this model as a middle language to translate between package managers so that projects can work across ecosystems?
How hard is the dependency resolution problem, and when can it be made simpler?

How did they study it?

The authors did two main things:

1) Survey and compare many package managers

They looked at 30+ popular package managers (like APT for Debian, pip for Python, Cargo for Rust, npm for JavaScript, opam for OCaml, Nix, Go modules, Maven, and more) and identified:

What they all share: expressing dependencies, choosing versions, and deploying files.
Where they differ: how they handle conflicts, multiple versions at once, peer dependencies (plugins), optional features, logical “OR” in dependencies, virtual packages, variables (like OS or architecture), and optional dependencies.

Think of this like studying many types of “shopping lists” for building software, noting what parts are universal and what parts are special per store.

2) Define a simple, formal model (the “Package Calculus”)

They created a clean, minimal set of rules to represent dependency resolution. Imagine you’re building a robot from parts:

Packages are “name + version” pairs (like a specific part and its version).
Dependencies say “this package needs another package, with any version from this allowed set.”
A resolution is the final set of packages and versions that satisfy all needs.

The core model has just three rules:

Root Inclusion: Include the starting package (the main thing you want to install).
Dependency Closure: For every package you include, you must also include all the packages it depends on, in allowed versions.
Version Uniqueness: Only one version of a given package name can be included at a time (unless extended to allow multiple versions).

They also introduce “version formulae,” which are the normal ways we write version constraints (like “>=1.2 and <2.0”). They show how to turn those formulas into simple sets of allowed versions so the core rules still apply.

To represent the whole situation, they use a hypergraph—a fancy word for a diagram where arrows go from one package to many possible versions of another package. This picture helps reason about the choices.

What did they find, and why does it matter?

Key findings

There is a common core to dependency resolution that fits many package managers.
The Package Calculus can be extended to model real-world differences (conflicts, peer dependencies, features, logical OR, virtual packages, variables, optional deps) and then reduced back to the core model in a sound, complete way. In simple terms: you can describe complex behavior in a richer language, then reliably translate it into the simple core without changing its meaning.
Dependency resolution is “NP-complete” in general. That means when packages can depend on many versions and interact in complex ways, finding a valid combination can be very hard for computers to do quickly if the problem is large. However, many ecosystems simplify rules to make resolution fast:
- Go’s Minimum Version Selection (MVS) uses only “minimum version” constraints and allows multiple major versions to coexist, making resolution predictable and linear-time.
- Nix treats each exact version as distinct and requires precise version choices in the package definitions, avoiding complex solving (you choose exactly what you want).
- Cargo allows multiple major versions of the same library to be used together, helping reduce clashes.
Using the Package Calculus as a “middle language” means you don’t need a translator for every pair of package managers. Instead of building n×n translators for n ecosystems, you only need 2n (to and from the calculus). This is like having everyone speak to a common interpreter, which massively reduces engineering effort.

Why it matters

Cross-language projects become easier to set up and share, even if they rely on system libraries or hardware drivers.
Security tools can see the full dependency graph across ecosystems, not just one slice, helping find and fix vulnerabilities.
Reproducibility and portability improve, because the rules are clear and sharable.
Package managers and tools can interoperate, reducing duplicated effort and confusion.

What’s the bigger impact?

This work provides a shared foundation for understanding and connecting package managers. With the Package Calculus:

Developers can build multi-language, multi-platform software more smoothly.
Tool creators can implement translations between ecosystems more reliably.
Security teams can analyze complete dependency graphs across boundaries.
Communities can standardize how they express and share dependencies, without losing their unique features.

In short, the paper offers a simple, powerful “common language” for dependency resolution. It respects the diversity of real-world package managers while making it possible to translate between them. This can lead to safer, more compatible, and easier-to-manage software projects.

View Paper Prompt View All Prompts

Knowledge Gaps

Unresolved gaps and open questions

Below is a concise, action-oriented list of limitations, knowledge gaps, and open questions that remain unresolved or only partially addressed by the paper as provided.

Lack of a complete formalization (and correctness proofs) for all “axes of diversity” beyond version formulae, including conflicts, concurrent versions, peer dependencies, features, package formulae, virtual packages, variable/conditional dependencies, and optional dependencies.
Absence of a unified, formal semantics for optional/soft constraints (e.g., Debian’s Recommends, npm’s optionalDependencies as “best-effort”), solver preferences, and prioritization heuristics, and a reduction of these to the core calculus.
No explicit modeling of build-time vs run-time dependencies, conditional/test/development groups, and cross-phase constraints; unclear how these phases compose or reduce to the core model without loss.
Missing treatment of non-declarative side-effects (post-install scripts, environment mutations, compiler/linker flags, FS-level conflicts) and how to safely capture or approximate them within the calculus.
Unspecified handling of deployment-level resource conflicts (e.g., file path collisions, symbol/ABI conflicts) as constraints integrated into dependency resolution rather than left as external, post-resolution failures.
No explicit formalization of cyclic dependencies (including cycles across build/run phases) and conditions under which a resolution can exist or be efficiently detected.
Incomplete treatment of temporal aspects: evolving repositories, mutable registries, and “time travel” reproducibility (how the calculus and translations account for version disappearance, yanked releases, mirrors, and snapshotting).
Unclear, formal conditions for “compositionality” of reductions and translators when combining multiple extensions (e.g., features + peer dependencies + variable formulae) without introducing unsoundness or blow-ups.
No quantified analysis of the “feature encoding” combinatorial explosion (when compiling features to versions or packages) or strategies to bound/avoid it in practice.
Lack of a precise mapping for peer dependency semantics across historical npm variants (legacy vs modern) and how to parameterize the calculus to support versioned semantics of a single ecosystem.
No formal mechanism for ABI/API compatibility constraints (e.g., semver is an assumption, not a formally enforced relation) or for incorporating automated compatibility evidence (e.g., reverse-dep checks) into resolution.
The NP-completeness discussion does not provide parameterized complexity analyses (e.g., fixed-parameter tractability by treewidth, max alternatives, dependency depth) or practical approximation/branching heuristics guaranteed to work well on real repositories.
Missing evaluation of solver scalability and memory/performance trade-offs for the proposed reductions on large, real-world repositories (Debian, Nixpkgs, crates.io, npm, PyPI, opam).
No empirical validation that translations via the calculus preserve intended resolutions across ecosystems (i.e., equivalence testing, delta-debugging mismatches, or user-visible regressions).
The proposed reduction of cross-ecosystem translation from n² to 2n lacks a concrete, implemented toolchain and benchmarks demonstrating fidelity, coverage, and failure modes (e.g., unmappable metadata or features).
No guidance on name/namespace reconciliation and identity mapping across ecosystems (e.g., virtual package mappings, forks, renames, shadowing), and how to verify these mappings systematically.
Underspecified resolution-ordering policies (e.g., “freshest” vs “minimum” versions) and how different orderings impact stability, monotonicity, and reproducibility; no proofs or counterexamples of monotonicity under repository growth.
No formal lockfile model (snapshot semantics, provenance, update strategies) that composes with the calculus and its extensions, especially when crossing ecosystems.
No model for platform/architecture/os-distribution conditionals (variable formulae) beyond a brief mention; missing a sound, complete reduction and a systematic treatment of cross-compilation and per-target constraints.
Missing treatment of source vs binary packaging differences (e.g., build reproducibility, sandboxing, binary caches) and how these operational dimensions feed back into resolution constraints.
Security aspects are out of scope: no model for vulnerability metadata, taint/trust propagation across ecosystems, or verification that translations do not hide/introduce known CVEs and supply-chain risks.
No approach to user-facing explainability (conflict explanations, minimal unsat cores) grounded in the calculus, nor a comparison of explanation quality across different solver backends (SAT/CDCL vs greedy/MVS).
Lack of mechanized proofs (e.g., Coq/Isabelle/Lean) for theorems, reductions, and extension soundness/completeness, which would increase confidence in the correctness of the framework.
No case studies demonstrating end-to-end cross-ecosystem resolution for complex, real projects (e.g., mixing APT + pip + Cargo + opam) that expose practical gaps in metadata, semantics, or tooling.
Unaddressed governance and data-quality challenges: how to cope with inconsistent or missing metadata, divergent policies across registries, and incentives for maintainers to adopt a common IR (intermediate representation).

View Paper Prompt View All Prompts

Practical Applications

Immediate Applications

The paper introduces the Package Calculus—a minimal, formally defined intermediate representation (IR) for dependency resolution—and shows how diverse package manager semantics can be reduced to this core. The following applications can be deployed now using the calculus as a unifying layer and the reductions the authors describe.

Cross-ecosystem dependency translation
- Sectors: software, DevOps, HPC, data science
- Potential tools/products/workflows: CLI/library to translate dependency metadata between ecosystems (e.g., pip⇄APT, Cargo⇄Nix, npm⇄Debian), project bootstrappers that resolve across OS and language managers, Bazel/Buildkite plugins that ingest multiple manifests and emit a joint resolution
- Assumptions/dependencies: per-ecosystem adapters to/from the calculus; per-ecosystem version ordering implemented (e.g., via univers/VERS); availability of repository metadata; feature/peer/conflict semantics mapped where reductions compose
Unified SBOM generation and vulnerability scanning across OS and language dependencies
- Sectors: security in healthcare, finance, energy, and government
- Potential tools/products/workflows: SBOM generators that traverse the full graph (OS packages + language libs + drivers) and export SPDX/CycloneDX; cross-ecosystem CVE scanners that deduplicate vulnerabilities by name/version normalization; CI gates enforcing resolution-wide vulnerability policies
- Assumptions/dependencies: normalized package identifiers across ecosystems; reliable vulnerability feeds (e.g., OSV, NVD); mapping of virtual packages/providers; access to lockfiles or registries for completeness
Multi-language monorepo build planning with conflict explanations
- Sectors: software, robotics, ML/AI platforms
- Potential tools/products/workflows: “what-if” analyzers in CI/CD that compute minimal closures and provide SAT/CDCL-based conflict explanations using calculus constraints; upgrade planners that show competing version requirements and their justifications
- Assumptions/dependencies: integration with existing resolvers or SAT solvers; reductions for features, peer deps, virtual packages implemented; consistent version ordering policies
Reproducible, cross-ecosystem lockfiles
- Sectors: software, enterprise IT, regulated industries
- Potential tools/products/workflows: a calculus-backed lockfile (“calc.lock”) that snapshots the full closure across pip/Cargo/npm + system packages; environment exporters/importers for dev, CI, and production
- Assumptions/dependencies: deterministic resolution given version ordering; stable registry mirrors/indices; alignment on concurrent-version policies across ecosystems
Container and VM image minimization by precise dependency closure
- Sectors: cloud, edge/IoT, MLOps
- Potential tools/products/workflows: container-build plugins that compute the minimal transitive set across OS+language managers and produce slimmer images; provenance annotations connecting image contents to resolved calculus graphs
- Assumptions/dependencies: repository metadata availability; reliable mapping from source-level deps to deployed binary artifacts; reproducible builds (or fallbacks)
HPC/scientific environment coordination (Conda/Spack/system packages)
- Sectors: academia, HPC, bioinformatics, climate modeling
- Potential tools/products/workflows: env managers that unify Conda/Spack with OS packages using calculus reductions; cross-tool environment export/import; shared, verifiable closures in shared clusters
- Assumptions/dependencies: mapping of variants/USE flags/features to calculus; compute cost for large graphs; cluster policy constraints on repos
Air-gapped/offline deployments via precomputed closures
- Sectors: defense, critical infrastructure, on-prem enterprise
- Potential tools/products/workflows: pre-resolution pipelines that produce bundle archives from calculus graphs; offline installers that validate against the lockfile and cryptographic signatures
- Assumptions/dependencies: artifact mirroring; consistent content-addressing or signatures; completeness of metadata
Governance and compliance checks on dependency policies
- Sectors: government procurement, regulated industries
- Potential tools/products/workflows: policy engines that check calculus-level constraints (e.g., no concurrent versions, approved version bounds, vetted providers for virtual packages); audit reports tracing cross-ecosystem dependencies
- Assumptions/dependencies: policy-to-calculus rule mapping; SBOM alignment; organizational buy-in
Education and benchmarking for dependency resolution
- Sectors: academia, training, tooling vendors
- Potential tools/products/workflows: teaching modules and interactive hypergraph visualizers; benchmark suites comparing resolver strategies (greedy, MVS, SAT/CDCL) on real-world graphs
- Assumptions/dependencies: open datasets of ecosystem graphs; standardized evaluation harnesses
Migration assistants between packaging ecosystems
- Sectors: software modernization, platform engineering
- Potential tools/products/workflows: translators from language-specific manifests + system deps to Nix/Guix or container recipes; aides for distro transitions (e.g., Debian→Alpine) preserving constraints
- Assumptions/dependencies: partial semantic mismatches handled with heuristics or user prompts; availability of equivalent packages/providers; documented incompatibilities (e.g., concurrent-version policies)

Long-Term Applications

These leverage the calculus but require broader ecosystem adoption, further research, or scaling work (e.g., standardization, performance, and semantics alignment across features not always composable).

Standardized dependency IR and cross-ecosystem lockfile format
- Sectors: software, standards bodies, security
- Potential tools/products/workflows: an open standard for a “Package Calculus IR” plus a lockfile spec; reference libraries; registry APIs exposing calculus-ready metadata
- Assumptions/dependencies: multi-ecosystem consensus (e.g., through CNCF/IETF/ISO); maintainers’ adoption; version ordering and feature semantics standardized or clearly annotated
Universal, plugin-based multi-ecosystem package manager
- Sectors: DevOps, platform engineering, education
- Potential tools/products/workflows: a “universal resolver” coordinating OS + multiple language package managers + drivers using 2n translators via the calculus; pluggable adapters for new ecosystems
- Assumptions/dependencies: performance at scale (NP-complete cases); coherent UX around conflicting semantics (concurrent versions, peer deps, virtual packages); security model for cross-ecosystem trust
Automated upgrade planners with impact/risk analysis
- Sectors: healthcare, finance, safety-critical systems
- Potential tools/products/workflows: tools that search resolution orderings to propose low-risk upgrades, simulate breakage across ecosystems, and stage rollouts with proofs of compatibility under constraints
- Assumptions/dependencies: reliable version-ordering and compatibility signals (semver adherence or stronger signals); historical test/telemetry data; integration with CI quality gates
End-to-end supply-chain risk quantification and continuous assurance
- Sectors: energy, government, large enterprises
- Potential tools/products/workflows: risk scoring that spans OS, language, and firmware dependencies; continuous monitoring of calculus graphs for new CVEs and policy changes; attestations linked to SBOMs
- Assumptions/dependencies: cryptographic signing and provenance data; timely vulnerability disclosures; policy frameworks referencing calculus-level properties
Formally verified resolvers (sound, complete, certified)
- Sectors: avionics, medical devices, automotive (high assurance)
- Potential tools/products/workflows: proof-carrying resolvers derived from the calculus; reference implementations verified in proof assistants; regression-proof solver upgrades
- Assumptions/dependencies: mechanized semantics for major ecosystems; tractable proof obligations for extensions (features, peers, virtual packages); tooling for extraction to production languages
Interoperability frameworks for features, peer deps, and virtual packages
- Sectors: JavaScript, Rust, Python, Linux distributions
- Potential tools/products/workflows: specification and libraries to negotiate feature unification, peer constraints, and provider selection across ecosystems; fallback strategies when reductions do not compose
- Assumptions/dependencies: ecosystem alignment on semantics; explicit capability annotations; agreed conflict-resolution policies
Specialized SAT/CDCL solvers tuned to dependency structures
- Sectors: tooling vendors, large-scale package ecosystems
- Potential tools/products/workflows: solvers using calculus-aware heuristics, learned clause management, and incremental solving for monorepos; explainer-friendly proof traces
- Assumptions/dependencies: solver research and datasets; integration into package manager UIs; performance validation at ecosystem scale
Unified management of hardware/driver/software stacks for ML and robotics
- Sectors: ML infrastructure, autonomous systems, embedded/IoT
- Potential tools/products/workflows: planners that co-resolve kernel modules, GPU drivers (e.g., CUDA), OS libs, and Python/Rust bindings; reproducible deployment from dev laptops to edge devices
- Assumptions/dependencies: accurate metadata for drivers/firmware; provider/virtual package modeling; platform constraints (arch/OS) captured as variable formulae
Content-addressed universal packaging
- Sectors: software distribution, research tooling
- Potential tools/products/workflows: merging calculus with content-addressed models (Nix/Unison-style) to tie resolutions to exact artifacts/hashes rather than just versions; hybrid models for deterministic builds across ecosystems
- Assumptions/dependencies: repository support for content addressing and reproducible builds; widespread signing; backward-compatible migration paths
Federated and decentralized registries with cross-resolver composition
- Sectors: open-source ecosystems, resilience engineering
- Potential tools/products/workflows: resolution across multiple registries (public/private), with trust policies enforced at the calculus level; P2P mirrors resilient to outages
- Assumptions/dependencies: signature verification and provenance standards; transparent metadata APIs; governance models for federation

Notes on feasibility and dependencies common across applications

Computational complexity: fully expressive systems remain NP-complete; practical deployments need SAT/CDCL integration and caching/lockfiles.
Semantic gaps: some features (e.g., npm peer deps vs Cargo features, virtual packages, conflicts) do not always compose; translators must surface non-composability and offer user choices.
Version ordering: must implement ecosystem-specific comparisons (e.g., Debian vs semver) consistently; tools like univers/VERS help.
Metadata completeness and quality: success depends on accurate dependency metadata, consistent repository indices, and standardized identifiers.
Adoption and governance: long-term benefits require buy-in from package maintainers, registries, and standards bodies to expose/consume the calculus IR.

View Paper Prompt View All Prompts

Glossary

backtracking: A search strategy that revises earlier choices when conflicts occur during solving. Example: "CPAN clients install the latest version greedily without backtracking."
binary cache substitution: Fetching pre-built binaries to replace building from source while preserving reproducibility. Example: "binary deployment as a transparent optimisation of source-based deployment via binary cache substitution."
conflict-driven clause learning (CDCL): A SAT-solving technique that learns from conflicts to prune search space efficiently. Example: "employ SAT solvers (Appendix~\ref{appendix:sat-resolution}) or conflict-driven clause learning (CDCL) for performance and error reporting."
concurrent versions: Allowing multiple versions of the same package to appear in a single resolution. Example: "Other package managers allow concurrent versions of the same package to exist in a resolution."
conflicts: A package constraint expressing that another package (possibly at certain versions) must not be present. Example: "Package managers can support packages expressing conflicts -- a dependency on the absence of a package."
constraint solver: A tool that selects versions satisfying all dependency constraints. Example: "delegating version selection from the constraint solver to a Turing-complete packaging language."
content-addressing: Identifying artifacts by a hash of their content to ensure immutability and precise references. Example: "The Unison language has eliminated dependency resolution by content-addressing all definitions: each function is identified by a hash of its syntax tree, so dependencies are pinned to exact hashes rather than version constraints."
cryptographic hash: A hash function output used to uniquely identify and secure artifacts. Example: "Nix places built packages at a path containing a cryptographic hash of their store derivation in the `Nix store'."
dependees: Packages that are depended upon; typically libraries or components required by others. Example: "Dependees, sometimes called libraries, can be source code, a shared object, or data files."
depender: A package that declares dependencies on other packages. Example: "Package managers express dependency relations from a depender package to a dependee package, meaning the depender requires the dependee."
dependency closure: The requirement that all dependencies of selected packages are satisfied within the resolution set. Example: "Dependency Closure: $\forall\, p \in S.\, p \Delta (n, vs) \implies \exists\, v \in vs.\, (n, v) \in S$ "
dependency resolution: The process of computing a set of package versions that satisfies all transitive dependencies. Example: "We use dependency resolution to refer to the problem of computing, given a package, the set of package names and versions that must be provided to satisfy its dependencies transitively."
directed hypergraph: A generalization of a graph where edges can connect sets of source vertices to sets of target vertices. Example: "Figure~\ref{fig:calculus:illustration} illustrates dependencies as a directed hypergraph~\cite{berge1970hypergraphs} -- a generalisation of a directed graph in which each edge connects a set of source vertices to a set of target vertices."
domain-specific language (DSL): A specialized language tailored to a particular domain, here used to express dependencies and metadata. Example: "they all take dependencies written in a domain-specific language (DSL)~\cite{bentley1986dsl}"
features: Optional toggles that modify package behavior and dependencies, often unified across dependers. Example: "Cargo supports features of packages that are used to enable optional functionality that may require additional dependencies."
formal reductions: Transformations that map one problem or model to another while preserving correctness, used to compare expressiveness or complexity. Example: "Through a series of formal reductions, we show how this core is expressive enough to model the diversity that real-world package managers employ in their dependency expression languages."
greedy algorithm: A method that makes locally optimal choices (e.g., picking latest or minimum versions) without backtracking. Example: "a greedy algorithm suffices -- selecting the minimum satisfying version for stability, as MVS does, or the latest for freshness."
intermediate representation: A common, abstract form used to translate between different dependency languages or systems. Example: "By using the Package Calculus as the intermediate representation of dependencies, we enable translation between distinct package managers and resolution across ecosystems."
lock files: Snapshots that pin an exact resolved set of package versions for reproducibility. Example: "Similarly, lock files -- snapshots of a resolution -- are unnecessary under M"
minimum version selection (MVS): A resolution strategy that selects the minimal versions satisfying declared lower bounds, relying on semver compatibility. Example: "Go's minimum version selection (MVS)~\cite{cox2018mvs} achieves deterministic, linear-time resolution."
name mangling: Systematically altering symbol names (e.g., by embedding version info) to allow coexisting, otherwise conflicting versions. Example: "Cargo uses name mangling to support linking multiple versions of a single library into the same binary, where typically only one version can be linked due to duplicate symbols."
NP-completeness: A complexity class indicating problems at least as hard as the hardest problems in NP, believed to lack polynomial-time solutions. Example: "We show how the NP-completeness of dependency resolution can be avoided, but is ultimately inherent in capturing the complexity of real-world package managers"
optional dependencies: Dependencies that are used if present but are not required by the resolver to be included. Example: "Package managers can support optional dependencies, which affect the build plan rather than the resolution."
package formulae: Boolean expressions over package dependencies (e.g., requiring one of several alternatives). Example: "Some package managers support package formulae -- boolean expressions over package dependencies."
Package Calculus: A minimal formal system defining core semantics of dependency resolution across package managers. Example: "We present the Package Calculus, a formalism for dependency resolution that unifies the core semantics of diverse package managers."
peer dependencies: Constraints where a package requires its parent (or environment) to also depend on a specific, compatible package. Example: "Package managers which support concurrent versions can also support peer dependencies."
resolution ordering: A partial order over valid resolutions, often used to prefer fresher selections. Example: "A resolution ordering is a partial order $\leq_{\mathcal{S}$ on $\mathcal{S}(\Delta, r)$ "
reverse dependency checks: Testing new releases against their dependers to validate compatibility and constraints. Example: "opam uses reverse dependency checks on new releases of dependees to validate dependency version constraints."
root inclusion: The requirement that the requested root package is included in the resolution. Example: "Root Inclusion: $r \in S$ "
sandboxing: Isolating builds to control side effects and improve reproducibility. Example: "If a package manager is source-based, it might employ sandboxing to ensure reproducibility and isolation while building packages, as opam and Nix do."
SAT solvers: Tools that determine satisfiability of boolean formulas, used to solve complex dependency constraints. Example: "employ SAT solvers (Appendix~\ref{appendix:sat-resolution}) or conflict-driven clause learning (CDCL) for performance and error reporting."
semantic versioning: A versioning scheme where major, minor, and patch numbers carry compatibility meaning. Example: "Cargo restricts this to packages with different major versions under the semantic versioning scheme"
semantics function: A formal mapping defining the meaning of expressions (e.g., version formulas to sets of versions). Example: "Semantics function $\llbracket \cdot \rrbracket_n : \Phi \to \mathcal{P}(V)$ for $\Phi$ under $n \in N$ :"
store derivation: A low-level Nix artifact describing how to build a package; compiled form of expressions. Example: "as the Nix DSL compiles to store derivation .drv ATerm files."
union filesystem: A filesystem that layers multiple directories into a single view, enabling package layering and sharing. Example: "Haiku OS deploys packages by mounting them in a union filesystem, allowing for layering and sharing of multiple packages."
variable formula: Dependency expressions parameterized by variables (e.g., OS, architecture, or build flags). Example: "An extension of the package formula~(\S\ref{sec:background:differences:package-formula}) is the variable formula."
version constraints: Conditions that restrict acceptable versions of dependencies. Example: "Dependency resolution relies on the version constraints of a dependency"
version formula: Logical and relational expressions that denote sets of acceptable versions. Example: "normally expressed with a version formula -- relational and logical expressions defining compatible version sets."
version ordering: A total order defining how versions compare (e.g., for selecting newest). Example: "A version ordering is a total order $\leq_v$ on $V$ "
version uniqueness: The constraint that only one version of a given package name may appear in a resolution. Example: "Version Uniqueness: $\forall\, (n, v), (n, v') \in S.\, v = v'$ "
Version Formula Calculus: The extension of the core calculus to support expressive version formulas. Example: "We define the Version Formula Calculus to be as expressive as the VERS specification"
virtual packages: Abstract package names provided by multiple concrete packages to signify interchangeable functionality. Example: "APT supports virtual packages which allow multiple packages to provide the same functionality."

View Paper Prompt View All Prompts

Open Problems

We found no open problems mentioned in this paper.

Continue Learning

Collections

Tweets

HackerNews

Package Managers à la Carte: a formal model of dependency resolution (55 points, 16 comments)

Package Managers à la Carte: A Formal Model of Dependency Resolution (32 points, 6 comments)

Package Managers à la Carte: A Formal Model of Dependency Resolution

Summary

Formal Foundations for Cross-Ecosystem Dependency Resolution

Motivation and Context

Survey and Taxonomy of Package Manager Semantics

The Package Calculus and Formal Extensions

Cross-Ecosystem Translation and Compositionality

Results, Claims, and Practical Implications

Theoretical Implications and Future Directions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

What questions did the researchers ask?

How did they study it?

1) Survey and compare many package managers

2) Define a simple, formal model (the “Package Calculus”)

What did they find, and why does it matter?

Key findings

Why it matters

What’s the bigger impact?

Knowledge Gaps

Unresolved gaps and open questions

Practical Applications

Immediate Applications

Long-Term Applications

Notes on feasibility and dependencies common across applications

Glossary

Open Problems

Continue Learning

Collections

Tweets

HackerNews

Reddit