Software Dependencies 2.0 in ML-Enabled Systems
- Software Dependencies 2.0 are a new category that integrate pre-trained machine learning models with traditional code artifacts, enabling probabilistic behavior in applications.
- Integration pipelines involve multi-stage processes like model initialization, adaptation, and fine-tuning, which increase system complexity and necessitate specialized management.
- Fragmented documentation and inconsistent versioning practices challenge maintainability and traceability, driving the need for standardized PTM dependency management.
Software Dependencies 2.0 define a new generation of dependencies in software engineering, characterized by the integration of pre-trained machine learning models (PTMs) as first-class, functional, and architectural dependencies alongside traditional human-written libraries and code artifacts. Unlike classical (Software Dependencies 1.0) dependencies, which are resolved via static linking and versioning of code components, Dependencies 2.0 encapsulate learned model weights, architectural configuration, tokenizers, and associated assets, introducing a composite—often probabilistic—behavior into downstream software systems. The introduction of PTMs into software dependency graphs fundamentally alters integration, documentation, and maintainability practices in open-source development, as evidenced by empirical analyses of large-scale GitHub repositories (Yasmin et al., 7 Sep 2025).
1. Definition and Conceptual Differentiation
The evolution from code-centric to model-centric dependencies is anchored in the empirical observation that modern software projects systematically reuse PTMs drawn from model repositories such as Hugging Face and PyTorch Hub. In this paradigm, a dependency is not limited to a static API interface but is a bundle of elements: model architecture, trained weights, ancillary components (tokenizers, vocabularies), and configuration parameters. PTMs inject stochastic and context-specific outputs, and their integration often requires substantial adaptation (e.g., adding, modifying, or pruning network heads and layers). The dependency thus becomes a non-trivial, dynamic object, fusing executable code with the learned statistical knowledge encoded in model parameters. The practical implication is that the reuse of learned behavior—the implicit output of an upstream training regime—propagates new forms of risk, technical debt, and traceability challenges not encountered in code-only dependency management.
2. Integration Pipelines: Staging and Adaptation Patterns
Empirical studies of open-source projects reveal that PTM integration pipelines are multi-stage and heterogeneous, differing significantly from canonical ML or software engineering workflows (Yasmin et al., 7 Sep 2025). The principal stages observed across 401 GitHub repositories sampled from the PeaTMOSS dataset include:
- Model Initialization: Loading the PTM via framework-specific APIs (e.g.,
).1
AutoModel.from_pretrained("bert-base-uncased")
- Model Adaptation: Modifying base architectures, such as inserting custom heads, deleting layers, or configuring submodules to tailor model behavior to downstream tasks.
- Data Processing and Preprocessing: Applying tokenizers, resizers, and normalizers to transform raw inputs into representations compatible with the PTM.
- Prompt Generation: Constructing application-specific prompts, particularly in generative settings.
- Feature Engineering: Extracting internal representations for downstream processing or embedding generation.
- Fine-Tuning: Updating model weights on new, task-specific data.
- Inference: Deploying the adapted PTM for prediction or generation.
- Post-Processing and Evaluation: Filtering or scoring model outputs, quantifying task-specific performance with structured metrics (e.g., BLEU, FID).
- Delivery: Exposing model outputs through APIs, services, or user interfaces.
Three dominant organizational archetypes—feature extraction pipelines, generative pipelines, and discriminative pipelines—structure these stages. The adaptation phase frequently involves nontrivial engineering effort; PTMs are rarely used in a plug-and-play manner, and downstream adaptation may itself become a source of further technical debt.
3. Documentation, Structuring, and Version Management
Documentation and explicit declaration of PTM dependencies are inconsistent and fragmented. Only about 21.2% of surveyed projects externalize all PTM references in structured documentation or configuration artifacts; the majority (58.9%) rely exclusively on code-local declarations, often as string literals or framework-specific loader calls. Explicit version tagging (e.g., using a revision
parameter in Hugging Face from_pretrained()
or specific tags in PyTorch Hub) is practiced in just 12% of cases.
This fragmentation impedes traceability, complicates reproducibility, and exacerbates maintenance risk. Dependency information—when present—is embedded across heterogeneous files and formats (YAML, JSON, README, or code), necessitating manual curation to construct a full dependency map. The lack of centralization and standardization for PTM metadata, versioning, and configuration increases the opacity of dependency trees and undermines the composability of pipelines in the long term.
4. Multi-PTM Interaction and Pipeline Coupling
PTMs in Software Dependencies 2.0 are integrated not as isolated artifacts, but as interacting components within complex, multi-model systems. Four major patterns of inter-PTM or PTM-to-learned-component interaction are identified:
- Feature Handoff: Output of one PTM (e.g., an embedding) serves as input to another module or PTM; this is seen in cascaded or hierarchical architectures.
- Feedback Guidance: PTMs serve as evaluators or guides (for example, a CLIP model scoring image-text pairings to refine the training or adaptation of another model).
- Metric-based Evaluation: PTMs are used solely for calculating metrics (e.g., FID, BLEU) for downstream outputs in evaluation or feedback loops.
- Post-Processing Refinement: Secondary PTMs are employed to filter, rerank, or amend outputs (e.g., using a moderation model for safety checks), increasing architectural complexity.
Projects frequently exhibit both interchangeable PTM reuse (model variants swapped in and out) and complementary PTM assemblies (distinct models performing different roles), further multiplying the dependency relations that must be managed. The distributed nature of these pipelines is underscored by empirical observations: in multi-PTM projects (~52.6% of sample), code spanning a mean of 3.9 files and ~886 lines of code is involved in the reuse pipeline.
5. Quantitative Insights into PTM Reuse in OSS
The paper of 401 open-source repositories yields the following quantitative results:
Metric | Observed Value |
---|---|
Projects with >1 PTM | 52.6% |
Multi-PTM projects using interchangeable PTMs | 37% |
Multi-PTM projects using complementary PTMs | 23% |
Projects with external PTM documentation | 21.2% |
Projects specifying PTM version | 12% |
Average files spanned by reuse pipeline | 3.9 |
Average code lines in reuse pipeline | ~886 |
This heterogeneity solidifies the claim that PTMs constitute a distinct and non-uniform category of dependency, requiring new best practices for documentation, audit, and reproducibility.
6. Challenges and Implications for Maintainability, Traceability, and Technical Debt
The integration of PTMs as dependencies introduces new challenges:
- Fragmented Declaration: Dispersed, non-centralized PTM specification hinders efforts to track and propagate updates, patch security vulnerabilities, or establish provenance.
- Versioning and Reproducibility: Inconsistent or absent version tagging undermines the ability to reproduce results, conduct reliable ablation studies, or systematically roll back to previous model states.
- Technical Debt: Modified or adapted PTMs, especially when extended via custom layers or fine-tuning, increase architectural complexity and maintenance burden, giving rise to loosely coupled or undocumented interfaces.
- Multi-Model Interactions: Feature handoff, evaluative guidance, and composite processing pipelines complicate dependency resolution as the boundary between code- and data-centric artifacts blurs.
The empirical evidence supports the assertion that Software Dependencies 2.0 networks are not simply more numerous than traditional code dependencies, but structurally and semantically more intricate, necessitating advances in dependency tracking, automated audit, and systematized metadata management.
7. Visualization and Representation
While no explicit mathematical formalism is introduced to model these pipelines, the use of LaTeX-based representations is prominent. For example, code and dependency trees are rendered using tikz diagrams and code environments such as
1 |
AutoModel.from_pretrained("bert-base-uncased", revision="v4.2.0") |
Hierarchical trees and network diagrams are used to convey the branching paths of single versus multi-PTM reuse, and to delineate the stages of reuse pipelines.
These visualization practices emphasize the complexity and non-linearity of PTM-centric dependency graphs, in contrast with the relatively flat structures of Software Dependencies 1.0.
Conclusion
Software Dependencies 2.0 formalize the emergence of PTMs and learned artifacts as central, first-class dependencies in modern software ecosystems. The empirical and qualitative analyses of large-scale open-source projects document that PTM reuse pipelines are intricate, distributed, and heavily adapted, raising unique challenges in documentation, maintainability, traceability, and evolution. The lack of standardized metadata or centralized versioning for PTMs impedes best practices in dependency management. The architectural couplings, directly observable in feature handoff, feedback guidance, and evaluation-based PTM interplay, introduce new forms of technical debt and risk.
This landscape demonstrates a pressing need for new tools, metadata practices, and automated management systems to handle the increased complexity and dynamic behavior of Software Dependencies 2.0 in modern ML-enabled software systems (Yasmin et al., 7 Sep 2025).