Papers
Topics
Authors
Recent
Search
2000 character limit reached

Data Flow Subsumption Framework

Updated 7 March 2026
  • DSF is a distributive framework that formalizes data-flow subsumption by capturing definition-use associations via a lattice-theoretic approach.
  • It employs explicit transfer functions and the Subsumption Algorithm to compute the exact Meet-Over-All-Paths (MOP) solution for test coverage.
  • The framework reduces redundancy and testing complexity by ensuring sound, complete, and convergent analysis within finite lattices.

The Data Flow Subsumption Framework (DSF) is a distributive data-flow analysis framework formalized to address redundancy in data flow testing, specifically by capturing the subsumption relationships among definition-use associations (DUAs) in a program. DSF provides a lattice-theoretic foundation and an associated iterative algorithmic approach that computes, for each program node, the subset of DUAs that are subsumed—those guaranteed to be covered whenever the node is traversed by a test. The framework’s distributivity property ensures that the iterative solution computes the Meet-Over-All-Paths (MOP) result, i.e., guarantees valid results over all control-flow paths. DSF operationalizes these ideas through explicit transfer functions and the Subsumption Algorithm (SA), enabling efficient, precise determination of DUA subsumption (Chaim et al., 2021).

1. Lattice-Theoretic Structure of DSF

DSF formalizes data-flow subsumption analysis over a meet-semilattice built from the power set of all DUAs mandated by the all-uses criterion for a program PP. Let UU denote the (finite) set of all DUAs in PP.

  • Domain: V=P(U)V = \mathcal{P}(U), where each element is a subset of UU.
  • Ordering: XYX \leq Y iff XYX \subseteq Y for all X,YUX, Y \subseteq U.
  • Meet operator: XY=XYX \sqcap Y = X \cap Y.
  • Bottom: =\perp = \varnothing (no DUAs covered).
  • Top: =U\top = U (all DUAs covered).

For each node nn in the control-flow graph G=(N,E,s,e)G = (N, E, s, e), DSF maintains:

  • IN(n)V\mathrm{IN}(n) \in V: DUAs guaranteed to be covered on all paths reaching nn.
  • OUT(n)V\mathrm{OUT}(n) \in V: DUAs guaranteed covered after traversing nn.

This lattice structure enables sound fixed-point computations via iterative analysis (Chaim et al., 2021).

2. Transfer Functions and Data Flow Equations

For each node nn, the following constant sets are precomputed:

  • Born(n)\mathrm{Born}(n): DUAs whose definition is at nn.
  • Disabled(n)\mathrm{Disabled}(n): DUAs where dnd \neq n and the relevant variable is redefined at nn.
  • PotCovered(n)\mathrm{PotCovered}(n): DUAs whose use occurs at nn.
  • Sleepy(n)\mathrm{Sleepy}(n): Edge-DUAs “asleep” at nn—those not necessarily covered on all paths.
  • Working sets include Covered(n)\mathrm{Covered}(n) ("DUAs covered so far") and CurSleepy(n)\mathrm{CurSleepy}(n) (aggregated “sleepy” edge-DUAs from certain predecessors).

The transfer functions select, propagate, and generate DUAs as follows:

Stage 1 (Propagation):

IN(n)=pPRED(n)OUT(p)\mathrm{IN}(n) = \bigcap_{p \in \mathrm{PRED}(n)} \mathrm{OUT}(p)

Stage 2 (Update):

CurSleepy(n)=pPRED(n),(p,n)∉backSleepy(p) Covered(n)=pPRED(n)Covered(p)[(IN(n)CurSleepy(n))PotCovered(n)] OUT(n)=Born(n)[IN(n)Disabled(n)]Covered(n)\begin{align*} \mathrm{CurSleepy}(n) &= \bigcup_{p \in \mathrm{PRED}(n),\, (p,n) \not\in \text{back}} \mathrm{Sleepy}(p) \ \mathrm{Covered}(n) &= \bigcap_{p \in \mathrm{PRED}(n)} \mathrm{Covered}(p) \cup [(\mathrm{IN}(n) \setminus \mathrm{CurSleepy}(n)) \cap \mathrm{PotCovered}(n)] \ \mathrm{OUT}(n) &= \mathrm{Born}(n) \cup [\mathrm{IN}(n) \setminus \mathrm{Disabled}(n)] \cup \mathrm{Covered}(n) \end{align*}

The transfer function for each node,

fn(X)=Born(n)(XDisabled(n))Covered(n)[(XCurSleepy(n))PotCovered(n)],f_n(X) = \mathrm{Born}(n) \cup (X \setminus \mathrm{Disabled}(n)) \cup \mathrm{Covered}(n) \cup [(X \setminus \mathrm{CurSleepy}(n)) \cap \mathrm{PotCovered}(n)],

is distributive over intersection (Chaim et al., 2021).

3. Distributivity and the Meet-Over-All-Paths Solution

DSF’s transfer functions are distributive over meet: fn(XY)=fn(X)fn(Y)f_n(X \sqcap Y) = f_n(X) \sqcap f_n(Y) for all X,YUX, Y \subseteq U. Each point’s MOP solution is

MOP(n)=πPaths(sn)fπ()\mathrm{MOP}(n) = \bigcap_{\pi \in \text{Paths}(s \to n)} f_\pi(\top)

with fπf_\pi as the composition of transfer functions along path π\pi from the start node ss.

The distributivity property guarantees that standard iterative worklist algorithms converge precisely to the MOP solution, not merely to a maximal fixed-point under the transfer functions. This ensures exact (not over- or under-approximated) subsumption results (Chaim et al., 2021).

4. The Subsumption Algorithm (SA)

The Subsumption Algorithm (SA) is a worklist-based iterative solver tailored to the DSF’s transfer functions. It proceeds as follows:

  1. Initialization: For n=sn = s, initialize IN(s)\mathrm{IN}(s) \gets \varnothing, OUT(s)Born(s)\mathrm{OUT}(s) \gets \mathrm{Born}(s), Covered(s)\mathrm{Covered}(s) \gets \varnothing; for all nsn \neq s, set OUT(n)U\mathrm{OUT}(n) \gets U, Covered(n)U\mathrm{Covered}(n) \gets U.
  2. Iteration: Iterate over nodes, computing IN(n)\mathrm{IN}(n), CurSleepy(n)\mathrm{CurSleepy}(n), Covered(n)\mathrm{Covered}(n), and updating OUT(n)\mathrm{OUT}(n) as described above, until convergence.
  3. Final Update: After stabilization, recompute IN(n)\mathrm{IN}(n), CurSleepy(n)\mathrm{CurSleepy}(n), and perform a final update of Covered(n)\mathrm{Covered}(n).

At termination, Covered(n)\mathrm{Covered}(n) is the set of all DUAs subsumed at node nn, i.e., those covered on every path reaching nn. This operationalization directly implements the DSF (Chaim et al., 2021).

5. Correctness, Termination, and Complexity

The SA is correct by standard arguments for distributive frameworks. Termination follows from monotonic decreasing updates of OUT(n)\mathrm{OUT}(n) and Covered(n)\mathrm{Covered}(n) within a finite lattice. Upon convergence,

  • Soundness: Any DUA in Covered(n)\mathrm{Covered}(n) is covered on all paths to nn.
  • Completeness: Every DUA so covered is retained by the algorithm, with no over-approximation.

Complexity:

  • Space: O(NU)O(|N| \cdot |U|).
  • Time per iteration: O(NU)O(|N| \cdot |U|) due to set operations.
  • Number of iterations: For reducible flow-graphs (loop nesting depth LL), at most L+2L+2; in practice, typically near-linear overall O(NU)O(|N| \cdot |U|); worst-case bound O(N2U)O(|N|^2 \cdot |U|) (Chaim et al., 2021).

6. Illustrative Example

A minimal program fragment demonstrates the DSF computation:

  • Nodes: 1 (x := ...), 2 (conditional), 3 (y := x+1), 4 (z := x+2), 5 (return)
  • Edges: 1→2, 2→3, 2→4, 3→5, 4→5.
  • DUAs (all-uses of xx): U={(1,3,x), (1,4,x)}U = \{(1,3,x),\ (1,4,x)\}, encoded as edge DUAs.

The computation of Born, Disabled, PotCovered, and Sleepy sets drives the SA. After one iteration, all Covered(n)\mathrm{Covered}(n) sets are empty, indicating no nontrivial node subsumes a DUA; only more elaborate control-flow structures yield non-empty subsumption sets. This concretely illustrates how DSF operationalizes DUA subsumption analysis (Chaim et al., 2021).

7. Summary and Significance

DSF provides a rigorous, distributive framework for systematically analyzing and exploiting subsumption in data flow testing. By formulating transfer functions over the power set lattice of DUAs and leveraging distributivity, DSF enables efficient, precise determination of DUA subsumption sets. The Subsumption Algorithm yields provably correct results with practical time complexity, facilitating resource reduction in data flow testing by eliminating redundant test requirements (Chaim et al., 2021).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Data Flow Subsumption Framework (DSF).