Sequential Dependence (SeqDep)
- Sequential Dependence (SeqDep) is a framework for modeling ordered, probabilistically-bound dependencies in categorical sequences while preserving marginal distributions.
- It underpins applications spanning stochastic modeling, combinatorial optimization, greedy graph algorithms, and query-document matching in information retrieval.
- Its rigorous formulation, including precise covariance decay and dependency graphs, supports efficient parallelization and adaptive algorithm design.
Sequential dependence (SeqDep) concerns the explicit, probabilistically-quantified dependence between variables or operations that unfold in a prescribed order—typically in sequences of random variables or algorithmic procedures. This structure contrasts with both independent sequences and other types of dependency (such as block or spatial dependency), and it induces unique statistical, computational, and algorithmic phenomena. SeqDep is foundational in stochastic modeling, combinatorial optimization, random projection analysis, and probabilistic graphical modeling. The principal frameworks span categorical random variable dependencies (Traylor et al., 2017), adaptive sequential statistical processes (&&&1&&&), algorithmic dependency in greedy graph algorithms (Blelloch et al., 2012), and retrieval models such as the sequential dependence model (SDM) (Dietz et al., 2018).
1. Formal Characterization of Sequential Dependence
Sequential dependence in categorical random variables is rigorously formalized by Traylor & Hathcock via a dependency coefficient applied within a sequence , each taking values in a finite alphabet . The dependency manifests through transition rules where for and ,
with marginal probabilities . This construction “boosts” the recurrence of categories while enforcing normalization. Importantly, despite this dependence, each retains the marginal distribution , so the variables are identically distributed but not independent.
Beyond this “sequential dependence” (SeqDep), a more general vertical dependency is captured by dependency-generating functions , where each is conditionally dependent only on , with the “dependency continuity” guarantee that any can always be traced back to via iterated applications of (Traylor et al., 2017).
2. Statistical Properties and Covariance Structure
A key property of SeqDep is that the cross-covariance between indicator variables decays exponentially in both the lag and the coefficient :
For binary cases (), this specializes to with and . This structure provides exact, lag-dependent, tunable correlation without altering the marginal distributions. SeqDep thus enables the embedding of specific temporal correlation profiles into categorical sequences (Traylor et al., 2017).
3. Sequential Dependence in Algorithmic and Process Contexts
SeqDep is intrinsic to many greedy and adaptive algorithms, where each step or iterate depends on a subset of its predecessors. In greedy sequential algorithms for maximal independent sets (MIS) and maximal matchings (MM), the dependence structure is captured by a priority DAG, whose depth reflects the length of sequential dependence. Blelloch, Fineman, and Shun show that for any graph and uniformly random ordering, the dependence length of greedy MIS is with high probability (Blelloch et al., 2012). This polylogarithmic depth enables efficient parallelization that respects the underlying SeqDep, maintaining both work efficiency and exact structural equivalence to the sequential procedure.
A summary table of algorithmic SeqDep in greedy MIS/MM algorithms:
| Aspect | Description | Reference |
|---|---|---|
| Dependency Graph | Priority DAG defined by algorithm’s precedence | (Blelloch et al., 2012) |
| Depth Bound | rounds (MIS), (MM) w.h.p. | (Blelloch et al., 2012) |
| Parallelization | Each “root” can be processed in parallel each round | (Blelloch et al., 2012) |
4. SeqDep in High-Dimensional Sequential Random Projection
In sequential random projection, as in streaming and adaptive sketching algorithms, the observed projection at each step is a function of both current and past random choices, inducing a sequence of dependent random variables. The analysis is complicated by the adaptivity; as new data arrives, projections and statistics are updated in a manner dependent on all previous steps.
Ho, Sun, and coauthors develop martingale and stopped process approaches to analyze this dependence, culminating in a non-asymptotic probability bound for sequential embeddings that extends the Johnson-Lindenstrauss (JL) lemma. Here, the adaptation induces sequential dependence because the data at time is measurable with respect to , and the random projection is independent of past but enters the history for future steps. Martingale concentration and self-normalized process techniques are needed to handle the adaptive sequence of dependent events, rather than classic union bounds valid only in the i.i.d. case (Li, 2024).
5. SeqDep in Probabilistic Graphical Models and Information Retrieval
In information retrieval, sequential dependence shapes models of query-document matching. The Sequential Dependence Model (SDM), introduced by Metzler and Croft, models the dependencies between query terms as an undirected Markov Random Field (MRF) (Dietz et al., 2018). The cliques in the MRF connect each query term and its neighbor(s) to the document variable, capturing both unigram and bigram dependencies. Mathematically, the SDM scoring function is a log-linear combination of unigram and bigram features, which can be recast as a mixture of LLMs.
Explicitly, the SDM score is
where , , and are features based on Dirichlet-smoothed unigrams, ordered bigrams, and windowed bigrams, respectively. The model thus integrates sequential dependence among query terms into the document-scoring process, providing theoretical foundations for both discriminative and generative modeling approaches (Dietz et al., 2018).
6. Generalizations and Graphical Interpretations
Vertical dependency, as formalized by dependency-generating functions , subsumes both SeqDep (when ) and first-kind dependence (FK; ). For each dependency generator , one can construct a dependency graph with edges from ; dependency continuity guarantees identical-marginal distributions for all . This graphical viewpoint unifies diverse sequential and non-sequential dependencies into a single general framework, characterizing possible dependency structures via generative tree- or forest-like directed acyclic graphs (Traylor et al., 2017).
Examples include:
- SeqDep: a single directed path from each to to $1$
- FK: a star with all pointing to 1
- Non-monotonic : more general forests, all nodes eventually connect to 1
These structures illustrate that a wide class of vertically-dependent categorical sequences can be generated with common marginals but exact, tunable lag-dependent cross-covariances.
7. Empirical and Practical Implications
Empirical results in parallel greedy algorithms demonstrate that by respecting the sparse sequential-dependence structure, high degrees of parallel speedup are achievable without loss in algorithmic correctness or solution optimality. Sequential dependence, when strictly defined and managed, circumvents the inefficiencies of naïvely parallelizing intrinsically sequential problems (Blelloch et al., 2012).
In sequential random projections, the sharp non-asymptotic concentration bounds derived under SeqDep enable practical guarantees for online, streaming, and adaptive algorithms—settings in which independence does not hold and classical concentration tools fail (Li, 2024).
In information retrieval, the SDM demonstrates that both discriminative and generative modeling frameworks are mathematically equivalent up to parameter reinterpretation, with performance contingent on sufficiently fine-grained parameter optimization, not the theoretical formulation per se. This underscores the foundational role of sequential dependence for both the statistical and algorithmic structure of retrieval models (Dietz et al., 2018).