Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bidirectional Indexing Scheme

Updated 13 April 2026
  • Bidirectional indexing schemes are advanced data structures that support both leftward and rightward extension and contraction of search patterns.
  • They leverage coordinated structures like suffix trees, BWTs, and DAWGs to efficiently handle exact and approximate pattern matching in texts and tries.
  • These schemes achieve optimal trade-offs between space and time efficiency, making them ideal for applications such as genome analysis and large-scale text analytics.

A bidirectional indexing scheme is an advanced data structure paradigm that enables both leftward and rightward extension and contraction of search patterns within text collections or tries. Such schemes are foundational in modern string processing, enabling efficient exact or approximate string matching, pattern discovery, and variable-order text analytics. The core idea is to maintain synchronized data structures that permit constant or logarithmic time navigation and update in both directions, supporting applications that require full access to both prefix and suffix information of search patterns.

1. Fundamentals of Bidirectional Indexing

Bidirectional indexing generalizes standard suffix-based text indexes by allowing alternation of left and right search operations. Classically, suffix trees and FM-indexes support either forward (left-to-right) or backward (right-to-left) searches. In bidirectional schemes, two coordinated indexes—typically constructed on the forward and reversed versions of the input—allow tracking and mutation of the search locus on both ends.

Given a text T∈ΣnT \in \Sigma^n (or, in the trie case, a labeled tree TT of nn nodes over alphabet Σ\Sigma of size σ\sigma), a bidirectional scheme enables, for a current search string WW, the following primitive operations:

  • extendLeft(cc; WW): Prepend character cc to WW, yielding TT0.
  • extendRight(TT1; TT2): Append character TT3 to TT4, yielding TT5.
  • contractLeft(TT6): Remove the leftmost character, returning a suffix of TT7.
  • contractRight(TT8): Remove the rightmost character, returning a prefix of TT9.

Synchronization of loci in the forward and reverse indexes is critical. This enables arbitrary orders of extensions and contractions while guaranteeing correctness and completeness of occurrences reported or traversed. The pioneering approach for labeled tree (trie) indexing combines the explicit suffix tree of the reversed trie nn0 and a compact, implicit Directed Acyclic Word Graph (DAWG) representation for the forward trie nn1 (Inenaga, 2019). For strings, bidirectional FM-index designs operate over the Burrows-Wheeler transform (BWT) of nn2 and its reverse (Belazzougui et al., 2016).

2. Data Structures Supporting Bidirectionality

Efficient bidirectional indexing relies on tightly coupled data structures:

Structure Purpose Space Complexity
Suffix Tree (nn3) Leftward (or rightward) traversal via Weiner/suffix links nn4 (strings/tries)
DAWG (implicit/compact) Recognizes substrings for extensions in direct/forward orientation nn5 (implicit), up to nn6 explicit
BWT (and BWT of reverse) Succinct representation of suffix intervals for extend/contract nn7 bits
Balanced-parentheses/topology Succinct tree representation to support parent/ancestor/lca nn8 bits
Run-length BWT (RLBWT) Space-efficient BWT variant for repetitive texts nn9 bits (Cunial et al., 2019)

Suffix trees and DAWGs are leveraged in the trie case, while for string indices, dual BWTs and their auxiliary rank/select and Weiner-link data structures are central. The method of micro–macro decomposition is used for implementing DAWG transitions in compact space, where subtrees of the suffix tree are partitioned into micro-trees of Σ\Sigma0 size allowing efficient ancestor/Weiner link queries (Inenaga, 2019).

The concept of affix trees/arrays—bidirectional combinations of classical suffix and reversed-suffix trees or arrays—is another manifestation of bidirectional indexing. However, these may require quadratic space in the worst case for forward tries, motivating research into compact implicit representations (Inenaga, 2019).

3. Core Algorithms and Efficiency

Bidirectional indexing algorithms are characterized by:

  • Construction:
    • Building Σ\Sigma1 (reverse trie’s suffix tree) and calculating the DAWG for the forward trie Σ\Sigma2. In the string case, constructing BWTs of Σ\Sigma3 and Σ\Sigma4 is done in Σ\Sigma5 (deterministic or randomized) time (Belazzougui et al., 2016).
    • Implicit DAWG representations using micro-macro decomposition can be constructed in Σ\Sigma6 time/space, independent of alphabet size (Inenaga, 2019).
    • Fully-functional bidirectional indexes can be constructed in Σ\Sigma7 (randomized) time and Σ\Sigma8 bits (Belazzougui et al., 2016, Cunial et al., 2019).
  • Query Operations:
    • Extend/Contract: For the trie, each operation (extend-left, extend-right) is implemented in Σ\Sigma9 time via edge or Weiner-link simulation in the σ\sigma0 and DAWG. For succinct BWT-based schemes, all four primitive operations run in σ\sigma1 time per operation (Belazzougui et al., 2016, Cunial et al., 2019).
    • Enumeration: After constructing the desired pattern σ\sigma2, occurrences are output in σ\sigma3 time by subtree enumeration.
    • Bidirectional Search Interface: Maintains two search loci: positions in the reverse suffix tree and in the DAWG or BWT/FMI, ensuring that any extension/contraction operation can be mapped to a well-defined state in both structures (Inenaga, 2019, Belazzougui et al., 2016).

Complexity summary for a pattern of length σ\sigma4 and σ\sigma5 occurrences is σ\sigma6 (DAWG+STree, trie case (Inenaga, 2019)) or σ\sigma7 (bidirectional BWT, string case (Belazzougui et al., 2016, Cunial et al., 2019)), with linear or near-linear preprocessing time and space.

4. Underlying Theoretical Principles

Central to efficient bidirectional indexing are several theoretical constructs:

  • Weiner Links: These generalize the notion of extending substrings in the suffix tree by prepending a character. Hard Weiner links correspond to primary transitions in the DAWG; soft Weiner links correspond to secondary transitions. The simulation of arbitrary DAWG traversals in linear space leverages this duality (Inenaga, 2019).
  • Suffix Link and Affix Link Mapping: The interplay between suffix links, reverse-suffix links, and affix links allows bidirectional navigation and mapping between corresponding loci in forward and reverse trees (Inenaga, 2019, Cunial et al., 2019).
  • Balanced-Parentheses Representation: This succinct data structure enables σ\sigma8 navigation (parent, ancestor, lca, child queries) within suffix trees, further facilitating constant-time contract/extend operations (Belazzougui et al., 2016, Cunial et al., 2019).
  • Run-Length Compression and Compact BWTs: For repetitive texts, run-length compressed BWTs markedly reduce space overhead while preserving efficient operation support (Cunial et al., 2019).
  • Search Scheme Formalism: In approximate matching, the restructuring of pattern search as a traversal over search schemes with partitioned/ordered blocks enables optimal trade-offs between index operations and search space enumerations (Kucherov et al., 2013).

5. Applications and Practical Significance

Bidirectional indexing schemes are essential in:

  • Exact and Approximate Pattern Matching: All combinations of prefix and suffix extension/contraction are possible, supporting advanced approximate matching paradigms such as search schemes, which optimize error distribution coverage and minimize enumeration complexity (Kucherov et al., 2013).
  • Genome and Sequence Analysis: High-throughput DNA sequencing requires indexing schemes that can manage bidirectional walks on large-scale, repetitive texts under stringent space and query time constraints (Belazzougui et al., 2016, Cunial et al., 2019).
  • Variable-Order and de Bruijn Graph Analytics: Fully-functional indexes support frequency-aware, variable-order traversal in de Bruijn graphs, providing node and arc frequency computations and dynamic order changes on-the-fly without pre-set bounds (Cunial et al., 2019).
  • Compressed Representation of Repetitive Collections: Space-efficient bidirectional indexes using CDAWG or run-length compressed BWTs enable scalable analysis of large and repetitive string datasets (Cunial et al., 2019).

6. Trade-offs, Limitations, and Variants

Different bidirectional scheme variants exhibit distinct trade-offs:

Index Type Space Complexity Extension/Contraction Time
DAWG+STree (trie, implicit DAWG) σ\sigma9 words WW0 per extension
Affix tree/array WW1 (forward) WW2 (but impractical size)
Bidirectional BWT/FM-index WW3 bits WW4
CDAWG-based (repetitive strings) WW5 words WW6
  • Large alphabets (WW7) in forward tries induce quadratic worst-case space in explicit DAWG constructions, motivating the linear-space implicit representation (Inenaga, 2019).
  • Fully-functional bidirectional indexes allow both extension and contraction in WW8 time; earlier constructions had WW9 for extension but only supported contraction from specific substrings (Cunial et al., 2019).
  • For highly repetitive texts, CDAWG-based approaches reduce space complexity to cc0, where cc1 is the total number of left/right extensions of maximal repeats, with sub-logarithmic per-operation time (Cunial et al., 2019).

7. Illustrative Example

Consider a forward trie: cc4 The corresponding reversed trie for bidirectional indexing is: cc5 The explicit construction of the reverse suffix tree is cc2 in size; implicit DAWG representation for the forward trie is maintained in cc3 space via micro–macro decomposition and simulation of Weiner links. A pattern search such as "ba" is performed by alternately issuing extend-right and extend-left operations, maintaining loci in both the suffix tree and DAWG, and enumerating occurrences directly from the suffix tree subtree rooted at the final locus (Inenaga, 2019).


Bidirectional indexing schemes unify efficient bidirectional pattern matching, compact representation, and navigability in text and trie settings, with operations and space bounded optimally in theoretical and practical contexts (Inenaga, 2019, Belazzougui et al., 2016, Cunial et al., 2019, Kucherov et al., 2013).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bidirectional Indexing Scheme.