Efficient detection of generalized bibubbles in bidirected graphs
Develop an efficient algorithm to identify all generalized bibubbles in a bidirected gene graph—i.e., minimal pairs of oriented genes (x, y) whose enclosed vertex set satisfies the specified reachability symmetry and minimality conditions—such that the method scales to very large graphs with tens of millions of nodes, including minigraph-cactus and PGGB graphs.
References
On the theoretical side, this article presented a rigorous definition of “bubble” in a bidirected graph but it did not find an efficient algorithm to identify such generalized bibubbles. While the current implementation in pangene works for gene graphs containing ∼20,000 genes, it will be slow for a minigraph-cactus or PGGB graph that contains tens of millions of nodes. How to efficiently identify generalized bibubbles remains an open and critical problem.