Dice Question Streamline Icon: https://streamlinehq.com

Principled global optimization formulation for pangene graph construction

Develop a principled global optimization formulation for pangene graph construction that reliably encodes known gene-level variations, thereby replacing the current ad hoc heuristics.

Information Square Streamline Icon: https://streamlinehq.com

Background

Pangene’s current graph construction relies on heuristics to handle paralogy, alignment ambiguities, and complex local topologies. While effective in many curated cases, this approach lacks a unifying optimization framework that could provide robustness and theoretical guarantees.

The authors explicitly state their preference for a global optimization formulation but acknowledge that they have not identified one that reliably captures known gene-level variation patterns, highlighting a broader gap that also affects other pangenome tools.

References

Ideally, we would prefer to model graph construction to a global optimization problem. We have not been able to find such a formulation that can reliably encode known variations.

Exploring gene content with pangene graphs (2402.16185 - Li et al., 25 Feb 2024) in Discussions