Principled global optimization formulation for pangene graph construction

Develop a principled global optimization formulation for pangene graph construction that reliably encodes known gene-level variations, thereby replacing the current ad hoc heuristics.

Background

Pangene’s current graph construction relies on heuristics to handle paralogy, alignment ambiguities, and complex local topologies. While effective in many curated cases, this approach lacks a unifying optimization framework that could provide robustness and theoretical guarantees.

The authors explicitly state their preference for a global optimization formulation but acknowledge that they have not identified one that reliably captures known gene-level variation patterns, highlighting a broader gap that also affects other pangenome tools.

References

Ideally, we would prefer to model graph construction to a global optimization problem. We have not been able to find such a formulation that can reliably encode known variations.

Exploring gene content with pangene graphs (2402.16185 - Li et al., 25 Feb 2024) in Discussions