Dice Question Streamline Icon: https://streamlinehq.com

Resolve annotation discrepancies in complex gene clusters between pangene and Ensembl on T2T-CHM13

Determine the correct gene annotations for the subset of 41 discrepancies identified between pangene and Ensembl T2T-CHM13 annotations that occur in large, complex gene clusters where correctness cannot currently be ascertained.

Information Square Streamline Icon: https://streamlinehq.com

Background

When comparing pangene-derived annotations to Ensembl annotations for T2T-CHM13, the authors found 41 differences among 18,676 shared genes. Some specific cases appear to favor pangene, but many of the discrepancies occur within complex gene clusters.

For these complex regions, the authors explicitly state that they cannot currently determine which annotation is correct, indicating an unresolved curation and validation problem for challenging genomic loci.

References

Most other genes among the 41 differences come from huge complex gene clusters. We cannot tell what the correct annotation is.

Exploring gene content with pangene graphs (2402.16185 - Li et al., 25 Feb 2024) in Results — Analyzing 100 human haplotypes