Multiscale pangenome graphs empower the genomic dissection of mixed-ploidy sugarcane species.

Saved in:
Bibliographic Details
Title: Multiscale pangenome graphs empower the genomic dissection of mixed-ploidy sugarcane species.
Authors: Huang, Yumin (AUTHOR), Zhang, Yixing (AUTHOR), Zhang, Qing (AUTHOR), Zhuang, Gui (AUTHOR), Li, Chunjia (AUTHOR), Wang, Baiyu (AUTHOR), Gao, Ruiting (AUTHOR), Xu, Yi (AUTHOR), Qi, Yiying (AUTHOR), Hua, Xiuting (AUTHOR), Shi, Huihong (AUTHOR), Xu, Qiutao (AUTHOR), Yao, Wei (AUTHOR), Liu, Xinlong (AUTHOR), Qi, Yongwen (AUTHOR), Chen, Baoshan (AUTHOR), Zhang, Muqing (AUTHOR), Ming, Ray (AUTHOR), Tang, Haibao (AUTHOR), Zhang, Jisen (AUTHOR)
Source: Science. 2/5/2026, Vol. 391 Issue 6785, p1-11. 11p.
Subjects: Pan-genome, Polyploidy, Population genetics, Genome editing, Genomics, Sugarcane, Graph theory, Genome-wide association studies
Abstract: The sugarcane genus Saccharum is characterized by complex genomes with diverse ploidy levels. We developed a multiscale graph–based pangenome representation, which integrates nine genome assemblies into a unified reference, representing modern cultivars and founding species. Each homo(eo)logous (encompasses both homologous and homeologous relationships) chromosome set retains 47 to 57 haplotypes and ~74,000 to 271,000 gene alleles. This framework enables multiomics exploration, encompassing homo(eo)log systems and epigenomic signatures. The pangenome facilitates population genomics analyses of 417 mixed-ploidy Saccharum accessions, revealing convergent selection and identifying the Andropogoneae TB1 homolog linked to tillering as a promising gene-editing target to boost cane yield. Additionally, the pangenome supports dosage-informed genome-wide association study, improving heritability estimates and identification of sugar or leaf-angle–associated loci, including SaIRX10 and SaBAK5. Our analytical framework establishes a foundation for graph-based genetic studies in sugarcane and other polyploid genomes. Editor's summary: Pangenomes were developed to better encompass genetic variation across a species, but this concept is now being expanded to include variation across subspecies as well as genera. A pair of papers in this issue report pangenome creation for crops, which particularly benefit from this analysis approach given their high levels of diversification and ploidy (see the Perspective by Soltis and Soltis). Ma et al. created a pangenome for Brassica rapa by integrating genetic variation from 1720 accessions spanning seven subspecies. Huang et al. created a pangenome for the polyploid sugarcane (Saccharum) that combines nine assemblies from four species. These references were able to help identify structural variations and regions potentially underlying important phenotypes. These pangenomes will serve as a valuable resource for research focusing on the improvement of these crops and will offer insight into domestication processes. —Corinne Simonti and Madeleine Seale INTRODUCTION: Modern sugarcane cultivars (Saccharum spp.) harbor exceptionally complex genomes shaped by interspecific hybridization, extreme and uneven polyploidy, aneuploidy, and abundant repeats. These features obscure haplotypes and allele dosage, complicating genetic research and breeding. A single linear reference cannot accommodate variation in ploidy, chromosome number, or introgressed backgrounds. RATIONALE: Graph genomes provide a compact, coordinate-consistent representation of alternative haplotypes and ploidy states, enabling allele-aware alignment, variant discovery, and cross-ploidy comparisons. We set out to build a polyploid-aware, multiscale graph framework for Saccharum that integrates genome- to gene- or protein-level information and supports downstream multiomics and population analyses, with a design expressly intended to scale to other polyploids. RESULTS: We developed a pangenome from nine assemblies across four Saccharum-related species and multiple ploidy levels, including both modern cultivars and their founding species, encompassing 47 to 57 haplotypes and ~74 to 271 thousand gene alleles. The graph captured ~82% of sugarcane genomic diversity (versus ~34% with a single reference) and operated seamlessly from genome to gene or protein scales. This integrative view delivers concrete biological insight. Clade-level comparisons revealed a pronounced enrichment and diversification of nucleotide-binding leucine-rich repeat receptors (NLRs) in wild relatives, which are a reservoir of disease-resistance alleles for introgression. Pangenome-guided multiomics improved mapping and boosted usable signal, uncovering additional high-confidence epigenomic features. At sugar-transport loci, such as SUT1, graph-resolved accessibility signatures aligned with transcriptional regulation of sucrose traits. Population analyses performed directly on the graph rescued missing diversity, reduced single-reference bias, and enabled cross-ploidy comparisons, exposing convergent selection within Andropogoneae and highlighting carbohydrate and cell wall modules under selection. Functional validation through CRISPR-Cas9 confirmed the domestication gene TB1 as a regulator of tillering in sugarcane. To map traits under high ploidy, we introduced DosageGWAS [genome-wide association study (GWAS) that considers the combined dosage of homo(eo)logous loci across different homo(eo)logs], which models continuous allele dosage per locus and aggregates dosage across homo(eo)logs, avoiding intractable genotype enumeration. DosageGWAS increased heritability estimates and improved sensitivity and precision of associations, recovering dosage-phenotype gradients at sugar- and leaf-angle loci near SaIRX10 and SaBAK5. Lastly, we demonstrated that this framework scales to other complex polyploid pangenomes, including cotton, wheat, and potato, with similar gains in diversity capture, homeolog resolution and potentially trait-mapping power. CONCLUSION: A multiscale, graph-based Saccharum pangenome resolved haplotypes and allele dosages, strengthened variant discovery, and raised explained phenotypic variance while revealing functions of biologically and agronomically important loci. By compactly representing complex polyploid haplotypes in a single coordinate space, this framework supports marker design, allele mining, and more accurate genomic prediction. Demonstrated to scale to other polyploids, it offers a practical blueprint for mixed-ploidy crop genomics and accelerates gene discovery and improvement in sugarcane and beyond. Multiscale pangenome graph strategy in polyploid sugarcane.: The Saccharum super-pangenome graph compresses the diversity from nine high-quality assemblies spanning multiple ploidy levels into a unified reference, addressing challenges in multiomics analysis, mixed-ploidy population genomics, and GWAS; facilitates the discovery of breeding targets and biological insights; and establishes a foundation for polyploid genomics. LA, leaf angle. [ABSTRACT FROM AUTHOR]
Copyright of Science is the property of American Association for the Advancement of Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Psychology and Behavioral Sciences Collection
Full text is not displayed to guests.
Description
Abstract:The sugarcane genus Saccharum is characterized by complex genomes with diverse ploidy levels. We developed a multiscale graph–based pangenome representation, which integrates nine genome assemblies into a unified reference, representing modern cultivars and founding species. Each homo(eo)logous (encompasses both homologous and homeologous relationships) chromosome set retains 47 to 57 haplotypes and ~74,000 to 271,000 gene alleles. This framework enables multiomics exploration, encompassing homo(eo)log systems and epigenomic signatures. The pangenome facilitates population genomics analyses of 417 mixed-ploidy Saccharum accessions, revealing convergent selection and identifying the Andropogoneae TB1 homolog linked to tillering as a promising gene-editing target to boost cane yield. Additionally, the pangenome supports dosage-informed genome-wide association study, improving heritability estimates and identification of sugar or leaf-angle–associated loci, including SaIRX10 and SaBAK5. Our analytical framework establishes a foundation for graph-based genetic studies in sugarcane and other polyploid genomes. Editor's summary: Pangenomes were developed to better encompass genetic variation across a species, but this concept is now being expanded to include variation across subspecies as well as genera. A pair of papers in this issue report pangenome creation for crops, which particularly benefit from this analysis approach given their high levels of diversification and ploidy (see the Perspective by Soltis and Soltis). Ma et al. created a pangenome for Brassica rapa by integrating genetic variation from 1720 accessions spanning seven subspecies. Huang et al. created a pangenome for the polyploid sugarcane (Saccharum) that combines nine assemblies from four species. These references were able to help identify structural variations and regions potentially underlying important phenotypes. These pangenomes will serve as a valuable resource for research focusing on the improvement of these crops and will offer insight into domestication processes. —Corinne Simonti and Madeleine Seale INTRODUCTION: Modern sugarcane cultivars (Saccharum spp.) harbor exceptionally complex genomes shaped by interspecific hybridization, extreme and uneven polyploidy, aneuploidy, and abundant repeats. These features obscure haplotypes and allele dosage, complicating genetic research and breeding. A single linear reference cannot accommodate variation in ploidy, chromosome number, or introgressed backgrounds. RATIONALE: Graph genomes provide a compact, coordinate-consistent representation of alternative haplotypes and ploidy states, enabling allele-aware alignment, variant discovery, and cross-ploidy comparisons. We set out to build a polyploid-aware, multiscale graph framework for Saccharum that integrates genome- to gene- or protein-level information and supports downstream multiomics and population analyses, with a design expressly intended to scale to other polyploids. RESULTS: We developed a pangenome from nine assemblies across four Saccharum-related species and multiple ploidy levels, including both modern cultivars and their founding species, encompassing 47 to 57 haplotypes and ~74 to 271 thousand gene alleles. The graph captured ~82% of sugarcane genomic diversity (versus ~34% with a single reference) and operated seamlessly from genome to gene or protein scales. This integrative view delivers concrete biological insight. Clade-level comparisons revealed a pronounced enrichment and diversification of nucleotide-binding leucine-rich repeat receptors (NLRs) in wild relatives, which are a reservoir of disease-resistance alleles for introgression. Pangenome-guided multiomics improved mapping and boosted usable signal, uncovering additional high-confidence epigenomic features. At sugar-transport loci, such as SUT1, graph-resolved accessibility signatures aligned with transcriptional regulation of sucrose traits. Population analyses performed directly on the graph rescued missing diversity, reduced single-reference bias, and enabled cross-ploidy comparisons, exposing convergent selection within Andropogoneae and highlighting carbohydrate and cell wall modules under selection. Functional validation through CRISPR-Cas9 confirmed the domestication gene TB1 as a regulator of tillering in sugarcane. To map traits under high ploidy, we introduced DosageGWAS [genome-wide association study (GWAS) that considers the combined dosage of homo(eo)logous loci across different homo(eo)logs], which models continuous allele dosage per locus and aggregates dosage across homo(eo)logs, avoiding intractable genotype enumeration. DosageGWAS increased heritability estimates and improved sensitivity and precision of associations, recovering dosage-phenotype gradients at sugar- and leaf-angle loci near SaIRX10 and SaBAK5. Lastly, we demonstrated that this framework scales to other complex polyploid pangenomes, including cotton, wheat, and potato, with similar gains in diversity capture, homeolog resolution and potentially trait-mapping power. CONCLUSION: A multiscale, graph-based Saccharum pangenome resolved haplotypes and allele dosages, strengthened variant discovery, and raised explained phenotypic variance while revealing functions of biologically and agronomically important loci. By compactly representing complex polyploid haplotypes in a single coordinate space, this framework supports marker design, allele mining, and more accurate genomic prediction. Demonstrated to scale to other polyploids, it offers a practical blueprint for mixed-ploidy crop genomics and accelerates gene discovery and improvement in sugarcane and beyond. Multiscale pangenome graph strategy in polyploid sugarcane.: The Saccharum super-pangenome graph compresses the diversity from nine high-quality assemblies spanning multiple ploidy levels into a unified reference, addressing challenges in multiomics analysis, mixed-ploidy population genomics, and GWAS; facilitates the discovery of breeding targets and biological insights; and establishes a foundation for polyploid genomics. LA, leaf angle. [ABSTRACT FROM AUTHOR]
ISSN:00368075
DOI:10.1126/science.adx1616