Skip to content

Files

Latest commit

b2ab2ff · Nov 8, 2021

History

History
82 lines (51 loc) · 7.97 KB

PanOryza Genomes Release 3---2021-10-21.md

File metadata and controls

82 lines (51 loc) · 7.97 KB

Oryza PanGenome Release 3.0

Released: November 2021

Summary

Gramene's Oryza Pan-Genome (https://oryza.gramene.org) is a web portal for comparative plant genomics focused on rice varieties. In its third release, the Oryza pan-genome provides access to 15 new rice genomes: 9 indica, 2 aus, and 4 japonica varieties including Carolina Gold rice (Zhou et al, 2020; Wang et al, 2018; Stein et al, 2018), and one updated reference genome: Oryza sativa indica (var. 93-11). These 16 new or updated Oryza genomes were added to the collection of 9 Oryza genomes also available in Gramene to a total of 25 Oryza genomes.

Fifteen new rice reference genomes including Carolina Gold, and one updated (indica reference) genome, for a total of 25 Oryza genomes plus Leersia perrieri, the closest relative of the Oryzae.

Together with 6 plant outgroup species (Arabidopsis thaliana, sorghum, grapevine, Chlamydomonas, Selaginella, and 2 assembly versions of maize: v4 and v5), the 25 Oryza genomes and L. perrieri (the most closely related species to the Oryza in the Oryzeae tribe), were used to build 37,315 protein-coding gene family trees. These family trees were constructed with 1,184,741 input proteins from 1,131,828 individual genes.

Baseline gene expression for the Japonica reference is available via the Oryza Pan-Genome search interface and linked to the Expression Atlas, while differential gene expression for the same species is only available on the Expression Atlas website.

For the 9 Oryza genomes in the prior release of the Pan-Genome (i.e., O. sativa japonica, O. barthii, O. brachyantha, O. glaberrima, O. glumaepatula, O. meridionalis, O. nivara, O. punctata, and O. rufipogon), as well as for L. perrieri and the other outgroups, orthology-based pathway projections are available via the Oryza Pan-Genome search interface. Similarly for those genomes, complementary pairwise DNA alignments (e.g., 81 for O. sativa Japonica) and/or synteny maps (e.g., 19 for O. sativa Japonica) are available on the Gramene website. Gramene continues to host genetic variation totalling over 40 million SNPs for the Japonica reference, O. glaberrima and O. glumaepatula.

The genome databases were built in direct collaboration with the Gramene and Ensembl Plants projects. Gene expression and pathway associations were facilitated through collaboration with the Expression Atlas and Plant Reactome projects, respectively. Core funding for the project was provided by the National Science Foundation (NSF IOS-1127112) and the Agricultural Research Service of the U.S. Department of Agriculture (USDA ARS 8062-21000-041-00D).

Release Information

Overall Highlights {#overall-highlights}

Six plant outgroup species (B73 maize, sorghum, Arabidopsis thaliana, grapevine, a vascular plant, and a single-celled green algae) were used to build 37,315 protein-coding gene family trees.

Our comparative genomics collection includes a total of 37,315 gene trees, that include the 25 oryza genomes, and 6 outgroups allows comparisons between higher eukaryotes, lower plants, and the model Arabidopsis. These gene trees were constructed with 1,184,741 input proteins from 1,131,828 individual genes. Pairwise DNA alignments and synteny maps are available for 10 Oryza varieties and L. perrieri in the main Gramene website.

Gene expression and orthology-based pathway projections are available for the Japonica reference via the Search interface.

Databases {#databases}

Comparative Genomics

Gene Trees. A total of 37,315 protein-coding gene family trees were constructed using the peptide encoded by the canonical transcript (i.e., a representative transcript for a given gene) for each of 1,131,828 individual genes (1,184,741 input proteins) from 33 plant genomes.

Whole-Genome Alignments. Pairwise genomic alignments are available for 10 of the Oryza genomes in the Gramene website (e.g., 81 for the Japonica reference). New alignments across the Oryza genomes will be made available here in future releases.

Synteny. Synteny maps are available for 10 of the Oryza genomes in the main Gramene website (e.g., 19 for the Japonica reference). New synteny maps including the new rice genomes will be made available here in future releases.

Variation

Genetic variation is available for the O. sativa Japonica and Indica references, O. glaberrima and O. glumaepatula in the main Gramene website. Genetic variation data sets will be made available here in future releases.

Expression

Gene expression data is available only for the Japonica genome reference and was curated and processed through the EMBL-EBI Expression Atlas. The set consists of 10 studies with baseline expression (available via the Oryza Pan-Genome search interface) and 93 differential studies (only available via the Expression Atlas website).

Pathways

320 curated pathways for the Japonica reference genome, as well as orthology-based pathway projections for the 10 Oryza species also in Gramene and the six outgroups (v5 of Zea mays only) are available via the Oryza Pan-Genome search interface and linked to Gramene’s Plant Reactome. Pathway projections for O. australiensis, O. minuta, Oryza meyeriana var. Granulata are only available on Gramene’s Plant Reactome. In addition, projections for O. longistaminata and the prior version of O. sativa indica are available on the Gramene website.

References {#references}

  • Stein, Joshua C., Yeisoo Yu, Dario Copetti, Derrick J. Zwickl, Li Zhang, Chengjun Zhang, Kapeel Chougule, et al. 2018. “Genomes of 13 Domesticated and Wild Rice Relatives Highlight Genetic Conservation, Turnover and Innovation across the Genus Oryza.” Nature Genetics 50 (2): 285–96.

  • Wang, Wensheng, Ramil Mauleon, Zhiqiang Hu, Dmytro Chebotarov, Shuaishuai Tai, Zhichao Wu, Min Li, et al. 2018. “Genomic Variation in 3,010 Diverse Accessions of Asian Cultivated Rice.” Nature 557 (7703): 43–49.

  • Zhou, Yong, Dmytro Chebotarov, Dave Kudrna, Victor Llaca, Seunghee Lee, Shanmugam Rajasekar, Nahed Mohammed, et al. 2020. “A Platinum Standard Pan-Genome Resource That Represents the Population Structure of Asian Rice.” Scientific Data 7 (1): 113.