## SorghumBase Release 2.0
### Released: November 2021
## Summary
SorghumBase, a web portal for comparative plant genomics focused on sorghum crop varieties,
has now released its second version. The release provides access to 13 new sorghum varieties ([Tao _et al_, 2021](https://doi.org/10.1038/s41477-021-00925-x)), for a total of 18 sorghum reference genomes,
and 6 other species to support phylogenetic analyses, and cross-species comparisons. The genome databases
were built in direct collaboration with the [Gramene](http://gramene.org) and
[Ensembl Plants](http://plants.ensembl.org) projects. Other data sets were facilitated via
collaborations with the [Expression Atlas](https://www.ebi.ac.uk/gxa/plant/experiments),
the [Sorghum QTL Atlas](https://aussorgm.org.au/), and the
[Plant Reactome](https://plantreactome.gramene.org/) databases. Core funding for the
project is provided by the Agricultural Research Service of the
[U.S. Department of Agriculture](http://www.usda.gov/) (USDA ARS 59-8062-9-002).

## Release Information
- [Overall Highlights](#overall-highlights)
- [Databases](#databases)
- [Statistics](#statistics)

## Overall Highlights {#overall-highlights}
- 13 new sorghum reference genomes: 

  - [_Sorghum bicolor_ ssp. _bicolor_ IS12661](https://ensembl.sorghumbase.org/Sorghum_is12661)
  - [_Sorghum bicolor_ ssp. _bicolor_ IS3614-3](https://ensembl.sorghumbase.org/Sorghum_is36143)
  - [_Sorghum bicolor_ ssp. _bicolor_ IS38525](https://ensembl.sorghumbase.org/Sorghum_is38525)
  - [_Sorghum bicolor_ ssp. _bicolor_ IS929](https://ensembl.sorghumbase.org/Sorghum_is929)
  - [_Sorghum bicolor_ ssp. _bicolor_ Ji2731](https://ensembl.sorghumbase.org/Sorghum_ji2731)
  - [_Sorghum bicolor_ ssp. _bicolor_ R931945-2-2](https://ensembl.sorghumbase.org/Sorghum_r93194522)
  - [_Sorghum bicolor_ ssp. _bicolor (margaritiferum)_ IS19953](https://ensembl.sorghumbase.org/Sorghum_is19953)
  - [_Sorghum bicolor_ ssp. _bicolor (margaritiferum)_ PI525695](https://ensembl.sorghumbase.org/Sorghum_pi525695)
  - [_Sorghum bicolor_ ssp. _drummondii_ PI532566](https://ensembl.sorghumbase.org/Sorghum_pi532566)
  - [_Sorghum bicolor_ ssp. _verticilliflorum_ AusTRCF317961](https://ensembl.sorghumbase.org/Sorghum_austrcf317961)
  - [_Sorghum bicolor_ ssp. _verticilliflorum_ 353](https://ensembl.sorghumbase.org/Sorghum_is12661)
  - [_Sorghum bicolor_ ssp. _verticilliflorum_ PI536008](https://ensembl.sorghumbase.org/Sorghum_pi536008)
  - [_Sorghum propinquum_ S369-1](https://ensembl.sorghumbase.org/Sorghum_s3691)

- The new Sorghum genomes (see above), together with the original five *Sorghum bicolor bicolor* genomes (BTx623, Rio, Tx2783, Tx430, and Tx436), and 6 plant outgroup species (Japonica rice, B73 maize (assembly versions 4 and 5), _Arabidopsis thaliana_, grapevine, a vascular plant, and a single-celled green algae), were used to build 30,475 protein-coding gene family trees.  The database also includes a total of five pairwise DNA alignments for each of the sorghum genomes aligned to Japonica rice.
  
- We continue to host genetic, structural and phenotypic variation, gene expression and orthology-based pathway projections for the reference genome *S. bicolor* BTx623.

## Databases {#databases}
### Comparative Genomics

[**Gene Trees.**](https://ensembl.sorghumbase.org/prot_tree_stats.html) A total of
30,475 protein-coding gene family trees were constructed using the peptide encoded by
the canonical transcript (i.e., a representative transcript for a given gene) for each
of 796,813 individual genes (836,223 input proteins) from 25 plant genomes.

[**Whole-Genome Alignments**](https://ensembl.sorghumbase.org/compara_analyses.html).
There are pairwise genomic alignments for each one of the following sorghum genomes against
Japonica rice:
- [_Sorghum bicolor_ ssp. _bicolor_ BTx623](https://ensembl.sorghumbase.org/sorghum_bicolor/Location/Compara_Alignments/Image?align=23;db=core;r=4:41625307-41663480)
- [_Sorghum bicolor_ ssp. _bicolor_ RTx430](https://ensembl.sorghumbase.org/sorghum_tx430nano/Location/Compara_Alignments/Image?align=30;db=core;r=Scaffold_2:9298671-9344179)
- [_Sorghum bicolor_ ssp. _bicolor_ RTx436](https://ensembl.sorghumbase.org/sorghum_tx436pac/Location/Compara_Alignments/Image?align=29;db=core;r=4:40945993-40992222)
- [_Sorghum bicolor_ ssp. _bicolor_ Tx2783](https://ensembl.sorghumbase.org/sorghum_tx2783pac/Location/Compara_Alignments/Image?align=28;db=core;r=4:38544936-38590672)
- [_Sorghum bicolor_ ssp. _bicolor_ Rio](https://ensembl.sorghumbase.org/sorghum_rio/Location/Compara_Alignments/Image?align=31;db=core;r=4:37447216-37493025)

### Variation

Genetic variation data sets for over 6.7 million sorghum single nucleotide
polymorphisms (SNPs) (Morris _et al_, 2013; Mace _et al_, 2013) and 1.5 million chemically
induced by ethyl methanesulfonate (EMS) point mutations (Xin _et al_, 2008); 27,884
structural variants (Zheng _et al_, 2011) from the [Database of Genomic Variation Archive](https://www.ebi.ac.uk/dgva/)
(DGVa); and nearly 6,000 QTLs from the [Sorghum QTL Atlas](https://aussorgm.org.au/).

### Expression

Gene expression data for the S. bicolor BTx623 genome reference was curated and
processed through the [EMBL-EBI Expression Atlas](https://www.ebi.ac.uk/gxa/plant/experiments).
The set consists of seven studies with baseline expression in BTx623 (Davidson _et al_,
2012; Makita _et al_, 2014; Wang _et al_, 2018; Emms _et al_, 2016; Turco _et al_, 2017; and
two unpublished studies with SRA projects SRP062564 and SRP029353). At the Expression
Atlas site, baseline expression is also available for a bioenergy sorghum genotype
R.07020 (Kebrom _et al_, 2017), and differential expression for three studies 
(Varoquaux _et al_, 2019; Dugas et al 2011; Gelli et al, 2014), including one 
(Dugas _et al_, 2011) in the BTx623 reference.

For the baseline studies in SorghumBase, sampling was done in a single site
(e.g., stem internode), a discrete cell type (e.g., bundle sheath, leaf mesophyll),
organism parts (including seed, spikelet, stem, leaf, flag leaf, inflorescence, pistil,
embryo, endosperm, anther, pericarp, pollen, root, shoot, floral meristem, flower,
vegetative meristem, vascular/ non-vascular system) or the whole organism, at one or
multiple developmental stages. For example, Wang and collaborators (Wang _et al_, 2018)
examined 11 matched tissues of maize B73 and sorghum BTx623 at different developmental
stages for gene expression profiling totalling 165 experimental assays.

### Pathways

269 orthology-based sorghum pathways associated with 1,248 S. bicolor BTx623 genes.
These were projected from curated Japonica rice pathways and are linked to [Gramene’s Plant Reactome](http://gramene.org/).

## References {#references}
Aken, Bronwen L., Sarah Ayling, Daniel Barrell, Laura Clarke, Valery Curwen, Susan Fairley, Julio Fernandez Banet, et al. 2016.
"The Ensembl Gene Annotation System."
*Database: The Journal of Biological Databases and Curation*.
PMID: 27337980. https://doi.org/10.1093/database/baw093.

Brenton, Zachary W., Elizabeth A. Cooper, Mathew T. Myers, Richard E. Boyles, Nadia Shakoor, Kelsey J. Zielinski, Bradley L. Rauh, William C. Bridges, Geoffrey P. Morris, and Stephen Kresovich. 2016.
"A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy."
*Genetics* 204 (1): 21–33.
PMID: 27356613. https://doi.org/10.1534/genetics.115.183947.

Casa, Alexandra M., Gael Pressoir, Patrick J. Brown, Sharon E. Mitchell, William L. Rooney, Mitchell R. Tuinstra, Cleve D. Franks, and Stephen Kresovich. 2008.
"Community Resources and Strategies for Association Mapping in Sorghum."
*Crop Science* 48 (1): 30–40.
https://doi.org/10.2135/cropsci2007.02.0080.

Davidson, Rebecca M., Malali Gowda, Gaurav Moghe, Haining Lin, Brieanne Vaillancourt, Shin-Han Shiu, Ning Jiang, and C. Robin Buell. 2012.
"Comparative Transcriptomics of Three Poaceae Species Reveals Patterns of Gene Expression Evolution."
*The Plant Journal: For Cell and Molecular Biology* 71 (3): 492–502.
PMID: 22443345. https://doi.org/10.1111/j.1365-313X.2012.05005.x.

Emms, David M., Sarah Covshoff, Julian M. Hibberd, and Steven Kelly. 2016.
"Independent and Parallel Evolution of New Genes by Gene Duplication in Two Origins of C4 Photosynthesis Provides New Insight into the Mechanism of Phloem Loading in C4 Species."
*Molecular Biology and Evolution* 33 (7): 1796–1806.
PMID: 27016024. https://doi.org/10.1093/molbev/msw057.

Gladman, N. et al. "Sorghum root epigenetic landscape during limiting phosphorus conditions." *Manuscript in preparation*.

Goodstein, David M., Shengqiang Shu, Russell Howson, Rochak Neupane, Richard D. Hayes, Joni Fazo, Therese Mitros, et al. 2012.
"Phytozome: A Comparative Platform for Green Plant Genomics."
*Nucleic Acids Research* 40 (Database issue): D1178–86.
PMID: 22110026. https://doi.org/10.1093/nar/gkr944.

Jiao, Yinping, John J. Burke, Ratan Chopra, Gloria Burow, Junping Chen, Bo Wang, Chad Hayes, Yves Emendack, Doreen Ware, and Zhanguo Xin. 2016.
"A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses."
*The Plant Cell*.
PMID: 27354556. https://doi.org/10.1105/tpc.16.00373.

Mace, Emma S., Shuaishuai Tai, Edward K. Gilding, Yanhong Li, Peter J. Prentis, Lianle Bian, Bradley C. Campbell, et al. 2013.
"Whole-Genome Sequencing Reveals Untapped Genetic Potential in Africa’s Indigenous Cereal Crop Sorghum."
*Nature Communications* 4: 2320.
PMID: 23982223. http://doi.org/10.1038/ncomms3320.

Makita, Yuko, Setsuko Shimada, Mika Kawashima, Tomoko Kondou-Kuriyama, Tetsuro Toyoda, and Minami Matsui. 2015.
"MOROKOSHI: Transcriptome Database in Sorghum Bicolor."
*Plant & Cell Physiology* 56 (1): e6.
PMID: 25505007. https://doi.org/10.1093/pcp/pcu187.

Morris, Geoffrey P., Punna Ramu, Santosh P. Deshpande, C. Thomas Hash, Trushar Shah, Hari D. Upadhyaya, Oscar Riera-Lizarazu, et al. 2013.
"Population Genomic and Genome-Wide Association Studies of Agroclimatic Traits in Sorghum."
*Proceedings of the National Academy of Sciences of the United States of America* 110 (2): 453–58.
PMID: 23267105. https://doi.org/10.1073/pnas.1215985110.

Olson, Andrew, Robert R. Klein, Diana V. Dugas, Zhenyuan Lu, Michael Regulski, Patricia E. Klein, and Doreen Ware. 2014.
"Expanding and Vetting Sorghum Bicolor Gene Annotations through Transcriptome and Methylome Sequencing."
*The Plant Genome* 7 (2): plantgenome2013.08.0025. https://doi.org/10.3835/plantgenome2013.08.0025.

Tao, Yongfu, Hong Luo, Jiabao Xu, Alan Cruickshank, Xianrong Zhao, Fei Teng, Adrian Hathorn, et al. 2021. “Extensive Variation within the Pan-Genome of Cultivated and Wild Sorghum.” *Nature Plants* 7 (6): 766–73. https://doi.org/10.1038/s41477-021-00925-x 

Turco, Gina M., Kaisa Kajala, Govindarajan Kunde-Ramamoorthy, Chew-Yee Ngan, Andrew Olson, Shweta Deshphande, Denis Tolkunov, et al. 2017.
"DNA Methylation and Gene Expression Regulation Associated with Vascularization in Sorghum Bicolor."
*The New Phytologist* 214 (3): 1213–29.
PMID: 28186631. https://doi.org/10.1111/nph.14448.

Xin, Zhanguo, Ming Li Wang, Noelle A. Barkley, Gloria Burow, Cleve Franks, Gary Pederson, and John Burke. 2008.
"Applying Genotyping (TILLING) and Phenotyping Analyses to Elucidate Gene Function in a Chemically Induced Sorghum Mutant Population."
*BMC Plant Biology*.
PMID: 18854043. https://doi.org/10.1186/1471-2229-8-103.

Wang, Bo, Michael Regulski, Elizabeth Tseng, Andrew Olson, Sara Goodwin, W. Richard McCombie, and Doreen Ware. 2018.
"A Comparative Transcriptional Landscape of Maize and Sorghum Obtained by Single-Molecule Sequencing."
*Genome Research* 28 (6): 921–32.
PMID: 29712755 https://doi.org/10.1101/gr.227462.117.

Zheng, Lei-Ying, Xiao-Sen Guo, Bing He, Lian-Jun Sun, Yao Peng, Shan-Shan Dong, Teng-Fei Liu, et al. 2011.
"Genome-Wide Patterns of Genetic Variation in Sweet and Grain Sorghum (Sorghum Bicolor)."
*Genome Biology* 12 (11): R114.
PMID: 22104744. http://dx.doi.org/10.1186/gb-2011-12-11-r114.