A comprehensive collection of transcriptomic (and other) markers for identification of individual cells (phenotype, state, etc) at various scopes
In scRNASeq data, a major challenge is to identify/classify individual cells. From machine learning (statistical inference) to manual interpretation, having sets of well-defined genes, in various tissues and species (Mouse, non-human primates, and human for now) is critical.
The aim of this curation is to bring together a large and comprehensive repertoire of critical gene sets. That is because many of such gene sets of canonical markers is domain-specific knowledge; spread across published literature, websites, ... etc as well as the biological brains of scientists and experts. To preserve this body of knowledge and in the mantra of open-science, please contribute, promote, and cite this repo. If you choose to contribute, please follow the contribution rules.
I believe it is critical to start at the scope of tissue. As this project get larger, I may develop a database system and split the tissues by species. For now, please start with finding the tissue type of interest stored in this repo.
For now, because the main utility of these genes is in R, I define a list named 'SGS.LS'.
The next layer in the list is the tissue. Then the vectors of gene names.
#List of gene names
SGS.LS <- list()
#Tissues
SGS.LS$Blood <- list()
SGS.LS$Testis <- list()
SGS.LS$Lung <- list()