Skip to content

Latest commit

 

History

History
48 lines (33 loc) · 2.17 KB

README.md

File metadata and controls

48 lines (33 loc) · 2.17 KB

msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format

CRAN CRAN downloads R-CMD-check Codecov test coverage

Overview

The msigdbr R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:

  • in an R-friendly "tidy" format with one gene pair per row
  • for multiple frequently studied model organisms, such as mouse, rat, pig, zebrafish, fly, and yeast, in addition to the original human genes
  • as gene symbols as well as NCBI Entrez and Ensembl IDs
  • without accessing external resources requiring an active internet connection

Installation

The package can be installed from CRAN.

install.packages("msigdbr")

The recent versions of the package provide only a small subset of the full MSigDB database due to CRAN size limitations. Please install the msigdbdf package to access the full MSigDB database:

install.packages("msigdbdf", repos = "https://igordot.r-universe.dev")

Older releases can be installed from GitHub (specify the exact version):

remotes::install_github("igordot/msigdbr", ref = "v2023.1.1")

Usage

The package data can be accessed using the msigdbr() function, which returns a data frame of gene sets and their member genes. For example, you can retrieve mouse genes from the C2 (curated) CGP (chemical and genetic perturbations) gene sets.

library(msigdbr)
genesets <- msigdbr(species = "mouse", category = "C2", subcategory = "CGP")

Check the documentation website for more information.