Skip to content

Downstreamer for single‐cell gene prioritization

Anoek edited this page Apr 16, 2024 · 1 revision

The following steps outline the process of prioritizing genes using Downstreamer to create single-cell gene regulatory networks. Downstreamer integrates GWAS summary statistics with a correlation matrix containing the co-regulation patterns within specific cells to identify key genes potentially contributing to disease development.

Preceding the steps outlined here, a gene by gene correlation matrix was generated by calculating the weighted Pearson correlation coefficients of the gene-expression for each gene-pair combination. This matrix was then decomposed into its eigenvalues and eigenvectors, and the eigenvectors that explained 80% were selected. The selected eigenvectors were then reformatted into a format that can be used as input for Downstreamer.

1. GWAS gene aggregation

In this step the GWAS summary statistics are converted from p-values per variant to an aggregate p-value per gene, accounting for LD structures (1000 Genomes phase 3) using PascalX (version 0.0.3).

Arguments used:

Argument Description
--refpanel Reference panel path 1kGP_high_coverage_Illumina
--gwas GWAS summary statistics path {gwas}.snppval.txt.gz
--annotation Genome annotation file genes_Ensembl94.txt
--threads Number of threads for scoring 20
--outfile Output file {gwas}.txt
--rscol Rs ids column index 0
--pscol P-value column index 1
--window Gene window 25000

2. Downstreamer

Next, the aggregated p-values and reformated eigenvectors are input into Downstreamer to prioritize key genes.

Arguments used:

Argument Description
--mode Downstreamer mode ENRICH
--gwas GWAS gene p-values {gwas}.txt
--geneCorrelations Gene expression matrix force normalized and split per chromosome permutationGeneCor/geneCorForceNormalchr_
--output Output file path {gwas}_keygenes_covCor
--genes File with gene information col1: geneName (ensg) col2: chr col3: startPos col4: stopPos col5: geneType col6: chrArm genes_Ensembl94_protein_coding.txt
--expressionEigenVectors Selected eigenvectors of gene-genecorrelation matrix {eigenvectors}
--covariates Corrects for potential inflation in correlation signal using median signal medianGwasSignal.txt
--eh Exclude HLA locus during pathway enrichment (chr6 20mb - 40mb)
-t Maximum number of calculation threads 8
--forceNormalGenePvalues Force normal gene p-values before pathway enrichment
Clone this wiki locally