Differential Gene Expression Analysis

Output Description

Summary HTML report

  • Location: de_report/report.html
  • Overview of the clustering and DEG results

Raw and normalized count tables

  • Location: de_report/data/bfx*.various_countTables.de-expl.xlsx
  • Normalized counts are counts that are normalized with respect to different sequencing depths of the libraries
  • Vst/rlog values are log2 counts that are transformed with respect to different sequencing depths, and at the same time taking account of the fact that lowly expressed genes vary more
  • Z-scores are vst values that are further transformed so that each gene has a mean expression of 0 and standard deviation of 1, good for visualization of data in heatmaps
  • PCA data reduce the complexity of sample-to-sample distances to a few dimensions. PC1 and PC2 are used for 2D visualizations in the above reports, PC1 to PC8 are additionally provided in the Excel sheet

Results of differential gene expression analysis

  • Location: de_report/data/bfx*.deseq-results.combined.*.xlsx
  • All comparisons using the same FDR are combined into a single table
  • Location: de_report/data/bfx*.deseq-results.separate.*.xlsx
  • Each comparison is in a separate table
  • Fold-changes, p-values, normalized counts, and more

TPM values

  • Location: de_report/data/bfx*.tpm_per_gene.xlsx
  • Kallisto (Bray et al. 2016)
  • TPM counts per transcript and gene obtained with kallisto
  • TPM normalization accounts for transcript length and sequencing depth
  • Use for within-sample comparisons between genes (but not between samples)

Quality Control

  • Location: multiqc/bfx*.do_rnaseq.multiqc.html
  • MultiQC (Ewels et al. 2016, see download)
  • HTML summary report with a general summary of the experiment

Gene counts Table

  • Location: de_report/data/bfx*.*.*.tsv.gz
  • The output of featureCounts provides raw counts of all genes and samples

Experimental Setup

  • Location:
    • de_report/data/bfx*.condition.csv
    • de_report/data/bfx*.contrasts.csv
  • Definition of samples, groups and comparisons to run

Input files for Gene Set Enrichment Analysis (GSEA)

  • Location: de_report/data/gsea
  • Description see here

Input files for Morpheus heatmaps

  • Location: de_report/data/bfx*.column_annotation.Morpheus.*.xlsx
  • Description see here

References

Bray, Nicolas L, Harold Pimentel, Páll Melsted, and Lior Pachter. 2016. “Near-Optimal Probabilistic RNA-seq Quantification.” Nat. Biotechnol. 34 (5): 525–27. https://www.nature.com/articles/nbt.3519.
Ewels, Philip, Måns Magnusson, Sverker Lundin, and Max Käller. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report.” Bioinformatics 32 (19): 3047–48. https://doi.org/10.1093/bioinformatics/btw354.