Differential Gene Expression Analysis
Output Description
Summary HTML report
- Location:
de_report/report.html
- Overview of the clustering and DEG results
Raw and normalized count tables
- Location:
de_report/data/bfx*.various_countTables.de-expl.xlsx
- Normalized counts are counts that are normalized with respect to different sequencing depths of the libraries
- Vst/rlog values are log2 counts that are transformed with respect to different sequencing depths, and at the same time taking account of the fact that lowly expressed genes vary more
- Z-scores are vst values that are further transformed so that each gene has a mean expression of 0 and standard deviation of 1, good for visualization of data in heatmaps
- PCA data reduce the complexity of sample-to-sample distances to a few dimensions. PC1 and PC2 are used for 2D visualizations in the above reports, PC1 to PC8 are additionally provided in the Excel sheet
Results of differential gene expression analysis
- Location:
de_report/data/bfx*.deseq-results.combined.*.xlsx
- All comparisons using the same FDR are combined into a single table
- Location:
de_report/data/bfx*.deseq-results.separate.*.xlsx
- Each comparison is in a separate table
- Fold-changes, p-values, normalized counts, and more
TPM values
- Location:
de_report/data/bfx*.tpm_per_gene.xlsx
- Kallisto (Bray et al. 2016)
- TPM counts per transcript and gene obtained with kallisto
- TPM normalization accounts for transcript length and sequencing depth
- Use for within-sample comparisons between genes (but not between samples)
Quality Control
- Location:
multiqc/bfx*.do_rnaseq.multiqc.html
- MultiQC (Ewels et al. 2016, see download)
- HTML summary report with a general summary of the experiment
Gene counts Table
- Location:
de_report/data/bfx*.*.*.tsv.gz
- The output of featureCounts provides raw counts of all genes and samples
Experimental Setup
- Location:
de_report/data/bfx*.condition.csv
de_report/data/bfx*.contrasts.csv
- Definition of samples, groups and comparisons to run
Input files for Gene Set Enrichment Analysis (GSEA)
- Location:
de_report/data/gsea
- Description see here
Input files for Morpheus heatmaps
- Location:
de_report/data/bfx*.column_annotation.Morpheus.*.xlsx
- Description see here
References
Bray, Nicolas L, Harold Pimentel, Páll Melsted, and Lior Pachter. 2016. “Near-Optimal Probabilistic RNA-seq Quantification.” Nat. Biotechnol. 34 (5): 525–27. https://www.nature.com/articles/nbt.3519.
Ewels, Philip, Måns Magnusson, Sverker Lundin, and Max Käller. 2016. “MultiQC: summarize analysis results for multiple tools and samples in a single report.” Bioinformatics 32 (19): 3047–48. https://doi.org/10.1093/bioinformatics/btw354.