Variation Analysis

Package Overview

Goal

Identification of sequence variants in germline and somatic samples, in whole-genomes and exomes

Requirements

  • Paired-end sequencing with 100bp read length or more
  • Germline:
    • 40x technical coverage (read length*2 / length of genome or exome)
    • Unique molecular identifiers (UMIs) or amplification-free library preparation suggested
  • Somatic:
    • Paired germline controls highly recommended
    • UMIs or amplification-free library preparation highly recommended
    • Unique-dual indexing of samples
    • 100x technical coverage for somatic samples for exome
    • 40x technical coverage for germline controls for exome
    • 60x technical coverage for somatic samples for whole-genome
    • 40x technical coverage for germline controls for whole-genome

Analysis

  • Running the nf-core sarek pipeline
    • Trimming of adapters and low-quality bases, processing of UMIs
    • Mapping to a reference
    • Identification and genotyping of SNVs and small indels
    • Functional annotation of variants based on a defined set of databases

Output

  • Mapped reads for visual inspection
  • Variant tables with genotype information (germline only) and annotation
  • HTML report

Advanced Analysis

  • Analysis of large sample sets (exomes: >= 100 samples, genomes: >= 20 samples)
  • Identification of copy-number variants including functional annotation
  • Identification of structural variants including functional annotation