RAD-seq analysis

Pipeline for quality control of Restriction-site Associated DNA sequencing (RAD-Seq). Genotyping-by-sequencing without prior genome information.

Our in-house RAD-Seq pipeline for quality control of RAD-seq libraries. RAD-seq allows for deep yet sparsely sampled sequencing of many individuals in a highly multiplexed manner, where typical applications includes QTL mapping, GWAS studies, high resolution population differentiation/phylogeny, pedigree reconstruction and SNP discovery for other more high throughput assays. The pipeline is mainly designed to only characterize the data and attempt to correct defects, e.g. adapter contamination and restriction-site sequencing errors, for further downstream analysis.

When we run analysis

We run this analysis by request for RAD-seq projects where we have prepared the sequencing library in-house.

It does not require a reference genome to run.

Input data

  • bcl2fastq demultiplexed FastQ files.


  • Trimmomatic output. Which includes FastQ files that are trimmed by quality score and adaptor content. Also, truncated to a uniform length (typcally 100 bp).
  • RAD loci count and estimated sequencing depth per sample, as generated by running Stacks in de novo mode with default parameters. Also included is a plot of loci shared across # samples.
  • FastQC for basic read quality metrics after trimming
  • MultiQC report which summarize the above steps
