Bioinformatic support

The bioinformatics support from NGI varies between facilities and type of sequencing project. For more extensive bioinformatics support we recommend that you contact NBIS, the National Bioinformatics Infrastructure Sweden.

NGI bioinformatics support

This is a brief overview of the standard bioinformatics analysis provided by the different NGI facilities.

Due to software license restrictions, best practice bioinformatic analysis is only available to academic and non-profit organization users.


NGI Stockholm

All samples sequenced at NGI Stockholm undergo a series of quality certified checks. Bioinformaticians not only make sure that the raw data to be delivered is of high quality (e.g., Q30, yield, etc.), but also that sequences are biologically relevant by checking the absence of contaminations and, depending on the specific application, by checking the results of the primary analysis (e.g., duplicate rates, GC-content, etc.).

NGI Stockholm perform automated bioinformatics analyses in four applications. Note that only a limited number of genomes are supported. Please contact NGI Stockholm if you have suggestions for other genomes to include (see contact information).

  • Whole Genome Re-sequencing. We map the reads using BWA and perform variant calling using the GATK pipeline. The variants are annotated using databases such as dbSNP and by using SNPEff. The following files are delivered: BAM, gVCF, VCF
  • RNA-Seq. The reads are mapped against the reference and a series of tools are run to determine the quality of the data and to provide users with data that is ready to be used for subsequent analysis. For more information please reference to GitHub
  • Sequence capture (Target resequencing). We map the reads using BWA and perform variant calling using the GATK pipeline. The variants are annotated using databases such as dbSNP and by using SNPEff. Metrics on the success of the capture are included.
  • de novo sequencing. Depending on the genome being sequenced and on the employed libraries we use 2 or 3 assemblers to assemble the reads. Assemblers selection is depending on the genome and on the data obtained. A typical selection is: ALLPATHS-LG, SOAPdenovo, and ABySS but other tools might be employed depending on the project. In order to assess the assembly quality and suggest which assembly is the best we use tools like CEGMA and FRCurve. A report is sent to the costumer reporting the tools employed, the assembly statistics, and the assembly evaluation. In case of projects aiming to assemble complex genomes we strongly suggest to follow the ALLPATH-LG recipe, i.e., one paired end library with insert size of 180bp and one (or more) mate pair library with insert size >3Kbp. In this case ALLPATH-LG can be employed to assemble the genome.


NGI Uppsala (the SNP&SEQ Technology Platform)

Illumina sequencing

All samples sequenced at the SNP&SEQ Technology Platform are handled by a highly automated processing pipeline, which take the samples through a series of quality-certified checks. Bioinformaticians supervise the processing and make sure that the delivered data meet our high standards in terms of quality and yield. Further analyses of the data can be offered, depending on the specific application.

Sequencing data is always delivered with statistics on various QC metrics such as base quality value distribution, GC-content, adapter content and nucleotide distribution.

  • Human Whole Genome Re-sequencing. We map the reads using BWA and perform variant calling using the GATK pipeline. The variants are annotated using databases such as dbSNP and by using SNPEff. The following files are delivered: BAM, gVCF, VCF
  • All other sequencing projects. The following files are delivered: fastq

SNP – Genotyping and array based methylation analysis

A result report from SNP genotyping typically includes a text file with the genotype data and/or PLINK files accompanied by files and text describing the results from the QC of the genotype data and SNP markers in the project.

Genotype data are exported in one of three different strand orientations. TOP strand according to definition from Illumina, PLUS strand according to the human reference genome or FORWARD strand according to dbSNP.

On request file format for CNV analysis, such as Nexus or PennCNV are compiled. Result reports from methylation analysis are exported in a text file with beta-values ranging from zero to one indicating 0 to 100% methylation for each interrogated locus and files describing the results from the QC of the results.

Any specific request for file format of the exported data should be noted in the NGI project registration


NGI Uppsala (Uppsala Genome Center)

NGI Uppsala (UGC) provides best practice analysis for several applications on Ion Torrent and PacBio.

Ion Torrent

  • Whole Genome Re-sequencing
    The reads are mapped using T-Map, a native Ion Torrent mapper. Run report is included. Variant calling and annotations (human only) is performed upon request. The following files are delivered: BAM and VCF
  • Ion AmpliSeq Human Exome
    The reads are mapped using T-Map, variant calling is performed using TS Variant Caller. Run report is included. Upon request the variants can be filtered using the CanvasDB system. The following files are delivered: BAM and VCF
  • Ion AmpliSeq Panels (Ready-To-Use or Custom)
    The reads are mapped using T-Map, variant calling is performed using TS Variant Caller. Run report is included. The following files are delivered: BAM and VCF
  • RNA-Seq (polyA-selected/total RNA)
    The reads are mapped using T-Map if a reference is available. Run report is included. The following files are delivered: BAM
  • Small RNA-Seq
    The reads are mapped using T-Map if a reference is available. Run report is included. The following files are delivered: BAM
  • Ion AmpliSeq Human Whole Transcriptome
    The reads are analysed using the AmpliSeq RNA plugin. Run report is included. Raw read counts and normalized expression values are calculated. The following files are delivered: BAM and gene expression values (CSV/XLS files)
  • Ion 16S Metagenomics
    The reads are analysed using Ion Reporter Metagenomics Workflow, which performs species level identification of microbial populations. The following files are delivered: BAM and Ion Reporter output

PacBio

  • De novo assembly
    An initial assembly is performed using HGAP or FALCON depending on genome type and size. Reports for the run and assembly are included. The following files are delivered: Raw PacBio data, subreads (FASTQ) and assembly files
  • Amplicon sequencing
    High-quality reads of insert are produced using the CCS protocol in SMRT analysis. Run report is included. The following files are delivered: Raw PacBio data and reads of insert FASTQ files
  • Full-length transcript sequencing (Iso-Seq)
    Full-length transcripts are generated using the Iso-Seq protocol in SMRT analysis. Run report is included. The following files are delivered: Raw PacBio data and output from Iso-Seq plugin
  • Base modification analysis (prokaryote only)
    Base modification results are created using the Modification and Motif Analysis protocol in SMRT analysis. Run report is included. The following files are delivered: Raw PacBio data and output from the Base Modification analysis

NBIS, National Bioinformatics Infrastructure Sweden

For additional bioinformatic support NGI recommends that you contact the SciLifeLab platform NBIS. NBIS was started on April 1st 2016 as the continuation of BILS, WABI and a few other organisations. Their support services cover the full range of community needs, from short consultations to long-term engagement.

NBIS also offer up to three hours of free consultation to all projects. If you will be in need of their services we strongly recommend that you get in touch with NBIS for a consultation meeting as early as possible, i.e., when you are in the planning stage of your project. For more information, see the NBIS home page.


Cookie policy: Cookies are not used for anonymous users.