This method is useful to build novel reference genomes, which could serve as a foundation for future research. Long-read technologies such as PacBio and ONT can decipher much of the structural properties of a genome. While PacBio HiFi assemblies do not need to be polished with short reads, Nanopore data requires an extra polishing step using Illumina data, e.g. paired-end libraries or HiC. HiC adds an additional layer of information to long-read data, arranges scaffolds in chromosomes and proofreads assembly quality.
NGI now offers de novo projects as one single package. Users can send in their sample(s) and NGI will take care of the separate library preparation setups suited for your particular project. A typical setup involves an initial draft genome assembled from long sequence reads, followed by scaffolding to get longer contigs and error-correction. This is followed by annotation of the new reference genome, eg. of genes and other functional elements. We also offer DNA extraction as a service for de-novo projects if required. For more info, please refer to our recent online webinar.
In order to know how contiguous your assembly should be, please have a look at the flowchart.
Each study setup is described in more detail below. Once you have chosen the setup suitable for your de-novo project, the arrows direct you to the type of data you need. You can read more about the different technologies NGI offers to generate the data in the technology section below.
More info about what applications, methods and bioinformatics options NGI provides can be found further down.
All new projects should first be discussed with us prior to applications. Please contact us here.
They are commonly used to:
These assemblies result from scaffolded paired-end Illumina reads and have:
Structural variation analysis is very cumbersome, and mainly short indels can be analysed. It can be problematic to predict if the observed variation is present at a single locus, or is a part of a larger genomic structure. Genomic repeats are usually collapsed, or mis-assembled; gene duplication events can be problematic to detect.
Long-read only assemblies (PacBio or Nanopore)
Hybrid long-read and Hi-C assembly
PacBio SMRT sequencing generates reads tens of kilobases in length enabling high quality genome assembly, structural variant analysis, amplicon resequencing, full-length transcript isoform sequencing, full-length 16S rRNA sequencing and amplification free epigenetic characterization.assembly methylation smrt pacbio amplicon sequel hifi clr de novo iso seq sv