This method is useful to build novel reference genomes, which could serve as a foundation for future research. Long-read technologies such as PacBio and ONT can decipher much of the structural properties of a genome. While PacBio HiFi assemblies do not need to be polished with short reads, Nanopore data requires an extra polishing step using Illumina data, e.g. paired-end libraries or HiC. HiC adds an additional layer of information to long-read data, arranges scaffolds in chromosomes and proofreads assembly quality.
NGI now offers de novo projects as one single package. Users can send in their sample(s) and NGI will take care of the separate library preparation setups suited for your particular project. A typical setup involves an initial draft genome assembled from long sequence reads, followed by scaffolding to get longer contigs and error-correction. This is followed by annotation of the new reference genome, eg. of genes and other functional elements. We also offer DNA extraction as a service for de-novo projects if required. For more info, please refer to our recent online webinar.
Project setup
In order to know how contiguous your assembly should be, please have a look at the flowchart.
Each study setup is described in more detail below. Once you have chosen the setup suitable for your de-novo project, the arrows direct you to the type of data you need. You can read more about the different technologies NGI offers to generate the data in the technology section below.
More info about what applications, methods and bioinformatics options NGI provides can be found further down.
All new projects should first be discussed with us prior to applications. Please contact us here.
Applications
They are commonly used to:
These assemblies result from scaffolded paired-end Illumina reads and have:
Structural variation analysis is very cumbersome, and mainly short indels can be analysed. It can be problematic to predict if the observed variation is present at a single locus, or is a part of a larger genomic structure. Genomic repeats are usually collapsed, or mis-assembled; gene duplication events can be problematic to detect.
Long-read only assemblies (PacBio or Nanopore)
Hybrid long-read and Hi-C assembly
RNA sequencing of mRNAs selected through poly-A enrichment.
illumina RNA-Seq library preparation transcriptomics RNA truseq mRNAProduction of high-quality proximity ligation libraries, using two restriction enzymes.
illumina de novo chromatin scaffolding library preparation epigenetics TADsA proximity-ligation protocol using a sequence-independent endonuclease, generating data for TAD identification and scaffolding.
illumina de novo chromatin scaffolding library preparation epigenetics TADsLow cost library preparation option for gDNA based on bead-linked transposase. Only for full plates of samples.
illumina WGS dna nextera normalization library preparation genomeMethod for shotgun DNA libraries used for whole genome sequencing and metagenomics.
illumina WGS dna tagmentation PCR-free library preparation genomeGold standard method for shotgun DNA libraries used for whole genome sequencing and metagenomics.
WGS dna library preparation truseq genome illuminaLibrary preparation from limited input DNA, used in whole genome sequencing and metagenomics etc.
genome illumina WGS dna library preparation truseqLibrary preparation for DNA, ideal for preparing libraries from small amounts of input material. Works well for shotgun libraries, ChIP DNA and FFPE samples, amongst others.
genome illumina WGS dna library preparationNanopore cDNA sequencing is able to sequence entire transcripts in one go, ideal for detecting isoforms and fusions events.
assembly long-read nanoporeNanopore instruments can sequence very long continuous fragments of DNA. Sequencing native DNA allows detection of base modifications.
long-read nanopore assemblyNanopore direct RNA sequencing is able to sequence entire transcripts from native RNA, opening up opportunities to detect RNA modifications.
assembly long-read nanoporePacBio SMRT sequencing generates reads tens of kilobases in length enabling high quality genome assembly, structural variant analysis, amplicon resequencing, full-length transcript isoform sequencing, full-length 16S rRNA sequencing and amplification free epigenetic characterization.
sv revio smrt assembly pacbio methylation amplicon hifi de novo iso seqNGI can generate high quality assemblies using IPA and hifiasm assemblers
pacbio hifi hifiasm ipa hic omnic revio scaffolding assemblyQuality control, Basecalling and multiplexing of sequencing reads generated by Oxford Nanopore sequencers.
long-read nanoporeAdditional compute intensive nanopore raw data processing services provided by NGI
methylation base modifications basecalling pod5Basic quality-control monitoring of Illumina FastQ sequence data.
QC fastqc fastq screen checkqc