Banner

Interested in the products mentioned in this resource?

Send us a message

Garbage in, garbage out is an expression commonly used in science to express that poor quality DNA libraries will generate erroneous sequencing results. So, when it comes to next-generation sequencing (NGS) how do you ensure that what you put in is good enough to get the quality of results that you need?

A high-quality sequencing library is the linchpin to generating good sequencing data. Ideal libraries consist of purified target sequences of the right size distribution, ligated to appropriate sequencing adaptors, and in sufficient concentrations for your chosen sequencing platform.

While getting here may take some in-house optimisation – dependent on your specific sample type, application and chosen sequencing platform – all library preparation workflows can benefit from some universal tips and tricks. In this blog, we discuss our six top tips to help you improve your sequencing library. 

1. Extraction, extraction, extraction!

Whether you’re identifying sequence-based biomarkers in haematological malignancies or surveying gene expression changes in solid tumours, the first common step to nearly all NGS studies is the extraction of nucleic acids. Choosing a sub-optimal extraction method can ultimately compromise the quality of your library. For example, inadequate cell lysis could result in insufficient yields while carried over contaminants can be detrimental to enzymes used in downstream steps such as ligation.

An incomplete extraction could also introduce bias into your dataset. A successful gene expression analysis of multiple cell types, for example, is contingent on the ability to access mRNA in the same proportion in which they exist in the body. Consequently, cell lysis must account for any variation in the severity needed to lyse different cell types.

Some samples prove particularly recalcitrant to DNA extraction. Formalin-fixation, paraffin-embedded (FFPE) samples, the most frequently used method of tissue preservation, are often challenging to nucleic acid extraction.

The major challenge of nucleic acid extraction from FFPE tissue is the prevalence of chemical crosslinking that binds nucleic acids to other proteins and DNA or RNA strands. This can result in impure, degraded and fragmented nucleic acid samples that are sub-optimal for sequencing libraries.

Damaged DNA not only means lost information but can also lead to false conclusions. For example, extensive DNA damage may result in difficulty distinguishing between true and damaged-induced low frequency mutations. However, there are steps that can be taken to decouple these cross links, preserving original complexity and delivering high-quality sequencing data. For example, SureSeq™ FFPE DNA Repair Mix is a mixture of enzymes that has been optimised to remove a broad range of damage that can cause artefacts in sequencing data.

2. Reduce PCR cycles

PCR cycles can be necessary to generate enough material for sequencing. However, they also increase the risk of introducing bias into your sequencing library. AT-rich regions of DNA, for example, are generally amplified more efficiently than GC-rich regions as it’s easier to denature the double-stranded DNA.

Amplification is increasingly essential for low concentration samples. However, with decreasing nucleic acid concentrations and increasing PCR cycles comes an increasing risk of an erroneous dataset. When input amounts are low, bias amplification of a small subset of nucleic acid sequences can cause a significant drop in diversity and a large skew in your dataset compared to when input amounts are high.

So, what’s the answer?

Simply reducing the number of PCR cycles isn’t always possible when there’s a threshold concentration needed for sequencing. The first step would be to increase the amount of starting material and optimise your extraction steps. But, again, increasing starting material is often not an option.

Therefore, it is also important to choose the optimal sequencing library kit. If amplification is necessary, select a kit that offers high-efficiency end repair, 3’ end ‘A’ tailing and adaptor ligation as this can help minimise the number of required PCR cycles.

Find out how OGT’s Universal NGS Complete Workflow solution gives exceptionally low levels of duplication.

Additionally, selecting a hybridisation enrichment strategy rather than an amplicon enrichment approach is preferable - minimizing errors is critical to experimental outcome such as for variant detection. Although amplicon approaches offer a simpler and faster workflow, hybridisation is a much more robust technique that yields better uniformity of coverage, fewer false positives, and superior variant detection due to the requirement of fewer PCR cycles.

Read more about hybridisation vs. amplicon

3. Use a robust sequence identification strategy

Unique molecular identifiers (UMIs) and Unique dual indexes (UDIs) are strategies used to allow multiplexing and enable accurate demultiplexing during downstream data analysis.

UMIs are short molecular sequences that work like molecular barcodes. By uniquely tagging each molecule in a sample library with a UMI, true variants can be differentiated from errors introduced during library preparation, target enrichment, or sequencing. This approach can be particularly useful if you are trying to identify low frequency variants and to efficiently identify any PCR errors.

Indexing using UDIs involves the ligation of two different index barcodes (i5 and i7) to every sequence molecule. However, while UMIs are tagged to each molecule in a sample, indexes are assigned to a specific library. In combinatorial indexing, each combination of indexes is different however, each i7 and i5 is used multiple times on a plate of libraries (Figure 1). Non-redundant indexing uses UDIs where each library has a completely unique i7 and i5 (Figure 1). This strategy has been incorporated into more modern workflows – such as the use of combinatorial indexing to prepare sequencing libraries – as it allows more accurate demultiplexing and avoids index hopping.

4. Minimise handling steps

Library preparation can require a significant amount of hands-on time in the lab. Not only can this be laborious, but the more steps in your protocol and the longer you spend manipulating your samples, the more chance you have to make a mistake.

No matter how experienced or engaged you are, we all make mistakes. In library preparation, human error can lead to cross-contamination or the introduction of contaminants – including nucleic acids or components that reduce the effectiveness of sequencing. Overall, this may reflect in your results quality or even generate incorrect data.

Some suppliers have optimised their kits to streamline the standard library preparation protocol. For instance, OGT has extensively tested enzymes and buffers to minimise the number of hands-on steps and overall processing time needed for its Universal NGS Complete Workflow. By combining the multi-enzymatic fragmentation and end repair and A-tailing step you’ll need to carry out fewer clean-up and QC steps which helps to minimise human errors.

But if you really want to reduce the risk of human error then it’s time to consider automation. Automation can partially or entirely remove the need for any hands-on time. OGT’s Universal NGS Library Complete Workflow, for example, has been designed to remove the need for secondary equipment such as ultrasonicators and vacuum concentrators easing the development of a more streamlined automation workflow. Indeed, OGT have developed a tailored automation workflow that not only improves throughput but reduces contamination risk and can improve the reliability and quality of your sequencing library.

5. Accurately quantify your library

After meticulously preparing your sequencing library, you don’t want to fail at the final hurdle! Quantifying and normalising your library are the final steps before you begin your sequencing run – and it’s important to make sure this is accurate.

Overestimating your library concentration will result in loading the sequencer with too little input and in turn, reduced coverage. On the other hand, if you underestimate your library concentration, you can overload the sequencer and reduce its performance. Additionally, if you are pooling libraries then incorrect quantification can lead to uneven library concentrations and ultimately, inconsistencies in data quality.

There are numerous methods for nucleic acid quantification including; UV absorption, intercalating dyes, quantitative PCR (qPCR) and droplet digital emulsion PCR. While there is no unambiguous gold standard, fluorometric methods risk overestimating the library concentration because it measures all double stranded DNA in the sample. Conversely, qPCR methods are extremely sensitive and only measure adaptor ligated-sequences.

6. Follow laboratory best practices

Samples are often precious – maybe they’re irreplaceable patient samples or you’ve spent months collecting samples from difficult-to-culture cells. Now, imagine your frustration when your library preparation fails because an enzyme mix was stored at the wrong temperature!

While it may sound obvious, following laboratory best practices is essential to obtaining a good sequencing library:

  • Make sure everything is stored at the right temperature and under the right conditions or else you’ll risk degraded samples, inactive enzymes, and a whole lot of time and money!
  • Aliquot your solutions to avoid damage from repeated freeze-thawing.
  • Wipe down work surfaces with the appropriate solution to reduce the risk of contamination. Take extra care when working with RNA to prevent RNase digestion – and ruined samples!
  • Work under laminar flow hoods or still boxes when necessary to prevent sample contamination.
  • Mix reagents thoroughly before use to ensure even distribution of reagent components and work on ice when necessary to ensure enzyme integrity.
  • Pipette carefully to reduce the chances of pipetting errors and use master mixes to reduce the effect of pipetting error.
  • Be sure to prepare fresh 70% ETOH daily for washes. ETOH may evaporate over time and leave your mix at <70% ETOH. As ETOH evaporates, the concentration of your solution decreases. This may result in DNA being eluted from your Ampure XP beads and being discarded as waste supernatant. This may lead to the loss of important sample and experimental failure.
  • Prepare reaction on ice since enzyme has low activation temperature and will start prematurely at room temperature.

For more expert tips on preparing samples for NGS, visit our NGS support page.

Get the most out of your NGS

To get the most out of your NGS experiment, it is important to consider the factors that determine the quality and quantity of your final library. While some of the challenges to creating a high-quality library are sample- or application-specific, following these tips and selecting the optimal library preparation kit are sure to help you generate reliable sequencing data.

From universal library preparation to methods specific to solid tumour samples, OGT’s NGS library preparation kits are optimised for your application. Working hand-in-hand with both SureSeq targeted cancer enrichment panels, as well as with the CytoSure® Constitutional NGS kit you’ll get the most sensitive and reproducible variant detection with excellent coverage uniformity.

Discover a wide range of SureSeq NGS products designed to support you across your entire NGS workflow

You might also be interested in

Next-generation sequencing: Sequencing depth vs. VAF sensitivity

  • Resource type: Article
  • Application: Haematology
Read

Summoning insights: NGS variant calling best practices

  • Resource type: Article
  • Application: Haematology, Solid tumour, Cytogenetics & rare disease
Read

To target or not to target…

  • Resource type: Article
  • Application: Haematology, Solid tumour, Cytogenetics & rare disease
Read

Connect with our NGS experts today

Call +44 (0)1865 856800 Email contact@ogt.com

Send us a message and we will get back to you