Designing a successful sample preparation workflow for metagenomic DNA analysis
To obtain high-quality, representative metagenomic DNA, you need a robust, highly reproducible sample preparation workflow. However, metagenomic researchers often struggle to identify the optimal workflow and there are significant challenges within every stage—from collection through DNA extraction and library preparation.
Key factors to consider when designing your workflow for preparing metagenomic DNA include:
- The study question
- Your understanding of the composition of the microbial community (e.g., do you know which microbes should be present)
- The site-to-site heterogeneity of the microbial community and how this impacts your sampling method
- The habitat you are working with and how the microbiome is influenced by various exposures
- Your sequencing approach and platform
Understanding these factors will help you choose the appropriate sampling, homogenization, DNA extraction, and library preparation method. Here you will find a description of several key challenges in metagenomic research, their impact on individual studies and the field as a whole, as well as how the factors above can help you tackle these challenges.
Lysing Matrices
Get reproducible homogenization with MP Bio’s bead beating tubes and ready-to-use lysing matrices for any sample type.
View ProductsObtaining a Representative Sample of Metagenomic DNA
The Challenge:
Regardless of the sample type (soils, feces, gut, etc.), microbial communities are complex—composed of hundreds to thousands of different bacteria and viruses, which can vary based on location and external factors. For instance, plant microbiomes vary between the leaf and the stem (Beckers et al.) and the human gut microbiome dramatically changes with diet (David et al.). It is challenging to collect a representative sample as well as preserve representation throughout the sample preparation process.
Where it can go wrong:
- Sample Collection:
- The sample collection site dictates what microbes are present.
- The size of your sample dictates which microbes you successfully capture from that location.
- An insufficient number of collected samples taken from different areas of the environment restricts your metagenomic samples, and therefore metagenomic DNA, to a specific location of your target ecosystem.
- A poor understanding of the key factors that influence the composition of your community of interest can result in poor study design and misleading results.
- DNA Extraction:
- Biased or inappropriate DNA extraction methods. Where some methods are too harsh or too weak for certain members of the microbial community so DNA is either not released, or is degraded.
- Contamination or an overabundance of host DNA could make detection of rare species difficult.
Impact
Microbial misrepresentation—due to poor sampling or DNA extraction methods—paints a flawed picture of community structure and microbial abundance, which creates data fallacies and makes it challenging to compare results across studies.
Community members go undetected or are over represented, causing scientists to mis-evaluate the true composition of a microbial ecosystem. For an individual study or lab, this can lead researchers down a futile path—wasting time and resources—or generate inherently biased conclusions. These results could be taken as truths by peers, creating a biased datapool or suboptimally designed studies.
Alternatively, scientists might prematurely abandon certain projects if their target microbe goes undetected, while it could be true that their target microbe was not present, this could be an artifact caused by sampling or extraction issues that overlook rare or unknown species.
Choose your sampling and metagenomic DNA extraction method based on your sample type and research question (e.g., how does a pesticide affect the microbial composition vs. how does the microbial composition of the soil affect maize growth), and optimize your methods by comparing methods within an individual study.
How to address this challenge:
- Try to identify any environmentally distinct areas of your sample and make sure each is represented when designing your sampling method.
- Collect samples from various locations within your habitat and assess the level of microbial variability between samples.
- Compare different methods and DNA extraction kits to optimize your in-house method.
- Use a control primer to detect the presence and abundance of a known species.
FastDNATM SPIN Kit for Soil
Great for soil and aquatic samples—use this gold standard kit to rapidly get pure genomic DNA from environmental samples.
View ProductGetting Pure DNA in Abundance
The challenge
The limiting factor for metagenomics studies is often the quality and amount of DNA. While you want to isolate DNA representative of your ecosystem, if there isn’t enough of it or if it is contaminated with host DNA, RNA, proteins, or other analytes, your library preparation and sequencing steps may fail. This is a challenge due to the complexity and diversity of metagenomic samples.
Critical questions researchers find difficult to answer include:
- Do I need to pretreat my sample? If so, how, and will representation be preserved?
- Which DNA extraction method should I use? Is there an existing method, or do I need to modify one (or many) to tailor it to my specific sample?
- How do I ensure that my sample is pure and representative?
- How do I maximize the detection of rare or unknown microorganisms?
Impact
Because sample types and research questions vary between studies, it is incredibly difficult to standardize a method that is beneficial for continued use within one lab or for the field at large, often resulting in researchers using methods that are ineffective or biased.
Sequencing technology is very sensitive to contaminants, and the sequencing platform you use can dictate DNA abundance requirements. If your DNA sample fails to meet quantity and quality requirements for the specific library preparation or sequencing technique used, you’ll get messy, uninterpretable data, causing you to troubleshoot where the project went wrong and identify how to mitigate these issues in the future.
With optimized methods, you gain the opportunity to streamline your workflow by eliminating time-consuming troubleshooting and adopting high-throughput, automated systems.
How to address this challenge:
- Use a combination of lysis buffers and mechanical force such as bead beating to effectively homogenize your samples while eliminating contaminants and host DNA.
- Compare methods and DNA extraction kits to optimize your method.
- Measure the quantity and purity of your DNA using gel electrophoresis and/or a spectrophotometer.
- Learn how to choose the right lysing matrix for your sample and what the different bead materials and sizes do.
Optimizing DNA Shearing of Your Metagenomic Samples for Library Preparation
The challenge
DNA shearing is a necessary step in next generation sequencing, and an optimal method will help generate reliable sequencing results. Various methods are used to efficiently fragment DNA: acoustic cavitation, enzymatic, or mechanical methods.
The challenge is identifying and optimizing the method to:
- Produce appropriately and consistently sized DNA fragments
- Perform unbiased shearing where fragmentation is random and sequence-independent
- Obtain pure DNA samples, eliminating cross-contamination between samples
- Avoid sample loss to retain a high quantity of DNA
Impact
To appreciate the impact, it is important to understand the workflow downstream of DNA shearing. Following shearing, DNA fragments are selected based on size, prepared for sequencing (e.g., tagging), sequenced, and mapped to a reference genome. Sequence-specific DNA shearing could result in a lack of fragment size uniformity. During size selection, researchers eliminate undesired read lengths, which could result in important DNA regions being removed and an under-representation of microbial members.
Additionally, DNA shearing biased towards certain sequences could affect fragment overlap. If you are monitoring a genomic region commonly shared among the microbial members in your sample, fragments may be indistinguishable from species to species and could cause “mismapping” or an inaccurate assessment of copy number variations. This can result in researchers deeming certain microbial members absent or overabundant—creating a distorted picture of microbial distribution.
Non-random DNA fragmentation is currently being explored as a source of non-uniform coverage of various genomic regions. Of particular focus is sequence-specificity of sonication and hydrodynamic shearing and how understanding these biases can help inform experimental design and read mapping (Poptsova et al.).
To help generate reliable sequencing results, the DNA shearing method needs to be sequence-independent, or you should explore options for correcting for this bias.
How to address this challenge:
- Use an approach that is validated for your specific application (e.g.,The FastPrep-96 is validated for fragmenting 100-1000 ng of genomic DNA (gDNA) in high-throughput to generate read lengths 10-14 kilobases. Learn more.)
- Compare techniques (e.g., acoustic cavitation vs. enzymatic shearing vs. mechanical shearing) on your samples.
- Keep your metagenomic samples cool to avoid degradation.
- Optimize the shearing speed/intensity, shearing time, and number of cycles.
FastPrep-96TM Instrument
Maximize productivity with this high-throughput bead beating system—also validated for DNA shearing—that delivers on performance, speed, and reproducibility.
View ProductExamples of Researchers Addressing Challenges in Preparing Metagenomic Samples
Publication Title | Challenge | Link to Publication |
The impact of different DNA extraction kits and laboratories upon the assessment of human gut microbiota composition by 16S rRNA gene sequencing | DNA Extraction | Kennedy, Nicholas A et al. PloS one vol. 9,2 e88982. 24 Feb. 2014, doi:10.1371/journal.pone.0088982 |
Shotgun metagenomics reveals the functional diversity of root-associated endophytic microbiomes in maize plant | Environmental Influence & Sampling | Fadiji, Ayomide Emmanuel, Ayansina Segun Ayangbenro, and Olubukola Oluranti Babalola.Current Plant Biology 25 (2021): 100195. |
Development of a robust protocol for the characterization of the pulmonary microbiota | DNA Extraction | Dumont-Leblond, Nathan, et al.Communications biology 4.1 (2021): 1-9. |
Complementary Metagenomic Approaches Improve Reconstruction of Microbial Diversity in a Forest Soil | Sampling & Sequencing | Alteio, L. V., et al. Msystems 5.2 (2020). Devi, Selvaraju Gayathri, et al. PloS one 10.7 (2015): e0132441. |
A Rapid and Economical Method for Efficient DNA Extraction from Diverse Soils Suitable for Metagenomic Applications | DNA Extraction | Devi, Selvaraju Gayathri, et al. PloS one 10.7 (2015): e0132441. |
The Effect of DNA Extraction Methods on Observed Microbial Communities from Fibrous and Liquid Rumen Fractions of Dairy Cows | DNA Extraction | Vaidya, Jueeli D., et al. Frontiers in microbiology 9 (2018): 92 |
Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community | DNA Shearing (Library Preparation) | Bowers, Robert M., et al. BMC genomics 16.1 (2015): 1-12. |
Sequencing platform and library preparation choices impact viral metagenomes | DNA Shearing (Library Preparation) | Solonenko, Sergei A., et al. BMC genomics 14.1 (2013): 1-12. |
Related Articles

Sample Preparation Brochure
Quickly understand the options most suitable for your sample using this guide describing MP Bio’s wide selection of high-quality lysing matrices and bead beaters.

Environmental Brochure
From sample preparation to molecular analysis, find end-to-end solutions for working with environmental samples.