Metagenomics Explained
Metagenomics is a molecular biology tool used to analyze DNA acquired from biological materials and environmental samples. With metagenomics, you can study the community of microorganisms in your sample without the necessity of obtaining pure cultures.
Metatranscriptomics and metaproteomics are relatively recent sub-areas of metagenomics, which allow further functional analysis of microbial communities. Metaproteomics is emerging as a complementary approach to metagenomics, which aims to characterize the proteins from the microbiota.
The integrative omics approach generates comprehensive information about the microbial community to try and answer who is there, what they can do, and what they are doing there.
The general workflow for a metagenomics study includes the following steps:
- Sample Collection
- Nucleic Acid Isolation
- Library Generation
- Sequencing
- Functional Annotation
Sample Collection and Storage
Obtaining a representative sample that yields enough DNA or RNA for metagenomic analysis is critical. Additionally, variability between samples can influence sequencing results, leading to false conclusions, so it is imperative to treat every sample the same during sample collection, handling, transport, and storage.
Pro Tips:
- Collect samples from a large enough region of the environment of interest to ensure the entire microbial community is represented.
- Collect 6-10 samples for controls, outliers, and in case of contamination.
- Use sterile disposable tools and containers or sterilize tools between samples.
Nucleic Acid Isolation, Quantification, and Quality Assessment
Ideally, your metagenomic DNA isolation method will maximize the yield of purified, high-quality DNA from the microbial community under investigation, while avoiding host nucleic acids, enzyme (polymerase) inhibitors, and contamination. When selecting the appropriate lysing matrix and purification scheme, consider the type and number of samples, purpose of the study, equipment availability, and financial constraints. There is no universal method for nucleic acid isolation, but it always involves three general steps:
- Homogenize the sample to release the nucleic acids from the cells
- Isolate the DNA
- Quantify and assess the quality of DNA
In some cases, the sample needs to be preprocessed to remove large debris (using a mesh sieve or coffee filter) that could hinder the effectiveness of cell lysis and DNA isolation. Liquid samples, such as river water, are usually filtered through membranes with appropriately sized pores to collect the microbes. Then the collected cells are lysed and nucleic acids are purified.
Typical filter materials and pore sizes:
- 0.45 μm cellulose nitrate: The most commonly used filter as it can capture a wide range of microbes with various sizes.
- 0.7 μm glass microfiber GF/F: Characterized as having very rapid flow rate, an extremely high loading capacity, and very fine particle retention. This is particularly useful when your target microbes are 0.6 μm - 0.8 μm.
- 1.2 μm glass microfiber: Particularly useful for samples with high concentrations of particulate or gelatinous substances.
Effectively homogenize the sample
Complete cell lysis is integral to obtaining representative DNA, and several methods can be used to homogenize samples. Bead beating is recognized as being superior to chemical or enzymatic lysis because it can recover more diverse microbial DNA, whereas non-bead beating methods have been shown to systematically miss certain organisms and spores, but this depends on the community being researched.
Bead beating-based kits are generally high-throughput and are often available for specific sample types. For instance, MP Bio offers lysing matrices and bead beating kits to isolate nucleic acids from a variety of samples types, such as soil, environmental aquatic samples, fecal matter, and biological tissues.
Obtain pure DNA or RNA
There are several effective methods for metagenomic DNA isolation, which can be categorized into three general approaches: filtration, precipitation, and centrifugation. Spin filters with membranes that enable DNA binding and washing are commonly used as they can boost efficiency and purity. Many commercial kits, such as the FastPrep DNA, consist of the materials, reagents, and spin filters needed for cell lysis, DNA extraction and DNA isolation.
Common DNA extraction methods/reagents include:
- Guanidinium Thiocyanate-Phenol-Chloroform
- Cesium Chloride (CsCl)/ Ethidium Bromide Gradient Centrifugation
- Cetyltrimethylammonium Bromide (CTAB) Extraction
- Alkaline Solutions and Sodium Dodecyl Sulfate (SDS)
Accurately quantify and assess the quality of nucleic acids
To successfully prepare the sequencing library, you’ll need to accurately determine the concentration and assess the purity of the isolated nucleic acids, which can be measured spectrophotometrically using A280/A260 and A230/A260 ratios.
While the A280/A260 is the most popular measure used, the A230/A260 ratio can reveal possible impurities, such as guanidine salts, acetate, or other contaminants that could interfere with enzymatic reactions. The A280/A260 should exceed 1.7 for PCR to work on DNA and over 1.8 for reverse transcription to work on RNA.
The concentration can be estimated by measuring the absorbance at 260nm, adjusting the A260 measurement for turbidity (measured by absorbance at 320nm), multiplying by the dilution factor, and assuming:
- A260 of 1.0 = 50 µg/mL pure dsDNA
- A260 of 1.0 = 20-33 ng/µL of ssDNA
- A260 of 1.0 = 40 ng/µL of RNA
- Concentration (µg/ml) = (A260 – A320) × dilution factor × 50µg/ml
For tips and best practices on DNA extraction and measuring the concentration and quality, visit Nucleic Acid Extraction or read the article on Preparing Environmental Samples for Sequencing.
Pro Tips:
- Depending on the sample type, microbial community, and sample volume, successive cell lysis may be necessary, and extracts from each sample could be combined to obtain representative results.
FastDNATM SPIN Kit for Soil, 50 preps
Efficiently isolate high-quality DNA from almost any environmental sample with a kit that contains Lysing Matrix tubes and removes humic acids and other contaminants.
View ProductLibrary Generation and Quality Assessment
A sequencing library is a pool of similarly sized DNA fragments with attached adaptors designed to interact with a particular sequencing platform. Metagenomic library preparation for most sequencing technologies requires nanograms or micrograms of DNA.
There are three general steps to building a DNA library:
- DNA shearing: Uniform sizes of DNA within the desired size range are generated, physically or enzymatically, to make it more amenable to sequencing.
- Adapter ligation: Ligase anneals the specified adaptor to the 5’ or 3’ end of the DNA fragment, enabling attachment to a flow cell or sequencing platform. The adapter can harbor barcodes to aid in sample identification. The resulting fragments are isolated based on size by using gel electrophoresis or bead-based mechanisms. Amplicon libraries generally do not require size selection as long as the fragment size was considered in the PCR parameters.
- Library quantification and quality assessment: This is often done using a BioAnalyzer System, which provides the concentration and fragment size, or through qPCR, which most accurately measures library concentration, but lacks library size information.
Challenge | Considerations and Potential Solutions |
Obtaining uniform DNA fragments and the ideal read length |
|
Choosing the ideal polymerase |
|
Amplifying "difficult" or contaminated samples |
|
Removing unwanted (host) sequences |
|
Pro Tips:
- Perform replicate PCR reactions to "average out" random PCR errors and to estimate intra-sample variance.
- Employ a proofreading polymerase and optimize PCR conditions to reduce erroneous sequences, mitigate inflated alpha diversity estimates, and minimize incorrectly classifying reads.
- Engineered high‐fidelity polymerases (such as Isis DNA polymerase) are usually the best choice when working with contaminated templates.
- Consider adding independent molecular identifiers (MIDs) at both ends of an amplicon to avoid sequence mis‐assignment.
SurePRIME DNA Polymerase
Reliably amplify your target DNA with our thermostable polymerase that is suitable for all PCR applications requiring “hot start” conditions.
View ProductAvoiding Contamination
Contamination can occur at any step in the process and can result in sequencing data artifacts. There are two types of contamination: sample cross-contamination and exogenous contamination. Problems become more pronounced with lower cell numbers, and consequently result in lower nucleic acid concentration.
Common sources of contamination include cross‐contamination and exogenous microorganisms introduced from sampling equipment and vials, contaminated reagents and solutions used during nucleic acid isolation and library preparation, and cross-contamination due to tag hopping or barcodes mutating during sequencing and leftover sequences from previous runs.
Pro Tips:
- Use sterile disposable sampling tools or sterilized non-disposable equipment.
- Wear gloves throughout the process and use sterile tubes and disposable liquid-handling items.
- Use negative controls throughout the process to help identify the source of contamination. All negative controls should be sequenced, regardless of the resulting DNA concentration, to reveal even minor impurities that yield an undetectable amount of amplicon.
- Perform a no-template control and sequence the resulting library even if undetectable amounts of product were generated.
- Use a reliable barcoding strategy to minimize the risk of tag hopping during sequencing.
- Periodically wash the fluidic subsystem of the sequencing machine to prevent previous run carryover.
- Thoroughly clean all laboratory surfaces with a high-quality cleaning solution, such as MP Bio’s 7X Cleaning Solution.
Related Articles

Sample Preparation Brochure
Discover how MP Bio’s comprehensive range of sample preparation solutions can enhance your research.

Environmental Brochure
MP Bio can help you overcome the challenges associated with working with environmental samples.