

~10-fold bias against the taxon in question, from a single internal mismatch). that a single mismatch between the primer and template sequences can have a dramatic effect on the measured community composition (e.g. Conversely, these studies have also underscored the key weakness of PCR-based methods - i.e. Recent studies have shown that PCR amplicon sequencing of mock microbial communities can recover known abundances extremely well ( 3, 8).

In contrast, the complexity of metagenomic samples and current lack of reference genomes for many organisms makes it less possible to classify all sequences. Because of this relative simplicity, the complete dataset can be classified with databases such as SILVA ( 7). Finally, amplicon datasets are simpler to analyze since they consist of a single gene. Thirdly, modern “denoising” methods produce exact amplicon sequence variants (ASVs) that can be used as stable and intercomparable biogeographic markers ( 6). Secondly, the targeted nature of the assay means relatively small numbers of sequences are sufficient for detecting rare organisms even when we have no genomes from any of their relatives, due to the conserved nature of the molecule and comprehensive SSU rRNA sequence database. for global surveys of sediment, water, animal-associated, and other microbial communities ( 1). First of all, it is a high-throughput and low cost technique, making it suitable for large numbers of samples e.g. Despite these potential issues, PCR amplicon sequencing retains several advantages that make it desirable to investigate and correct biases. Since our pipeline identifies missed taxa, we suggest modifications to improve coverage of biogeochemically-important oceanic microorganisms - a strategy applicable to any environment with metagenomic data.Īmplicon sequencing is a powerful tool for understanding microbial community composition and dynamics in the oceans and other ecosystems ( 1), but the PCR amplification step is potentially biased due to both technical issues during amplification and mismatches to organisms found in natural ecosystems ( 2– 5). Using Atlantic and Pacific BioGEOTRACES field samples, we demonstrate high correspondence between 515Y/926R amplicons (generated as part of this study) and metagenomic 16S rRNA (median R 2=0.98, n=272), indicating amplicons can produce equally accurate community composition data versus shotgun metagenomics. Considering Cyanobacteria/Chloroplast 16S, 515Y/926R had the highest coverage (0.99), making it ideal for quantifying phytoplankton.

The best-performing 16S primers were 515Y/926R and 515Y/806RB which perfectly matched most sequences (~0.95). To evaluate whether primers commonly used in microbial oceanography match naturally-occurring organisms, we compared primers with > 300 million rRNA sequences retrieved from globally-distributed metagenomes. Small subunit ribosomal RNA (SSU rRNA) amplicon sequencing comprehensively profiles microbiomes, but results will only be accurate if PCR primers perfectly match environmental sequences.
