교육데이터 활용•지원 서비스

Biological indicator taxa have long been used for integrative assessments of water quality, particularly benthic invertebrate groups such as arthropods. While standardized protocols have been developed to calculate 'biological index' scores based on the abundances of these taxa, such systems are challenging to implement at large scales due to the sampling effort required, taxonomic expertise needed, and the need for repeated sampling to reliably discriminate sites. Many of the same taxa detected by traditional surveys can also be detected by genetic analysis of environmental DNA (eDNA), potentially allowing for an alternative formulation of biological indexes that might be faster and more economical to produce. The current data were produced to evaluate eDNA-derived biological indexes at sites within the Potomac River watershed of the eastern United States, specifically within units of the National Park Service for which previous biological assessment data were available. This data release consists of five files: 1. sample.metadata.txt, which contains sampling metadata and identifiers linking to sample-derived sequence data that has been deposited in the Sequence Read Archive of the National Center for Biotechnology Information (NCBI). This database is authoritative and comprehensive for sharing high-throughput sequence data produced with public funds. All accessions listed in the file can be searched to retrieve sample and sequence information at www.ncbi.nlm.nih.gov. 2. cox1.references.fasta, which contains reference sequences of the cytochrome c oxidase 1gene of arthropods (typically abbreviated cox1 or COI), identified from regional checklists. The file is a text file in FASTA format. 3. mt16S.references.fasta, which contains reference sequences of the mitochondrial 16S ribosomal RNA (mt16S) gene of arthropods identified from regional checklists. The file is a text file in FASTA format. 4. first.stage.counts.txt, which is a tab-delimited table of counts of sequences that are attributed to each taxon from each sample for the first stage of the study. Whether the taxon attribution is from the mt16S or cox1 locus is also indicated. 5. second.stage.counts.txt, which is a tab-delimited table of counts of sequences that are attributed to each taxon from each sample for the second stage of the study. Whether the taxon attribution is from the mt16S or cox1 locus is also indicated.