Metadata for Carlisle et al. A Web-Based Tool for Assessing the Condition of Benthic Diatom Assemblages in Streams and Rivers of the Conterminous United States
공공데이터포털
R code and data files used to identify diatom metrics that are robust to taxonomic inconsistency. The R code run a goodness of fit analysis to determine how much variation is explained by analyst in a diatom dataset that has been harmonized for taxonomic consistency compared to the original raw dataset. This dataset is associated with the following publication: Carlisle, D., S. Spaulding, M. Tyree, N. Schulte, S. Lee, R. Mitchell, and A. Pollard. A web-based tool for assessing the condition of benthic diatom assemblages in streams and rivers of the conterminous United States manuscript. ECOLOGICAL INDICATORS. Elsevier Science Ltd, New York, NY, USA, 135: 1-13, (2022).
Datasets to develop and validate the genus-level, trait-based multimetric diatom indices for assessing the ecological condition of river and stream across the conterminous United States
공공데이터포털
Data is from National Aquatic Resource Surveys. This dataset is associated with the following publication: Riato, L., R. Hill, A. Herlihy, D. Peck, P. Kaufmann, J. Stoddard, and S. Paulsen. Genus-level, trait-based multimetric diatom indices for assessing the ecological condition of river and stream across the conterminous United States.. ECOLOGICAL INDICATORS. Elsevier Science Ltd, New York, NY, USA, 141: 109131, (2022).
Supplementary material for Lee et al. in review: Harmonization and Revision of a National Diatom Dataset for Use in the Development of Water Quality Indicators
공공데이터포털
ABSTRACT Diatom data have been collected in large-scale biological assessments in the United States, such as the U.S. Environmental Protection Agency’s National Rivers and Streams Assessment (NRSA). However, the effectiveness of diatoms as indicators may suffer if inconsistent taxon identifications across different analysts obscure the relationships between assemblage composition and environmental variables. To reduce these inconsistencies, we harmonized the 2008-2009 NRSA data from nine analysts by updating names to current synonyms and by statistically identifying taxa with high analyst signal (taxa with more variation in relative abundance explained by the analyst factor, relative to environmental variables). We then screened a subset of samples with QA/QC data and combined taxa with mismatching identifications by the primary and secondary analysts. When these combined “slash groups” did not reduce analyst signal, we elevated taxa to the genus level or omitted taxa in difficult species complexes. We examined the variability explained by analyst in the original and revised datasets. Further, we examined how revising the datasets to reduce analyst signal can reduce inconsistency, thereby uncovering the variation in assemblage composition explained by total phosphorus (TP), an environmental variable of high priority for water managers. To produce a revised dataset with the greatest taxonomic consistency, we ultimately made 124 slash groups, omitted 7 taxa in the small naviculoid (e.g., Sellaphora atomoides) species complex, and elevated Nitzschia, Diploneis, and Tryblionella taxa to the genus level. Relative to the original dataset, the revised dataset had more overlap among samples grouped by analyst in ordination space, less variation explained by the analyst factor, and more than double the variation in assemblage composition explained by TP. Elevating all taxa to the genus level did not eliminate analyst signal completely, and analyst remained the most important predictor for the genera Sellaphora, Mayamaea, and Psammodictyon, indicating that these taxa present the greatest obstacle to consistent identification in this dataset. Although our process did not completely remove the analyst signal, this work clarifies the extent of the problem and provides a method to minimize analyst signal. Resolution of these taxonomic issues makes large datasets such as the NRSA more suitable for the development of diatom-based water quality indicators. This dataset is associated with the following publication: Lee, S., I. Bishop, S. Spaulding, R. Mitchell, and L. Yuan. Taxonomic harmonization may reveal a stronger association between diatom assemblages and total phosphorus in large datasets.. ECOLOGICAL INDICATORS. Elsevier Science Ltd, New York, NY, USA, 102: 166-174, (2019). NOTE: This dataset has been removed from public access due to revocation. Please refer inquiries regarding this dataset to the listed contact person.
Data Release for: A Web-Based Tool for Assessing the Condition of Benthic Diatom Assemblages in Streams and Rivers of the Conterminous United States
공공데이터포털
Benthic diatom assemblages are known to be indicative of water quality but have yet to be widely adopted in biological assessments in the United States due to several limitations. Our goal was to address some of these limitations by developing regional multi-metric indices (MMIs) that are robust to inter-laboratory taxonomic inconsistency, adjusted for natural covariates, and sensitive to a wide range of anthropogenic stressors. We aggregated bioassessment data from two national-scale federal programs and used a data-driven analysis in which all-possible combinations of 2-7 metrics were compared for three measures of performance. The datasets in this release support the Carlisle, et al. 2022 report cited herein. The article provides full details of data aggregation, model development, and application.
Data Release for: A Web-Based Tool for Assessing the Condition of Benthic Diatom Assemblages in Streams and Rivers of the Conterminous United States
공공데이터포털
Benthic diatom assemblages are known to be indicative of water quality but have yet to be widely adopted in biological assessments in the United States due to several limitations. Our goal was to address some of these limitations by developing regional multi-metric indices (MMIs) that are robust to inter-laboratory taxonomic inconsistency, adjusted for natural covariates, and sensitive to a wide range of anthropogenic stressors. We aggregated bioassessment data from two national-scale federal programs and used a data-driven analysis in which all-possible combinations of 2-7 metrics were compared for three measures of performance. The datasets in this release support the Carlisle, et al. 2022 report cited herein. The article provides full details of data aggregation, model development, and application.
Datasets used to evaluate the effects of antecedent streamflow and sample timing on trend assessments of fish, invertebrate and diatom communities across the United States, 2002-12 (output)
공공데이터포털
Detecting trends in biological attributes is central to many stream monitoring programs; however, understanding how natural variability in environmental factors affects trend results is not well understood. We evaluated the influence of antecedent streamflow and sample timing (covariates) on trend estimates for fish, invertebrate, and diatom taxa richness and biological condition from 2002 to 2012 at 51 sites distributed across the conterminous United States. This data release contains all of the input and output files necessary to reproduce the results presented and discussed in the associated journal article.
Diatom and Environmental Data
공공데이터포털
Raw data associated with this research. This dataset is associated with the following publication: Yuan, L., R. Mitchell, E. Pilgrim, and N. Smucker. Inferences based on diatom compositions improve estimates of nutrient concentrations in streams. SCIENCE OF THE TOTAL ENVIRONMENT. Elsevier BV, AMSTERDAM, NETHERLANDS, 952: 176032, (2024).
Harmonization of sediment diatoms from hundreds of lakes in the northeastern United States
공공데이터포털
Sediment diatoms are widely used to track environmental histories of lakes and their watersheds, but merging datasets generated by different researchers for further large-scale studies is challenging because of the taxonomic discrepancies caused by rapidly evolving diatom nomenclature and taxonomic concepts. Here we collated five datasets of lake sediment diatoms from the northeastern USA using a harmonization process which included updating synonyms, tracking the identity of inconsistently identified taxa and grouping those that could not be resolved taxonomically. The Dataset consists of a Portable Document Format (.pdf) file of the Voucher Flora, six Microsoft Excel (.xlsx) data files, an R script, and five output Comma Separated Values (.csv) files. The Voucher Flora documents the morphological species concepts in the dataset using diatom images compiled into plates (NE_Lakes_Voucher_Flora_102421.pdf) and the translation scheme of the OTU codes to diatom scientific or provisional names with identification sources, references, and notes (VoucherFloraTranslation_102421.xlsx). The file Slide_accession_numbers_102421.xlsx has slide accession numbers in the ANS Diatom Herbarium. The “DiatomHarmonization_032222_files for R.zip” archive contains four Excel input data files, the R code, and a subfolder “OUTPUT” with five .csv files. The file Counts_original_long_102421.xlsx contains original diatom count data in long format. The file Harmonization_102421.xlsx is the taxonomic harmonization scheme with notes and references. The file SiteInfo_031922.xlsx contains sampling site- and sample-level information. WaterQualityData_021822.xlsx is a supplementary file with water quality data. R code (DiatomHarmonization_032222.R) was used to apply the harmonization scheme to the original diatom counts to produce the output files. The resulting output files are five wide format files containing diatom count data at different harmonization steps (Counts_1327_wide.csv, Step1_1327_wide.csv, Step2_1327_wide.csv, Step3_1327_wide.csv) and the summary of the Indicator Species Analysis (INDVAL_RESULT.csv). The harmonization scheme (Harmonization_102421.xlsx) can be further modified based on additional taxonomic investigations, while the associated R code (DiatomHarmonization_032222.R) provides a straightforward mechanism to diatom data versioning. This dataset is associated with the following publication: Potapova, M., S. Lee, S. Spaulding, and N. Schulte. A harmonized dataset of sediment diatoms from hundreds of lakes in the northeastern United States. Scientific Data. Springer Nature, New York, NY, 9(540): 1-8, (2022).
Multiscale Framework King County and Puget Lowland data
공공데이터포털
This data has the benthic index of biotic integrity for King County, Washington and also includes the index of watershed integrity and index of catchment integrity. This dataset is associated with the following publication: Riato, L., S. Leibowitz, M. Weber, and R. Hill. A multiscale landscape approach for prioritizing river and stream protection and restoration actions. Ecosphere. ESA Journals, 14(1): e4350, (2023).