데이터셋 상세
미국
Groundwater data, predictor variables, and rasters used for predicting the probability of high arsenic and high manganese in the Glacial Aquifer System, northern continental United States
This data release contains input data used in model development and TIF raster files used to predict the probability of high arsenic (As) and high manganese (Mn) in groundwater within the glacial aquifer system in the northern United States. Input data include measured As and Mn concentrations at groundwater wells, and associated predictor variable data. The probability of high As and high Mn was predicted using boosted regression tree methods using the gbm package in R version 4.0.0. The response variables for individual models were the occurrence of: (1) As >10 µg/L, and (2) Mn >300 µg/L. Water-quality data were compiled from three sources, as described in Wilson and others (2019): a compilation of data from numerous agencies and organizations at the state, regional, and local level; the U.S. Geological Survey National Water Information System; and the U.S. Environmental Protection Agency Safe Drinking Water Information System. The resultant dataset consisted of 10,001 As and 14,565 Mn measurements across the study area. A total of 108 predictor variables were originally considered for model development which included well characteristics, soil properties, aquifer properties, predicted nitrate, hydrologic position on the landscape, groundwater age, predicted pH, and predicted anoxic conditions. After model refinement, a total of 79 and 55 predictor variables were used for predicting the probability of high As and high Mn, respectively. The probability of high As and high Mn was predicted at two depths representative of public and domestic drinking water supply depths at a resolution of 1 km across the glacial aquifer.
데이터 정보
연관 데이터
Groundwater data, predictor variables, and rasters used for predicting the probability of high arsenic and high manganese in the Glacial Aquifer System, northern continental United States
공공데이터포털
This data release contains input data used in model development and TIF raster files used to predict the probability of high arsenic (As) and high manganese (Mn) in groundwater within the glacial aquifer system in the northern United States. Input data include measured As and Mn concentrations at groundwater wells, and associated predictor variable data. The probability of high As and high Mn was predicted using boosted regression tree methods using the gbm package in R version 4.0.0. The response variables for individual models were the occurrence of: (1) As >10 µg/L, and (2) Mn >300 µg/L. Water-quality data were compiled from three sources, as described in Wilson and others (2019): a compilation of data from numerous agencies and organizations at the state, regional, and local level; the U.S. Geological Survey National Water Information System; and the U.S. Environmental Protection Agency Safe Drinking Water Information System. The resultant dataset consisted of 10,001 As and 14,565 Mn measurements across the study area. A total of 108 predictor variables were originally considered for model development which included well characteristics, soil properties, aquifer properties, predicted nitrate, hydrologic position on the landscape, groundwater age, predicted pH, and predicted anoxic conditions. After model refinement, a total of 79 and 55 predictor variables were used for predicting the probability of high As and high Mn, respectively. The probability of high As and high Mn was predicted at two depths representative of public and domestic drinking water supply depths at a resolution of 1 km across the glacial aquifer.
Predicted nitrate and arsenic concentrations in basin-fill aquifers of the Southwest Principal Aquifers study area
공공데이터포털
This product "Predicted nitrate and arsenic concentrations in basin-fill aquifers of the Southwest Principal Aquifers study area" is a 1:250,000-scale vector dataset and was developed as part of a regional Southwest Principal Aquifers (SWPA) study. The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions. Separate classifiers were developed for nitrate and arsenic because each constituent was expected to be affected by a different set of factors, and each factor could have a different magnitude or directional influence (increase/decrease) on concentration. For each constituent, two different classifiers were developed; a prediction classifier and a confirmatory classifier. The prediction classifiers were developed specifically to predict nitrate and arsenic concentrations in basin-fill aquifers across the SWPA study area and were based on explanatory variables representing source and susceptibility conditions. These explanatory variables were available throughout the entire SWPA study area and, therefore, did not pose a limitation for using the classifiers to predict concentrations. The confirmatory classifiers were developed to supplement the prediction classifiers in the evaluation of the conceptual model. The name, "confirmatory," reflects the classifier's purpose for evaluation of a-priori hypotheses and contrasts other general types of statistical models, such as those used for prediction or exploratory purposes. The confirmatory classifiers included the explanatory variables used in the prediction classifiers, as well as additional variables representing geochemical conditions and basin groundwater budget components. The inclusion of the geochemical and basin groundwater budget variables in the confirmatory classifiers allowed for further evaluation of the conceptual models, which was not possible with the prediction classifiers alone. The geochemical data, however, were only available at specific well locations, and consistent water-budget data were not available for every basin in the study area. The limited availability of the data for these variables constrained the confirmatory classifiers to observations from 16 case-study basins and precluded use of the confirmatory classifier for predicting concentrations across the SWPA study area. To contrast the scope of the two classifiers, the confirmatory classifiers were developed by using all available explanatory variables but with observations restricted to the 16 case-study basins, whereas the prediction classifiers were unrestricted with respect to spatial extent because these were developed by using a subset of the explanatory variables that were available throughout the study area.
Predicted nitrate and arsenic concentrations in basin-fill aquifers of the Southwest Principal Aquifers study area
공공데이터포털
This product "Predicted nitrate and arsenic concentrations in basin-fill aquifers of the Southwest Principal Aquifers study area" is a 1:250,000-scale vector dataset and was developed as part of a regional Southwest Principal Aquifers (SWPA) study. The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions. Separate classifiers were developed for nitrate and arsenic because each constituent was expected to be affected by a different set of factors, and each factor could have a different magnitude or directional influence (increase/decrease) on concentration. For each constituent, two different classifiers were developed; a prediction classifier and a confirmatory classifier. The prediction classifiers were developed specifically to predict nitrate and arsenic concentrations in basin-fill aquifers across the SWPA study area and were based on explanatory variables representing source and susceptibility conditions. These explanatory variables were available throughout the entire SWPA study area and, therefore, did not pose a limitation for using the classifiers to predict concentrations. The confirmatory classifiers were developed to supplement the prediction classifiers in the evaluation of the conceptual model. The name, "confirmatory," reflects the classifier's purpose for evaluation of a-priori hypotheses and contrasts other general types of statistical models, such as those used for prediction or exploratory purposes. The confirmatory classifiers included the explanatory variables used in the prediction classifiers, as well as additional variables representing geochemical conditions and basin groundwater budget components. The inclusion of the geochemical and basin groundwater budget variables in the confirmatory classifiers allowed for further evaluation of the conceptual models, which was not possible with the prediction classifiers alone. The geochemical data, however, were only available at specific well locations, and consistent water-budget data were not available for every basin in the study area. The limited availability of the data for these variables constrained the confirmatory classifiers to observations from 16 case-study basins and precluded use of the confirmatory classifier for predicting concentrations across the SWPA study area. To contrast the scope of the two classifiers, the confirmatory classifiers were developed by using all available explanatory variables but with observations restricted to the 16 case-study basins, whereas the prediction classifiers were unrestricted with respect to spatial extent because these were developed by using a subset of the explanatory variables that were available throughout the study area.
Arsenic, manganese, and pH groundwater quality data, selected well construction characteristics, and aquifer assignments for wells in the conterminous U.S.
공공데이터포털
This data release contains groundwater-quality data for three parameters of interest (arsenic, manganese, and pH) and well information for sample sites for aquifers in the conterminous U.S. Water-quality data and well information were derived from a dataset compiled from three sources: the U.S. Geological Survey (USGS) National Water Information System (NWIS), the U.S. Environmental Protection Agency (USEPA) Safe Drinking Water Information System (SDWIS), and numerous agencies and organizations at the state, regional, and local level. The data compilation of the National Water Quality Program’s groundwater assessment team is an internal dataset informally referred to as the National Groundwater Aggregation (NGA). The current study of groundwater quality in the conterminous U.S. augments data compiled by others globally. Only geochemical parameters of interest (arsenic, manganese, pH) from wells in the national groundwater aggregation are presented—data from springs were not used. A table of site information includes attributes for each well, such as the state, water use code, depth, open interval (if available) and aquifer (if available). The provider of the water-quality data and well information in also in this table.
Arsenic, manganese, and pH groundwater quality data, selected well construction characteristics, and aquifer assignments for wells in the conterminous U.S.
공공데이터포털
This data release contains groundwater-quality data for three parameters of interest (arsenic, manganese, and pH) and well information for sample sites for aquifers in the conterminous U.S. Water-quality data and well information were derived from a dataset compiled from three sources: the U.S. Geological Survey (USGS) National Water Information System (NWIS), the U.S. Environmental Protection Agency (USEPA) Safe Drinking Water Information System (SDWIS), and numerous agencies and organizations at the state, regional, and local level. The data compilation of the National Water Quality Program’s groundwater assessment team is an internal dataset informally referred to as the National Groundwater Aggregation (NGA). The current study of groundwater quality in the conterminous U.S. augments data compiled by others globally. Only geochemical parameters of interest (arsenic, manganese, pH) from wells in the national groundwater aggregation are presented—data from springs were not used. A table of site information includes attributes for each well, such as the state, water use code, depth, open interval (if available) and aquifer (if available). The provider of the water-quality data and well information in also in this table.
Total and aqueous arsenic concentrations, physiochemical characteristics, and ancillary data of groundwater from newly constructed drinking water wells in central, northwest, and northeast Minnesota, 2014-2016, version 2.0, July 2018
공공데이터포털
This dataset provides aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate measurements in groundwater samples from 254 newly constructed private residential wells between 2014 and 2016. The study focuses on three geologically distinct regions of Minnesota: central, northwest, and northeast. These study regions were chosen due to their prevalent elevated As concentrations in drinking water. Each of the 254 wells were sampled in three rounds by the Minnesota Department of Health (MDH). The timing of the three sampling rounds was (1) immediately or shortly after well construction (round 1); (2) 3-6 months after initial sample collection (round 2); and (3) 12 months after initial sample collection (round 3). During each round, samples were collected for both total and aqueous As, aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate. Physiochemical characteristics, including specific conductance, pH, dissolved oxygen, oxidation reduction potential, and temperature, were also measured to gage the well water stability prior to sample collection. Round 1 sampling was timed to co-occur and mimic well driller regulatory sampling. Drillers collected samples after well development from the drill rig groundwater pump or from the residential plumbing, and the MDH sampler replicated the sample location and timing used by the driller. Sampling from the drill rig’s groundwater pump occurred after the well was drilled and developed, when the water was visibly clear, with little visible sediment particles. Samples from plumbing were collected after the plumbing was flushed out and physiochemical characteristic readings stabilized. Round 2 and round 3 by MDH staff were collected only from plumbing. Samples collected from plumbing were taken from faucets, hydrants, or pressure tanks prior to filters or treatment systems.
Total and aqueous arsenic concentrations, physiochemical characteristics, and ancillary data of groundwater from newly constructed drinking water wells in central, northwest, and northeast Minnesota, 2014-2016, version 2.0, July 2018
공공데이터포털
This dataset provides aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate measurements in groundwater samples from 254 newly constructed private residential wells between 2014 and 2016. The study focuses on three geologically distinct regions of Minnesota: central, northwest, and northeast. These study regions were chosen due to their prevalent elevated As concentrations in drinking water. Each of the 254 wells were sampled in three rounds by the Minnesota Department of Health (MDH). The timing of the three sampling rounds was (1) immediately or shortly after well construction (round 1); (2) 3-6 months after initial sample collection (round 2); and (3) 12 months after initial sample collection (round 3). During each round, samples were collected for both total and aqueous As, aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate. Physiochemical characteristics, including specific conductance, pH, dissolved oxygen, oxidation reduction potential, and temperature, were also measured to gage the well water stability prior to sample collection. Round 1 sampling was timed to co-occur and mimic well driller regulatory sampling. Drillers collected samples after well development from the drill rig groundwater pump or from the residential plumbing, and the MDH sampler replicated the sample location and timing used by the driller. Sampling from the drill rig’s groundwater pump occurred after the well was drilled and developed, when the water was visibly clear, with little visible sediment particles. Samples from plumbing were collected after the plumbing was flushed out and physiochemical characteristic readings stabilized. Round 2 and round 3 by MDH staff were collected only from plumbing. Samples collected from plumbing were taken from faucets, hydrants, or pressure tanks prior to filters or treatment systems.
Total and aqueous arsenic concentrations, physiochemical characteristics, and ancillary data of groundwater from newly constructed drinking water wells in central, northwest, and northeast Minnesota, 2014-2016, version 2.0, July 2018
공공데이터포털
This dataset provides aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate measurements in groundwater samples from 254 newly constructed private residential wells between 2014 and 2016. The study focuses on three geologically distinct regions of Minnesota: central, northwest, and northeast. These study regions were chosen due to their prevalent elevated As concentrations in drinking water. Each of the 254 wells were sampled in three rounds by the Minnesota Department of Health (MDH). The timing of the three sampling rounds was (1) immediately or shortly after well construction (round 1); (2) 3-6 months after initial sample collection (round 2); and (3) 12 months after initial sample collection (round 3). During each round, samples were collected for both total and aqueous As, aqueous nitrate+nitrite, aqueous manganese, aqueous iron, and total sulfate. Physiochemical characteristics, including specific conductance, pH, dissolved oxygen, oxidation reduction potential, and temperature, were also measured to gage the well water stability prior to sample collection. Round 1 sampling was timed to co-occur and mimic well driller regulatory sampling. Drillers collected samples after well development from the drill rig groundwater pump or from the residential plumbing, and the MDH sampler replicated the sample location and timing used by the driller. Sampling from the drill rig’s groundwater pump occurred after the well was drilled and developed, when the water was visibly clear, with little visible sediment particles. Samples from plumbing were collected after the plumbing was flushed out and physiochemical characteristic readings stabilized. Round 2 and round 3 by MDH staff were collected only from plumbing. Samples collected from plumbing were taken from faucets, hydrants, or pressure tanks prior to filters or treatment systems.
Machine-learning model predictions and rasters of arsenic and manganese in groundwater in the Mississippi River Valley alluvial aquifer
공공데이터포털
Groundwater from the Mississippi River Valley alluvial aquifer (MRVA) is a vital resource for agriculture and drinking-water supplies in the central United States. Water availability can be limited in some areas of the aquifer by high concentrations of trace elements, including manganese and arsenic. Boosted regression trees, a type of ensemble-tree machine-learning method, were used to predict manganese concentration and the probability of arsenic concentration exceeding a 10 µg/L threshold throughout the MRVA. Explanatory variables for the BRT models included attributes associated with well location and construction, surficial variables (such as hydrologic position and recharge), variables extracted from a MODFLOW-2005 groundwater-flow model for the Mississippi embayment, and variables from an airborne electromagnetic survey of the aquifer. This data release provides the R scripts to tune and reproduce the BRT models and final prediction rasters. For a full description of modeling workflow and final model selection see the companion journal article.