데이터셋 상세
미국
Data used to model and map manganese in the Northern Atlantic Coastal Plain aquifer system, eastern USA
Data used to model and map manganese concentrations in groundwater in the Northern Atlantic Coastal Plain (NACP) aquifer system, eastern USA, are documented in this data release. The model predicts manganese concentration within four classes and is based on concentration data from 4492 wells. The well data were compiled from U.S. Geological Survey, U.S. Environmental Protection Agency, Suffolk County Water Authority (Suffolk County, New York), and state agency sources. The four concentration classes are based on guidelines for drinking water quality: below detection (class 1, less than 10 micrograms per liter (ug/L)); detected but less than the aesthetic guideline of 50 ug/L (class 2); greater than the aesthetic guideline but less than the health guideline of 300 ug/L (class 3); and greater than the health guideline of 300 ug/L (class 4). The thresholds of 50 ug/L and 300 ug/L are a Secondary Maximum Contaminant Level and a lifetime health advisory, respectively, from the U.S. Environmental Protection Agency for public water supplies. The model is built with the XGboost machine learning method. Explanatory variables (predictors) include well depth, soil characteristics, hydrologic variables, groundwater residence time, and predicted values of pH and of the probability of low dissolved oxygen from previous machine learning models of the aquifer system. The data are provided in data tables, raster files, and model files, organized as follows. One data table describes the 27 explanatory variables used in the model (NACP_Mn_explanatory_variables.csv). There is a data table for the well data used to develop the models, which includes the manganese concentrations, concentration classes, regional aquifer, explanatory variables, and predicted concentration class for the wells (NACP_Mn_well_data.csv). There is a compressed group (zip file) of 10 files (one for each regional aquifer) for explanatory variable data used to make predictions for the regional aquifers (NACP_Mn_prediction_input_aquifers.zip). There are two zip files providing model output, one for predictions made for each aquifer in text format and one for tif-format rasters of predictions for each aquifer. The data release also contains a tif-format raster file of the prediction grid and a zip file with the model object file (R data format) and a script that can be used to run the model to produce the predictions provided in this data release. Filenames for prediction input and for model output are distinguished by codes abbreviating the aquifer name and position in the vertical stack of 19 regional aquifers and confining units, as follows: Surficial aquifer, 1surf; Upper Chesapeake aquifer, 3upch; Lower Chesapeake aquifer, 5loch; Piney Point aquifer, 7pipt; Aquia aquifer, 9aqia; Monmouth - Mt. Laurel Aquifer, 11moml; Matawan aquifer, 13mtwn; Magothy Aquifer, 15mgty; Potomac-Patapsco aquifer, 17popt; Potomac-Patuxent aquifer, 19popx. The nine confining units are not represented in the model or predictions.
데이터 정보
연관 데이터
Data used to model and map manganese in the Northern Atlantic Coastal Plain aquifer system, eastern USA
공공데이터포털
Data used to model and map manganese concentrations in groundwater in the Northern Atlantic Coastal Plain (NACP) aquifer system, eastern USA, are documented in this data release. The model predicts manganese concentration within four classes and is based on concentration data from 4492 wells. The well data were compiled from U.S. Geological Survey, U.S. Environmental Protection Agency, Suffolk County Water Authority (Suffolk County, New York), and state agency sources. The four concentration classes are based on guidelines for drinking water quality: below detection (class 1, less than 10 micrograms per liter (ug/L)); detected but less than the aesthetic guideline of 50 ug/L (class 2); greater than the aesthetic guideline but less than the health guideline of 300 ug/L (class 3); and greater than the health guideline of 300 ug/L (class 4). The thresholds of 50 ug/L and 300 ug/L are a Secondary Maximum Contaminant Level and a lifetime health advisory, respectively, from the U.S. Environmental Protection Agency for public water supplies. The model is built with the XGboost machine learning method. Explanatory variables (predictors) include well depth, soil characteristics, hydrologic variables, groundwater residence time, and predicted values of pH and of the probability of low dissolved oxygen from previous machine learning models of the aquifer system. The data are provided in data tables, raster files, and model files, organized as follows. One data table describes the 27 explanatory variables used in the model (NACP_Mn_explanatory_variables.csv). There is a data table for the well data used to develop the models, which includes the manganese concentrations, concentration classes, regional aquifer, explanatory variables, and predicted concentration class for the wells (NACP_Mn_well_data.csv). There is a compressed group (zip file) of 10 files (one for each regional aquifer) for explanatory variable data used to make predictions for the regional aquifers (NACP_Mn_prediction_input_aquifers.zip). There are two zip files providing model output, one for predictions made for each aquifer in text format and one for tif-format rasters of predictions for each aquifer. The data release also contains a tif-format raster file of the prediction grid and a zip file with the model object file (R data format) and a script that can be used to run the model to produce the predictions provided in this data release. Filenames for prediction input and for model output are distinguished by codes abbreviating the aquifer name and position in the vertical stack of 19 regional aquifers and confining units, as follows: Surficial aquifer, 1surf; Upper Chesapeake aquifer, 3upch; Lower Chesapeake aquifer, 5loch; Piney Point aquifer, 7pipt; Aquia aquifer, 9aqia; Monmouth - Mt. Laurel Aquifer, 11moml; Matawan aquifer, 13mtwn; Magothy Aquifer, 15mgty; Potomac-Patapsco aquifer, 17popt; Potomac-Patuxent aquifer, 19popx. The nine confining units are not represented in the model or predictions.
Data and Model Archive for Preliminary Machine Learning Models of Manganese and 1,4-Dioxane in Groundwater on Long Island, New York
공공데이터포털
Data and preliminary machine-learning models used to predict manganese and 1,4-dioxane in groundwater on Long Island are documented in this data release. Concentration data used to develop the models were from 910 wells for manganese and 553 wells for 1,4-dioxane, primarily public supply wells, from U.S. Geological Survey, U.S. Environmental Protection Agency (USEPA), and Suffolk County Water Authority sources. Thirty-two explanatory variables describe depth, groundwater flow, land use, soil properties, and other features of the aquifer system. The models use XGBoost, an ensemble tree machine learning method. Four models are documented for manganese, predicting the probability of concentrations relative to four thresholds: 10 micrograms per liter (detection), 50 micrograms per liter (the USEPA Secondary Maximum Contaminant Level), 150 micrograms per liter, and 300 micrograms per liter (the USEPA lifetime health advisory). One model is documented for 1,4-dioxane, predicting the probability of concentrations relative to 0.07 micrograms per liter (detection). The models were used to predict concentrations in two layers of the upper glacial aquifer and three layers of the Magothy aquifer. Predictions were made at a 500-square-foot resolution across the entire island for manganese and across Suffolk County, which occupies the eastern two-thirds of Long Island, for 1,4-dioxane. The data are provided in data tables, raster files, and model files. One data table describes the 32 explanatory variables (LI_mn_14dx_exp_vars.txt). One data table describes the well data and includes the manganese and 1,4-dioxane concentrations, explanatory variables, and predictions for the wells (LI_mn_14dx_well_data.txt). There is a compressed group (zip file) of five files providing the explanatory variable data used to make predictions for the five aquifer layers (LI_mn_14dx_predinput_griddata.zip) and a zip file of 25 files providing model predictions for each model and aquifer layer (LI_mn_14dx_predoutput_rasters.zip). The data release also contains a tif-format raster file of the prediction grid (LI_mn_14dx_prediction_grid.tif). The models are documented in a zip file (LI_mn_14dx_models.zip) that contains the model object files (R data format) and scripts that can be used to run the models to produce the predictions provided in this data release. Filenames for prediction input and for model output are distinguished by names and numbers as follows: 1_upper_glacial, top layer of the upper glacial aquifer; 3_upper_glacial, bottom layer of the upper glacial aquifer; 5_Magothy, top layer of the Magothy aquifer; 14_Magothy, middle layer of the Magothy aquifer; and 23_Magothy, bottom layer of the Magothy aquifer.
Data for Elevated Manganese Concentrations in United States Groundwater, Role of Land Surface-Soil-Aquifer Connections
공공데이터포털
Chemical data from 43,334 wells were used to examine the role of land surface-soil-aquifer connections in producing elevated manganese concentrations (>300 µg/L) in United States (U.S.) groundwater. Elevated manganese and dissolved organic carbon (DOC) concentrations were associated with shallow water tables and organic-carbon rich soils, suggesting soil-derived DOC supported manganese reduction. Manganese and DOC concentrations were higher near rivers than farther from rivers, suggesting river-derived DOC also supported manganese reduction. Anthropogenic nitrogen may also affect manganese concentrations in groundwater. In parts of the northeastern U.S. containing poorly buffered soils, ~40% of the samples with elevated manganese concentrations had pH values <6 and elevated concentrations of dissolved oxygen and nitrate relative to samples with pH ≥6, suggesting acidic recharge produced by the oxidation of ammonium in fertilizer helped mobilize manganese. An estimated 2.6 million people potentially consume groundwater with elevated manganese concentrations, the highest densities of which occur near rivers and in areas with organic-carbon rich soil. Results from this study indicate land surface-soil-aquifer connections play an important role in producing elevated manganese concentrations in groundwater used for human consumption.
Data for Elevated Manganese Concentrations in United States Groundwater, Role of Land Surface-Soil-Aquifer Connections
공공데이터포털
Chemical data from 43,334 wells were used to examine the role of land surface-soil-aquifer connections in producing elevated manganese concentrations (>300 µg/L) in United States (U.S.) groundwater. Elevated manganese and dissolved organic carbon (DOC) concentrations were associated with shallow water tables and organic-carbon rich soils, suggesting soil-derived DOC supported manganese reduction. Manganese and DOC concentrations were higher near rivers than farther from rivers, suggesting river-derived DOC also supported manganese reduction. Anthropogenic nitrogen may also affect manganese concentrations in groundwater. In parts of the northeastern U.S. containing poorly buffered soils, ~40% of the samples with elevated manganese concentrations had pH values <6 and elevated concentrations of dissolved oxygen and nitrate relative to samples with pH ≥6, suggesting acidic recharge produced by the oxidation of ammonium in fertilizer helped mobilize manganese. An estimated 2.6 million people potentially consume groundwater with elevated manganese concentrations, the highest densities of which occur near rivers and in areas with organic-carbon rich soil. Results from this study indicate land surface-soil-aquifer connections play an important role in producing elevated manganese concentrations in groundwater used for human consumption.
Data used to model and map pH and redox conditions in the Northern Atlantic Coastal Plain aquifer system, eastern USA
공공데이터포털
Data used to model and map pH and redox conditions in groundwater in the Northern Atlantic Coastal Plain aquifer system, eastern USA, are documented in this data release. The models use as input data measurements of pH and dissolved oxygen concentrations at about 3000 to 5000 wells, which were compiled primarily from U.S. Geological Survey and U.S. Environmental Protection Agency databases. The boosted regression trees machine learning method was used to build the models. Explanatory variables (predictors) describe geology, hydrology, chemistry, physical characteristics, anthropogenic influence, metrics from a groundwater flow model, and groundwater residence times in the aquifer system. Data for four models are documented--one model for pH and one model each for the probability of dissolved oxygen less than three threshold values (0.5, 1, and 2 milligrams per liter). The data are provided in data tables and raster files, organized as follows. There is one data table for the well data used to develop all four models (well data). There is one zipped group of 10 files (one for each aquifer) for explanatory input data used to make predictions at grid points (prediction input). There are 9 zipped groups of files for model output; these include 1 zip file of predictions at grid points for each of the 4 models (prediction output), 1 zip file for combined pH and dissolved oxygen predictions (combined prediction output); and 4 zip files of uncertainty intervals for predictions for each of the 4 models (uncertainty output). Filenames for prediction input and for model output are distinguished by codes abbreviating the aquifer name and position in the vertical stack of 19 regional aquifers and confining units, as follows: Surficial aquifer, 1surf; Upper Chesapeake aquifer, 3upch; Lower Chesapeake aquifer, 5loch; Piney Point aquifer, 7pipt; Aquia aquifer, 9aqia; Monmouth - Mt. Laurel Aquifer, 11moml; Matawan aquifer, 13mtwn; Magothy Aquifer, 15mgty; Potomac-Patapsco aquifer, 17popt; Potomac-Patuxent aquifer, 19popx. The data release also contains a tif-format raster file of the prediction grid and two data tables that separately describe the explanatory variables (predictors) and their sources.
Data used to model and map pH and redox conditions in the Northern Atlantic Coastal Plain aquifer system, eastern USA
공공데이터포털
Data used to model and map pH and redox conditions in groundwater in the Northern Atlantic Coastal Plain aquifer system, eastern USA, are documented in this data release. The models use as input data measurements of pH and dissolved oxygen concentrations at about 3000 to 5000 wells, which were compiled primarily from U.S. Geological Survey and U.S. Environmental Protection Agency databases. The boosted regression trees machine learning method was used to build the models. Explanatory variables (predictors) describe geology, hydrology, chemistry, physical characteristics, anthropogenic influence, metrics from a groundwater flow model, and groundwater residence times in the aquifer system. Data for four models are documented--one model for pH and one model each for the probability of dissolved oxygen less than three threshold values (0.5, 1, and 2 milligrams per liter). The data are provided in data tables and raster files, organized as follows. There is one data table for the well data used to develop all four models (well data). There is one zipped group of 10 files (one for each aquifer) for explanatory input data used to make predictions at grid points (prediction input). There are 9 zipped groups of files for model output; these include 1 zip file of predictions at grid points for each of the 4 models (prediction output), 1 zip file for combined pH and dissolved oxygen predictions (combined prediction output); and 4 zip files of uncertainty intervals for predictions for each of the 4 models (uncertainty output). Filenames for prediction input and for model output are distinguished by codes abbreviating the aquifer name and position in the vertical stack of 19 regional aquifers and confining units, as follows: Surficial aquifer, 1surf; Upper Chesapeake aquifer, 3upch; Lower Chesapeake aquifer, 5loch; Piney Point aquifer, 7pipt; Aquia aquifer, 9aqia; Monmouth - Mt. Laurel Aquifer, 11moml; Matawan aquifer, 13mtwn; Magothy Aquifer, 15mgty; Potomac-Patapsco aquifer, 17popt; Potomac-Patuxent aquifer, 19popx. The data release also contains a tif-format raster file of the prediction grid and two data tables that separately describe the explanatory variables (predictors) and their sources.
Groundwater data, predictor variables, and rasters used for predicting the probability of high arsenic and high manganese in the Glacial Aquifer System, northern continental United States
공공데이터포털
This data release contains input data used in model development and TIF raster files used to predict the probability of high arsenic (As) and high manganese (Mn) in groundwater within the glacial aquifer system in the northern United States. Input data include measured As and Mn concentrations at groundwater wells, and associated predictor variable data. The probability of high As and high Mn was predicted using boosted regression tree methods using the gbm package in R version 4.0.0. The response variables for individual models were the occurrence of: (1) As >10 µg/L, and (2) Mn >300 µg/L. Water-quality data were compiled from three sources, as described in Wilson and others (2019): a compilation of data from numerous agencies and organizations at the state, regional, and local level; the U.S. Geological Survey National Water Information System; and the U.S. Environmental Protection Agency Safe Drinking Water Information System. The resultant dataset consisted of 10,001 As and 14,565 Mn measurements across the study area. A total of 108 predictor variables were originally considered for model development which included well characteristics, soil properties, aquifer properties, predicted nitrate, hydrologic position on the landscape, groundwater age, predicted pH, and predicted anoxic conditions. After model refinement, a total of 79 and 55 predictor variables were used for predicting the probability of high As and high Mn, respectively. The probability of high As and high Mn was predicted at two depths representative of public and domestic drinking water supply depths at a resolution of 1 km across the glacial aquifer.
American River At Folsom Powerhouse Manganese ug/L Time Series Data
공공데이터포털
Measurements of Manganese collected at American River At Folsom Powerhouse. Currently collected twice a year, previously collected quarterly. Access further information for this data set by contacting Bureau of Reclamation, California-Great Basin Region, Environmental Affairs Division (CGB-157). See ResultAttributes for STAFF_GAUGE, SMPL_DEPTH, SMPL_CATEGORY_NAME, METHOD_CODE, RESULT_RL, RESULT_RL-UNIT_STD_NAME, RESULT_MDL, RESULT_MDL-UNIT_STD_NAME, USBR_QA_SUBTYPE_NAME, USBR_QULFR_DESCRIPTION. STAFF_GAUGE is the water height in decimal feet measured by gauge (e.g., 15.2). SMPL_DEPTH is the vertical depth at which sample is collected (e.g., 0 - 15 cm). For water samples: depth below water/air interface. For sediment and soil samples: depth below water/solid or air/solid interface. SMPL_CATEGORY_NAME is the category type of sample (e.g., Composite). METHOD_CODE is the name of method used to obtain result (e.g., EPA 200.8). RESULT_RL is the result reporting limit (accounting for dilution) (e.g., 0.02). RESULT_RL-UNIT_STD_NAME is the unit associated with RESULT_RL (e.g., mg/L). RESULT_MDL is the result method detection limit (e.g., 0.007). RESULT_MDL-UNIT_STD_NAME is the unit associated with RESULT_MDL (e.g., mg/L). USBR_QA_SUBTYPE_NAME is the quality control type of the sample (e.g., USBR_BLANK_SPIKE). USBR_QULFR_DESCRIPTION is the quality assurance description (if any) (e.g., Result may have a high bias.).
Machine-learning model predictions and rasters of arsenic and manganese in groundwater in the Mississippi River Valley alluvial aquifer
공공데이터포털
Groundwater from the Mississippi River Valley alluvial aquifer (MRVA) is a vital resource for agriculture and drinking-water supplies in the central United States. Water availability can be limited in some areas of the aquifer by high concentrations of trace elements, including manganese and arsenic. Boosted regression trees, a type of ensemble-tree machine-learning method, were used to predict manganese concentration and the probability of arsenic concentration exceeding a 10 µg/L threshold throughout the MRVA. Explanatory variables for the BRT models included attributes associated with well location and construction, surficial variables (such as hydrologic position and recharge), variables extracted from a MODFLOW-2005 groundwater-flow model for the Mississippi embayment, and variables from an airborne electromagnetic survey of the aquifer. This data release provides the R scripts to tune and reproduce the BRT models and final prediction rasters. For a full description of modeling workflow and final model selection see the companion journal article.
MODFLOW-NWT model used to assess groundwater availability in the Northern Atlantic Coastal Plain aquifer system from Long Island, New York to North Carolina
공공데이터포털
A three-dimensional, groundwater flow model was developed with the numerical code MODFLOW-NWT to represent changes in groundwater pumping and aquifer recharge in the Northern Atlantic Coastal Plain aquifer system from Long Island, New York to North Carolina. The model was constructed using existing hydrogeologic and geospatial information to represent the aquifer system geometry, boundaries, and hydraulic properties of the 19 separate regional aquifers and confining units within the aquifer system. The model was calibrated using an inverse modeling parameter-estimation (PEST) technique to conditions from 1986 to 2008, the period for which data are most complete and reliable. The simulation period for this analysis spanned from predevelopment to future conditions, from 1900 to 2058. The model was used to advance the understanding of groundwater budgets and components including recharge, discharge, and aquifer storage for the entire system and for each of the statewide systems; compute historical and recent system response and project future system response to development at a scale relevant to basinwide water-management decisions; and evaluate options for hydrologic monitoring of system changes. The report ‘Documentation of a groundwater flow model developed to assess groundwater availability in the Northern Atlantic Coastal Plain aquifer system from Long Island, New York, to North Carolina: U.S. Geological Survey Scientific Investigations Report 2016–5076' (https://doi.org/10.3133/sir20165076) documents the model design and calibration, as well as several simulations to test model construction assumptions. The report 'Assessment of groundwater availability in the Northern Atlantic Coastal Plain aquifer system from Long Island, New York, to North Carolina: U.S. Geological Survey Professional Paper 1829' (https://doi.org/10.3133/pp1829) documents water-availability simulations and the resulting analysis and discussion. This USGS data release contains all of the input and output files for the simulations described in the associated reports (https://doi.org/10.3133/sir20165076) and (https://doi.org/10.3133/pp1829). This data release also includes (1) MODFLOW-NWT source code, (2) the PEST files and source code used for model calibration, and (3) the ZONEBUDGET input files and source code used for the groundwater availability analysis.