교육데이터 활용•지원 서비스

로그인

데이터셋 상세

미국

Modeled daily salinity derived from multiple machine learning methodologies for 91 salinity monitoring sites in the northern Gulf of Mexico, 1980–2021

This data release consists of statistical predictions of daily salinity time series generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope includes the predictions for 91 locations within bays and estuaries of the Gulf of Mexico, United States. The 91 locations are organized across 15 salinity groups and represented in the organizational structure of this data release. The input data files of imputed salinity (observations, response variable) and covariates (predictor variables) for the makESTUSAL software were created by use of a companion software (covESTUSAL) (Asquith and others, 2023a). These input data are provided by Banks and others (2024).

데이터 정보

데이터 포털
미국
META URL
https://catalog.data.gov/dataset/modeled-daily-salinity-derived-from-multiple-machine-learning-methodologies-for-91-salinit-ed0e6
라이선스
notspecified
비용
제공기관
Department of the Interior
관리부서
데이터

연관 데이터

Modeled daily salinity derived from multiple machine learning methodologies for 91 salinity monitoring sites in the northern Gulf of Mexico, 1980–2021

공공데이터포털

This data release consists of statistical predictions of daily salinity time series generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope includes the predictions for 91 locations within bays and estuaries of the Gulf of Mexico, United States. The 91 locations are organized across 15 salinity groups and represented in the organizational structure of this data release. The input data files of imputed salinity (observations, response variable) and covariates (predictor variables) for the makESTUSAL software were created by use of a companion software (covESTUSAL) (Asquith and others, 2023a). These input data are provided by Banks and others (2024).

Modeled daily salinity derived from multiple machine learning methodologies and generalized additive models for three salinity monitoring sites in Mobile Bay, northern Gulf of Mexico, 1980–2021

공공데이터포털

Results from generalized additive models (GAM), random forest models (RFM), and cubist models (CUB) for three Dauphin Island Sealab (DIS) operated salinity sites in Mobile Bay are reported in this data release. These sites included Meaher Park (DIS:MHPA1), Middle Bay Lighthouse (DIS:MBLA1), and Dauphin Island (DIS:DPIA1). The constructed models predicted a 42-year daily salinity record from 1980 to 2021 at each site based on incomplete imputed salinity records and several explanatory variables. Explanatory variables included: daily streamflow from 8 United States Geological Survey (USGS) streamgages, daily minimum and maximum temperature, precipitation, vapor pressure, wind speed, wind direction, horizontal and vertical wind speed lagged from 0 to 7 days, altitude and azimuth of the sun and moon, and the positive and negative slopes of streamflow change over the previous seven days. Two GAM, RFM, and CUB salinity models were developed for each site using even- and odd-year-holdout. The final predicted salinity time series were derived from inverse error weighted pooling of the even- and odd-year model results for each model type. A similar methodology was used to pool the even- and odd-year models from the three model types to create a time series of daily salinity predictions from the ensemble of models. By applying model tests, prediction intervals estimations for the GAM, RFM, CUB were determined with model ensemble pooled predictions as shown in model input. Model input even- and odd-year models, helped determine pooling predictions and prediction intervals. RFM and CUB models displayed variable importance along with variable significance as seen in the GAM model. Predicted salinity levels exhibit variation from measured values, with certain maximum salinity predictions potentially exceeding the natural conditions expected in Mobile Bay.

Modeled daily salinity derived from multiple machine learning methodologies and generalized additive models for three salinity monitoring sites in Mobile Bay, northern Gulf of Mexico, 1980–2021

공공데이터포털

Results from generalized additive models (GAM), random forest models (RFM), and cubist models (CUB) for three Dauphin Island Sealab (DIS) operated salinity sites in Mobile Bay are reported in this data release. These sites included Meaher Park (DIS:MHPA1), Middle Bay Lighthouse (DIS:MBLA1), and Dauphin Island (DIS:DPIA1). The constructed models predicted a 42-year daily salinity record from 1980 to 2021 at each site based on incomplete imputed salinity records and several explanatory variables. Explanatory variables included: daily streamflow from 8 United States Geological Survey (USGS) streamgages, daily minimum and maximum temperature, precipitation, vapor pressure, wind speed, wind direction, horizontal and vertical wind speed lagged from 0 to 7 days, altitude and azimuth of the sun and moon, and the positive and negative slopes of streamflow change over the previous seven days. Two GAM, RFM, and CUB salinity models were developed for each site using even- and odd-year-holdout. The final predicted salinity time series were derived from inverse error weighted pooling of the even- and odd-year model results for each model type. A similar methodology was used to pool the even- and odd-year models from the three model types to create a time series of daily salinity predictions from the ensemble of models. By applying model tests, prediction intervals estimations for the GAM, RFM, CUB were determined with model ensemble pooled predictions as shown in model input. Model input even- and odd-year models, helped determine pooling predictions and prediction intervals. RFM and CUB models displayed variable importance along with variable significance as seen in the GAM model. Predicted salinity levels exhibit variation from measured values, with certain maximum salinity predictions potentially exceeding the natural conditions expected in Mobile Bay.

Imputed daily salinity and associated covariates to support statistical modeling for 91 salinity monitoring sites in the northern Gulf of Mexico

공공데이터포털

Imputed salinities from either salinity or specific conductance observations and covariate data in data structures deemed suitable for statistical modeling of salinity in near-coastal environments of the northern Gulf of Mexico are provided for 15 salinity groups. The data herein were created by the 'covardr2formodel.R' script of 'covESTUSAL software' (Asquith and others, 2023), which represents terminal decisions on variable setup and transformations. The design ideal is data downloaded from this data release would be used in some path of “input” within statistical modeling software. Copious documentation of the decision process for data assembly is provided by Asquith and others (2023).

Imputed daily salinity and associated covariates to support statistical modeling for 91 salinity monitoring sites in the northern Gulf of Mexico

공공데이터포털

Imputed salinities from either salinity or specific conductance observations and covariate data in data structures deemed suitable for statistical modeling of salinity in near-coastal environments of the northern Gulf of Mexico are provided for 15 salinity groups. The data herein were created by the 'covardr2formodel.R' script of 'covESTUSAL software' (Asquith and others, 2023), which represents terminal decisions on variable setup and transformations. The design ideal is data downloaded from this data release would be used in some path of “input” within statistical modeling software. Copious documentation of the decision process for data assembly is provided by Asquith and others (2023).

Geospatial representations of salinity monitoring site and bay and estuary group boundaries in the Gulf of Mexico

공공데이터포털

The polygon datasets were created to assist in visualizing the results of salinity modeling in Gulf of Mexico estuaries and bays. Statistical algorithms (Asquith and others, 2023) were developed to predict daily salinities for 91 salinity monitoring sites (Rodgers and Swarzenski, 2019) operated by 7 agencies in near coastal United States waters of the Gulf of Mexico. These monitoring sites are assigned to 15 salinity groups roughly corresponding to distinct bays and estuaries. The statistical algorithms facilitate the study of trends and drivers of salinity in near coastal waters. The groups polygon dataset consists of 15 polygons representing the outer boundary or hull of each of the 15 salinity groups. The site polygons dataset consists of 91 polygons—one polygon each per salinity monitoring site. The polygons were created using the Watershed Boundary Dataset, the National Hydrography Dataset, and aerial imagery. A detailed description of the polygon creation method is in the metadata processing steps. Creation of the polygons was motivated by a need to construct visual cues (maps and map animations) for testing the veracity of the statistical algorithms.

공공데이터포털

The dataset folder entitled “SanAn” holds data structures consisting of statistical predictions of daily salinity time series for the San Antonio Bay (SanAn) group, generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope of the SanAn group includes the predictions for five locations defined using agency code and salinity site abbreviations.

공공데이터포털

The dataset folder entitled “LagMa” holds data structures consisting of statistical predictions of daily salinity time series for the Laguna Madre (LagMa) group, generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope of the LagMa group includes the predictions for three locations defined using agency code and salinity site abbreviations.

공공데이터포털

The dataset folder entitled “SabLa” holds data structures consisting of statistical predictions of daily salinity time series for the Sabine Lake (SabLa) group, generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope of the SabLa group includes the predictions for two locations defined using agency code and salinity site abbreviations.

공공데이터포털

The dataset folder entitled “SabLa” holds data structures consisting of statistical predictions of daily salinity time series for the Sabine Lake (SabLa) group, generated from the makESTUSAL software repository described by Asquith and others (2023b). The statistical methods included multiple methods of machine learning, which produced the daily salinity prediction and attendant credible uncertainties included in the data release. The geographic scope of the SabLa group includes the predictions for two locations defined using agency code and salinity site abbreviations.

목록