데이터셋 상세
미국
Predicting ABM Results with Covering Arrays and Random Forests
Our goal is to explore the feasibility and usefulness of using a combination of covering arrays and machine learning models for predicting results of an agent- based simulation model within the vast parameter value combination space. The challenge is to select parameter values that are representative of the overall behavior of the model, so that we can train the machine learning model to be able to correctly predict behavior on previously untested areas of the parameter space. We have chosen Wilensky's Heat Bugs model in NetLogo for our study. It is a simple model, amenable to quick data generation, with a limited number of outputs to predict, and with emergent behavior. This model therefore allows exploration of this new approach.We utilize covering arrays to reduce the parameter value space systematically, run the model for each parameter set in the 2-way and 3-way covering arrays, train a random forest model on the 2-way data (33, 351 parameter combinations), and test its ability to predict the outcome of the simulation on the significantly larger 3-way data that was not seen during the training of the model (3, 971, 955 parameter combinations).
데이터 정보
연관 데이터
A New Approach for Representing Agent-Environment Feedbacks: Coupled Agent-Based and State-And-Transition Simulation Models
공공데이터포털
Agent-based models (ABMs) and state-and-transition simulation models (STSMs) are two classes of simulation models that have proven useful for understanding the processes underlying complex, dynamic ecosystems and evaluating practical questions about how ecosystems will respond to different scenarios of global change and environmental management. ABMs can simulate many types of agents (i.e., autonomous units, such as wildlife, livestock, people, or viruses) and are advantageous because they can represent agent characteristics, decision-making, adaptive behavior, mobility, and interactions, and can capture feedbacks between agents and their environment. STSMs are flexible and intuitive models of landscape dynamics that can track landscape attributes and management scenarios, and integrate diverse data types (e.g., output from correlative and mechanistic models). Both ABMs and STSMs can be run spatially and track important metrics of management success, including costs. Despite the complementarity of these two approaches, they have not been connected through a dynamic linkage until now. We report on analytical techniques and software tools that we developed to couple these modeling approaches using NetLogo, R, and the ST-Sim package for SyncroSim. We demonstrate the capabilities and value of this new approach through a proof-of-concept modeling example focused on bison-vegetation interactions in Badlands National Park. This coupled approach: 1) streamlines handling of model inputs and outputs; 2) increases the temporal resolution of agent-environment interactions that are available in ST-Sim; 3) minimizes assumptions; and 4) generates more realistic spatio-temporal patterns. With the developments presented here, modelers can now use output from an ABM to dictate changes in vegetation and their characteristics within an STSM, and create more realistic and management-relevant simulations.
Modelled Land Capability of Tasmania - St Pauls 100,000 Mapsheet
공공데이터포털
A predictive model has been established and tested to account for variations in the landscape to reflect changes in agricultural land capability class (on a progressive rating of 1: good - 7: poor). This dataset (and map) provides a prediction of the most likely land capability class to be expected in a particular location based on several layers of readily available information. These layers included geology, rainfall, slope, elevation, forest cover and surface drainage status. These data layers were input into a Geographic Information System modelling framework. Using previous experience and limited visits in the field, the output has been produced as a digital dataset and 1: 100,000 map. It was found to provide a relatively good impression of the landscapes potential for agricultural persuits (ie cropping and grazing). It was found to represent changes in capability class very well where geology, climate or slope control capability. In those areas where subsurface drainage controlled land capability it was found to be less reliable. Overall however as these areas of the State were previously devoid of any broadscale land resource information for this purpose - this map provides a valuable fist step in discerning land capability.
Theory aware Machine Learning (TaML)
공공데이터포털
A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see Debra J. Audus, Austin McDannald, and Brian DeCost, "Leveraging Theory for Enhanced Machine Learning" *ACS Macro Letters* **2022** *11* (9), 1117-1122 DOI: [10.1021/acsmacrolett.2c00369](https://doi.org/10.1021/acsmacrolett.2c00369).
Modelled Land Capability of Tasmania - Shannon 100,000 Mapsheet
공공데이터포털
A predictive model has been established and tested to account for variations in the landscape to reflect changes in agricultural land capability class (on a progressive rating of 1: good - 7: poor). This dataset (and map) provides a prediction of the most likely land capability class to be expected in a particular location based on several layers of readily available information. These layers included geology, rainfall, slope, elevation, forest cover and surface drainage status. These data layers were input into a Geographic Information System modelling framework. Using previous experience and limited visits in the field, the output has been produced as a digital dataset and 1: 100,000 map. It was found to provide a relatively good impression of the landscapes potential for agricultural persuits (ie cropping and grazing). It was found to represent changes in capability class very well where geology, climate or slope control capability. In those areas where subsurface drainage controlled land capability it was found to be less reliable. Overall however as these areas of the State were previously devoid of any broadscale land resource information for this purpose - this map provides a valuable fist step in discerning land capability.
Modelled Land Capability of Tasmania - Lake Sorell 100,000 Mapsheet
공공데이터포털
A predictive model has been established and tested to account for variations in the landscape to reflect changes in agricultural land capability class (on a progressive rating of 1: good - 7: poor). This dataset (and map) provides a prediction of the most likely land capability class to be expected in a particular location based on several layers of readily available information. These layers included geology, rainfall, slope, elevation, forest cover and surface drainage status. These data layers were input into a Geographic Information System modelling framework. Using previous experience and limited visits in the field, the output has been produced as a digital dataset and 1: 100,000 map. It was found to provide a relatively good impression of the landscapes potential for agricultural persuits (ie cropping and grazing). It was found to represent changes in capability class very well where geology, climate or slope control capability. In those areas where subsurface drainage controlled land capability it was found to be less reliable. Overall however as these areas of the State were previously devoid of any broadscale land resource information for this purpose - this map provides a valuable fist step in discerning land capability.
Machine Learning Modeling of Water Quality Based Risk Assessment
공공데이터포털
This is the geospatial and hydroclimate input data used to develop data-driven Machine Learning (ML) models as well as model estimated water quality based risk metrics and watershed health composite measure in three river basins in the Midwest. Model outputs that are used to construct the figures in the paper are displayed in the Excel file with the definitions of the data reported in each datasheet. The directory to the GIS data that were used to construct the inputs and spatially distributed risk metrics at the HUC-10 level is listed here and in the Scientific Data Management Plan. Portions of this dataset are inaccessible because: Large size GIS database. They can be accessed through the following means: C:\Users\MHantush\OneDrive - Environmental Protection Agency (EPA)\ScienceHUB\WH ML Modeling. Format: Generic GIS database.
Machine Learning Modeling of Water Quality Based Risk Assessment
공공데이터포털
This is the geospatial and hydroclimate input data used to develop data-driven Machine Learning (ML) models as well as model estimated water quality based risk metrics and watershed health composite measure in three river basins in the Midwest. Model outputs that are used to construct the figures in the paper are displayed in the Excel file with the definitions of the data reported in each datasheet. The directory to the GIS data that were used to construct the inputs and spatially distributed risk metrics at the HUC-10 level is listed here and in the Scientific Data Management Plan. Portions of this dataset are inaccessible because: Large size GIS database. They can be accessed through the following means: C:\Users\MHantush\OneDrive - Environmental Protection Agency (EPA)\ScienceHUB\WH ML Modeling. Format: Generic GIS database.
model calibration
공공데이터포털
The East Fork data and the methods used to calibrated the model are detailed in the attached previously published EPA report