데이터셋 상세
미국
Data for: Ignoring species availability biases occupancy estimates in single-scale occupancy models
We simulate over 28,000 datasets and saved their model outputs to answer the following three questions: (1) what is an adequate sampling design for the multi-scale occupancy model when there are a priori expectations of parameter estimates?, (2) what is an adequate sampling design when we have no expectations of parameter estimates?, and (3) what is the cost (in terms of bias, accuracy, precision and coverage) in occupancy estimates) if availability is not accounted for? Specifically, we simulated data under four scenarios: Scenario 1 (n = 10,000): Species availability is constant across sites (but less than one), Scenario 2 (n = 9,358): Species availability is heterogenous across sites, Scenario 3 (n = 2,815): Species availability is heterogenous across years, and Scenario 4 (n = 5,942): Species availability is correlated to their detection probability. Then, for each scenario except the first, we analyzed the data using four different estimators: (i) constant multi-scale occupancy model, (ii) multi-scale occupancy model with a random-effects term in the availability part of the model, (iii) constant single-scale occupancy model, and (iv) single-scale occupancy model with a random-effects term in the detection part of the model. Note the formulation of the random-effects terms included in the models mimicked the way that data were simulated (e.g., if species availability was heterogenous across sites, then a site random-effects term was included in the models). The first scenario was analyzed using models (i) and (iii) only. For simplicity, we refer to models (i) and (iii) as ‘constant’ models or 'fixed-effects' models. We refer to models (ii) and (iv) as ‘random-effects’ models. The summary of simulated data and model estimates are located in four folders, each corresponding to a different simulated scenario: Scenario 1 (n = 10,000): Folder ModelOutput_Scen1_TwolevelSim = csv files holding data are named Results_TwoLevelAvail_2lev_x.csv Scenario 2 (n = 9,358): Folder ModelOutput_Scen2_HeteroSite = csv files holding data are named Results_TwoLevelAvail_Hetero_x.csv Scenario 3 (n = 2,815): Folder ModelOutput_Scen3_HeteroYear = csv files holding data are named Results_TwoLevelAvail_HeteroSeason_x.csv Scenario 4 (n = 5,942): Folder ModelOutput_Scen4_Cor = csv files holding data are named Results_TwoLevelAvail_Cor_x.csv Each row in each of the csv files contains information related to a different simulated dataset and includes information related to: sampling design, true parameter values, and model estimates. Other files in the folder correspond to the entire model output (.rda files), time for model run to complete (time_..csv), and a file indicating whether or not the model run finished (nsim...csv). For more information related to those files, we point the user to the code that generated them: Scenario 1 (n = 10,000): Scen1_Constant.R Scenario 2 (n = 9,358): Scen2_HeteroSite.R Scenario 3 (n = 2,815): Scen3_HeteroYear.R Scenario 4 (n = 5,942): Scen4_Corr.R
데이터 정보
연관 데이터
Data to fit habitat suitability models at different invasion stages and their results to evaluate model decisions
공공데이터포털
This is a dataset containing the input and output data used in the analysis of best practices of invasive plant species distribution modeling (Young et al. 2024). We developed habitat suitability models for 13 invasive plant species at a variety of geographic ranges and different invasion stages and modeling strategies to assess the impact of predictor quality, thinning resolution, and geographic range of occurrence points on model performance. We developed a library of environmental variables at both the global scale and at the scale of the contiguous United States known to physiologically limit plant distributions (Young et al. 2024, Table S1) and relied on human input based on natural history knowledge to narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling (SAHM 2.2.2, Morisette et al., 2013). We used the Continuous Boyce Index (CBI) as a metric to assess model performance. This data bundle contains the merged data sets used to create the models, including location and associated environmental data, for each species, invasion stage, and modeling strategy, grouped by predictor set. In this data bundle, we have also included a dataframe of CBI values for each species, invasion stage, and modeling strategy, used in our analyses. The species include Ailanthus altissima, Alliaria petiolata, Brassica tournefortii, Cenchrus ciliaris, Chondrilla juncea, Cirsium vulgare, Dioscorea bulbifera, Imperata cylindrica, Lonicera maackii, Lysimachia nummularia, Microstegium vimineum, Pueraria montana, and Ranunculus testiculatus.
INHABIT species potential distribution across the contiguous United States
공공데이터포털
We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies. We applied the modeling workflow developed in Young et al. 2020 to species not included in the original case studies. Our methodology balanced trade-offs between developing highly customized models for a few species versus fitting non-specific and generic models for numerous species. We developed a national library of environmental variables known to physiologically limit plant distributions and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We accounted for uncertainty related to sampling bias by using two alternative sources of background samples, and constructed model ensembles using the 10 models for each species (five algorithms by two background methods) for four different thresholds. Each species folder contains the potential distribution of the species and all raster layers were produced using VisTrails:SAHM [SAHM 2.1.2]. Each of the 8 rasters represent the following: 1) MPP - minimum predicted presence threshold 2) 0.01 - one percentile threshold 3) 0.1 - ten percentile threshold 4) MaxSS - maximum sensitivity plus specificity threshold 5) MPP - minimum predicted presence threshold with Restricted Environmental Conditions 6) 0.01 - one percentile threshold with Restricted Environmental Conditions 7) 0.1 - ten percentile threshold with Restricted Environmental Conditions 8) MaxSS - maximum sensitivity plus specificity threshold with Restricted Environmental Conditions These rasters will be integrated into the Invasive Species Habitat Tool (INHABIT), a web application displaying visual and statistical summaries of nationwide habitat suitability models for manager identified invasive plant species. These species include: African rue (Peganum harmala), Air potato (Dioscorea bulbifera), Amur honeysuckle (Lonicera maackii), Amur peppervine (Ampelopsis brevipedunculata), Annual bluegrass (Poa annua ), Annual rye (Lolium multiflorum), Asian mustard (Brassica tournefortii), Beefsteak mint (Perilla frutescens), Bigleaf periwinkle (Vinca major), Bird vetch (Vicia cracca), Bishop's goutweed (Aegopodium podagraria), Black henbane (Hyoscyamus niger), Bohemian knotweed (Fallopia bohemica), Bradford pear (Pyrus calleryana), Buffelgrass (Cenchrus ciliaris), Bulbous bluegrass (Poa bulbosa), Bull thistle (Cirsium vulgare), Bur buttercup (Ranunculus testiculatus), Burning bush (Euonymus alatus), Camelthorn (Alhagi maurorum), Canada thistle (Cirsium arvense), Cereal rye (Secale cereale), Cheatgrass (Bromus tectorum), Chinaberry (Melia azedarach), Chinese holly (Ilex cornuta), Chinese privet (Ligustrum sinense), Chinese tallowtree (Triadica sebifera), Chinese wisteria (Wisteria sinensis), Chocolate vine (Akebia quinata), Clasping pepperweed (Lepidium perfoliatum), Cogongrass (Imperata cylindrica), Common crupina (Crupina vulgaris), Common gorse (Ulex europaeus ), Common reed (Phragmites australis), Common tansy (Tanacetum vulgare), Coral ardisia (Ardisia crenata), Crape myrtle (Lagerstroemia indica), Creeping bentgrass (Agrostis stolonifera), Creeping buttercup (Ranunculus repens), Crested wheatgrass (Agropyron cristatum), Crown vetch (Securigera varia), Dalmatian toadflax (Linaria dalmatica), Diffuse knapweed (Centaurea diffusa), Dyer's woad (Isatis tinctoria), English holly (Ilex aquifolium), English ivy (Hedera helix), European beachgrass (Ammophila arenaria ), False brome (Brachypodium sylvaticum), Field brome (Bromus arvensis), Fountaingrass (Pennisetum setaceum), French broom (Genista monspessulana), Fuller's teasel (Dipsacus fullonum), Garlic mustard (Alliaria petiolata), Giant knotweed (Fallopia sachalinensis), Hairy cat's ear (Hypochaeris radicata), Halogeton (Halogeton glomeratus),
INHABIT species potential distribution across the contiguous United States (ver. 3.0, February 2023)
공공데이터포털
We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies. We applied the modeling workflow developed in Young et al. 2020 to species not included in the original case studies. Our methodology balanced trade-offs between developing highly customized models for a few species versus fitting non-specific and generic models for numerous species. We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1: https://doi.org/10.1371/journal.pone.0263056) and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We accounted for uncertainty related to sampling bias by using two alternative sources of background samples, and constructed model ensembles using the 10 models for each species (five algorithms by two background methods) for three different thresholds (conservative to targeted). This data bundle contains a single file of tabular summaries by management unit (including each species/ ensemble type combination) and a subfolder for each species that contains the merged data sets used to create models, the six raster files associated with the species, and tabular outputs including response curve data, variable importance information, and model assessment metrics. Each of the six rasters represent the following: 1) 0.01 - one percentile threshold 2) 0.1 - ten percentile threshold 3) MaxSS - maximum sensitivity plus specificity threshold 4) 0.01 - one percentile threshold with Restricted Environmental Conditions 5) 0.1 - ten percentile threshold with Restricted Environmental Conditions 6) MaxSS - maximum sensitivity plus specificity threshold with Restricted Environmental Conditions The bundle documentation files are: 1) 'INHABIT_V3_metdata.xml' (this file) which contains the project-level metadata 2) managementSummaries.csv is the tabular summaries by management unit. 3) 'mergedDataset.csv' contains the merged data set used to create the models, including location and associated environmental data, for each species. 4) XX.tif where XX is the raster type explained above (threshold; masked or not). 5) responseCurves.csv is the tabular information need to produce response curves for each predictor retained in each of the 10 models produced for each species. 6) variableImportance.csv is the tabular summaries indicating predictor importance for each of the 10 models produced for each species. 7) assessmentMetrics.csv is the tabular summaries of assessment metrics for each model or ensemble for each species. These data will be integrated into the third version of the Invasive Species Habitat Tool (INHABIT), a web application displaying visual and statistical summaries of nationwide habitat suitability models for manager identified invasive plant species. These species include: African rue (Peganum harmala), Air potato (Dioscorea bulbifera), Alkali swainsonpea (Sphaerophysa salsula), Amur honeysuckle (Lonicera maackii), Amur maple (Acer ginnala), Amur peppervine (Ampelopsis brevipedunculata), Annual bluegrass (Poa annua), Annual rye (Lolium multiflorum), Asian mustard (Brassica tournefortii), Autumn olive (Elaeagnus umbellata), Balloon vine (Cardiospermum halicacabum), Beefsteak mint (Perilla frutescens), Bermudagrass (Cynodon dactylon), Bigleaf periwinkle (Vinca major), Bird vetch (Vicia cracca), Bishop's goutweed (Aegopodium podagraria), Black henbane (Hyoscyamus niger), Bohemian knotweed (Fallopia bohemica), Bradford pear (Pyrus calleryana), Brazilian peppertree (Schinus terebinthifolius), Briton's wild petunia (Ruellia simplex), Broad leaved helleborine (Epipactis helleborine), Buffelgrass (Cenchrus ciliaris), Bulbous bluegrass (Poa bulbosa), Bull thistle (Cirsium vulgare), Bur
Simulation to evaluate response of population models to annual trends in detectability
공공데이터포털
In 'Simulation to evaluate response of population models to annual trends in detectability', we provide data and R code necessary to create simulation scenarios and estimate trends with different population models (Monroe et al. 2019). Literature cited: Monroe, A. P., G. T. Wann, C. L. Aldridge, and P. S. Coates. 2019. The importance of simulation assumptions when evaluating detectability in population models. Ecosphere 10(7):e02791. 10.1002/ecs2.2791
Simulation to evaluate response of population models to annual trends in detectability
공공데이터포털
In 'Simulation to evaluate response of population models to annual trends in detectability', we provide data and R code necessary to create simulation scenarios and estimate trends with different population models (Monroe et al. 2019). Literature cited: Monroe, A. P., G. T. Wann, C. L. Aldridge, and P. S. Coates. 2019. The importance of simulation assumptions when evaluating detectability in population models. Ecosphere 10(7):e02791. 10.1002/ecs2.2791
INHABIT species potential distribution across the contiguous United States (ver. 4.0, June 2024)
공공데이터포털
This is a dataset containing the potential distribution of 259 invasive terrestrial plant species. We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies and other managers. We applied the modeling workflow developed in Young et al. (2020, https://doi.org/10.1371/journal.pone.0229253) and adapted by Jarnevich et al. (2023, https://doi.org/10.1016/j.ecoinf.2023.101997). We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1: https://doi.org/10.1371/journal.pone.0263056) and relied on human input based on natural history knowledge to narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling (SAHM 2.2.2, Morisette et al., 2013). For each species, we generated up to three groups of models reflecting various levels of suitability including suitability for occurrence, suitability for abundance (>5% cover), and suitability for high abundance (>25% cover), where there were enough data available to create models. For occurrence, we accounted for uncertainty related to sampling bias by using two alternative sources of background samples. For all three groups of models, we constructed weighted ensembles using up to 20 models (occurrence) or 10 models (abundance) for each species. We also combined the three ensembles using three different thresholds converting the continuous values to suitable/unsuitable, ranging from inclusive to restrictive. This data bundle contains a single file of tabular summaries by management unit (including each species/ensemble type/abundance level combination), a file describing the changes from version 3, and a species metadata file. There is also a subfolder for each species that contains the merged data sets used to create models, up to 9 raster files associated with the species, and tabular outputs including response curve data, variable importance information, and model assessment metrics. The potential nine rasters included in each species subfolders represent the following: 1) Occurrence suitability - Continuous value ensemble 2) Abundance suitability - Continuous value ensemble 3) High abundance suitability - Continuous value ensemble 4) Restricted occurrence suitability - Continuous value ensemble with restricted environmental conditions* 5) Restricted abundance suitability - Continuous value ensemble with restricted environmental conditions* 6) Restricted high abundance suitability - Continuous value ensemble with restricted environmental conditions* 7) 0.01 – first percentile threshold applied to model group ensemble 8) 0.05 – fifth percentile threshold applied to model group ensemble 9) 0.1 – tenth percentile threshold applied to model group ensemble *Restricted environmental conditions = only display areas where environmental characteristics are inside the range of the values used to develop the model. For example, a location with a minimum winter temperature of 12 C would be outside the range of -10 to 10 C used in model development. The bundle documentation files are: 1) 'project_metadata_INHABIT_V4.xml' (this file) which contains the project-level metadata. 2) managementSummaries.csv is the tabular summaries by management unit. 3) 'INHABIT_VersionHistory.txt' contains information on the methodological changes incurred between this release and the previous data release. 4) 'species_metadata.csv' contains information on specific model changes of each species from tuning algorithm parameters to ensure model quality. 5) 'mergedDataset.csv' contains the merged data set used to create the models, including location and associated environmental data, for each species. 6) XX.tif where XX is the raster type explained above. 7) 'responseCurves.csv' is the tabular information need to produce response curves for each predictor
Management summary table for INHABIT species potential distribution across the contiguous United States: additional management units
공공데이터포털
We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies. We applied the modeling workflow developed in Young et al. 2020 to species not included in the original case studies. Our methodology balanced trade-offs between developing highly customized models for a few species versus fitting non-specific and generic models for numerous species. We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1) and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We accounted for uncertainty related to sampling bias by using two alternative sources of background samples, and constructed model ensembles using the 10 models for each species (five algorithms by two background methods) for three different thresholds (conservative to targeted). This data release contains tabular summaries by management unit (including each species/ ensemble type combination). These data will be integrated into the third version of the Invasive Species Habitat Tool (INHABIT), a web application displaying visual and statistical summaries of nationwide habitat suitability models for manager identified invasive plant species. These data will be integrated into the third version of the Invasive Species Habitat Tool (INHABIT), a web application displaying visual and statistical summaries of nationwide habitat suitability models for manager identified invasive plant species. This file specifically, managementSummaries.csv, contains tabular summaries of the raster outputs summarized for each management unit for all species. These species include: African rue (Peganum harmala), Air potato (Dioscorea bulbifera), Alkali swainsonpea (Sphaerophysa salsula), Amur honeysuckle (Lonicera maackii), Amur maple (Acer ginnala), Amur peppervine (Ampelopsis brevipedunculata), Annual bluegrass (Poa annua), Annual rye (Lolium multiflorum), Asian mustard (Brassica tournefortii), Autumn olive (Elaeagnus umbellata), Balloon vine (Cardiospermum halicacabum), Beefsteak mint (Perilla frutescens), Bermudagrass (Cynodon dactylon), Bigleaf periwinkle (Vinca major), Bird vetch (Vicia cracca), Bishop's goutweed (Aegopodium podagraria), Black henbane (Hyoscyamus niger), Bohemian knotweed (Fallopia bohemica), Bradford pear (Pyrus calleryana), Brazilian peppertree (Schinus terebinthifolius), Briton's wild petunia (Ruellia simplex), Broad leaved helleborine (Epipactis helleborine), Buffelgrass (Cenchrus ciliaris), Bulbous bluegrass (Poa bulbosa), Bull thistle (Cirsium vulgare), Bur buttercup (Ranunculus testiculatus), Burning bush (Euonymus alatus), Caesarweed (Urena lobata), Camelthorn (Alhagi maurorum), Camphortree (Cinnamomum camphora), Canada thistle (Cirsium arvense), Cape-ivy (Delairea odorata), Castor bean (Ricinus communis), Cat's claw creeper (Dolichandra unguis-cati), Cereal rye (Secale cereale), Cheatgrass (Bromus tectorum), Chinaberry (Melia azedarach), Chinese holly (Ilex cornuta), Chinese pistache (Pistacia chinensis), Chinese privet (Ligustrum sinense), Chinese tallowtree (Triadica sebifera), Chinese wisteria (Wisteria sinensis), Chocolate vine (Akebia quinata), Clasping pepperweed (Lepidium perfoliatum), Coco yam (Colocasia esculenta), Cogongrass (Imperata cylindrica), Common buckthorn (Rhamnus cathartica), Common crupina (Crupina vulgaris), Common gorse (Ulex europaeus), Common reed (Phragmites australis), Common tansy (Tanacetum vulgare), Common wormwood (Artemisia vulgaris), Coral ardisia (Ardisia crenata), Crape myrtle (Lagerstroemia indica), Creeping bentgrass (Agrostis stolonifera), Creeping buttercup (Ranunculus repens), Crested wheatgrass (Agropyron cristatum), Crown vetch (Securigera varia), Curly dock (Rumex
Occupancy model coefficients and observed co-occurrence simulations for sicklefin chub, sturgeon chub, and associated fishes in the Missouri River
공공데이터포털
Extant population monitoring and habitat assessment data sets of benthic species were used as inputs for occupancy models focused on Sicklefin and Sturgeon chub with the goals of describing temporal, spatial, and environmental factors associated with occupancy patterns of each chub species, assessing co-occurrence of the two species, and determining relationships between co-occurrence and environmental factors. We also used three-species occupancy models to assess co-occurrence of these chubs with other primarily benthic species. This data set is comprised of the outputs of these models.
Data to create and evaluate distribution models for invasive species for different geographic extents
공공데이터포털
We developed habitat suitability models for invasive plant species selected by Department of Interior land management agencies. We applied the modeling workflow developed in Young et al. 2020 to species not included in the original case studies. Our methodology balanced trade-offs between developing highly customized models for a few species versus fitting non-specific and generic models for numerous species. We developed a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1: https://doi.org/10.1371/journal.pone.0263056) and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We accounted for uncertainty related to sampling bias by using two alternative sources of background samples, and constructed model ensembles using the 10 models for each species (five algorithms by two background methods) for three different thresholds (conservative to targeted). The mergedDataset_regionalization.csv file contains predictor values associated with pixels underlying each presence and background point. The testStripPoints_regionalization.csv file contains the locations of the modeled species occurring in the different geographic test strips.
Thresholded abundance models for three invasive plant species in the United States
공공데이터포털
We developed habitat suitability models for three invasive plant species: stiltgrass (Microstegium vimineum), sericea lespedeza (Lespedeza cuneata), and privet (Ligustrum sinense). We applied the modeling workflow developed in Young et al. 2020, developing similar models for occurrence data, but also models trained using species locations with percent cover ≥10%, ≥25%, and ≥50%. We chose predictors from a national library of environmental variables known to physiologically limit plant distributions (Engelstad et al. 2022 Table S1) and relied on human input based on natural history knowledge to further narrow the variable set for each species before developing habitat suitability models. We developed models using five algorithms with VisTrails: Software for Assisted Habitat Modeling [SAHM 2.1.2]. We selected background samples using the target background approach, and took an alternative approach to construct model ensembles by combining first percentile and ten percentile threshold rules (suitability values associated with the lowest one percent and lowest ten percent of the training data) to categorize the continuous output from each algorithm into low (below the one percentile), moderate (between the one and ten percentile), and high (above the ten percentile) suitability. Finally, we summed these to create an ensemble. This data bundle contains the merged data sets used to create the models, the composite raster files for each abundance threshold associated with each species, tabular summaries by management unit (including each species/ composite type combination), and the occurrence points with their associated cover. The spatial data are organized in a separate folder for each species, each containing 5 rasters describing potential habitat suitability for the species at the different abundance thresholds. Each of the rasters represent the composite map (composite_abundX.tif) for each abundance threshold. The bundle documentation files are: 1) 'thresholded_abundance_project_metdata.xml' (this file) which contains the project-level metadata 2) 'mergedDataset.csv' contains the merged data set used to create the models, including location and associated environmental data, for all three species for each thresholded abundance. 3) XX.tif where XX is the raster type explained above (abundance threshold). 4) managementSummary.csv is the tabular summaries by management unit.