데이터셋 상세
미국
Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8 Analysis Ready Dataset Raster Images from 2013-2023
This data release contains lake and reservoir water surface temperature summary statistics calculated from Landsat 8 Analysis Ready Dataset (ARD) images available within the Conterminous United States (CONUS) from 2013-2023. All zip files within this data release contain nested directories using .parquet files to store the data. The file example_script_for_using_parquet.R contains example code for using the R arrow package (Richardson and others, 2024) to open and query the nested .parquet files. Limitations with this dataset include: - All biases inherent to the Landsat Surface Temperature product are retained in this dataset which can produce unrealistically high or low estimates of water temperature. This is observed to happen, for example, in cases with partial cloud coverage over a waterbody. - Some waterbodies are split between multiple Landsat Analysis Ready Data tiles or orbit footprints. In these cases, multiple waterbody-wide statistics may be reported - one for each data tile. The deepest point values will be extracted and reported for tile covering the deepest point. A total of 947 waterbodies are split between multiple tiles (see the multiple_tiles = “yes” column of site_id_tile_hv_crosswalk.csv). - Temperature data were not extracted from satellite images with more than 90% cloud cover. - Temperature data represents skin temperature at the water surface and may differ from temperature observations from below the water surface. Potential methods for addressing limitations with this dataset: - Identifying and removing unrealistic temperature estimates: - Calculate total percentage of cloud pixels over a given waterbody as: percent_cloud_pixels = wb_dswe9_pixels/(wb_dswe9_pixels + wb_dswe1_pixels), and filter percent_cloud_pixels by a desired percentage of cloud coverage. - Remove lakes with a limited number of water pixel values available (wb_dswe1_pixels < 10) - Filter waterbodies where the deepest point is identified as water (dp_dswe = 1) - Handling waterbodies split between multiple tiles: - These waterbodies can be identified using the "site_id_tile_hv_crosswalk.csv" file (column multiple_tiles = “yes”). A user could combine sections of the same waterbody by spatially weighting the values using the number of water pixels available within each section (wb_dswe1_pixels). This should be done with caution, as some sections of the waterbody may have data available on different dates. All zip files within this data release contain nested directories using .parquet files to store the data. The example_script_for_using_parquet.R contains example code for using the R arrow package to open and query the nested .parquet files. - "year_byscene=XXXX.zip" – includes temperature summary statistics for individual waterbodies and the deepest points (the furthest point from land within a waterbody) within each waterbody by the scene_date (when the satellite passed over). Individual waterbodies are identified by the National Hydrography Dataset (NHD) permanent_identifier included within the site_id column. Some of the .parquet files with the _byscene datasets may only include one dummy row of data (identified by tile_hv="000-000"). This happens when no tabular data is extracted from the raster images because of clouds obscuring the image, a tile that covers mostly ocean with a very small amount of land, or other possible. An example file path for this dataset follows: year_byscene=2023/tile_hv=002-001/part-0.parquet -"year=XXXX.zip" – includes the summary statistics for individual waterbodies and the deepest points within each waterbody by the year (dataset=annual), month (year=0, dataset=monthly), and year-month (dataset=yrmon). The year_byscene=XXXX is used as input for generating these summary tables that aggregates temperature data by year, month, and year-month. Aggregated data is not available for the following tiles: 001-004, 001-010, 002-012, 028-013, and 029-012, because these tiles primarily cover ocean with limited
데이터 정보
연관 데이터
Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8 Analysis Ready Dataset Raster Images from 2013-2023
공공데이터포털
This data release contains lake and reservoir water surface temperature summary statistics calculated from Landsat 8 Analysis Ready Dataset (ARD) images available within the Conterminous United States (CONUS) from 2013-2023. All zip files within this data release contain nested directories using .parquet files to store the data. The file example_script_for_using_parquet.R contains example code for using the R arrow package (Richardson and others, 2024) to open and query the nested .parquet files. Limitations with this dataset include: - All biases inherent to the Landsat Surface Temperature product are retained in this dataset which can produce unrealistically high or low estimates of water temperature. This is observed to happen, for example, in cases with partial cloud coverage over a waterbody. - Some waterbodies are split between multiple Landsat Analysis Ready Data tiles or orbit footprints. In these cases, multiple waterbody-wide statistics may be reported - one for each data tile. The deepest point values will be extracted and reported for tile covering the deepest point. A total of 947 waterbodies are split between multiple tiles (see the multiple_tiles = “yes” column of site_id_tile_hv_crosswalk.csv). - Temperature data were not extracted from satellite images with more than 90% cloud cover. - Temperature data represents skin temperature at the water surface and may differ from temperature observations from below the water surface. Potential methods for addressing limitations with this dataset: - Identifying and removing unrealistic temperature estimates: - Calculate total percentage of cloud pixels over a given waterbody as: percent_cloud_pixels = wb_dswe9_pixels/(wb_dswe9_pixels + wb_dswe1_pixels), and filter percent_cloud_pixels by a desired percentage of cloud coverage. - Remove lakes with a limited number of water pixel values available (wb_dswe1_pixels < 10) - Filter waterbodies where the deepest point is identified as water (dp_dswe = 1) - Handling waterbodies split between multiple tiles: - These waterbodies can be identified using the "site_id_tile_hv_crosswalk.csv" file (column multiple_tiles = “yes”). A user could combine sections of the same waterbody by spatially weighting the values using the number of water pixels available within each section (wb_dswe1_pixels). This should be done with caution, as some sections of the waterbody may have data available on different dates. All zip files within this data release contain nested directories using .parquet files to store the data. The example_script_for_using_parquet.R contains example code for using the R arrow package to open and query the nested .parquet files. - "year_byscene=XXXX.zip" – includes temperature summary statistics for individual waterbodies and the deepest points (the furthest point from land within a waterbody) within each waterbody by the scene_date (when the satellite passed over). Individual waterbodies are identified by the National Hydrography Dataset (NHD) permanent_identifier included within the site_id column. Some of the .parquet files with the _byscene datasets may only include one dummy row of data (identified by tile_hv="000-000"). This happens when no tabular data is extracted from the raster images because of clouds obscuring the image, a tile that covers mostly ocean with a very small amount of land, or other possible. An example file path for this dataset follows: year_byscene=2023/tile_hv=002-001/part-0.parquet -"year=XXXX.zip" – includes the summary statistics for individual waterbodies and the deepest points within each waterbody by the year (dataset=annual), month (year=0, dataset=monthly), and year-month (dataset=yrmon). The year_byscene=XXXX is used as input for generating these summary tables that aggregates temperature data by year, month, and year-month. Aggregated data is not available for the following tiles: 001-004, 001-010, 002-012, 028-013, and 029-012, because these tiles primarily cover ocean with limited
A remote sensing approach to characterize winter water level drawdown patterns in lakes
공공데이터포털
This data release consists of four datasets that were used for evaluating winter drawdown patterns in 166 Massachusetts lakes greater than 0.3 km2 surface area. The first dataset (“Water area and level.csv”) provides water area and water level time series data of 166 lakes from 2016 to 2021. Water area and water level time-series data were derived from European Space Agency’s Sentinel 1 synthetic aperture radar satellite sensor using the JavaScript code in Google Earth Engine platform. Details of this code were described in the software release (https://doi.org/10.5066/P9ZA5I1U). The second dataset (“Water area interpolated.csv”) is the linearly-interpolated daily water area time series data of the 166 lakes from the first dataset that were used in winter drawdown classification model as input files. The third dataset (“Winter drawdown classification.csv”) is the winter drawdown classification model derived binary classification (1 for winter drawdown and 0 for non-winter drawdown) of 166 lakes for 5 years (2016–2021). The fourth dataset (“Winter drawdown metrics_2016.csv”, “Winter drawdown metrics_2017.csv”, “Winter drawdown metrics_2018.csv”, (“Winter drawdown metrics_2019.csv”, and “Winter drawdown metrics_2020.csv”) are the winter drawdown metrics such as timing, duration, and magnitude of drawdown derived for the winter drawdown lakes from the water area time series (second dataset) for 5 years. The codes used for the classification model and drawdown metrics are also available in the software release (https://doi.org/10.5066/P9ZA5I1U).
A remote sensing approach to characterize winter water level drawdown patterns in lakes
공공데이터포털
This data release consists of four datasets that were used for evaluating winter drawdown patterns in 166 Massachusetts lakes greater than 0.3 km2 surface area. The first dataset (“Water area and level.csv”) provides water area and water level time series data of 166 lakes from 2016 to 2021. Water area and water level time-series data were derived from European Space Agency’s Sentinel 1 synthetic aperture radar satellite sensor using the JavaScript code in Google Earth Engine platform. Details of this code were described in the software release (https://doi.org/10.5066/P9ZA5I1U). The second dataset (“Water area interpolated.csv”) is the linearly-interpolated daily water area time series data of the 166 lakes from the first dataset that were used in winter drawdown classification model as input files. The third dataset (“Winter drawdown classification.csv”) is the winter drawdown classification model derived binary classification (1 for winter drawdown and 0 for non-winter drawdown) of 166 lakes for 5 years (2016–2021). The fourth dataset (“Winter drawdown metrics_2016.csv”, “Winter drawdown metrics_2017.csv”, “Winter drawdown metrics_2018.csv”, (“Winter drawdown metrics_2019.csv”, and “Winter drawdown metrics_2020.csv”) are the winter drawdown metrics such as timing, duration, and magnitude of drawdown derived for the winter drawdown lakes from the water area time series (second dataset) for 5 years. The codes used for the classification model and drawdown metrics are also available in the software release (https://doi.org/10.5066/P9ZA5I1U).
Lake and landscape dataset used for analyses in Natural and anthropogenic controls on lake water-level decline and evaporation-to-inflow ratio in the conterminous US study-Fergus Limnology and Oceanography 2022
공공데이터포털
Lake and landscape data were compiled from the US Environmental Protection Agency National Lakes Assessment 2007 and 2012 surveys and LakeCat geospatial dataset. Additional climate variables were summarized from national PRISM and NOAA data layers following the same geoprocessing steps used in the LakeCat creation. The compiled dataset includes a derived metric that characterizes the degree of human-related water management presence on a lake that has the potential to significantly alter lake hydrology. The HydrAP metric (anthropogenic hydrological-alteration potential) uses information from the National Inventory of Dams and National Land Cover Database and is described in detail in Fergus et al. 2021. The compiled dataset includes all lake sites in the NLA 2007 survey and only new lake sites in NLA 2012 (i.e., not resampled lake sites during the two survey periods). We retained VISIT_NO = 1 observations for the analyses for a total of 1716 observations for unique lake sites distributed across the conterminous US.
Lake and landscape dataset used for analyses in Natural and anthropogenic controls on lake water-level decline and evaporation-to-inflow ratio in the conterminous US study-Fergus Limnology and Oceanography 2022
공공데이터포털
Lake and landscape data were compiled from the US Environmental Protection Agency National Lakes Assessment 2007 and 2012 surveys and LakeCat geospatial dataset. Additional climate variables were summarized from national PRISM and NOAA data layers following the same geoprocessing steps used in the LakeCat creation. The compiled dataset includes a derived metric that characterizes the degree of human-related water management presence on a lake that has the potential to significantly alter lake hydrology. The HydrAP metric (anthropogenic hydrological-alteration potential) uses information from the National Inventory of Dams and National Land Cover Database and is described in detail in Fergus et al. 2021. The compiled dataset includes all lake sites in the NLA 2007 survey and only new lake sites in NLA 2012 (i.e., not resampled lake sites during the two survey periods). We retained VISIT_NO = 1 observations for the analyses for a total of 1716 observations for unique lake sites distributed across the conterminous US.
Daily predictions of water temperature for streams across the contiguous United States (1979-2021)
공공데이터포털
This model application data release provides the data processing and model code used to generate predictions of daily stream water temperature across the contiguous United States from 1979-2021. We used a recurrent graph convolutional network (RGCN) algorithm to make daily stream temperature predictions. Stream water temperature observations, along with forcing data consisting of daily meteorological information, a stream distance matrix, and static stream characteristics were used to predict daily stream temperature summaries (minimum, mean, and maximum) for 57,810 stream segments across the contiguous United States. This model application data release is organized as follows: • data_processing_code.zip contains the instructions and code needed to assemble inputs to the model. This directory contains a README.txt file that describes all major processing steps and outputs of this code. • model_code.zip contains code to process the outputs from data_processing_code.zip into model-ready data structures and implements the modeling algorithm. This directory contains a README.txt file that describes all model-ready input files, major processing steps, and an overview of the modeling steps. • national_temperature_metadata.xml describes the top-level files contained in this model application data release (model outputs and supporting reach-level metadata). • The model outputs are contained in a Parquet database, where chunks of data are stored in regional and subregional (HUC2 and HUC4) nested folders titled huc2={HUC2 ID}.zip}. Each HUC2 can be downloaded separately. • data_access_pattern.R gives an example of how to extract and use the stream temperature predictions in this data release. • reach_metadata.csv contains reach-level metadata that describes how the reach was used in the model (training or testing) and how the reach was classified (groundwater, atmospheric, reservoir, thermoelectric) for evaluation purposes. The methods and results from this modeling effort are described in: Diaz, J., Oliver, S.K., Gorski, G. 2025. Evaluation of daily stream temperature predictions across the contiguous United States using a spatiotemporal aware machine learning algorithm. Environmental Modelling & Software, https://doi.org/10.1016/j.envsoft.2025.106655.
Process-based water temperature predictions in the Midwest US: 6 Habitat metrics
공공데이터포털
This dataset summarized a collection of annual thermal metrics to characterize lake temperature impacts on fish habitat for 7,150 lakes from uncalibrated models (PB0) and 449 from calibrated models (PBALL). The dataset includes over 172 annual thermal metrics.
Daily surface temperature predictions for 185,549 U.S. lakes with associated observations and meteorological conditions (1980-2020)
공공데이터포털
Daily lake surface temperatures estimates for 185,549 lakes across the contiguous United States from 1980 to 2020 generated using an entity-aware long short-term memory deep learning model. In-situ measurements used for model training and evaluation are from 12,227 lakes and are included as well as daily meteorological conditions and lake properties. Median per-lake estimated error found through cross validation on lakes with in-situ surface temperature observations was 1.24 °C. The generated dataset will be beneficial for a wide range of applications including estimations of thermal habitats and the impacts of climate change on inland lakes.
Daily surface temperature predictions for 185,549 U.S. lakes with associated observations and meteorological conditions (1980-2020)
공공데이터포털
Daily lake surface temperatures estimates for 185,549 lakes across the contiguous United States from 1980 to 2020 generated using an entity-aware long short-term memory deep learning model. In-situ measurements used for model training and evaluation are from 12,227 lakes and are included as well as daily meteorological conditions and lake properties. Median per-lake estimated error found through cross validation on lakes with in-situ surface temperature observations was 1.24 °C. The generated dataset will be beneficial for a wide range of applications including estimations of thermal habitats and the impacts of climate change on inland lakes.
Water temperature data from the Pend Oreille River, Washington and Idaho, 2016-2018
공공데이터포털
The data were collected summer, 2016, 2017, and 2018. Continuous temperature loggers were deployed along the Pend Oreille River between Albeni Falls Dam and the Box Canyon Dam. Loggers were checked every 1-2 weeks throughout the summer.