데이터셋 상세
미국
Quantitative structure activity relationships (QSARs) and machine learning models for abiotic reduction of organic compounds by an aqueous Fe(II) complex
Due to the increasing diversity of organic contaminants discharged into anoxic water environments, reactivity prediction is necessary for chemical persistence evaluation for water treatment and risk assessment purposes. Almost all quantitative structure activity relationships (QSARs) that describe rates of contaminant transformation apply only to narrowly-defined, relatively homogenous families of reactants (e.g., dechlorination of alkyl halides). In this work, we develop predictive models for abiotic reduction of 60 organic compounds with diverse reducible functional groups, including nitroaromatic compounds (NACs), aliphatic nitro-compounds (ANCs), aromatic N-oxides (ANOs), isoxazoles (ISXs), polyhalogenated alkanes (PHAs), sulfoxides and sulfones (SOs), and others. Rate constants for their reduction were measured using a model reductant system, Fe(II)-tiron. Qualitatively, the rates followed the order NACs > ANOs  ISXs  PHAs > ANCs > SOs. To develop QSARs, both conventional chemical descriptor-based and machine learning (ML)-based approaches were investigated. Conventional univariate QSARs based on a molecular descriptor ELUMO (energy of the lowest-unoccupied molecular orbital) gave good correlations within classes. Multivariate QSARs combining ELUMO with Abraham descriptors for physico-chemical properties gave slightly improved correlations within classes for NCs and NACs, but little improvement in correlation within other classes or among classes. The ML model obtained covers reduction rates for all classes of compounds and all of the conditions studied with the prediction accuracy similar to those of the conventional QSARs for individual classes (r2 = 0.41-0.98 for univariate QSARs, 0.71-0.94 for multivariate QSARs, and 0.83 for the ML model). Both approaches required a scheme for a priori classification of the compounds for model training. This work offers two alternative modelling approaches to comprehensive abiotic reactivity prediction for persistence evaluation of organic compounds in anoxic water environments. This dataset is associated with the following publication: Gao, Y., S. Zhong, T. Torralba-Sanchez, P. Tratnyek, E. Weber, Y. Chen, and H. Zhang. Quantitative structure activity relationships (QSARs) and machine learning models for abiotic reduction of organic compounds by an aqueous Fe(II) complex. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 192: 116843, (2021).
데이터 정보
연관 데이터
Quantitative structure activity relationships (QSARs) and machine learning models for abiotic reduction of organic compounds by an aqueous Fe(II) complex
공공데이터포털
Due to the increasing diversity of organic contaminants discharged into anoxic water environments, reactivity prediction is necessary for chemical persistence evaluation for water treatment and risk assessment purposes. Almost all quantitative structure activity relationships (QSARs) that describe rates of contaminant transformation apply only to narrowly-defined, relatively homogenous families of reactants (e.g., dechlorination of alkyl halides). In this work, we develop predictive models for abiotic reduction of 60 organic compounds with diverse reducible functional groups, including nitroaromatic compounds (NACs), aliphatic nitro-compounds (ANCs), aromatic N-oxides (ANOs), isoxazoles (ISXs), polyhalogenated alkanes (PHAs), sulfoxides and sulfones (SOs), and others. Rate constants for their reduction were measured using a model reductant system, Fe(II)-tiron. Qualitatively, the rates followed the order NACs > ANOs  ISXs  PHAs > ANCs > SOs. To develop QSARs, both conventional chemical descriptor-based and machine learning (ML)-based approaches were investigated. Conventional univariate QSARs based on a molecular descriptor ELUMO (energy of the lowest-unoccupied molecular orbital) gave good correlations within classes. Multivariate QSARs combining ELUMO with Abraham descriptors for physico-chemical properties gave slightly improved correlations within classes for NCs and NACs, but little improvement in correlation within other classes or among classes. The ML model obtained covers reduction rates for all classes of compounds and all of the conditions studied with the prediction accuracy similar to those of the conventional QSARs for individual classes (r2 = 0.41-0.98 for univariate QSARs, 0.71-0.94 for multivariate QSARs, and 0.83 for the ML model). Both approaches required a scheme for a priori classification of the compounds for model training. This work offers two alternative modelling approaches to comprehensive abiotic reactivity prediction for persistence evaluation of organic compounds in anoxic water environments. This dataset is associated with the following publication: Gao, Y., S. Zhong, T. Torralba-Sanchez, P. Tratnyek, E. Weber, Y. Chen, and H. Zhang. Quantitative structure activity relationships (QSARs) and machine learning models for abiotic reduction of organic compounds by an aqueous Fe(II) complex. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 192: 116843, (2021).
Designing QSARs for Parameters of High-Throughput Toxicokinetic Models Using Open-Source Descriptors
공공데이터포털
Additional details used in the methods are found in the MS Word file “S1_Dawson et al._Supporting_Information.docx”. The MS Excel file “S2_Dawson et al. Supporting Information.xlsx” contains datasets and graphical results. The Excel file sheets are as follows: S2.1 illustrates Clint hepatic flow calculations, S2.2 - 5 include training and test data sets; S2.6-7 include figures illustrating Clint model selection criteria and assemblages of model descriptors; S2.8 includes confusion matrices for evaluation Clint model, S2.9-10 include figures illustrating fup model selection criteria and assemblages of model descriptors (with ranges); S2.11 includes tables of model assessments of the Clint test set, S2.12 includes information relevant to BER calculations for the ToxCast test set, S2.13 includes information relevant to BER calculations for Tox21 chemicals, and S2.14 provides information on different transformations for fup. This dataset is associated with the following publication: Dawson, D., B. Ingle, K. Phillips, J. Nichols, J. Wambaugh, and R. Tornero-Velez. Designing QSARs for Parameters of High-Throughput Toxicokinetic Models Using Open-Source Descriptors. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 55(9): 6505, (6517).
Designing QSARs for Parameters of High-Throughput Toxicokinetic Models Using Open-Source Descriptors
공공데이터포털
Additional details used in the methods are found in the MS Word file “S1_Dawson et al._Supporting_Information.docx”. The MS Excel file “S2_Dawson et al. Supporting Information.xlsx” contains datasets and graphical results. The Excel file sheets are as follows: S2.1 illustrates Clint hepatic flow calculations, S2.2 - 5 include training and test data sets; S2.6-7 include figures illustrating Clint model selection criteria and assemblages of model descriptors; S2.8 includes confusion matrices for evaluation Clint model, S2.9-10 include figures illustrating fup model selection criteria and assemblages of model descriptors (with ranges); S2.11 includes tables of model assessments of the Clint test set, S2.12 includes information relevant to BER calculations for the ToxCast test set, S2.13 includes information relevant to BER calculations for Tox21 chemicals, and S2.14 provides information on different transformations for fup. This dataset is associated with the following publication: Dawson, D., B. Ingle, K. Phillips, J. Nichols, J. Wambaugh, and R. Tornero-Velez. Designing QSARs for Parameters of High-Throughput Toxicokinetic Models Using Open-Source Descriptors. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 55(9): 6505, (6517).
Rapid Experimental Estimates of Physicochemical Properties
공공데이터포털
We have performed high-throughput experimental estimates of five physicochemical properties for a set of 200 chemicals to evaluate the consistency with previous measurements, factors impacting consistency and experimental success, and the applicability domain of the new data in relation to previously measured data and predictive models. This dataset is associated with the following publication: Nicolas, C., K. Mansouri, K. Phillips, C. Grulke, A. Richard, A. Williams, J. Rabinowitz, K. Isaacs, A. Yau, and J. Wambaugh. (ENVIRONMENTAL SCIENCE and TECHNOLOGY) Rapid Experimental Estimates of Physicochemical Properties to Inform Models and Testing. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 636: 901-909, (2018).
Rapid Experimental Estimates of Physicochemical Properties
공공데이터포털
We have performed high-throughput experimental estimates of five physicochemical properties for a set of 200 chemicals to evaluate the consistency with previous measurements, factors impacting consistency and experimental success, and the applicability domain of the new data in relation to previously measured data and predictive models. This dataset is associated with the following publication: Nicolas, C., K. Mansouri, K. Phillips, C. Grulke, A. Richard, A. Williams, J. Rabinowitz, K. Isaacs, A. Yau, and J. Wambaugh. (ENVIRONMENTAL SCIENCE and TECHNOLOGY) Rapid Experimental Estimates of Physicochemical Properties to Inform Models and Testing. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 636: 901-909, (2018).
Quantitative Structure-Use Relationship Model thresholds for Model Validation, Domain of Applicability, and Candidate Alternative Selection
공공데이터포털
This file contains value of the model training set confusion matrix, domain of applicability evaluation based on training set to predicted chemicals structural similarity, and 75th percentile bioactivity index values for each QSUR model. This dataset is associated with the following publication: Phillips, K., J. Wambaugh, C. Grulke, K. Dionisio, and K. Isaacs. High-throughput screening of chemicals as functional substitutes using structure-based classification models. GREEN CHEMISTRY. Royal Society of Chemistry, Cambridge, UK, 19: 1063-1074, (2017).
Quaternary ammonium compounds in wastewater treatment effluents from 2020 to 2021
공공데이터포털
Quaternary ammonium compounds (QAC) are used in many commercial and household disinfection products. A method was developed to measure QAC occurrence in water discharged from three wastewater treatment plants during 2020 and 2021. Water samples (20 mL) were processed via solid phase extraction using weak cation exchange (WCX) cartridges (3 cc, 60 mg) and analyzed via liquid chromatography tandem mass spectrometry. Samples were analyzed for 12 QACs including benzethonium, six benzylalkyldimethyl ammonium compounds (BAC), three dialkyldimethyl ammonium compounds (DADMAC), and two ethylbenzylalkyldimethyl ammonium compounds (EBACs). QACs were detected in 100% of WWTP effluents with individual concentrations ranging 2.30-1,630 ng/L and total QAC concentrations (ΣQAC, the sum of 12 QAC) ranged 38.1-3,450 ng/L. The average concentrations of QACs detected in effluents was 49% BAC (average ΣBAC = 168 ng/L), 38% DADMAC (average ΣDADMAC = 119 ng/L), 9% EBACs (ΣEBAC = 25.0 ng/L), and 4% benzethonium (15.6 ng/L).
Designing QSARs for parameters of high throughput toxicokinetic models using open-source descriptors
공공데이터포털
The MS Excel file (Dawson et al S2 Supporting information.xlsx) contains multiple sheets containing the training sets, test sets, and predictions for intrinsic metabolic clearance (Clint), fraction unbound in plasma (fup), and bioactivity-exposure ratios (BER), for ToxCast and pharmaceutical-like chemicals. The Word file (Dawson et al S1 Supporting Information.docx) provides additional supporting information on assembly of the training and test sets for Clint, fup, and BER. The data dictionary describes the terms used in the supporting information, S1 and S2. This dataset is associated with the following publication: Dawson, D., B. Ingle, K. Phillips, J. Nichols, J. Wambaugh, and R. Tornero-Velez. Designing QSARs for Parameters of High-Throughput Toxicokinetic Models Using Open-Source Descriptors. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 55(9): 6505-6517, (2021).
Quantitative Structure-Use Relationship (QSUR) Model Descriptors
공공데이터포털
This data set contains ToxPrint finger prints for all chemicals in FUse that had QSAR-ready SMILES strings as well as select physicochemical properties from the Estimation Program Interface Suite (EPI Suite) program. This dataset is associated with the following publication: Phillips, K., J. Wambaugh, C. Grulke, K. Dionisio, and K. Isaacs. High-throughput screening of chemicals as functional substitutes using structure-based classification models. GREEN CHEMISTRY. Royal Society of Chemistry, Cambridge, UK, 19: 1063-1074, (2017).
Transparency in Modeling through Careful Application of OECD’s QSAR/QSPR Principles via a Curated Water Solubility Data Set
공공데이터포털
Figures, Tables, and QRMF for "Charles N. Lowe, Nathaniel Charest, Christian Ramsland, Daniel T. Chang, Todd M. Martin, and Antony J. Williams Chemical Research in Toxicology 2023 36 (3), 465-478 DOI: 10.1021/acs.chemrestox.2c00379". This dataset is associated with the following publication: Lowe, C., N. Charest, C. Ramsland, D. Chang, T. Martin, and A. Williams. Transparency in Modeling through Careful Application of OECD’s QSAR/QSPR Principles via a Curated Water Solubility Data Set. CHEMICAL RESEARCH IN TOXICOLOGY. American Chemical Society, Washington, DC, USA, 36(3): 465-478, (2023).