In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning
공공데이터포털
QSAR Model Reporting Formats. Examples of R code: feature selection and regression analysis. Figure S1: Data distribution of logBCF, BP, MP and logVP. Figures S2–S5: Relationship between model complexity and prediction errors as well as the plots of estimated values versus experimental data for logBCF, BP, MP, and logVP, respectively. Figure S6: Plots of leverage versus standardized residuals for logBCF, BP, MP, and logVP models. Table S1: Chemical product classes for training and test sets. Tables S2–S5: Regression statistics for logBCF, BP, MP, and logVP, respectively. Table S6: Applicability domains for logBCF, BP, MP, and logVP. Tables S7–S12: Chemicals with large prediction residuals for the six properties (PDF) Chemical names, CAS registry number and SMILES as well as experimentally measured and estimated property values of the training and test sets (XLSX). This dataset is associated with the following publication: Zang, Q., K. Mansouri, A. Williams, R. Judson, D. Allen, W.M. Casey, and N.C. Kleinstreuer. (Journal of Chemical Information and Modeling) In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning. Journal of Chemical Information and Modeling. American Chemical Society, Washington, DC, USA, 57(1): 36-49, (2017).
Predicting ABM Results with Covering Arrays and Random Forests
공공데이터포털
Our goal is to explore the feasibility and usefulness of using a combination of covering arrays and machine learning models for predicting results of an agent- based simulation model within the vast parameter value combination space. The challenge is to select parameter values that are representative of the overall behavior of the model, so that we can train the machine learning model to be able to correctly predict behavior on previously untested areas of the parameter space. We have chosen Wilensky's Heat Bugs model in NetLogo for our study. It is a simple model, amenable to quick data generation, with a limited number of outputs to predict, and with emergent behavior. This model therefore allows exploration of this new approach.We utilize covering arrays to reduce the parameter value space systematically, run the model for each parameter set in the 2-way and 3-way covering arrays, train a random forest model on the 2-way data (33, 351 parameter combinations), and test its ability to predict the outcome of the simulation on the significantly larger 3-way data that was not seen during the training of the model (3, 971, 955 parameter combinations).
Importance of predictor variables for models of chemical function
공공데이터포털
Importance of random forest predictors for all classification models of chemical function. This dataset is associated with the following publication: Isaacs , K., M. Goldsmith, P. Egeghy , K. Phillips, R. Brooks, T. Hong, and J. Wambaugh. Characterization and prediction of chemical functions and weight fractions in consumer products. Toxicology Reports. Elsevier B.V., Amsterdam, NETHERLANDS, 3: 723-732, (2016).
Decision Analytic Aproach Survey Results
공공데이터포털
An elicitation with 32 experts informed relative prioritization of risks from chemical properties and human use factors for consumer product-related chemicals. Three different versions of the model were evaluated using distinct weight profiles. This dataset is associated with the following publication: Wood, M., K. Plourde, S. Larkin, P. Egeghy, A. Williams, V. Zemba, I. Linkov, and D. Vallero. Advances on a Decision Analytic Approach to Exposure‐Based Chemical Prioritization. RISK ANALYSIS. Blackwell Publishing, Malden, MA, USA, 40(1): 83-96, (2020).
Chemical Function Predictions for Tox21 Chemicals
공공데이터포털
Random forest chemical function predictions for Tox21 chemicals in personal care products uses and "other" uses. This dataset is associated with the following publication: Isaacs , K., M. Goldsmith, P. Egeghy , K. Phillips, R. Brooks, T. Hong, and J. Wambaugh. Characterization and prediction of chemical functions and weight fractions in consumer products. Toxicology Reports. Elsevier B.V., Amsterdam, NETHERLANDS, 3: 723-732, (2016).