The sensitivity of transcriptomics BMD modeling to the methods used for microarray data normalization
공공데이터포털
This dataset is a project file generated by BMDExpress 2.2 SW (Sciome, Research Triangle Park, NC). It contains gene expression data for livers of rats exposed to 4 chemicals (crude MCHM, neat MCHM, DMPT, p-toluidine) and kidneys of rats exposed to PPH. The project file includes normalized expression data (GeneChip Rat 230 2.0 Array) using 7 different pre-processing methods (RMA, GCRMA, MAS5.0, MAS5.0_noA calls, PLIER, PLIER16, and PLIER16_noA calls); differentially expressed probe-sets detected by William's method (p<0.05, and minimum fold change of 1.5); probeset-level and pathway-level BMD and BMDL values from transcriptomic dose-response modeling. This dataset is associated with the following publication: Mezencev, R., and S. Auerbach. The sensitivity of transcriptomics BMD modeling to the methods used for microarray data normalization. PLOS ONE. Public Library of Science, San Francisco, CA, USA, 15(5): e0232955, (2020).
Liver weight changes in rats and mice database
공공데이터포털
This dataset was prepared from the US Environmental Protection Agency's (EPA) Toxicity Reference Database (ToxRefDB) that contains information for 1,142 chemicals and 5,960 studies. Curations include information regarding the study design, chemical identity, dosing, treatment group parameters, treatment-related (significantly different from control) and critical (adverse) effects for all dose treatment groups, as well as endpoint testing status according to guideline specifications. ToxRefDB data was examined for all subchronic (SUB) studies with complete curations, which included registrant-submitted toxicity studies from the US EPA’s Office of Pesticide Programs (OPP) and guideline studies sourced from the National Toxicology Program (NTP). Statistically significant differences between treatment and control group data at p<0.05 within the source documents was extracted and denoted with a “treatment-related” Boolean indicator “true”. Across the studies with absolute liver weights and relative-to-body (RLW) liver weights, the treatment-related mean effect values at the lowest effect (LE) dose levels as well as mean control liver weights were determined for all chemical-study-sex-species-exposure route groupings. The LE-ALW and LE-RLW changes were quantified as effect size differences from control using the following equation: Effect_size = 100 x (LE Effect_value – Control Effect_Value) / Control Effect_Value Any microscopic liver pathology effects occurring at the corresponding LE dose level of weight change were also identified and listed in the dataset. Histopathology terms were presented as they appeared in ToxRefDB without harmonizing different hierarchical levels and aggregating multiple terms used to depict the same lesions. The final dataset that includes chemical stressor information, study source identifiers, study type, sex, species, strain, administration route, administration method, dose level, mg/kg/day value, qualitative and quantitative effect information, effect size from control, and pathology effects if present. The dataset includes data from 389 subchronic studies on 273 chemicals. This dataset is associated with the following publication: Mezencev, R., M. Feshuk, L. Kolaczkowski, G. Peterson, Q. Zhao, S. Watford, and J. Weaver. The association between histopathologic effects and liver weight changes induced in mice and rats by chemical exposures: an analysis of the data from Toxicity Reference Database (ToxRefDB). TOXICOLOGICAL SCIENCES. Society of Toxicology, RESTON, VA, 200(2): 404-413, (2024).
A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics
공공데이터포털
The U.S. Tox21 Federal collaboration, which currently quantifies the biological effects of nearly 10,000 chemicals via quantitative high-throughput screening(qHTS) in in vitro model systems, is now making an effort to incorporate gene expression profiling into the existing battery of assays. Whole transcriptome analyses performed on large numbers of samples using microarrays or RNA-Seq is currently cost-prohibitive. Accordingly, the Tox21 Program is pursuing a high-throughput transcriptomics (HTT) method that focuses on the targeted detection of gene expression for a carefully selected subset of the transcriptome that potentially can reduce the cost by a factor of 10-fold, allowing for the analysis of larger numbers of samples. To identify the optimal transcriptome subset, genes were sought that are (1) representative of the highly diverse biological space, (2) capable of serving as a proxy for expression changes in unmeasured genes, and (3) sufficient to provide coverage of well described biological pathways. A hybrid method for gene selection is presented herein that combines data-driven and knowledge-driven concepts into one cohesive method. This dataset is associated with the following publication: Mav, D., R.R. Shah, B.E. Howard, S.S. Auerbach, P.R. Bushel, J.B. Collins, D.L. Gerhold, R. Judson, A.L. Karmaus, E.A. Maull, D.L. Mendrick, B.A. Merrick, N.S. Sipes, D. Svoboda, and R.S. Paules. A hybrid gene selection approach to create the S1500+ targeted gene sets for use in high-throughput transcriptomics. PLoS ONE. Public Library of Science, San Francisco, CA, USA, 13(2): 1-17, (2018).
Predict Organ Toxicity ChemResTox Data
공공데이터포털
We use a supervised machine learning strategy to systematically investigate the relative importance of study type, machine learning algorithm, and type of descriptor on predicting in vivo repeat-dose toxicity at the organ-level. A total of 985 compounds were represented using chemical structural descriptors, ToxPrint chemotype descriptors, and bioactivity descriptors from ToxCast in vitro high-throughput screening assays. Using ToxRefDB, a total of 35 target organ outcomes were identified that contained at least 100 chemicals (50 positive and 50 negative). Supervised machine learning was performed using Naïve Bayes, k-nearest neighbor, random forest, classification and regression trees, and support vector classification approaches. Model performnce was assessed based on F1 scores using five-fold cross-validation with balanced bootstrap replicates. Fixed effects modeling showed the variance in F1 scores was explained mostly by target organ outcome, followed by descriptor type, machine learning algorithm, and interactions between these three factors. A combination of bioactivity and chemical structure or chemotype descriptors were the most predictive. Model performance improved with more chemicals (up to a maximum of 24%) and these gains were correlated (ρ= 0.92) with the number of chemicals. This dataset is associated with the following publication: Liu, J., G. Patlewicz, A. Williams, R. Thomas, and I. Shah. (Chemical Research in Toxicology) Predicting organ toxicity using in vitro bioactivity data and chemical structure. CHEMICAL RESEARCH IN TOXICOLOGY. American Chemical Society, Washington, DC, USA, 30: 2046−2059, (2017).
Evaluation of Existing QSAR Models and Structural Alerts and Development of New Ensemble Models for Genotoxicity Using a Newly Compiled Experimental Dataset
공공데이터포털
In this study, a major effort was undertaken to compile a large genotoxicity dataset (54,805 records for 9299 substances) from several public sources (e.g., TOXNET, COSMOS, eChemPortal). The names and outcomes of the different assays were harmonized, and assays were annotated by type: gene mutation in Salmonella bacteria (Ames assay) and chromosome mutation (clastogenicity) in vitro or in vivo (chromosome aberration, micronucleus, and mouse lymphoma Tk+/- assays). This dataset was then evaluated to assess genotoxic potential using a categorization scheme, whereby a substance was considered genotoxic if it was positive in at least one Ames or clastogen study. The categorization dataset comprised 8442 chemicals, of which 2728 chemicals were genotoxic, 5585 were not and 129 were inconclusive. QSAR models (TEST and VEGA) and the OECD Toolbox structural alerts/profilers (e.g., OASIS DNA alerts for Ames and chromosomal aberrations) were used to make in silico predictions of genotoxicity potential. The performance of the individual QSAR tools and structural alerts resulted in balanced accuracies of 57-73%. A Naïve Bayes consensus model was developed using combinations of QSAR models and structural alert predictions. The ‘best’ consensus model selected had a balanced accuracy of 81.2%, a sensitivity of 87.24% and a specificity of 75.20%. This in silico scheme offers promise as a first step in ranking thousands of substances as part of a prioritization approach for genotoxicity. This dataset is associated with the following publication: Pradeep, P., R. Judson, D. DeMarini, N. Keshava, T. Martin, J. Dean, C. Gibbons, A. Simha, S. Warren, M. Gwinn, and G. Patlewicz. An Evaluation of Existing QSAR Models and Structural Alerts and Development of New Ensemble Models for Genotoxicity Using a Newly Compiled Experimental Dataset. Computational Toxicology. Elsevier B.V., Amsterdam, NETHERLANDS, 18: 100167, (2021).
Dataset used in ORD-035008 - A Set of Six Gene Expression Biomarkers and Their Thresholds Identify Rat Liver Tumorigens in Short-Term Assays
공공데이터포털
The links provided include the DrugMatrix-Affymetrix study, the TG-GATES study, and the TempO-Seq S1500+ study. This dataset is associated with the following publication: Lewis, R., T. Hill III, and C. Corton. A set of six Gene expression biomarkers and their thresholds identify rat liver tumorigens in short-term assays. TOXICOLOGY. Elsevier Science Ltd, New York, NY, USA, 443: 152547, (2020).
Dataset for 'From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow'
공공데이터포털
Data for Reardon AJF, et al., From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow. Front. Toxicol. 5:1194895. doi: 10.3389/ftox.2023.1194895. PMC10242042. This dataset is associated with the following publication: Reardon, A., R. Farmahin, A. Williams, M. Meier, G. Addicks, C. Yauk, G. Matteo, E. Atlas, J. Harrill, L. Everett, I. Shah, R. Judson, S. Ramaiahgari, S. Ferguson, and T. Barton-Maclaren. From vision toward best practices: Evaluating in vitro transcriptomic points of departure for application in risk assessment using a uniform workflow. Frontiers in Toxicology. Frontiers, Lausanne, SWITZERLAND, 5: 1194895, (2023).
Simulating toxicokinetic variability to identify susceptible and highly exposed populations
공공데이터포털
Data for "Breen, M., Wambaugh, J.F., Bernstein, A. et al. Simulating toxicokinetic variability to identify susceptible and highly exposed populations. J Expo Sci Environ Epidemiol 32, 855–863 (2022). https://doi.org/10.1038/s41370-022-00491-0". This dataset is associated with the following publication: Breen, M., J. Wambaugh, A. Bernstein, M. Sfeir, and C. Ring. Simulating toxicokinetic variability to identify susceptible and highly exposed populations. Journal of Exposure Science and Environmental Epidemiology. Nature Publishing Group, London, UK, 32: 855-863, (2022).