How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach
공공데이터포털
Background It has been recognized that replicates of arrays (or spots) may be necessary for reliably detecting differentially expressed genes in microarray experiments. However, the often-asked question of how many replicates are required has barely been addressed in the literature. In general, the answer depends on several factors: a given magnitude of expression change, a desired statistical power (that is, probability) to detect it, a specified Type I error rate, and the statistical method being used to detect the change. Here, we discuss how to calculate the number of replicates in the context of applying a nonparametric statistical method, the normal mixture model approach, to detect changes in gene expression. Results The methodology is applied to a data set containing expression levels of 1,176 genes in rats with and without pneumococcal middle-ear infection. We illustrate how to calculate the power functions for 2, 4, 6 and 8 replicates. Conclusions The proposed method is potentially useful in designing microarray experiments to discover differentially expressed genes. The same idea can be applied to other statistical methods.
A Sensitivity Analysis of Methodological Variables Associated with Microbiome Measurements
공공데이터포털
This repository provides the raw data, analysis code, and results generated during a systematic evaluation of the impact of selected experimental protocol choices on the metagenomic sequencing analysis of microbiome samples. Briefly, a full factorial experimental design was implemented varying biological sample (n=5), operator (n=2), lot (n=2), extraction kit (n=2), 16S variable region (n=2), and reference database (n=3), and the main effects were calculated and compared between parameters (bias effects) and samples (real biological differences). A full description of the effort is provided in the associated publication.
MICRON Data (2015-2016) with associated R Markdown code
공공데이터포털
This data set includes water quality data and microbial community abundance tables for periphyton samples from this project. The data set also includes extensive R markdown code used to process the data and generate the results included in the report. This dataset is associated with the following publication: Hagy, J., R. Devereux, K. Houghton, D. Beddick, T. Pierce, and S. Friedman. Developing Microbial Community Indicators of Nutrient Exposure in Southeast Coastal Plain Streams using a Molecular Approach. US EPA Office of Research and Development, Washington, DC, USA, 2018.
MICRON Data (2015-2016) with associated R Markdown code
공공데이터포털
This data set includes water quality data and microbial community abundance tables for periphyton samples from this project. The data set also includes extensive R markdown code used to process the data and generate the results included in the report. This dataset is associated with the following publication: Hagy, J., R. Devereux, K. Houghton, D. Beddick, T. Pierce, and S. Friedman. Developing Microbial Community Indicators of Nutrient Exposure in Southeast Coastal Plain Streams using a Molecular Approach. US EPA Office of Research and Development, Washington, DC, USA, 2018.
Data for pilot-scale low level hydrogen peroxide tests using humidifiers
공공데이터포털
Dataset includes data from each experiment conducted in the pilot-scale testing. Each sheet of the Excel file pertains to each test. A data dictionary is included in the first sheet. In each sheet there are microbiological data (colony forming units) for each test and positive control coupon used in the study. Also shown is the calculation of decontamination efficacy (log10 reduction). This dataset is associated with the following publication: Wood, J., W. Calfee, S. Ryan, L. Mickelsen, M. Clayton, and V. Rastogi. A Simple Decontamination Approach Using Hydrogen Peroxide Vapor for Bacillus anthracis Spore Inactivation. JOURNAL OF APPLIED MICROBIOLOGY. Blackwell Publishing, Malden, MA, USA, 121(6): 1603-1615, (2016).
Reporting of measures of accuracy in systematic reviews of diagnostic literature
공공데이터포털
Background There are a variety of ways in which accuracy of clinical tests can be summarised in systematic reviews. Variation in reporting of summary measures has only been assessed in a small survey restricted to meta-analyses of screening studies found in a single database. Therefore, we performed this study to assess the measures of accuracy used for reporting results of primary studies as well as their meta-analysis in systematic reviews of test accuracy studies. Methods Relevant reviews on test accuracy were selected from the Database of Abstracts of Reviews of Effectiveness (1994–2000), which electronically searches seven bibliographic databases and manually searches key resources. The structured abstracts of these reviews were screened and information on accuracy measures was extracted from the full texts of 90 relevant reviews, 60 of which used meta-analysis. Results Sensitivity or specificity was used for reporting the results of primary studies in 65/90 (72%) reviews, predictive values in 26/90 (28%), and likelihood ratios in 20/90 (22%). For meta-analysis, pooled sensitivity or specificity was used in 35/60 (58%) reviews, pooled predictive values in 11/60 (18%), pooled likelihood ratios in 13/60 (22%), and pooled diagnostic odds ratio in 5/60 (8%). Summary ROC was used in 44/60 (73%) of the meta-analyses. There were no significant differences in measures of test accuracy among reviews published earlier (1994–97) and those published later (1998–2000). Conclusions There is considerable variation in ways of reporting and summarising results of test accuracy studies in systematic reviews. There is a need for consensus about the best ways of reporting results of test accuracy studies in reviews.
Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application
공공데이터포털
Background A model-based analysis of oligonucleotide expression arrays we developed previously uses a probe-sensitivity index to capture the response characteristic of a specific probe pair and calculates model-based expression indexes (MBEI). MBEI has standard error attached to it as a measure of accuracy. Here we investigate the stability of the probe-sensitivity index across different tissue types, the reproducibility of results in replicate experiments, and the use of MBEI in perfect match (PM)-only arrays. Results Probe-sensitivity indexes are stable across tissue types. The target gene's presence in many arrays of an array set allows the probe-sensitivity index to be estimated accurately. We extended the model to obtain expression values for PM-only arrays, and found that the 20-probe PM-only model is comparable to the 10-probe PM/MM difference model, in terms of the expression correlations with the original 20-probe PM/MM difference model. MBEI method is able to extend the reliable detection limit of expression to a lower mRNA concentration. The standard errors of MBEI can be used to construct confidence intervals of fold changes, and the lower confidence bound of fold change is a better ranking statistic for filtering genes. We can assign reliability indexes for genes in a specific cluster of interest in hierarchical clustering by resampling clustering trees. A software dChip implementing many of these analysis methods is made available. Conclusions The model-based approach reduces the variability of low expression estimates, and provides a natural method of calculating expression values for PM-only arrays. The standard errors attached to expression values can be used to assess the reliability of downstream analysis.
Quantitative assessment of the use of modified nucleoside triphosphates in expression profiling: differential effects on signal intensities and impacts on expression ratios
공공데이터포털
Background The power of DNA microarrays derives from their ability to monitor the expression levels of many genes in parallel. One of the limitations of such powerful analytical tools is the inability to detect certain transcripts in the target sample because of artifacts caused by background noise or poor hybridization kinetics. The use of base-modified analogs of nucleoside triphosphates has been shown to increase complementary duplex stability in other applications, and here we attempted to enhance microarray hybridization signal across a wide range of sequences and expression levels by incorporating these nucleotides into labeled cRNA targets. Results RNA samples containing 2-aminoadenosine showed increases in signal intensity for a majority of the sequences. These results were similar, and additive, to those seen with an increase in the hybridization time. In contrast, 5-methyluridine and 5-methylcytidine decreased signal intensities. Hybridization specificity, as assessed by mismatch controls, was dependent on both target sequence and extent of substitution with the modified nucleotide. Concurrent incorporation of modified and unmodified ATP in a 1:1 ratio resulted in significantly greater numbers of above-threshold ratio calls across tissues, while preserving ratio integrity and reproducibility. Conclusions Incorporation of 2-aminoadenosine triphosphate into cRNA targets is a promising method for increasing signal detection in microarrays. Furthermore, this approach can be optimized to minimize impact on yield of amplified material and to increase the number of expression changes that can be detected.