교육데이터 활용•지원 서비스

로그인

데이터셋 상세

미국

Do multiple outcome measures require p-value adjustment?

Background Readers may question the interpretation of findings in clinical trials when multiple outcome measures are used without adjustment of the p-value. This question arises because of the increased risk of Type I errors (findings of false "significance") when multiple simultaneous hypotheses are tested at set p-values. The primary aim of this study was to estimate the need to make appropriate p-value adjustments in clinical trials to compensate for a possible increased risk in committing Type I errors when multiple outcome measures are used. Discussion The classicists believe that the chance of finding at least one test statistically significant due to chance and incorrectly declaring a difference increases as the number of comparisons increases. The rationalists have the following objections to that theory: 1) P-value adjustments are calculated based on how many tests are to be considered, and that number has been defined arbitrarily and variably; 2) P-value adjustments reduce the chance of making type I errors, but they increase the chance of making type II errors or needing to increase the sample size. Summary Readers should balance a study's statistical significance with the magnitude of effect, the quality of the study and with findings from other studies. Researchers facing multiple outcome measures might want to either select a primary outcome measure or use a global assessment measure, rather than adjusting the p-value.

데이터 정보

데이터 포털
미국
META URL
https://catalog.data.gov/dataset/do-multiple-outcome-measures-require-p-value-adjustment
라이선스
notspecified
비용
제공기관
U.S. Department of Health & Human Services
관리부서
데이터
- Official Government Data Source
- 랜딩 페이지

연관 데이터

Reporting of adverse drug reactions in randomised controlled trials – a systematic survey

공공데이터포털

Background Decisions on treatment are guided, not only by the potential for benefit, but also by the nature and severity of adverse drug reactions. However, some researchers have found numerous deficiencies in trial reports of adverse effects. We sought to confirm these findings by evaluating trials of drug therapy published in seven eminent medical journals in 1997. Methods Literature review to determine whether the definition, recording and reporting of adverse drug reactions in clinical trials were in accordance with published recommendations on structured reporting. Results Of the 185 trials reviewed, 25 (14%) made no mention of adverse drug reactions. Data in a further 60 (32%) could not be fully evaluated, either because numbers were not given for each treatment arm (31 trials), or because a generic statement was made without full details (29 trials). When adverse drug reactions such as clinical events or patient symptoms were mentioned in the reports, details on how they had been recorded were given in only 14/95 (15%) and 18/104 (17%) trials respectively. Of the 86 trials that mentioned severity of adverse drug reactions, only 42 (49%) stated how severity had been defined. The median amount of space used for safety data in the Results and Discussion sections was 5.8%. Conclusions Trial reports often failed to provide details on how adverse drug reactions were defined or recorded. The absence of such methodological information makes comparative evaluation of adverse reaction rates potentially unreliable. Authors and journals should adopt recommendations on the structured reporting of adverse effects.

공공데이터포털

The Other Outcomes dataset includes information on whether the trial includes measures of depression, anxiety, substance use, sleep, anger, quality of life and functioning. Results in this dataset are provided for each treatment arm. The name of the measure is included as well as the between-group effect sizes. Use this dataset to learn how about the effects of PTSD treatments on other outcomes. Values abstracted as not applicable ("NA") or not reported ("NR") from the study are null values (empty cells). Study level variables, like military status and percent female, are included for ease of filtering. These columns are not individual arm or arm comparison level data.

Noninferiority trials

공공데이터포털

In one of the biggest dilemmas facing cardiovascular clinical research, clinical trials are increasingly being required to show benefits on clinical end-points rather than surrogate end-points, while at the same time the incremental benefits of newer treatments are getting smaller. These two factors have a huge impact on sample size, which has led some investigators to design trials to show that the new treatment has an effect similar to that of the standard, rather than outright superiority. Recent examples of fibrinolytic trials that have demonstrated similar effects of two drugs are ASSENT (Assessment of the Safety and Efficacy of a New Thrombolytic)-2, GUSTO (Global Use of Strategies to Open Occluded Coronary Arteries)-III, and COBALT (Continuous Infusion Versus Double-Bolus Administration of Alteplase) [1,2,3,4]. However, as discussed by several authors [5,6,7,8], there are issues with trials of this type that make them considerably less credible than superiority trials.

The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient: a simulation study

공공데이터포털

Background Many randomized trials involve measuring a continuous outcome - such as pain, body weight or blood pressure - at baseline and after treatment. In this paper, I compare four possibilities for how such trials can be analyzed: post-treatment; change between baseline and post-treatment; percentage change between baseline and post-treatment and analysis of covariance (ANCOVA) with baseline score as a covariate. The statistical power of each method was determined for a hypothetical randomized trial under a range of correlations between baseline and post-treatment scores. Results ANCOVA has the highest statistical power. Change from baseline has acceptable power when correlation between baseline and post-treatment scores is high;when correlation is low, analyzing only post-treatment scores has reasonable power. Percentage change from baseline has the lowest statistical power and was highly sensitive to changes in variance. Theoretical considerations suggest that percentage change from baseline will also fail to protect from bias in the case of baseline imbalance and will lead to an excess of trials with non-normally distributed outcome data. Conclusions Percentage change from baseline should not be used in statistical analysis. Trialists wishing to report this statistic should use another method, such as ANCOVA, and convert the results to a percentage change by using mean baseline scores.

Reporting of measures of accuracy in systematic reviews of diagnostic literature

공공데이터포털

Background There are a variety of ways in which accuracy of clinical tests can be summarised in systematic reviews. Variation in reporting of summary measures has only been assessed in a small survey restricted to meta-analyses of screening studies found in a single database. Therefore, we performed this study to assess the measures of accuracy used for reporting results of primary studies as well as their meta-analysis in systematic reviews of test accuracy studies. Methods Relevant reviews on test accuracy were selected from the Database of Abstracts of Reviews of Effectiveness (1994–2000), which electronically searches seven bibliographic databases and manually searches key resources. The structured abstracts of these reviews were screened and information on accuracy measures was extracted from the full texts of 90 relevant reviews, 60 of which used meta-analysis. Results Sensitivity or specificity was used for reporting the results of primary studies in 65/90 (72%) reviews, predictive values in 26/90 (28%), and likelihood ratios in 20/90 (22%). For meta-analysis, pooled sensitivity or specificity was used in 35/60 (58%) reviews, pooled predictive values in 11/60 (18%), pooled likelihood ratios in 13/60 (22%), and pooled diagnostic odds ratio in 5/60 (8%). Summary ROC was used in 44/60 (73%) of the meta-analyses. There were no significant differences in measures of test accuracy among reviews published earlier (1994–97) and those published later (1998–2000). Conclusions There is considerable variation in ways of reporting and summarising results of test accuracy studies in systematic reviews. There is a need for consensus about the best ways of reporting results of test accuracy studies in reviews.

Outcomes research in the development and evaluation of practice guidelines

공공데이터포털

Background Practice guidelines have been developed in response to the observation that variations exist in clinical medicine that are not related to variations in the clinical presentation and severity of the disease. Despite their widespread use, however, practice guideline evaluation lacks a rigorous scientific methodology to support its development and application. Discussion Firstly, we review the major epidemiological foundations of practice guideline development. Secondly, we propose a chronic disease epidemiological model in which practice patterns are viewed as the exposure and outcomes of interest such as quality or cost are viewed as the disease. Sources of selection, information, confounding and temporal trend bias are identified and discussed. Summary The proposed methodological framework for outcomes research to evaluate practice guidelines reflects the selection, information and confounding biases inherent in its observational nature which must be accounted for in both the design and the analysis phases of any outcomes research study.

PTSD Continuous Outcomes Between Arms

공공데이터포털

This dataset provides between-arm results for dichotomous outcomes: loss of diagnosis and clinically meaningful response. Included is information on how loss of diagnosis and clinically meaningful response were defined, p-value for statistical test, and study-reported effect sizes. Each comparison is on a separate row, and pairwise as well as omnibus (multi-arm) comparisons are included. There are also separate rows for studies with more than one measure, time point and analysis type or when there is with more than one definition of diagnostic change or clinically meaningful change.

PTSD Dichotomous Outcomes Between Arms

공공데이터포털

This dataset provides results for between-arm comparisons of continuous measures. Included is information on score differences, p-value for statistical test, and study-reported effect sizes. Where possible, the between-arm standardized effect size was calculated, using Hedges’ g. We calculated Hedges’ g based on the following (in order of preference): 1) adjusted mean difference; 2) follow-up scores; 3) unadjusted mean difference; 4) change scores. For the calculated effect size, information on the basis for calculation is included along with measures of variance. Negative values for Hedges’ g indicate a larger decrease (or lower follow-up score) in the first arm than in the second arm, while positive values indicate the reverse. Each comparison is on a separate row, and pairwise as well as omnibus (multi-arm) comparisons are included. There are also separate rows for each measure, time point and analysis type.

Evaluation of Imputation Methods for the National Survey on Drug Use and Health

공공데이터포털

Although PMN imputation as currently implemented has a number of advantages, including the ability to use a large number of similar variables to determine the imputed value and to provide individual record consistency among very complex variable relationships, the goal of this study was to evaluate this method compared with other options, especially in the context of the redesign of the NSDUH.

목록