Skip Navigation Archive: U.S. Department of Health and Human Services U.S. Department of Health and Human Services
Archive: Agency for Healthcare Research Quality
Archival print banner

This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: Let us know the nature of the problem, the Web address of what you want, and your contact information.

Please go to for current information.

Assessing the Empirical Evidence of Associations between Internal Validity and Effect Sizes in Randomized Controlled Trials (Text Version)

Slide presentation from the AHRQ 2009 conference.

On September 15, 2009, Paul Shekelle made this presentation at the 2009 Annual Conference. Select to access the PowerPoint® presentation (212 KB).

Slide 1

Assessing the Empirical Evidence of Associations between Internal Validity and Effect Sizes in Randomized Controlled Trials

AHRQ contract HHSA 290 2007 10062 I

Paul G. Shekelle, M.D., Ph.D.
Southern California EPC


Slide 2


  • Widespread belief that design and execution factors are related to bias in trials
    • Systematic deviation of an estimate, e.g. observed treatment effect in individual study from true value


Slide 3


  • These internal validity features for RCTs are commonly used:
    • Jadad scale (1996):
      • Randomization
      • Double-blinding
      • Description of dropouts
    • Allocation Concealment (e.g. Colditz et al., 1989)
      • Assignment generated by independent person not responsible for determining eligibility of patients


    Slide 4

    Evidence of bias

    • Schulz et al. (1995) assessed 250 trials in 33 meta-analyses
      • Inadequate concealment of allocation accounted for a 41% increase in effect sizes
      • Lack of double blinding showed a 17% increase in reported treatment effect
    • Moher et al. (1998) using 11 meta-analyses including 127 RCTs
      • Studies with inadequate concealment showed a 37% increased effect compared to concealed treatment allocation trials
      • "low quality" trials showed a 34% increase in effect


    Slide 5

    Cochrane Risk of Bias Tool

    • Recently, Cochrane has proposed a new tool to assess bias:
      • Sequence generation
      • Allocation concealment
      • Blinding of participants, personnel and outcome assessors
      • Incomplete outcome data
      • Selective outcome reporting
      • Other sources of bias
    • Cochrane also recommends a global summary score


    Slide 6

    Some Conflicting Evidence

    • Balk & colleagues (2002)
      • Used 24 existing quality measures and assessed 276 RCTs from 26 meta-analyses
      • No association of measures with bias across conditions (cardiovascular disease, infectious disease, pediatrics, and surgery)
    • Wood, Egger, Gluud, Schulz, Juni, Altman, Gluud, Martin, Wood & Sterne (2009)
      • utilized 146 meta-analyses (1346 RCTs) examining wide range of interventions and outcomes re allocation concealment and reported blinding
      • Bias effects vary by outcomes


    Slide 7

    Cochrane Back Group Approach

    • Extensive quality item list proposed by Cochrane Back Group editorial to assess controlled trials
      • Randomization sequence
      • Allocation concealment
      • Patient blinding
      • Care provider blinding
      • Assessor blinding
      • Dropouts (description, adequateness)
      • ITT analysis
      • Selective outcome reporting
      • Baseline comparability
      • Similarity of Co-Interventions
      • Compliance
      • Timing of outcome assessment


    Slide 8

    All RCTs in Cochrane Back Group Reviews

    Reviewed 261 Trials 216 Trials

    • 45 Trials Unable to Calculate Effect Size
    • 128 Trials compared treatments to other treatments
    • 122 Trials compared treatments to placebo/usual care

    64% of trials reported short-term pain outcomes


    Slide 9

    Effect of Internal Validity Items on Bias

    Validity Item Yes No Effect Size Difference
    (95% CI)
    A. randomization 104 112 0.02 (-0.12, 0.16)
    B. concealment 69 147 -0.08 (-0.23, 0.07)
    C. baseline differences 135 81 -0.10 (-0.24, 0.05)
    D. blinding - patient 82 134 -0.03 (-0.18, 0.11)
    E. blinding - care provider 57 159 -0.10 (-0.26, 0.06)
    F. blinding - outcome 123 93 -0.10 (-0.25, 0.04)
    G. co-interventions 92 124 -0.09 (-0.23, 0.05)
    H. compliance 76 140 -0.01 (-0.15, 0.14)
    I. dropouts 150 66 -0.13 (-0.29, 0.02)
    J. timing 198 18 -0.17 (-0.43, 0.10)
    K. ITT 118 98 -0.10 (-0.24, 0.04)

    Effect Size Difference
    Higher quality have smaller effect
    Lower quality have smaller effect


    Slide 10

    SC EPC Data Sets

    • Quality and effect sizes of all 267 trials in 15 Meta-Analyses of Cochrane Back Review Group analyzed
      • Threshold analysis
      • Significant differences in effect sizes between high and low quality RCTs
    • Trials of existing EPC evidence reports assessed with extensive quality item list
      • 166 trials, diverse set of topics, pharmacological therapies / behavior modification interventions
      • Effects of quality varied across conditions
        • Including blinding, allocation concealment
        • No overall effect of quality on effect sizes across conditions or outcomes


    Slide 11

    New SC EPC Data Set

    • To investigate the differing results from two large datasets, we are now testing a third dataset where we know that Jadad scale and allocation concealment items influence effect sizes in the expected direction.


    Slide 12


    • New dataset can be merged with existing datasets
      • To investigate effects that were hindered by lack of variance in previous samples
        • E.g., many trials report not enough information in order to judge the quality feature, large sample needed
      • To find empirical groupings of quality criteria
        • Quality features do not seem to be independent from another, e.g. studies with adequate allocation concealment rarely use an inappropriate sequence generation
      • To investigate factors that can explain the observed differences in results across samples
        • Moderator effects in meta-regression


Slide 13

Proposed Moderators

  • Size of overall treatment effect
    • Strong treatment effect may obliterate effects of quality
  • Condition being treated
    • Quality may influence reported effect sizes more in some clinical fields than others
  • Type of analyzed outcome
    • E.g., subjective vs. objective data, see also Wood et al.
  • Variance of quality across studies
    • Some quality features show little variance across trials (e.g. differential timing very rare)

Quality ---------------> Effect Size


Slide 14


  • Effect of quality feature on individual RCT results important finding
    • Quality of RCT varies, empirical evidence of bias
  • Some conflicting results in literature
    • Some samples show large effects of quality on effect sizes, some show no consistent effect
  • Current research needs to focus on investigating conditions for risk of bias
    • When is which quality feature associated with bias


Slide 15


  • Pending any new analyses, for now review groups can probably have most confidence in using the following items to assess bias:
    • Jadad Criteria
    • Concealment of Allocation Or
    • Cochrane Risk of Bias Tool


Slide 16

Graph: Effect of Internal Validity Items on Bias


Slide 17

Graph: Effect of Internal Validity Items on Bias

Current as of December 2009
Internet Citation: Assessing the Empirical Evidence of Associations between Internal Validity and Effect Sizes in Randomized Controlled Trials (Text Version). December 2009. Agency for Healthcare Research and Quality, Rockville, MD.


The information on this page is archived and provided for reference purposes only.


AHRQ Advancing Excellence in Health Care