Decisions Encountered During Key Task Number 1: Negotiating Consensus on Goals and "Value Judgments" of Performance Reporting

Methodological Considerations in Generating Provider Performance Scores

Chartered Value Exchanges (CVEs) have multiple stakeholders, including patients, providers, health plans, employers, government agencies, and community groups. These stakeholders may have differing ideas and concerns about measuring and reporting provider performance. Because generating performance reports may require considerable time, effort, and financial resources, CVEs may find it beneficial to include all potential stakeholders in early and ongoing discussions concerning the "value judgments" of performance reporting. The value judgments will affect how stakeholders choose among the various options at each methodological decision point.

These value judgments are decisions for which there are no clearly right or wrong answers (or at least, no right or wrong answers from a methodological standpoint). Where possible, it is advisable to identify and address areas of disagreement among CVE stakeholders on these value judgments before resources are devoted to generating performance reports. By negotiating consensus among stakeholders early in the process and periodically revisiting this consensus, a CVE can establish good working relationships and approach problems in a neutral environment (i.e., an environment in which providers do not yet know their performance on a public report).

Examples: Negotiating consensus

Massachusetts Health Quality Partners (MHQP; brought providers into the reporting process early, years before reports were generated. According to Melinda Karp, MHQP Director of Strategic Planning and Business Development, an important priority was to convince providers that MHQP's goal was to "do something with the providers, not do something to the providers." In addition to providers, health plans were brought to the table years before MHQP's first public reports were released.

The California Cooperative Healthcare Reporting Initiative (CCHRI), through the California Physician Performance Initiative (CPPI), is constructing individual physician performance scores on 17 measures of ambulatory quality. CCHRI has formed a Physician Advisory Group to review and provide input on an array of methods issues, including measure selection, attribution, and reliability of results for use by stakeholders. The CPPI project is adhering to the principles outlined in the Patient Charter for Physician Measurement (Consumer-Purchaser Disclosure Project,, with the following negotiated criteria:

  • Physicians must have an opportunity to correct their performance data.
  • Performance reports must exceed a minimum reliability threshold (in order to limit the risk of misclassification due to chance).
  • Performance must be reported in categories rather than as absolute values.
  • Consumers must be given a way to understand the performance data and their limitations.

A. What are the purposes of publicly reporting provider performance?

This document is intended for use by CVEs interested in creating public reports of provider performance. These public reports may include measures of quality, costs (or efficiency), patient experience, or other types of performance measures. Throughout this document, the word "provider" is intended to be flexible in its meaning. "Providers" may refer to individual health care practitioners (physicians, nurses, therapists, pharmacists, etc.), practices or clinics (i.e., collections of practitioners who provide care together at a single address), or larger health care organizations (physician groups, hospitals, etc.).

Public reporting is not the only activity CVEs may undertake to improve health care in their local areas. CVEs also can engage in confidential reporting in which each provider's performance data are shared only with the provider. When the provider is an organization, this usually means sharing the data with organizational leaders, who may then decide whether and how to internally disseminate the data. This form of reporting can provide useful guidance to providers trying to improve their performance. For example, these providers may want to know how well their improvement initiatives are working.

Confidential reporting also can motivate providers to improve by appealing to a sense of professionalism. However, because confidential performance reports are not released to the public, they cannot be used by patients to select a provider. Therefore, a CVE's decision about whether to produce public or confidential performance reports (or produce both a public and a confidential report) may depend on the goals of CVE stakeholders.

If CVEs choose to publicly report provider performance, it may be advisable to reach early consensus on the purposes of these reports. This is a critical first step because the purposes of reporting will affect later methodological decision points. Based on Berwick3 and Hibbard,4 reporting has at least three general purposes:

  • To help patients choose providers.
  • To motivate performance improvement.
  • To empower patients to act as "co-producers" of their health care.

Below, we discuss the advantages and caveats associated with these purposes of publicly reporting provider performance. It is important to note that the potential purposes of public reporting are not mutually exclusive. By using the same performance data in different ways, a CVE may be able to produce different reports to achieve different purposes.

Recognizing that different audiences may have different needs, a CVE could produce one report for patients and a second report for providers. For example, the kinds of performance reports that are most useful to patients may not be the most useful to providers seeking to improve (e.g., there may not be enough detail to provide guidance on improvement efforts).3 Similarly, if reports are sufficiently detailed to guide providers' improvement efforts, they may be too detailed for many patients to easily understand.5

  1. Option 1: To help patients choose providers. The goal of helping patients become better informed consumers of health care is a commonly cited reason for public performance reporting. If this option is chosen, then performance reports should be designed with the patient in mind. They should be readily understandable to an audience that may not have medical or statistical expertise.5 For guidance on which kinds of reporting formats might be preferable for helping patients choose providers, refer to papers by Drs. Hibbard and Sofaer1-2 and to AHRQ's "Talking Quality" Web site ( and "Model Public Report Elements: A Sampler" (


    • Patients may choose better performing providers.
    • If providers believe patients are using public performance reports to make health care choices, providers may be motivated to improve.


    • Historically, patients have not prioritized publicly available performance information when choosing a provider.6-8 Anecdotal information from family and friends may be more heavily used by patients, even when performance data are available.
    • Due to data limitations, it may not be possible to produce the performance reports that patients would find most useful or make them available at the right moment in the health care decisionmaking process. For example, a report of individual practitioner performance, rather than organizational performance, may have the best fit with how patients view their health care. However, publicly reporting the performance of individual practitioners may not be possible, especially when a CVE also wants to limit the amount of performance misclassification due to chance.
  2. Option 2: To motivate providers to improve. Enabling patients to choose providers based on their performance may motivate improvement efforts. If providers believe patients use public reports, then providers who want to attract and keep patients will be motivated to attain high performance. However, even if providers do not believe patients use public reports when seeking health care, these reports can have a powerful motivating effect. Providers may want to do well—out of a sense of professionalism, competition, or "peer pressure"—in the eyes of their colleagues, other health care organizations, and the general public.9 In addition, performance reports that present detailed performance information can help guide providers in their improvement efforts (e.g., by showing them exactly which measures need the most improvement).3


    • Providers may improve their performance.
    • Providers may get guidance in their improvement efforts, especially when reports give detailed performance information.


    • Some providers with poor performance may criticize the report rather than engage in improvement efforts.6,10
    • There is some evidence that publicly reporting performance may not always spur performance improvement.11
  3. Option 3: To empower patients to "co-produce" their health care. Patients who are empowered to be more active participants in their own health care may have better outcomes of care.4 Public reports of provider performance may raise patients' awareness that there is substantial variation in performance on important measures of health care quality. Regardless of whether they use performance information to choose a provider, patients may be motivated to ask for the health care services included in performance reports (especially if they note that their own provider's performance is not perfect). As with reports aimed at informing patients' choice of provider, reports aimed at empowering patients should be understandable by (and educational for) those who may not have medical and statistical expertise.


    • Empowered patients may receive better care.


    • Patient empowerment may not require public performance reporting. Other means of patient education may be more efficient.

Thoughts on the purposes of public reporting

  • Nancy Clarke, formerly Executive Director of the Oregon Health Care Quality Corporation (, describes the organization's main purposes in public reporting as motivating quality improvement and making the patient a partner in quality improvement. Due to a shortage of primary care providers (PCPs), the "shopping model for consumers driving markets doesn't have much traction [in Oregon]." These thoughts were echoed by Christine Amy, Project Director of Aligning Forces for Quality-South Central Pennsylvania ( "There aren't enough PCPs in the area, so labeling a provider as 'great' isn't relevant to patient choice when the provider is closed to new patients. The purposes of public reporting are to motivate and guide providers and to use the reports as a teaching tool to help patients be better partners in their own care."
  • Devorah Rich, formerly Project Director of the Greater Detroit Area Health Council (, describes an evolution in the purposes of reporting: "Ideally, people originally thought it would engage the consumer, but it's turned out to actually motivate the physicians very powerfully. The physicians pay a lot of attention to our reports. You don't get through medical school without being competitive." There has been less evidence of consumer engagement with the reports, and this is felt to be due to reporting performance at the physician organization level (rather than the individual physician level).
  • Renee Frazier, Executive Director of the Healthy Memphis Common Table ( explains that the main purposes of public reporting are to motivate provider improvement and to empower patients: "Knowing the indicators (and the reasons for them) helps individuals to understand the most important care they should be receiving. It also helps to know what services to ask for if you are not already receiving them from your doctor."
  • Jim Chase, Executive Director of Minnesota Community Measurement (; a member of the Minnesota CVE) notes that while performance reporting has mostly motivated providers to improve, rather than guiding patients' choice of provider, reporting has been tied to explicit incentives aimed at providers. The performance scores in public reports also have served as the basis for pay-for-performance and provider tiering programs: "We've learned that [patients and providers] don't just go out and use the information. There's an evolution, and incentives like pay-for-performance and tiering can make the information more relevant."
  • Susan McDonald, formerly with the Minnesota Department of Human Services, credits public purchasers' use of the performance reports with catalyzing provider improvement efforts, and Carolyn Pare, President and Chief Executive Officer of the Buyers Health Care Action Group (also a member of the Minnesota CVE) further notes the crucial roles played by purchasers and quality improvement organizations in helping providers make the best use of performance reports: "While critically important, standard measurement, data collection, and reporting in and of itself would not have changed things in Minnesota."

Return to Contents 

B. What will be the general format of performance reports?

Performance reports can vary widely in their general formats. They can be complex, with detailed reports of measure-by-measure performance rates and statistical confidence intervals, or they can be much simpler, displaying categories of overall performance on a composite measure (e.g., "a 3-star hospital on pneumonia"). For CVEs, it may be advisable to negotiate the general format (or formats, if multiple reports are planned) before providers know exactly how their own performance will appear. At this stage, a scan of existing reports (including reports that are on paper and on the Internet) may be useful to help stimulate discussion.

The decision about which general format to use will probably be heavily influenced by the purpose of public reporting. In general, reports that are aimed at a patient audience will need to have a simpler reporting format that is more usable by this audience.i Such reports may present only a few categories of performance (e.g., a 4-star scale) or may rank providers to enable quick ascertainment of the highest and lowest performers. In addition, reports that are based on relative provider performance may be most informative to patients who are trying to choose the highest performing providers. Reports of relative provider performance focus on enabling comparisons between providers within a given market area (i.e., the market area theoretically accessible to the patient), rather than comparing providers to an external performance threshold, such as a national benchmark.

On the other hand, reports that are aimed at a provider audience (to motivate and guide improvement) may require more reporting formats that display more detailed information. Relative performance may be presented in such reports to enhance their ability to motivate improvement, but absolute performance (with numerators, denominators, and other "raw scores") is likely to be most useful in guiding improvement efforts.

Many options and combinations of options are available for the general format of performance reports. Each reporting format may be more appropriate for some audiences and less appropriate for others. We present three examples here.

  1. Option 1: Simplified reports of relative provider performance. This option is attractive when the purpose of reporting is to inform patients' choices of health care providers. These reports generally present only a few categories of performance, measures are aggregated when possible, and providers may be ranked. Raw performance rates and scientific depictions of statistical uncertainty are rarely included in such reports.


    • Enables patients (who may lack medical or statistical expertise) to more easily interpret performance differences among providers.


    • May oversimplify the full range of provider performance. For example, the "1-star" category for provider performance may include a wide range of actual performance levels.
    • May obscure the representation of statistical uncertainty. If patients do not understand the degree of statistical uncertainty in a performance report, small differences in performance may be interpreted as meaningful when in truth they are not.
    • May not contain enough detailed information to guide provider improvement efforts.

Option 1 Examples: Simplified reports of relative provider performance

The Oregon Health Care Quality Corporation (, which received guidance from a "consumer plain language" expert, reports clinic performance in three categories: "better" (clinic absolute score is higher than one standard deviation above the statewide score), "average," and "below" (clinic absolute score is lower than one standard deviation below the statewide score). In addition, the CVE confidentially provides detailed performance data to each clinic.

The Puget Sound Health Alliance ( reports provider performance in three categories: above regional average, at regional average, and below regional average. However, users can select a provider's name in the Web-based report to access numeric performance scores and statistical confidence intervals.

The Healthy Memphis Common Table ( reports provider performance using a star system: providers get 1 star for performance that exceeds the 75th percentile in Shelby County and 2 stars for performance exceeding the 90th percentile. This reporting format was felt to be consistent with the literacy level of the patient community (i.e., consistent with a fifth grade level of literacy).

  1. Option 2: Simplified reports of absolute provider performance. Rather than showing how providers compare with each other, performance reports can show simplified categories of absolute performance. For example, if the range of possible scores on a performance measure is 0-100, such a report could tell patients whether a given provider scored above 80 or below 80 (regardless of how many providers score above or below 80). This approach is attractive when CVE stakeholders can agree on an absolute performance threshold above (or below) which there are no truly meaningful differences in performance.


    • Enables patients (who may lack medical or statistical expertise) to understand performance information when choosing providers.
    • May set clear performance goals for providers. By comparing their absolute current scores to the performance thresholds that define the reported performance categories, providers can gauge how much they need to improve to get into a higher category.


    • The representation of statistical uncertainty may be challenging.
    • Reports may not contain enough detailed information to guide provider improvement efforts.


    • If all providers in a CVE's area are in the same performance category, then the report will not be useful in choosing a provider. This is not necessarily a bad thing. If, for example, all providers score in the highest category, then patients can choose providers on attributes such as convenience and be reasonably confident that they will get high-performing providers.
  2. Option 3: Detailed reports of absolute provider performance. This option is attractive when the purpose of reporting is to guide providers' efforts to improve performance. These reports may present data that are as detailed as possible as well as data that are somewhat more aggregated (to enable providers to prioritize their efforts). These reports also may contain explicit improvement strategies and identify high-performing providers who can share best practices.


    • Reports may give providers useful guidance in their improvement efforts.


    • Data complexity may make these reports less accessible to patients who are trying to choose a provider.

Option 3 Examples: Detailed reports of absolute provider performance

Organizations leading the Minnesota Healthcare Value Exchange ( and report numeric performance scores for each provider in its reports. These reports display the providers in the rank-order of their scores. There is no representation of statistical uncertainty in the public reports. However, providers receive even more detailed reports of their own scores with statistical confidence intervals.

Aligning Forces for Quality-South Central Pennsylvania ( displays provider performance on each measure of diabetes care quality as an absolute percentage, with national and community average scores included as benchmarks. These performance scores are initially sorted according to provider name (in alphabetical order), but providers also can be sorted by performance rank (with a single user action). Currently, no representation of statistical uncertainty is included in these performance reports.

The Wisconsin Healthcare Value Exchange generally reports absolute performance scores, consistent with the primary purpose of enabling provider groups to compare their performance to benchmarks. The ambulatory Web site ( is "not really designed for consumers," and Web site user tracking statistics confirm that the site is most often visited by Wisconsin health care providers.

Return to Contents 

C. What will be the acceptable level of performance misclassification due to chance?

It is impossible to know exactly which providers are misclassified due to chance alone. However, it is possible to know, for each provider, the risk (i.e.,probability) that performance is misclassified. CVE stakeholders can therefore negotiate a maximum acceptable risk of performance misclassification due to chance, and this negotiation can take place before performance reports are created (i.e., before providers know exactly how their performance will appear). This negotiation can be more useful and concrete if there is general agreement about the format of a performance report.

For example, if a CVE has provisionally decided on a 4-star scale for reporting, stakeholders can address such questions as:

  • What is the maximum acceptable risk that a true 4-star provider will be misclassified as a 3-star provider? What about being misclassified as a 2-star provider?
  • What is the maximum acceptable risk that a true 2-star provider will be misclassified as a 3-star provider? Or a 4-star provider? Or a 1-star provider?

There is no "right answer" to the acceptable risk of misclassification due to chance. How much risk is acceptable may vary by CVE, depending on exactly which measures will be reported and on how performance reports will be used. Patients have a wide range of opinions about the acceptable level of misclassification risk. In a 2006 survey, most patients thought a risk of misclassification greater than 5 percent but not greater than 20 percent would be acceptable.12

Other CVE stakeholders may have different opinions about how much misclassification they think is reasonable in a performance report. The important thing is to engage CVE stakeholders in discussions about misclassification, to acknowledge its existence as a limitation of any performance report, and to begin to achieve consensus on how much misclassification risk is acceptable. This level of risk always can be revisited at later stages (especially, once more is known about how the factors that determine misclassification interact with each other in a given performance report; Appendix 1 discusses this issue further).

To decide on an acceptable amount of misclassification due to chance, CVE stakeholders may want to think about the goals of performance reporting:

  • If the goal of the performance report is to help patients choose higher performing providers, reports that have too high a rate of misclassification can mislead too many patients.
  • If the goal of the performance report is to motivate providers to improve, an excessive rate of misclassification will falsely reassure too many low-performing providers who are misclassified as high performing. It also can generate concern among high-performing providers who are classified as low performers.
  • If the goal of the performance report is to reward high performance, an excessive rate of misclassification will result in too many low performers being rewarded and too many high performers not receiving a reward.

The acceptable risk of performance misclassification due to chance can take many values. We present two polar extremes to illustrate the tradeoffs.

  1. "Extreme" Option 1: Set a very low level of acceptable misclassification risk due to chance. An example of a very low level of risk is "less than 1% of all true 4-star providers will be misclassified as 3-star providers, and less than 0.1% will be misclassified as 2-star providers." An example of a current report that uses statistical confidence intervals to limit the risk of misclassifying average performers as above or below average is the Hospital Compare report of hospitals' 30-day mortality rates ( For the vast majority of hospitals, Hospital Compare classifies their mortality performance as average ("No different than the U.S. national rate").


    • The risk that a provider's performance will be misclassified due to chance will be low.
    • If statistical confidence intervals are used to set a low level of misclassification risk, then the probability of one type of misclassification (classifying providers as below or above average when they truly have average performance) will be limited to the level of confidence (usually 5%). Using confidence intervals to limit misclassification risk is discussed in more detail in the section on Task Number 5.


    • When sample sizes are small (or when between-provider differences in true performance are minimal), it may not be possible to include a large proportion of providers in the report. Or it may not be possible to report performance on measures that are important to stakeholders. These problems are especially likely when reporting the performance of individual clinicians.
    • If statistical confidence intervals are used to set a low level of misclassification risk, then nearly all providers may be classified as having average performance. Therefore, there will be a higher risk of misclassifying truly above or below average providers as average performers.
  2. "Extreme" Option 2: Set a very high level of acceptable misclassification risk due to chance. An example of a very high level of risk is "up to 40% of all true 4-star providers will be misclassified as 3-star providers."


    • Even when sample sizes are not large, it may be possible to report the performance of nearly all providers on nearly all measures.


    • The performance report may misclassify the performance of many providers on many measures solely due to chance. The potential consequences of this performance misclassification are shown in Figure 3.

Example: Talking with stakeholders about misclassification risk

Even though misclassification risk is a fundamental and important methodological issue, more tangible approaches to discussing the subject may help engage stakeholders. In interviewing CVE stakeholders, we found that most do not currently engage stakeholders in conversations that are explicitly about misclassification risk. Instead, CVEs discuss more "tangible" topics that are fundamentally about misclassification risk…without actually mentioning the words "misclassification risk." These discussions may combine the statistical theory-based concerns outlined in this report with the political realities in which each CVE operates.

As Nancy Clarke, formerly Executive Director of the Oregon Health Care Quality Corporation ( explains:

If we held a meeting on "risk of misclassification," no one would come. But when we have meetings on "tradeoffs: what's fair to providers and fair to consumers," plenty of people come. We had sequential meetings, each with a white paper that combined the statistical and the political: "What's big enough for clinic size?" "What's big enough for number of cases?" "How do we put data into buckets to show the public?" "What's a fair benchmark?" etc. EVERYBODY comes to those meetings.

iRefer to reports by Hibbard and Sofaer for more detailed guidance on which kinds of reporting formats might be preferable for purposes such as helping patients choose their providers.1-2

Page last reviewed September 2011
Internet Citation: Decisions Encountered During Key Task Number 1: Negotiating Consensus on Goals and "Value Judgments" of Performance Reporting: Methodological Considerations in Generating Provider Performance Scores. September 2011. Agency for Healthcare Research and Quality, Rockville, MD.