Decisions Encountered During Key Task Number 3: Identifying Data Sources and Aggregating Performance Data
Methodological Considerations in Generating Provider Performance Scores
There are a few key types of performance data that a Chartered Value Exchange (CVE) may want to collect, and there are different ways to compile these data. The basic types of performance data include:
- Administrative data (e.g., claims, hospital discharge data, prescription fills, laboratory services).
- Medical record data (both paper and electronic).
- Clinical registry data.
- "Hybrid" data (i.e., administrative data that are combined with selected medical record data to improve accuracy).21
- Data from patient experience surveys.
The definitions, advantages, and disadvantages of using all of these types of data are discussed in more detail in a separate AHRQ decision guide, Selecting Quality and Resource Use Measures: A Decision Guide for Community Quality Collaboratives.14 However, from a methodological point of view, a key decision is the degree to which performance data will be processed before reaching a CVE.
To construct performance measures, "raw" sources of data (e.g., health plan claims, hospital discharge data) must be converted into a format that is ready for measure specifications to be applied. This conversion (also known as "data cleaning") can be very cumbersome, especially when a CVE does not already have in-house expertise in processing a particular data source. One approach to dealing with raw data sources is to contract with a data management vendor. Guidance on selecting and interacting with a vendor is available in the decision guide Selecting Quality and Resource Use Measures mentioned above.
Two general models or approaches to data aggregation can be followed:
- An "aggregated data model" where more detailed raw data are aggregated by a CVE to produce performance measures.
- A "distributed data model" where the entity or entities that provide the data (usually health plans) retain many key data elements (especially those that constitute personal health information) and may process the data into provider scores (or numerators and denominators). For example, a health plan might process its own raw claims, apply measure specifications provided by a CVE to these processed claims, and then report measured performance (e.g., numerators and denominators on diabetic eye exams) to the CVE for each provider. This way, the CVE never has to deal directly with raw performance data. However, the CVE still has the freedom to combine these measured performance statistics with other data sources in its own report. In a distributed data model, a CVE may also be able to construct some types of composite measures.ii
A third alternative is to use "prescored" data generated by another performance reporting organization. Examples of prescored data include hospital safety ratings by Leapfrog or categories of hospital death rates from the Centers for Medicare & Medicaid Services (CMS) Hospital Compare. These prescored data have been fully processed into performance scores (or categories of performance), and CVEs generally cannot influence how these measures are specified or how performance is classified. The advantage of using prescored data is that a CVE can report these scores (or performance categories) without needing to process any data. However, using prescored data may limit a CVE's options for addressing performance misclassification, whether systematic or due to chance.
Option 1: Obtain raw performance data ("aggregated data model"). Raw performance data include health plan claims, hospital discharge data, medical record abstracts, and patient survey responses.
- This approach maximizes a CVE's degree of freedom to decide how performance measures will be specified and reported. A CVE will be able to decide the organizational level of reporting (e.g., individual practitioner or provider group) and determine how to construct composite measures.
- By handling raw performance data, a CVE will learn the limitations and flaws of these data. A CVE also may work with health plans and providers on data improvements to help facilitate future measurement and reporting efforts.
- Because raw performance data can be processed on a patient-by-patient basis, this approach allows for a detailed data review and correction process.
- This approach allows maximum flexibility and range of options in attributing data to providers, performing case mix adjustment, and dealing with the risk of performance misclassification due to chance.
- This approach maximizes the potential for performance data to be used for research.
- Processing raw performance data may require substantial experience and can be difficult and expensive. A data management vendor will be needed.
- Raw performance data may contain individually identifiable health data about patients. When such data are present, a CVE must take additional precautions to preserve the privacy, confidentiality, and security of these data. Depending on the type of data, there may be additional legal considerations (e.g., Health Insurance Portability and Accountability Act).
Example: Using raw health plan claims ("aggregated data model")
The Oregon Health Care Quality Corporation (http://q-corp.org) and Puget Sound Health Alliance (http://www.wacommunitycheckup.org) both receive raw claims data from commercial and Medicaid health plans. These CVEs share an experienced data contractor that processes the claims, working with the health plans and other CVE stakeholders to identify and address missing data, check data interpretation, and calculate provider performance scores.
Option 2: Use a distributed data model. Raw performance data can be processed by the original sources of these data, using measure specifications provided by the CVE. For example, health plans may process their own claims data and send provider-level performance measure numerators and denominators to a CVE, rather than patient-level data that would need to be aggregated up to the provider level.
- CVE avoids the cost and difficulty of processing raw performance data.
- CVE retains some flexibility in specifying performance measures, specifying case mix adjustment methods, and addressing other analytic concerns.
- If the needs of a CVE change, it may be difficult for data sources to agree to reprocess the raw performance data and send new kinds of measure output to the CVE.
- Potential exists for misleading reports if CVE partners who produce the data do not process their data in the same way or use exactly the same measure specifications.
- When case mix adjustment is desired, using a distributed data model may limit the types of adjustment methods available to a CVE, as case mix adjustment often requires more granular information such as patient characteristics. The section on Task Number 5 discusses situations in which case mix adjustment may be warranted.
- As with a vendor, a CVE may need to perform audits to determine whether data are being processed as specified by the CVE.
Examples: Using a distributed data model
- Massachusetts Health Quality Partners (http://www.mhqp.org), Greater Detroit Area Health Council (http://www.gdahc.org), and Healthy Memphis Common Table (http://www.healthymemphis.org) obtain HEDIS measure numerators and denominators from each of their health plans. Therefore, each health plan deals directly with its own raw administrative data.
- The Quality Alliance Steering Committee (QASC) and America's Health Insurance Plans (AHIP) are piloting a prototype distributed data model that includes a subset of Colorado and Florida health plans in an effort to generate HEDIS® (Healthcare Effectiveness Data and Information Set) measures (http://www.healthqualityalliance.org/hvhc-project).
- To generate performance data for reports by Aligning Forces for Quality-South Central Pennsylvania (http://www.aligning4healthpa.org), providers randomly sample their own medical records and abstract these records to generate numerators and denominators on diabetes quality of care measures.
Option 3: Use "prescored" data. Examples of fully processed performance scores include Leapfrog patient safety ratings and ratings from Medicare's Hospital Compare, patient experience measures H-CAHPS [Hospital Consumer Assessment of Healthcare Providers and Systems]), hospital mortality rates, and readmission ratings.
- Data have already been completely processed into performance scores that may be ready for reporting (and may already have been reported).
- CVE has little or no control over measure specifications.
- CVE has little or no control over providers (in the case of Leapfrog ratings) or payers (in the case of CMS Hospital Compare) that are represented in the prescored data.
- CVE has limited options for performing case mix adjustment, addressing misclassification risk, and dealing with other analytic concerns.
- Important measure details and data validity checks (e.g., specifications, attribution rules, case mix adjustment methods, and level of misclassification risk) may or may not be available, depending on the documentation available from the source of the prescored data.
B. How will data sources be combined?
If a CVE uses more than one source of performance data for a given measure, then these data sources will need to be combined to report for each provider a single level (or category) of performance on that measure. For example, a CVE may collect performance data on a diabetes quality of care measure from three commercial health plans, plus Medicare and Medicaid. A given provider may have patients with diabetes from each of these five payers. However, reporting five separate performance scores for each provider on this measure (one for each data source) might confuse patients. Receiving multiple performance scores on the same measure may annoy providers, especially when the scores are very different across sources. These divergent scores may also be a sign of small denominators within each data source (as discussed in Appendix 2) or inadequate case mix adjustment (as discussed in the section on Task Number 5).
Combining data from multiple sources is not a trivial task, given variations in coding practices across public and private payers. For example, it is not unusual for different payers to have different provider identifiers, which creates challenges in generating a unified provider file. The degree of difficulty in aggregating data across multiple sources partly depends on the amount of data processing that has occurred before these data reach the CVE (see the section on Task Number 3) for more discussion of preprocessed data). In general, the less preprocessed the data from multiple sources, the more work is necessary to combine these data.
Scenario 1: Starting with raw performance data from multiple sources ("aggregated data model")
As discussed earlier, "raw" performance data are data to which measure specifications have not yet been applied. In other words, these raw data have had little or no processing. Claims for a health plan's members are a common example of raw performance data. To process raw data into performance scores, a CVE will need to work with a data vendor. The AHRQ decision guide Selecting Quality and Resource Use Measures: A Decision Guide for Community Quality Collaboratives contains guidance on selecting and working with a data vendor.14
Calculating the measures within each data source offers a chance to ensure that measure specifications are being correctly applied. For example, scores on a measure may change dramatically depending on whether the performance data are from source A or source B, with no reasonable explanation. (One data source may represent a higher risk population.) In this case, the measure specifications may have been incorrectly applied to one of the sources, or there could be problems with the data from one or more sources.
Because every source of raw data is different, it may be advisable for the CVE or data vendor to directly consult with each source to resolve any questions about how the data are coded (see the section on Task Number 4). For example, if a CVE is calculating a diabetes measure from a health plan's data, the CVE may want to review the measure specifications with health plan staff. This step can help ensure that specifications will identify the intended population of patients with diabetes.
Scenario 2: Using a distributed data model with multiple sources
In a distributed data model, raw performance data can be processed by the sources of these data before the data are shared with the CVE, using measure specifications provided by the CVE. For example, in a distributed data model, health plans can calculate a provider's numerator and denominator for each Healthcare Effectiveness Data and Information Set (HEDIS) measure and report these to the CVE.
In a distributed data model with multiple sources, the major challenge to combining data across sources is ensuring consistent provider identification. When data sources report performance to a CVE, the performance being reported must be linked to a provider identifier (e.g., a numeric code or name representing the physician whose performance is being reported). The central problem that CVEs may commonly encounter is that each data source may use a different set of provider identifiers. In other words, Dr. Jones might have one identifier in Plan A and another identifier in Plan B. In addition, Dr. Jones' name may be represented differently across the different health plan files. To combine Dr. Jones' performance reported by Plan A with Dr. Jones' performance in Plan B, a CVE will need a "crosswalk" that links the identifiers for each provider across the data sources to be combined.
The following are two options that illustrate ways to create a provider crosswalk.
Option 1 for provider crosswalk: Use readily available provider identifiers. Some provider identifiers may be readily available to a CVE. These include provider taxpayer identifiers, national provider identifiers (NPIs), Drug Enforcement Administration (DEA) numbers, State medical license numbers, and Medicare billing identifiers. These identifiers may correspond to providers of different types, including individual physicians, medical groups, hospitals, and integrated health care delivery systems.
- Using these identifiers is relatively economical.
- For hospitals and other large provider organizations, readily available identifiers may be highly accurate.
- For individual practitioners and small outpatient practices, a crosswalk based on readily available identifiers may have low accuracy. For example, a tax ID may include providers that actually have little to do with each other, aside from sharing a common billing system. In addition, it may be difficult to know which tax IDs represent individual practitioners and which represent larger groups, making comparisons more difficult.
- The crosswalk may not be able to link a large number of providers across data sources. This may happen when one data source does not include the same "readily available" identifiers as another.
- The crosswalk may not enable a CVE to identify provider attributes. For example, it may be impossible to know which tax IDs represent individual physicians and which represent small groups. It also may be impossible to determine the specialty of each provider.
- The ability to change the level of reporting may be limited. If the crosswalk only contains individual physician identifiers, then it may not be possible to report performance at higher levels of provider organization (e.g., the medical group). Reporting for larger groups of providers can be an important option for limiting misclassification risk (see the section on Task Number 5).
Example: Building on readily available provider identifiers
The Healthy Memphis Common Table (http://www.healthymemphis.org) began its performance reporting efforts by using the Medicare GEM* dataset: provider identifiers were obtained from a single plan. When a group could not be identified (15% of the time), staff followed up directly with providers and practice managers to let them self-identify. When commercial health plan data were later used, checking health plan provider identifiers for accuracy and consistency (via telephone calls to providers) revealed that the tax IDs did not always match across health plans. In these cases, additional variables were used for matching, such as provider address. Solo practices also were examined specifically to ensure that the apparent "practice" was not part of a larger provider group; true solo practices were not reported. The CVE is now working to develop a master directory of providers in the Memphis area.
* GEM refers to the Generating Medicare Physician Quality Performance Measurement Results Project.
- Option 2 for provider crosswalk: Create a "master provider directory." A master provider directory is an organizational mapping of all the known providers in a CVE's local geographic area. This mapping tells which individual practitioners are affiliated with which practice sites (or clinics), tells which practice sites are part of which larger medical groups, and may include affiliations with larger provider organizations. Other data that may be included in a master directory are individual provider specialties, certifications, and acceptance of new patients. Finally, to enable a CVE to combine performance data from multiple sources, the master directory must contain a crosswalk with the provider identifiers used by each data source.
Having a master provider directory is especially important when a CVE is reporting the performance of individual providers or small groupings of providers. If a CVE is only reporting hospital performance, then a master directory may be less useful. Creating such a directory may require substantial time, effort, and resources since collecting new data (often by directly contacting providers) is almost certain to be necessary. Maintaining a master directory also requires ongoing investment, since providers often change their affiliations. However, once created, a master directory also has distinct advantages.
- The master provider directory can serve as a common reference point for all data sources to ensure valid aggregation of performance data.
- The directory may help convince providers that performance is being accurately reported. By contacting providers as part of master directory maintenance, the CVE demonstrates a commitment to accurate reporting.
- The directory offers flexibility in determining the best level of provider organization for performance reporting. Reporting for larger provider groups can be an important option for limiting misclassification risk (see the section on Task Number 5).
- Requires significant time and resources.
- Requires provider engagement. The accuracy of a CVE's master directory will only be as good as the information given by providers.
Examples: Creating a master provider directory
Massachusetts Health Quality Partners (MHQP; http://www.mhqp.org) uses a "Master Physician Directory" to combine data from its five participating health plans. This master directory also enables MHQP to report HEDIS performance at the medical group level and simultaneously report patient experience survey data at the practice site level (a lower level of provider organization). To create the directory, MHQP relied on readily available physician identifiers (e.g., license and DEA numbers) and provider addresses. Provider organizational mappings from local health plans conflicted with each other, so MHQP engaged in direct outreach to providers to learn their self-identified organizational relationships. The directory is updated annually, and this update is now facilitated by a computer interface that allows providers to correct their pieces of the directory. It took the MHQP directory roughly 10 years to reach a "steady state" in which the same percentage of providers (~5-10%) changes affiliations from one year to the next. At this point, MHQP leaders believe these changes of affiliation no longer represent corrections of past errors. Instead, these changes represent true changes in provider affiliation that occur when providers move or groups change their configurations.
The Oregon Health Care Quality Corporation (http://q-corp.org), created an Oregon practitioner directory listing primary care clinics with 4 or more physicians (including roughly 2,000 of 3,000 such physicians in the State). Creating an accurate directory required Internet sleuthing and direct outreach via telephone. When plans for public reporting were circulated, the clinics began to actively participate in correcting their directory entries. Puget Sound Health Alliance (http://www.wacommunitycheckup.org) similarly created a provider directory that included clinics with four or more physicians.
Minnesota Community Measurement (a constituent of the Minnesota CVE; http://www.mnhealthscores.org) also created a master provider directory for ambulatory physician clinics. It took 3 years to create and verify this directory.
General approach to combining data once crosswalk is complete
Creating an accurate provider crosswalk may be the most difficult part of combining performance data from multiple sources. However, the best way to aggregate performance data for each provider can be unclear. From a methodological standpoint, aggregating provider performance across multiple data sources (on a single measure) is very similar to creating a performance composite from multiple individual measures.
Two key methodological concepts apply: validity and reliability. Greater validity means that a smaller share of providers will be systematically misclassified in a performance report. Greater reliability means that a smaller share of providers will be misclassified due to chance alone in a report. Both of these concepts are discussed in more detail in Appendixes 1 and 2, but their application to combining multiple-source data is briefly discussed here.
To maximize measurement reliability, performance data from each source can be weighted when they are combined. A general recommended strategy is to give performance scores that are based on fewer observations less weight than those that are based on more observations. In other words, this approach allows more reliably measured scores to have more influence than less reliably measured scores. This weighting strategy is straightforward for measures with numerators and denominators. By separately summing the numerators and denominators from all sources and then dividing the summed numerator by the summed denominator, a CVE will produce the most reliable performance estimate that is possible with the data available.
For example, a provider might deliver a HEDIS service to 40 out of 50 patients (80%) in health plan A and 5 out of 10 patients (50%) in plan B. Using the recommended strategy for combining data from these plans, the summed numerator is 45 (40 + 5) and the summed denominator is 60 (50 + 10). Therefore, the combined performance score is 75% (45 divided by 60). Note that 75% is much closer to 80% (the plan A score) than to 50% (the plan B score). The combined score is closer to the plan A score because this strategy of combining performance data automatically weighted the data appropriately, giving more weight to plan A, which had more observations and therefore a more reliable measured score.
Validity issues may arise when combining data sources because different data sources may contain data generated by dissimilar patient populations. For example, a CVE may want to combine data from Medicare with data from a commercial health plan. However, these two patient populations may differ in many important ways. It may be misleading to compare "Provider A," who mostly sees patients with Medicare, to "Provider B," who mostly sees patients with commercial insurance, on measures of mortality. This might be the case because patients with Medicare are probably older than those with commercial insurance and therefore have a higher baseline rate of mortality. Thus, even if Provider A gives care that is equal to Provider B's care for both patient populations, Provider A will appear to have worse performance (i.e., a higher mortality rate).
Problems combining data can be addressed through the case mix adjustment or stratification methods discussed in the section on Task Number 5. However, we note here—and explain in more detail in the section on Task Number 5—that case mix adjustment is not always a straightforward methodological decision.
C. How frequently will data be updated?
Provider performance is likely to change over time, so CVEs will want to periodically update the data in reports of provider performance. From a methodological perspective, there is no real downside to updating performance data as frequently as possible, using the most recent data available. After all, if the performance data contained in a report are too old, then they may no longer accurately represent provider performance.
One important caveat applies to updating performance data: As updates become more frequent, CVEs may be tempted to reduce the number of observations included in each update. An extreme example of this practice would be to send out weekly updates on a patient experience survey, sharing just the surveys that were returned in the preceding week. If only a few surveys are received each week, then week-to-week scores could fluctuate wildly due to chance alone (i.e., week-to-week scores would have low reliability).
To increase measurement reliability, frequent updates may need to be accompanied by a "rolling average" approach to calculating provider performance. In this approach (discussed in the section on Task Number 5), data from preceding periods are combined with data from the most recent period to increase the number of observations. In more complex versions of the "rolling average" approach (e.g., Bayesian methods), more recent performance data get more weight than older performance data.
Other than potentially incurring greater expense, there is no practical downside to updating performance data as frequently as possible. To help decide how much expense is worthwhile, CVE stakeholders may aim for matching the frequency of data updates to the minimum length of time necessary for changes in true performance to occur. For most performance measures, it is probably not plausible for true performance to change on a week-to-week or even month-to-month basis.
Example: Frequency of data updates
The Greater Detroit Area Health Council (GDAHC; http://www.gdahc.org) updates the performance data in its public reports on an annual basis, with a lag of at least 1 year between the time clinical care is delivered and the time performance data are reported. Devorah Rich, formerly Project Director of GDAHC, explains that in the future, "real time reporting" is desired: "The analogy is like trying to lose weight. When groups are working hard, they want to know whether these efforts are successful and they want to get recognized for this."
ii A CVE using a distributed data model may be able to construct composite measures using a "weighted average" approach. Because a CVE using distributed data models may not receive patient-level data, it may not be possible to construct "all-or-none" composite measures. These types of composite measures are discussed in more detail in the section on Summary of Methodological Decisions Made by a Sample of CVE Stakeholders.