SNAC-recommended Initial Core Set, September 18, 2009
The initial core set of Children's Healthcare Quality Measures for Voluntary use by Medicaid and CHIP Programs was developed using a transparent and evidence-informed process, informed by broad input from multiple stakeholders. Key components included multiple opportunities for public comment including a CMS-led listening session for Medicaid and CHIP officials; an AHRQ NAC Subcommittee that contributed expertise on validity, feasibility, and importance of measures in use; and supportive background work by AHRQ, CMS, and members of the CHIPRA Federal Quality Workgroup.
Creation of the AHRQ National Advisory Council on Healthcare Research and Quality Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs (SNAC)
In May 2009, the AHRQ Director approved a Charter creating the Agency for Healthcare Research and Quality's National Advisory Council for Healthcare Research and Quality (AHRQ NAC) Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs (SNAC). The AHRQ NAC had agreed to provide advice to AHRQ and CMS to facilitate their work to recommend an initial core set of measures of children's health care quality for Medicaid and CHIP programs. To provide the requisite expertise and input from the range of stakeholders identified in the CHIPRA legislation, the NAC established the Subcommittee on Children's Healthcare Quality Measures for Medicaid and CHIP Programs (SNAC).
The SNAC was charged with: a) providing guidance on criteria for identifying an initial core measurement set; b) providing guidance on a strategy for gathering additional measures and measure information from State programs and others; and c) reviewing and applying criteria to a compilation of measures currently in use by Medicaid and CHIP programs to begin to select the initial core measurement set. SNAC recommendations were to be provided to the NAC, which in turn advises the Director of AHRQ.
Nominations for SNAC members to represent the range of stakeholders were sought from CMS and the CHIPRA Federal Quality Workgroup. An emphasis was placed on identifying Medicaid and CHIP officials because of their unique role as potential implementers of the initial core set. Although more were invited, 4 State Medicaid program officials (from Alabama, Minnesota, Missouri, District of Columbia), and 1 State CHIP official were able to participate as SNAC members. Others represented Medicaid, CHIP, and other State programs more generally (i.e., representatives of the National Academy on State Health Policy, National Association of State Medicaid Directors, and the Association of Maternal and Child Health Programs).
Representatives of health care provider groups came from the American Academy of Family Physicians, American Academy of Pediatrics, American Board of Pediatrics, the National Association of Children's Hospitals and Related Institutions, the National Association of Pediatric Nurse Practitioners, and a Medicaid health plan representative. The interests of families and children were represented by the March of Dimes. Individual SNAC members provided expertise in children's health care quality measurement, children's health care disparities, tribal health care, dental care, substance abuse and mental health care, adolescent health, and children's health care delivery systems in general. Two members of the NAC also participated in the SNAC. SNAC members are listed in the Appendix.
The SNAC Co-chairs Rita Mangione-Smith, MD, MPH and Jeffrey Schiff, MD, MBA were selected because of their expertise in children's health care quality measurement and leadership roles in the Medicaid Medical Directors Learning Network, respectively. The SNAC charter expires December 31, 2009.3,4
The SNAC held two public meetings (July 22-23 and September 17-18, 2009) and accomplished a substantial amount of work outside the meetings in order to help the NAC, AHRQ, CMS, and the Secretary meet the CHIPRA legislative deadline of January 1, 2010.
Multiple ongoing opportunities for public input were provided as part of this process. In June 2009, AHRQ established a Web site to provide information on its role in CHIPRA implementation, in close collaboration with CMS, and an email address through which the public could comment on the process. In addition, both SNAC meetings were open to the public and provided opportunities on each day for anyone to make formal public comments. Additional opportunity for public comment came during the July 24, 2009 NAC meeting at which the SNAC Co-chairs presented on the process used and results of the July 22-23, 2009, SNAC meeting.5 In addition, the SNAC co-chair, Dr. Schiff, arranged for a conference call for members of the Medicaid Medical Directors Learning Network (MMDLN) to seek input on the measure identification and nomination process. Several members of the MMDLN responded by nominating children's health care quality measures in use by their States for consideration for the initial core measure set. Finally, on September 30, 2009, CMS led a listening session for Medicaid and CHIP officials to provide an opportunity for comment on the initial, recommended core measure set.
Those making public comments through these mechanisms included individual health care practitioners, additional Medicaid and CHIP programs, representatives of industry groups, child and family advocates, and members of the CHIPRA Federal Quality Workgroup.
First SNAC Meeting July 22-23, 2009
The first SNAC meeting was held July 22-23, 2009, in Washington, DC. The meeting was open to the public. This section describes preparation for the first SNAC meeting, the focus of SNAC discussions, presentations to the SNAC, refinements to methodology made during the meeting, and the identification of a preliminary group of measures to further consider for inclusion in the final core set, as well as needs for additional information and work.
AHRQ and CMS staff and the subcommittee Co-chairs began conferring prior to the first scheduled SNAC meeting. Seventy-seven measures in use by Medicaid and CHIP programs were identified by AHRQ staff with the assistance of CMS and a process to initially evaluate those measures was agreed upon by AHRQ and CMS.
Prior to the July meeting, SNAC Co-chairs, working through AHRQ, provided subcommittee members with standard definitions and criteria recommended for use in evaluating the validity and feasibility of quality measures. SNAC members were asked to apply these evaluation criteria to the 77 measures using the RAND Corporation's modified Delphi process.1 Previous work has shown this method of evaluating quality measures to be reliable and to have content, construct, and predictive validity in other applications.2-4
The modified Delphi process involved individual SNAC members scoring the initial identified set of Medicaid and CHIP quality measures for validity and feasibility on a 1 to 9-point scale (with 1 denoting the measure was not valid or feasible and 9 indicating it was definitely valid and feasible). Objective information (e.g., on underlying scientific soundness of the measures) related to both measure validity and feasibility was provided to the extent it was available. However some measures were scored in this round without adequate identification of numerators, denominators, or measure specifications. Measure specifications are essential for evaluating feasibility. Instructions to the SNAC for Delphi I noted that scores for validity could be guided by professional consensus when published evidence to support the measure's validity was insufficient.
The RAND modified Delphi method outlines cut-points for passing scores on validity and feasibility. For validity, the median passing score used is more stringent, i.e. 7-9 on the 9-point scale, than the median passing score for feasibility which requires a median score of 4-9 to pass. The rationale for this difference is that for validity, either the evidence exists to support the measure or it does not which results in relatively objective information being available to make this assessment. Feasibility is a more subjective assessment than validity. Some Medicaid or CHIP programs may find a measure quite feasible to implement (due to their infrastructure, amount of available funding, etc) while others will not.
Median scores and a display of the distribution of scores across voting members were calculated and prepared for SNAC review by AHRQ staff prior to the July meeting. The median scores summarized the individual scores of SNAC members on these two domains (i.e., validity and feasibility). The median scores and the display of distribution across voting SNAC members were presented at the July SNAC meeting and used to determine whether candidate measures would be discussed further. For the purposes of the July meeting, measures with a median validity score of 6 or 7 and a median feasibility score of ≥4 were discussed by the SNAC. Measures with a validity score of 6 or 7 were selected for discussion as these measures were deemed controversial and in need of further consideration by the group.
SNAC Meeting July 22nd-23rd, 2009
The SNAC spent most of the first day reviewing the criteria for validity and feasibility; identifying criteria for importance; and discussing the measures that were deemed "controversial" after Delphi Round 1, i.e., measures with a median validity score of 6 or 7, median feasibility of ≥4, and a relatively wide distribution of scores across members, suggesting little consensus among the group. Forty-five of 77 measures met these criteria. On the second day, the SNAC heard presentations by experts commissioned by AHRQ and CMS to provide further input into the overall process.
Additional input and discussion: Presentations to SNAC and the participating public
At the July 22-23, 2009, SNAC meeting, members and the public present at the meeting heard several presentations and engaged in discussions with presenters. Presentations by the AHRQ Director, Carolyn Clancy, CMS's Director of the Center for Medicaid and State Operations (CMSO), Cindy Mann, and Director of the Division of Evaluation, Quality and Health Outcomes in CMSO, Barbara Dailey, set the stage for the meeting. The AHRQ Director provided the charge to the SNAC and the CMSO Director expressed a strong desire for the SNAC to recommend a grounded and parsimonious core set which could be implemented voluntarily by State programs, health plans, or provider groups.6,7 Representatives of the National Quality Forum, the National Committee on Quality Assurance, and the Center for Health Care Strategies spoke on the challenges of implementing health care quality measures for children.
In addition, several experts who had been asked to write federally-supported white papers on specific aspects of measurement in the legislation presented their early thoughts about their work. These experts addressed the charges to them of conceptualizing and assessing the validity, feasibility, and importance of measures of mental and behavioral health care, family experiences of care, duration of enrollment and coverage, availability of services, and the "most integrated health care setting." AHRQ and CMS also asked that papers be prepared analyzing data sets of the National Academy for State Health Policy, Health Management Associates, and the Child and Adolescent Health Measurement Initiative (CAHMI) database from the 2007 National Survey on Children's Health. An additional environmental scan of Medicaid and CHIP Web sites to identify additional children's health care quality measures that may have been missed in the first effort by AHRQ staff and CMS had also been commissioned. Not all authors could participate in the July SNAC meeting. All presentations are included in the transcript of the July meeting posted at http://www.ahrq.gov/chipra/.
Refinements to methodology
During the July meeting the SNAC agreed upon refinements to the methodology to be used for future rounds of the modified Delphi process. Importance was added as a third domain to consider when evaluating potential measures in addition to validity and feasibility. The SNAC worked to establish consensus on the criteria to use to rank the importance of measures under consideration. To be considered important at least some of the following criteria had to be met by the measure. The criteria are listed in order of decreasing weight as determined through a voting process by SNAC members on July 23, 2009:
- The measure should be actionable. State Medicaid and CHIP programs, managed care plans, and relevant health care organizations should have the ability to improve their performance on the measure with implementation of quality improvement efforts.
- The cost to the nation for the area of care addressed by the measure should be substantial.
- Health care systems should clearly be accountable for the quality problem assessed by the measure.
- The extent of the quality problem addressed by the measure should be substantial.
- There should be documented variation in performance on the measure.
- The measure should be representative of a class of quality problems, i.e., it should be a "sentinel measure" of Quality of Care (QOC) provided for preventive care, mental health care, or dental care, etc.
- The measure should assess an aspect of health care where there are known disparities.
- The measure should contribute to a final core set that represents a balanced portfolio of measures and is consistent with the intent of the legislation.
- Improving performance on measures included in the core set should have the potential to transform care for our nation's children.
Similar to feasibility, the threshold for a passing score on importance was also set at ≥4 on the 9-point scale as this was felt to be the most subjective of the three evaluation domains.
The SNAC members were asked to score each of the measures that had passed the first round of Delphi scoring for validity and feasibility on the new criterion of importance. AHRQ staff then summarized these scores using the median value. Measures were considered to pass the importance criterion if the median score was ≥4.
The refinement process further involved reviewing, discussing and reaching consensus on criteria the SNAC would use to evaluate the validity and feasibility (including reliability) of candidate measures that would be considered for potential inclusion in the recommended core set.
Other steps and decisions
The SNAC's discussion of controversial measures resulted in the recommendation that further information related to measure validity, feasibility and importance would be needed prior to further consideration of these controversial measures. The SNAC asked AHRQ staff to obtain that information.
During their July deliberations, the SNAC also determined that a call for nominations of additional pediatric quality measures in use (either within or outside of the Medicaid and CHIP programs) should be used to identify a larger set of measures to consider for the final core set.
SNAC members expressed a strong desire to recommend a grounded and parsimonious core set of measures that could be implemented voluntarily by State programs, health plans, and provider groups, and agreed on a target number of no more than 25 measures. The SNAC acknowledged that such a core set would be incomplete, but efforts would be made to balance the set to accomplish the legislative goals and the goals articulated in the SNAC discussion of measure importance. The SNAC agreed to bring forth to the NAC's attention measures not accepted into the core set and aspects of child health for which current measures do not exist.
By the end of the July SNAC meeting, SNAC members had identified a preliminary list of 24 measures that had clearly passed criteria for validity and feasibility in the first round of Delphi scoring and also passed scoring for importance using the criteria agreed to by the SNAC at the July meeting. This preliminary list of measures is available at the AHRQ CHIPRA Web site as part of the SNAC Co-chairs presentation to the NAC on July 24 (see below).5 The Co-chairs made clear that this preliminary group of measures would be subject to further research by the AHRQ staff as needed and included in the second round of Delphi scoring prior to the September SNAC meeting. In addition, SNAC members were invited to nominate additional measures for consideration.
First SNAC Report to the NAC
The SNAC Co-chairs reported to the NAC immediately after the July meeting (on July 24, 2009).5 This presentation included a review of the SNAC-refined criteria for the measure evaluation domains (validity, feasibility, and importance) as well as the preliminary list of 24 measures passing all three domains after the initial round of Delphi scoring. The SNAC report is available at CHIPRA.6
Second SNAC meeting September 17-18, 2009
The SNAC held its second meeting on September 17-18, 2009 in Washington, DC. In addition to being open to public participation on site, the meeting was Webcast. The technology allowed for greater participation and public comment. A link to the Webcast was available.
Preparation for the Meeting
Additional Measure Nominations
Shortly after the July meeting, the AHRQ staff in collaboration with the SNAC Co-chairs developed a measure nomination template. This template was created in order to collect a standardized set of information on all measures nominated for potential inclusion in the core set (see Appendix—Nomination Template). The nomination template was made available in early August 2009. Nominations were accepted until August 24, 2009. In addition to measure nominations by SNAC members, public nominators included members of the Medicaid Medical Directors Learning Network, the American Medical Association Physician Consortium for Performance Improvement, the National Partnership for Women and Families, and the Child and Adolescent Measurement Initiative on behalf of The Commonwealth Fund. Additional nominations were obtained through e-mail to the AHRQ public comment e-mail address. CHIPRA Federal Quality Workgroup nominations also came from CMS and HRSA.
In addition to all newly nominated measures, each measure that either 1) passed Delphi round one or 2) was considered controversial by the SNAC during their first meeting in July was entered into the measure template, with required information, by AHRQ staff. Authors of the CHIPRA-commissioned papers also recommended measures for consideration and additional sources of data for quality measurement based on their works in progress. Measures recommended by the contractors included a measure of medical home (for "most integrated health care setting") using items from the Healthcare Effectiveness Data and Information Set (HEDIS) Consumer Assessment of Healthcare Providers and Systems (CAHPS®) surveys; a preliminary measure of availability also using items from the HEDIS CAHPS®; and measures of duration of enrollment based on work done by researchers primarily using Medicaid and CHIP enrollment data. In addition, one of the works in progress focused on the type of data (e.g., race/ethnicity) and measures that could be obtained from the Medicaid Statistical Information Statistics (MSIS).
At a minimum, nominators were asked to identify the measure numerator and denominator; measure specifications; and current use of the measure. Substantial effort was put into obtaining all of the information requested in the template for every measure under consideration. The nominators entered information into the nomination template. Each template was then supplemented with additional information where necessary by AHRQ staff and the SNAC Co-Chairs. Through this work, a standardized set of information was made available for almost all measures for consideration by the SNAC members during their second round of Delphi scoring. One-page summary sheets that abstracted information from the measure nomination templates were provided for each measure under consideration (see Appendix—One Page Summary Template).
By mid-September 2009, the SNAC had 121 measures to consider during a second modified Delphi process.
Delphi II scoring by the SNAC
Using a second modified Delphi scoring process prior to the September meeting but including the SNAC-identified criteria for importance (see Appendix—Instructions for Delphi Round 2 for AHRQ SNAC Members), SNAC members selected 65 of the 121 measures as meeting criteria for validity, feasibility, and importance. As in Delphi I, SNAC members were instructed to use professional consensus on the underlying scientific soundness of the measures in cases of insufficient published evidence.
SNAC September meeting deliberations
As at the second SNAC meeting, members first heard opening remarks from the Directors of AHRQ and CMSO, and an overview of the meeting agenda and process.8 Unlike the first meeting, there were no invited presentations (other than during public comment periods on Days 1 and 2). Due to time constraints and the need to identify for NAC consideration a reasonable core set of measures near the SNAC's target number of 25, the initial plan was to only discuss and consider the 65 measures that passed the second modified Delphi scoring process as candidates for the core set. However, initial discussions at the September 17-18, 2009, SNAC meeting resulted in adding back 5 measures that did not strictly pass the second Delphi round (i.e., those with high median feasibility and importance scores [ ≥7] and median validity scores of 6 or 6.5 rather than the cutoff of 7) to the list of measures to be discussed and voted on during the meeting. Thus, 70 of the 121 measures scored in Delphi round two were discussed and considered for the core set. Table B of the Appendix provides a list of nominated measures that did not meet the criteria threshold for validity during the Delphi II scoring process and were not discussed at the September meeting.
Electronic voting process
Throughout the one and a half-day meeting in September, a method of electronic confidential voting was used extensively by SNAC members. This method was chosen because in small groups some members may dominate a discussion, leading to group decisions that do not reflect the true sense of the group membership.5 Through private electronic voting, the SNAC process was most likely to obtain the candid individual preferences of members, accumulating to a consensus of the SNAC.
Balancing measures across multiple domains
The SNAC reviewed and prioritized measures based on several characteristics pertaining to legislative and feasibility criteria, including: data source (administrative, medical record, HIT, survey); site of care (primary care, specialty care, inpatient, emergency, mental health, substance abuse, dental); measure type (outcome, process, structural); care continuum (screening, prevention, diagnosis, treatment, care coordination); accountable entity (state program, health plan, provider); child ages to which the measure applied; and availability of data to report disparities.
Elimination of multiple overlapping measures, merging of some measures within specific categories, and voting
On day one of the meeting, SNAC members engaged in detailed discussions of measures felt to have substantial overlap. For example, multiple measures pertaining to premature birth passed the criteria for validity, feasibility and importance, as did multiple dental measures. After discussions were completed, a series of votes was conducted which resulted in elimination of multiple measures and merging of some measures within a given category. For example, three separate well-child care visit (WCV) measures that apply to different age groups were combined into one measure for voting purposes. Similarly, multiple measures of premature birth were eliminated, narrowing measures in this area to one measure of low birth weight. Measures in each category (e.g. prevention/health promotion, care of children with chronic disease) were rank ordered within the category. Lowest scoring measures were eliminated from further consideration. This process resulted in 31 measures for final consideration on the second day of the meeting.
Getting to 25 measures to recommend to NAC
On day two of the meeting, three rounds of voting were conducted in succession. SNAC members could vote for their top 20 measures out of the 31 that remained. In round one, SNAC members individually voted for their top 10 measures; in round two their next 5 measures; and in round three their final 5 measure choices. Measures voted for in the 1st round received 3 points per vote, measures voted for in the second round received 2 points per vote, and measures voted for in the 3rd round received 1 point per vote. A priority score was then calculated for each measure which represented the total points assigned to that measure by SNAC members after the three rounds of voting. The final rank order of the measures based on priority scores was examined by the SNAC to assess how the acceptance of various cut-points (i.e. 10, 15, 20, 25 total measures) would fulfill the goal of arriving at a grounded, parsimonious, balanced core set of measures. The SNAC voted to recommend the top 25 measures on the list (Table 1). Table A in the Appendix provides the list of measures that met criteria for validity, feasibility and importance during the Delphi II scoring process but were not ultimately recommended for inclusion in the core set.
The SNAC did not recommend that the measures in Table 1 be implemented "as is." Rather, the group emphasized that the measure denominators should be re-specified, insofar as needed, so that the measures can be made feasible for use across all Medicaid and CHIP programs, providers, consumers, and intermediaries (e.g., health plans contracting with State Medicaid programs). For example, HEDIS CAHPS (FEC 1 and 5) as currently specified is used primarily by Medicaid Managed Care health plans that report to National Committee on Quality Assurance (NCQA). The SNAC recommended that in the future the CAHPS measures should be used by all Medicaid and CHIP programs so that family experiences of care across a broader spectrum of covered children can be understood, compared, and, when needed, acted upon.
Additionally, the SNAC agreed that identifying the entities accountable for multi-level layers of service delivery (e.g., providing the service, facilitating the service) and for quality measure data reporting is critical. For all measures, a common duration of enrollment calculation is essential to make valid and reliable comparisons across institutions, programs, and states. In implementing these measures, however, evaluators might explore the use of standard "person-enrollment-month" methods for making these calculations, rather than simply drop enrollees who do not remain enrolled continuously for an entire assessment period.
SNAC members also emphasized that further attention to improving the capacity of measures and datasets to assess disparities is needed. Few of the proposed measures are used, at least at present, to report data that distinguish care quality by race, ethnicity, tribe, socioeconomic status, or special health care need status among children.
Finally, the SNAC recognized the critical importance of several topics that are essentially missing in the recommended set of quality assessment measures for Medicaid and CHIP purposes; they stressed the need for developing valid and feasible health care quality measures to fill these gaps. These included measures of specialty care, inpatient care, substance abuse care, mental health treatment, measures that link mainstream clinical care with other services that children receive (i.e., coordination of care), health outcome measures, and measures of the medical home.