July 22, 2009: Morning Session (continued)
Transcript: First Meeting of the Subcommittee on Quality Measures for Children in Medicaid and Children's Health Insurance Programs
Jeffrey Schiff: Let's just take—if you will put [indiscernible] up for people who want to say something now and then let's make it quick because we have a lot to do and the other thing about this list, I just want to be really clear is this is not the end of this list. This is—
Rita Mangione-Smith: [Cross-talking] on E-mail?
Jeffrey Schiff: And this is the E-mail. I think anytime during this meeting if something comes up that has to go on there, we will keep this list up. But if we can keep people's comments short so we can get to our discussion about what we mean about validity and feasibility, that will be great.
Female Voice: Real quick question more than anything else, which is, some of the measures we have here can be bundled into an indicator so it is more about indicator construction of measures. Are we going to talk about that at some point because like there are 10 different parts of family-centered care here, but you could do a family-centered care measure?
Rita Mangione-Smith: Right. And in fact, our intention, like with the HEDIS/CAHPS® [Health Plan Employer Data and Information Set/Consumer Assessment of Healthcare Providers & Systems] measures, was to have people rank validity and feasibility for the composite not for all items. And we successfully just listed the composite for the children without chronic conditions, but then we got to children with special health care needs, we did it at the item level which is wrong. So when we rescore because that is one of the ones we are going to have to rescore, we will do it at the composite level.
Female Voice: Thanks. Yes.
Jeffrey Schiff: Okay, Paul.
Paul Melinkovich: I have sort of a clarification. There are some existing measures around assessing nutritional status related to obesity, but there is clearly a measure around nutritional failure that is well defined, used in several, in both cystic fibrosis (CF) and inflammatory bowel disease and eating disorders in adolescents. That has a clear action piece to it tied to improved outcomes. So is there an opportunity then to expand some of the existing measures to looking at nutritional status across the whole spectrum rather than just, say, for obesity.
Female Voice: Real quick—just a point, Carolyn's point when she was saying if you went on measuring central line infections, yes, that is two FTEs [full-time equivalents] that would cost the hospital. The reality is there are other pushes, and the hospitals are already using these dollars, so these would not be in de novo dollars. Come January 2010, the Joint Commission is requiring reporting of central line-associated bloodstream infections for every unit within a hospital. This is the logical synergy point to marry that to the Medicaid so the States and the hospitals are going to be putting these dollars in anyhow.
Glenn Flores: I sent an E-mail with a whole list of things so I will not go into detail, but I just wanted to bring them up again to make sure they are at least in the parking lot.
First of all was collection of self-identified race and ethnicity, primary language spoken at home, and parental English proficiency, and this actually is already done by some health plans and States. Obviously, there is a whole world of racial ethnic disparities and disparities for other groups that we have not talked about. I mentioned some of them; we certainly can go into that.
There are some interpreter and language services standards that are in use that I think should be on both sides of the equation there. We talked about inpatient quality measures, but there are also emergency department (ED) measures that one could come up with; some ankle film algorithms and some head CT algorithms and asthma protocols that I think are fairly well established in some health plans already.
We did not talk a lot about subspecialty pediatric measures such as dialysis rates, time from diagnosis to treatment of ALL (acute lymphoblastic leukemia), of the quality of sickle cell care. I mean there is a whole list, again, that I mentioned in the E-mail. Again, somebody also mentioned about patient safety processes and outcomes. I think we really want to hone in on that, particularly with a lot of people from AHRQ here; rates of providing eyeglasses, hearing aids, and other supportive devices to those who need them because those are already measures—a lot of access measures that we have not talked about.
And then in terms of patient-centered variables, unmet need I think is also a very big issue in terms of asking the parents and the families about that. And I would add to Cathy's point that you can also use CAHPS® in a lot of plans, and States are using CAHPS® to look at everything from satisfaction with care to perceived racism and discrimination.
Jeffrey Schiff: Glenn, just one of the things we have talked about is the collection of race and ethnicity data as sort of one of the ways to filter and report all of these measures, so it is sort of—I think we, as we have talked about this, see there is an overarching part, but how it gets collected is obviously something to talk more about.
Glenn Flores: Sure.
Male Voice: I think my comment builds on Lisa's, and it strikes me that we have a number of measures already on the list that are geared to either specific services or specific conditions, but I just wonder whether or not in the measures that need development there is any consideration of trying to put the notion of a medical home. And we in dentistry would add a dental home in some way to sort of get at that as a development—something to work toward and perhaps come up with indices and break out these various elements.
Jeffrey Schiff: So you are talking about a composite measure around that?
Female Voice: [inaudible]
Jeffrey Schiff: Right.
Male Voice: It would likely be—probably a composite measure but I just see that we have on the list already a lot of the granular sort of pieces that might go into it and to try to capture in that concept.
Denise Dougherty: Just a point and we should have—tomorrow, you are going to hear about Charlie Homer and his colleagues who are looking at how you would think about and define and whether there are measures of the most integrated health care setting. So just as a reminder, the labels on the left hand in the column here came from the legislation and there are some that we know are a little bit—we are not quite sure what they mean and maybe Marina can tell us.
Female Voice: [inaudible]
Denise Dougherty:—do they mean the medical home when they said the most integrated health care setting, but we are looking at what that should mean and what measures are available.
Female Voice: Medical home has to do with the American Academy of Pediatrics (AAP) [inaudible]
Rita Mangione-Smith: So I'm going to force us to move on to the validity and feasibility part because we are kind of running about a half hour behind now for where we wanted to be, but this has been great.
Female Voice: Can I have one quick point?
Female Voice: Sure.
Female Voice: One quick point, the direction of [inaudible] or continuity of [inaudible].
Rita Mangione-Smith: It is all right here.
Female Voice: Yes [inaudible]
Denise Dougherty: And Jenny Kenney is going to talk about that tomorrow. She is doing a paper on it.
Male Voice: And denial of renewal of coverage, too, I would add—
Rita Mangione-Smith: Any more measures that get written, you guys are going to have to come here at lunch and [indiscernible].
Jeffrey Schiff: I'm just going to make an advisory thing. If you do come during lunch and you put something on there, we need to make sure it is clear or else let us know so—
Rita Mangione-Smith: Please write neatly on that, be specific, okay? And maybe put your initials next to it so that if we need to, we can come to it and ask for clarification. That will be great.
Okay. So I'm going to move us to talking about validity and feasibility. You all got that Delphi description [audio glitch]. Just by way of a quick review, in that document we defined validity as another being supported by the current evidence base and/or adequate evidence is lacking that it is supported by expert consensus opinion. And being a little bit more specific about that in terms of thinking about structure process and outcomes, and what we are saying is that a given structure of care, say, for instance a clinical decision support system has been shown to positively affect a process here so it increases appropriate influence of [indiscernible] of your clinical population. So that will be a structure process link that has been shown. Previous studies have been done or maybe there is just a belief that if you did that, you would not enhance—you know, actually a consensus that if you would enhance your decision support systems you would in fact increase adherence to giving flu vaccinations to deal with that.
Or there is a link between structure and outcomes of care so the classical one I like to think about are the nice studies that have been done that showed increased continuity of care in the outpatient setting. It needs to decrease ambulatory care sensitive hospitalization (ACSH) so the ACSH is the outcome, the process being—I'm sorry—the structure being that you would have an appointment system within your clinic that really enhances a child seeing the same provider over and over again, so continuity of care. Or that there is evidence finally that the process is linked to outcome so if I give a child with persistent asthma an inhaled steroid, I will improve their outcome. They will have fewer ED visits, fewer hospitalizations, fewer missed school days, less parent missed work days, and hopefully better asthma related quality of life.
So that is what we are talking about in terms of what is the evidence for the link for the measure that you are looking at. And most of those measures, I think all of those measures that we gave you to look at were process measures with a very few exceptions. So when we think about validity, we need to think hard—is there evidence to support the idea that this process or structure of care does actually validly impact an outcome in a positive way?
The other piece of validity that is really important is does the health care system, the provider, the clinic, the hospital, the health plan, or Medicaid actually have control over that piece or that measure of care? So if they have no control over changing it, then it is really not a valid measure at all, so it has to be within their realm of control. Sometimes, you get pushback about things like the PedsQL, "Oh I'm not the only one who affects the outcome of the quality of life measure. There are other things that influence a child's quality of life." That certainly is true but I think we still need to push ourselves to the outcome level; Cathy raised that.
So that is kind of how we presented validity to you. That document we sent you is an extraction from a document at RAND. That basically goes over when we ask them to do Delphi. Here is how we asked them to think about validity. And as you know, a score of 1 to 3 on that 9-point scale says not valid; does not meet those criteria. A score of 4 to 6 says I'm not certain. And that is probably the ones that are based on expert consensus opinion, right? And then there are the ones that are the 7 to 9 slam dunk, there is great evidence that this is a process of care or an outcome or structure that clearly is valid as a measure of quality.
So what I would like us to do is just I want to open up the discussion among your right now. I want us to agree on sort of what we are going to say are our criteria formula. If you agree with what I just said, great, but I'm sure you are going to have some additional things to put in there that you want to think about. And questions that we thought through as we were going through our pre-work process before we sent you everything was do we only in this core set include measures that are based on solid evidence, so in that seven to nine range, and I will not tell [indiscernible] about that. Or do we also consider what we decided to call evidence-informed measures?
So there are a lot of expert practice guidelines. If you look at what the level of evidence is supporting many of the statements from those guidelines, it is less than randomized controlled trials; it is way less than that. A lot of times it is level "C" evidence, which is expert consensus based on observational studies and, sometimes, not even based on observational studies. So we as a group need to agree for this core set of measures what we are going to allow in or not.
Female Voice: Back to sort of one of the values that Jeff put out on the table of transparency, I think it might be helpful that we send the recommendations forward as something that is going forward that has maybe not the most solid evidence, that we're transparent, saying we still think this is really important, but maybe the evidence base is not quite as strong as some of the other measures because again, we are still in a voluntary Medicaid CHIP world, and that maybe a point of information that States, as they are weighing the limited resources they have, which measure they actually invest in.
Jeffrey Schiff: This particular issue was the genesis of that comment so I think that is right on target. I think the other comment about this is perhaps having these measures out there with less than perfect evidence will create the mechanism to develop better evidence for them and I think that is not the goal of this group, but if it is an appropriate byproduct that would be great.
Rita Mangione-Smith: If we start routinely collecting these measures, our ability to say, okay, well, if you give care this way, what does that do to the PedsQL? Does it get better? So I think that it is a really nice thing to be putting that data out there for the future. I'm sorry [inaudible] Lynn.
Lynn Olson: I also appreciated the section that was added here on the importance criteria which I think given the level of evidence on some of the pediatric issues will really be important here, well-child visit is a key example. And I guess I would also suggest that another thing to add here, in addition to the issue of the prevalence of these conditions among children, is the question of to what degree do we know what difference getting insurance makes or having insurance. And there, we can see where it can make big differences. Again, well-child care would be a great example there because we know that there are just tremendous differences between the uninsured kids and the publicly and privately insured kids on that well-child visit measure. So I guess I'm asking—to what degree are we including this importance criterion? And then, how are we also defining the importance criteria?
Rita Mangione-Smith: So tomorrow afternoon, that is all we are doing. So we are going to do this exercise first, provide it our—
Lynn Olson: Because they all seem related—
Rita Mangione-Smith:—importance criteria then we are going to make you all do a Delphi right here tomorrow on importance, the measures that have made it through the validity and feasibility filter. So we are making this happen first, and we still have a huge list of measures, a lot of them made it, and now we need to look through and then grade them on importance based on criteria that we as a group agree are reasonable importance criteria.
Jeffrey Schiff: So just to reiterate what we said this morning, if something does not make it as valid or feasible we do not think we should spend time grading it as important. There may be a measure that needs to be developed for that area, but if we do not have any measure that is valid or feasible, there is no point in doing importance for that, but we will talk more about importance criteria tomorrow.
Rita Mangione-Smith: Okay. And I think [inaudible]
Female Voice: Just two things real quick—when you were saying about validity and you deferred from giving us your opinion on how much needs to be linked to the outcomes part of it if it is a process measure, and I would at least put my vote at least at the beginning set that we need to be tightly linked to outcomes. I think we have a golden opportunity, and we do not need to repeat the mistakes of 20 years of adult health care quality measurement of measuring things and make zero difference for the outcome. The other thing just to be clear; is cost on the table under importance, or does that need its own column? When we talk about potential cost savings of improvement in quality, is it in their bucket, in importance?
Rita Mangione-Smith: It is definitely open for discussion tomorrow although that is in the importance criteria.
Lisa Simpson: I want to encourage us to always keep in mind a reasonably succinct and clear definition of quality of care, and certainly I had towards outcomes and process and the link between them and then recognize that we still are really at some level talking about measurement. And it might be helpful to keep in mind what people who work in measurement mean when they say validity because it harkens back to some of what we were hearing earlier that we do not know a lot about the specifications of a lot of these measures but to the extent that validity means measures, assesses, what it purports to measure is just an incredibly important definition to have in our minds with respect to validity.
And I will reiterate that in measurement terms, you cannot have a valid measure if you do not have a reliable one. It has to be reproducible. To that extent, it is some sort of a measure that has some internal consistency. There is a whole class of kind of criteria like that that relate to reliability, and they put an absolute ceiling on validity.
And the third thing that I guess I would say is that I come down on this side that, if you cannot give me some halfway decent published evidence about validity, I'm not terribly interested in it as a criterion for getting at a core measure. We might find some that we think ought to be valid and sound great, and there is no evidence about them, then that goes into perhaps one that falls into the "needs more development." We need more empirical evidence about it, but consensus development or consensus-based expert things fall into the old kind of what we used to call GOBSAT, which is "good old boys sitting around the table" coming up with measures, and could we please move to your point about that is where we are with adult measures 10 or 25 years ago. I think we need to get beyond that and set as a criterion that we want some published evidence about these measures to go into the core set.
Rita Mangione-Smith: That is setting the bar [indiscernible] I say that is wrong, but I just am wondering if there is any reaction to that because [cross-talking]. If you look in that New England Journal publication we did a couple of years ago and you look at the evidence table for the indicators, it is embarrassingly low [inaudible]. I mean 20 percent of those measures have level-one evidence so just throwing that out there.
Jeffrey Schiff: Okay. Let's move to Paul.
Paul Miles: Well, I'll just make a comment. I think these are really, really good valid quality improvement (QI) projects that have good data that support the measures that resulted in outcomes, but it is sometimes difficult to get published in journals that are only looking for randomized trials. There are certain things in asthma, some of the work in bloodstream infections and CF and a ton of stuff that I think I would add to validity. I would bet my child's life on some of those efforts, even though they did not come out of these randomized trials and probably cannot be randomized, but I think we have to be really, really well-described—some kind of consensus.
Lisa Simpson: But in my own defense, I never said randomized controlled trials. I said some kind of published evidence, and that is not necessarily just in The Lancet and JAMA and absolutely can be based on—Americans will say—observational studies. The rest of the world, in terms of evidence-based practice may dance on the head of randomized controlled trial pins, but Americans tend not to. We tend to be more inclusive about evidence and so—I mean I take your point about maybe it has not exactly been published, but that evidence does not need to be randomized controlled trials, and I think we should be more inclusive with respect to how we regard evidence.
Jeffrey Schiff: Let's take these two more comments about validity, and then we should spend the last 15 minutes talking about feasibility—actually, I guess three comments.
Xavier Sevilla: I just wanted to echo Paul here and the reality is, in children especially, there is just not a lot of published data at all in most important conditions. It is sad to say, but that is the reality. And most of the recommendations that are made are actually with not very strong evidence. I think the important thing is what Lisa said, it is to actually state up front transparently, this measure does not have randomized controlled trials. This has maybe consensus and actually state that when we actually publish these core measures. But I think if we just stick to the ones that have grade A evidence, we are just going to have a very short list.
Jeffrey Schiff: I do not think we are talking about grade A necessarily, but A or B maybe, it is whatever—anyhow.
Female Voice: I just do not see that we addressed the comment I made about linked to outcome, since the majority of this is process, and I do not think—we sort of jumped to evidence, which is a related concept but is very different of how much the measure has some evidence that it actually changes the outcome of the child.
Rita Mangione-Smith: And that is also a pretty high bar.
Female Voice: I know.
Rita Mangione-Smith: A lot of them [cross-talking]
Female Voice: Well, that is why we need some consensus on here; otherwise, we are doing a lot of process measures that mean nothing.
Jeffrey Schiff: So I just want to ask a question. Is that an importance criteria or is that a—?
Female Voice: I think it is validity. Yeah.
Jeffrey Schiff: Okay.
Rita Mangione-Smith: So are we going to say that we are making a statement that we will not include something in this core set [cross-talking] that process of care has been linked to positive health outcomes [cross-talking].
Female Voice: Well-child care.
Rita Mangione-Smith: It is pretty dramatic if we do that.
Lisa Simpson: But what would happen if you set criteria that said this was our ideal set of criteria? And it may be that in fact you have nearly a null set, and that we then moved away by relaxing certain criteria to get to some better ones [cross-talking]
Rita Mangione-Smith: I see the building of that—a table that it has been very transparent about, you know—
Female Voice: Which criteria it meets?
Rita Mangione-Smith:—there is no known link to outcomes data for this measure; although, we felt it was important. Importance criteria, it may be great.
Male Voice: Yeah. I mean that is the beauty of the Delphi process is when you do not have the outcomes, it is a group of experts; I do not know if it is old boys but [cross-talking]
Rita Mangione-Smith: Middle-aged girls, how about that?
Male Voice: Yeah. But still, I mean that [cross-talking]. In all seriousness, that is the rigor of a Delphi process, it is you fill that vacuum with the best possible expert consensus and move from there. And hopefully, our document can also be used as a call for research, a kind of research agenda where we have with the [indiscernible].
Male Voice: I think this also gets back to the challenge that Carolyn gave us that this is really about improving kids' care. And it is kind of what Marlene says, if the set we come up with we can in some way defend that we think it is going to improve kids' care, what is the aim of what we are doing?
Rita Mangione-Smith: So I want to summarize so that I can get a sense that we have some consensus on what we have decided here: that we are going to be very, very transparent about the level of evidence supporting our measures; that the measure truly assesses what it reports to measure, and I think that gets into the whole specification piece. It is great to say we are going to have all child care measures, but let's make sure that the numbers we are getting are real—I mean that got brought up quite [inaudible] earlier—that the measures have to be reliable to be valid. I mean I truly believe that if you have detailed specifications that from one institution to the next, you can get the same kind of information and compare on this. And that we will accept that there is some unpublished evidence from QI work that is going on all over the country that some of us have experienced that truly does act in a sense as evidence that even it is not randomized controlled trial (RCT) type evidence, that it should be considered valid based on those types of studies that do not get published but truly show positive impacts. And that finally, we would love our processes to all be linked for good health outcomes, but we will recognize and be transparent about when that data is [inaudible]. Did I capture it?
Female Voice: [inaudible]
Rita Mangione-Smith: All right. Feasibility is our next one. Again, just to go over what we put forward in that document that we sent you all in terms of feasibility, we talked about the very important first step of the data to do this measure has to be available either through the Medicaid and CHIP programs and administrative databases for medical records data or for survey data that were seemingly collected. And that particular information can be collected in a reliable way, and so from institution to institution, we can make comparisons and that it is possible for a measure to be feasible to mean it has gone [indiscernible] working with the National Committee for Quality Assurance (NCQA), we have to be able to go from [indiscernible] statement to operationalized specification, and that is a huge and very difficult step. And anybody who has tried to do it knows exactly what I'm talking about.
You think it sounds like it is going to be so easy but I will give you just a quick example—the upper respiratory infection (URI) for HEDIS. We went through I think 10 iterations of the detailed specifications for that measure before we got to a point where we said, yeah, we think people can use this reliably, and it will be the same thing if I'm in Alabama, Texas, California, or Washington. It is very, very hard to do, so I think we need to think really critical about, again, what our criteria are going to be. That is tough because for a lot of these measures, we do not have specifications. So I think what we have to ask ourselves—is that okay? And we are going to trust that if we are going to take this core set and where specifications are missing, they are going to be developed in this way or not. So I'm going to open the discussion up, again, same as last time, to talk about [inaudible].
Female Voice: Rita, can I actually just ask a question?
Rita Mangione-Smith: Yes.
Female Voice: Or maybe two. One is back earlier—I'm looking at Denise—is it reasonable when we get to some candidate or more specificity on candidate core measures, to go back to the National Quality Measures Clearinghouse and look to see whether there are specs there because that is one of the criteria for accepting things into that clearinghouse; it is that you have some detailed stuff? It may not be pediatric measures, but it struck me that if we are focused also on, say, inpatient care and not just ambulatory care for certain kinds of things where there is pretty good information already on the specifications for adult measures, can we infer something about applying or extrapolating or generalizing those measures to pediatric care? And if so, is there actual information in that clearinghouse about specifications because that is what they require?
Denise Dougherty: There is, but not specifications that are actually in use that vary across States, but there are specifications.
Rita Mangione-Smith: So I'll throw a question out to the group. If we have a measure on that list where we were not given specifications but we can identify a specification from a similar measure, does the group want to say we will only include that measure if we can also include the specification or what we mean by that? I'm just throwing it out as a question.
Jeffrey Schiff: [indiscernible]
Female Voice: Well, it would help to constrain your measures and perhaps if you could say there is really nothing out there that tells us at the end of the day across all States or all kinds of institutions or so forth what are the specifications for numerators and denominators, let's put it that way, that actually have been somehow rather shown to work or at least are out there in a quasi evidence-based way that would help you—again we have a gazillion measures. If we have to get down to 10 or 25 or something, setting this criteria might help, and then if they turn out to be null sets, you relax the criterion a little down to something that looks better.
Denise Dougherty: Well, we sent this you a table that shows you how many States are using the measures. And then we also sent you specifications for the HEDIS measures.
I'm actually going to turn this question over to Barbara because it seems to me that there actually is not very much information about the non-HEDIS measures and what the valid specs are, but Barbara has connections to the States, the Medicaid programs, and could perhaps when we come down to a smaller set of those non-HEDIS measures kind of give a call to the State programs and say either how are you measuring this or is this feasible or not feasible.
I mean tomorrow you are going to hear some general things for people who have done surveys about what the State Medicaid directors and medical directors are reporting about their challenges in implementing measures. But we have not gotten to the detail of measure by measure and whether from the State perspective it is feasible, how they are collecting the information. We got some of that from our environmental scan, but perhaps we can directly ask the question. AHRQ cannot do that because we would require—we would be done. It will be 2014, and we would be hearing from OMB [Office of Management and Budget] about a question on our questionnaire, but since Barbara has a program and connections to these folks—
Rita Mangione-Smith: Can I ask a quick process question? We are a little bit behind. Will it be okay if we go 5 or 10 minutes into our lunch hour to complete the feasibility piece, that will be great so we will be back on schedule after lunch?


5600 Fishers Lane Rockville, MD 20857