Commentary-From the Clinics to the Courts
The Role Evidence Should Play in Litigating Medical Care
By E. Haavi Morreim, University of Tennessee-Memphis
Notice of Copyright
This article was originally published in the Journal of Health Politics, Policy and Law. All rights reserved. This material may be saved for personal use only, but may not be otherwise reproduced, stored, or transmitted by any medium, print or electronic, without the explicit permission of the copyright holder. Any alteration to or republication of this material is expressly prohibited.
It is a violation of copyright law to reproduce any copyrighted information from this publication without first obtaining separate permission directly from the copyright holder who may charge fees for the use of such materials. It is the responsibility of the user to contact and obtain the needed copyright permissions prior to reproducing materials in any form.
Permission requests should be directed to:
Duke University Press
Durham, NC 27708
Fax: (919) 688-3524
The Role of Evidence in Judging Health Plans
An Important Distinction
The Role of Evidence in Judging Physicians
References and Notes
Throughout this collection of essays, the Institute of Medicine and the Agency for Healthcare Research and Quality have identified an issue whose importance and nuances we are only beginning to appreciate. Although medicine has long claimed to be rooted in science, actual clinical care has often had only a limited scientific basis, resulting in inexplicably wide variations of care (Wennberg 1996). The past few years have witnessed a marked interest in evidence-based medicine (EBM), stemming from several concerns.
First, decades of double-digit health care inflation led to a recognition that enormous amounts of money have been wasted on interventions with little proven value. Health plans facing pressures to keep premiums down and profits up have moved aggressively to curb wasteful practices such as excessive hospitalizations and needless surgeries. The ruling norm under lavish insurance—"If it might help and probably won't harm, do it"—has given way to a leaner norm: "Don't do it, unless you can demonstrate its value." Under this new rule even common, widely accepted clinical routines have met coverage denials, and irate providers are scrambling to gather the kind of data necessary to document the value of their care (Morreim 1994).
Second, the more forward-looking health plans aim, not just to cut costs, but to render care more rational. In many cases physicians' clinical routines are based not so much on empirical evidence as on local habits, malpractice fears, facilities availability, or even advertising.1 And in other instances, physicians are failing to provide important, scientifically well-grounded interventions, such as those for ongoing management of chronic illnesses like diabetes, asthma, and hypertension.2
Third, some plans' rather drastic cost-cutting measures have occasioned concerns that basic quality of care is suffering. Thus, just as health plans question physicians' practices, many providers and purchasers have stridently challenged the scientific credibility of the guidelines by which plans have tried to enforce their preferred clinical practices.
Finally, beyond simply avoiding poor-quality care, many purchasers, particularly large employers, seek affirmative value for their dollars. No longer willing to pour money into care that may or may not produce good outcomes, or whose outcomes might be achieved much more efficiently, many buyers now expect health plans to demonstrate that premium dollars are well spent, including important forms of preventive care and disease management.3
Perhaps it is no mere coincidence that courts have likewise begun to demand more and better evidence from litigants demanding large sums of compensation for alleged damages. In both settings, large sums of money have been requisitioned, sometimes on no better basis than "junk science" (Huber 1991; Angell 1996). Whatever the connection, the essays in this collection explore important questions about the ways in which courts' quest for more and better evidence in other contexts may dovetail with health plans', providers', and purchasers' demands that clinical practices, and the guidelines sometimes imposed on them, reflect an adequate scientific foundation.
Mainly the questions explored in this collection are empirical: how has evidence-based medicine in fact affected clinical practice; how do judges understand and weigh scientific claims; how will courts address evidence-based medicine and cost-effectiveness analysis in coverage disputes; and so forth (see the Introduction by Clark C. Havighurst and others in this issue). This excellent foundation makes it possible to launch into some related normative (i.e., evaluative) issues that will be the focus of this essay: how should courts respond to health plans' demands for evidence-based medicine; how should courts respond to plans' efforts to balance costs against benefits; how should courts screen testimony about physicians' alleged malpractice; what kinds and amounts of evidence should courts expect from litigants in medical cases. Thus this commentary represents, not so much a reflection on the core essays, as an exploration of some of the further issues those writings have prompted.
This move from empirical description into a normative discussion is important, because we cannot determine what courts ought to do simply by examining what they have done thus far. A powerful example comes from recent litigation concerning the use of high-dose chemotherapy with autologous bone marrow transplant (HDC/ABMT) for breast cancer. For well over a decade, women with advanced breast cancer were told this treatment offered hope, even though there was never any credible science behind the claim, just some theoretical promise alongside physicians' desperate desire to do something—anything—to help their patients. Indeed, some early studies had already indicated the treatment provides no benefit over standard chemotherapy and actually diminishes patients' prognosis in certain categories (see ECRI 1995). In other instances, studies allegedly showing benefit were methodologically deeply flawed.4
When health plans tried to deny coverage on the ground there was no evidence that HDC/ABMT is effective for breast cancer, desperate patients replied that the practice was well-accepted by physicians. In fact, both sides were correct. There was no good evidence, but physicians nevertheless did widely accept it. Hence, although a number of courts sided with health plans,5 a large number sided with patients.6 Between judicial injunctions mandating insurance coverage, wrongful death verdicts imposing enormous damages,7 insurance companies' acquiescence to threats of litigation, and government mandates to cover the procedure (Hoffman 1999), the treatment proliferated rapidly (Peters and Rogers 1994). Indeed, although the National Institutes of Health (NIH) had major research under way, results were exceedingly slow in coming. Because so many women had access to the treatment through their insurers, it became difficult to recruit enough women willing to enter controlled scientific trials in which only half the subjects would receive the treatment.8 When the NIH studies finally concluded, results indicated that HDC/ABMT had no significant advantage over standard chemotherapy.9 By that time, some 30,000 women had received the treatment, at a cost estimated around $3 billion.10 This figure does not count what some health plans paid in compensatory and punitive damages, or in legal fees and courts costs, for making coverage denials that turned out, in fact, to be correct.
Such desperation- or sympathy-guided rulings are not merely expensive. They set a terrible legal precedent if we want empirical judgments to be guided by empirical evidence.11 And yet such judicial aberrations from empirical realities are not unique.12 Indeed, comparable cases from product liability and toxic tort litigation prompted the Supreme Court's mandate that judges screen empirical testimony more rigorously (Daubert v. Merrell Dow Pharmaceuticals, Inc., 43 F.3d 1311 [9th Cir. 1995]). Accordingly, it is appropriate to consider carefully what the approach of the courts ought to be regarding the uses of scientific evidence in litigation regarding health plans and providers alike. Health plans will be considered first, then physicians.
The Role of Evidence in Judging Health Plans
An Important Distinction
We begin with a distinction. When courts consider whether a health plan committed a tort or breached a contract in its attempts to trim costs and reshape clinical care, only some of the issues are empirical—that is, only some issues will be resolvable by appeal to the evidence found through sensory observation and experience. In the HDC/ABMT example, plans' claim that the therapy had little scientific support for treatment of breast cancer was an empirical claim. So are claims about whether or not mothers and infants experience greater mortality and morbidity when discharged within twenty-four hours after an uncomplicated vaginal delivery, or claims about the comparable effectiveness of generic versus brand-name drugs in the treatment of this or that infectious organism.
In contrast, a health plan makes normative claims when it determines, for example, whether certain expenditures produce enough benefit to warrant their cost. A plan might agree, empirically, that annual mammography for women under forty provides some benefit. And yet it might decide, normatively, that this benefit does not merit funding, given the more pressing alternative needs for that plan's limited funds in serving its large population (Eddy 1994). These value decisions are sometimes explicitly embedded in contractual terms of coverage and exclusions, but they can also be implicit in individual coverage decisions.
A third distinction would note that still other claims are conceptual. Health plans' decisions are heavily based on contractual provisions, and those provisions' terms must be interpreted. A plan may exclude coverage for "custodial care," for instance, but in a given instance it may require careful interpretation of linguistic concepts to decide whether a patient's extended home care counts as "medical treatment" or merely "custodial" care. This third distinction, while important, will not be discussed further here.
The significance of distinguishing empirical from normative issues is that, as courts consider various challenges to health plans' decisions about care and coverage, they must determine what sort of issue is at stake, and bring the right sort of evaluation to it. Courts cannot resolve normative issues by gathering empirical evidence, nor vice versa. We begin, then, by discussing courts' approach to the empirical issues, before turning to normative issues just below.
If courts expect plans and providers to base their empirical decisions on more and better evidence, health plans have a formidable task. Outcomes studies attempting to document the actual effects of ordinary clinical care are a relatively new phenomenon. During the post-World War II era of lavish third-party health insurance, medical science focused mainly on the development and testing of high-technology new drugs and devices. There was little reason to evaluate new products' and procedures' best uses, or even their most efficient production modes because, so long as FDA approval plus professional acceptance ensured good sales, it would be foolish for manufacturers to do research that could ultimately reduce sales (Garber 1992). By the same token, fee-for-service rewarded physicians and hospitals for maximizing services, not for studying which ones to delete.
Only recently has an urgent need to cut costs and maximize value-for-dollars prompted serious attempts to connect inputs with outcomes and to identify the most effective, and cost-effective, modes of care. However, although thousands of clinical practice guidelines (CPGs) have proliferated in recent years,13 many have at best only a limited scientific basis. The problems, detailed elsewhere,14 include a dearth of studies, inadequacy of databases, unstandardized methodologies, and biases and conflicts of interest. Nevertheless, outcomes research and health technology assessment (HTA) have become crucial to intelligent health care planning, and the quality of such research and the guidelines to which it gives rise are improving steadily.
If courts are to bring Daubert standards to evaluate the adequacy of the guidelines by which plans shape clinicians' care and make their coverage decisions, those CPGs should be anchored in "a reliable foundation" (Daubert, 509 U.S. at 597) not just the vague "general acceptance" of the standard set by Frye v. United States (293 F. 1013 [D.C. Cir. 1923], as discussed by Shuman in this issue). Hence, if a plan's CPG says "patients with condition X should generally be hospitalized only two days," there should be a credible empirical basis for choosing two days rather than some other number. Indeed, several courts have already held that plans' guidelines, utilization review programs, and coverage decisions must be made on a medically reasonable basis.15
Two caveats should be noted. First, in addition to their empirical bases, such choices will also reflect value judgments about where best to draw the lines between benefits and costs, as noted below under "normative" considerations. Lavishly funded plans will naturally have more liberal CPGs, while leaner ones will be less generous. Second, plans should not be required to use the "best" empirical evidence nor, as Daniel W. Shuman observes in his article here, should courts preclude differing schools of thought, reputable minorities, or the other kinds of allowance already permitted when courts appraise individual physicians' practices. Indeed, the very same dearth of research, of adequate databases, and of standardized methodologies that makes outcomes research so difficult would arguably require considerable flexibility from courts examining a CPG's validity.
Perhaps even more importantly, if courts do not leave reasonable room for differences of opinion, then the courts themselves would potentially engage in the practice of medicine by dictating too closely to health plans which clinical guidelines to adopt. If courts today feel ill-prepared even to assess which testimony is sufficiently "expert" to admit regarding toxic torts and products liability, they would quite surely be unprepared to dictate the nation's medical standards by permitting too narrow a range of CPGs to wear the mantle of judicial acceptability.
Nevertheless, a Daubert standard applied to CPGs would probably demand a better scientific pedigree than some health plans' guidelines currently appear to offer. According to some observers, "Most health insurers and managed care plans rely on ad hoc opinion by experts; only in a few instances are there HTA programs or structured processes for coverage decision making" (Perry and Thamer 1999: 1870). Moreover, "materials such as the practice guidelines prepared by Milliman and Robertson, a well-known actuarial firm, often rely on insurers' own decisions rather than on well-designed scientific research" (Rosenbaum et al. 1999: 231).16 In other cases, plans have relied on "an administrator who 'asked friends who are doctors,' or an insurance company's employee-physician (usually not a specialist in the field in question) who reads textbooks and discusses the issue with other insurance company physicians" (Holder 1994: 19).
In this context, many health plans have created a serious problem for themselves by defining "medical necessity"—the contractual cornerstone criterion of most health plans' coverage (Havighurst 1995: 15; Hall et al. 1996: 1055)—in terms of physician acceptance or general recognition by the medical profession.17 If health plans want to insist that physicians base their practices on scientific evidence rather than on local habits, malpractice fears, facilities availability, and the like, then plans must rewrite their contracts to reflect that outlook. So long as "medical necessity" definitions hinge largely on physician acceptance, then the only "evidence" even the most Daubert-loyal court can demand will be evidence of a practice's popularity among physicians, not its grounding in science.
As noted just above, it is one thing for a health plan to make an empirical claim—for example, that medication generally works better than surgery for certain patients' heart problems. It is quite another matter for plans to make normative judgments, such as whether the marginal improvement in quality of life that comes from a new drug (with once-a-day convenience and fewer side effects) is worth paying considerably higher cost. Such value judgments permeate CPGs, right alongside scientific elements. Thus science can tell us that computed tomography (CT) will not harm patients and will occasionally detect a hitherto undiagnosed problem. Cost-effectiveness analysis (CEA) can further tell us how much each new CT-made diagnosis has cost and compare that with the costs and burdens of alternative approaches. But value judgments are required for a health plan to conclude that it will not cover annual head-to-toe CT scans (Barnard 2000: A-20) for all its enrollees as a preventive care routine, on the ground that this cost is too high for its anticipated benefits and that the money can be better spent elsewhere.
There are many ways to draw the myriad cost-value trade-offs that permeate health care, and each has its merits and drawbacks. As Justice David Souter points out in Pegram v. Herdrich (120 S Ct. 2143 ), a cost-conscious plan is more likely to witness ruptured appendixes from too few appendectomy surgeries, while an open-ended fee-for-service approach is more likely to see unnecessary appendectomies. Health plans must constantly weigh the benefits that an expenditure will bring to a few individuals against its needs to serve all its other enrollees. These decisions in turn are set within the limits of both the available premium dollars and the uncertainties of future needs.
Health plans' values choices are not always obvious. As Jacobson and Kanna note in their essay in this issue, plans tend to keep cost-value trade-offs "below the radar screen." Commonly they are hidden in plans' judgments about medical necessity. The term sounds precise enough, with tones of science and imperatives of necessity. But in fact there are enormous variations in how plans determine what is "necessary" and what isn't, even though virtually all plans promise to cover all, but only, necessary care.18
Courts could manage these tensions in a variety of ways. One option, as Jacobson and Kanna note, is Judge Hand's formula in Carroll Towing, balancing the probability that a given adverse event will happen, the gravity of injuries that might cause, and the burdens of preventing them (United States v. Carroll Towing, 159 F.2d 169, 173 [2d Cir. 1947]). However, it would be difficult to mandate this or any other one-size-fits-all approach. Health care is marked by an extraordinary diversity of goals and values, a diversity rooted partly in the fact that most of day-to-day health care is geared not toward life and death and the catastrophes of Carroll Towing but toward quality of life, the management of uncertainty, and other highly nuanced, deeply personal factors. There are many ways besides health care for citizens to improve or preserve their quality of life, and a variety of legitimate ways to decide what price is worth paying to reduce what kinds of uncertainty (Morreim 2000b).
Accordingly, in deference to the wide diversity of human goals and budgets, courts arguably should permit health plans to vary in the kinds and levels of coverage they provide. But once that is granted, we must ask afresh how courts should evaluate disputes about the value choices a health plan has implemented through its decisions about care and coverage.
The answer must begin with the observation that in this normative sphere, courts cannot demand scientific evidence, because this is the realm of values not facts. No amount of data can tell us the scientifically "correct" priority to place on Viagra, or how much money should be spent to reduce someone's chance of fatal heart attack by 1 percent. Hence Daubert, with its focus on adequacy of empirical claims, does not apply to this normative realm.
Instead, courts might inquire whether the health plan has made its values clear—whether it informed its subscribers adequately, up front, just how it allocates resources, and whether it has acted in accordance with those stated values. A clear contract will tell prospective buyers "if you buy this plan, here is what you'll receive, and here are the rules by which we will decide the borderline cases": "yes, the evidence shows that treatment T works well enough, but in this plan T is not deemed sufficiently cost-effective to merit coverage."19 This is a contract-focused approach, but not the same sort that Jacobson and Kanna identify in discussing contract-based approaches to CEA. As they describe the option of "abandon tort altogether in favor of contract" (Jacobson and Kanna this issue), courts would have little or no opportunity to evaluate the quality of the evidence on which plans base the empirical aspects of their CPGs. They would simply determine whether the plan followed its contract.
In contrast, the alternative approach suggested here would have courts address empirical disputes by considering the quality of the empirical evidence. Daubert-based challenges would be appropriate if, for instance, a plan constructed its CPGs, or applied them to individual cases, in a scientifically unacceptable manner. Reciprocally, courts would address normative disputes primarily as a contract issue by inquiring whether the health plan made its values clear and acted reasonably in accordance with them in the instant case. The important point for present purposes is that we must not confuse plans' empirical claims about which interventions work best for which problems, with their value choices about what is worth paying for. The former, but not the latter, permits evidence-oriented scrutiny.
As courts determine how best to hold health plans accountable, they should avoid unfairly holding plans to a stricter standard than physicians, in cases where comparable questions are at stake. Physicians, after all, are not expected to achieve perfect results or even to exercise optimal skill and judgment. They are obligated only to provide ordinary and reasonable care. If patients nevertheless do poorly, that is not malpractice but only an unfortunate outcome.
Health plans likewise should not be judged by whether a patient has fared well or poorly. Rather, they too should be judged according to the adequacy of their policies and implementation. Plans could be easy targets otherwise, to the detriment of their fiscal stability and ultimately their patients' care. For example, many health plans now cover screening tests for early detection of major illnesses. Whether the test is mammography for breast cancer or PSA for prostate cancer, no matter what frequency of testing is chosen, some patients' illness will be missed (Leahy 1989). Those patients might correctly say, "If only the policy hadn't been so stingy, my illness would have been diagnosed at a more curable stage." But just as courts do not judge physicians simply on the basis of whether some other treatment might have worked, neither should they judge plans according to whether some other policy would have averted this patient's unfortunate outcome. Health plans must create policies to meet a broad diversity of needs, and for courts the important question should be whether the policy was well-conceived, adequately disclosed, and properly applied ex ante, not whether it served every person well ex post. As Arnold J. Rosoff notes in his article in this issue, plans must take a prospective viewpoint as they arrange for many people to receive good care within a budget. Although courts must resolve adverse incidents in retrospect, the proper approach for assessing the policies that lead to those incidents is essentially prospective.