Evidence Report/Technology Assessment: Number 26
This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: https://info.ahrq.gov. Let us know the nature of the problem, the Web address of what you want, and your contact information.
Please go to www.ahrq.gov for current information.
Under its Evidence-based Practice Program, the Agency for Healthcare Research and Quality
(AHRQ) is developing scientific information for other agencies and organizations on which to base
clinical guidelines, performance measures, and other quality improvement tools. Contractor
institutions review all relevant scientific literature on assigned clinical care topics and produce
evidence reports and technology assessments, conduct research on methodologies and the
effectiveness of their implementation, and participate in technical assistance activities.
Introduction / Reporting the Evidence / Methods / Findings / Results of Decision and Cost-Effectiveness Analyses / Future Research / Availability of Full Report
Acute myocardial infarction (AMI) is the leading cause of death in the United States. Investigating the causes, progression, and treatment of AMI continues to be a national research priority. In clinical medicine, much research has focused on the early diagnosis and treatment of acute cardiac ischemia (ACI), which includes both unstable angina pectoris (UAP) and AMI. In 1991, the National Heart, Lung, and Blood Institute (NHLBI) of the National Institutes of Health initiated the National Heart Attack Alert Program (NHAAP) to study the issues related to rapid recognition and response to patients with signs and symptoms of ACI in emergency department (ED) settings, the point at which most of these patients enter the health care system. This ongoing effort brings together scientists, clinicians, and NHLBI staff with a Coordinating Committee that includes representatives of 40 professional organizations.
In 1994, the NHAAP Working Group on Evaluation of Technologies for Identifying Acute Cardiac Ischemia in the Emergency Department was formed to assess the technologies for diagnosing ACI and AMI in the ED. Members of the Working Group had expertise in the areas of cardiology, emergency medicine, general internal medicine, family practice, and nursing, as well as in the specific disciplines of meta-analysis and health services research. The Working Group reviewed all technologies for diagnosing ACI in the ED. The assessments of these technologies in actual use in EDs, and the nature, extent, and quality of the evidence on which the assessments were based, are presented in the Working Group's final 1997 report, An Evaluation of Technologies for Identifying Acute Cardiac Ischemia in the Emergency Department.
Return to Contents
Reporting the Evidence
In 1998, the Agency for Healthcare Research and Quality (AHRQ, formerly the Agency for Health Care Policy and Research [AHCPR]), working as a partner for the NHLBI's NHAAP, contracted with the New England Medical Center's Evidence-based Practice Center (EPC) to update the 1997 NHAAP report. The EPC was charged with evaluating the evidence on these diagnostic technologies published since October 1994.
As before, the purpose of the review was to assess the accuracy of technologies for diagnosing ACI in the emergency department and their clinical impact when used in this setting. However, the original 1997 report did not provide quantitative estimates of the test performance or clinical impact of the diagnostic technologies. To address this, we conducted meta-analyses where possible in which we reexamined all the studies reviewed in the original report, abstracted the necessary data, and combined these data with more recently published studies. We also conducted decision and cost-effectiveness (CE) analyses to investigate the interactions between technologies' diagnostic performances and costs, populations, and outcomes, and to provide an evidence-based framework on which to base recommendations.
NHAAP Working Group members help frame some of the study issues but they were not involved in the evaluation of evidence or in the writing of the report.
Return to Contents
We conducted a systematic and comprehensive search of the English-language literature published between 1966 and December 1998. Literature was retrieved from a computer MEDLINE search, references cited in the 1997 Working Group report, review of references of retrieved articles, and assistance from domain experts. Search terms included those related to the diagnosis of ACI, AMI, and UAP in the ED and to the following technologies:
- Prehospital electrocardiography (ECG).
- Continuous/serial ECG.
- Non-standard leads ECG.
- Exercise stress ECG.
- The ACI Time-Insensitive Predictive Instrument (ACI-TIPI).
- The Goldman chest pain protocol.
- Biochemical tests and biomarkers (e.g., creatine kinase [CK] or its subunit [CK-MB], troponin T, etc.).
- Sestamibi myocardial perfusion imaging.
- Computer-based decision aids.
We followed the general approach for selecting studies taken by the Working Group in its report. We considered reports if they came from work done in the ED setting; results coming from other settings (e.g., the cardiac care unit) were used only if little or no ED-based data were available. Data from non-ED settings were used with the understanding that they suggest potential utility but do not directly apply to the emergency setting.
We accepted prospective and retrospective studies that evaluated one or more of the technologies considered in this evidence report and included patients 18 years and older who presented to the ED with symptoms suggestive of ACI. We placed no restrictions on patients' gender or ethnicity. In general, ED testing consists of either a single test performed within the initial 4-hour period after presentation to the ED, or repeated testing up to 14 hours after the patient's presentation to the ED. We accepted studies with minor deviations from this standard.
Data were abstracted according to a written protocol and were summarized in evidence tables.
Grading of the Evidence
The evidence-grading scheme we used assesses four dimensions that are important for the proper interpretation of the evidence:
- Size of the study (weight of the evidence).
- Applicability (population category and prevalence of disease).
- Diagnostic performance or magnitude of clinical impact.
- Methodological quality (internal validity).
Applicability. We grouped the populations and settings of the studies using a four-category scale to help interpret the results. We also collected data about the prevalence of ACI or AMI to assist the interpretation. The four defined population categories are:
- Category I—Studies that included all patients with signs and symptoms suggestive of ACI, such as chest pain, shortness of breath, jaw pain, acute pulmonary edema, and so forth. This is the most inclusive category. Few studies met Category I criteria.
- Category II—Studies that used chest pain as the inclusion criteria. Most studies belong to this group. Category II is a subset of Category I.
- Category III—Studies that included patients with chest pain but that excluded those with clinical or ECG findings of AMI. Many studies, especially studies of stress cardiac imaging or testing, belong to this group. Category III is a subset of Category II.
- Category IV—Studies in which all patients were hospitalized or which used additional criteria that enrolled highly selected subpopulations. We also placed retrospective studies in this category.
Test performance studies. When there were sufficient data for a technology, we used three complementary methods of synthesizing data across several studies to report on its test performance:
- Summary receiver operating characteristics (SROC) analysis.
- Separately combined sensitivity and specificity values using a random effects model.
- The summary diagnostic odds ratios using a random effects model.
We defined a three-level methodological quality scale for test performance studies graded as follows:
- A (least bias)—Such as a study that adheres to the traditionally held concepts of high quality diagnostic evaluation, including:
- Clear descriptions of the population and setting.
- Clear descriptions of the reference standard, the test under investigation, and the diagnostic criteria.
- Masked interpretation of the reference test and the test under investigation.
- Verification of the diagnoses in all or most of the patients with negative results.
- No significant reporting errors that are likely to result in substantial bias.
- B (susceptible to some bias)—A study that does not meet all the criteria in category A, but its deficiencies are unlikely to cause major bias.
- C (likely to have significant bias)—A study with significant design or reporting flaws that cannot preclude major bias. This category includes studies in which verification bias could be a major issue and studies that have significant amounts of missing information or discrepancies in their reporting.
Clinical impact studies. In the few instances where there are sufficient data reported by clinical impact studies, dichotomous outcomes expressed as risk ratio or continuous outcomes were combined using a random effects model.
We defined a three-level methodological quality scale for clinical impact studies graded as follows:
- A (least biased)—Such as prospective controlled trials.
- B (susceptible to some bias)—Such as prospective cohort studies.
- C (likely to have significant bias)—Other designs or studies with significant conduct or reporting problems that could lead to large bias.
Return to Contents
The MEDLINE literature search identified 6,667 titles, a third of which were published from 1994 onward, indicating increased research activities on this topic over the past 5 years compared to the previous 27 years. From these abstracts, 407 full articles were retrieved for review, 106 of which are included in the analysis.
A diverse array of technologies with varying degrees of diagnostic accuracy is available for use in general or selected populations to diagnose ACI in the ED. About half the studies analyzed were in population category II and about 30 percent in category III. Prevalence of AMI across studies, even within population categories and in similar settings, varied widely with little indication that similarly reported inclusion criteria among studies resulted in similar levels of AMI prevalence.
Despite this, there is some indication that overall, studies that included all patients with chest pain (population category II) have higher prevalence of AMI than either studies that included all patients with symptoms suggestive of ACI (population category I) or studies that excluded patients with diagnostic ECGs (population category III). In addition, though differences in AMI prevalence among different settings are not statistically significant, there is evidence that studies that analyzed only admitted ED patients have higher prevalence of AMI than those that included all ED patients. Thus, these two populations may truly be different.
Most studies evaluated the accuracy of the technologies; only a few evaluated the clinical impact of routine use. To summarize:
- Prehospital 12-lead ECG has moderate sensitivity (76 percent) and specificity (88 percent) for diagnosis of ACI. It has demonstrated a reduction of the mean time to thrombolysis by 33 minutes and short-term overall mortality in randomized trials.
- In the general ED setting, only ACI-TIPI has demonstrated, in a large multicenter clinical trial, a reduction in unnecessary hospitalizations without decreasing the rate of appropriate admission for patients with ACI.
- The Goldman chest pain protocol has good sensitivity (about 90 percent) for AMI but has not been shown to result in any differences in hospitalization rate, length of stay or estimated costs, in the single clinical impact study performed. Its applicability to patients with UAP has not been evaluated.
- Single measurement of biomarkers at presentation to the ED has poor sensitivity for AMI although most biomarkers have high specificity (over 90 percent). Serial measurements can greatly increase the sensitivity for AMI while maintaining their excellent specificity. Biomarkers cannot identify most patients with UAP.
- Diagnostic technologies to evaluate ACI in selected populations, such as echocardiography, sestamibi perfusion imaging, and stress ECG, may have very good to excellent sensitivity; however, they have not been sufficiently studied.
Return to Contents
Results of Decision and Cost-Effectiveness Analyses
Decision and cost-effectiveness analyses were performed for 17 technologies and 4 combinations of technologies that have been evaluated in the literature and this report. The cost analysis is from the payers' perspective (e.g. health insurance companies); patient outcomes are either appropriate triage or 30-day survival of patients with ACI.
As not all technologies can be applied to all patients in the ED (such as stress ECG), two different ED populations were used for the analysis:
- A general population model, which includes all patients in the ED.
- A subgroup model, in which high-risk patients are excluded.
Stress tests, sestamibi imaging, and serial and continuous ECG were evaluated only in the subgroup population.
As expected, technologies with the best diagnostic accuracy for AMI and UAP have the highest values for appropriate triage for patients with ACI. Technologies that are more effective (greater number of patients with ACI appropriately triaged) tend to have higher total costs, with the exception of ACI-TIPI. The biomarkers are least costly and have the lowest values for appropriate triage. Algorithms, combination technologies, and echocardiography are the next most effective technologies, in that order. Sestamibi imaging and exercise ECG are more expensive than other technologies but have excellent diagnostic performance for ACI.
Based on data using only the diagnostic performance data of technologies, the combination technology of troponin T and echocardiography has the best CE among all technologies applicable to the general population model. If results from clinical impact studies are incorporated, ACI-TIPI has the best CE because of its very high triage accuracy and low cost.
The incremental CE of troponin T and echocardiography is about $7,670 per additional appropriate triage for a patient with ACI compared with serial or combination biomarkers. The incremental CE of the next most effective technology, the artificial neural network, is approximately $10,560. Given the economic ramifications and the effects on the patient of a missed ACI diagnosis, this incremental CE for troponin T and echocardiography is minimal.
Because the estimates for detection of UAP are based on sparse data, we also evaluated the triage accuracy and cost-effectiveness of technologies for appropriate triage for patients with AMI only. The relative CE rankings do not change compared with the rankings for patients with ACI. There are few but important differences, however, in triage accuracy:
- The Goldman protocol improves significantly.
- Serial CK-MB improves slightly.
- The combination of troponin T and echocardiography is slightly better than ACI-TIPI (a difference of one patient with AMI appropriately triaged).
The combination of troponin T and echocardiography is the most cost-effective, followed by the artificial neural network. The incremental CE between these two technologies is much larger than in the general ACI model: approximately $137,000 per additional appropriately triaged patient with AMI.
In the low-risk patient subgroup model, ACI-TIPI is again the most cost-effective technology if data from clinical impact studies are incorporated. Sestamibi stress imaging has the best diagnostic performance (detects 82 percent of patients with ACI), followed by sestamibi rest scanning, and exercise ECG. The costs of exercise ECG and stress sestamibi are nearly the same. The incremental CE between the two technologies is a mere $364 per appropriately triaged patient, reflecting the higher effectiveness of stress sestamibi for its cost relative to exercise ECG.
The incremental CE between stress sestamibi imaging and the next cost-effective technology, the combination of troponin T and echocardiography, is much greater: $12,757. However, given that stress sestamibi imaging results in the appropriate triage of 37 additional patients with ACI (per 1,000 ED patients) compared with troponin T and echocardiography, it appears to be a very cost-effective technology.
If data from the ACI-TIPI trial are used, the incremental CE of using ACI-TIPI compared with troponin T and echocardiography is only $1,502 per additional appropriate triage for a patient with ACI, a truly negligible increase for improved triage accuracy.
Considering only triage accuracy for patients with AMI, the combination of troponin T and echocardiography is the most cost-effective. Exercise ECG and stress sestamibi imaging also have excellent triage accuracy; however, the per ED patient costs of these two technologies is about $500 more than that of troponin T and echocardiography.
Return to Contents
- Most studies evaluated the performance of a technology in diagnosing AMI; future studies should also evaluate a technology's performance in diagnosing UAP.
- Some technologies (e.g., echocardiography, sestamibi imaging, exercise ECG, serial biomarkers, and new biomarkers such as P-selectin and fatty acid binding proteins) remain underevaluated.
- To date, most studies have evaluated the application of a single technology on patients. Research is needed to determine whether combinations of tests, such as a panel of biomarkers, or of multiple modalities, such as ECG with serial CK-MB measurements, perform better than the component tests alone.
- Because good test performance, in isolation, does not automatically translate to appropriate utilization or desired outcomes, clinical impact studies are needed to evaluate the clinical outcomes of the actual use of the test.
- The prevalence of ACI among the studies varies widely and may be explained only partially by differences in patient populations. The wide variation of prevalence has an unknown effect on test performance and interpretation of the results, and may indicate incomplete reporting of study biases. We need to understand the reason for the heterogeneity of the prevalence among studies with seemingly similar patient populations.
- The methodological quality and the reporting of the diagnostic performance studies on this topic varies widely and could be improved substantially.
Return to Contents
Availability of Full Report
The full evidence report from which this summary was taken was prepared for the Agency for Healthcare Research and Quality by the New England Medical Center under contract No. 290-97-0019. Printed copies may be obtained free of charge from the AHRQ Publications Clearinghouse by calling 1-800-358-9295. Requesters should ask for Evidence Report/Technology Assessment No. 26, Evaluation of Technologies for Identifying Acute Cardiac Ischemia in Emergency Departments (AHRQ Publication No. 01-E006).
The Evidence Report is also online on the National Library of Medicine Bookshelf.
Return to Contents
AHRQ Publication Number 00-E031
Current as of September 2000