Skip Navigation Archive: U.S. Department of Health and Human Services U.S. Department of Health and Human Services
Archive: Agency for Healthcare Research Quality
Archival print banner

This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: Let us know the nature of the problem, the Web address of what you want, and your contact information.

Please go to for current information.

CareScience Risk Assessment Model - Hospital Performance Measurement

Presentations from a November 2008 meeting to discuss issues related to mortality measures.

Appendix D — Technical Details about Model Specification

Model Stratification

The stratification is roughly based on 3-digit level ICD-9-CM diagnosis codes. A clinical and statistical review was conducted, basing on State All-Payer 1999 data. The stratification processing is described below:

  1. A major diagnosis code, representing more than 0.1% of total discharges, stands as a separate stratum. For example, ICD9 410 (AMI) stands as a separate stratum because its volume (2% of total discharges) is greater than 0.1%, the minimum requirement.
  2. Within the minor diagnosis codes that fail the volume criterion, a clinically significant diagnosis code stands as a separate stratum. The clinical significance is determined at threshold of 5% mortality rate. For example, ICD9 151 (Malig. Stomach Neoplasm) stands as a separate stratum because its mortality rate (11%) is above the threshold of 5%, even though its volume is lower than 0.1%.
  3. Similar diagnosis codes are rolled up into a common stratum. For example, ICD9-204, 205, 206, 207 and 208 are rolled up into Leukemia.
  4. The remaining diagnosis codes are rolled up into Broad Diagnosis Groups according to the designation of ICD-9-CM. For example, ICD9_740 - 759 are rolled up into BDG14: Congenital Anomalies.
  5. Exemption One: Newborns are determined by three principal diagnosis codes at two-digit level: 76, 77 and V3. If any of a newborn's diagnosis codes begins with 764, 765 or V213, this case is classified into low birth-weight immature newborn. If none, this case is classified into normal newborn.
  6. Exemption Two: Major organ transplant patients are stratified separately due to issues specific to this patient group (e.g. heart transplant). They are identified by DRG.

Mortality Exemptions

Hospital level mortality rate is usually around 2 or 3 percent. Expirations do not evenly occur across the 142 model strata. Among some disease strata, mortality rate is very close to zero. For instance, mortality rate is less than one-tenth of a percent among intervertebral disc disorder patients (ICD9-diag 722). It is extremely difficult to build a robust model to accurately pinpoint the very rare expirations. As a result, these disease groups are proposed to be omitted from mortality analysis rather than forced into a poor model.

The CareScience mortality model is based on linear regression, and consequently the predicted mortality risks may fall out of the range between zero and one at the patient level. Out-of-range risks are acceptable unless they exceed the "reasonable range" of -0.5 ≤and ≤1.5 at which point they are considered invalid. If negative risks occur in aggregate reporting, they are rounded to zero.

Complications Algorithm

Complications are derived from principal and secondary diagnosis codes. Ideally, complications should be recorded as binary outcomes. However, there is no absolute way to classify diagnoses as complications. Clinically, a diagnosis may be considered a complication in one case but a comorbidity in another. The POA flag (present on admit) is helpful to identify existing conditions prior to admission. But the flag is often unavailable in either public or private data. Moreover, a diagnosis that was captured during an inpatient stay does not necessarily indicate its development after admission. Although chart reviews are a reliable way to supplement this information, they are unsuitable for large-scale data processing efforts. CareScience has therefore developed a unique comorbidity-adjusted complication index (CACI) to approximate the probability that a diagnosis is a complication given its accompanying principal diagnosis. These complication probabilities were determined ex ante by a panel of medical experts (please refer to section of Clinical Knowledge Base for details).

The following algorithm is in use when complication is calculated:

  1. Find the pair of principal and secondary diagnosis in CACI table and select the corresponding probability.
  2. If the principal and secondary diagnosis code combination can not be found in the CACI table, the program selects the secondary diagnosis code's default probability (independent of principal diagnosis) from the CACI2 table.
  3. If the secondary diagnosis code can not be found in either the CACI table or the CACI2 table, the diagnosis is excluded from the calculation.
  4. If the patient has no secondary diagnosis codes, his complication probability rate is set to zero.
  5. If the first three digits of the secondary diagnosis code are equal to the first three digits of the principal diagnosis code, the secondary diagnosis is excluded from the calculation.
  6. For Obstetrics patients, the program selects probabilities for ALL diagnoses from the CACI2 table.
  7. Newborns, as defined by the principal diagnosis code, are not included in complication analyses.

CACR Comorbidity Scores and Chronic Diseases or Disease History

CACR comorbidity scores are derived from principal and secondary diagnosis codes. Secondary diagnoses are first categorized according to a five point Likert scale of increasing severity (A-E) where E is most severe. If a secondary diagnosis is not present in the Diagnosis_Morbidity table, it receives a designation of "unspecified" and is correspondingly grouped to category U. Secondary diagnoses are subsequently evaluated according to the CACI algorithm. Comorbidities are calculated as

Sum of 1 minus P i j with index 1 arrow n and upper limit s, j = 1 through n

where n is the number of secondary diagnoses, s is the severity category, and pij is the probability of complication for the jth secondary diagnosis given principal diagnosis i. The probability that a particular secondary diagnosis is a complication of a given principal diagnosis is retrieved from the CACI table. Probabilities of comorbidities in the same severity category are summed together. As a result, comorbidity score may not be an integer.

Comorbidity score is closely related with complications. Comorbidity scores are calculated by the similar algorithm that is used to calculate complications.

Chronic diseases and disease history are determined from patients' secondary diagnosis codes.
(Note: Chronic diseases were previously included in patients' comorbidity scores.) To differentiate patient characteristics, common chronic diseases enter the model separately from comorbidities. Comorbidities and chronic diseases are restrained to positive coefficients in the model calibration.

Birth Weight and Defining Diagnosis

Neonates represent approximately 10% of all admissions and are therefore an important analysis population. The overwhelming majority of neonates are healthy full-term babies. Attention is focused on high-risk neonates, who are primarily immature, low-weight newborns. Our study shows that birth weight is the most important predictor of survival, treatment pattern, and resource requirements. Weight class is encoded in the fifth digit of immature neonate diagnosis codes (764, 765, and V213). If the fifth digit denotes unspecified weight, the birth weight is set to null, and the record is excluded from outcome analyses. If the fifth digit denotes a birth weight range, the birth weight is set to the midpoint of the range. For example, a fifth-digit value of '4' indicates a weight between 1 and 1.25 kg, and so the birth weight is set to 1.125kg. It should be noted that the fifth digit of V213 indicates a different weight range from that of 764 and 765.

Since 2003 gestational age became a new field in newborn data, however, it has remained inconsistently reported. Due to lack of data, the relationship between gestational age and birth weight has yet to be quantified. In CareScience risk model, if a record only contains a gestational age code, birth weight is not be assigned, and the record is excluded from analysis. If a newborn does not possess a code that indicates immature status, the newborn is assigned to the normal neonatal group for which birth weight is not a risk factor.

The neonatal model does not include procedures and CACR comorbidity scores. These factors are considered less relevant newborn characteristics. On the other hand, certain diagnoses are deemed significant attributes defining a newborn's health status. These codes are directly incorporated into the neonatal model at the three digit level and are called defining diagnoses.

Valid Procedures

Strictly speaking, a procedure is not a patient characteristic but rather a provider care choice. For example, two physicians may opt to pursue two different yet equally effective courses of treatment for the same patient. Although procedures represent the discretion of the care provider, they can signal important information about the patient's overall health status. Certain procedures can serve as effective proxies for lab reports and treatment history that are not available in the current database, as well as for other unobservable critical factors. To be included in the model, procedures must be designated as "valid" for the patient's particular disease stratum. Additionally, the timing of certain procedures relative to the patient's hospital admission must be considered. Valid procedures are grouped into one of two categories based on timing criteria.

Each disease stratum (ccms_crl_group_by) has a unique set of valid procedures. If a procedure falls into Category 1, timing of the procedure is not considered, and the analytic program simply searches the beta tables to find the procedure's corresponding coefficient. If the coefficient is not present in the beta tables, its value is set to zero. Category 1 procedures with coefficients of zero have no impact on the risk score. (It should be noted that although a procedure may be considered clinically relevant, it may not be statistically significant for a particular outcome. Procedures failing to be statistically significant are not included in the model and have no impact on the risk score.)

If a procedure is mapped to Category 2, inclusion of the procedure in the model depends on the procedure's timing during the inpatient stay. More specifically, the Valid Procedure table contains a field called 'Timing' that specifies the maximum period of time from admission during which a procedure must occur to be included in the model. For example, a Timing field value of "48" indicates that a procedure may enter the model if it occurs within the first 48 hours of the hospital stay. For patient records, timing of individual procedures can be calculated as the difference between the Admission_Date and the Diagnosis_or_Procedure_date. If the difference is within the timing requirement in the Valid Procedure table, the procedure will be counted, and the algorithm, as described for Category 1, will be applied. If the timing field is missing, the procedure will be excluded.

For several disease strata (ccms_crl_group_by), the risk model does not incorporate valid procedures. These groups include Normal_Neonates, Immature_Neonates, DRG 103, DRG 480, DRG 481, DRG 495, DRG 512, and DRG 513.

Return to Article Contents
Proceed to Appendix E


Page last reviewed March 2009
Internet Citation: CareScience Risk Assessment Model - Hospital Performance Measurement. March 2009. Agency for Healthcare Research and Quality, Rockville, MD.


The information on this page is archived and provided for reference purposes only.


AHRQ Advancing Excellence in Health Care