Skip Navigation U.S. Department of Health and Human Services
Agency for Healthcare Research Quality
Archive print banner

This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: Let us know the nature of the problem, the Web address of what you want, and your contact information.

Please go to for current information.

Using Administrative Data To Monitor Access, Identify Disparities, and Assess Performance of the Safety Net (continued)

Hospital Discharge Records


Hospital discharge records are among the administrative databases that are potentially of the greatest value in understanding utilization patterns of low-income and other vulnerable populations and in assessing the performance of the local safety net. These data are computerized summaries of the medical record for each patient discharged from a hospital, with information on the hospital stay (diagnoses, procedures, admission/discharge dates, charges, and so on), as well the patient (age, gender, race/ethnicity, insurance status, ZIP code of residence, and so on). These records are maintained primarily for payment purposes, but have been used by researchers, analysts, and planners for a broad range of other purposes as well.

For assessing access and performance of the safety net, most analyses of hospital discharge data have focused on preventable/avoidable hospital admissions for ambulatory care sensitive (ACS) conditions. These ACS conditions involve diagnoses where timely and effective ambulatory care (usually primary care) can help prevent or reduce the risk of hospitalization. There are three types of ACS conditions:

  • Chronic conditions (diabetes, asthma, congestive heart failure), where effective management can often prevent more serious flare-ups that require admission.
  • Acute conditions (ear/nose/throat infections, gastroenteritis, cellulitis, and so on), where early intervention can prevent more serious progression of the condition that might require hospital admission for treatment.
  • Preventable illnesses (pertussis, tetanus, rheumatic fever, and so on), where immunization can prevent the onset of the disease, and any hospitalization represents a serious failure of the health care delivery system.

For these conditions, higher use rates for a population subgroup or geographic area can be an indication of access problems or concerns about performance of the safety net. It is important to note that these admissions are not necessarily "inappropriate" in the sense of being unneeded or unwarranted. These are simply conditions where effective ambulatory care might have prevented the condition from becoming so severe that admission is perceived to be necessary. And, of course, not all ACS admissions are even preventable or avoidable. In some cases, the best possible care cannot prevent serious progression of the condition to the stage that requires hospitalization; failure of treatment, especially among older and sicker patients, remains a reality in even the best managed health care system using the most advanced technology and methods available. Select Appendix B for a full listing of ACS conditions.

There is also a broad range of other diagnoses that may be of interest to analysts concerned about access issues and performance of the safety net. The list of ACS conditions does not include primary diagnoses involving substance abuse or mental health problems. These conditions may be particularly prevalent among some vulnerable populations, and monitoring hospitalization patterns for these conditions may be important to understanding needs and performance of the health care delivery system. Again, higher use rates might raise serious concerns about need or system performance.

Also not included among ACS conditions are surgical procedures. There is substantial evidence in the literature of large disparities in rates of surgery by insurance status and by race/ethnicity (Institute of Medicine, 2003), and tracking rates of surgery may also be of interest. One approach is to analyze "referral sensitive surgeries," high-cost procedures that are usually nonemergent where failure to obtain a referral to a surgeon can be a barrier to obtaining the procedure (e.g., coronary artery bypass surgery, joint replacement, and organ transplant) (Billings et al., 1993). A second approach is simply to track a general class of procedures (e.g., cardiovascular or orthopedic) that may be of particular concern, with lower rates of procedures being a possible indication of problems in gaining access to care. However, it is essential to recognize that for many of these procedures, there is substantial disagreement among observers on what is the "right" rate and about whether U.S. utilization patterns for middle-class patients in some communities may reflect overutilization. Exercising caution in interpreting and presenting results is clearly necessary for surgical procedures.

Obtaining Hospital Discharge Databases

The availability of hospital discharge databases is largely dependent on the jurisdiction involved. In more than 30 States, these databases are maintained by a centralized State authority. Data are required to be submitted by all hospitals in the State, and the process for gaining access to the data for research and analytic purposes has often been standardized (generally making the process less painful than obtaining birth/death records or emergency department records). The agency involved is typically the State health department, although some States have separate agencies that maintain and analyze the data. The companion volumes to this tool kit contain data from most of the States where such hospital discharge data are available, although the number of States continues to increase.

Even within States where discharge data are not maintained by the State government, such data may be available through the State or local hospital associations. These data are often maintained for purposes of tracking market share, and may be available to researchers and analysts (subject to privacy restrictions on identifying individual providers). In other communities with a discrete number of hospitals serving the area, it may also be possible to obtain these data directly from individual hospitals, although such efforts can be very time consuming and are often frustrating.

At the Federal level, the Healthcare Cost and Utilization Project (HCUP) in the Agency for Healthcare Research and Quality (AHRQ) has assembled hospital discharge data from more than 30 State data organizations and hospital associations, and has converted these databases into a uniform format. Within the parameters of restrictions imposed by participating States, many of these State databases are available. In addition, two samples are available for making national comparisons. The Nationwide Inpatient Sample (NIS) is a sample of approximately 1,000 hospitals from all participating HCUP States and contains all discharges from each sampled hospital. The Kids' Inpatient Database (KID) is a sample of pediatric discharges from all participating hospitals (over 2,500) and is particularly useful for studying conditions that are limited to infants, children, and adolescents. For additional information on HCUP, go to

As noted, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) has created additional complications for obtaining hospital discharge data. Effective April 2003, HIPAA provisions intended to protect patient privacy are applicable to release of hospital discharge abstracts. Among the protected data elements are important variables such as patient ZIP code of residence. While HIPAA does not prevent the release of data with ZIP code-level identifiers for research and policy analysis purposes, it creates an additional hurdle in obtaining the data by requiring an additional level of review for its release. As States and providers become more familiar with these requirements, these obstacles are likely to become easier to manage, but in the shortrun some difficulties may be encountered in obtaining release of discharge data. For additional information on HIPAA, go to

To use hospital discharge data effectively, it is also necessary to have "denominator" data for the areas and populations that will be examined. Hospital utilization is generally expressed as a rate (e.g., admissions or discharges per 1,000 area population), and it is therefore necessary to acquire population counts for inclusion in the denominator of the equation for calculating the rates.

There is a commercial industry that actively markets such data for a charge (primarily to marketing departments in large corporations), but some data are also available at no cost from the U.S. Census Bureau ( These data from the U.S. Census are available at all geographic levels for the year 2000, and intercensal estimates and projections are also generally available at the State, county, and municipal level. For years before and after 2000, ZIP code-level analysis may require purchase of data from commercial sources because inter-censal estimates and estimates/projections after 2000 are not available at the ZIP level from the U.S. Census. Because of rapid population growth in many areas, the further in time from the year 2000, the more important such estimates are likely to become for accurate analysis.

Analyzing the Data

As with any administrative database, the first step in analyzing hospital discharge data involves testing data quality. The good news is that much of the data is required for regulatory purposes, and many (but not all) States collecting the data have already applied sophisticated data cleaning protocols to the data. Accordingly, the most important step is simple frequency distributions for the fields of interest, primarily to identify the extent of missing data problems, but also to identify serious anomalies.

Because hospital utilization differs significantly by age, it is important to conduct separate analyses of various age groups. Experience suggests that meaningful groupings include ages 0-4, 5-17 (or 0-17), 18-39, 40-64, 65-75, 75+, though others may be useful for different analytic purposes. Smaller groupings can also be useful, although sample size limits can present problems if the age grouping is too small. For large age groupings (0-17, 18-39, 40-64), age and sex adjustment using standard statistical methods may also be required to assure comparability among areas. Low-income areas typically have much younger populations, and the distribution of elderly populations also differs widely geographically.

Separate analysis by age can also be important because outcomes and safety net performance may differ substantially for different age groups. The Medicaid program is heavily skewed toward coverage of children, further augmented by recent coverage expansions resulting from the State Children's Health Insurance Program initiative. Other initiatives, such as community health centers; Women, Infants, and Children (WIC) programs; and school clinic programs also emphasize assuring timely and effective access for children. Not surprisingly, analysis of rates for ACS conditions for children and adults typically show different patterns. Figures 3 and 4 display ACS rates for the Baltimore MSA at the ZIP code level. For children, a strong association with area income exists (R2 = .595), and low-income areas of Baltimore have ACS rates 2.2 times higher than those in high-income ZIP codes. For adults age 40-64, the mean rate is substantially higher than that for children (26.7/1,000 versus 9.5/1,000), and an extraordinarily strong association exists between ACS rates and area income (R2 = .893). But of significant importance is the much larger difference between low- and high-income areas, with rates in low-income ZIP codes for adults more than 4 times higher than those in more affluent areas.

These data may suggest two important conclusions about access and performance of the safety net in Baltimore that are likely to be of interest to policymakers:

  • Things are a lot worse for adults.
  • The substantial investment in efforts to improve access for children has helped reduce the level of disparity (although additional progress is necessary).

This latter point is likely to be of significant importance as Federal, State, and local budget constraints force hard looks at existing programs; the ability to show that investment of resources actually seems to matter can help reinforce support for these programs.

The potential usefulness of ACS data at the ZIP code level is further illustrated by data from the Atlanta, GA, MSA. In Figure 5, county level ACS rates are displayed for adults age 40-64, suggesting relatively low ACS rates throughout the MSA. However, ZIP code-level analysis (Figure 6) documents numerous ZIP code areas with very high ACS rates, largely in the city of Atlanta, but also in some of the outlying areas. This ZIP code-level data can help unmask serious problems within an area and also suggest geographic areas to target for further analysis to help design interventions.

Examining ZIP code-level data often illustrates another important point: outcomes and safety net performance can differ substantially within a community, even among neighborhoods with similar levels of poverty. In Figure 7, ACS rates for adults in New York City show a strong association with area income (R2 = .613), but there are also large differences among low-income ZIP codes. The ACS rate for ZIP code 10035 in East Harlem is more than 3 times higher than ZIP code 11239 in Brooklyn, despite similar area income levels. Both areas have substantial black populations, and East Harlem also has a large Hispanic population. While these data cannot explain the cause of higher rates in East Harlem, they can help policymakers better understand the causes of these differences and to target potential interventions more effectively.

While ZIP code-level analyses are often very helpful in focusing concerns in particular geographic areas, they can also present difficulties. In densely populated cities, ZIP codes are often too large a unit of analysis. In New York City, ZIP code 10027 stretches from the tenements of Central Harlem to the coops and condominiums overlooking the Hudson River adjacent to Columbia University. Some States have street address level data on hospital discharge records, permitting geocoding of data to the census tract level for ZIP codes with diverse populations. In less densely populated suburban or rural areas, ZIP codes may have too few residents to obtain statistically significant results. The level of random variation can be reduced by setting a minimum population size of each geographic area (e.g., 5,000 residents), and then combining ZIP codes not meeting the minimum threshold size with adjacent ZIP codes that have similar demographic characteristics.

Despite such cautions, ZIP code boundary changes (the post office controls ZIP code boundaries, which are changed to meet their mail routing requirements) and other data problems can produce anomalous results. One mechanism that can be used to increase the level of confidence in ACS rates is to calculate rates for a set of "marker" or "reference" conditions that involve diagnoses where physicians agree on the criteria for admission, and timely and effective ambulatory care is unlikely to have any direct or immediate impact on the need for hospital admissions (Appendix B). For these conditions (e.g., appendectomy and hip fracture), the level of variation among areas should be minimal and any large difference in rates (significantly above or below area averages) indicates that an error (in the numerator or denominator of the equation for the area) has most likely occurred, and that caution should be exercised in interpreting ACS rates for the area.

Another complicating factor in analyzing hospital discharge data is differences in physician practice style. An extensive body of literature documents the extent of this variation, especially among regions of the country and among hospital market areas (Wennberg and Gittelsohn, 1973; Wennberg et al., 1989). ACS rates include conditions where substantial differences of opinion can exist among physicians on whether the patient needs to be admitted or can be treated safely in an outpatient setting. While research has established that differences in ACS rates between low- and high-income areas within a geographic region are not attributable to these practice style differences (Bindman et al., 1995), caution is required in making comparisons of absolute rates among areas, especially those separated by substantial geography. This problem is effectively illustrated by comparing ACS rates for adults age 40-64 in Baltimore (Figure 3) and San Francisco (Figure 8). There is a strong association between ACS rates and ZIP code income in both communities, and there is a fourfold difference between rates for low- and high-income areas in both areas. However, the mean rate in San Francisco (13.17) is about half the rate in Baltimore (26.69), reflecting the generally lower hospital utilization levels on the west coast.

Finally, it is important to note that ACS conditions include a broad range of diagnoses across age groups, body systems, and genders. Some individual conditions have admission rates that are sufficiently high to warrant separate analysis, such as asthma, diabetes, severe ear, nose, or throat infections (for children), and congestive heart disease (for adults). It is often useful to examine some of these conditions individually (especially asthma), since different factors may contribute to high rates in an area. Also, some analysts have examined subsets of ACS conditions, grouping acute and chronic conditions separately, again reflecting the possibility of differences in the causal factors for high ACS rates. For acute conditions, timely "front door" access to care may be most important to prevent progression of the condition, while for chronic conditions patient self-management skills (and confidence) and provider performance may be more important. A full list of ACS conditions, marker/reference conditions, and referral sensitive surgeries is contained in Appendix B. Software to create age- and sex-adjusted rates for total and individual ACS conditions, marker/reference conditions, and referral sensitive surgeries is available at no charge at

Return to Contents

Emergency Room Records


In many communities, the local emergency department (ED) is the de facto "safety net of the safety net." Patients without access to care in other settings often use the ED for routine primary care, while others delay seeking needed care due to access problems or because they are unaware of the availability of primary care services. Accordingly, increased attention is being given to the potential for the ED to be a "window" on the safety net, and many analysts have begun to examine ED visit records to help learn more about utilization patterns and potential access problems.

The New York University Center for Health and Public Service Research and the United Hospital Fund of New York emergency department algorithm classifies ED use into four basic categories:

  • Non-emergent. Cases where immediate care is not required within 12 hours (e.g., sore throat).
  • Emergent-primary care treatable. Care is needed within 12 hours, but care could be provided in a typical primary care setting (infant with a 102° fever).
  • Emergent-ED care needed: preventable/avoidable. Immediate care in an ED setting is needed, but the condition could potentially have been prevented or avoided with timely and effective ambulatory care (asthma, diabetic ketoacidosis, and so on).
  • Emergent-ED care needed: not preventable/avoidable. Immediate care in an ED setting is needed, and the condition could not have been prevented/avoided with ambulatory care (heart attacks, multiple trauma, and so on).

In addition, the algorithm separates out visits with a primary diagnosis involving mental health, substance abuse, or injury since these conditions are difficult to classify and may be of special interest to analysts. Figure 9 illustrates the algorithm.

As with ACS rates for hospitalization, analysis of ED records usually cannot provide a causal explanation for a high rate of ED use for nonemergent, primary care treatable, or preventable/avoidable conditions. However, analysis of patterns of use among population subgroups and geographic areas can be useful in identifying areas of particular concern to focus further inquiry or to develop intervention strategies.

Obtaining ED Records

The availability of computerized ED records is considerably more problematic than obtaining hospital discharge data. Only a handful of States currently mandate submission of such data to centralized State authorities, although the number is increasing. In these jurisdictions, the approach and requirements for obtaining the data are similar to those for hospital discharge data.

However, in most communities ED records can be obtained only by contacting each area hospital individually and requesting their assistance. This is a difficult and often frustrating task. The good news is that virtually all hospitals have computerized ED visit databases, again maintained primarily for payment purposes. The data are relatively easy to access, and can be downloaded in a readable format. Unfortunately, many hospitals may be unwilling to share information or unlikely to reassign information technology staff to provide ED data to an outsider. Accordingly, the basic rule of thumb is to ask nicely for as little as possible, and be flexible as to the format and media (tapes, disks, and so on). Select for the discussion on obtaining birth records.

Access to data can be easier when participants are willing. It is important to remind hospitals that the data will not be used to blame the ED (and hospital). High ED use rates are typically the result of a failure of the primary care delivery system, and hospital EDs (which are mandated by the Emergency Medical Treatment and Active Labor Act of 1986 [EMTALA] to accept everyone) are simply the reluctant recipients of fallout from this problem. Often ED analyses are conducted in conjunction with larger needs assessment efforts by State or local government, and a letter from the mayor or governor to the hospital requesting cooperation always helps. Many of the Health Resources and Services Administration (HRSA)-sponsored Community Access Program grantees involve large coalitions of local providers interested in improving access for vulnerable populations, and obtaining participation from coalition members has been quite successful.

ED records are also subject to the requirements of the Health Insurance Portability and Accountability Act (HIPAA) (select for discussion). To the extent that obtaining ED records is a "retail" effort, in the short run significant problems are likely to arise as individual hospitals sort out the process and procedures for complying with HIPAA provisions.

Analyzing the Data

Because analysis of ED databases is a relatively recent development, the potential problems with data quality are likely to be greater. Of primary concern is determining whether the database includes patients admitted to the hospital. Since a separate ED payment is often not permitted for admitted patients, in some (but not all) databases these visits are expunged. The second level of concern is the common problem of incomplete records with missing fields (especially for expected payer). A final problem relates to differences in coding practices among hospitals. This problem is likely to be acute if the data were obtained directly from hospitals. There may be differences in how race, gender, and payer are coded. Software for applying the ED classification algorithm is available at no charge at and includes routines for recoding ED data.

Some of the greatest problems in analyzing ED data relate to diagnostic codes. These codes are based on the ICD-9-CM system, which has three digits to the left of a decimal and two digits to the right (Practice Management Information Corporation, 2002). In some databases the decimal has been removed or the variable converted from an alphanumeric field to a numeric field, often making consistent analysis impossible. The bigger problem relates to the tendency among some hospitals to truncate ICD-9-CM codes to the first three digits. Since ED payment is not typically based on diagnosis, this is not irrational, since the first three digits effectively identify the general disease/medical condition category for the patient. But analysis of whether a condition is primary care treatable is often dependent on the two digits to the right of the decimal. For example, there is an important difference between an uncomplicated diabetes patient with elevated glucose levels (ICD-9-CM 250.0), and a diabetic patient with ketoacidosis (250.1) or hyperosmolar coma (250.2). In these cases, if the code has been truncated at the third digit, important distinctions are lost and effective analysis is not possible.

As with hospital discharge analysis, ED records can produce results that are intuitive to policymakers. For example, in the State of Missouri, for children ages 0-17, more than 62 percent of ED visits by Medicaid patients were "preventable or avoidable" (i.e., non-emergent, emergent-primary care treatable, or emergent-ED care needed-preventable/avoidable). For children with commercial coverage in Missouri, only 43 percent were "preventable/avoidable" (Figure 10). Clearly, there are problems in ED patterns for Medicaid-covered children in Missouri, but it is also apparent that even for commercially insured children, ED use is not optimal. ZIP code-level analysis can also be revealing. Where data from all area hospitals are available, it is possible to create per capita ED use rates for the various ED classification categories (rather than percentage distributions among the categories). The data are dramatic. For example, in the Baltimore MSA, the association between "preventable/avoidable" ED use/1,000 and area income for children ages 0-17 was very strong (R2 = .736) and even higher for adults (R2 = .783) (Figures 11 and 12). In Austin, TX, the local Community Access Program initiative conducted an analysis of ED use patterns and found some central city areas with very high ED use rates, with generally lower rates in most suburban areas (Figure 13).

Return to Contents


Analysis of administrative data has the obvious advantages discussed: the data are available, and often can be analyzed relatively inexpensively. These data can be enormously helpful in assessing need in a community, especially in making comparisons among population subgroups and geographic areas.

However, administrative data seldom establish a definitive causal link between a data outcome and the factors that led to the rate. Access and safety net performance are incredibly complex issues. There is no single factor that can assure access to care, or that explains all access problems. Safety net performance is undoubtedly affected by many variables, including resource supply, composition, support levels, demand levels, and so on.

But policymakers may be tempted to jump to conclusions that may or may not be justified by the analysis. For example, the first reaction to high ACS rates in an area is typically to add resources, such as more primary care doctors. But while supply of available physicians (i.e., those willing to accept all patients regardless of willingness to pay) is undoubtedly critical, high ACS rates may also be related to a broad range of other factors, including lack of effective patient education on self-management/symptom identification, dissatisfaction with existing providers (lack of dignity and respect, long wait times, dirty facilities, and so on), or simple lack of awareness of resource availability. Sorting through these alternative explanations is critical to any safety net needs assessment and is likely to become essential in an era of constrained resources and coverage contraction.

Return to Contents


Anderson RN. Deaths: Leading causes for 2000. National vital statistics reports; vol. 50 no. 16. Hyattsville (MD): National Center for Health Statistics; 2002.

Billings J, Anderson G, Newman L. Recent findings on preventable hospitalizations. Health Aff (Millwood) 1996 Fall;15(3):239-49.

Billings J, Zeitel L , Lukomnik J , et al. Impact of Socioeconomic Status on Hospital Use in New York City. Health Aff (Millwood) 1993 Spring;12(1):162-73.

Bindman A, Grumbach K, Osmond D, et al. Preventable hospitalizations and access to health care. JAMA 1995 Jul 26;274(4);305-11.

Practice Management Information Corporation. International Classification of Diseases. 9th Revision: Clinical Modification. 6th Edition. Los Angeles (CA): PMIC; 2002.

Millman ML, editor. Access to health care in America. Washington (DC): National Academy Press; 1993.

Smedley BD, Stith AY, Nelson AR, editors. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Committee on Understanding and Eliminating Racial and Ethnic Disparities in Health Care. Washington (DC): National Academy Press; 2003.

Wennberg J, Freeman J, Shelton R, et al. Hospital use and mortality among Medicare beneficiaries in Boston and New Haven. N Engl J Med 1989 Oct 26;321(17):1168-73.

Wennberg J, Gittelsohn A. Small area variations in health care delivery. Science 1973;182(117):1102-8.

Current as of September 2003

Return to Contents
Return to Tools for Monitoring the Safety Net


The information on this page is archived and provided for reference purposes only.


AHRQ Advancing Excellence in Health Care