Skip Navigation U.S. Department of Health and Human Services
Agency for Healthcare Research Quality
Archive print banner
National Healthcare Disparities Report, 2003

Methods and Final Tables of the 2001 California Health Interview Survey for the National Healthcare Disparities Report

This information is for reference purposes only. It was current when produced and may now be outdated. Archive material is no longer maintained, and some links may not work. Persons with disabilities having difficulty accessing this information should contact us at: Let us know the nature of the problem, the Web address of what you want, and your contact information.

Please go to for current information.

Submitted by
Wei Yen, PhD, and E. Richard Brown, PhD
UCLA Center for Health Policy Research

June 23, 2003

This report, our final report for PO# 02R00019401D, describes the technical aspects of the CHIS 2001 data and the analysis for this project. The first sections of the report are devoted to discussion of the CHIS 2001 sample design, response rates, and weighting, followed by a discussion of the process in selecting CHIS data elements for the National Healthcare Disparities Report, and a description on the analysis procedures. Results of the analysis (84 tables) are attached in 14 Excel files.

Return to Contents

CHIS 2001 Sample Design

The 2001 California Health Interview Survey (CHIS 2001) is a representative sample of the non-institutionalized California population. It is a two-stage, geographically stratified random-digit-dial (RDD) sample design. At the first stage, California telephone numbers were randomly generated by computer and then a random sample of these numbers were drawn from within each of 41 predefined geographic areas or "strata" (33 strata are individual counties and 8 strata are groupings of counties with smaller population sizes). These telephone numbers were then dialed and screened to determine if they were households and thus eligible for the survey.

At the second stage, one adult was randomly selected for interview from among all adults living in the contacted household. Only the selected person was eligible for the interview. In households where there were children associated with the selected adult, one adolescent (age 12-17) was interviewed and information was obtained for one child under age 12 by interviewing the adult most knowledgeable about that child's health care. The child and adolescent were each randomly chosen if more than one child or adolescent resided in the household and with whom the selected adult was "associated" as either a parent or guardian. (Adjustment factors for the selection mechanisms have been incorporated into the data's sample weights.)

A minimum sample size goal of 800 adult interviews was set for each stratum, resulting in over-samples for counties with small populations and, consequently, over-samples of rural areas. Sample size goals larger than 800 (ranging from 1,000 to 2,660) were set for counties with larger population sizes. The largest county sample, over 11,000 adult interviews, was allocated to Los Angeles County. Additionally, supplemental geographic samples were collected in three cities (Berkeley, Long Beach, Pasadena) and in three counties (San Francisco, Santa Barbara, Solano).

CHIS 2001 also over-sampled several ethnic minority population groups, supplementing the statewide samples for American Indian and Alaska Natives (AIANs), Koreans, Vietnamese, South Asians, Japanese, and Cambodians. The sample goal for these ethnic groups was a minimum of 800 completed adult interviews, with the exception of Cambodians which was set at 200.

The actual sample yield from CHIS 2001 includes 24 strata with a size of 755-855 interviews, 14 strata with a size of 971-1,974 interviews, 2 strata with about 2,500 interviews each, and 1 stratum with over 12,000 interviews (Los Angeles). All targeted ethnic sample sizes have been met.

For further information on the CHIS 2001 sampling, go to: [PDF Help].

Return to Contents

Response Rate

The response rates for CHIS 2001 are calculated separately for the adult survey, the adolescent and the child as each survey has its own selection process. The child and adolescent response rates are, however, contingent on the adult response rates because the selection of the child and adolescent was done during the adult interview. While there are numerous methods that can be used to calculate response rates, the calculation adopted for CHIS 2001 is one of the most stringent, following the guidelines of the American Association for Public Opinion Research. Using this most stringent calculation, the CHIS 2001 adult overall response rate is 37.7% (product of 59.2% of screener completion rate and 63.7% adult interview completion rate). The adolescent overall response rate is 23.9% (product of the adolescent interview completion rate of 63.5%, the screener, and the extended interview). The child overall response rate is 33.0% (product of the child interview completion rate of 87.6%, the screener, and the extended interview). The CHIS 2001 adult overall response rate is comparable to other similar surveys conducted in California.

For further information regarding the CHIS 2001 response rates, go to: [PDF Help].

Return to Contents


The CHIS 2001 sample was weighted to reflect the population both statewide and at the level of each stratum. In order to produce correct population estimates for the CHIS 2001 results, weights are applied to the sample data to compensate for a variety of factors, some directly resulting from the design and administration of the survey. Sample weighting was carried out in to accomplish the following objectives:

  • Compensate for differential probabilities of selection for households and persons;
  • Reduce biases occurring because non-respondents may have different characteristics than respondents;
  • Adjust, to the extent possible, for under-coverage in the sampling frames and in the conduct of the survey; and
  • Reduce the variance of the estimates by using auxiliary information.

As part of the weighting process for the RDD sub-samples (each stratum is an independent sample), a household weight was created for all households that completed the screener interview. This household weight is the "base weight" computed as the inverse of the probability of selection of the sample telephone number adjusted for each of the following:

  • Sub-sampling for listed address/advance letter status;
  • Unknown residential status;
  • Screener interview non-response;
  • Multiple telephone numbers; and
  • Household post-stratification.

A "post-stratified household weight" was then used to compute a person-level weight. This person-level weight incorporates the within-household probability of selection of the sampled person and adjusts for non-response, plus an adjustment resulting from raking the data to person level control totals. Each of these adjustments corresponds to a multiplicative weighting factor.

The control totals used in the raking were derived from the Census 2000 Summary File 1 (SF1). Population items in SF1 include sex, age, race, ethnicity (Latino/non-Latino), household relationships, and group quarters. The race classification in SF1 includes six groups: White, African American, American Indian/Alaska Native, Asian, Native Hawaiian/Pacific Islander, and a category of Other Race.

The race/ethnic supplemental samples were weighted using a different process. The two race/ethnic supplemental samples used in this report, Japanese and Vietnamese, came from the CHIS 2001 Asian Ethnic Supplemental Sample, which includes, in addition to these two groups, Cambodian, Korean, and South Asian. To create the weights for the supplements for the race/ethnic groups the RDD and list samples were combined and weighted as a single state-level sample. A base weight was created using the selection probability for each source (RDD and list). The base weight was then adjusted for sub-sampling, multiple phone lines, over-sample selection bias (over-sample respondents were screened on certain selection criteria), and final raking of the weights to population control totals.

For further information on the weight of CHIS 2001, go to: [PDF Help].

Return to Contents

Selection of CHIS Data Relevant to National Healthcare Disparities Report

At the recommendation of AHRQ staff, we conducted a thorough review of the "priority populations" as defined in the Healthcare Research and Quality Act of 1999 and the topic areas as identified by the National Healthcare Disparities Report (NHDR). We found that CHIS 2001 contains sufficient sample sizes for many of the "priority populations" identified by the Act, such as low-income groups, minority groups, women, children, the elderly, individuals with chronic conditions, rural residents, and inner-city residents. We further found that CHIS 2001 contains many of the health measures identified by NHDR. Based on our reviews of the "priority populations," the target health measures, and the CHIS 2001 sample and content, we identified a total of 31 health measures and 17 population measures that are relevant to NHDR. In consultation with AHRQ staff, we recommended 14 of the health measures and 9 of the population measures to be considered for inclusion in the NHDR. These measures include:

Health Measures

  • % of persons with health insurance (at the time of interview)
  • % of persons with any period of uninsurance during a year
  • % of adult employees whose employer offers health insurance coverage
  • % of persons who have a specific source of ongoing care
  • % of persons in fair or poor health who have a specific source of ongoing care
  • % of persons that experience delays or do not get prescription drug, test or treatment, or any other medical care
  • % of persons in fair or poor health with no doctor visits in past year
  • % of women (age40+) who report they had a mammogram within the past 2 years
  • % of women (18 and over) who report that they had a Pap smear within the past 3 years
  • % of men and women (50 and older) who report they ever had a flexible sigmoidoscopy/colonoscopy
  • % of men and women (50 and older) who report they had a fecal occult blood test (FOBT) within the past 2 years
  • % of adults with diabetes who had a hemoglobin A1c measurement at least once in past year
  • % of adults with diabetes who had a foot examination in past year
  • % of persons age 65 and over who received an influenza vaccination in the past 12 months

Population Measures

  • Age
  • Gender
  • Income as percent of federal poverty level (FPL)
  • Census race classification
  • Census Hispanic/Latino origin status
  • Latino subgroups (Mexican, Central American, Puerto Rican, South American, Other Latino)
  • Asian subgroups (Chinese, Filipino, Japanese, Vietnamese)
  • Language(s) spoken at home
  • Urban/Rural status

Return to Contents

Data Files

The CHIS 2001 source data files were used for the analysis. The analysis used data from the RDD sample file and the Asian ethnic supplemental sample file for all age groups (adult, adolescent, and child). The analysis involving Japanese and Vietnamese used data from the Asian ethnic supplemental sample file; the rest of the analysis used data from the RDD sample file, including the analysis of two other Asian ethnic groups – Chinese and Filipino.

Return to Contents


As discussed above, the health measures and population measures used in this project were selected through a process that includes the initial review of the relevant measures in CHIS 2001 and selection of the final set by CHIS staff in consultation with AHRQ staff.

The analysis generated tabular reports in which each health measure is examined across all the selected population characteristics (or subgroups). Two statistics are generated for each estimate: the percent and the standard error of the percent estimate. The percent estimate is shown as the percent of the population of the cross-tabulation cell reporting a health condition, behavior, or health care status as indicated by the title. For instance, if the cell is from a cross-tabulation of age and race and the health measure on "% of persons with any period of uninsurance during a year," then the percent estimate of 23.2% for the cells of Asians (race) 18-44 (age) means 23.2 percent of all Asians aged 18-44 were uninsured at some point of time during a year. Its corresponding standard error estimate is 1.2 percent.

Although CHIS 2001 has a relatively very large sample size, some of the estimates in this report were based a small cell size. This is because these estimates are for the rare population segments, such as the Native Hawaiians and Other Pacific Islanders (respondent's only reported race) aged 45-64 for which there are only 65 records in CHIS 2001. These estimates are usually accompanied by large variances which, in turn, indicate that the estimates may not be stable or reliable. We have adopted a cut-off value of the ratio of the standard error over the percent estimate itself as an indicator of whether the estimate is stable or not. The cut-off value of this ratio, or coefficient of variation (COV), is 0.3 (or 30 percent). If the COV is equal to or greater than 0.3, then the percent estimate itself is considered unstable or unreliable. Take for example the one-race Native Hawaiians and Other Pacific Islanders aged 45-64 who had any period of uninsurance during a year. The percent estimate shows 10.2% of this group to be uninsured at some time during past year. The standard error estimate is 4.1%. The COV ratio is then 0.4 which is greater than the pre-defined cutoff value of 0.3 for stable estimates, thus indicating that the percent estimate of 10.2% for this group is not stable. All unstable estimates defined as such are flagged with highlight in the reports.

The statistical procedures used for these estimates are the SAS Surveymeans and the SUDAAN Crosstab. The SUDAAN Crosstab procedure is only used for estimates of Japanese and Vietnamese from the Asian Ethnic Supplemental Sample File. The variance estimation in the Asian Ethnic Supplemental Sample File can only be performed through using replicate weights and SUDAAN Crosstab is capable of such estimation. All other estimates are calculated using the SAS Surveymeans procedure which utilizes the design information in the RDD sample.

The final tables resulting from this analysis were submitted on February 24, 2003 and resubmitted on March 10, 2003. Attached again are the final tables, 14 Excel files in all with each containing 6 spreadsheets (tables).

Return to Appendix B: Methods
Proceed to Next Section
2003 National Healthcare Disparities Report


The information on this page is archived and provided for reference purposes only.


AHRQ Advancing Excellence in Health Care