# Notes: Description of Data Cleaning and Calculations

## 2009 Comparative Database Report

This notes section provides additional detail regarding how various statistics presented in this report were calculated.

### Data Cleaning

Each participating hospital was asked to submit cleaned, individual-level survey data. However, as an additional check, once the data were submitted, response frequencies were run on each hospital's data to look for out-of-range values, missing variables, or other data anomalies. When data problems were found, hospitals were contacted and asked to make corrections and resubmit their data. In addition, each participating hospital was sent a copy of their data frequencies for the hospitals to verify that the data set received was correct.

### Response Rates

As part of the data submission process, hospitals were asked to provide their response rate numerator and denominator. Response rates were calculated using the formula below.

**Response Rate** =

**Number of complete, returned surveys**

Number of surveys distributed - Ineligibles

**Numerator** = Number of complete, returned surveys. The numerator equals the number of individual survey records submitted to the database. It should **exclude** surveys that were returned blank on all nondemographic survey items, but **include** surveys where at least one nondemographic survey item was answered.

**Denominator** = The total number of surveys distributed minus ineligibles. Ineligibles include deceased individuals or those who were not employed at the hospital during data collection.

As a data cleaning step, we examined whether any individual survey records submitted to the database were missing responses on all of the nondemographic survey items (indicating the respondent did not answer any of the main survey questions). Records where all nondemographic survey items were left blank by the respondent were found (even though these blank records should not have been submitted to the database). We therefore removed these blank records from the larger dataset and adjusted any affected hospital's response rate numerator and overall response rate accordingly.

### Item and Composite Percent Positive Scores

To calculate your hospital's composite score, simply average the percentage of positive responses to each item in the composite. Here is an example of computing a composite score for Overall Perceptions of Patient Safety:

- There are four items in this composite—two are positively worded (items # A15 and # A18) and two are negatively worded (items # A10 and # A17). Keep in mind that DISAGREEING with a negatively worded item indicates a POSITIVE response.
- Calculate the percentage of positive responses at the item level (go to the example in Table 1).

### Table 1. Example of Computing Item and Composite Percent Positive Scores

Four items measuring "Overall Perceptions of Patient Safety" | For positively worded items, count the number of "Strongly agree" or "Agree" responses | For negatively worded items, count the number of "Strongly disagree" or "Disagree" responses | Total number of responses to the item | Percent positive response on item |
---|---|---|---|---|

Item A15-positively worded"Patient safety is never sacrificed to get more work done" |
120 | NA* | 260 | 120/260=46% |

Item A18-positively worded"Our procedures and systems are good at preventing errors from happening" |
130 | NA* | 250 | 130/250=52% |

Item A10-negatively worded"It is just by chance that more serious mistakes don't happen around here" |
NA* | 110 | 240 | 110/240=46% |

Item A17-negatively worded"We have patient safety problems in this unit" |
NA* | 140 | 250 | 140/250= 56% |

* NA = Not applicable | Composite Score % Positive = (46% + 52% + 46% + 56%) / 4 = 50% |

In this example, there were 4 items, with percent positive response scores of 46 percent, 52 percent, 46 percent, and 56 percent. Averaging these item-level percent positive scores results in a composite score of .50 or 50 percent on Overall Perceptions of Patient Safety. In this example, an average of about 50 percent of the respondents responded positively to the survey items in this composite.

Once you calculate your hospital's percent positive response for each of the 12 safety culture composites, you can compare your results with the composite-level results from the 622 database hospitals.

Note that the method described above for calculating composite scores is slightly different than the method described in the September 2004 Survey User's Guide that is part of the original survey toolkit materials on the AHRQ Web site. The guide advises computing composites by calculating the overall percent positive across all the items within a composite. The updated recommendation included in this report is to compute item percent positive scores first, and then average the item percent positive scores to obtain the composite score, which gives equal weight to each item in a composite. The Survey User's Guide will eventually be updated to reflect this slight change in methodology.

### Percentiles

Percentiles were computed using the SAS® Software default method. The first step in this procedure is to rank order the percent positive scores from all the participating hospitals, from lowest to highest. The next step is to multiply the number of hospitals (n) by the percentile of interest (p), which in our case would be the 10th, 25th, 50th, 75th, or 90th percentile.

For example, to calculate the 10th percentile, one would multiply 622 (the total number of hospitals) by .10 (10th percentile). The product of n x p is equal to "j+g" where "j" is the integer and "g" is the number after the decimal. If "g" equals 0, the percentile is equal to the percent positive value of the hospital in the j^{th} position plus the percent positive value of the hospital in the j^{th} + 1 position, divided by 2 [(X_{j} + X_{j+1}/2]. If "g" is **not** equal to 0, the percentile is equal to the percent positive value of the hospital in the j^{th} +1 position.

The following examples show how the 10th and 50th percentiles would be computed using a sample of percent positive scores from 12 hospitals (using fake data shown in Table 2). First, the percent positive scores are sorted from low to high on Composite "A."

### Table 2. Data Table for Example of How To Compute Percentiles

Hospital |
Composite "A"% Positive Score |
---|---|

1 | 33% |

2 | 48% 10th percentile score = 48% |

3 | 52% |

4 | 60% |

5 | 63% |

6 | 64% 50th percentile score = 65% |

7 | 66% |

8 | 70% |

9 | 72% |

10 | 75% |

11 | 75% |

12 | 78% |

**10th percentile**

- For the 10th percentile, we would first multiply the number of hospitals by .10 (n x p = 12 x .10 = 1.2).
- The product of n x p = 1.2, where "j" = 1 and "g" = 2. Since "g" is
**not**equal to 0, the 10th percentile score is equal to the percent positive value of the hospital in the j^{th}+ 1 position:- "j" equals 1.
- The 10th percentile equals the value for the hospital in the 2
^{nd}position = 48 percent.

**50th Percentile**

- For the 50th percentile, we would first multiply the number of hospitals by .50 (n x p = 12 x .50 = 6.0).
- The product of n x p = 6.0, where "j" = 6 and "g" = 0. Since"g" = 0, the 50th percentile score is equal to the percent positive value of the hospital in the j
^{th}position plus the percent positive value of the hospital in the j^{th}+ 1 position, divided by 2:- "j" equals 6.
- The 50th percentile equals the average of the hospitals in the 6th and 7th positions (64%+66%)/2 = 65%.