# Notes: Description of Data Cleaning and Calculations

## 2008 Comparative Database Report

This notes section provides additional detail regarding how various statistics presented in this report were calculated.

### Data Cleaning

Each participating hospital was asked to submit cleaned, individual-level survey data. However, as an additional check, once the data were submitted, response frequencies were run on each hospital’s data to look for out-of-range values, missing variables, or other data anomalies. In instances where data problems were found, hospitals were contacted, asked to make corrections and resubmit their data. In addition, each participating hospital was sent a copy of their data frequencies as an additional way for the hospitals to verify that the dataset received was correct.

### Response Rates

As part of the data submission process, hospitals were asked to provide their response rate numerator and denominator. Response rates were calculated using the formula below.

__Number of complete, returned surveys__

Number of surveys distributed - Ineligibles

**Numerator** = Number of complete, returned surveys. The numerator equals the number of individual survey records submitted to the database. It should *exclude* surveys that were returned blank on all non-demographic survey items, but *include* surveys where at least one non-demographic survey item was answered.

**Denominator** = The total number of surveys distributed minus ineligibles. Ineligibles include deceased individuals or those who were not employed at the hospital during data collection.

As a data cleaning step, we examined whether any individual survey records submitted to the database were missing responses on all of the non-demographic survey items (indicating the respondent did not answer any of the main survey questions). Records where all non-demographic survey items were missing were found (even though these blank records should not have been submitted to the database). We therefore removed these blank records from the larger dataset and adjusted any affected hospital’s response rate numerator and overall response rate accordingly.

### Item and Composite Percent Positive Scores

To calculate your hospital’s composite score, simply average the percent of positive response on each item that is in the composite. Here is an example of computing a composite score for Overall Perceptions of Patient Safety:

- There are four items in this composite—two are positively worded (items #A15 and #A18) and two are negatively worded items #A10 and #A17). Keep in mind that
**disagreeing**with a negatively worded item indicates a**positive**response. - Calculate the percent of positive response at the item level (go to example in Table 1).

### Table 1. Example of Computing Item and Composite Percent Positive Scores

Four items measuring “Overall Perceptions of Patient Safety” |
For positively worded items, count the number of “Strongly agree” or “Agree” responses |
For negatively worded items, count the number of “Strongly disagree” or “Disagree” responses |
Total number of responses to the item |
Percent positive response on item |
---|---|---|---|---|

Item A15-positively worded“Patient safety is never sacrificed to get more work done” |
120 | NA* | 260 | 120/260=46% |

Item A18-positively worded“Our procedures and systems are good at preventing errors from happening” |
130 | NA* | 250 | 130/250=52% |

Item A10-negatively worded“It is just by chance that more serious mistakes don't happen around here” |
NA* | 110 | 240 | 110/240=46% |

Item A17-negatively worded“We have patient safety problems in this unit” |
NA* | 140 | 250 | 140/250= 56% |

* NA = Not applicable | Composite Score % Positive = (46% + 52% + 46% + 56%) / 4 = 50% |

In this example, there were 4 items with percent positive response scores of 46 percent, 52 percent, 46 percent, and 56 percent. Averaging these item-level percent positive scores results in a composite score of .50 or 50 percent on Overall Perceptions of Patient Safety. In this example, an average of about 50 percent of the respondents responded positively on the survey items in this composite.

Once you have calculated your hospital’s percent positive response on each of the 12 safety culture composites, you can compare your results with the composite-level results from the 519 database hospitals.

Note that the method described above for calculating composite scores is slightly different than the method described in the September 2004 *Survey User’s Guide* that is part of the original survey toolkit materials on the AHRQ Web site. The Guide advises computing composites by calculating the overall percent positive across all the items within a composite. The updated recommendation included in this report is to compute item percent positive scores first, and then average the item percent positive scores to obtain the composite score, which gives equal weight to each item in a composite. The *Survey User’s Guide* will eventually be updated to reflect this slight change in methodology.

### Percentiles

Percentiles were computed using the SAS® Software default method. The first step in this procedure is to rank order the percent positive scores from all the participating hospitals, from lowest to highest. The next step is to multiply the number of hospitals (n) by the percentile of interest (p), which in our case would be the 10th, 25th, 50th, 75th or 90th percentile.

For example, to calculate the 10th percentile, one would multiply 519 (the total number of hospitals) by .10 (10th percentile). The product of n x p is equal to “j+g” where “j” is the integer and “g” is the number after the decimal. If “g” equals 0, the percentile is equal to the percent positive value of the hospital in the j^{th} position plus the percent positive value of the hospital in the j^{th} +1 position, all divided by two [(X_{(j)} + X_{(j+1)})/2]. If “g” is **not** equal to 0, the percentile is equal to the percent positive value of the hospital in the j^{th} +1 position.

The following examples show how the 10th and 50th percentiles would be computed using a sample of percent positive scores from 12 hospitals (using fake data in Table 2). First, the percent positive scores are sorted from low to high on Composite “A.”

### Table 2. Data Table for Example of How to Compute Percentiles

Hospital |
Composite “A”% Positive Score |
---|---|

1 | 33% |

2 | 48% 10th percentile score = 48% |

3 | 52% |

4 | 60% |

5 | 63% |

6 | 64% 50th percentile score = 65% |

7 | 66% |

8 | 70% |

9 | 72% |

10 | 75% |

11 | 75% |

12 | 78% |

**10th percentile**

- For the 10th percentile, we would first multiply the number of hospitals by .10 (n x p = 12 x .10 = 1.2).
- The product of n x p = 1.2, where “j” = 1 and “g” = 2. Since “g” is
**not**equal to 0, the 10th percentile score is equal to the percent positive value of the hospital in the j^{th}+1 position:- “j” equals 1.
- The 10th percentile equals the value for the hospital in the 2
^{nd}position = 48 percent.

**50th Percentile**

- For the 50th percentile, we would first multiply the number of hospitals by .50 (n x p = 12 x .50 = 6.0).
- The product of n x p = 6.0, where “j” = 6 and “g” = 0.� Since“g” = 0, the 50th percentile score is equal to the percent positive value of the hospital in the j
^{th}position plus the percent positive value of the hospital in the j^{th}+1 position, all divided by two:- “j” equals 6.
- The 50th percentile equals the average of the hospitals in the 6th and 7th position (64%+66%)/2 = 65.

## Copyright Notice

This document is in the public domain and may be used and reprinted without permission except those copyrighted materials noted for which further reproduction is prohibited without specific permission of copyright holders.