Survey User's Guide
Chapter 5. Preparing and Analyzing Data and Producing Reports
At the end of data collection, you will need to prepare the collected survey data for analysis. As mentioned in Chapter 2, you may want to hire a vendor to conduct data entry or data analysis or to produce feedback reports for your nursing home. If you decide to do your own data entry, analysis, and report preparation, use this chapter to guide you through the various decisions and steps. If you decide to hire a vendor, use this chapter as a guide to establish data preparation procedures. If you plan to conduct a Web-based survey, you can minimize data cleaning by programming the Web survey to perform some of these steps automatically. In addition, if you plan to administer the survey in more than one nursing home, you will need to report the results separately for each participating nursing home.
You or your vendor will need to complete a number of tasks to prepare the survey data for analysis. During the data preparation process, several data files will be created. It is important to maintain the original data file that is created when survey responses are data entered. Any changes or corrections should be made to duplicate files, for two reasons:
- Retaining the original file allows you to correct possible future errors made during the data cleaning or recoding processes.
- The original file is important should you ever want to go back and determine what changes were made to the data set or conduct other analyses or tests.
Each survey needs to be examined for completeness prior to entering the survey responses into the data set. Exclude surveys that were returned completely blank or those with only background demographic questions answered. In addition, you may want to visually or programmatically (during data cleaning) omit surveys where the respondent gave the exact same answer for all the questions in the survey. Because the survey includes negatively worded items, respondents should use both the positive and negative ends of the response scales to provide consistent answers. If every answer is the same, the respondent did not give the survey his or her full attention and the responses are probably not valid.
Calculate Your Final Response Rate
After you have identified which returned surveys will be included in the final analysis data file, you can use the following formula to calculate the official response rate:
Number of returned surveys - incompletes
Numbers of surveys distributed - ineligibles
This formula differs from that used for calculating preliminary response rates (shown in Chapter 4) only in the numerator. The numerator may be smaller than in your last preliminary response rate calculation because, during your examination of all returned surveys, you may find that some of the returned surveys are incomplete. You may have to exclude them from the analysis data file.
Edit, Code, and Enter the Data
In this section we describe several data file preparation tasks.
Illegible, Mismarked, and Double-Marked Responses
Some survey responses may need to be edited or coded before the data are entered into an electronic data file or statistical analysis program. Editing and coding involve decisionmaking regarding the proper way to enter ambiguous responses. These editing and coding steps will probably not be necessary if you are using a Web-based survey or scannable forms.
One potential issue is survey responses that are difficult to determine. For example, respondents may write in an answer such as 3.5, when they have been instructed to circle only one numeric response. They may circle two answers for one item. Develop and document decision rules for these situations and apply them consistently. Examples of such rules are to use the highest response when two responses are provided (e.g., a response with both 2 and 3 would convert to a 3) or to mark all of these types of inappropriate responses as missing.
After surveys have been corrected as needed (most surveys will not need to be corrected), you can enter the data directly into an electronic file by using statistical software. Available packages include SAS®, SPSS®, and Microsoft Excel®. You also can create a text file that can be easily imported into a data analysis software program.
Individual Identifiers on Your Data File
If you used identifiers (identification numbers or codes) on your surveys, after you close out data analysis, destroy any information linking the identifiers to individual names. You no longer need this information, and you will want to eliminate the possibility of linking responses on the electronic file to individuals.
If no identifiers were used on the surveys, you will need to include some type of individual identifier in the data file. Create an identification number for each survey and write it on the hard copy surveys in addition to entering it into the electronic data file. This identifier can be as simple as numbering the returned surveys consecutively, beginning with the number 1. This number will enable you to go back and check the electronic data file against a respondent's paper survey answers if there are values that look like they were entered incorrectly.
Respondents are given the opportunity to provide written comments at the end of the survey. Comments can be used to obtain direct quotes for feedback purposes. If you wish to analyze these data further, you need to code the responses according to the type of comment. For example, staff may respond with positive comments about resident safety efforts in the nursing home. They may comment on some negative aspects of resident safety that they think need to be addressed. You may assign code numbers to similar types of comments and later tally the frequency of each comment type. Open-ended comments may be coded either before or after the data have been entered electronically.
After the surveys have been coded and edited as necessary and entered electronically, you will need to check and clean the data file before you begin analyzing and reporting results. The data file may contain data entry errors. You can check and clean the data file electronically by producing frequencies of response to each item and looking for out-of-range values or values that are not valid responses. Most items in the survey require a response between 1 and 5, with a 9 coded as Does Not Apply/Don't Know. Check through the data file to ensure that all responses are within the valid range (e.g., that a response of 7 has not been entered). If you find out-of-range values, refer to the original survey and determine the response that should have been entered.
Analyze the Data and Produce Reports of the Results
Feedback reports are the final step in a survey project and are critical for synthesizing survey responses. Ideally, feedback should be provided broadly—to nursing home administrators and management, health system boards of directors, nursing home committees, and nursing home staff. Reports can be given either directly during meetings or through centralized communication tools such as E-mail and newsletters.
The more broadly the survey results are disseminated, the more useful the information is likely to become. The feedback also will serve to legitimize the collective effort of the staff in responding to the survey. It is gratifying and important for respondents to know that something worthwhile came out of the information they provided. Different types of feedback reports can be prepared for different audiences, from one- or two-page executive summaries to more complete reports that use statistics to draw conclusions or make comparisons.
Frequencies of Response
One of the simplest ways to present results is to calculate the frequency of response for each survey item. A Microsoft PowerPoint® presentation template is available from the AHRQ Web site that you may use to communicate results from the Nursing Home Survey on Patient Safety Culture. The feedback report template groups survey items according to the safety culture dimension each item is intended to measure. You simply insert your nursing home's survey findings in the charts to create a customized feedback report. The two lowest response categories are combined (e.g., Strongly Disagree/Disagree or Never/Rarely) and the two highest response categories are combined (e.g., Strongly Agree/Agree or Most of the Time/Always) to make the results easier to view in the report. The midpoints of the scales are reported as a separate category (Neither Agree nor Disagree or Sometimes). The percentage of answers corresponding with each of three response categories then are displayed graphically—go to Figure 2.
You will need to exclude "Does Not Apply/Don't Know" and missing responses when showing overall percentages of response. Most of the survey's items include a Does Not Apply/Don't Know response option. This option is included so that staff who do not have enough information or do not know about a particular issue can select this answer rather than guessing or providing an answer about something they are not familiar with. In addition, each survey item will probably have some missing data from respondents who simply did not answer the question. Does not apply/don't know and missing responses are excluded when displaying percentages of response to the survey items. When using a statistical software program, you will recode the "9" response (Does not apply/don't know) as a missing value so that it is not included when displaying frequencies of response.
An example of how to handle the does not apply/don't know and missing responses when calculating survey results is shown in Table 1. As Table 1 shows, respondents who answered does not apply/don't know are treated the same way as those who did not answer the item (missing). The column labeled "Correct Percentages of Response" shows the correct percentage for each response option in the example. The column labeled "Correctly Combined Percentages" shows the correct percentage of negative, neutral, and positive scores, which do not include the does not apply/don't know responses or missing responses.
The two shaded columns on the right in Table 1 labeled "Incorrect Percentages of Response" and "Incorrectly Combined Percentages" show the incorrect or wrong way to compute results if you were to mistakenly include the does not apply/don't know responses as valid responses. Again, the easiest way to ensure that the percentages are computed correctly is to recode all "9" responses to missing so that they are not included in the frequency and/or percentage of negative, neutral, and positive score calculations.
It can be useful to calculate one overall composite score for each dimension. To calculate your nursing home's composite score on a particular safety culture dimension, simply average the percentage positive response on each item that is included in the composite. Here is an example of computing a composite score for Staffing:
- This composite includes four items. Two are positively worded (items A3 and A16) and two are negatively worded (items A8 and A17). Keep in mind that DISAGREEING with a negatively worded item indicates a POSITIVE response.
- Calculate the percent positive response at the item level (Go to the example in Table 2).
In this example, four items had percent positive response scores of 46 percent, 52 percent, 46 percent, and 56 percent. Averaging these item-level percent positive scores (46% + 52% + 46% + 56% / 4 = 50%) results in a composite score of .50 or 50 percent on Staffing. That is, an average of about 50 percent of the respondents responded positively on the survey items in this composite.
Identifying Strengths and Areas for Improvement. There are placeholder pages in the Microsoft PowerPoint® survey feedback report template to highlight your nursing home's strengths and areas for improvement with regard to resident safety issues covered in the survey. You may decide to define resident safety strengths as those positively worded items that about 75 percent of respondents endorsed by answering Strongly Agree/Agree or Always/Most of the Time (or, for negatively worded items, where 75 percent of respondents disagreed or responded Never/Rarely). The 75 percent cutoff is somewhat arbitrary, and your nursing home may choose to report strengths using a higher or lower cutoff percentage.
Similarly, areas needing improvement could be identified as those items that 50 percent or fewer respondents answered negatively (they either answered Strongly Disagree/Disagree or Never/Rarely to positively worded items, or they answered Strongly Agree/Agree or Always/Most of the Time to negatively worded items). The cutoff percentage for areas needing improvement is lower, because if half the respondents are not expressing positive opinions about a safety issue, improvement is probably needed.
It also is important to present information about the background characteristics of all the respondents—how long they have worked in the nursing home, their staff positions, and so forth. This information helps others to better understand whose opinions are represented in the data. Be careful not to report frequencies in small categories (e.g., if the number of activity directors who responded is fewer than five), where it may be possible to determine which employees fall into those categories.
Submit Your Data to the Nursing Home Comparative Database
The Agency for Healthcare Research and Quality (AHRQ) has posted initial comparative results from the pilot study of the Nursing Home Survey on Patient Safety Culture on its Web site (https://archive.ahrq.gov/professionals/quality-patient-safety/patientsafetyculture/nursing-home/2008/index.html). In the future, AHRQ will ask all nursing homes that have administered the survey to voluntarily submit their data files to the Nursing Home Survey on Patient Safety Culture Comparative Database. This database will be modeled on the Hospital Survey on Patient Safety Culture Comparative Database, which contains comparative data from users of AHRQ's Hospital Survey on Patient Safety Culture. You will be able to compare your nursing home's results with the overall nursing home comparative data.
When you submit your data file, you will be asked to provide some background information about:
- The characteristics of your nursing home (e.g., bed size, state/geographic region, and ownership) and whether it is part of a larger nursing home chain.
- How the survey was administered (paper only, Web only, or mixed mode).
- When data collection was completed (month and year).
- How many staff were asked to complete the survey (response rate denominator).
This information may be used to conduct analyses of the data files by selected nursing home characteristics. Participating nursing homes will not be identified by name. Only aggregate data will be reported, and only when there are sufficient data so that such aggregation will not permit reidentification of participating nursing homes. If your nursing home is interested in submitting its data to the nursing home database, Email DatabasesOnSafetyCulture@ahrq.hhs.gov.
For free technical assistance on the Nursing Home Survey on Patient Safety Culture regarding survey administration issues, data analysis and reporting, or action planning for improvement, E-mail SafetyCultureSurveys@ahrq.hhs.gov. AHRQ is also sponsoring periodic in-person User Group Meetings so that users of the nursing home survey, along with users of the hospital and medical office surveys, can network and learn from one another.
Dillman DA. Mail and Internet surveys: The tailored design method, 2nd edition. New York: Wiley; 2007.
Groves RM. Survey nonresponse. New York: Wiley; 2002.
Shih, T, Fan, X. Comparing response rates from Web and mail surveys: A meta-analysis. Field Methods 2008;20(3):249-71. Available at: http://fmx.sagepub.com/cgi/content/abstract/20/3/249.