go home

Select a Professional Development Module:
Key Topics Strategy Scenario Case Study References

Introduction |  Step 1 |  Step 2 |  Step 3 |  Step 4 |  Step 5 |  Step 6 |  Step 7 |  Step 8 |  Step 9

Step 7. Identify potential risks to the accuracy of the data you will collect. To minimize these risks, decide if it is necessary to implement and carry out procedures to enhance the trustworthiness of the data, such as training programs for data collectors (R).

(R) = report example

Trustworthy data are data that are both valid and reliable. Validity is achieved when the data measure the characteristics that you intend them to measure. Reliability is achieved when one can be assured that the results of the data collection would be reproducible under repeated administrations, ratings, or codings. The following list contains common causes of measurement error that threaten the validity and reliability of results.

  • The individuals from whom you seek data have characteristics that prevent them from being able to understand the intent of your questions (such as a lack of understanding of the vocabulary used in the questions).
  • Data collection procedures are poorly conceived or poorly implemented. Examples include confusing or inconsistent directions to respondents, or failure of interviewers or observers to follow the directions specified in their protocols.
  • The settings in which people respond prevent them from paying enough attention to the questions. Examples are settings that are excessively hot, cold, or noisy.
  • Data interpretation procedures are inadequate. Examples are the application of ambiguous, confusing, or misdirected rating (e.g., scoring) or coding criteria.

The following paragraphs suggest training procedures and other strategies for preventing the forms of measurement error just described.

  1. Use of guidelines and backup procedures for administration of evaluation instruments. Some questionnaires and learning assessments are administered to respondents in live situations. Sometimes this can be very easy, as for example, when a paper feedback questionnaire is administered to adults before they exit a workshop. It may require a very simple training procedure that might only take a few minutes. At other times, it can be more challenging, due to complexities built into the delivery of the instrument or to technical problems. An example would be an assessment of young students’ learning that requires the students to work in pairs at computer stations and log on to the Internet. Many things can go wrong in such a situation. The computers may malfunction, Internet access may be unacceptably slow, and some students may have to work alone because their partners are absent. Backup strategies may need to be built into the administrative procedures of the assessment to prevent problems from resulting in a failure to collect data. Thorough guidelines may need to be developed to address these complexities. The more detailed the guidelines are, however, the greater the risk that the instrument administrators will make errors if they are not sufficiently trained.
  2. Use of corroboration and taping when carrying out observations and interviews. Observations and interviews require that data collectors have the skill to collect information and maintain fidelity to the protocols that have been designed for them to record information. The errors that can result from lack of fidelity to the protocols can be offset through corroboration. Corroboration can be achieved when multiple data collectors observe the same phenomena or take notes at the same interviews. If multiple data collectors cannot be present, the audio of the interview can be recorded and the observation can be videotaped. Taping makes possible repeated asynchronous visits by multiple data collectors.

    Use of coding and rating of information. In data analysis, coding is the ascription of standardized interpretations to raw information. An example would be the coding of a particular student response in a classroom discussion as demonstrative of a particular cognitive process. Rating is a form of coding in which a standardized value is assigned to student work, such as a student's answer to an open-ended question on a learning assessment. Coding and rating make it possible to classify, aggregate, and summarize results. Table 2 below identifies the types of artifacts that are coded and rated, and the types of standardized criteria used for rendering coding and rating discussions.
  3. Table 2. Types of artifacts and interpretive criteria used in training raters and coders

    Instrument Artifact Interpretive criteria
    Learning assessment Student work Rating rubric
    Observation protocol Field notes Coding scheme
    Interview protocol Transcripts Coding scheme
    Questionnaire Respondent form Coding scheme

    Use of rater and coder training. Reliability in coding and rating occurs when different coders and raters render the same decisions. If an item is valid (see the Instrument Triangulation and Selection module for more on validity) and the rating or coding criteria are comprehensive and clearly communicated, it should be possible to achieve coding and rating reliability as long as the raters and coders have the skills to properly apply the criteria. This requires training, which is the subject of Step 8.