go home

Select a Professional Development Module:
Key Topics Strategy Scenario Case Study References

Introduction  |  Step 1  |  Steps 2 and 3

Steps Two and Three: Consider the set of observation items you might design for the first evaluation question ("How do students typically interact as they use Science Search?"). For the purposes of this exercise, assume that you already have derived the content of the observation items through an independent exercise of operationally defining student behavior variables that can answer the evaluation question. Specifically, you have identified the following kinds of student behavior variables for the context of groups of students using Science Search: how on task students are, how accurately and deeply they grapple with the science material and inquiry processes of the task, and the participation level of each student in the group. Now, it is time to think through the implications of the different dimensions of observation techniques for measuring these kinds of variables. Using the five dimensions in the Strategy section, describe where you anticipate that your items would fall on each dimension.

Lower Inference Judgments vs. Higher-Inference Judgments

You anticipate that the majority of your observation items will involve higher-inference judgments on the part of observers, most likely in the form of rating scales. Although it may take careful work to define these categories of higher-inference coding, you anticipate that this is preferable to lower-inference judgments of student interaction—where essentially you would have to code each student statement in terms of its characteristics. This lower-inference coding would be intense and not necessarily able to yield meaningful data for the evaluation.

Also, because you want to capture any unanticipated important features or difficulties of students working with Science Search, you plan to ask observers to keep records of any such events and to answer several open-ended questions at the end of the observation protocol. Again, this task involves primarily higher-inference judgments on the part of the observer.

Event Sampling vs. Time Sampling

Your observation approach actually may blend event sampling with time sampling. You probably will do time sampling because you will not be able to observe different student groups simultaneously. Thus, you may focus on different student groups for different blocks of time (e.g., observe one student group for 10 minutes, another student group for 10 minutes, etc.). (Note that because the second evaluation question concerns the behavior of the teacher, you also will have to allocate other blocks of time for this observation.)

Within your sampled time blocks, you probably will "event sample" as well. This means that you are not interested in every type of behavior that occurs, nor will you design your observation categories to accommodate any possible behavior or event. Instead, your codings and ratings will be linked closely to the subset of student behavior variables you already have defined. If the teacher, for example, asks a student you are observing to leave his or her group and get some supplies from the cabinet, you would ignore this event.

All-Inclusive Subjects vs.Targeted Subjects

Because Science Search is supposed to be implemented with students working in small groups, it is impractical to consider collecting data on all students in a class. Instead, it makes sense to consider gathering data from two small groups of students. You decide you will target each group for half of the time that you devote to observing student groups. Thus, you will be sampling two student groups from each of the 10 classes.

Because teachers may assign students to groups in different ways, you want your two small groups to be as representative of the class as possible (e.g., if groups are divided mostly by prior achievement, you will purposely observe one group of higher achievers and one group of lower achievers).

Real Time vs. Post Hoc

Because you are interested in how students interact in their small groups, and because these interactions can be fast-paced and complex, you decide that most of your note taking and coding will take place in real time. Ideally, observers should be able to complete any brief notes and many higher-inference rating items by the end of each block of time for observing a given student group (so as to maximize the freshness of their memories and the credibility of their judgments). If observers want to write more detailed notes to address unanticipated events in class or answer and support open-ended summary questions on the protocol, they may have to do this post hoc (i.e., after the observation period ends).

Quantitative vs. Qualitative

Because of cost and time constraints, keeping in mind that you need information to inform revisions of the curriculum as soon as possible, you want the majority of your observation items to be immediately translatable into quantitative data. Any rating scale items would lend themselves to this purpose. However, as already discussed, you also intend to solicit some qualitative information from observers about unanticipated features and difficulties of students working with Science Search that they may have witnessed. In this case, it is likely that you may try to turn some of this qualitative information into quantitative data (e.g., you discern that there are three areas of student implementation difficulty across the descriptions from 10 classrooms, and you go back and count how often each category of difficulty arose in each classroom).