Steps Two and Three: Consider the set of observation
items you might design for the first evaluation question ("How do students typically interact as
they use Science Search?"). For the purposes of this exercise, assume that you already have
derived the content of the observation items through an independent exercise of operationally
defining student behavior variables that can answer the evaluation question. Specifically, you
have identified the following kinds of student behavior variables for the context of groups of
students using Science Search: how on task students are, how accurately and deeply they
grapple with the science material and inquiry processes of the task, and the participation
level of each student in the group. Now, it is time to think through the implications of the
different dimensions of observation techniques for measuring these kinds of variables. Using the
five dimensions in the Strategy section, describe where you anticipate that your items would
fall on each dimension.
Lower Inference Judgments vs. Higher-Inference Judgments You anticipate that the majority of your observation items will involve higher-inference judgments on
the part of observers, most likely in the form of rating scales. Although it may take careful work to
define these categories of higher-inference coding, you anticipate that this is preferable to
lower-inference judgments of student interaction—where essentially you would have to code each
student statement in terms of its characteristics. This lower-inference coding would be intense
and not necessarily able to yield meaningful data for the evaluation.
Also, because you want to capture any unanticipated important features or difficulties of students
working with Science Search, you plan to ask observers to keep records of any such events and to
answer several open-ended questions at the end of the observation protocol. Again, this task involves
primarily higher-inference judgments on the part of the observer.
Event Sampling vs. Time Sampling
Your observation approach actually
may blend event sampling with time sampling. You probably will do
time sampling because you will not be able to observe different student groups simultaneously.
Thus,
you may focus on different student groups for different blocks of time
(e.g., observe one
student
group for 10 minutes, another student group for 10 minutes, etc.).
(Note that because the
second
evaluation question concerns the behavior of the teacher, you also will have to
allocate
other
blocks of time for this observation.)
Within your sampled time blocks, you probably will "event sample" as well. This means
that you are not
interested in every type of behavior that occurs, nor will you design your observation
categories to
accommodate any possible behavior or event. Instead, your codings and ratings will be
linked closely to
the subset of student behavior variables you already have defined. If the teacher, for
example, asks a
student you are observing to leave his or her group and get some supplies from the cabinet,
you would
ignore this event.
All-Inclusive Subjects vs.Targeted Subjects Because Science Search is supposed to be implemented with students working in small groups, it is
impractical to consider collecting data on all students in a class. Instead, it makes sense to consider
gathering data from two small groups of students. You decide you will target each group for half of the
time that you devote to observing student groups. Thus, you will be sampling two student groups from each
of the 10 classes.
Because teachers may assign students to groups in different ways, you want your two small groups to be as
representative of the class as possible (e.g., if groups are divided mostly by prior achievement, you
will purposely observe one group of higher achievers and one group of lower achievers).
Real Time vs. Post Hoc
Because you are interested in how students interact in their small groups, and because these interactions
can be fast-paced and complex, you decide that most of your note taking and coding will take place in real
time. Ideally, observers should be able to complete any brief notes and many higher-inference rating
items by the end of each block of time for observing a given student group (so as to maximize the
freshness of their memories and the credibility of their judgments). If observers want to write more
detailed notes to address unanticipated events in class or answer and support open-ended summary
questions on the protocol, they may have to do this post hoc (i.e., after the observation period
ends).
Quantitative vs. Qualitative Because of cost and time constraints, keeping in mind that you need information to inform revisions of
the curriculum as soon as possible, you want the majority of your observation items to be immediately
translatable into quantitative data. Any rating scale items would lend themselves to this purpose.
However, as already discussed, you also intend to solicit some qualitative information from
observers about unanticipated features and difficulties of students working with Science Search
that they may have witnessed. In this case, it is likely that you may try to turn some of this
qualitative information into quantitative data (e.g., you discern that there are three areas of
student implementation difficulty across the descriptions from 10 classrooms, and you go back
and count how often each category of difficulty arose in each classroom).
|