Alignment Table for Instrument Characteristics
Design
The alignment table for sound project evaluation
instruments can be viewed either as a whole, displaying all
three principal characteristics of instruments, or
as three separate tables corresponding to instrument
characteristics: (1) Design, (2)
Technical Quality, and (3) Use & Utility. See the
alignment table overview for
a general description of what appears in the alignment
tables.
The glossary and
quality criteria entries for
instrument characteristics are also available on their
own.
Component |
Glossary Entry |
Quality Criteria |
Professional Standards & References to Best Practice
|
Design |
|
|
|
|
Aligning of data gathering approaches to all major
evaluation questions and subquestions.
The following are broad categories of data
sources:
- Existing Databases may hold valuable
information about
participant characteristics and
relevant outcomes, although they may be difficult to
access.
-
Assessments of Learning may be given to
project participants, typically measuring some
achievement construct. Typologies of achievement
tests typically are differentiated by types of
items, depth of task(s), and scoring criteria or
norms applied to interpret the level of student
performance. For example, if scoring for an
achievement test is
norm-referenced, this means a
student's score is defined according to how other
students performed on the same test. In contrast,
if scoring for a test is
criterion-referenced, a
student's score is defined in terms of whether or
not they have met a pre-specified level of
accomplishment.
- Survey Questionnaires may be completed by
project participants or administered by
interviewers. A combination of item formats (e.g.,
checklists, ratings, rankings, forced-choice and
open-ended questions) may be appropriate.
- Observations of participant behavior may
be recorded in quantitative (e.g., real-time coded
categories) or qualitative (e.g., note-taking for
case study) formats or by special media (i.e., audio
or video recordings)
|
For each evaluation question, determine the best
kinds of data gathering approaches, who will provide
the data, and when and how many times the data will be
collected.
Project
participants, members of the evaluation team,
or other
stakeholders may be the appropriate
respondents for a specific data gathering approach.
In cases where it is not possible to involve all
participants or stakeholders, random or purposeful
sampling is called for.
Each evaluation question implies appropriate
scheduling of data gathering.
Formative evaluation
questions imply activity during and possibly preceding
project implementation.
Summative evaluation
questions can involve data gathering before, during,
and after implementation. Sometimes, it may be
advisable to repeat the same data gathering procedure
(e.g., classroom observations) at multiple points
during a project. Interest in the long-term effects
of a project calls for additional data gathering after
a suitable elapse of time.
|
User-Friendly Handbook for Project Evaluation,
Chapter
Two
|
|
Principled Assessment Designs |
|
Creating a data gathering process that gives strong
evidence of the desired universe of outcomes
|
Development of data-gathering instruments should be
based on a coherent set of activities that lead to the
adoption and implementation of instruments that yield
valid and
reliable evidence of project effects.
The following models serve as the basis for
principled assessment designs:
- Student Model: Identify the configuration
of students' knowledge, skills, or other attributes
that should be measured.
- Evidence Model: Determine the behaviors
or performances that should reveal the knowledge and
skills articulated in the student model.
- Task Model: Construct the tasks or
situations that elicit the behaviors or performances
defined in the evidence model.
|
See Mislevy et. al. (2001).
|
|
Item Construction & Instrument Development
|
|
The process of determining how each instrument item
will prompt appropriate and high-quality data
|
Best practices in item construction are grounded in
respected methodological frameworks acceptable to all
stakeholders and the evaluation research
community.
Use of established instruments that align with
evaluation questions is preferable to the development
of new instruments. When new instruments are called
for, items should be written clearly and the
instrument development should be guided by known
psychometric properties. Standardized assessments, in
particular, call for rigorous development and should
conform to the standards of the American Psychological
Association. There also are accepted guidelines for
survey development.
The items comprising any data-gathering instrument
should be comprehensive and defensible. Any
instrument also should be complete, fair, and free
from bias. Items should be reviewed to assure
sensitivity to gender and cultural diversity. In
addition, instruments should be structured not only to
capture project strengths but also project weaknesses.
Here, the evaluator must anticipate possible problems
with project implementation (e.g., high
participant
turnover, high difficulty level of training concepts)
and design items to assess the prevalence of such
problems.
|
User-Friendly Handbook for Project Evaluation,
Chapter
Three
Standards for Educational and Psychological Testing;
See Dillman (1999); Sudman, S., Bradburn, N.M., &
Schwarz, N. (1996)
Program Evaluation Standards
U3 Information Scope and
Selection
Information collected should be broadly selected to
address pertinent questions about the program and be
responsive to the needs and interests of clients and
other specified
stakeholders.
Program Evaluation Standards
A4 Defensible Information Sources
The sources of information used in a program
evaluation should be described in enough detail, so
that the adequacy of the information can be
assessed.
Program Evaluation Standards
P5 Complete and Fair Assessment
The evaluation should be complete and fair in its
examination and recording of strengths and weaknesses
of the program being evaluated, so that strengths can
be built upon and problem areas addressed.
|
|
Ascertaining the practicality and usefulness of all
data gathering instruments prior to their first
use.
|
Best practices in instrument selection and
development require a systematic process of pilot
testing. Where project
participants or
stakeholders
are the respondents, a small group of individuals
drawn from or matched to this sample should complete
the instruments and give feedback to the evaluation
team about the clarity and meaningfulness of all
items. As individuals try out an instrument, it may
be useful to have them engage in a think-aloud
debriefing/protocol.
For instruments administered or completed by members
of the evaluation team, systematic training and pilot
testing is required to judge not only their quality,
but assure consistency among all team members in the
instrument's use. For example, when pilot-testing a
classroom observation tool, the team of observers will
need repeated practice observing the same situation so
that all obstacles to inter-rater agreement are
addressed (e.g., further clarification of coding
categories).
|
User-Friendly Handbook for Project Evaluation
|
References
American Education Research Association, American
Psychological Association, and National Council on
Measurement in Education (1985, 1999).
Standards
for educational and psychological testing. Washington,
DC: American Psychological Association.
Dillman, D.A. (1999).
Mail
and internet surveys: The tailored design method. New
York: John Wiley & Sons.
Mislevy, R.J., Steinberg, L.S., Almond, R.G., Haertel,
G.D. and Penuel, W.J. (2001).
Leverage
Points for Improving Educational Assessment. CRESST
Technical Paper Series, Los Angeles, CA: CRESST.
Stevens, F., Lawrenz, F., and Sharp, L. (1993 & 1997).
User-Friendly
Handbook for Project Evaluation: Science,
Mathematics, Engineering, and Technology Education.
Arlington, VA: National Science Foundation.
Sudman, S., Bradburn, N.M., and Schwarz, N. (1996).
Thinking
about answers: The application of cognitive
processes to survey methodology. San Francisco:
Jossey-Bass.
Not sure where to start?
Try reading some user
scenarios for instruments.
|
|
|