Alpha
level
The level of probability that stakeholders are willing to live with that a particular
statistical test on a particular set of cases is committing a Type I error.
The most common alpha level used in evaluations is 5%.
Alpha
The probability that a particular statistical test on a particular set of cases
is committing a Type I error and falsely declaring an effect that does not really
exist. In graphical terms, the alpha is the percent of the estimated normal
curve of the population that falls outside of the confidence interval on either
one or both sides of the curve (depending on the nature of the hypothesis being
tested).
Audiences
People who are interested in the results of the evaluation. This usually includes
the funding agency and the sponsoring organization, but it may also include
other groups, such as the parents of participating students or interested researchers.
Bias
When the evaluation design fails to capture the true population and implementation
characteristics, thus rendering the results un-generalizable.
Case
One individual in a group of people or other phenomena being studied (such as,
in education, classes or schools.)
Categorical
variable
A variable with values that are simply categories and cannot be quantified except
by counting up how many cases are in each category that contains them (such
as names of countries or a series of social classes). Otherwise known as a qualitative
variable.
Conclusions
Interpretations that have been synthesized in order to extrapolate even broader
meanings about the project from the data (e.g., the low test scores, examined
in conjunction with low student retention rates and low motivation from the
survey data, suggest that the project is not meeting its objective of providing
an interesting curriculum).
Confidence
interval
The interval surrounding the mean of the sample that has a specified confidence
of containing the mean of the population.
Constructed
response
A type of question that requires the respondent to compose an answer rather
than select from a list of choices (e.g., selected response).
Continuous
variable
An ordinal variable with values that can be broken down into ever-more granular
numeric units (example: age, height, and weight.)
Criterion-Referenced
A scoring interpretation in which a test score is defined by whether a pre-specified
level of accomplishment has been met.
Dichotomous
variable
A categorical variable that is limited to two values that are not necessarily
opposites (such as yes/no, low/high, low/very low, agree/disagree, disagree/highly
disagree.)
Discrete
variable
An ordinal variable with values that consist of countable finite integers that
cannot broken down into more granular units.
Effect
size
The amount of effect that is desired in order to support the idea that the intervention
is successful.
Effect
An outcome that can be said to be at least partially the result of an intervention
rather than caused by other intervening factors.
Efficiency
When your sampling scheme maximizes your power (by generating large samples
at the primary unit of analysis) while not needlessly over-sampling at secondary
units of analysis.
Errors
of Measurement
Sources of variability that interfere with an accurate test score and influence
test results in unexpected ways.
Common sources include:
Formative
Evaluation
Evaluation which examines the effectiveness of the project's implementation
for the sake of facilitating project improvement.
Goal
A broad description of an intended outcome.
Interpretations
Meanings that have been inferred and extrapolated from the data (e.g., the scores
were low relative to expectations).
Learning
Assessment
A systematic measurement tool for capturing some aspect of learning.
Nominal
scale
An arrangement of values of a categorical variable that has no meaningful order
(such as hair color or occupation).
Numeric
variable
A variable with numeric values and a natural order. Otherwise Also known as
a quantitative variable.
Norm-Referenced
A scoring interpretation in which a test score is defined according to how others
perform on the same test.
Objective
A more specific description of an intended outcome. Objectives are usually stated
in ways that allow the amount of attainment to be measured. In education, objectives
are typically about cognition, affect, or psychomotor skill.
Observation
The particular value assigned to a case on a particular variable.
Ordinal
scale
An order that can be imposed on the values of a variable in a subject, where
the order ranges from the highest value (such as "very interested")
to the lowest value (such as "not at all interested").
Participants
Stakeholders who are engaged in project activities.
For example, in a project that involves implementing a new curriculum, the participants
might be the instructors teaching the new curriculum and the students receiving
it.
Power
The probability that your statistical tests will detect an effect that really
exists (e.g., the probability that the test not commit a Type II error).
Primary
unit of analysis
The broadest unit of analysis that you sample from and analyze distinctively
due to the fact that it has distinguishing attributes that could influence intervention
outcomes.
Qualitative
Data
Non-quantified narrative information.
Qualitative Analysis
The use of systematic procedures for deriving meaning from qualitative information.
It often involves an inductive, interactive, and iterative process whereby the
evaluator returns to relevant audiences and data sources
to confirm and/or expand the purposes of the evaluation and test conclusions.
Can be conducted on data collected using interviews, observations, and open-ended
questions on content assessments, as well as on other types of instruments.
Content, thematic, and cognitive analyses are some of the approaches that are
used to analyze qualitative data.
Quantitative
Data
Quantifiable, numerically-expressed information.
Quantitative
Analysis
The use of computational procedures and statistical tests to examine quantitative
data.
Ratio
scale
T he ordering of numeric values when zero is meaningful (such as money or weight.)
Reliability
"The extent to which we are measuring some attribute in a systematic and
therefore repeatable way" (Walsh & Betz, 1995, p. 49). For an instrument
to be reliable its results must be reproducible and stable under the different
conditions in which it is likely to be used. Test reliability is decreased by
errors of measurement. Three commonly used
types of reliability include:
Results
Relevant information gleaned from the data collected in the evaluation.
Scale
The order of values of a variable.
Secondary
unit of analysis
Subgroups of your primary unit of analysis that you sample from.
Selected
response
A type of question that requires the respondent to select an answer from a list
of choices rather than compose an answer (e.g., constructed response).
Stakeholders
Individuals who have a stake or interest in a project, including the:
Summative
Evaluation
Evaluation which examines the project's impact in order to make a decision about
its overall effectiveness.
Type
I error
When a statistical test falsely detects an effect that does not really exist.
Type
II error
When a statistical test fails to detect an effect that really exists.
Validity
"The extent to which the test being used actually measures the characteristic
or dimension we intend to measure" (Walsh & Betz, 1995, p. 58). Three
traditional conceptions of validity are:
Recent thinking views validity as depending on both:
Value
The attribute of a case on a particular variable.
Variable
An attribute of something being studied or observed that can be assigned a value.