| |
Assessment:
How reliable are our tests? Part 1
Prior to the
development of assessment methods in higher education, most of us
presumed that our tests were measures of ñactual knowledgeî and
that student learning could be measured perfectly by tests and quizzes
alone. We still use tests and quizzes to derive grades in order
to evaluate individual students, but it is important here to understand
the difference between evaluation of individuals and assessment.
Assessment looks not at individuals, but rather at units such as
a class, a course, or a curriculum as a whole. Thus, the tools of
assessment and the concepts of interpretation differ from those
used in evaluation of individuals.
Assessment
encourages us to look at our tests in the context of our classes
as a whole. When we do, we find some valuable insights. One is the
concept of test reliability. Here, we can use statistical correlation
as a check. Calculation of a linear correlation coefficient (r)
reveals how ñperfectî a relationship may be between two variables.
In Figure 1, a perfect correlation is shown in ñA,î where r =1.0
reveals a plot of two variables in which all points fit perfectly
along a line. The relationship in ñBî shows absolutely no linearity
between two variables. The coefficient calculated from these points
is zero. We can use the correlation to see how reliable our tests
are. We need two variables, so we could give all of our students
two tests, and see how consistently the tests measure the same studentsÍ
knowledge. If ñperfect,î the plot will look like that in ñA.î But
because giving two tests is a lot of work, we can use a standard
method called ñsplit halvesî to discover the degree of reliability
our tests provide. This involves randomly splitting a single test
into two tests, such as using odd numbered items as one test and
even numbered as another.
If
ñperfectly reliable,î each half should give the same result per
student and a plot like ñAî will result. Another way is to look
at our past semesterÍs grade sheet and treat our entire course grading
as a single test. Thus if we gave ten quizzes or four tests, we
could split our quizzes/tests randomly and do the same check. ñCî
results from a split half analysis on ten quizzes. ItÍs a good result,
but far from perfect, and shows that tests are not perfectly reliable.
In fact, no single measure of student learning is perfect, and thatÍs
why assessment requires multiple measures. In routine test design,
one hopes for an r value greater than 0.6. However, if youÍve never
used your own class data to make such a check, you donÍt yet know
the reliability of your own testing.
You
can try this yourself for your own tests. The Excel package in your
office computer can calculate r values. The paragraphs below in
this newsletter explain how to do this. WhatÍs that plot in ñD?î
Well, itÍs not a test for evaluating individuals; instead itÍs a
knowledge survey for assessing student learning in our First Year
Seminar classes, and the results show great internal reliability
of that tool. WeÍll cover more on tests and knowledge surveys in
the next issue. In the meantime, whatever you do, donÍt forget to
sign up for the assessment workshop on February 25. See back of
newsletter for details.

---------------------------------------------------------------------------------------
Calculatin
da correlation coefficient with da Excel® Spreadsheet
If
you managed to get a doctorate without calculating a correlation
coefficient and doing a least-squares line fit, then congratulate
yourself; most of us were not so fortunate! This was an unpleasant,
laborious task until computers; now its a cinch. Suppose you
have given a test to ten students. You have split the test into
even items, odd items, and graded each. You now have two grades
for each student from the same test. (Alternately, you could have
given two tests, and youd like to see how reliably two tests
compare. If any student missed a test or took a makeup that differed
from the first test you are analyzing, remove such students from
the data base. You want clean data from only the test or tests you
are examining. In any event, you now have a data pair for each student.)
Type the data into two columns of Excel spreadsheet as shown in
Figure 1. Each row represents a students data. Click on Tools
Menu. You may see Data Analysis as an option in the
pull-down menu. If not, click on Add-Ins and select
the Analysis Tool Pack. Click OK. Data
Analysis will then appear as an option under Tools. Select
correlation and click OK. Because we have
labels in the first row, check the box Labels in first row.
We want to correlate our data arranged in two columns, so click
on Columns. To keep life easy, select New Worksheet
ply for outcomes. For the input range, click on the upper
left cell (the one with Odds in it), type a colon (:),
then click on the lower right cell. The input range is always upper
left to lower right of the data set. If you want to check, say,
correlations of five quizzes against one another, then you can have
five columns in your data set. As soon as you click OK, your correlation
coefficient(s) should appear, and will look like Figure 2. The data
in Figure 1 yield the r-value (0.528) shown in Figure 2. Use the
data here for a practice run with Excel®.
|
Odds
|
Evens
|
|
74
|
94
|
|
67
|
82
|
|
66
|
87
|
|
55
|
79
|
|
59
|
81
|
|
46
|
91
|
|
67
|
85
|
|
52
|
82
|
|
62
|
79
|
|
43
|
69
|
Figure 1. Raw input data from test scores.
|
|
Odds
|
Evens
|
|
Odds
|
1
|
|
|
Evens
|
0.528
|
1
|
Figure 2. Output data calculated correlation coefficient
(r = 0.528) by Excel® spreadsheet. Dont be surprised if
your tests show lower internal correlations as measures of reliability
than you presumed; our tests are usually not as stable tools as
we think. If you seek to compare test data with another measure,
remember that you cant expect any correlations higher than
the tool you use with the lowest internal reliability.
Assessing
along the Continuum of Students' Learning
Dr. Peggy Maki
February 25, Friday, 8:30
A.M. - ~ 3:00 P.M., Red
Lion Inn by I-15 Pocatello Creek Road Exit
Breakfast & Lunch provided
Early Registrants Receive Assessing for Learning: Building
a Sustainable Commitment Across the Institution, 2004, Stylus
Press, 204 p.
To register, email to nuhfed@isu.edu
and give your ISU mail box number
Beginning
with research on learning, this workshop will present collaborative
principles, practices, and strategies for assessing student
learning at the institution- and department levels as students
progress through their studies. The workshop will demonstrate
collaborative steps involved in assessing student learning.
Whos Peggy
Maki?
Higher
education consultant, Peggy L. Maki, Ph.D., specializes in assisting
institutions to integrate assessment of student learning into educational
practices, processes and structures. Her work also focuses on assessment
within the context of accreditors' expectations for institutional
effectiveness. She has recently been named to the Board of Contributors
of About Campus, Department Editor of Assessment for About Campus,
Assessment Field Editor at Stylus Publishing, LLC, and to the Advisory
Board of the Wabash Center for Critical Inquiry. She serves as a
faculty member in AAC&U's Institute on General Education; this
past summer she served as a faculty member in the Carnegie Foundation's
Integrated Learning Project. Beginning in the Summer, 2005, she
will be teaching graduate seminars focused on assessment at two
universities.
Formerly, Senior Scholar and Director
of Assessment at the American Association for Higher Education (AAHE),
she has served as Associate Director of the Commission on Institutions
of Higher Education, New England Association of Schools and Colleges,
Inc., New Englands regional accrediting body; Vice President,
Academic Dean, Dean of Faculty, and Professor of English, Bradford
College, MA; Chair of English, Theatre Arts, and Communication,
Associate Professor of English, and Dean of Continuing Education,
Arcadia University, PA. She is a recipient of a national teaching
award, the Lindback Award for Distinguished Teaching.
She has conducted over 300 workshops
and keynote addresses on assessment both in the U.S. and abroad,
including New Zealand, Hong Kong, Mexico, Greece, Bulgaria, British
Columbia, and Malaysia. Her articles on assessing student learning
have appeared in AAHEs Bulletin, AAHEs Inquiry and Action
series, About Campus, Assessment Update, Change Magazine, The Journal
of Academic Librarianship, NetResults, and Proceedings of the International
Conference on Teaching and Learning, held at the National University
of Singapore (keynote address). Her writing also includes articles,
chapters in books, and a book on the teaching of writing. Additionally
she conducts writing-across-the curriculum workshops that develop
and document student learning.
She is in the process of editing
a book on assessment practices at the doctoral level and developing
a workbook to accompany her recently published handbook on assessment:
Assessing for Learning: Building a Sustainable Commitment across
the Institution, published in June, 2004, by Stylus Publishing,
LLT, and AAHE.
|
|