Idaho State University Idaho State University Home PageISU Site Feedback FormISU Web Site SearchISU Website Index
spacer
spacer
spacer


  
Idaho State University's One-page
Newsletter for Teaching Excellence

Volume 13, Number 1, January, 2005
Center for Teaching and Learning
Museum 434 Campus Box 8010
Pocatello, ID 83209-8010

 
Phone (208)282-4703
FAX (208)282-5361
nuhfed@isu.edu

 

 
  

Assessment: How reliable are our tests? Part 1


Prior to the development of assessment methods in higher education, most of us presumed that our tests were measures of ñactual knowledgeî and that student learning could be measured perfectly by tests and quizzes alone. We still use tests and quizzes to derive grades in order to evaluate individual students, but it is important here to understand the difference between evaluation of individuals and assessment. Assessment looks not at individuals, but rather at units such as a class, a course, or a curriculum as a whole. Thus, the tools of assessment and the concepts of interpretation differ from those used in evaluation of individuals.

Assessment encourages us to look at our tests in the context of our classes as a whole. When we do, we find some valuable insights. One is the concept of test reliability. Here, we can use statistical correlation as a check. Calculation of a linear correlation coefficient (r) reveals how ñperfectî a relationship may be between two variables. In Figure 1, a perfect correlation is shown in ñA,î where r =1.0 reveals a plot of two variables in which all points fit perfectly along a line. The relationship in ñBî shows absolutely no linearity between two variables. The coefficient calculated from these points is zero. We can use the correlation to see how reliable our tests are. We need two variables, so we could give all of our students two tests, and see how consistently the tests measure the same studentsÍ knowledge. If ñperfect,î the plot will look like that in ñA.î But because giving two tests is a lot of work, we can use a standard method called ñsplit halvesî to discover the degree of reliability our tests provide. This involves randomly splitting a single test into two tests, such as using odd numbered items as one test and even numbered as another.

If ñperfectly reliable,î each half should give the same result per student and a plot like ñAî will result. Another way is to look at our past semesterÍs grade sheet and treat our entire course grading as a single test. Thus if we gave ten quizzes or four tests, we could split our quizzes/tests randomly and do the same check. ñCî results from a split half analysis on ten quizzes. ItÍs a good result, but far from perfect, and shows that tests are not perfectly reliable. In fact, no single measure of student learning is perfect, and thatÍs why assessment requires multiple measures. In routine test design, one hopes for an r value greater than 0.6. However, if youÍve never used your own class data to make such a check, you donÍt yet know the reliability of your own testing.

You can try this yourself for your own tests. The Excel package in your office computer can calculate r values. The paragraphs below in this newsletter explain how to do this. WhatÍs that plot in ñD?î Well, itÍs not a test for evaluating individuals; instead itÍs a knowledge survey for assessing student learning in our First Year Seminar classes, and the results show great internal reliability of that tool. WeÍll cover more on tests and knowledge surveys in the next issue. In the meantime, whatever you do, donÍt forget to sign up for the assessment workshop on February 25. See back of newsletter for details.

---------------------------------------------------------------------------------------

Calculatin’ da correlation coefficient with da Excel® Spreadsheet

If you managed to get a doctorate without calculating a correlation coefficient and doing a least-squares line fit, then congratulate yourself; most of us were not so fortunate! This was an unpleasant, laborious task until computers; now it’s a cinch. Suppose you have given a test to ten students. You have split the test into even items, odd items, and graded each. You now have two grades for each student from the same test. (Alternately, you could have given two tests, and you’d like to see how reliably two tests compare. If any student missed a test or took a makeup that differed from the first test you are analyzing, remove such students from the data base. You want clean data from only the test or tests you are examining. In any event, you now have a data pair for each student.) Type the data into two columns of Excel spreadsheet as shown in Figure 1. Each row represents a student’s data. Click on Tools Menu. You may see “Data Analysis” as an option in the pull-down menu. If not, click on “Add-Ins” and select the “Analysis Tool Pack.” Click “OK.” “Data Analysis” will then appear as an option under Tools. Select “correlation” and click “OK.” Because we have labels in the first row, check the box “Labels in first row.” We want to correlate our data arranged in two columns, so click on “Columns.” To keep life easy, select “New Worksheet ply” for outcomes. For the input range, click on the upper left cell (the one with “Odds” in it), type a colon (:), then click on the lower right cell. The input range is always upper left to lower right of the data set. If you want to check, say, correlations of five quizzes against one another, then you can have five columns in your data set. As soon as you click OK, your correlation coefficient(s) should appear, and will look like Figure 2. The data in Figure 1 yield the r-value (0.528) shown in Figure 2. Use the data here for a practice run with Excel®.

Odds
Evens
74
94
67
82
66
87
55
79
59
81
46
91
67
85
52
82
62
79
43
69

Figure 1. Raw input data from test scores.

 

Odds
Evens
Odds
1
Evens
0.528
1

Figure 2. Output data calculated correlation coefficient (r = 0.528) by Excel® spreadsheet. Don’t be surprised if your tests show lower internal correlations as measures of reliability than you presumed; our tests are usually not as stable tools as we think. If you seek to compare test data with another measure, remember that you can’t expect any correlations higher than the tool you use with the lowest internal reliability.

 

Assessing along the Continuum of Students' Learning
Dr. Peggy Maki
February 25, Friday, 8:30 A.M. - ~ 3:00 P.M., Red Lion Inn by I-15 Pocatello Creek Road Exit
Breakfast & Lunch provided
Early Registrants Receive Assessing for Learning: Building a Sustainable Commitment Across the Institution, 2004, Stylus Press, 204 p.

To register, email to nuhfed@isu.edu and give your ISU mail box number

Beginning with research on learning, this workshop will present collaborative principles, practices, and strategies for assessing student learning at the institution- and department levels as students progress through their studies. The workshop will demonstrate collaborative steps involved in assessing student learning.

 

 

Who’s Peggy Maki?

Higher education consultant, Peggy L. Maki, Ph.D., specializes in assisting institutions to integrate assessment of student learning into educational practices, processes and structures. Her work also focuses on assessment within the context of accreditors' expectations for institutional effectiveness. She has recently been named to the Board of Contributors of About Campus, Department Editor of Assessment for About Campus, Assessment Field Editor at Stylus Publishing, LLC, and to the Advisory Board of the Wabash Center for Critical Inquiry. She serves as a faculty member in AAC&U's Institute on General Education; this past summer she served as a faculty member in the Carnegie Foundation's Integrated Learning Project. Beginning in the Summer, 2005, she will be teaching graduate seminars focused on assessment at two universities.

Formerly, Senior Scholar and Director of Assessment at the American Association for Higher Education (AAHE), she has served as Associate Director of the Commission on Institutions of Higher Education, New England Association of Schools and Colleges, Inc., New England’s regional accrediting body; Vice President, Academic Dean, Dean of Faculty, and Professor of English, Bradford College, MA; Chair of English, Theatre Arts, and Communication, Associate Professor of English, and Dean of Continuing Education, Arcadia University, PA. She is a recipient of a national teaching award, the Lindback Award for Distinguished Teaching.

She has conducted over 300 workshops and keynote addresses on assessment both in the U.S. and abroad, including New Zealand, Hong Kong, Mexico, Greece, Bulgaria, British Columbia, and Malaysia. Her articles on assessing student learning have appeared in AAHE’s Bulletin, AAHE’s Inquiry and Action series, About Campus, Assessment Update, Change Magazine, The Journal of Academic Librarianship, NetResults, and Proceedings of the International Conference on Teaching and Learning, held at the National University of Singapore (keynote address). Her writing also includes articles, chapters in books, and a book on the teaching of writing. Additionally she conducts writing-across-the curriculum workshops that develop and document student learning.

She is in the process of editing a book on assessment practices at the doctoral level and developing a workbook to accompany her recently published handbook on assessment: Assessing for Learning: Building a Sustainable Commitment across the Institution, published in June, 2004, by Stylus Publishing, LLT, and AAHE.

 

 
       
      
   Center for Teaching and Learning  
      
   ISU home page  
         
   text-only alternative