Enhancing Learning and Assessment Through Confidence-Based Marking

 

Abstract  accepted for a paper to the 1st International Conference on

"Enhancing Teaching and Learning through Assessment", Hong Kong 12-15 June, 2005

 

Principally relevant to the themes:

A.R. Gardner-Medwin, Dept. Physiology, University College London, London WC1E 6BT, UK

Email:  ucgbarg@ucl.ac.uk       Web-site: www.ucl.ac.uk/lapt


Confidence-based marking (CBM) has been known for many years to stimulate reflection and constructive thinking by students, and to improve both the reliability and validity of exam data in measuring partial knowledge.  See for example a summary given by Ahlgren in 1969, available on the web (Confidence on Achievement Tests - Theory, Applications : http://www.p-mmm.com/founders/AhlgrenBody.htm ).  However, CBM has been adopted in very few places and is sometimes, rather surprisingly, regarded with scepticism.

At UCL, in our medical course, we have used a simple yet theoretically sound, version of CBM for ten years now, including 4 years experience of use in summative exams. The conference presentation will explain the rationale for adopting CBM and the extent of success evidenced by student evaluations and analysis of a large body of data from formative and summative assessment.  Recent developments (funded by HEFCE) will be explained, that make it easy to disseminate the use of CBM to other institutions and other disciplines, and to integrate its use into Virtual Learning Environments (such as WebCT) that do not at present offer CBM as an option. 


Our system employs three confidence levels:
1-2-3. Students are asked to rate their confidence after each time they have answered a question (True/False, MCQ, numeric, or open text) that will be marked categorically right or wrong. With low confidence (level 1) they receive just 1 mark if correct and no penalty if their answer is wrong. At levels 2, 3 they receive accordingly 2 or 3 marks if correct, but increasing penalties (normally -2, -6 marks) if wrong.  This provides a properly motivating mark scheme, whereby a student always stands to gain by honestly expressing their best estimate of the reliability of their answer, based on careful reflection and an attempt to justify either high confidence or reasons for reservation.  A student should use C=1 if less than 67% sure of being correct and C=3 if more than 80% sure.


Our students like CBM a lot, finding it more searching in identifying their areas of weakness or misconception. It has proved completely straightforward to use with both new and pre-existing databases of questions established for conventional marking. The software for use on computers has been developed with great flexibility and power, and an emphasis on efficient feedback when students wish to comment on particular questions or explanations. It can be used in stand-alone or web-browser formats, and with students working on campus or at home. Marks can be inserted into WebCT gradebooks (as used at UCL and Imperial College London) using protocols that can also be adapted for different VLE systems.


An important issue is the reaction of staff to CBM use. Concerns are sometimes expressed that CBM would somehow favour one or other gender, or particular
personality types. Our data (Gardner-Medwin & Gahan, 1995, 2003: see web site) shows no evidence at all of gender diferences, despite high sensitivity and a clearly significant tendency for both sexes to be more cautious in their expression of high confidence in exams than when doing formative assessment to aid study. Excessive diffidence or unwarranted confidence might disadvantage a student (though to some extent we can apply corrections for such behaviour in exams). But either of these bad traits is something a student should become aware of and attempt to correct, for which they need exactly the kind of feedback offered by CBM in formative assessment. 


Our exam data (obtained with optical mark reader cards: Speedwell Computing Ltd.) permits comparison of CBM scores with conventional (number-correct) scores. This has revealed marked improvements of the standard Cronbach alpha measurement of reliability, from
0.873 ± 0.012 to 0.925 ± 0.007 with CBM (mean ± SEM in 6 exams, P<0.001).  This is an improvement that would require approximately 80% more questions in an exam to be achieved by reducing random variance with conventional marking.  In both qualitative and quantitative ways there seem to be such clear advantages to the use of CBM that wider adoption and evaluation by the teaching and learning community would seem well merited.