slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Marks for identifying uncertainty: Stimulation of learning through Certainty-Based Marking Tony Gardner-Medwin Physiol PowerPoint Presentation
Download Presentation
Marks for identifying uncertainty: Stimulation of learning through Certainty-Based Marking Tony Gardner-Medwin Physiol

Loading in 2 Seconds...

play fullscreen
1 / 35

Marks for identifying uncertainty: Stimulation of learning through Certainty-Based Marking Tony Gardner-Medwin Physiol - PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on

Cambridge Assessment June 2009. Marks for identifying uncertainty: Stimulation of learning through Certainty-Based Marking Tony Gardner-Medwin Physiology, University College London www.ucl.ac.uk/LAPT. The take home message:. We should reward the acknowledgment of uncertainty.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Marks for identifying uncertainty: Stimulation of learning through Certainty-Based Marking Tony Gardner-Medwin Physiol' - aisha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Cambridge Assessment

June 2009

Marks for identifying uncertainty:

Stimulation of learning through

Certainty-Based Marking

Tony Gardner-Medwin

Physiology, University College London

www.ucl.ac.uk/LAPT

slide2

The take home message:

We should reward the acknowledgment of uncertainty

Starting points (you may agree or disagree!)

  • The nature of assessment affects how students learn & think
  • Objective tests/exercises can stimulate learning & understanding
  • Formative assessment is more important than summative
  • Different Q types suit different situations, e.g. T/F, SBA, free text
  • Scaling to “% above chance” (%Knowledge) should be universal
  • Negative marking can be either really constructive or really awful
  • Students & kids can enjoy assessment if it is stimulating, fair, varied, challenging, immediately rewarding, not humiliating -- like a game.
slide3

How Certainty-Based Marking works

  • How it relates to probability & knowledge
  • How students react & use it
  • CBM as summative assessment
  • Why isn’t it used more?
slide5

Which Certainty Level is Best?

C=3 High

3

C=2 Mid

2

C=1 Low

1

No Reply

0

-1

Mark expected on average

67% 80%

-2

-3

-4

guessing

-5

range

-6

0%

50%

100%

How likely is your answer to be correct?

slide7

knowledge

  • uncertainty
  • don't know
  • misconception
  • delusion

Decreasing certainty about what is true.

Increasing certainty about something false.

Increasing "ignorance"

"It's not ignorance does so much damage - it's knowin' so derned much that ain't so."attrib J. Billings

“I was gratified to be able to answer promptly, and I did !

- I said I didn't know.” Mark Twain

Ordinary words we use to describe Knowledge

  • Knowledge is a function of certainty (confidence, degree of belief)
  • There are states a lot worse than acknowledged ignorance
slide8

Student Learning: Principles they readily understand

  • You need to know the reliability of your knowledge to use it
  • Confident errors are serious, requiring attention to explanations
  • Expressing uncertainty when you are uncertain is a good thing
  • Confidence is about understanding why things cannot be otherwise, not about personality
  • if over- or under-confident, you must calibrate through practice
  • reflection and justification are essential study habits

In evaluation surveys, a majority of students have always said they like CBM, finding it useful and fair.

They asked to include it in exams, and after 5yrs exam use at UCL they voted 52% : 30% to retain it (in 2005/6), though this was rejected by the conservative medical establishment.

slide9

Why test knowledge? Google makes it so easy to find !

  • Cheap information (& increased teamwork) require :-
  • Identifying things you will get wrong and not Google! “unknown unknowns” rather than “don't knows”
  • 2) Judging reliability and uncertainty correctly
  • .... setting a threshold for seeking help
  • .... evaluating conflicting and corroborating information

In olden times, you had to rely on your own stored information

.... you would make a best choice and “go for it”

School leavers have more sparse (though broader) stored info, but still have a “go for it” culture - to a scary extent!

.... responding with an immediate idea & not thinking much

These lessons are core things that CBM teaches

slide10

Nuggets of knowledge

?

?

?

?

?

?

?

?

EV I DENCE

Inference

CBM places greater demands on justification & stimulates connections

Network of Understanding

Thinking about uncertainty / justification

develops understanding of relationships

Certainty (Degree of Belief)

Choice

To understand = to link correctly the facts that bear on an issue.

slide11

Using CBM

  • With UCL LAPT software, online or from CD
  • 2. With Moodle - work in progress
  • 3. With commercial software – some progress, more needed!
  • 4. Secure exams, with OMR Cards [Speedwell]
slide12

CBM quite closely follows the ideal ignorance measure

The student loses about 3 marks per 'bit' of ignorance- up to a maximum of 3 bits

slide13

Hassmen & Hunt ‘94

max

100

No negative marking

Fixed negative marking: +/-1

high

80

mid

1

1

60

reply

reply

low

40

0.8

20

min

0.5

0.6

0

no reply

Mark expected on average

-20

0.4

0

Mark expected on average

Mark expected on average

-40

no reply

0.2

-60

no reply

-0.5

-80

0

-100

35% 55% 67% 85%

-0.2

-1

-120

0%

50%

100%

0%

50%

100%

0%

50%

100%

Confidence (est'd prob'y correct)

Confidence (est'd prob'y correct)

Confidence (est'd prob'y correct)

Davies 2002

Hevner 1932

high

3

Gardner-Medwin’06

high

mid

2

3

mid

1

low

2

low

0

1

Mark expected on average

Mark expected on average

no reply

-1

0

no reply

-2

-1

50%

-3

-2

0%

50%

100%

0%

50%

100%

Confidence (est'd prob'y correct)

Confidence (est'd prob'y correct)

What’s a good mark scheme?

The standard LAPT (1,2,3 / 0,-2,-6) scheme seems better than any of these.

slide14

CBM increases the reliability of exam data

'Reliability' indicates to what extent a score measures something about the student's ability, as opposed to 'luck' or chance.

slide15

CBM increases the effective test length

With increased 'Reliability' you don't need so many exam questions to get data of equal quality.

slide16

CBM increases the reliability of exam data with True/False Questions

'Reliability' indicates to what extent a score measures something about the student's ability, as opposed to 'luck' or chance.

To achieve these increases using only % correct would have required on average 58% more questions.

slide17

Reliability and efficiency of exams (Quality of data / number of questions) are enhanced with CBM

Data from 6 medical student exams (250-300 T/F Qs each, >300 students).

slide18

Certainty-based scores predict the conventional score on different Qs better than conventional scores do.

slide19

How should one handle students with poor calibration?

Significantly overconfident in exam: 2 students (1%)

e.g. 50% correct @C=1, 59%@C=2, 73%@C=3

Significantly underconfident in exam: 41 students (14%)

e.g. 83% correct @C=1, 89%@C=2, 99%@C=3

Maybe one shouldn’t penalise such students

Adjusted confidence-based score:

Mark the set of answers at each C level as if they were entered at the C level that gives the highest score**.

mean benefit = 1.5% ± 2.1% (median 0.6%)

** (first combining sets if %correct is not in ascending order)

slide20

Scaling CBM scores to be directly comparable with conventional scores

NCOR is based on number correct, scaled so guesses (50% prob’y correct) give on average 0%. (“% Knowledge”)

slide21

Equivalence of **scaled CBM scores and conventional scores for standard setting.

True/False ♦ and SBA □(5 option) components of a formative test for 345 students were ranked by conventional scores. Then for each decile, mean CBS scores are plotted against % correct above chance (“% knowledge”).

Gardner-Medwin & Curtin 2007 REAP conference, data from Imperial College

**CBS = ( (Total-Chance)/(Max-Chance) )p × 100%,   where p = 0.6 for TF, 0.48 for SBA (5opt)

slide22

Why doesn't everybody already use CBM ?

- a puzzle

  • Enthusiasm was exhausted before the age of 'online'
  • Some CBM methods were complex, opaque or non-motivating
  • Reluctance to treat certainty as integral to knowledge
  • Mistaken worries about 'personality bias'
  • Under-rating of self-assessment & practice as learning tools
  • Worry that CBM would need new questions
  • Worry that CBM would upset standard-setting
  • Inertia and vested interests
a few of the names associated with confidence testing in education
A few of the names associated with confidence testing in education
  • Andrew Ahlgren
  • Jim Bruno
  • Confucius
  • Robert Ebel
  • Jack Good
  • Kate Hevner
  • Darwin Hunt
  • Dieudonné Leclercq
  • Emir Shuford

“When you know a thing,

to hold that you know it.

And when you do not know a thing,

to allow that you do not know it.

This is knowledge.”

“Learning without thought is a waste of time.”

Confucius

London Colleagues:

  • Mike Gahan
  • David Bender
  • Nancy Curtin
slide24

We fail if we mark a lucky guess as if it were knowledge.

We fail if we mark misconceptions as no worse than

ignorance.

www.ucl.ac.uk/lapt

slide26

Lessons from experience with CBM

  • Practice is needed before use in exams
  • Exams should re-use questions from an open database only very sparingly
  • Over-confidence and diffidence are both unhealthy traits that can be moderated by practice to achieve good calibration
  • With multi-option questions, students tend (at least initially) to over-estimate reliability
  • Standard setting - it is easy (but important!) to scale CBM marks to match familiar scales based on number correct.
some questions about cbm
Some Questions about CBM !
  • Are there problems using it?
  • Why doesn't my VLE support CBM?
  • Do students need practice?
  • Isn't computer marked assessment just factual?
  • Does CBM increase retention?
  • Do I need new questions?
  • What are the best Q types?
  • What about school education?
  • Is it relevant to my subject, where opinions differ?
  • Isn't it bad to encourage guessing?
  • What if my only assessments are exams?
  • How do I convince an exam board?
  • Isn't it right/wrong that really matters?
slide29

"I think about confidence assessment

50%

40%

30%

20%

10%

0%

Every Time

Most of the

Rarely

Never

No reply

time

%

"I sometimes change my answer while thinking about

30

confidence assessment"

25

20

15

10

5

0

Disagree 1

2

3

4

Agree 5

slide30

Students really take to confidence-based marking

  • Principles that students seem readily to understand :-
  • both under- and over- confidence are impediments to learning
  • confident errors are far worse than acknowledged ignorance and are a wake-up call (-6!) to pay attention to explanations
  • expressing uncertainty when you are uncertain is a good thing
  • thinking about the basis and reliability of answers can help tie bits of knowledge together (to form “understanding”)
  • checking an answer and rereading the question are worthwhile
  • sound confidence judgement is a valued intellectual skill in every context, and one they can improve
  • immediate feedback while still thinking about the basis of your answer is a hugely valuable study aid
slide31

No. Confidence scores are better than simple scores at predicting even the conventional scores on a different set of questions. This can only be because they are a statistically more efficient measure of knowledge.

The correlation, across students, between scores on one set of questions and another is higher for confidence than for simple scores.

But perhaps they are just measuring ability to handle confidence ?

slide32

Known Knowns

... things we know that we know.Known Unknowns

... things that we know that we don't know.Unknown Unknowns

... things we do not know we don't know.

Donald Rumsfeld

When you know a thing,

to hold that you know it.

And when you do not know a thing,

to allow that you do not know it.

This is knowledge.

Confucius

slide33

Will it snow next weekend?

Does a (good) weather forecaster have knowledge?

- obviously yes, but expressed through a probability

Does insulin raise blood glucose levels?

Similar, even though the Q is not about a probability.

- the probability is your certainty that your answer is right

How can you measure and reward this knowledge?

- the origin of CBM >100 years ago.

The key is to have a "proper" or "motivating" reward scheme, which ensures that the person does best by expressing their true level of uncertainty

slide34

CBM data is a more valid measure of ability

'Validity' means it measures what you want, rather than just something easily measured.

slide35

SUMMARY

Why CBM?

  • Get students to think more carefully
  • Reward recognition of uncertainty, either personal or in a group
  • Highlight misconceptions
  • Engage students more - the game element of CBM
  • Encourage criticism of Qs (intolerance of ambiguity or looseness)
  • In general: enhance self-assessment as a learning experience

NB All of the above arise with little or no practice with CBM. The following do require practice :

  • More searching diagnostic data
  • More valid and reliable assessment data

(But NB with CBM you have conventional assessment data too.)