Original publication 'Toetsvragen schrijven' 1983 Utrecht: Het Spectrum, Aula 809, Onderwijskundige Reeks voor het Hoger Onderwijs ISBN 90-274-6674-0. The 2006 text is a revised text.

Item writing

Techniques for the design of items for teacher-made tests

5. Relations between concepts

Ben Wilbrink

this database of examples has yet to be constructed. Suggestions? Mail me.


Variety of relations.

5.1 Translate, picture, construct

gif/06eclausen-may1.jpg gif/06eclausen-may2.jpg

"One simple but often effective use of ICT involves the exploitation of the ability to drag and drop on the screen."
"This question focuses on the pupils' understanding of the structure of the flower, rather than on the 'naming of parts'."

Tandi Clausen-May (2001). pdf


Dear Ralph,
I'm a newcomer here of a small town. I would 76. ________________
describe myself as shy and quietly. Before my classmates, 77. ________________
- - - - - - - - - -
seem to work. Can you tel me about what I should do? 85. ________________

This kind of error correction is a section of the English test, one of five or six tests for admission to Chinese universities. Liying Cheng and Luxia Qi (2006). Description and Examination of the National Matriculation English Test Language Assessment Quarterly: An International Journal, 3, 53-70.

logical and mathematical relations


maps, graphics, statistics

Kirsten R. Butcher (2006). Learning From Text With Diagrams: Promoting Mental Model Development and Inference Generation. Journal of Educational Psychology. 98(1), 182-197. abstract

Patrick B. Kohl and Noah D. Finkelstein (2006). Effect of instructional environment on physics students’ representational skills. PHYSICAL REVIEW SPECIAL TOPICS - PHYSICS EDUCATION RESEARCH 2, 010102 2006 pdf

The intersting thing in the Kohl Finkelstein article is the different ways to represent some state of the world, in this case a physics problem.

summarize, abstract, give concrete form

5.1 more literature

Alan Blackwell (WWW 2000?). Diagrammatic reasoning and system visualization. Computer Laboratory Cambridge University.pdf

5.2 Discriminate

5.3 Classify


Which of the following presents as chronic (longer than 3 months) airspace disease on a chest radiograph?
  1. Streptococcal pneumonia
  2. Adult respiratory distress syndrome
  3. Pulmonary edema
  4. Pulmonary alveolar proteinosis

A 30-year-old-man presented with a 4-month history of dyspnea, low-grade fever, cough and fatigue. Given the following chest radiograph [photo], what is the most likely diagnosis?
  1. Adult respiratory distress syndrome
  2. Pulmonary edema
  3. Streptococcal pneumonia
  4. Pulmonary alveolar proteinosis

Collins, 2006

Collins (2006): "Vignettes do not have to be long to be effective, and should avoid verbosity, extraneous material and 'red herrings.'"

R. Bareiss, B. Porter and R. Holte (1990). Concept Learning and Heuristic Classification in Weak-Theory Domains, Artificial Intelligence Journal, v45 (nos. 1-2), pp. 229-264. postscript

5.4 Schemes, algorithms, and routines

The cognitive psychology seeing on schemes and algorithms is known under the label 'scripts' in the original work by Schank and Abelson (1977).

A special case, maybe, are algorithms in arithmetics, as for example the long division (Lee, 2007).

The problems with word problems do not properly belong in this paragraph. They are of a more general nature, touching on most of the design issues in achievement testing. In due time I will rework the material mentioned below. Sorry, much or most of the material has been moved now to a special page wordproblems.htm, collecting the literature regarding word problems and their abuses.

There are 26 sheep and 10 goats on a ship. How old is the captain?

The example item is authored by IREM Grenoble 1980, and used in an attempt to empirically show the phenomenon of 'suspension of sense-making' by pupils solving word problems in mathematics. This approach is not new, Waterink had something to say on it, see Leen (1961, p. 131 ff). An important publication on the topic is:
Lieven Verschaffel, Brian Greer, Erik De Corte (2000). Making sense of word problems.. Swets & Zeitlinger. http://beta.springerlink.com/content/h053465217792u77/fulltext.pdf [Dead link? May 2009] review by Christoph Selter (he cites the 'age-of-the-captain' problem from the book. Educational Studies in Mathematics, 42, 211-213. For annotations to the Verschaffel et al. book see wordproblems.htm

The point is: the wording of many so-called word problems is fake, teaching children to disregard the words, and start manipulating the numbers as soon as possible. The 'how-old-is-the-captain' question is not a trick question, but children should understand - and should have been taught - that some problems simply do not have a realistic answer, and should say so.

If there is any indication that pupils tend to answer word problems without even trying to make sense of the word problem, then these word problems should not be used in assessment, or even in instruction, because they fail the construct validity criterion. This will pose a serious problem for much of mathematics teaching. I suspect, however, that the same phenomenon is present in many other disciplines, albeit not that pronounced as in mathematics. Do not expect university level tests to be clean from the phenomenon of 'suspension of sense-making.'

A. Leen (1961). De ontwikkeling van het rekenonderwijs op de lagere school in de 19e en het begin van de 20ste eeuw. Groningen; Wolters. Proefschrift Vrije Universiteit Amsterdam.

more literature on algorithms etc

Anna Sfard (1991). On the dual nature of mathematical conceptions: Reflections on processes and objects as different sides of the same coin. Educational Studies in Mathematics, 22. scan

Maak een tekening van zes figuren. Plaats ten minste een van de personen op de voorgrond, evenzo ten minste een persoon op de achtergrond, en tenminste een in een tussenpositie. De relatie tussen de figuren moet de regels van het rechtlijnig perspectief weerspiegelen.
[Wilson, in: Bloom, Hastings en Madaus, 1971, blz. 550]

5.5 Lawful relations

I will use physics as the paradigm science here. For example Gerald Holton (1953), a textbook directed to the general student at the undergraduate level. Within that field, I will concentrate on Newton's laws of motion. Of special importance will be the folk physics problem: how to deal with the common sense ideas about physics concepts, assuming one sees there is a real instructional problem here, and therefore the design of achievement test items should recognize this also. I would not mention the folk physics problem in this paragraph, if I did not have the strong conviction that common sense ideas present a serious problem in all disciplines, not alone in physics.

How about the social sciences? Do they have laws comparable to those in physics? I will have to tackle that one. A big problem here is the concept of measurement, and how that is handled by psychologists and other social scientists; I will follow Joel Michell (1999) here.

Gerald Holton (1953). Introduction to concepts and theories in physical science. Cambridge, Mass.: Addison-Wesley.

Joel Michell (1999). Measurement in psychology. A critical history of a methodological concept. Cambridge University Press.


  1. Every body continues in its state of rest, or of uniform motion in a right line, unless it is compelled to change that state by forces impressed upon it.
  2. The change of motion is proportional to the motive force impressed; and is made in the direction of the right line in which that force is impressed.
  3. To every action there is always opposed an equal reaction: or, the mutual actions of two bodies upon each other are always equal, and directed to contrary parts.

Isaac Newton (1686/1729/1934). Principia: Sir Isaac Newton's Mathematical Principles of Natural Philosophy & His System of the World. p. 13.

Newton's first law of motion looks quite simple, is one of the most important or the most important physical law ever, is highly abstract even though it does not look so, goes against the grain of the common sense conception of motion as well as of Aristotle ideas about motion (Newton does not say so, though). Here the item designer has complicated content at hand that is extremely difficult to handle adequately. Most of the time it isn't handled adequately, though. Obviously, it is no use at all to ask for the definition, be it in Latin, the first English translation's version of 1729, the Dover version, or any formulation equivalent in meaning (whatever that may be). Is it useful to ask for mathematical manipulations, be they geometrical (Newton's preferred method) or algebraic? There is some controversy on this question. On the one hand, doing the mathematics evidently is not equal do doing physics, but then what is the meaning of 'doing physics' here? In educational courses more often than not it is the mathematics that counts, risking to pass students clever enough to do the mathematics, while not understanding the physics itself. The item designer will have to take a stand here: it should be the understanding of the physical principles that counts, not mathematical wizardry. That, simply, is a question of validity. In due course I will collect lots of sources on the points made here, including validity, and the methods available to go for the 'real' understanding of the physics involved, contrasted against (the same person's earlier) common sense understanding (f.e. Halloun and Hestenes, 1985 pdf) or the superficial understanding evidenced by the correct manipulation of givens, formulas and the calculus (f.e. Tuminaro, 2004). Dijksterhuis (1969) is an important source from philosophy of science; for Newton's physics he uses the Dutch publication by Beth (1932), but Mach's (1893/1960) work (and a whole library following this up) will do as well.

To this day every student of elementary physics has to struggle with the same errors and misconceptions which then had to be overcome, and on a reduced scale, in the teaching of this branch of knowledge in schools, history repeats itself every year. The reason is obvious: Aristotle merely formulated the most commonplace experiences in the matter of motion as universal scientific propositions, whereas classical mechanics, with its principle of inertia and its proportionality of force and acceleration, makes assertions which not only are never confirmed by everyday experience, but whose direct experimental verification is impossible .... (p. 30).

Champagne, Gunstone and Klopfer (1985, p. 62), citing from E. J. Dijksterhuis (1951/1969). The mechanization of the world picture. London: Oxford University Press.

The Physics Classroom and Mathsoft Engineering & Education (2004). A high school physics tutorial: Newton's Laws. html

J. H. Mandleberg (1952). Physical chemistry made plain. An aid for intermediate students and others. London: Cleaver-Hume Press.

David Hammer (2000). Student resources for learning introductory physics. American Journal of Physics, Physics Education Research Supplement, 68, 52-59. html

David Hammer and Andrew Elby (2003). Tapping epistemological resources for learning physics. Journal of the Learning Sciences, 12, 53-90. paper pdf

Antti Savinainen (2004). High School Students' Conceptual Coherence of Qualitative Knowledge in the Case of the Force Concept. Dissertation, the Faculty of Science of the University of Joensuu. pdf

David Hestenes, Malcolm Wells, and Gregg Swackhamer (1992). Force Concept Inventory. The Physics Teacher, Vol. 30, 141-158. pdf. This article describes the Inventory, but it does not show specific items from the instrument.

Zdeslav Hrepic (2004). Development of a real-time assessment of students' mental models of sound propagation. Dissertation Kansas State University. pdf

Alicia R. Allbaugh (2003). The problem-context dependence of students' application of Newton's Second Law. Dissertation Kansas State University. pdf


A boy is walking away from a lamppost. How fast is his shadow moving? A ladder is resting against a wall. If the base is moved out from the wall, how fast is the top of the ladder moving down the wall?

Such 'related rates problems' are old chestnuts of introductory calculus, used both to show the derivative as a rate of change and to illustrate implicit differentiation. Now that some 'reform' texts [Callahan and Hoffman (1995), Hughes-Hallett et. al. (1994)] have broken the tradition of devoting a section to related rates, it is of interest to note that these problems originated in calculus reform movements of the 19th century.

Bill Austin, Don Barry and David Berman (2000). The Lengthening Shadow: The Story of Related Rates. Mathematics Magazine, 73, 3-12. pdf

I think arithmetic, or the calculus, is in some ways an ideal discipline for the researcher looking for item design principles. Therefore the Austin, Barry and Berman article is intriguing. I will have to look up more of this kind of research, not only historical studies, but instructional science studies as well.

The Fluxionary or Differential and Integral Calculus has within these few years become almost entirely a science of symbols and mere algebraic formulae, with scarcely any illustration or practical application. Clothed as it is in a transcendental dress, the ordinary student is afraid to approach it; and even many of those whose resources allow them to repair to the Universities do not appear to derive all the advantages which might be expected from the study of this interesting branch of mathematical science.

William Ritchie (1836). Principles of the Differential and Integral Calculus, as cited in Austin, Barry and Berman (2000) pdf.

And that is the point, isn't it: natural science ins't a science of symbols, and even the calculus itself shouldn't be one.

5. A stone dropped into still water produces a series of continually enlarging concentric circles; it is required to find the rate per second at which the area of one of them is enlarging, when its diameter is 12 inches, supposing the wave to be then receding from the centre at the rate of 3 inches per second

6. One end of a ball of thread, is fastened to the top of a pole, 35 feet high; a person, carrying the ball, starts from the bottom, at the rate of 4 miles per hour, allowing the thread to unwind as he advances; at what rate is it unwinding, when the person is passing a point, 40 feet distant from the bottom of the pole; the height of the ball being 5 feet? . . .

12. A ladder 20 feet long reclines against a wall, the bottom of the ladder being 8 feet distant from the bottom of the wall; when in this position, a man begins to pull the lower extremity along the ground, at the rate of 2 feet per second; at what rate does the other extremity begin to descend along the face of the wall? . . .

13. A man whose height is 6 feet, walks from under a lamp post, at the rate of 3 miles per hour, at what rate is the extremity of his shadow travelling, supposing the height of the light to be 10 feet above the ground?

Problems from James Connell (1844). The Elements of the Differential and Integral Calculus. London: Longman, Brown, Green, and Longman, as cited in Austin, Barry and Berman (2000) pdf.

As Austin, Barry and Berman point out, these problems illustrate some calculus concepts, and if they do not resemble Ritchie's problems, they "are new and original and many remain in our textbook."

B. Sherin (2001). How students understand physics equations. Cognition and instruction, 19 479-541. http://www.sesp.northwestern.edu/publications/141061094744ad835da3692.pdf [Dead link? May 2009]

5.6 An historical perspective


A swallow once invited a snail to dinner. He lived one league from the place, and the snail travelled at the rate of one inch a day. How long would it be before he dined? An old man met a child. "Hello," he said, "I hope that you will live as long as you have lived already, and the same period again, and then three times as much as those two periods together. Then, if God lets you live one year more, you will be a hundred years old." How old was he?

From a pedagogic work on arithmetic, by Bede (672-735), as cited in Frank Davies (1973). Teaching reading in early England (p. 90). Pitman.


He [Whewell] had a condescending attitude toward the teaching of arithmetic in schools and promoted the use of rules without understanding.

De Morgan was a celebrated historian of mathematics and thought deeply about how mathematics should be taught. He admired the French higher education system where students were made to reflect on the subject, instead of merely copying out exercise after exercise, as was the case at Cambridge. The French system produced research mathematicians, and Cambridge by and large did not.

... Wharton found De Morgan's textbooks had too much discussion of underlying principles. Wharton differed from De Morgan in another fundamental way Wharton actively supported teaching with examples, whereas De Morgan preferred the approach of French textbooks with no examples in at all. Despite these differences, there was a consensus among the participants that teaching arithmetic using unexplained rules was quite unacceptable and needed to be stopped.

In rejecting the rote teaching of arithmetic, Wharton, Wilson, and Parker found themselves in opposition to Whewell's system. Their convictions would eventually lead to entrance examinations to Oxford and Cambridge intended to ensure an even standard of mathematics education in the schools.

Cited from Delve, 2003, p. 172


The Limits of a quantity which admits of change in its magnitude, are those magnitudes between which all the values that it can have during all its changes are comprised; beyond which it can never pass; and fro which it may me may to differ by quantities less than any that can be assigned in finite terms.

This is the first sentence in John Hind's (1831) The principles of differential calculus with its application to curves and curve structures designed for the use of students in the university.

If the mathematical sciences were cultivated wholly for their practical utility, as it is called, meaning their application to the formation and management of all the mechanism by which the arts of life are advanced, it would not be necessary to consider any magnitude as having existence at all, unless it were sufficiently great to be either useful or noxious to some object connected with some application in question."

This is the first sentence in Augustus De Morgan (1837) The Differential and Integral Calculus. Containing differentiation, integration, development, series, differential equations, differences, summation, equations of differences, calculus of variations, definite integrals - with applications to algebra, plane geometry, solid geometry, and mechanics. (reprint: Elibron)

John Hind does not think it necessary to give any motivation at all, an attitude that I personally experienced still in 1973: my professor simply refused to provide a motivation for the mathematical analysis he was going to teach for one year, being explicitly asked to provide one by his students. Hinde probably would have answered, if directly asked for it, that mathematics is meant to train the mind, the common attitude at the time. Augustus de Morgan, on the contrary, thinks motivation and explanation of the essence. This is a crucial moment in the history of mathematics - at Cambridge University that is.

Smith favoured natural philosophy over pure mathematics

On the one hand, the Tripos was a problem-solving marathon second to none. Its multitude of papers contained more questions than could be solved in the allotted time and success depended more on having the mechanical ability to solve problems as rapidly as possible than on having a clear understanding of the theory. On the other hand, the Smith's Prize examination consisted of only a few papers and was generally aimed at soliciting a more thoughtful or philosophical approach to the questions asked. (...) By fostering an interest in the study of applied mathematics, the competition [Smtih's Prize exam] played a significant part in promoting the remarkable achievements in mathematical physics that characterized Cambridge mathematics during the second half of the nineteenth century. (p. 272)

June Barrow-Green (1999). 'A corrective to the spirit of too exclusively pure mathematics': Robert Smith (1689-1768) and his prizes at the Cambridge University. Annals of Science, 56, 271-316.

This seems to be a perennial problem. It may result, and often does, in bad design of test questions: worded problems that really function as mathematical problems only, while the intention of the item designer might have been that pupils first build a model of the problem situation, and only then translate the model into formulas that will allow a specific solution.

5.7 Literature

H. J. E. Beth (1932). Newton's 'Principia.' deel I, II. Groningen: Noordhoff. [in Dutch]

Benjamin S. Bloom, J. Thomas Hastings and George F. Madaus (Eds) (1971). Handbook on formative and summative evaluation of student learning. London: McGraw-Hill.

Audrey B. Champagne, Richard F. Gunstone and Leopold E. Klopfer (1985). Instructional consequences of students' knowledge about physical phenomena. In Leo H. T. West and A. Leon Lines: Cognitive structure and conceptual change (pp. 61-90). Academic Press.

Tandi Clausen-May (2001). An approach to test development. nfer

Jannette Collins (2006). Education techniques for lifelong learning: writing multiple-choice questions for continuing medical education activities and self-assessment modules. Radiographics, Mar-Apr 26(2), 543-51. http://www.arrs.org/StaticContent/pdf/ajr/pdf.cfm?theFile=ajrWritingMultipleChoiceHandout.pdf [Dead link? May 2009]

Janet Delve

(2003). The College of Preceptors and the Educational Times: Changes for British mathematics education in the mid-nineteenth century. Historia Mathematica 30, 140-172. pdf

E. J. Dijksterhuis (1951/1969). The mechanization of the world picture. London: Oxford University Press.

Ibrahim Abou Halloun and David Hestenes (1985a). The initial knowledge state of college physics students. Am. J. Phys. 53 (11) 1043-1048. pdf. And
Ibrahim Abou Halloun and David Hestenes (1985b). Common sense concepts about motion. Am. J. Phys. 53 (11), 1056-1065. pdf.

Ji-Eun Lee (2007). Making sense of the traditional long division. Journal of Mathematical Behavior 26, 48–59.

Ernst Mach (1893/1960). The science of mechanics: A critical and historical account of its development. La Salle: Open Court. Translated by Thomas J. McCormack.

Roger C. Schank and Robert P. Abelson (1977). Scripts, plans, goals and understanding : an inquiry into human knowledge structures. Erlbaum.

Jonathan Tuminaro (2004). A cognitive framework for analyzing and describing introductory students' use and understanding of mathematics in physics. Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park. pdf

more literature

Roman Frigg and Stephan Hartmann (2006). Models in science. In Stanford Encyclopedia of Philosophy. html

Stephan Hartmann (2005) The World as a Process: Simulations in the Natural and Social Sciences. pdf

Carl G. Hempel (1952/1972). Fundamentals of concept formation in empirical science. London: The University Of Chicago Press, 1972.

Brenda R. J. Jansen and Han L. J. van der Maas (2002). The Development of Children's Rule Use on the Balance Scale Task. Journal of Experimental Child Psychology 81, 383-416. pdf

Richard Lesh, Mark Hoover and Anthony E. Kelly (nd). Equity, Assessment, and Thinking Mathematically: Principles for the Design of Model-Eliciting Activities. Rational Number project. html

Levy, S.T., Mioduser, D., & Talis, V. (2006 in preparation). Episodes to Scripts to Rules: Concrete-abstractions in kindergarten children's construction of robotic control rules pdf

J. E. Mezzick en H. Solomon (1980). Taxonomy and behaviorial science. London, Academic Press.

M. Macdonald-Ross (1979). Scientific diagrams and the generation of plausible hypotheses: an essay of the history of ideas. Instructional Science, 8, 233-234.

Margaret Morrison (). Models as representational structures. Paper presented in: Nancy Cartwright's Philosophy of Science. An International Workshop, December 16-17, 2002 pdf

E. A. Murphy (1976). The logic of medicine. London: Johns Hopkins University Press.

A. P. van Leeuwen (1941). 2 x 2 = 5. Merkwaardige uitkomsten van cijfer- en getallencombinaties. Rekenkundige sels, vreemde vraagstukken, goocheltoeren met cjfers, toovervierkanten, paradoxen enz. Amsterdam: Becht. Aad van Leeuwen

William E. Becker and William H. Greene (2001). Teaching Statistics and Econometrics to Undergraduates. Journal of Economic Perspectives—Volume 15, 169-182. pdf




proefjes.nl (plaatje - cartoon van Galileo? - op proefjes.nl)



The Diagrammatic Reasoning Site http://www.cs.hartford.edu/~anderson/

Galileo Galilei's Notes on Motion, integral text in Latin as well as in English, plus electronic representation of the manuscript. Biblioteca Nazionale Centrale, Florence - Istituto e Museo di Storia della Scienza, Florence - Max Planck Institute for the History of Science, Berlin. site.

American Translators Association. Translation: Getting it right. pdf

Robert Rynasiewicz (www 2004). Newton's Views on Space, Time, and Motion. Stanford Encyclopedia of Philosophy. html

May 23, 2007 \ contact ben at at at benwilbrink.nl     Valid HTML 4.01!   http://www.benwilbrink.nl/projecten/06examples5.htm