Original publication 'Toetsvragen schrijven' 1983 Utrecht: Het Spectrum, Aula 809, Onderwijskundige Reeks voor het Hoger Onderwijs ISBN 90-274-6674-0. The 2006 text is a revised text.

Item writing

Techniques for the design of items for teacher-made tests

6. Writing items on text

Ben Wilbrink

this database of examples has yet to be constructed. Suggestions? Mail me.


Figure 1. What about text? The scheme shows three things the pupil can do with text, as well as compose it. The 'personal theory' is the intellectual baggage of the student, her mental model of (this little bit of) the world.

The base level of questions about text is to ask for reproduction or recognition. The next higher level is analyzing separate parts of the text in relation to each other. Having done so, the pupil is fully prepared to combine other knowledge of the world with the information given in the text, and draw conclusions going outside the boundaries of the text itself. In all stages, but especially in the last one, the personal theory about the world - the pupil's mental model - plays a significant role. The kind of scheme in figure 1 summarizes cognitive theory, see for example Kuhn (2005). Acknowledging the role of personal theory introduces the meta-cognitive level: for the student it is important to know what she knows, what that information is worth - how sure it is - and how it does combine with the information in the new text - or how if fails to do so - and what that implies for either the mental model or the perceived value of the new information.

Thinking in terms of meta-cognition and mental models may not come easy to the teacher or professor, yet they have always done so: testing students for their knowledge and insights is a meta-cognitive exercise itself. Let this be a reassurance that reading the text of this chapter will be worthwhile.

For the time being, examples illustrating the interplay between text, operations on the text, and mental models are available in Kuhn (2005). In time, more examples will be collected from the literature, or constructed.

"Items that attempt to assess the test taker's ability to derive meaning from a passage and to make inferences are often limited to questions such as the following: What is the main idea in this story? What is this stoty mostly about? What is the best title for this story? How did the character probably feel? These are not bad questions. However, a close inspection often reveals that such questions can be answered using information that is explicitly stated in the text."

Robert L. Linn (1988). Dimensions of thinking: Implications for testing. CSE Technical Report 282. http://www.cresst.org/Reports/r282.pdf [link broken? 1-2009]. Published in Beau Fly Jones and Lorna Idol (Eds) (1991). Dimensions of thinking and cognitive instruction. Erlbaum.

It is highly seductive to write your test items in such a way that in fact they only ask for information given literally in the course text. This is of course highly problematic, for students will interprete this as an admonition to study the text this superficially only.

6.1 Participation: have you read it?

6.2 Theme's and main issues

Carefully explain what Nozick and Lewis would say about the following skeptical argument:

P1. I don't know that I am not a brain in a vat.
P2. If I don't know that I am not a brain in a vat, then I don't know that I have hands.
C. I don't know that I have hands.

from the MITOPENCOURSEWARE 'Theory of knowledge' fall 2003 exam html. Nozick en Lewis zijn twee bestudeerde auteurs.

What is the Gettier problem?

The same exam. It is a classical case in theory of knowledge.

6.3 Analysis


Figuur 1. Film making in Sammy's Science House (Apple): analyse the time sequence, drag the images. Difficulty levels: 3 or 4 images. This one may prove tricky for adults as well. Images in stead of text, showing the chapter's title to be unnecessarily restrictive.

What more can one wish for, for kids three or four years of age? A bag full of films, immediate feedback on the results of your analysis, a 'real' film being shown as a reward for the correct sequencing. This surely is a prototype of a good test of analytical thinking. Or a good course for analytical thinking. What comes first, and what later.

Analysis of text has been described beautifully and very much to the point by Deanna Kuhn in the first part of her 2005 chapter 7 The skills of argument. This part of chapter 7 is original with the book, it is not based on research published elsewhere. I will choose this exposition as my frame of reference for the paragraph on analysis of text. The particular case discussed by Kuhn is an admissions test used by, amongst others, The City University of New York (CUNY), testing for argumentative skill. Regrettably, the test is used as an intelligence test, not as one testing for achievements in the skill of reasoning. That is no fault of the test itself, it is an omission in the curriculum followed by the students that have to sit this test.

p. 148: "I began this chapter by looking at a test of argumentative thinking that is employed as a "high-stakes" admission gate to a college degree. Few would probably quarrel with the assertion that students aspiring to a college degree should be able to meet the challenges this test poses. Yet a significant number of students seeking a higher degree fail the test, at some institutions at a rate exceeding 50 percent. Should these students take their failure as proof that they are not suited for higher education? Many no doubt have. But at least a few have reacted with dismay and questioned the message. "Which courses are these skills taught in?" they want to know. "Why haven't we learned them? What can I do now to learn what I need to?" These students have a legitimate complaint, certainly. If these skills are so highly valued, why are courses not available that teach them?

Kuhn meticulously treats one concrete analysis example from this test, showing the complexity of what it is that is asked from the students sitting this test. And then this is only a test of reasoning with information given to the student, not one of reasoning with course material studied. Kuhn's work demonstrates that analytical questions on text can be very complex indeed, even in cases where content itself is kept perfectly transparent. The citation illustrates that there might be serious problems in the relation between analytical questions asked of the student in the exam, and the quality and content of the instruction or course preparing the student for the very same examination. Be careful not to err here.


Figure from Kuhn, D. (2001). How Do People Know? Psychological Science [the image lives on Kuhn's site]

6.4 Inference

JAMES L. WARDROP, THOMAS H. ANDERSON, WELLS HIVELY, C. NICHOLAS HASTINGS, RICHARD I. ANDERSON, KEITH E. MULLER (1982). A framework for analyzing the inference structure of educational achievement tests. Journal of Educational Measurement, 19, 1-18. pdf


In her book on the Vietnam War, Frances Fitzgerald argues that the U.S. should not have sent soldiers to fight in Vietnam because America could not hope to win the war. From her view, it was useless to support the corrupt South Vietnamese government against guerrillas who won the respect, loyalty, and support of the peasantry. But Neil Sheehan argues that America could have won the war if it had insisted on replacingincompetent South Vietnamese government officials. Success, according to Sheehan, required leaders who would focus on rural development policies, raise living standards, protect peasants, and reduce the ability of the Communists to recruit soldiers from among the peasants. Assume that Fitzgerald's and Sheehan's views are accurately represented here. Whose arguments are stronger?
  1. Fitzgerald's, because the U.S. should not risk the lives of American soldiers to fight a foreign war.
  2. Fitzgerald's, because the support of the South Vietnamese peasants was essential if American soldiers were to be effective.
  3. Sheehan's, to the extent that the U.S. had the power to install effective leaders, implement rural development, and protect the peasantry.
  4. Sheehan's, because World War I and World War II proved that American military intervention can be successful.
  5. Both (b) and (c). If it were possible to accomplish what Sheehan suggests, the Communists would have lost their base of support among the peasantry. But if Sheehan's suggestions were not feasible, the U.S. should have avoided military intervention.

Yeh (2002, p. 15) pdf. See the article for an explanation of this item and itd theoretical background (Deanna Kuhn's work, among others).

6.5 Composition

6.6 The naive or novice learner

There are many situations in life as well as in institutons where complex text has to be 'learned' somehow, while the learner does not have the necessary background information to fully interpret and understand the textual material. Think of the reader of Scientific American articles (the personal domain), or employees being trained for complex tasks (the institutional domain).
The case material I will use to develop adequate questioning techniques will be two articles on use, effects, and costs of medicines. Bart Meijer van Putten (8 juli 2006). De pijn blijft. NRC Handelsblad, p. 41. Broer Scholtens (8 juli 2006). Zoek het geheim van de dure pillen. De Volkskrant, Kennis p. 5. This choice is motivated by the availability of these articles, the highly technical character of the information contained in it, the audience that is highly involved while not trained in medicine, pharmacy or research methodology, and - last but not least - the availability of lots of examples of well designed test items on medicinal course content, but directed at the well educated student or assistent doctors (f.e. Case and Swanson 2001 http://www.nbme.org/PDF/2001iwg.pdf [dead link? 1-2009]).


In all cases I define advance organizers as introductory material at a higher level of abstraction, generality, and inclusiveness than the learning passage itself (...).
Further, advance organizers also differ from overviews in being relatable to presumed ideational content in the learner's current cognitive structure (...).
Expository organizers are used when the new learning material is completely unfamiliar, as determined by pretests, and attempts merely to provide inclusive subsumers that are both related to existing ideas in cognitive structure and to the more detailed material in the learning passage. (...).

David P. Ausubel (1978). In defense of advance organizers: A reply to critics. Review of Educational Research, 48, 251-257. jstor

It will be of interest to trace how exactly Ausubel constructs these 'expository organizers,' because it might be one way—or the way—for trainee/novice learners to give meaning to the material they eventually will have to master.


In courses other than computer science, students regularly work with texts and other artifacts far larger and more sophisticated than they could produce themselves. A literature or history course is a good example. The same occurs in sociology courses. These artifacts teach the student what is best in the field and should be emulated.

Give the students access to large programs and designs well before they have the ability to produce them. These artifacts can be used as the basis of exercises. Students can make small modifications to large programs and they can extend them in simple ways early on.

Joseph Bergin (online july 2006). Pedagogical patterns. html. As you will have understood, the work of Bergin is in the field of teaching object-oriented programming OOP.

The case of the naive learner may stand for the more general class of special pedagogical situations, or pedagogical patterns as Joseph Bergin (html) calls them. Bergin, in the box above, does not call the reader 'naive.' Instead he calls the text 'more sophisticated' than the reader, in this way generalizing the topic of this section 6.6. The patterns here are patterns of instructional design in the field of object oriented programming, going by the fancy names of Fixer Upper, Spiral, Mistake, Early Bird, Toy Box, Tool Box, Lay of the land, Test Tube, Larger than Life, Fill in the Blanks. 'Larger than Life,' see the above box, takes the naive learning from text as the problem, and proposes to make the problem the solution. Doing so implies questions - test items also - to be aligned with the instructional design, nothing less, nothing more. Beautiful. The other patterns share this one characteristic: they deviate - each in its own way - from the traditional way of sequencing curricular material. Computer programming is is a discipline that lends itself easily to experimenting with these patterns, because the computer environment itself offers the ultimate testing environment. Nevertheless, the patterns are useful in other disciplines also, for example in medicine starting on day one with realistic cases (Earli Bird pattern). Bergin is one of the people involved in the Pedagigical Patterns Project, all in the field of object-oriented programming etectera site.

6.6 more literature

Scott O. Lilienfeld (2002). When Worlds Collide: Social Science, Politics, and the Rind et al. (1998) Child Sexual Abuse Meta-Analysis [Controversy And Scholarly Publishing]. The American Psychologist, 57, 176-188. html

John M. Carroll and Judith Reitman Olson (1987). Mental models in human-computer interaction: Research issues about what the user of software knows. Washington, DC: National Academy Press. http://darwin.nap.edu/books/POD266/html/R1.html [dead link? 1-2009]

Patricia A. Alexander en Judith E. Judy (1988). The interaction of domain-specific and strategic knowledge in academic performance. Review of Educational Research, 58, 375-404.

6.7 Literature

Many items are as yet on my 'to do' list: they are mentioned here, but not used in the above text yet.

Deanna Kuhn (2005). Education for thinking. Harvard University Press. excerpt

Stuart S. Yeh (2002). Tests Worth Teaching To: Constructing State-Mandated Tests That Emphasize Critical Thinking Educational Researcher, 30, # 9, 12-17. pdf

more literature


Jos Kessels, Ad van der Kam en Jan Tollenaar (1989). De zaak Arlet; inleiding in de kennistheorie + Handleiding voor de docent. Meppel: Boom.

M. Gall, B. Dunning en R. Weathersby (1971). Minikursus Denkvragen stellen. Groningen: Wolters-Noordhoff. Nederlandse bewerking: P. L. v. d. Plas en W. J. M. de Roos, 1977.

W. R. Borg, M. L. Kelley en P. Langer (1970). Minikursus effektief vragen stellen. Nederlandse bewerking J. Heeringa en S. A. M. Veenman, 1977. Wolters-Noordhoff.


Ron Oostdam en Gert Rijlaarsdam (). Towards strategic language learning. Amsterdam University Press (USA: The University of Chicago Press).

Huub van den Bergh (1990). On the construct validity of multiple-choice items for reading comprehension. Applied Psychological Measurement, 14, 1-14.

Benjamin S. Bloom et aliis (Ed.) (1956). Taxonomy of educational objectives. The classification of educational goals. Book 1 Cognitive domain. David McKay.

Joyce Chapman (2005). The Development of the Assessment of Thinking Skills. University of Cambridge Local Examinations Syndicate. http://www.cambridgeassessment.org.uk/research/confproceedingsetc/publication.2005-10-13.7538460012/file/ [dead link? 1-2009]

Robert Cummins: Cross-domain inference and problem embedding. In Robert Cummins and John Pollock (Eds) (1991). Philosophy and AI. (p. 23-38) MIT.

Entwistle, N. (1995). Frameworks for understanding as experienced in essay writing and in preparing for examinations. Educational Psychologist, 30, 47-54. abstract

Frank Friedman en John P. Rickards (1981). Effect of Level, Review, and Sequence of Inserted Questions on Text Processing. Journal of Educational Psychology, 73, 427-436.

Donald Laming (2003). Marking university examinations: some lessons from psychophysics. Psychology Learning and Teaching, 3(2), 89-96. pdf

Elliot G. Mishler(1986). Research interviewing. Context and narrative. Cambridge, Massachusetts: Cambridge University Press.

Don Nix (1985). Notes on the efficacy of questioning. In Arthur C. Graesser and John B. Black (Eds) (1985). The psychology of questions. Hillsdale, New Jersey: Lawrence Erlbaum.

J. P. Rickards (1979). Adjunct questions in text: a critical review of methods and processes. Review of Educational Research, 49, 181-196.

Claire E. Weinstein, Ernest T. Goetz and Patricia A. Alexander (Eds) (1988). Learning and study strategies. Issues in assessment, instruction, and evaluation. London: Academic Press.

Hsin-Kai Wu and Chou-En Hsieh (in press as of June 2006). Developing Sixth Graders' Inquiry Skills to Construct Explanations in Inquiry-based Learning Environments. International Journal of Science Education. pdf

Corinne Zimmerman (2005). The Development of Scientific Reasoning Skills: What Psychologists Contribute to an Understanding of Elementary Science Learning. Final Draft of a Report to the National Research Council Committee on Science Learning Kindergarten through Eighth Grade. pdf

September 8, 2007 \contact ben at at at benwilbrink.nl     Valid HTML 4.01!   http://www.benwilbrink.nl/projecten/06examples6.htm