Original publication 'Toetsvragen schrijven' 1983 Utrecht: Het Spectrum, Aula 809, Onderwijskundige Reeks voor het Hoger Onderwijs ISBN 90-274-6674-0. The 2006 text is a revised text.

Item writing

Techniques for the design of items for teacher-made tests

1. Introduction

Examples

Ben Wilbrink

Contents examples pages

1 Introduction

1.1 Item design: art or skill?
1.2 First principles
1.3 Summary of content
1.4 An historical perspective

1.5 Literature

2 Item types, transparency, item forms en levels of abstraction

2.1 Open-ended questions
2.2 Multiple-choice (MC) questions
2.3 Essay type questions
2.4 Transparency
2.5 Item forms for many uses
2.6 Valid questioning
2.7 An historical perspective

2.8 Literature

3 Course content inventory

3.1 (Indirect) observable terms
3.2 Abstract terms and constructs
3.3 Theoretical terms
3.4 Relational networks of terms
3.5 Variants of 'definitions'
3.6 Literature

4 Questions on individual terms

4.1 Translation
4.2 Definition
4.3 Providing examples
4.4 Recognizing/naming examples
4.5 Recognizing/naming formal terms
4.6 Descriptions
4.7 Literature

5 Questions on relations between terms

5.1 Translating and picturing
5.2 Discriminating
5.3 Classifying
5.4 Algorithms, routines
5.5 Lawful relations
5.6 An historical perspective
5.7 Literature

6 Questions of text

6.1 Participation control
6.2 Theme's and headlights
6.3 Analysis
6.4 Inference
6.5 Composition

6.6 Literature

7 Posing problems

7.1 About problems
7.2 Taking inventory
7.3 Heuristics
7.4 Literature

8 Quality control

8.1 Rules in examining
8.2 check these points
8.3 Indepent assessment of item quality
8.4 checklists
8.5 An historical perspective
8.6 Literature

... an experience, a very humble experience, is capable of generating and carrying any amount of theory (or intellectual content), but a theory apart from an experience cannot be definitely grasped even as a theory.

John Dewey, in: Democracy and education.

The history of human progress is the story of the transformation of acts which, like the interactions of inanimate things, take place unknowingly to actions qualified by understanding of what they are about; from actions controlled by external conditions to actions having guidance through their intent: - their insight into their own consequences. Instruction, information, knowledge, is the only way in which this property of intelligence comes to qualify acts originally blind. (Quest for certainty (1929, p. 245)

John Dewey citing himself in his (1939, p. 521).

Here is the point: once you have learned how to ask questions - relevant and oppropriate and substantial questions - you have learned how to learn and no one can keep you from learning whatever you want or need to know.

Neil Postman and Charles Weingartner (1969, p. 34).

this database of examples has yet to be constructed. Suggestions? Mail me.

How many parts of speech are there?

Eight.

What?

Noun, pronoun, verd, adverb, participle, conjunction, preposition, interjection.

What is a noun?

A part of speech which signifies with the case a person or thing specifically or generally.

How many attributes has a noun?

Six.

What?

Quality, comparison, gender, number, form, case.

Give the inflection of the active verb.

Lego, an active verb in the indicative mood, a word of present time, singular number, simple form, first person, third short conjugation, which will be inflected thus: lego, legis, legit, legit, and plural, legimus, legitis, legunt; in the same mode in the treterite imperfect tense, legebam, legebas, legebat, and plural, legebamus, legebatis, legebant.

James Bowen's (vol. 1 p. 211) transcription is from the Ars minor on teaching Grammar, written by Donatus. This book "was organized into a series of questions and answers designed to teach the fundamentals of Latin grammar by rote."
James Bowen (1972). A history of Western education. Volume one. The ancient world: Orient and Mediterranean. Methuen.
And the West? The 'Donaet' was still in use in Western European school a thousand years later.

Item writing is not a science or science-based technology. Yet the European West does have more than a thousand years experience in asking questions and assessing the answers given.
The kind of question-and-answer instruction above may seem amazing, because of its being dead serious in that answers should be literally correct, nevertheless something uncannily resembling this kind of literal reproduction was still rife in my own Latin school days.

"This study examines and evaluates two sources of evidence bearing on the validity of 31 MC item-writing guidelines intended for teachers and others who write test items to measure student learning. These two sources are measurement textbooks and research."

Thomas Haladyna, Steven M. Downing, and Michael C. Rodriguez (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15, 309-334. http://depts.washington.edu/currmang/Toolsforteaching/MCItemWritingGuidelinesJAME.pdf [Dead link? May 1, 2009]

Do not be surprised to find in the year 2002 the above kind of statement, expressing the lack of scientifically informed item writing techniqes.

1 Introduction

schema van alles

Figuur 1. Schema van alles. Voor software om zoiets te maken zie http://cmap.ihmc.us/

Figure 1 (from the Dutch chapter 1) Scheme of the book's content
- toetsvragen ontwerpen: item writing
- stof schematiseren: schematizing course content (a scheme is a figure like this 'Figuur 1')
- vragen: questions (test items)
- kennis beschrijven: describing knowledge
- definitie: definitions
- voorbeelden: examples
- niet-voorbeelden: non-examples
- termen (begrippen): concepts
- relaties: relations
- over tekst: on text
- probleemoplossen: problemsolving
- geboden: do's
- verboden: don't's
- intervisie: collegiate review
- kwaliteitscheck: quality check
- vraagsoort: kind of item
- vraagvorm: item format
- rompvraag: item form
- is een kunst: is an art
- met ervaring beter: knowing from experience
- cognitieve doelen: cognitive goals

1.1 Item design: art or skill?

Item writing is essentially creative - it is an art. Just as thee can be no set of formulas for producing a good story or a good painting, so there can be no set of rules that guarantees the production of good items. (Wesman, 1971 p. 81)

Every test begins with an idea in the mind of the item writer. The production and selection of ideas upon which test items may be based is one the most difficult problems confronting him. (...)
There is no automatic process for the production of item ideas. They must be invented or discovered, and in these processes chance thoghts and inspirations are very important. (Wesman, 1971 p. 86)

"From the outset of this book, it has been emphasized that constructing test items is a complex task, requiring both technical skill and creativity." The book is about the technicalities, indeed. "Creativity, however, is an element of item construction that can only be identified; it cannot be explained. Item writers, as individuals, will bring their own sense of art to the task."
[Osterlind, 1997, p. 308]

The Wesman and Osterlind view - which is rather the general view in the field - flies in the face of everything that is even remotely construct valid. It posits rather shamelessly that students do not have a fair chance to adequately prepare themselves for tests written in this artful way. How could they? Should they be artists also? Here is the most important motivation to write this book on item design: to offer an alternative to the artists in the field.

Writing plausible distractors comes from hard work and is the most difficult part of MC item writing.

.
Thomas Haladyna, 1999 p. 97

De toetsvragenschrijver die het ernstig meent met de inhoudelijke representativiteit [content validity] van de vragen, begint niet met het neerpennen van vragen zoals ze hem te binnen schieten, en ook niet door nuchter een vraag per bladzijde tekst te bedenken, maar door ieder van zijn onderwijsdoelen te vertalen in taakomschrijvingen. Gewoonlijk levert dat het beste resultaat op door globale doelen nogal fijn onder te verdelen. Binnen ieder van die onderverdelingen is het niet voldoende om de onderwerpen louter op te sommen, maar moet voor ieder onderwerp zijn aangegeven op welke vorm van beheersing van dat onderwerp het onderwijs is gericht. (...)
Het moet duidelijk zijn of de student te maken krijgt met technische termen, met in dagelijkse termen beschreven situaties, met afbeeldingen van situaties of met concrete dingen. Antwoordmodellen zijn ook van belang. Maar al te vaak wordt de meerkeuzevraagvorm als vanzelfsprekend beschouwd. Als de toetsvragenschrijver onbevangen nadenkt over zijn doelen, zal hij er vaak toe besluiten dat de taak vraagt om door de student geconstrueerde antwoorden - om geschreven antwoorden op aanvulvragen, of om mondelinge antwoorden om belemmeringen zo klein mogelijk te houden.

Lee J. Cronbach in Thorndike (1971, p. 458)

1.2 First principles

Atkin, Black en Coffey (2001, Classroom Assessment and the National Science Education Standards (2001) [available on NAP

do students realize the goals? not: do they differ from each other?

"The central point is that, to be effective, feedback should cause thinking to take place. Implementation of such practice can change the attitudes of both teachers and pupils to written work: the assessment of pupilsÕ work will be seen less as a competitive and summative judgement and more as a distinctive step in the process of learning."

Paul Black (2004). Raising standards through formative assessment. In Carol Adams and Kathy Baker (Eds). Perspectives on pupil assessment. The GTC conference New Relationships: Teaching, Learning and Accountability London, 29 November 2004.

Paul Black is an advocate of assessement for learning, contrasting it with assessment of learning. Help the individual pupil to reach the goals, give her meaningful feedback. Reporting grades is not meaningful feedback, and will definitely damage the motivation of many pupils.

Achievement test questions tend to ask for definite answers. This need not be a problem in any given test, but it definitely is problematic if achievement testing forces complex and probabilistic phenomena into the format of questions having clear cut answers. Fischbein (1975) presents research showing just one such effect (see here). The age-of-the-captain problem is a straightforward illustration of what is meant here (see examples chapter 5 ).

feed-forward, backwash.
objectives?
heuristics must be generally applicable.
statistical analysis
ultimate goals

1.3 Summary of content

1.4 An historical perspective

Famous 16th century humanist course : no exams

(IV p. 507) " ..... there never existed a roll of attending students: the lectures were quite free, and, although highly formative, they did not lead to any test, nor to any degree: they were just intended to turn those who cared to avail themselves of them, into well-equipped searchers and ripe scholars."
(II p. 122) ... the nature of the Institute, which was meant to be free and generous in the dispensing of choice linguistic knowledge: whoever wished, could avail himself of what was offered in genuine benevolence. Its result did not stay out: a thorough acquaintance with the languages and literatures of Rome and Greece soon spread, and, although more slowly, yet more suely even showed its influence on intellectual activity itself: it reduced all study and research to reality and objectivity, breaking off with tradition and senseless repeating. That new impulse, which was as the very spirit of Busleyden's Institute, was exhibited in the forming of the students: instead of a series of automatic beings, drilled after the same pattern, it created a race of searchers, of able workers, longing for a new, for a better and intenser activity. That spirit of the teaching was at once recognized by youth, that age of generous attempts, dreaming of progress and ideal.

Henry de Vocht (1951-1955). History of the foundation and the rise of the Collegium Trilingue Lovaniense 1517-1550. 4 volumes. Louvain: Bibliothèque de lÕUniversité, Bureau de Recueil.

If there is one place and time for the beginning of 16th century Humanism, it is the in 1517 newly founded College of the Three Languages (Latin, Greek and Hebrew) at Leuven University. The one person, Erasme, was narrowly involved. Many students from all over Europe flocked in to attend its lectures, and spread the new attitude to reading, writing and learning all over Europe.

Explicit - formal - assessment of learning is not at all self-evident. The case of the Collegium Trilingue Lovaniense is merely a shining example of the kind of course that is an end in itself, instead of one that finds its end in an exam. Back to today's worries: do not be mistaken to organize end-of-course tests in the case of skills labs, or other courses where pupils primarily do things or otherwise get new experiences. It will still be necessary to design questions and problems that will stimulate pupils, but they will not necessarily be used in exit tests.

1.5 Literature

John Dewey. Democracy and education.

John Dewey (1939). Experience, knowledge and value, in Paul Arthur Schilpp and Lewis Edwin Hahn: The philosophy of John Dewey. Open Court, third edition, 1989..

Wesman (1971) 'Writing the test item' in Thorndike Educational Measurement (p. 81-129).

Neil Postman and Charles Weingartner (1969) Teaching as a subversive activity. Penguin Education Specials.

more literature

See also the English literature mentioned in the Dutch chapter html

Isaac I. Bejar (2005). Toward a science of assessment. In Wayne J. Camara en Ernest W. Kimmel (Eds) (2005). Choosing students. Higher education admissions tools for the 21st century. London: Lawrence Erlbaum Associates.

In tegenstelling tot wat zijn titel suggereert, gaat Bejar's 'science' vooral over het systematisch ontwerpen van toetsvragen (testvragen), en technieken voor automatisch nakijken, layouten, etcetera. Zie mijn bespreking van het boek, en de daarbij gegeven literatuur (pdf-documenten).

Ginette Delandshere (2002). Assessment as inquiry. Teachers College Record, 104, 1461-1484. pdf

abstract For more than 10 years now, arguments have been constructed regarding the need for new forms of educational assessment, and for a paradigm shift with a focus on supporting learning rather than on sorting and selecting students. The call for change in assessment follows an almost unanimous recognition of the limitations of current measurement theory and practice. The conceptions of learning represented by theories of learning and cognition appear strikingly different from those implied in current educational assessment and measurement practices. Indeed, most educational measurement specialists are still working from century-old understandings and behaviorist perspectives. Although the call for change is clear, the proposals and recommendations being put forward have limitations of their own and are unlikely to yield the kinds of fundamental changes envisioned by researchers. These limitations lie either in the focus of the work, in the lack of a clear articulation of the theories and concepts, in the nature of the assumptions made about learning (many of which remain implicit and unchanged), in the exclusion of certain conceptions of learning, or in some combination of these problems. This article explores the possibility of using inquiry as a way to understand, and hence to assess, learning. After an initial review of the assessment literature in which the need for change has been asserted and analysis of the theoretical and epistemological foundations that seem to undergird these writings, the focus shifts to the meaning of learning, knowing, and teaching implied in this literature and to the limitations of its recommendations. Later sections consider notions of learning that seem to be excluded from current assessment practices and begin to uncover similarities between learning, knowing, and inquiring that could make inquiry an appropriate metaphor for what we currently know as educational assessment. Finally, there is discussion of important issues that would need to be considered in an inquiry framework for assessment.

Howard T. Everson (1995). Modeling the student in intelligent tutoring sytems: The promise of a new psychometrics. Instructional Science, 23, 433-452.

abstract This paper reviews a number of relatively new and promising psychometric approaches to the problem of modeling student achievement (the student model) within intelligent tutoring systems (ITS). A shared characteristic of most ITSs is their need to estimate a model of the student''s understanding of the domain, and use this model to modify and adapt subsequent instructional content and sequence. Sound cognitive diagnosis and the need to advance ITS technology require the development of student models that are integrated with cognitive theory and instructional science. A number of cognitively oriented psychometric approaches Ñ including latent-trait models, statistical pattern recognition methods, and causal probabilistic networks Ñ are described and discussed within the current ITS framework. As measurement-based student models are refined, we anticipate their compatibility with future generations of intelligent tutoring systems.

E. Fischbein (1975). The intuitive sources of probabilistic thinking in children. Dordrecht: Reidel.

Adriaan D. Groot (1946/1978). Thought and choice in chess. Den Haag: Mouton, 1978.

Thomas M. Haladyna (1999 2nd). Developing and validating multiple-choice test items. Erlbaum. (2004 3rd)

Stephen Klassen (2006). Contextual assessment in science education: Background, issues, and policy. Science Education, 1-32. restricted access pdf

abstract Contemporary assessment practices in science education have undergone significant changes in recent decades. The basis for these changes and the resulting new assessment practices are the subject of this two-part paper. Part 1 considers the basis of assessment that, more than 25 years ago, was driven by the assumptions of decomposability and decontextualization of knowledge, resulting in a low-inference testing system, often described as traditional.'' This assessment model was replaced not on account of direct criticism, but rather on account of a larger revolution - the change from behavioral to cognitive psychology, developments in the philosophy of science, and the rise of constructivism. Most notably, the study of the active cognitive processes of the individual resulted in a major emphasis on context in learning and assessment. These changes gave rise to the development of various contextual assessment methodologies in science education, for example, concept mapping assessment, performance assessment, and portfolio assessment. In Part 2, the literature relating to the assessment methods identified in Part 1 is reviewed, revealing that there is not much research that supports their validity and reliability. However, encouraging new work on selected-response tests is forming the basis for reconsideration of past criticisms of this technique. Despite the major developments in contextual assessment methodologies in science education, two important questions remain unanswered, namely, whether grades can be considered as genuine numeric quantities and whether the individual student is the appropriate unit of assessment in public accountability. Given these issues and the requirement for science assessment to satisfy the goals of the individual, the classroom, and the society, tentative recommendations are put forward addressing these parallel needs in the assessment of science learning. Copyright 2006 Wiley Periodicals, Inc. Sci Ed, 1-32, 2006

Robert L. Linn (Ed.) (1989). Educational measurement. National Council on Measurement in Education, and American Concil on Education. Third edition (the second edition is Thorndike, 1971).

Robert J. Mislevy (1994). Test theory reconceived. National Center for Research on Evaluation, Standards, and Student testing (CRESST) pdf.

Geeft een aardig overzicht van de huidige toetstheorie, hoewel het ondanks zijn titel dicht blijft bij de received view die de studenten zèlf niet ziet staan. Deze theorie is voor het ontwerpen van toetsvragen niet nodig, wie dit zelf wil nagaan leze het artikel.
"Test theory, as we usually think of it, is part of a package. It encompasses models and methods for drawing inferences about what students know and can do - as cast in a particular framework of ideas from measurement, education, and psychology. This framework generates a universe of discourse: the nature of the problems one defines, the kinds of statements one makes about students, the ways one gathers data to support them. Test theory, as we usually think of it, is machinery for inference within this framework."
Robert J. Mislevy (1993). A framework for studying differences between multiple-choice and free-response test items. In Randy Elliot Bennett and William C. Ward Construction versus choice in cognitive measurement (p. 75-106). Erlbaum.
- Dit hoofdstuk behoort tot hetzelfde project. Een beetje wonderlijk: doet het werk van Cronbach en Gleser over, zonder dat te noemen, en zonder ook maar in de verste verte dezelfde resultaten te boeken. Sterk autoritaire insteek ook: het gaat over beslissingen OVER studenten, zelf hebben zij niets in te brengen. Een typisch meettheoretische benadering; de bedoeling is studentmodellen te schatten, op basis van testresultaten. De werkelijke wereld gaat over heel andere zaken, al was het maar omdat daar de examenresultaten zelf bepalend zijn, niet wat er op basis van die resultaten aan modellen valt te schatten. In hoge mate teleurstellend werk dus, maar er zijn vele anderen die dit werk juist ophemelen. Take your pick.

Steven J. Osterlind (1997). Constructing test items: Multiple-Choice, Constructed-Response, Performance, and Other Formats. Kluwer.

The only other book on item writing, according to Thomas Haladyna (1999, p. viii). Thomas does not read Dutch, evidently.
Expensive
Technical, in a traditional way.
Does not address any issues regarding the mapping of content into test items.

James W. Pellegrino, Naomi Chudowsky and Robert Glaser (Eds). Knowing what Students Know. The Science and Design of Educational Assessment. Board on Testing and Assessment / Center for Education / Division of Behavioral and Social Sciences and Education / National Research Council: Committee on the Foundations of Assessment. Washington, DC: National Academy Press. [for reading available on NAP]

James W. Pellegrio and Naomi Chudowsky (2003). The Foundations of Assessment. Merasurement: Interdisciplinary Research and Perspectives, 1, 103-148.
- abstract This article presents major messages from the National Research Council report, Knowing What Students Know: The Science and Design of Educational Assessment (2001). The committee issuing this report was charged with synthesizing advances in the cognitive sciences and measurement, and exploring their implications for improving educational assessment. The article opens with a vision for the future of educational assessment that represents a significant departure from the types of assessments typically available today, and from the ways in which such assessments are most commonly used. This vision is driven by an interpretation of what is both necessary and possible for educational assessment to positively impact student achievement. The argument is made that realizing this vision requires a fundamental rethinking of the foundations and principles guiding assessment design and use. These foundations and principles and their implications are then summarized in the remainder of the article. The argument is made that every assessment, regardless of its purpose, rests on three pillars: (1) a model of how students represent knowledge and develop competence in the subject domain, (2) tasks or situations that allow one to observe students' performance, and (3) interpretation methods for drawing inferences from the performance evidence collected. These three elements-cognition, observation, and interpretation-must be explicitly connected and designed as a coordinated whole. Section II summarizes research and theory on thinking and learning which should serve as the source of the cognition element of the assessment triangle. This large body of research suggests aspects of student achievement that one would want to make inferences about, and the types of observations, or tasks, that will provide evidence to support those inferences. Also described are significant advances in methods of educational measurement that make new approaches to assessment feasible. The argument is presented that measurement models, which are statistical examples of the interpretation element of the assessment triangle, are cuuently available to support the kinds of inferences about student achievement that cognitive science suggests are important to pursue. Section III describes how the contemporary understanding of cognition and methods of measurement jointly provide a set of principles and methods for guiding the processes of assessment design and use. This section explores how the scientific foundations presented in Section II play out in the design of real assessment situations ranging from classroom to large-scale testing contexts. It also considers the role of technology in enhancing assessment design and use. Section IV presents a discussion of the research, development, policy, and practice issues that must be addressed for the field of assessment to move forward and achieve the vision described in Section I.

Lorrie A. Shepard (2000). The role of classroom assessment in teaching and learning. CSE Technical Report 517 http://www.cse.ucla.edu/Reports/TECH517.pdf [Dead link? May 1, 2009] Published in V. Richardson (Ed.) (2001), Handbook of research on teaching (4th ed). Washington, DC: American Educational Research Association.

"The purpose of this chapter is to develop a framework for understanding a reformed view of assessment, where assessment plays an integral role in teaching and learning. If assessment is to be used in classrooms to help students learn, it must be transformed in two fundamental ways. First, the content and character of assessments must be significantly improved. Second, the gathering and use of assessment information and insights must become a part of the ongoing learning process. The model I propose is consistent with current assessment reforms being advanced across many disciplines (e.g., International Reading Association/National Council of Teachers of English Joint Task Force on Assessment, 1994; National Council for the Social Studies, 1991; National Council of Teachers of Mathematics, 1995; National Research Council, 1996). It is also consistent with the general argument that assessment content and formats should more directly embody thinking and reasoning abilities that are the ultimate goals of learning (Frederiksen & Collins, 1989; Resnick & Resnick, 1992). Unlike much of the discussion, however, my emphasis is not on external accountability assessments as indirect mechanisms for reforming instructional practice; instead, I consider directly how classroom assessment practices should be transformed to illuminate and enhance the learning process. I acknowledge, though, that for changes to occur at the classroom level, they must be supported and not impeded by external assessments." [http://www.cse.ucla.edu/Summary/517shepard.htm (Dead link? May 2, 2009]]

Robert L. Thorndike (ed.) (1971). Educational measurement. Washington, DC: American Council on Education.

Stephen Toulmin (1958). The uses of argument. Cambridge University Press.

Recent (2001) boek: Return to reason, Harvard University Press site review
- "Now, at the beginning of a new century, Toulmin sums up a lifetime of distinguished work and issues a powerful call to redress the balance between rationality and reasonableness. His vision does not reject the valuable fruits of science and technology, but requires awareness of the human consequences of our discoveries. Toulmin argues for the need to confront the challenge of an uncertain and unpredictable world, not with inflexible ideologies and abstract theories, but by returning to a more humane and compassionate form of reason, one that accepts the diversity and complexity that is human nature as an essential beginning for all intellectual inquiry."
Het tijdschrift Argumentation, december 2005, is een special issue: The Toulmin model today. Online, not free however.
A recent project using the Toulmin 'toolbox' in secondary education:
Sibel Erduran, Shirley Simon, Jonathan Osborne (2004). TAPping into argumentation: Developments in the application of Toulmin's argument pattern for studying science discourse. Science Education, 88, 915-933. pdf
Toulmin's boeken: een medicijn tegen de neiging de eigen, beperkte, disciplinair gevormde, autoriteit op te leggen aan de beoordeelde studenten, en daarmee een groot misverstand te veroorzaken.

A. G. Wesman (1971). Writing the test item. In Robert L. Thorndike (ed.) (1971). Educational measurement. Washington, DC: American Council on Education.

Additional literature and links

For more literature, see also file 'a'

Pierre Bourdieu et Jean-Claude Passeron (1970). La reproduction. Éléments pour une théorie du système d'enseignement. Paris: Les Éditions de Minuit.

ch. 3: Elimination et sélection (L'Examen dans la structure et l'histoire du système d'enseignement - Examen et élimination sans examen - Sé technique et sélection sociale)

H. S. Broudy (1977). Types of knowledge and purposes of education. In Richard C. Anderson, Rand J. Spiro, and William E. Montague (Eds) (1977). Schooling and the acquisition of knowledge. Hillsdale: Lawrence Erlbaum.

Allan Collins (1977). Processes in acquiring knowledge. In Richard C. Anderson, Rand J. Spiro, and William E. Montague (Eds) (1977). Schooling and the acquisition of knowledge. Hillsdale: Lawrence Erlbaum.

G. M. Seddon (1978). The properties of Bloom's taxonomy of educational objectives for the cognitive domain. Review of Educational Research, 48, 303-323. 1st page

Edward J. Furst (1981). Bloom's taxonomy of educational objectives for the cogitive domain: philosophical and educational issues. Review of Educational Research, 51, 441-453. abstract

Furst was a member of the Bloom commission.

Lawrence Hamel and Patricia Schank (2006). A wizard for PADI assessment design. PADI Principled Assessment Designs for Inquiry site Technical Report 11 pdf

The Principled Assessment Designs for Inquiry (PADI) project (padi.sri.com) includes a design system that provides a structure for assessment designs, intended to support and encourage assessment designs with clear rationales. This PADI structure can be summarized as a template of an assessment design with many parts. Designing an assessment becomes, in essence, filling in a template intelligently and making all the choices and interconnections among the various parts of the template.
For an analogy to a template, consider a U.S. federal income tax form, which must be filled out by understanding the interconnected rules and assorted constraints. To mediate the complexity, popular tax software provides an interview format, where filling out the form is reduced to answering a series of questions.
Likewise, the PADI design system has a means to create and conduct interviews, or wizards, which prompt for decisions about assessment design. In a first prototype, we implemented a wizard that prompts for some selection criteria, eventually matching the answers supplied by the interviewee to an existing template that already has a substantial amount of information entered. This matching template is then duplicated and can be customized further by the assessment designer who uses the system.
This report includes an overview of the purpose and implementation of the wizard system, along with brief discussions of the initial impressions by assessment designers who have used it.

Derek Rowntree (1977). How shall we know them? Assessing students. Harper & Row. isbn 0063181452, 269 pp. paperback, first printing, near mint, €

Harry Torrance (1989). Ethics and politics in the study of assessment. In Robert G. Burgess: The ethics of educational research (pp. 172-187). London: The Falmer Press.

p. 173: "... an interest in technique and technology continues to dominate research into assessment. In considerable degree this is due to most research into assessment being undertaken by examination boards and thus still being concerned with monitoring the validity and reliability of examinations, rather than investigating their social role or impact on teaching. That is, research on assessment can in large part be construed as in-house quality control and market analysis by commercial organizations. (...) In such a situation it is perhaps not surprising that the ethics and politics of assessment, and of the study of assessment, rarely get an airing."
Yhe point of mentioning the article here: there are ethical issues involved in assessment, and therefore also in the approach to the design of achievement test items. Ultimately this connects to the work of Bourdieu and Passeron, and to that of Michael Young on the threat of the meritocracy, also mentioned by Torrance.

Nancy Warehime (1993). To be one of us. Cultural conflict, creative democracy, and education. New York: SUNY.

Grant P. Wiggins (1993). Assessing student performance. Exploring the purpose and limits of testing. San Francisco: Jossey-Bass.

Linda Darling-Hammond, Jacqueline Ancess and Beverly Falk (1995). Authentic assessment in action. Studies of schools and students at work. New York: Teachers College Press.

cases:Central Park East Secondary School, Hodgson Vocational Technical High School, International High School, P. S. 261, The Bronx New School.

Assessment Reform Group (2002). Research-based p[rinciples to guide classroom practice. pdf

Assessment for learning
- is part of effective planning
- focuses on how to learn
- is central to classroom practice
- is a key professional skill
- is sensitive and constructive
- fosters motivation
- promotes understanding of goals and criteria
- helps learners know how to improve
- develops the capacity for self-assessment,li> recognises all educational achievement
Paul Black and Dylan Wiliam (1998). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80, 139-148. html

UK

Institute of Educational Assessors site

"Our aim is to improve the quality of assessment in schools and colleges by working with educational assessors to develop their knowledge, understanding and capability in all aspects of educational testing and assessment."
Affiliate membership is open to parents, foreigners, and you also, even for free until May 2007, as the Institution offers free membership during the first year of the Institution's existence.
Access to 9000 journals, discussion platforms, you name it.

Office for Standards in Education (Ofsted) (2003). Good assessment in secondary schools. pdf

"This report is about the ways in which schools can use assessment to improve learning and achievement. It draws evidence from lessons and schoolsÕ systems for academic monitoring to illustrate some of the most effective strategies used to guide, challenge and support pupils."
Assessment practice proves to be troublesome, in contrast to many other aspects of secondary education, and notwithstanding the work of Black and Wiliams ('assessment for learning') being highly influential. b.w.
"In summarising findings on the quality of assessment, OfstedÕs review of secondary education in 1998 commented:ÔOverall, the purpose of assessment is to improve standards, not merely to measure them.Õ4 Inspections of secondary schools continue to indicate that many schools need to work on features of assessment to ensure that it helps pupils to learn better."

February , 2009 \ contact ben at at at benwilbrink.nl

http://www.benwilbrink.nl/projecten/06examples1.htm