Testing for ‘math in everyday situations’

Rekenen in alledaagse situaties

Ben Wilbrink


announced June 14, 2015. I have been invited (July 14) to write a 2500 word article on psychological obscurantism in Dutch math educaion (primary and secondary education), especially so in the case of the new kind of exit exam in all branches of secondary education: the tests of functional use of arithmetics in situations of daily life called the rekentoetsen-2F, -3F. The article, in Dutch, is to appear in De Psycholoog. In preparation for it I will write the text in English, on this very web page.


“Spencer also argued for an increase in mathematics, focused on what students would need in their everyday lives.”

Kieran Egan (2002). Getting it wrong from the beginning. Our progressivist inheritance from Herbert Spencer, John Dewey, and Jean Piaget. New Haven: Yale University Press. (p 122)

“If more scholars were willing to expose nonsense for what it is, fewer students and teachers would waste their lives.”

Jon Elster (2015). Obscurantism and Academic Freedom. In Akeel Bilgrami & Jonathan R. Cole (Eds.) (2015). Who's afraid of academic freedom? (p. 82). Columbia University Press. isbn 9780231168809 info tweet


This webpage is not about the use of contexts in math education. Instead, it is about research on people using math (or failing to do so) in situations of daily life, in vocational life also, and especially about research on the idea and the practice of testing students’ mastery of the functional use of mathematics in such situations. Ask me ‘Is there a substantial body of such research?’, then my answer will be: ‘No’. There is a lot of research, however, that in more or less slanted ways bears on the problem. This webpage makes a fresh start in bringing that research together, or at least in building an inventory of research questions.


Why is this important? Because obscurantist amateur psychology is ruining math education, and because the foundations of PISA Math are constructivism/situationism/progressivism and the misconception of testing for the use of math in situations of daily life. I bet you have never encountered scientific research that attests to the soundness of the concept of testing for ‘math using in situations of daily life’. I haven’t!


A perfect example of what I will be writing about:


Jan De Lange, Alla Routitsky, Kaye Stacey, Ross Turner, Margaret Wu, Andreas Schleicher, Claire Shewbridge, Pablo Zoido, and Nicola Clements. (2009). Learning Mathematics for Life. A view perspective from PISA. pdf

This key publication explains the rationale of the item writing for PISA Mathematics tests. The ideology explained. Examples. Empirical data. I was not aware of the existence of this report until Joost Hulshof informed me (August 5, 2014). Of course, there is a publication in Dutch by Truus Dekker and others, with lots of adstruction and examples.

A quick and dirty list of themes and ideas


red lines: Yet better to be formulated:


PISA) Wait a moment—PISA Math is not a high stakes test. It definitely is not an exit exam for students! Not for students, indeed. For politicians it is, however. PISA Math is rapidly becoming the international ‘standard’ for what math education should be about, what it should ‘deliver’. The items in the Dutch rekentoets are in many respects identical to those in PISA Math—there is a research report by Cito on this question, it is kept a secret by the Department of Education. The problem stil not explained by Cito is: how is it possible for Dutch youth to score in the world top on PISA Math, yet do so poorly on the Dutch rekentoets that it has become a political problem of huge proportions?


LO) It is not only this kind of questioning itself, but also the kinds of questioning left out (LO) of testing: testing for mastery of algorithms of arithmetic, or for mental arithmetic. ‘Algorithm’ in the mathematical sense, not the fuzzy one like this (Everyday Math). The Dutch rekentoets explicitly does not test mastery of algorithmic arithmetic, and does not seem to take mental arithmetic seriously (no speed limit imposed, small number of items).


IC) The accusation that testing for ‘math use in situations of daily life’ really is testing for differences in intellectual capabilities (IC) is not the same as claiming that these tests will correlate highly with standardized intelligence tests. Let us say instead that, f.e., PISA Math is a funny IQ-test. One example: working memory capacity as an intellectual capability is not directly tested in most standardized IQ tests, yet it is evidently the case that differences in working memory capacity are influential in results on contextual math questions of the PISA-type.


IH) Is this a special class: the context presented may inhibit (IH) inference to information in the possession of the student. Not exactly the same as the transfer problem. And what is the relation to strategy training on improving access to knowledge?


WP). Of course, word problems (WP) have always been used in math education. Lots of research (see here), yet strangely irrelevant to using word problems in high stakes testing (exit exams) situations. How is that?


BUS). The busing problem (BUS). Highly popular among reform educationalists, especially so in the Netherlands. 36 passenger places in every bus. How many buses are needed for 1128 passengers to Mickey Mouse Land? What do you count as a correct answer? In Dutch: see here. [It is merely an authoritarian convention, and not mathematics, to only count the integer number ‘32’ as a correct answer]. In NAEP it is not damaging to students. In the Netherlands, students will have to answer questions like this one in exit exams, very high stakes to them.


PP). Are there any positive points (PP) at all?


D1) What is the demarcation between funny math problems and true math problems? Barry Garelick gives a hint I will have to follow up. I call this #1, because there are more questions of demarcation (such as Thijssen De Examenidioot on denksommen (what might be an English equivalent?) [Treffers, last note in his last book, on denksommen in rekentoets],

CS). Case subjects (CS) [Dutch: zaakvakken], of course, provide true and valid situations for math to be used: geography, physics, chemistry, astronomy. (Below I already did mention vocational subjects). Contrast this with the philosophy of connecting with the world the kids experience on a daily basis.


TC) True constructs (TC) are important in psychological testing as well as in examinations [Dutch: constructzuiverheid] (do not use calculators in arithmetics tests; distinguish between math modeling - math calculation - interpretation of results; beware of cognitive overload)


LC). Using new kinds of problem situations in fact will test students on learning capabilities (LC) (among other things). This observation follows from Stellan Ohlsson’s 2011 theory of learning: starting with the abstract rule, specializing that rule to fit concrete situations. On the difference between known and new kinds of problem situations see also the dissertation by Anton J. H. Boonen (2015). Comprehend, visualize & calculate. Solving mathematical word problems in contemporary math education. Amsterdam: Vrije Universiteit. I take issue with Boonen on the difference he makes.


GP). Dutch educators, f.e. Anne van Streun and the math test-3F team (Victor Schmidt, pres.), define ‘math in everyday situations’ as problem solving requiring a problem solving scheme attributed to George Polya (GP). This might be a huge misconception. It is not at all clear to Artificial Intelligence scientists like Allen Newell, student of Polya, what exactly is the theory of Polya. See Newell 1983 Technical Report version is free access; scan 21 Mb of the book chapter.


RT). Retrieval (RT) of relevant information, even if well known, might be problematic.

The general point is: testing for problem solving in examinations is tricky business if the relevant productions [for this technical term see the literature based on the work of Allen Newell & Herbert A. Simon] have not been specifically exercised. For Dutch readers: Wilbrink Toetsvragen ontwerpen, hoofdstuk Problemen stellen

Bransford and Franks here suggest that transfer is an important problem in the world as we know it, as well as in learning theory. I do not think so: in vocational situations the problems typically are of very well known kinds. The same will be true of problems in daily life (grocery shopping etcetera). The Bransford and Franks argument might well be false, as it has always been (see the Spencer quote at the top of this web page). Then the question becomes: what might the demarcation be between typical and a-typical problem situations?


WA). There is somewhat more than a familiy resemblance with applied math as implemented in, f.e., the Dutch courses on Wiskunde A (WA). The ideology behind them is roughly the same, in the Netherlands identified with the Freudenthal Institute (Realistic Math Education also called Reform Math Education, RME). The program for secondary education started in 1985. An important but today mostly forgotten document is: Cito-werkgroep (not dated, about 1987). Wiskunde A: doelgericht toetsen. Leerdoelen en voorbeeldopgaven verzorgd door het Cito. isbn 9001186319 [not online, in Dutch only]. It explains how Cito [the Dutch Educational Testing Service] expects to handle the many problems involved in designing exam questions on this ‘applied math’.


PA). How can students master their ‘math in stituations of daily life’? By practising on a lot of context math problems? That is probably a very inefficient way. The parallel here is with learning to write. Scardamalia 1981 ‘How children cope with the cognitive demands of writing’, p. 100-101:

West (1967) reports that studies on teaching methods consistently show that practice alone (PA) is not very effective. Careful reading of essays combined with analytic discussions of ideas, presentation of functional writing assessments, and intensive evaluation were effective strategies when combined with practice; but practice alone was never found to be the most effective teaching strategy. I believe such findings point to a serious fallacy in the whole conventional approach to writing—the belief that when students engage in complex learning tasks they are actually engaged in all the complex problems of writing. Stollard’s research suggests that this is not true, as does much of our recent rsearch on the thinking that children bring to their writing. Papers presented at the 1980 AERA meetings (“Knowledge Children Have but Don’t Use in Their Writing”) address this topic directly.


AMB). Paper tests are highly artificial. How could they ever be valid tests of the propensity or capability of using math in everyday situations? The information available to the candidates is wildly different from that what they encounter in those supposedly mathematical daily life situations. One of the problems is: in daily life situations come without necessarily being labelled as problems to be solved mathematically. It is difficult to say of situations in daily life that the information available is ambiguous: after all, it is the only information there is, one has to cope with that. The test situation is totally different: it is artificial, and made up by item writers wanting you, the testee, to see the available information in a particular way, not in other ways. A good working hypothesis now is: context problems (such as PISA Math items) inherently are ambiguous (AMB) to testees (and analysts such as your reporter ;-). Can one fill a 70 L wheelbarrow with exactly 70 L sand? The Dutch education secretary Sander Dekker told our Parliament he can! Not just once, but even 65 times in a row! The prediction therefore is: ambiguities will be revealed by letting testees report (in one way or another, from verbal report, scratch paper, eye tracking (Gerdineke van Silfhout, dissertation; site), to fMRI-scan) on their attack on individual problems. Not every possible ambiguity is fatal for the valid use of the item in exams or other high stakes tests, of course. Failing an exam on a score just below the cutting point, however, is exactly the kind of situation where any ambiguity at all is unacceptable. The criterion: Court of Law (Dutch study by Job Cohen, on the rights of students in higher education in the Netherlands. By the way, Job Cohen also was mayor of Amsterdam).


AC). The testpsychology of ‘Assessment Centers’ (AC) is exactly the model that is relevant to the development of math tests (such as PISA Math) that pretend to be valid representations or at least valid predictors of behaviour in situations in daily life being in one way or another characteristically mathematical.


SIT). There are situations, and situations. Situations one of a kind, and situations of different kinds. A situation here is, of course, a problem situation. Don’t worry, the semantics of reform math is full of surprises. A characteristic of valid examinations is that they test for mastery of the kinds of problem situations that students have been exercising in the past few weeks, months or years. A characteristic of psychological tests, such as intelligence tests, is that they confront the testees with problem situations that supposedly are entirely new to them — or thoroughly and equally well known to all testees. See also Job Cohen 1981 on the crucial distinction between same kind and new kind of problem situation. Where does PISA Math stand? Or the exams based on the idea of testing for use of math in situations of daily life? At issue is that the reform math literature does not recognize any distinction here, at least as far as I know that literature, or it must be that the ideal is to present exclusively new kinds op problem situations.


QL). Surely knowing one’s math must impact on quality of life (QL). This requires longitudinal cohort research, and it has been done!


EC). Education itself is an important part of daily life. Educational careers (EC) surely will depend on the quality of math education in the early years.


S). People typically do not tend to solve their problems in daily life in rational ways, but so as to get preliminary answers that are perfectly satisfactory to them. Herbert Simon called it ‘satisficing’ (S).


DM). In general: daily decision making (DM) is not rational [Daniel Kahneman Thinking fast and slow. ] Expect also school children to behave ‘not rational’ in the sense of Simon, or of Kahneman. F.e., do not expect students to use shortcuts in solving arithmetics problems if straight calculation surely will give the right answer [a.o. research by Joke Torbeyns]


EXP). Expertness (EXP). Typically the math needed on a daily basis in vocational and daily life will eventually be mastered to a high degree of expertise.


VT). vocational training (VT) includes training in the specific math needed in that vocation. Pick up any course book for technical vocations (I favour the old ones, for flight mechanics in the early forties, or the paper printing industry). In reform math countries/provinces vocational institutions (nursing, teaching, administration) nowadays will have a very difficult time bringing their students’ arithmetics fluency up to the level that minimally is required for the responsible professional.


H). Historical (H): Dutch master van Pelt systematically took inventories of the kind of math being used by housewives, masons, carpenters, etcetera, exercised those with his grade six pupils, and took care every pupil left school with a copybook of those exercises, for later consultation. [see the first pages in Van Pelt 1903 (in Dutch) pdf] What can we, a century later, still learn from him?


CM). Cost of mistakes (CM). Lack of number sense may cost one dearly. In the Netherlands a nurse was convicted to prison sentence for wrongful death. She mistook the prescription of 0.30 mg for ten times 0.3 mg; the mistake proved fatal to her patient. Also in the Netherlands prime minister Mark Rutte did not remember correctly, after returning from Brussels, a critical amount of 90 billion euro, instead reporting it to be 50 billion. The question now becoming: are there systematic surveys of number sense failures and their incurred costs? For example: the number of mistakes made by nurses in their daily routine: according to a Dutch investigation that number is very high, and it can be reduced significantly by pharmacists taking over critical calculations in medication preparation. Nursing is typically the kind of vocation that education should adequately prepare the students for, regarding number sense and language mastery.


PC). ‘Math in daily life’ is a poor concept (PC). At the very least one should distinguish 1) daily use such as calculations/estimations of time, distances, amounts; 2) the rare occasion some significant decision has to be made, based on calculations: booking a vacation, buying insurance, renting a house; 3) the routine but possibly complex math in vocational practice. The first can be researched, f.e., with techniques used by Mihaly Csikszentmihalyi in his research of flow experience. The second can be researched in the psychological lab, or simply in real life situations. The third should be quite interesting to the math researcher: in many vocational situations correct calculations are crucial, mistakes might come at a cost, so there must an extensive literature covering math use in vocational situations. I’ll have to search for that literature (ergonomics, for example?).


MT). Allocating students to different math tracks (MT) on the premise of the kind of vocation they will probably have in ten to twenty years is highly problematic morally. This is at issue in the Netherlands, where time and again different tracks have been created for the mathematically strong versus weak students. The weaker tracks emphasize using math in problem situations of daily life, the stronger tracks emphasizing using math is math type problems.


SC). Instead of seeking the fundaments in what is known scientifically, protagonists of ‘math in daily situations’ rely on two somewhat antagonistic ideologies: constructivism, and the more recent situationism. [ Anderson, Reder and Simon sounded alarm bells on this development at the end of last century, f.e. http://goo.gl/12hnpD ]


TR). The idea of transfer (TR) lies at the basis of much (fuzzy psychological) thinking in reform math. The basic (but naive) observation is that many people do not seem to use their knowledge of math in situations where it evidently would be applicable. We know already, from the above, that human decision making need not be rational in the school sense; it is no wonder people do not take the trouble to calculate their options if a satisficing approach already solved their decision problem. In psychology itself transfer is not quite well understood. Stellan Ohlsson (2011, Deep Learning. How the Mind Overrides Experience) thinks transfer is nothing else but learning itself, and I think he has strong arguments for his position.


AM). Anecdotal materials (AM), such as this Albertan piece, the kind of stuff one hears at birthday parties, from employers, from desperate parents, about their Johnny not knowing any more how to add. Surely Johnny, if he doesn’t now know how to mentally add, won’t be able to mentally add in daily life, during his lifetime (what would be the cost of that?). Even assuming these anecdotes to be valid, they have a signaling value only. Some cases might count as critical incidents: if valid, they are proof of something being seriously amiss. For example: a teacher declaring, for national television (the Netherlands: Brandpunt), ‘In this school I am forbidden to teach long division’.


RE). Reverse Engineering (RE): given test or exam questions presumably testing for the use of math in situations of daily life, is it possible that the situation imagined truly is a situation that occurs in real life daily, or once in a lifetime, as well as a situation that education should prepare its students for? Most items from PISA Math and comparable tests such as the Dutch rekentoets are fantastic, contrived, trivial, not mathematic at all; meaning there is no real world corresponding to most of these contrived math test items.


CLT). Another approach will be along the lines of psychological theories such as Sweller’s Cognitive Load Theory (CLT) or simply using Miller’s 7 plus or minus chunks capacity of working memory — for children even (substantially) less than 7 chunks. Research literature: see here. It will be immediately evident that confronting students with situations they have no experience with, can not possibly be valid for the use of math in that kind of situation later in life. Etcetera.


E). The E of Exercise. The amazing observation of the outsider in the math education debate: reform protagonists really do think investing in fluency in the basic algorithms of arithmetics is wasting time and money. I can’t remember even one protagonist acknowledging that fluency in multiplication and division of large numbers necessarily implies fluency in the basic operations on single digit numbers, and probably also otherwise a high degree of number sense. It is this number sense that is of lifetime importance regarding one’s health in the realms of life that really count: health, also financial health. The research mentioned, by Reyna and others, already hinted in that direction. More research would be very welcome.


CA). That number sense can be made less abstract by operationalising it in models of cognitive architecture (CA), such as Anderson’ ACT-R model, or Newell’s SOAR. The next step then is meticulously researching what is happening in the brain while people tackle their everyday problems involving number sense in one way or another. This kind of research already has been done by Lebiere and Anderson (a cognitive specification for solving simple algebraic equations, subsequently using fMRI-scanning to follow the brain-events from the young problem solvers), it only has to be specialised to number sense implicated in daily functioning of people.







onderstaand is integraal overgenomen uit model.htm, en moet ik nog selecteren op relevantie:





Timothy Gowers (2002). Mathematics. A Very Short Introduction. Oxford University Press.

Gowers is Fields Medal winner. Leuk boekje, maar ook wel van belang: hfdst 1 over modellen, 2 over abstracties, 3 over bewijzen. Voor het modelleren van een probleemstelling moet er nogal wat worden vereenvoudigd. Pas dat eens toe op de alledaagse problemen in de contextopgaven zoals die in veel rekentoetsen zijn te vinden! Dat is een oefening die ik maar eens moet gaan doen; dus niet denken vanuit wat de ontwerper van de vraag wel zal hebben bedoeld, maar serieus naar de gegeven context kijken. Hoeveel veronderstellingen zitten daar impliciet al in, enzovoort.




Andrzej Sokolowski, Bugrahan Yalvac & Cathleen Loving (2011): Science modelling in pre-calculus: how to make mathematics problems contextually meaningful, International Journal of Mathematical Education in Science and Technology, 42, 283-297. abstract

Dit artikel adresseert het probleem dat het toepassen van wiskundige kennis bij opgaven in de zaakvakken vaak neerkomt op het oppervlakkig manipuleren van de juiste formules. Draai het om: bouw de instructie op rond opgaven waarbij op een niet-triviale wijze vanuit de beschikbare of te verzamelen gegevens een geschikt wiskundig model moet worden opgebouwd.




Dédé de Haan (2001). Praktische opdrachten bij wiskunde: verslag van een onderzoek. Nieuwe Wiskrant 20-3/maart pdf


Bart Ormel (2011?). Het natuurwetenschappelijk modelleren van dynamische systemen. Naar een didactiek voor het voortgezet onderwijs. Proefschrift. Bespreking door Sylvia van Borkulo. Tijdschrift voor Didactiek der β-wetenschappen, 28, 75-78.




L. N. Tronsky (2005). Strategy use, the development of automaticity, and working memory involvement in complex multiplication. Memory and Cognition, 33, 927-940. free access


]






Lieven Verschaffel, Brian Greer and Erik de Corte (2000). Making sense of word problems. Lisse: Swets & Zeitlinger.




Sean P. Yee & Jonathan D. Bostic (2014). Developing a contextualization of students' mathematicalproblem solving. The Journal of Mathematical Behavior, 36, 1-19. abstract


See appendix D in Bostic, for what here is called ‘math problem solving’. This kind of question is fuzzy mathematics. Bostic calls them ‘Open, complex, realistic problems were the main focus of instructional materials for intervention participants’. I do not know what it means if instructed students score better on this type of ‘math in situations of daily life’, ‘functional mathematics’, or whatever it may be called.

Authors Lesh & Zawojewski are known as constructivists. The chapter in the handbook undoubtedly is not a research report. Maybe it summarizes research? I could have a look, but would it be worth my time?

The Yea & Bostic claim cited above simpy is not true: the problem solving involved is not mathematical problem solving. The authors are perfectly aware of this problem, otherwise they would not have mentioned Polya’s work on the teaching of problem solving.

This research is not up to the standards of experimental psychology. Looks much like an attempt to re-invent the problem solving psychology of Duncker (between the world wars), in a psychologically simplistic way. As such it might be representative of most of the ‘research’ (often called ‘design research’) by protagonists of reform mathematics.

No math problem solving at all, in this article.

Sorry, Sean!



H. P. Bahrick & L. K. Hall (1991). Lifetime maintenance of high school mathematics content. Journal of Experimental Psychology: General, 120, 22-33. abstract


Ik heb hier nog geen pdf of fotokoie van. Genoemd door Willingham (in een 2015 blog) en Anderson, Reder & Simon (2000).



Anne Anastasi (1984). Aptitude and achievement tests: The curious case of the indestructible strawperson. In Barbara S. Plake: Social and Technical Issues in Testing. Implications for Test Construction and Usage (129-140). Erlbaum. isbn 0898592992 pdf


on the differences (if any) of aptitude and achievement tests. Not quite the same problem as that posed by math tests using contexts from daily life, in contrast to proper math tests. Needs my attention. Very informed chapter.



Noëlle Bisseret (1979). Education, class language and ideology. Routledge & Kegan Paul. isbn 0710001185 info




Richard M. Brandt (1972/1981). Studying behavior in natural settings. University Press of America. isbn 081911829X


Shows what it takes do reaerch behavior in situations of daily life ;-)



Belangrijk wanneer iemand uit de boot is gevallen: herinner hem/haar eraan te ZWEMMEN (wijsheid opgedaan uit verslag SAIL Amsterdam 2015). Vergelijk nu eens: #rekentoets met ‘rekenen in alledaagse situaties’ ;-) Ik moet erbij vertellen: de hele heisa moet het transfer=probleem oplossen, dat mensen hun rekenvaardigheid niet altijd spontaan toepassen wanneer dat goed zou kunnen.




Donald Spearritt (1996). Carroll's model of cognitive abilities: educational implications. Themanummer International Journal of Educational Research, 25 (2), 107-198.


On mathematical ability: pp 156-7.









August 24, 2015 \ contact ben at at at benwilbrink.nl    

Valid HTML 4.01!   http://www.benwilbrink.nl/projecten/rekenen_in_alledaagse_situaties.htm