Figure 1 illustrates the main points of the learning model as conceived in the SPA model. The blue curve plotted is that of replacement learning, one of two basic forms of learning implemented in the spa-model. The other one, accumulation learning, is a less steep learning curve. The basic learning process is that of learning small bits of knowledge. Therefore the basic curves start off rather steep, and the functions levels off as learning time progresses.
Learning time is a personal parameter, different students needing different amounts of time to reach the same mastery. In order to avoid the complexities of personal differences, the time dimension will be indicated as episodes, the first episode is the time spent learning until the preliminary test that indicates the level of mastery reached. Be aware that the true mastery - as depicted in the figure, is an abstraction and will never be observed.
Learning time is a personal parameter, different students needing different amounts of time to reach the same mastery. In order to avoid the complexities of personal differences, the time dimension will be indicated as episodes, the first episode is the time spent learning until the preliminary test that indicates the level of mastery reached. Be aware that the true mastery - as depicted in the figure, is an abstraction and will never be observed.
Mastery more often than not will be mastery of knowledge and insight of a more complex character than that on which learning has been defined. The complexity parameter simply is the number of bits of knowledge one must master in order to correctly answer the items taken from the domain. This complex learning may be plotted also; figure one shows the learning curve of complexity five. This curve is a better candidate to fit the growth in mastery of a domain of knowledge (items). In the beginning complex learning may be very slow but accelerating until the acceleration stops and turns into a leveling off.
The spa model does not in any way depend on the specific form of replacement or accumulation learning. The program does not yet allow it, but in principle learning curves of any form might be used in the spa model, as long as they are deterministic. The spa model is not a sophisticated learning model; it suffices to assume learning to be a deterministic process and leave research on real learning processes to experimental psychology.
The vulnerable point in the spa model is now seen to be the learning model, because the learning model that applies will never be known for sure. Whatever the analysis the spa model is used for, it will always be necessary to try different kinds of learning models and/or complexities to ascertain whether the relevant outcomes of the analysis will lie in a certain acceptable range with a certain probability. Robustness analysis on the learning model is therefore called for.
For the applet itself click spa_applets.htm#5,
In a way the Predictor is all the student needs to decide whether she still has to invest more time in preparation for the test. It would be nice, though, to have some insight in the amount of time needed to reach a more satisfactory level of expected outcomes. It is clear that the static Predictor should be the kick-off for an attempt to bring in some dynamics by assuming a learning model. This module presents the learning model options in isolation from the Predictor. The next module will combine the two to model the path of expectations.
The learning model assumes mastery to be known, and learning to be deterministic according to the particular model chosen. The prototype presents a choice between two kinds of learning model, called the accumulation model and the replacement model (Mazur and Hastie, 1978). The applet offers the opportunity to plot a second curve or second set of curves according to the second specification for the parameters.
The interpretation in this modelling is as follows. Take the number correct score on a preliminary test as your value for mastery, after all that is your best bet. Preparation time comes in next: it is all the time spent in preparation for the preliminary testing. It is arbritarily lumped together and assigned the value of 'one episode.' The excess of episodes specified is the possible future trajectory. The number of bars specifies in how many parts every episode has to be chopped up to produce a reasonably smooth learning curve.
The problem with the accumulation as well as the replacement model is that they do not fit the complex learning in most curricula: they produce learning curves steepest at the beginning point and after that levelling off. The SPA tries to solve the problem elegantly by assuming that test items can be answered correctly only by knowing a certain number of basic facts or events, called the items' complexity. Mastery will still be defined on the complex items in the test and in the item bank, but learning is defined on knowledge of the underlying basic facts or events. Making test items more complex will produce curves that are level to begin with, sloping upward and only thereafter beginning to level off. Again, see the chapter's text for the fine points and scientific backing.
basically the learning process is a mystery
The problem in modeling learning is that learning is a physical process of immense complexity; it simply is not known how exactly learning, or retrieving information from memory, is possible, let alone how exactly the learning of algebra or English occurs. Accordingly any attempt at modeling entails approximation, compromise, and empiricism. Resulting models nevertheless shall be highly sophisticated. An example of this kind of modeling is TODAM, the Theory of Distributed Associative Memory
It will be clear from the fragment above that it is strictly impossible for the SPA-model to use learning curves that are true to the learning that the student in fact performs, or the way she retrieves learned materials probed by the test questions. This state of affairs is completely different from that concerning the utility functions in the SPA model, being predicated on culturally defined rules.
Learning curves in the SPA model will of necessity be very crude, and they should be used as instruments to search the robustness of results obtained. One way to do so, is to repeat analyses using different types of learning curves and different values for the complexity parameter, for example, translating the exact quantitative results in weaker qualitative ones.
a limited number of basic learning models
[follows Mazur and Hastie 1978]
The Mazur and Hastie learning models do not fit the learning of the kind of complex knowledge that (higher) education is about. The following paragraph presents a satisficing solution to fill this gap.
complex knowledge, insight
To answer the typical item in the educational test, the student should be able to use two or more bits of basic knowledge simultaneously. Basic knowledge may be tested directly, of course, its complexity then is one. This really is a bare bones approach to modelling the testing of complex knowledge. Technically it's implementation in a mathematical model is extremely simple; the probability to know the complex item is the product of the probabilities that the basic knowledge items have been mastered.
I am not aware of any places in the literature that use the same operationalization of complex knowledge, but they must exist. Combining basic knowledge items into complex knowledge items might be called chunking, a term familiar since the publication of Miller's (1956) seminal paper on the magical number seven. The concepts of complexity and of chunking are not identical, however. The proper interpretation of chunking is that mastery then should be defined on the level of the chunk, and treated as 'basic knowledge,' on a higher level. The problem then is to find mathematical functions that fit the learning trajectory of mastery on the 'chunk level.' The chunking concept, in other words, does not solve the problem that Mazur and Hastie's learning functions do not fit the learning of complex knowledge.
Insight might be modeled - I will not do so for the spa-model, at least not now - as 'constrained stochastic behavior,' in the same way Simonton (2003) has done to explain scientific creativity. Simonton uses the concepts of 'domain' and 'field' as proposed by Csikszentmihalyi (1990, 1999). "The domain consists of a large but finite set of facts, concepts, techniques, heuristics, themes, questions, goals, and criteria. These can be collectively referred to as the population of ideas that make up a given domain." "The field consists of all those individuals who are working with the set of ideas that define the domain." The stuff scientific creativityis made of, is combinations of these ideas, most of them proving sterile, of course. What makes this a model of insight in learning students is that here the domain is very small, enabling students to explore much, most or even all of the combinations. All useful combinations contribute to the student's insight. The students as a group or population might even be thought of as the 'field' of - in this case - the course. Because of the smallness of the domain, insight is not a result of serendipity or luck, but of hard work.
Broadening our scope from mastery of course content to expertise in a specific function, job or discipline, it is evident that learning will not stop at the rather simplistic levels of mastery of text book content but will continue on the job and in life. Growth in expert knowledge is a process on quite a different time scale, there is a dedicated literature on this topic (for example Ericsson 1999).
knowledge objects
Chunking as term might have been introduced by Miller (1956), but the underlying concept is known from, for example, De Groot's (1946, 1965) work on the thinking of the chess player. Highly complex learning will not necessarily behave along the simplistic lines of the SPA-model's definition of complex knowledge. Learning complex course material, for example, might be a continuous proces of slow growth, ending in a jump to a high level of mastery. It is not unlike the results in learning complex skills, such as typing, where growth in mastery temporarily might seem to have stopped, and then to jump to higher level. In the cognitive field the phenomenon is called by Entwistle (1995) 'knowledge objects.' The student, until then unable to come to grips with the course content, after the jump is confident in her or his mastery of it. The reason to mention this kind of phenomenon here is to warn that the learning formulas in the SPA model need not always be able to catch the complexities of real world learning. It is not a fault of the SPA-model, however, because it is possible to feed the model a graphical curve instead of a formula representing replacement or accumulative learning. No, I have to disappoint you; this possibility is not implemented in the applet.
Serendipity
There are unpredictable forms of intellectual growth and of gaining new insights that for that reason cannot be captured in graphs over time. They might not be any the less important in education or in society at large than the forms of learning that get tested right away. In science the phenomenon of discovery by serendipity is well known. It is a kind of discovery that does come to the well prepared only, however random it may otherwise seem. A less erratic form of luck in scientific work has been wonderfully well described by Giere (1988). It is the scientific researcher discovering that a model or technique that is totally disconnected from her current research activities, but well known from a project elsewhere or an earlier job, might be used to force a break-through in the current project or field.
More down to earth it is evident that there is a weak but certainly positive relation between levels of mastery reached in earlier course subjects, and the ease and quality of further study. In my (1978) I did suggest the possibility that mathematically optimal strategies according to the SPA model might neglect the bonusses to be obtained by reaching out for slightly higher levels of mastery than just for the moment might be the most time-efficient. There will be ample opportunity to return to this observation, that is of course widely known in educational research and in research on equal opprtunity in education. For a recent discussion see Ceci and Papierno (2005). Evidently the SPA-model does not model the world at large, but it should be able to assist in recognizing extravagant claims about what good educational measurement is able to do to this or that kind of stakeholder.
On serendipity see also Simonton (2003).
what about forgetting?
A learning model has been incorporated in the tentamenmodel by Van Naerssen in 1970. His choice was the constant value model, equivalent to the replacement model. In the 70's model development suffered from the complexities involved in modelling learning as well as forgetting. The SPA model simply assumes forgetting to be absorbed in the learning itself. In another form forgetting will ask our attention again in module 7 where the consequences of failing tests or examinations have to be modelled in order to find optimal strategies.
individual differences in capacity or intelligence
One of the complexities in earlier attempts at modelling learning has been the notion of individual differences in learning. The temptation was to introduce a parameter for capacity, or intelligence if you prefer that term. It is possible to handle the capacity parameter in the same way as has been done with the mastery parameter, but it greatly complicates the model. Leaving it out altogether proved to be a viable option. Because the SPA model is a model for individual student strategies, it is not necessary to bring in a parameter that describes individual differences.
what to make of time?
Related to the last observation is the dimension of time. Speed of learning obviously differs between students, and so does the time needed to reach a specific level of mastery. It would be nice to have a model that uses the time dimension without translating time in physical units of hours or weeks or what not. An early attempt to do so was Van Naerssen's learning model stipulating the unit of time to be whatever it takes to learn one half of the course material not yet mastered, assuming that amount of time to be a constant. But this only is an interpretation of the mathematical learning model used. In the SPA the dimension of time is abstracted further by calling the time already spent in preparation 'one episode.' That one episode will stand for different amounts of time on the clock or the calendar according to the particular moment the student takes stock of what she has been learned, or according to the particular student involved.
and the simulation?
Early versions of the SPA model in the 90's had procedures to simulate learning also. The decision to simulate learning seemed necessary at the time, and has resulted in lots of complexities and big losses of time. But exactly why should a model of achievement testing incorporate a learning model that not only describes the development of learning in time, but also the process that generates the learning? Here also it was the early conception in the 1970 tentamenmodel of learning as a process that invited the misconception.
The SPA model now considers learning to be a process in a black box, the outcomes of which are assumed to be adequately described by the mathematical function chosen. The learning model with random fluctuations has in the SPA model been replaced by deterministic models. Of course one can never be sure whether a particular learning model is the correct one, but this problem is one that should be handled by studying the model outcomes under different learning models, or different parameter values for the same learning model, i.e. for example robustness analysis and analysis of worst versus best case scenario's.
An example of a learning model whose process lends itself to simulation is Anderson's ACT-model. See the article by Pieter Been (in Dutch) for some simulation results on the ACT model. I have not (yet) studied the ACT-model and its stochastic properties. It looks like a model that is not too difficult to program next to the two deterministic learning models now implemented in this SPA-module on learning.
The core of the applet's program are the methods for the two types of learning curve implemented in the applet.
Each learning model has three variables that are supposed to be given, mastery, ceiling, and time, and one parameter that has to be evaluated from the givens. The mathematical derivation for this evaluation is mentioned above the code proper.
In some situations there is a ceiling to the learning, a barrier to full learning. To keep things simple, SPA assumes the ceiling to be 1. This is a rather artificial assumption, of course, for its corollary is that the quality of test items is impeccable, including the quality of assessment of the answers given. The latter point is not of minor import, because it surely is a problem for closed form questions (multiple choice) as well as for open questions (see my 1977).
The time variable is assumed to be one, i.e. one episode. It is the time invested in preparing for the preliminary test. For different students that one episode might span a relatively big range of time as measured in minutes, hours, days or weeks.
The learning model is assumed to apply to 'simple' learning, the learning of the basic bits or units of the course. The transformation to mastery of 'complex' knowledge, and vice versa, is technically simply and occurs elsewhere in the program.
To test the correctness of the learning functions, option 501 will furnish the function values, and the rounded values used in the plot itself. Use the mathematical functions, given in the java code paragraph above, to check the correctness of values. The t-values in the list in figure 1 are 0.0, 0.1, 0.2, ... 2.0.
For the accumulation model:
given time = 1, complexity = 1, mastery = 0.6, then
parameter = 1 * ( -1 + 1 / 0,6 ) = 1,6666666667 - 1 = 0,6666666667.
At time = 2, then
mastery = 2 / ( 2 + 0,6666666667 ) = 0.75.
given time = 1, complexity = 2, mastery = 0.36, then basic mastery is the root of 0.36 is 0.6
At time = 2 mastery then is the square of 0.75 = 0.5625.
given time = 1, complexity = 2, mastery = 0.6, then basic mastery is the root of 0.6 is 0.7745966.
parameter = 1 * ( -1 + 1 / 0,7745966 ) = 1,2909945 - 1 = 0,2909945.
At time = 2, then
basicMastery = 2 / ( 2 + 0,2909945 ) = 0.8729833,
mastery is its square 0.7620998, equal to the last one in figure 1: 0.7620999, barring rounding error.
For the replacement model:
given time = 1, complexity = 1, mastery = 0.6, then
parameter = -1 / ln( 0.4 ) = -1/-.916291 = 1.0913563.
At time = 2, then
basicMastery = 1*( 1 - Exp( -2 / 1.0913563 ) = 1 - Exp( -1.832582 ) = 1 - 0.16 = 0.84.
Anderson (1976). An empirical investigation of individual differences in time to learn. Journal of Educational Psychology, 68, 226-233.
John R. Anderson, Paul J. Kline and Charles M. Beasley., Jr. (1980). Complex learning processes. In Richard E. Snow, Pat-Anthony. Federico and William E. Montague (Eds.) (1980). Aptitude, learning and instruction. Volume 2: cognitive process analyses of learning and problem solving (p. 199-236). Erlbaum.
Anderson, J. R., et aliis (1993). Rules of mind. Hillsdale, New Jersey: Lawrence Erlbaum.
Anderson, J. R., C. Lebiere, et al. (1998). The atomic components of thought. London: Lawrence Erlbaum.
Marshall Arlin and Janet Webster (1983). Time costs of mastery learning. Journal of Educational Psychology, 75, 187-195.
Been, Pieter (1998). Individuele studiesystemen: ondergang, varkenscyclus of feniks. In Theo H. Joostens en Gerard W. H. Heijnen (Red.). Beoordelen, toetsen en studeergedrag. Groningen: Rijksuniversiteit, GION - Afdeling COWOG Centrum voor Onderzoek en Ontwikkeling van Hoger Onderwijs, 33-53.
Biggs, J. (1996). Enhancing teaching through constructive alignment. Higher Education, 32, 347-364.
>
Brown, S. & Heathcote, A. (2003). Averaging learning curves across and within participants. Behaviour Research Methods, Instruments & Computers, 35, 11-21
http://www.newcastle.edu.au/school/behav-sci/ncl/publications.html
Browne, M. W., & Du Toit, S. H. C. (1990). Models for learning data. In Collins, L. M., & Horn, J. L. (eds) (1991). Best methods for the analysis of change: recent advances, unanswered questions, future directions. Washington, D.C.: APA. 47-68.
Ceci, Stephen J., and Paul B. Papierno (2005). The rhetoric and reality of gap closing. When the 'Have-Nots' gain but eh 'Haves' gain even more. American Psychologist, 60, 149-160.
Chant, V. G., & Atkinson, R. C. Application of learning models and optimization theory to problems of instruction. Hoofdstuk 8 in Estes' volume 5.
Csikszentmihalyi (1990). The domain of creativity. In M. A. Runco and R. S. Albert: Theories of creativity. p. 190-212. Newbury Park, CA.: Sage. [I have not yet seen this article]
Csikszentmihalyi (1999). Implications of a systems perspective for the study of creativity. In R. J. Sternberg: Handbook of creativity, p. 313-338. Cambridge University Press. [I have not yet seen this article] [contents
Entwistle, N. (1995). Frameworks for understanding as experienced in essay writing and in preparing for examinations. Educational Psychologist, 30, 47-54. abstract
Ericsson, K. A. (Ed.). The road to excellence. The acquisition of expert performance in the arts and sciences, sports and games. Mahwah, New Jersey: Lawrence Erlbaum.
Gathercole, Susan E. (Ed.) (1996). Models of short-term memory. Hove, UK: Lawrence Erlbaum. (a.o.: Neil Burgess and Graham J. Hitch: A connectionist model of STM for serial order (see also www2.psy.uq.edu.au/CogPsych/ Noetica/Articles/Ng_MayberyNoeticaFinal.pdf) - George Houghton, Tom Hartley and David W. Glasspool: The representation of words and nonwords in short-term memory: Serial order and syllable structure (see also: http://psyche.cs.monash.edu.au/v2/psyche-2-25-houghton.html) - Randi C. Martin and Mary F. Lesch: Associations and dissociations between language impairment and list recall: Implications for models of short-term memory. (see also: psych.rice.edu/brainandlanguagelab/ posters/hamiltonmartin.CNS2003.pdf) - Richard Schweickert, Cathrin Hayt, Lora Hersberger, and Lawrence Geuntert: How many words can working memory hold? A model and a method.)
Groot, A. D. de (1946). Het denken van den schaker. Een experimenteel psychologische studie. Amsterdam: Noord-Hollandsche Uitgevers maatschappij.
Groot, A. D. de (1965). Thought and choice in chess. The Hague: Mouton.
Heathcote, A., & Mewhort, D. J. K. (2000). The evidence for a power law of practice. In R. Heath, B. Hayes, A. Heathcote, & C. Hooker, The Proceedings of the 4th Conference of the Australasian Cognitive Science Society, The University of Newcastle, Australia.
http://www.newcastle.edu.au/school/behav-sci/ncl/lop.pdf
http://www.newcastle.edu.au/school/behav-sci/ncl/publications.html
Heathcote, A., Brown, S. & Mewhort, D.J.K. (2000) Repealing the power law: The case for an exponential law of practice. Psychonomic Bulletin and Review, 7, 185-207.
http://www.newcastle.edu.au/school/behav-sci/ncl/publications.html
Hicklin, W. J. (1976). A model for mastery learning based on dynamic equilibrium theory. Journal of Mathematical Psychology, 13, 79-88.
Jensen, Arthur R. (1973). Growth model of achievement. In his Educability and group differences, p. 79.
Kelvin Lai and Patrick Griffin (2001). Linking Cognitive Psychology and Item Response Models: towards modelling problem strategies. Paper presented at the 2001 annual conference of the Australian Association for Research in Education, Perth, December 2-6. pdf
Langley, P., and R. Jones (1988). A computational model of scientific insight. In R. S. Sternberg (Ed.). The nature of creativity (p. 177-201). Cambridge: Cambridge University Press.
Marc Mangel and Colin W. Clark (1988). Dynamic modeling in behavioral ecology. Princeton, NJ: Princeton.
Mazur, J. E., and R. Hastie (1978). Learning as accumulation: A reexamination of the learning curve. Psychological Bulletin, 85, 1256-1274.
Mellenbergh, G. J., and W. P. Van den Brink (1998). The measurement of individual change. Psychological Methods, 3, 47-485.
Michener, E. R. (1978). Understanding understanding mathematics. Cognitive Science, 2, 361-383.
Miller, George A., Eugene Galanter, and Karl H. Pribram (1960). Plans and the structure of behavior. New York: Holt, Rinehart and Winston.
Miller, George A. (1956). The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. The Psychological Review, 63, 81-97. pdf
Miller, George A. (2003). The cognitive revolution: a historical perspective. TRENDS in Cognitive Sciences, 7, 141-144. pdf
Moore, e.a. (1974). Concept difficulty as a function of the number and level of conceptualization of the defining attributes. Journal of Experimental Education, 42(3), 42-25.
Murdock, B. B. (1995). Developing TODAM: three models for serial-order information. Memory & Cognition, 23, 631-645.
http://memory.psych.purdue.edu/models/todam/">http://memory.psych.purdue.edu/models/todam/ [dead link? 2-2008]
Murdock, Bennet (1996). Item, associative, and serial-order information in TODAM. In Susan E. Gathercole: Models of short-term memory. (p. 239-266) Hove, UK: Lawrence Erlbaum.
Naerssen, Robert F. van (1970). Over optimaal studeren en tentamens combineren. Openbare les. Amsterdam: Swets en Zeitlinger, 1970. html [The first publication on the tentamen model, in Dutch]
Robert F. van Naerssen (1978). A systems approach to examinations. Annals of Systems Research, 6, 63-72.
Newell, A., & Rosenbloom, P. S. (1981). Mechanisms of skill acquisition and the law of practice. In Anderson, J. R. (Ed.): Cognitive skills and their acquisition. Hillsdale, N. J.: Erlbaum. 1-55.pdf
Norman, M. F. (1972). Markov processes and learning models. New York: Academic Press.
Postman, Leo (1978). Methodology of human learning. In W. K. Estes (Ed.) (1978). Handbook of learning and cognitive processes, volume 5, Human information processing (p. 11-69). Hillsdale, New Jersey: Lawrence Erlbaum. [Especially III Measurement of retention]
Rochlin, G. I. (1997). Trapped in the net. The unanticipated consequences of computerization. Princeton, New Jersey: Princeton University Press.
Rogosa, D., Brandt, D., & Zimowski, M. (1982). A growth curve approach to the meaurement of change. Psychological Bulletin, 92, 726-748.
Rosenbloom, P., & Newell, A. (1986). Learning by chunking: a production system model of practice. In David Klahr, Pat Langley, and Robert Neches: Production system models of learning and development. MIT Press
Sagiv, A. (1979). General growth model for evaluation of an individual's progress in learning. Journal of Educational Psychology, 71, 866-881.
Salemi, M. K., & Tauchen, G. E. (1982). Estimation of nonlinear learning models. Journal of the American Statistical Association, 77, 725-731.
Lorrie Shepard (1991). Psychometricians' beliefs about learning. Educational Researcher, 20, 2-16.
p. 9: Conclusion: Implications for Measurement Practice
Three main points are made in the respective sections of this artide:
1. On the basis of qualitative analysis of interview data from a representative sample of 50 district testing directors, it is asserted that approximately half of all measurement specialists operate from implicit learning theories that encourage close alignment of tests with curriculum and judicious teaching of tested content.
2. These beliefs, associated with criterion-referenced testing, derive from behaviorist learning theory, which requires sequential mastery of constituent skills and behaviorally explicit testing of each learning step.
3. The sequential, facts-before-thinking model of learning is contradicted by a substantial body of evidence from cognitive psychology.
My argument is that hidden assumptions about learning should be examined precisely because they are covert. What we believe about learning and the intended effect of testing on learning should be considered directly, not "smuggled in" by the adoption of a popular test theory. What measurement specialists believe about learning does shape practice, including instructional practice. Although we have formal theories about test validity and formal means to evaluate how technical decisions affect the meaning of test scores, we do not have explicit ways to examine and debate our understandings of learning theory. Left unexamined, it is possible for a 30-year-old theory still to have a pervasive influence. Note that in selecting quotations to characterize the behaviorist position in Appendix B, I purposely chose examples from Glaser's Individually Prescribed Instruction and Resnick's earlier works. Their work in the 1980s is nearly a repudiation, certainly a significant transformation, of their earlier understandings. They have changed, but we have not, primarily because measurement specialists are no longer psychologists conversant with changes in learning theories. Thus, I propose that we engage in formal debate about our theories and expectations for the effects of tests and that we consider the empirical evidence of these effects.
The measurement community is under attack because of the negative effects of high-stakes standardized testing inaugurated by educational reform. There has been a tendency to respond defensively, as evidenced by a counterattack on performance assessment in Education Week (Rothman, 1990) and at a recent conference a between-sessions joke mocking authentic assessment as "measurement-free" assessment. Although I agree with some of my colleagues' fears about overly ambitious claims for authentic assessment - for example, the claim by some that performance tasks are incorruptible in high-stakes contexts - there is the danger that an entrenched technical community will be unable to respond thoughtfully to legitimate criticisms of current tests.
In the Education Week article, measurement specialists asserted that performance assessments are less reliable and less valid than traditional tests and that they are potentially biased because they rely on fewer tasks, they disputed as unproved the belief that performance assessments will improve the teaching of higher order skills. Why are existing tests presumed to have the high ground in this dispute? What claim do traditional tests have to validity other than the logic of test development and actuarial correlations? Is there empirical evidence to establish the similarity in cognitive processes between multiple-choice test responses and criterion performances? What bias is introduced by asking decontextualized questions rather than having children read aloud and retell a story? If examined critically, current measurement technology rests on assumptions that are no more proved than the assertions in favor of performance assessment.
This article is an exercise in making implicit beliefs explicit so that they become available for debate and evaluation. Although differences in beliefs about learning are not the only theoretical differences or set of hidden assumptions dividing the measurement community, an understanding of learning theory is fundamental to evaluating evidence of testing effects (Is it good or bad that children spend more instructional time in drill-and-practice activities?) and therefore to framing validity investigations. (...) Furthermore, because validity includes not only the consequences of test use but also the meaning of test scores, the hypothesized effects should be systematically investigated. For example, if assessments are intended to guide instruction, then it should be demonstrable that classroom instruction for individual students is different and more effective than it would have been without the assessment information. If accountability assessments are intended to redirect instruction, then it should be possible to document whether students spend more time writing, do more extended projects, are engaged in nonalgorithmic problem solving, and so forth. These kinds of studies will be required to establish the validity of indivdidual performance assessments, not because they are more suspect than traditional tests but because these types of investigations should have been undertaken long ago to support the use of current tests.
Robert S. Siegler (2000). Unconscious insights. Current Directions in Psychological Science, 9, 79-83. pdf
Simon, Herbert A. (1974). How big is a chunk? Science, 183, 482-488. Reprinted in Simon (1979) Models of thought. Yale University Press.
Simon, H. A. (1966). A note on Jost's Law and exponential forgetting. Psychometrika, 31, 505-506.
Simonton, Dean Keith (2003). Scientific creativity as constrained stochastic behavior: The integration of product, person, and process perspectives. Psychological Bulletin, 129, 475-494.
Sternberg, R. J., and J. E. Davidson (Eds.) (1995). The nature of insight. Cambridge, Massachusetts: The MIT Press.
Patrick Suppes, Elizabeth Macke and Mario Zanotti (1978). The role of global psychological models in instructional technology. In Robert Glaser: Advances in instructional psychology, volume 1 (pp. 229-259). Hillsdale: Lawrence Erlbaum. pdf scan
Thomas W. Malone, Patrick Suppes, Elizabeth Macken, Mario Zanotti, and Lauri Kanerva (1979). Projecting Student Trajectories in a Computer-Assisted Instruction Curriculum. Journal of Educattonal Psychology, 71, 74-84. pdf
THOMAS W. MALONE, ELIZABETH MACKEN and PATRICK SUPPES (1979). TOWARD OPTIMAL ALLOCATION OF INSTRUCTIONAL RESOURCES: DIVIDING COMPUTER-ASSISTED INSTRUCTION TIME AMONG STUDENTS. InstructionalScience 8 (1979) 107-120. pdf
Tucker, L. R. (1966). Learning theory and multivariate analysis: illustration by determination of generalized learning curves. In Cattell, R. B., Handbook of multivariate experimental psychlogy, 476-501.
Vaart, H. R. van der (1953). Adult age. An investigation based on certain aspects of growth curves. Leiden: Brill. proefschrift RU Leiden. Also: Acta Biotheoritica, X, 139-212.
Ben Wilbrink, Ben (1977). Verborgen vooroordeel tegen andere dan meerkeuzevragen. In Stichting Onderwijsresearch: Congresboek Onderwijs Research Dagen p. 219-222. [72k html]
Ken Kelley and Scott E. Maxwell (2008). Delineating the average rate of change in longitudinal models. Journal of Educational and Behavioral Statistics, 33, 307-332. (ik heb een pdf, maar moet de fc houden omdat de pdf de formules niet goed weergeeft)
Willett, J. B., & Sayer, A. G. (1994). Using covariance structure analysis to detect correlates and predictors of individual change. Psychological Bulletin, 116, 363-381.
M. R. Wilson (1989). Saltus: A psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105, 276-289.
G. N. Masters (1982). A Rasch model for partial credit scoring in objective tests. Psychometrika, 47, 149-174.
Mail your opinion, suggestions, critique, experience on/with the SPA
http://www.benwilbrink.nl/projecten/spa_learning.htm