Merit ranking: How fair can it be?

An investigation

Ben Wilbrink

Merit almost never is a simple thing. It is a combination of effort, talent, luck, circumstance, sacrifice, nature and nurture. Above all, merit is contingent. Whatever it may be, it is a combination of many variables, and it is in the eyes of the beholders. Merit ranking, therefore, is a very, very tricky business. One butterfly over Colorado will effect the ranking in Amsterdam to come out different. Rankings are like the weather, which in the case of rankings is to say that they can not be consistent at all. Of course this is no news, we have taken our precautions against inconsistencies showing up too conspicuously. For one thing, time measurement in major sport events is ridiculously precise, effectively shutting out one factor of luck: the jury members' verdicts. As long as supporters want to believe time to be the real measure of merit on the 100 yards, on this particular race, that's fine with me. Supporters believing the same thing about grades in education is something else altogether. I would like to find out the connection between grading in education, and the impossibility of merit ranking ever to be consistent, or at least fully consistent (is there a difference, then? I have always learned that inconsistent preferences in the races will cost you money, nothing here is 'fully'; 'foolly' maybe?).

While starting this project, I do not know what it will lead to, what its outcome will be. It has been triggered by a reading of an old copy of Vassiloglou and French (1982), a chapter from Rawls' dissertation (1953), and a discussion with a friend about even the smallest differences in merit justifying winners to take all. This looks like the working of serendipity, yet there is some merit involved too: human capital amassed in years of struggling with problems in educational assessments and knowing my literature about, for example, decision-making, makes me intuit that the theme will turn out to be highly interesting and be of consequence in the design of achievement test items and in the theory of strategic preparation for achiement tests.

Assuming all grading of students in essence to be ranking, a number of powerful theories can be used to analyze what is happening in assessment in our schools and universities. The more so if one remembers that ranking of options is exactly what most of decision-making is about (for example, Lichtenstein and Slovic, 2006).

In order to get some distance to the existing literature on assessment, I will call the object of what is to be ranked 'merit.' Merit or achievement, it does not matter. At least concept of merit does not have all the connotations that educational achievement has.

I use the term 'grading' or valuation, assuming that the student's work has been scored on its correctness, these scores or a total score then has to be translated into a rank or a grade or whatever. Scoring is a process dictated by whatever it is in the particular discipline to be correct. Scoring is nearly the same as providing immediate feedback to the student. My own work that directly is related to this scoring is the design of achievement test items. Grading is something else altogether, it places the students achievement on a particular scale or in a particular frame of references, more often than not existing of the achievement of a reference group. This 'reference group' might be anything, from the intuition of the teacher with long years of experience, to the randomly sampled refrence group used in the development of an intelligence test.

My position is that grading is a form of ranking, in fact grading has evolved from the educational practice of ranking, see my (1997 html). If you do not concur, then for the sake of argument, assume this to be the case.

A convincing argument may be found in recent work of Ronald Giere, the philosophical position he calls perspectivism, a position in between or combining that of objectivism/realism and subjectivism/constructivism. In his (2006) the example is the case of color vision. Colors as such do not exist in the physical world, yet color vision is based on physical phenomena, specific physiology, constructivist neuropsychology, and of course culture and language.

Now apply this perspectival analysis on what may be called merit vision. Or rank vision. Grading educational achievements surely is not as thoroughly physical as color vision is. How much more involved, tricky, and complex than color vision it must be! If not outright subjective, it surely is highly perspectival and at least somewhat arbitrary. How, then, could its outcomes ever be called fair? Or is this only a societal myth, 'myth' in the sense Alexander Astin talks about societal myths of selection processes? The link to the work of Astin (university admissions in the US) is highly relevant, because selective processes feed on merit rankings on the basis of GPA's etcetera.

Giere (2006)

My intention is to get better insight in the processes of valuation and grading in education, insight that probably will not in itself lead to better methods or procedures. My feeling at the beginning of this project is that at least this exercise will teach us to be very, very humble in doing our daily jobs in assessing students. The fear is that Arrow's Impossibility Theorem applies to the kind of decision-making that we call assessment: it is not possible to be consistently fair in the grading (of work) of students.

Number of wins (out of four comparisons) of candidate i (rows) over j (columns)

                   Candidate
              1    2    3    4    5    Row-
            _________________________  sums
            |                       |
          1 | -    2    4    1    2 |   9
          2 | 2    -    1    1    3 |   7
Candidate 3 | 0    3    -    3    4 |  10
          4 | 3    3    1    -    2 |   9
          5 | 2    1    0    2    - |   5
            |_______________________|

Vassiloglou and French (1982, p. 191)

An example may make it clear what this is all about. The above table shows in every cell how many times out of four the row candidate or item has been ranked above the column candidate. Are the rowsums sufficient for the overall ranking? Wood and Wilson (1980) say they are. Then along come Vassiloglou and French, they construct the following table, by deleting candidate 3 from the above one.

                   Candidate
              1    2    4    5    Row-
            ____________________  sums
            |                  |
          1 | -    2    1    2 |   5
Candidate 2 | 2    -    1    3 |   6
          4 | 3    3    -    2 |   8
          5 | 2    1    2    - |   5
            |__________________|

Vassiloglou and French (1982, p. 191)

The amazing result is that the the ranking of the four candidates left has been changed from 1-3-1-4 to 3-2-1-3. That is not a nice result, because leaving out one candidate should not change the ranking of the others relative to each other; Vassiloglou and French term this the principle of the Independence of irrelevant alternatives.

In case you did not notice: these data could have been from the ranking of five examinees on each of four essays comprising a four essay examination. Or whatever else you might construct it to be.

The above example has been disturbing. Surely it is artificial, because generally essays are not rated or ranked in this way. But then, if our assessment tradition would have been otherwise, they would, wouldn't they? There seems to be at least some arbitrariness in the way we rank or grade essays, students, sportsmen, bets, or whatever.

at the start of this project february 2008
I will start with a fast inventory of what it is that I see as the problem here, how that will fit in in some of my own projects, what will probably be the key publications to use. I will do this in Dutch, to really be fast.

In onze samenleving is er op veel plaatsen sprake van rangordes op basis van verdienste. Dat is evident in de sport, want daar gaat het juist om het wedstrijdelement in dat rangordenen. Het is evident op de arbeidsmarkt, waar de vragende partij (werkgevers) in beginsel zelf bepaalt iwe aan te nemen, en wie niet. Het is evident in het onderwijs, al wordt het daar al meer toegedekt dan op de arbeidsmarkt het geval is.
Het is niet vanzelfsprekend evident op allerlei markten, marktwerking is geen werking van verdienste. Marktmacht hebben berust niet per definitie op verdienste, en is niet per definitie verdienste.

Het is daarom niet van belang ontbloot dat dat rangordenen op verdienste ergens op slaat, dat het in beginsel eerlijk is.

Het idee is nu dat die eerlijkheid op de tocht komt te staan wanneer aangetoond kan worden dat rangordenen op verdienste in beginsel niet op consistente wijze mogelijk is. Ik vermoed dat zoiets al door Arrow is aangetoond, daar gaat waarschijnlijk zijn Impossibility Theorem over. De enige uitzondering is mogelijk de situatie waarin verdienste echt eendimensionaal is, ik meen dat zoiets in de brute werkelijkheid niet voorkomt, ook niet in de sportwereld.

Veronderstel eens dat Arrow gelijk heeft, en dat dat gelijk ook van toepassing is op het beoordelen in het onderwijs. Wat hebben we dan voor situatie? Misschien is dat wat ondoorzichtiger dan eerst maar eens een toepassing op een kunstmatige wereld te doen: de sport, met zijn spelregels die een zekere eendimensionaliteit bewerkstelligen. Neem de sprint op de honderd meter. Het criterium is tamelijk eenduidig: wie het eerst over de streep komt, wint. Maar hoe zit het dan met het combineren van een reeks verschillende wedstrijden? Welke hardloper is over die reeks wedstrijden dan de 'beste' geweest?

De conclusie gaat al heel snel worden dat in het dagelijks leven het toekennen van verdienste een vergelijking met anderen impliceert, en dat we daar bepaalde procedures voor hanteren die de schijn van een zekere eendimensionaliteit opleveren, daarmee het wezenlijke probleem van de inconsistentie toedekkend. (Tournaments, competities, afvalwedstrijden, de hele reutemeteut. Daar zijn ongelooflijk veel varianten op mogelijk, daar is ook erg veel literatuur over neem ik aan)
Kennelijk vinden we dat gezamenlijk meestal niet erg, we kunnen ermee leven, brengen af en toe veranderingen in procedures aan, en dat is het dan. We doen geen zielpijnigend onderzoek naar wat dan de gevolgen zijn van het verdringen van het feit van de principiële inconsistentie.

Wat zou zo'n zielpijnigend onderzoek op kunnen leveren aan nieuwe inzichten? Zou er een geruststelling uit kunnen komen, in die zin dat er zeker inconsistenties zijn, dus dat bij iets andere procedures er andere winnaars uit de bus zouden komen, maar de groep winnaars of wie net zo goed winnaar had kunnen zijn is toch wel echt een groep die zich van de grotere rest onderscheidt? Zou het kunnen zijn dat de principiële inconsistentie zich beperkt tot een zekere fuzziness in de rangordeningen, maar deze niet wezenlijk anders kunnen zijn wanneer alle mogelijke varianten beproefd zouden worden? Wie kan daar op voorhand al van overtuigd zijn?

Mijn vermoeden is, en daar zal dus de zoektocht vooral over gaan, dat het aantonen van wezenlijke inconsistentie, net als dat overigens in de besliskunde het geval is, betekent dat je in een spelsituatie van een rationele tegenstander altijd gaat verliezen. Het kost geld wanneer je weddenschappen aangaat op basis van inconsistente voorkeuren, van inconsistente rangordeningen. Volgt daar dan ook uit dat het de samenleving kost wanneer de samenleving werkt met inconsistente voorkeuren in haar rangordeningen van verdienste?

Een paar getallenvoorbeelden zouden de zaak meteen op scherp kunnen zetten. Die voorbeelden moeten er ook zijn of komen, want als ze niet geconstrueerd kunnen worden, dan is er kennelijk van wezenlijke inconsistentie geen sprake. Wood en Wilson gevene een getallenvoorbeeld, en Vassiloglou en French gaan daar weer verder op in. Ik ga me dus eerst eens met die getallenvoorbeelden intensief bezighouden, want ik wil deze webpagina juist met een paar van die voorbeelden beginnen, dan is iedereen meteen bij de les. Ik moet onmiddellijk duidelijk maken dat het niet een academische vraag is die hier aan de orde is, maar dat uitmaakt in ons onderwijs, op de arbeidsmarkt, in de sport. (Het zou mooi zijn wanneer ik een verbinding zou kunnen maken met de oefeningen van Hofstee over jury-oordelen).

Het gaat bij dit rangordenen niet alleen maar om wie er als 'besten' uitkomen: er kunnen heel wat bijzondere punten op de schaal zijn waarbij het van belang is of je er net onder danwel net boven uitkomt, zeg maar alkles waar een bepaalde cesuur wordt gehanteerd. Die cesuur zelf is meestal ook inherent relatief, maar kan plaatselijk wel absoluut zijn, bijvoorbeeld omdat tevoren is toegezegd of vastgelegd wat de cesuur zal zijn, ongeacht de resultaten etcetera.

Is ieder verschil in verdienste voldoende voor welke verschillende behandeling dan ook?

Er is discussie mogelijk hoe bijvoorbeeld John Rawls in zijn Theorie van rechtvaardigheid omgaat met de vraag of verschillen in verdienst, hoe klein ook, ieder verschil in behandeling, hoe groot ook, rechtvaardigen. Zo geformuleerd mag wel duidelijk zijn dat Rawls dat nooit zal hebben beweerd. Wat is dan wel zijn standpunt hierover? (zijn eerste hoofdstuk in de bundel artikelen, ik meen de punten vi en vii daarin, waarin vi een beginsel van proportionaliteit verwoord, en vii het beginsel dat meer verdienste een claim oplevert).

¥ winner takes all?
¥ proportionaliteit
¥ democratische verdeling (winnaars hebben respect voor verliezers en doen daar dus ook het nodige mee

Wat kan er zoal van invloed zijn op een rangorde?

Toetsen
¥ welke vragen uit de verzameling zijn voorglegd
¥ wie de andere deelnemers aan deze toets zijn
¥ of de resultaten van een bepaalde deelnemer worden meegenomen of niet
¥ wat er gebeurt met vragen die achteraf ondeugdelijk blijken te zijn
¥ welk type vragen wordt gebruikt (mc, essay, etc)
¥ wie heeft de vragen ontworpen / wat is de kwaliteit van de vragen
¥ hoe zijn de vragen over de stof verdeeld
¥ hoe zijn de vragen over vormen van beheersing verdeeld (reproductie, toepassing, etc)
¥ wat zijn de omstandigheden van de toetsafname
¥ wat is de conditie van de deelnemers
¥ hoe worden de afzonderlijke antwoorden beoordeeld/gescoord/gewaardeerd, door wie etc.
¥ hoe worden de oordelen, cijfers etc. voor afzonderlijke vragen gecombineerd tot een eindoordeel of eindwaardering of eindcijfer?

Een deel van de bovenstaande punten zou je kunnen vatten onder de operationele definitie van verdienste voor dat vak en die toets.

Bij de rankingmethoden in Wood en Wilson heb ik de gedachte dat je bij heel onbetrouwbare toetsen of beoordelen best wel eens van deze ranking tabelletjes kunt krijgen die er vrijwel hetzelfde uitzien als bij juist heel betrouwbare beoordelingen. (Bijv. door een aantal leerlingen of gesimuleerde tests op basis van dezelfde mastery e nemen, versus leerlingen/tests randm getrokken uit een bredere betaverdeling voor mastery). Ik zou dat graag uitzoeken, het zou leuk zijn als ik dat kon opnemen in het spa_model, want het lijkt me dat ik met deze methoden nog veel meer kan doen dan auteurs als Wood en Wilson, of Vassiloglou en French, al uitwerken. Maar pas op, het is best mogelijk dat er recent al ontzettend veel meer is bereikt bij studie van tournaments, en/of in de social choice theory. Ik moet dus ook snel een literatuurverkenning doen, uitgaande van de set van publicaties rond het werk van Wood en Wilson, en van Vassiloglou en French.

Meritocracy

Young, M. (1958). The rise of the meritocracy, 1870-2033: An essay on education and equality. London: Thames and Hudson.

Gascoigne, J. (1984). Mathematics and meritocracy: the emergence of the Cambridge Mathematical Tripos. Social Studies of Science, 14, 547-584.

Ralf Dahrendorf (2005). The rise and fall of meritocracy. Commentary. Project Syndicate. www.project-syndicate.org html

Lemann, Nicholas (1999). The big test. The secret history of the American meritocracy. New York: Farrar, Strauss and Giroux.

Kenneth Arrow, Samuel Bowles and Steven Durlauf (Eds) (2000). Meritocracy and Economic Inequality New Delhi, Oxford University Press.

Jerome Karabel (2005). The chosen. The hidden history of admission and exclusion at Harvard, Yale, and Princeton. Boston: Houghton Mifflin. The last paragraph The dark side of meritocracy from the last chapter The battle over merit pdf (on another website of mine). For some citations see here

Ben Wilbrink (1997). Terugblik op toegankelijkheid: meritocratie in perspectief. In Marian Van Dyck, Toegankelijkheid van het Nederlandse onderwijs. Studies (p. 341-384). Den Haag: Onderwijsraad. html

Historical examples

Howard Machin and Vincent Wright (1989). Les élèves de lÕécole Nationale dÕAdministration de 1848-1849. Revue de lÕhistoire moderne et contemporaine, 36, 605-639.

Mooie voorbeelden van excellente carrières gekoppeld aan positie in vergelijkend toelatingsexamen.
p. 607; LÕexamen pour la première promotion dÕélèves eut lieu en mai et juin 1848. Le nombre de candidats effectif fut de 865: 70 autres sÕétaient inscrits mais se retirèrent avant le jour du concours. [zie noot 6 over andere aantallen door diverse auteurs genoemd] Il y avait donc beaucoup de candidats, ce qui témoignait des espoirs suscités par lÕécole et de la popularité du service de lÕétat chez de jeunes Fran¸ais ambitieux. Parmi eux se trouvaient des représentants de quelques-unes des grandes dynasties administratives et politiques de la France [noot 7 geeft details] LÕexamen comportait deux étapes: une épreuve dÕadmissibilité (ayant pour but dÕéliminer) et une épreuve définitive dÕadmission (en vue de sélectionner). Comme dans tous les concours fran¸ais, les noms des candidats re¸us étaient publiés par ordre de mérite.
p. 608: Seulement 152 (et non pas 200 comme il était prévu au début) sur les 685 candidats furent finalement choisis. Volgt een overzicht van sociale achtergronden van studenten
p. 637; On a soutenu que la rance du milieu du XIXe siècle eetait un pays dirigé par une élite, mais gue celle-ci nÕétait pas complètement fermée. ... Cependant, le savoir intellectuel, bien quÕimportant et même nécessaire dans certains cas, nÕétait pas une condition suffisante en France pour réussir une carrière administrative, la plus convoitée de toutes les carrières. Dans la compétition entre les très nombreux licenciés en droit, les riches, les privilégiés et les jeunes gens bien apparentés lÕemportaient généralement. Il avait fallu bien des révolutions pour ouvrir, mais seulement de fa¸on finalement temporaire, lÕadministration à des hommes nouveaux. Après chaque révolution, le système traditionnel se rétablissait lui-même progressivement. La conception de lÕadministration fran¸aise comme une 'carrière ouverte à tous les talents' restait, et reste aujourdÕhui, un idéal prêché par beaucoup, chéri par certains, mais mis en pratique seulement par un très petit nombre.

Literature

Ronald N. Giere (2006). Scientific perspectivism. The University of Chicago Press.

Table of contents pdf
Read chapter one pdf

D. H. Krantz, R. D. Luce, P. Suppes, and A. Tversky (1971/2007). Foundations of Measurement Volume I: Additive and Polynomial Representations. Dover (reprint appearing January 30, 2007).

Sarah Lichtenstein and Paul Slovic (Eds) (2006). The construction of preference. Cambridge University Press contents.

Percy B. Lehning (2006). Rawls. Lemniscaat.

Een grondig voorbereide inleiding op Rawls, zijn leven en zijn werk.
Gelijktijdig verschenen met de vertaling (door Bestebreurtje) van zijn Theory of justice uit 1971

Serena Olsaretti (Ed.) (2003). Desert and justice. Oxford University Press.

A number of pages are available on http://books.google.nl/. Even from the introductory chapter a few pages have been left out. Is this Oxford UP policy?

John Rawls (1993). Political liberalism. New York: Columbia University Press.

John Rawls (1999). The law of peoples. With "The idea of public reason revisited." Cambridge, Massachusetts: Harvard University Press. David Gordon review - pdf

John Rawls (2001). Justice as fairness. A restatement. Belknap Harvard University Press.

John Rawls (1971/2006). Een theorie van rechtvaardigheid. Vert. Frank Bestebreurtje. Lemniscaat. bespreking door Enno de Wit.

John Rawls (2001). Lectures on the history of moral philosophy. Harvard University Press.

Marilena Vassiloglou and Simon French (1982). Arrow's theorem and examination assessment. British Journal of Mathematical and Statistical Psychology, 35, 183-192.

abstract Usually in examinations an overall assessment of a candidate's performance is made by means of a weighted sum of the marks attained on the various components. However, recently it has been suggested that the combination should be based on the candidate's rankings on the components alone, and not on the actual marks. This paper discusses whether such an approach can lead to a fair and consistent system of assessment.
Problematic is the author's opinion that ranking methods do not make use of differences in strength. That simply is not true, I expect them to correct their opinion in a later publication. The point simply is this: the greater the strength of a particular difference in rank, if one can say such a thing, the more other ranks will be in the range of that difference. I will have to study Krantz, Suppes, Luce and Tversky on points like this. b.w.

Ben Wilbrink (1997). Assessment in historical perspective. Studies in Educational Evaluation, 23, 31-48. html

Robert Wood and Douglas T. Wilson (1980). Determining a rank order when not all individuals are assessed on the same basis. In L. J. Th. van der kamp, W. F. Langerak and D. N. M. de Gruijter: Psychometrics for education debates (p. 207-230). Wiley.

There is an important criticism on the ranking method, as presented in Vassiloglou and French (1982).

literature on weighting

Marilyn W. Wang and Julian C. Stanley (1970). Differential weighting: a review of methods and empirical studies. Review of Educational Research, 40, 663-705.

Robyn M. Dawes (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34, 571-582.

abstract Proper linear models are those in which predictor variables are given weights in such a way that the resulting linear composite optimally predicts some criterion of interest; examples of proper linear models are standard regression analysis, discriminant function analysis, and ridge regression analysis. Research summarized in Paul Meehl's book on clinical versus statistical prediction - and a plethora of research stimulated in part by that book - all indicates that when a numerical criterion variable (e.g., graduate grade point average) is to be predicted from numerical predictor variables, proper linear models outperform clinical intuition. Improper linear models are those in which the weights of the predictor variables are obtained by some nonoptimal method; for example, they may be obtained on the basis of intuition, derived from simulating a clinical judge's predictions, or set to be equal. This article presents evidence that even such improper linear models are superior to clinical intuition when predicting a numerical criterion from numerical predictors. In fact, unit (i.e., equal) weighting is quite robust for making such predictions. The article discusses, in some detail, the application of unit weights to decide what bullet the Denver Police Department should use. Finally, the article considers commonly raised technical, psychological, and ethical resistances to using linear models to make important social decisions and presents arguments that could weaken these resistances.
p. 571: "Paul Meehl's (1954) book Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence appeared 25 years ago. It reviewed studies indicating, that the prediction of numerical criterion variables of psychological interest (e.g., faculty ratings of graduate students who had just obtained a PhD) from numerical predictor variables (e.g., scores on the Graduate Record Examination, grade point averages, ratings of letters of recommendation) is better done by a proper linear model than by the clinical intuition of people presumably skilled in such prediction. The point of this article is to review evidence that even improper linear models may be superior to clinical predictions. A proper linear model is one in which the weights aiven to the predictor variables are chosen in such a way as to optimize the relationship between the prediction and the criterion. Simple regression analysis is the most common example of a proper linear model; the predictor variables are weighted in such a way as to maximize the correlation between the subsequent weighted composite "and the actual criterion. Discriminant function analysis is another example of a proper linear model; weights are given to the predictor variables in such a way that the resulting linear composites maximize the discrepancy between two or more groups. Ridge regression analysis, another example (Darlington, 1978; Alarquardt & Snee, 1975), attempts to assign weights in such a way that the linear composites correlate maximally with the criterion of interest in a nev; set of data."
p. 580: "When I was at the Los Angeles Renaissance Fair last summer, I overheard a young woman complain that it was "horribly unfair" that she had been rejected by the Psychology Department at the University of California, Santa Barbara, on the basis of mere numbers, without even an interview. --How can they possibly tell what I'm like?" The answer is that they can't. Nor could they with an interview (Kelly, 1954). Nevertheless, many people maintain that making a crucial social choice without an interview is dehumanizing. I think that the question of whether people are treated in a fair manner has more to do with the question of whether or not they have been dehumanized than does the question of whether the treatment is face to face. (Some of the worst doctors spend a great deal of time conversing with their patients, read no medical journals, order few or no tests, and grieve at the funerals.) A GPA represents 31 years of behavior on the part of the applicant. (Surely, not all the professors are biased against his or her particular form of creativity.) The GRE is a more carefully devised test. Do we really believe that we can do a better or a fairer job by a 10-minute folder evaluation or a half-hour interview than is done by these two mere numbers? Such cognitive conceit (Dawes, 1976, p. 7) is unethical, especially given the fact of no evidence whatsoever indicating that we do a better job than does the linear equation. (And even making exceptions must be done with extreme care if it is to be ethical, for if we admit someone with a low linear score on the basis that he or she has some special talent, we are automatically rejecting someone with a higher score, who might well have had an equally impressive talent had we taken the trouble to evaluate it.) No matter how much we would like to see this or that aspect of one or another of the studies reviewed in this article changed, no matter how psychologically uncompelling or distasteful we may find their results to be, no matter how ethically uncomfortable we may feel at "reducing people to mere numbers," the fact remains that our clients are people who deserve to be treated in the best manner possible. If that means-as it appears at present-that selection, diagnosis, and prognosis should be based on nothing more than the addition of a few numbers representing values on important attributes, so be it. To do otherwise 'is cheating the people we serve."
Dawes (2002). The ethics of using or not using statistical prediction rules in psychological practice and related consulting activities. Philosophy of Science, 69, S178-S184. probably available on www: doc

February 22, 2008 \ contact ben at at at benwilbrink.nl

http://www.benwilbrink.nl/projecten/meritranking.htm