Elitism and High-Stakes Examinations

Part 4 of A Mathematical Manifesto
by Ralph Raimi for NYC HOLD
October, 2002

4. Elitism and High-Stakes Examinations

The Federal government in legislation of the past ten years has been making a commendable effort to induce the States to write Standards for instruction in English, mathematics, history, science and so on, on the basis of which statewide examinations can be composed to ascertain whether their schools are accomplishing what the public wants. School education is in principle still a local or State responsibility, but it is becoming ever more clear that the nation as a whole has an interest in the results, much as it has a national and not merely local interest in containing communicable diseases.

This effort being recent, except for those few states which have had a statewide examination system for many years, the results of recent federal policy in this regard have yet to be seen; but we can see already some of the difficulties. The most important difficulties are associated with a desire to "leave no child behind", which in terms of examinations is mostly construed as meaning "so that everybody passes". States which have written exacting standards and examinations have been finding an excessive, or embarrassing, number of "failures", while states with the foresight to make their standards vague and their examinations trivial are rewarded with a high apparent success rate. Politics seems to dictate the desire to look good while hiding the real failures either by inflating the scores, calling many clearly failing students "learning disabled" and so not subject to the examinations that count, or gradually eroding the originally exacting standards so that these devices are not necessary. (The roster of the "learning disabled" has expanded marvelously in the past five years, in an epidemic rivaling that of polio in the time before Salk and Sabin.

We do have at least one "national" examination, the NAEP, which so far has been used only statistically, to compare state with state and year against year, but not to decide who "passes" or "fails" in some absolute sense, as for example the way the New York Regents mathematics A examination decides who may not get a diploma. Now, any mathematician studying the questions asked by NAEP is immediately struck by the low level of mathematical knowledge and understanding it asks. Furthermore, the fact that its multiple-choice format questions offer only four choices means that guessing at answers will yield a 25% correct result on average; and yet on many of its questions the percent answering correctly is really not much more than that, sometimes 30%, so that it is hard to distinguish those who know a little from those who know nothing. Especially at the higher end of the scale, the NAEP for mathematics, by not asking anything exacting, not even at the level nominally prescribed by some of the weaker State standards, does not permit superior students to exhibit their superior scholarship. Knowing only half of what a State's standards might demand may well be sufficient for a 100% score on NAEP.

A feeble examination such as the NAEP will distinguish better from worse, or 2001 from 2002, but it will not, despite its classifications of "proficient" and the like, tell us if students this year are doing well. Still, most State education departments in composing their own examinations or purchasing commercially published evaluation programs, tend to follow this model, apparently to make sure that enough students pass. And it must also be said that some of these State standards and consequent examinations are composed by people with little understanding of mathematics, even on the elementary level, and are phrased in ambiguous wording, or ask questions not worth asking. And those containing a large number of "extended-reply" questions, or "situated performance" questions, are unreliable, for the interpretation of the answers is subject to enormous variations in the knowledge or assiduity of the graders, as has been demonstrated by much recent experience. "Performance" tests in mathematics, moreover, cannot by their nature examine most of what school mathematics must teach.

Weak examinations not only fail to define levels of accomplishment very well, but discourage teaching. Students who are "good at math" are neglected; they learn all they need to know to succeed early in the semester, if not at home or in earlier grades, and are encouraged to tune out on further study by the boredom of hearing in class again and again what they already know, or sometimes what they know to be trivial and even false. The best students, if they pay attention, will know how to produce the silly answers demanded by silly examinations, and will earn praise for it, but we all know mavericks who refuse to play such games, not realizing their apparent backwardness will in fact injure their later education.

Which of us doesn't know intelligent people who could have done more with their intellectual lives had they been provided less repugnant guidance, and who today say (with some pride, alas) that they were "never any good at math"?

The problem is, what devices will encourage better and more demanding teaching of the able students, while not leaving behind those less able, or less interested, or less well prepared by home or earlier teaching? Throughout most of human history, societies were not reluctant to cut short the answer: they were willing to call some children "smart" and others "dumb", and let the dumb ones drop out early and become laborers who didn't even need to be able to read, let alone use mathematics usefully, while school teachers could concentrate on the apparently superior students, bringing up an elite to run the country, its industries, its government, and its intellectual life. We expect more from our schools than this.

Not only is it unfair to dismiss the "lower half" in our schools, it is often a misjudgment, that the reluctant, scornful or bored student is in fact one of the "dumb" ones, unteachable. Dismissing them as dropouts is to lose a valuable resource as well as create an unnecessary underclass, all the result of making mistaken judgments in the schools. Poor examinations and watered-down curricula lead to such mistaken judgments. How can we save the non-performing students and yet retain high standards and rich offerings for students who can take advantage of them? The problem is exacerbated by the pieties of the last century, which have merely denied any differences between children, and have insisted that everyone can learn equally well, and with the proper instruction will. They don't say this about violinists and basketball players, but they do say it about mathematics.

It is important for us to reverse this doctrine and admit that some will -- for whatever reason -- learn substantially less than others, no matter what we do. In a democratic and egalitarian society like ours, however, we must do this without simply dismissing the "losers", but making every effort to bring them to the highest level possible for them. It is in the statewide examination system that we can find the means to do so, or at least to make the diagnosis needed to address the problem properly. Hiding the differences by feeble examinations and low standards is cheating; it may be likened to two common images: putting a band-aid on a cancer, or shooting the messenger.

The messenger is essential if we are to defend the fortress; when the enemy is at the gates we must know it, even if the news is uncomfortable. On the other hand, there is no reason to retain the tradition that 90% is an A, 80% a B, 70% a C and so on. With exacting examinations it is quite possible to have scores run from 10 to 100, with 30 as a passing score. (Obviously 30 would not serve in a four-choice multiple-choice examination, but there is no reason not to offer ten choices.) Students do not all have to learn the same amount, and to go on to the next grade a certain minimum should be recognized as necessary, but the examination should permit us to recognize even four or five times that skill should someone have it.

We must identify the "village Milton" if we can, while advancing the sturdy yeoman at the same time. Pretending there is little difference is hurtful to them both. We need examinations much more demanding at the high end than those we are seeing now, and we need minimal scores that are sufficient but not attainable by luck. There then comes a time, typically the beginning of high school, when as a matter of voluntary choice, the aspiring Miltons and Hilberts will find their way into the more demanding curriculum choices, much as is now done voluntarily (though with advice) in the colleges, while those of less accomplishment will have a path that can be navigated successfully even if they know a fraction of what is possible for some people in their age cohort.

We already do this with basketball players and musicians. Students who are not Varsity players are not forbidden to play baseball, or to run. We encourage them all, but do not hide their differences; why do we stop short in mathematics and history?

We must conceal nothing when it comes to measuring accomplishment, and to make use of a wide differential in these measurements to make optimal future choices, or to give advice to students and parents by which they will make these choices of their own volition. Some few will fail, we know; indeed, some will drop out when they can, or even before, or go to jail for crimes. We must have the courage to recognize that not everyone will succeed, even if everyone can. We need not decide the difference is genetic, or caused by bad companions, or whatever "cause" is popular at the moment. It is results we measure, and these results, at each stage, are the diagnosis for the next application of educational expertise, according to the case. We must do what we can not to shame those who do not reach the highest ranks, but we already know that nonviolinists and non-Varsity athletes can sustain their "failure" without pain. Educating children to their demonstrated maximum should be no loss and no shame. The key is to get the demonstration to be fair and accurate.

There is one more advantage of "difficult" examinations that nobody seems to have noticed, and that is the incentive it gives for superior performance. Good teachers have for generations given "extra credit" homework problems. They knew the students who did these problems were the ones already slated for grades of A, and so perhaps did the students, but the opportunity to "show your stuff" generates a bit of extra effort just the same, effort that otherwise would have no place in the culture of the school. The average student -- maybe even some A students -- will pay no attention to the "extra-credit problems"; if they have better things to do, so be it. But there are those reached by this structure who would otherwise waste some of their time.

Such teachers -- and they are often superior teachers --- often deplore "high-stakes exams" as now given, saying that the assignment they are given towards the end of each year, to spend weeks of review of those minimal topics that will raise the school's average on these tests, deny them the chance to do more than the dull minimum altogether. They are right, but those critics of testing who deplore "teaching to the test" are really pointing to something other than testing as such. It is a poor test that will accrue higher scores by wasting part of a semester that could be spent in learning something more than the minimum, and it is such tests that are to be deplored, not testing as such.

It is the duty of the State, and a fortiori the teacher, to assign the "extra credit" that induces the able and interested student to do more than what is needed to go on to the next year's classes. The "high-stakes" examination is the proper vehicle for such encouragement, for it can include those incentive problems for which the better students will gladly study and yet, by setting the passing standard at a reasonable level, that same examination need not discourage the average student from his proper level of advancement.

It is not the "high-stakes" feature of the testing that is deplorable, for the State and student both need the information. What is deplorable is any implication that every student must be expected to learn everything on offer. Those who fear psychological damage to the "losers" are only guaranteeing the increase in the number of losers, even though at the same time these advocates of so-called equality look for other means of concealing the message. The message is real, the success and failure are real, and they always have been. Our task is to reduce the pain of what is inevitable, to be sure we do not actually generate failures that are not inevitable, and to maximize what our next generations will in fact accomplish. Students and parents will, on average, be prouder of real accomplishment, even modest accomplishment, than of fraudulent or empty certificates and honors. As with currency, inflation is no cure for poverty, though sometimes it hides poverty for a few weeks or years.

There is no way, for example, that a superior student can take pride in answering every question correctly on the NAEP examination. With wider-ranging tests, those who deserve honors will find an honorable way to show they deserve them, and an incentive to achieve them, yet without denigrating the accomplishments of the rest.

4. We therefore ask school mathematics programs to include examinations that reliably distinguish the widest possible range of accomplishment for the grade levels in question, but with most questions set at the "easy" or "average" level of difficulty while a few will stretch in difficulty to the truly demanding. Thus while the passing level need not be set particularly low, to distinguish the true failures from the sufficiently accomplished, very few students will score near the very top; those who do can then be distinguished without making the others look bad in terms of scores, but will still have had the incentive to do more than is common by the expectation that the examination offers them scope to exhibit their skill, which is hidden by most examinations given today. Examinations should sharply distinguish the failures from the passing by including mostly routine questions, but should also attract the attention of those teachers and school authorities who can provide for the special needs of the most able. Thus the purpose of "high-stakes" testing should not only be to assess performance such as is expected of all students, but also to encourage superior performance by permitting its recognition without thereby discouraging the others.

Ralph A. Raimi
Department of Mathematics
University of Rochester
Rochester, NY 14627

Return to the NYC HOLD main page or the NYC HOLD News page.