Grade Inflation is a Symptom, Not the Disease (Part 2): The Specter of Grade Compression

In this subseries on the wicked problem of assessment, I am exploring the tensions between supporting the learning of every student and sorting students by levels of distinction and merit. I argued previously that: 1. Although the education system (and broader society) is more invested in sorting students, many teachers (like myself) are more invested in supporting them, 2. Grades are the focal point of this conflict, and 3. Many support-oriented teachers navigate this conflict via alternative grading models such as ungrading, contract grading, and labor-based grading.

In this post, I turn more directly to the concept of grade inflation. Highlighting recent controversies, I argue that this sorting-coded phrase is a red herring for what Team Sort is really concerned about: grade compression. This will lead to my next post on the implications of compression for Team Support, especially those who abhor, and seek to circumvent, traditional grading policies. Grade inflation/compression is, one might say, a tractor beam that relentlessly pulls Team Support toward Team Sort’s cargo hold.

Now, lamentations about grade inflation are hardly new, but two recent reports sparked a flurry of jeremiads proclaiming the onset of  “achievement deflation” and the meaninglessness of grades. One report from Harvard revealed that As now account for 60% of all grades at the undergraduate college. The other from the University of California San Diego detailed a thirtyfold increase in students taking remedial math classes, even as many were straight A math students in high school.

To be clear, I am not disputing the evidence that: 1. Grades have increased throughout higher education and much of the K-12 sector, 2. This phenomenon has developed over several decades, and 3. It has likely accelerated in recent years. But as education historians Christopher J. Richmann and Ryan T. Ramsey explain, inflation is hardly a neutral term; it implies a “quantitative increase without a corresponding value increase.” The term, then, assumes a divergence between rising grades and student learning.

If we weren’t projecting nefarious conclusions from the outset, we would examine many potential explanations for the rise in grades. This process would include evidence that, on the one hand, teachers are directing more time and energy toward effective pedagogy (to its credit, the Harvard Report does note this, if briefly) guided by the scholarship of teaching and learning. This process would also include evidence that, on the other hand, standardized test scores are falling and that students are on average spending less time on coursework. And as Richmann and Ramsey urge, we would situate these trends within a dynamic historical context rather than an “ahistorical narrative” in which grades have always meant the same thing, and conveyed the same information, over time. In short, we would find that it’s a complicated story, like every story tied to assessment.

Team Sort is, however, naturally distrustful of grade increases. We see this skepticism in a College Matters podcast episode about the Harvard report. At one point, guest journalist Beth McMurtrie notes that Harvard professors are grappling with whether their role should be to help everyone master course material or determine which students can best handle high-stakes performance expectations:

“If you want all of your students to learn the material, why not let them try and try again? Why do one high stakes test and say, well, you got a C, let’s move on. … Or why would you say only a certain number of you deserve A’s if a lot of your students come in, work really hard, are very creative and productive, and they show different forms of mastery of the material.”

In this scenario, the idea that every student who demonstrates mastery according to course learning objectives—whatever route that process takes—should receive a high grade reflects a support-based mindset.

Host Jack Stripling responds to McMurtrie with a medical analogy: “You know, I don’t want a physician practicing on me. They get one shot at this … if they’re surgeons.” Stripling’s remark reflects a common, even default, perspective that: 1. Grades indicate the degree of mastery students have achieved in a course, 2. Determinations of mastery should be made through high-stakes performances with no do-overs (it’s mainly the lack of a do-over that makes them so high-stakes), and 3. Grades distinguish those who should move up the next rung of the ladder from those who should not. Jesse Stommel, writing in the collection Ungrading, describes similar (medically oriented) skepticism he has encountered: “When I give presentations on grading and assessment, I often get some variation of the question, How would you want your doctor to have been graded?” Given this prevalence, let’s dig into the analogy.

As I noted about the differences between exploration and performance zones, undoubtedly some college courses should include high-stakes performances, depending on factors like the course’s department, its placement within the broader curriculum, and its specific learning goals, whereas many others should not. But surely even the most high-stakes performance scenario for a college student would and should never be life-or-death, right? Besides, I suspect that if Stripling pondered his own analogy further, he very much would want his surgeon to have practiced this procedure under the guidance of a seasoned expert many times (dozens? hundreds?) during medical school and their residency.

Stommel’s response further illustrates that grades alone cannot tell us who will be the best at this or that profession: “I would want a mixture of things assessed and a mixture of kinds of assessment, because the work of being a doctor (or engineer, sociologist, teacher, etc.) is sufficiently complex that any one system of measurement or indicator of supposed mastery will necessarily fail.” Interestingly, it seems many medical schools in the U.S. agree with Stommel that traditional grading systems are insufficient to assess mastery, as more than half have switched to pass-fail grading during medical students’ pre-clinical years, and an increasing number during their clinical years (not without controversy, of course).

This “what about my doctor’s grades?” analogy yields, I believe, under even a modicum of scrutiny. Nevertheless, I recognize the weight of this anxiousness, because what really concerns Team Sort isn’t so much that grades are going up as that the range of grades being awarded is narrowing; insofar as the logic of sorting is concerned, this grade compression is the real threat. How are we to sort students, after all, if so many of them have a 4.0?

For reasons that are unclear to me, compression is a term far less bandied about than inflation. Two recent Chronicle articles, one proclaiming that grades are broken and one that grades are a charade, each mention compression only once even as both diagnose the underlying problem for Team Sort. As the latter puts it:

“Inflated grades are also problematic because as they rise toward the 4.0 ceiling, grade compression makes it harder for high-achieving students to stand out. As is already happening at Harvard, students increasingly look for other ways to distinguish themselves, such as through extracurriculars or the pursuit of even more education. This is inefficient because it imposes an extra burden on high-performing students that would be unnecessary with a well-functioning grading system.”

This is the problem in a nutshell: If we take away grades as the ultimate mechanism of sorting without addressing the sorting function’s continued dominance in education and society, we merely push the sorting function into other domains.

Putting a pin in that for now, I want to explore this question: Insofar as sorting is concerned, compression has always been the most material consequence of rising grades, so why haven’t the pundits, journalists, and educators who champion sorting centralized this term all along? Well, maybe it has something to do with the fact that “decompressing” grades on a wide scale would require policies like grade quotas, thus ensuring a limited number (something like 20-25%?) of A-range grades. It would also require learning outcomes that read something like: “By the end of this course, students should know or be able to do X, but only Y% will receive an A for it” (okay, that one was a little tongue-in-cheek).

In fact, Princeton tried something along these lines about twenty years ago. And as one pundit after another (and even the Harvard report) has noted, this policy was abhorred by students and teachers alike, and it was quashed several years later. Princeton’s grade decompression debacle, I believe, helps explain why members of Team Sort are rarely eager to advocate specific remedies. This lack of transparency has another effect, however, bolstering an ostensibly convenient illusion that grades can serve both to sort and support. The Harvard report, for example, explicitly situates grades as (among other functions) a means to inform students about “how they are performing or where their strengths truly lie” and to distinguish students “for the purposes of honors, prizes, and applications to professional and graduate schools.”

But this conflation ultimately serves neither the interests of Team Sort nor Team Support; it just further immiserates everyone. As I will discuss in the next post, proponents of alternative grading are in essence calling BS on this “have your cake and eat it too” prevarication. They understand that the prospect of being evaluated by grades impedes students from also digesting grades as feedback—i.e., that within the education system as we know it, grades can only ever meaningfully contribute to sorting students.

Let’s unpin the core issue I noted above: If grades cease to effectively sort students, even as the sorting function still prevails, the “scene” of sorting will shift to other fields such as extracurriculars, internships, and any other endeavor that can be used to sift the alleged best from the rest. This shift is well underway at places like Harvard, but it will almost certainly impact a much broader range of institutions over time. In this world, maybe almost everyone gets an A, but no one gets any happier.

I have until now primarily faulted Team Sort for not grappling in any meaningful way with the fundamental tensions between sorting and supporting students, but in Part 3, Team Support enters stage left. If they had their druthers, many support-oriented teachers would abolish the sorting function (which they often refer to as gatekeeping) all together. This is especially true, I suspect, for those who resist traditional grades (a group overlapping those who reject the term rigor) and/or who perceive the wide distribution of A grades as an act of resistance. But while I consider myself a card-carrying member of Team Support (and I have my own issues with the term rigor), I don’t believe we can escape Team Sort’s tractor beam. As is sometimes said about politics, we may not care about sorting, but sorting sure cares about us.



One response to “Grade Inflation is a Symptom, Not the Disease (Part 2): The Specter of Grade Compression”

  1. […] as Team Sort bemoans grade inflation,their real concern is grade compression, as I addressed in my previous post. It doesn’t matter to Team Sort whether grades are rising from genuine learning gains or more […]