Grade Inflation is a Symptom, Not the Disease (Part 3): The (Irrational?) Exuberance of Alternative Graders

In this current series, I am exploring the tensions between two functions or philosophies of education: 1. That we should prioritize supporting all students’ learning and growth, and 2. That we should prioritize sorting the highest achieving students from the rest. I argue that the education system operates from the flawed, if mostly implicit, premise that these functions are reconcilable—that we can effectively support and sort students. In practice, these functions are almost diametrically opposed; the more we support the learning and achievement of all students, the harder it becomes to sort them. I argue further that the (most) recent moral panic alleging rampant grade inflation in higher education demonstrates the irreconcilability of these functions.

In her edited collection Ungrading, Susan Blum lays out the core conflict:

“Complaints about grade inflation circulate in society and administrations (though grade inflation is really grade compression, as a smaller range of grades is actually employed for most students)…. The problem with grade compression, critics argue, is that students are harder to distinguish. In going gradeless, most of the authors of this book act on the conviction that our principal task is educating all students, not ranking them.” (p. 5)

Blum recognizes that even as Team Sort bemoans grade inflation,their real concern is grade compression, as I addressed in my previous post. It doesn’t matter to Team Sort whether grades are rising from genuine learning gains or more ominous trends like academic dishonesty going up or academic standards going down. It is, ironically, a problem if students across the board are meeting thresholds for the highest grades, because that makes grades less useful for sorting.

Over the next couple of posts, I will explore the implications of Blum’s last sentence. I very much vibe with the idea that ranking students should not be our job, and I very much agree that ranking students negatively impacts their learning. But I believe that if Supporters fail to acknowledge the education system’s imperative to sort students, and if we disavow accountability for that imperative, we will end up exacerbating the harms we associate with sorting. In fact, this may already be happening.

Specifically, I examine alternative grading, which has emerged as a key battleground in the conflict between sorting and supporting. Just as traditional grades are the primary mechanism for sorting students, alternative grades have become the primary mechanism for supporting them.

Alternative grading has progressed in recent years from a niche pursuit of Alfie Kohn devotees to a full-fledged, interdisciplinary movement encompassing many approaches; these include ungrading, specifications grading, standards-based grading, labor-based grading, and contract grading. Practices like specs and standards-based grading seem more suited to STEM fields, whereas others like ungrading seem a better fit for the humanities. (How these methods overlap and differ is not relevant to my purposes here). This movement has even taken on trappings of a scholarly discipline with centers and annual conferences dedicated to creating and circulating knowledge about the benefits of alternative grading.

Considering how big a tent alternative grading is, I do not claim that every practitioner disavows the sorting function in the manner described by Susan Blum. For that matter, I do not claim that everyone employing traditional grades disavows the supporting function. People navigating the tensions between sorting and supporting can find themselves on a very messy spectrum of values, beliefs, and tactics. However, there is clearly strong overlap between those who support learning for all students and those who advocate alternative grading. This group (of which I consider myself a member) aims no less than to undo the damage wrought by traditional grades. So, let’s consider how alternative grading advocates frame their project.

I spotlight mathematics professors Robert Talbert and David Clark, who have become leading voices for the movement through their book and (same titled) blog Grading for Growth. Central to Clark and Talbert’s pedagogy are feedback loops, which they see as foundational to learning and which in their estimation are largely absent from traditional grading schemes. The basic idea is that a student performs a task, gets feedback on that performance, reflects on the feedback, and then reperforms the task based on what they learned. For Talbert and Clark, feedback loops emerge from the four pillars of alternative grading: clearly defined standards, helpful feedback, marks that indicate progress, and reattempts without penalty.

There is a lot to like about Talbert and Clark’s framework. As someone who thinks grading and growth are oxymoronic terms, I salute the optimism of grading for growth. I also appreciate their hospitality to those who are disaffected by traditional grades but hesitant to plunge headfirst into alternative grading; for these instructors, Clark and Talbert encourage experimenting with one pillar at a time. And their emphasis on feedback loops has enhanced my own understanding about feedback’s central role in learning.

Like any heuristic meant to conceptualize (the wicked problem of) assessment, of course, the four pillars have their limitations. Even as Clark and Talbert ostensibly situate many approaches in their big tent, they struggle to talk about less STEM-coded practices. Their book, for instance, features individual chapters on standards-based grading, specs-based grading, and hybrids of these approaches, but it only mentions ungrading in passing. I suspect this is partly because ungrading represents a more radical critique of the idea that grading can ever promote growth. It may also have something to do with their commitment to making standards crystal clear at all times; in exploratory, wicked-learning contexts, ambiguity and uncertainty are central to the inquiry process, and it makes less sense to have rigid, predefined standards.

But my primary concerns reflect the lofty claims Talbert and Clark, as well as other advocates, make about alternative grading’s potential to benefit students. They directly confront the charge about grade inflation, arguing that in “an alternative grading system, grades are directly tied to what students actually learn, and reassessments make those grades more accurate. So when you see higher grades, it means more learning. That’s not ‘inflation’; it’s real growth” (42). In other words, more students receive As precisely because: 1. They know what the bar for an A is; 2. They are better afforded to clear that bar; and 3. They are not evaluated against one another along a competitive curve, so everyone who clears the bar gets an A.

Clark and Talbert further dispute the charge that alternative grading lowers academic standards, insisting instead that students “in alternatively graded classes must complete work that is fundamentally correct, without partial credit, thus setting a high bar” (42). Finally, they state that alternative grading has the power to reduce student anxiety, increase intrinsic motivation, counter bias, and enhance equity—in other words, to alleviate the many problems Team Support links to traditional grading.

Let me confess my own bias here: I want this all to be true. If we could raise academic standards while simultaneously equipping more students to meet those standards and reducing their anxiety and stress to boot, that would be a win-win-win, right? This potential to transform the learning experiences of students in formal education is why I began experimenting with alternative grading in the first place. But amid this excitement, I can’t help but recall Carl Sagan’s famous admonition: “Extraordinary claims require extraordinary evidence.” And based on that standard, the picture becomes a lot murkier.

Amid their own exuberance, Talbert and Clark concede that “peer-reviewed research on alternative grading is still emerging” (33), a cagey way of acknowledging there isn’t much empirical evidence to back up their ambitious claims. And in his recent book Failing Our Future, fellow advocate Joshua Eyler makes an almost identical concession: “We are still in the early days of scholarship about the efficacy of alternative grading models as compared to traditional grading practices” (120).

Clark and Talbert instead rely primarily on theoretical arguments about why alternative grading should have positive impacts. For example, they tout conceptual consistencies between the four pillars and methods popularized in science of learning books like Make It Stick, such as retrieval practice, spaced repetition, and interleaving. They also turn to social science literature, citing self-determination theory and achievement-goal theory as reasons why alternative grading should support intrinsic motivation and resilience.

And to be clear, these conceptual connections are worth making. Both my own teaching and scholarship have been significantly influenced by self-determination theory, for example.

But let’s be real here. Having seen the fallout from the massive and very much ongoing replication crisis in the social sciences, we should be wary about any scenario where educators make ambitious claims about educational interventions with limited empirical backing. Theoretical arguments and links to broader education research are necessary but insufficient, and they can bring their own baggage. Eyler, for instance, cites once trendy concepts like growth mindset and grit without addressing more recent empirical reconsiderations of their potential to impact learning.

To put this most generously, some of alternative grading’s most prominent gurus have gotten “out over their skis.” Yes, the problems with traditional grading are numerous and grave, and I applaud and share the desire to counteract them. But that doesn’t mean alternative grading systems will necessarily improve things much, as well-intentioned as they are. The idea that alternative grading could meaningfully respond to all the harm caused or exacerbated by traditional grading is simply too big an ask.

And this overpromising around alternative grades doesn’t factor in the core problem I am addressing in this series. Remember that win-win-win I mentioned above: higher standards, higher achievement, and lower stress for students? Even if this scenario were possible—a very questionable proposition—this would be a win-win-win on Team Support’s terms. Team Sort struggles to envision an education system without winners and losers. And as I have been saying, Team Support always plays on Team Sort’s territory.

Our courses are part of an ecology that extends to the departments in which we teach, the curricular and accreditation requirements (especially for general ed courses) to which our universities are held accountable, and the employers and admissions committees (and parents) to whom we are beholden to vet promising students. Simply by virtue of teaching at institutions where the obligation to sort students is so prominent, Team Support’s pedagogical values are compromised. To tweak John Donne, “No course is an island.”

As I will address in the next post, I believe alternative grading systems will produce (at times significantly) higher grades than traditional grading systems, whatever the reasons.But the prospect of grades going up for large numbers of students will never be seen as an unqualified good in an education system that depends on having convenient (if inherently unsound) mechanisms to rank people. The very success of alternative grading, as a Marxist would say of capitalism, may sow the seeds of its destruction.