Assessment is a Wicked Problem (Part 1)

I have a love-hate relationship with the enterprise of learning assessment. By assessment, I mean both the formative, “checking in to see how someone is doing at a given moment” assessments and the summative, “this is how you did and what that says about what you learned” assessments that get recorded in gradebooks and eventually appear on students’ transcripts. To oversimplify only slightly, I love the former and hate the latter.

To clarify, what I hate about summative assessment is mostly the grading part—and all the damage grading does to the experience of learning—as advocates of ungrading have explained at length. But putting grades aside for now, I believe there are significant epistemological limitations to how much we—by “we,” I mean everyone from individual teachers to admissions committees to prospective employers to the people who design and/or interpret the results of standardized tests, etc.—can know about what other people know and have learned to do. I hinted at this claim in my previous post, where I argued that the feedback teachers receive about their instructional methods is often unclear, inconsistent, and potentially misleading.

On one hand, then, I believe assessment is fundamental to any learning process. When we try to learn something new—a skill, a concept, or some combination, we need concrete, accurate feedback on how we are doing. One could even go so far as to say that there is no learning without assessment—at least without formative assessment.

But when it comes to formal assessment, how do we knowthat students have learnedthe skills and content we purport to teach in our courses? We cannot pry into students’ minds or map the neural pathways and connections forged in the process of acquiring knowledge or enhancing abilities. What we can do is require students to produce artifacts: Tests, papers, quizzes, portfolios, etc., and we can assess these artifacts according to whatever standards we devise. But our assessments of these artifacts are, at best, educated guesses at what students have or have not learned.

Granted, some skills rely on what is often called muscle memory. Hence the saying, “Once you learn to ride a bike, you will never forget,” which is cliché but useful for distinguishing domains of learning where assessment is relatively straightforward from where it isn’t. If a seven-year-old achieves a level of comfort and ease at balancing and steering, going up and down hills without falling, riding on different terrains, etc., we can pretty much say they have learned to ride a bike. If they were to stop riding for the next fifteen years, they might feel wobbly the first time they get back on, but (assuming they haven’t experienced an injury or illness that significantly changed how their body and mind function) they will likely soon feel about as comfortable riding as they did as a seven-year-old.

But for the knowledge and skills we associate with formal education, learning is not nearly as stable as the ability to ride a bike. The higher one goes up Bloom’s taxonomy—toward skills like evaluating, analyzing, and creating—the more dynamic and context-specific learning becomes. Receiving an A on a research paper for a first-year writing course, for instance, means the paper produced by the student (assuming they did the work themselves) met (or perhaps exceeded) the standards set down for that assignment. Depending on the course, the teacher, the discipline, and the institution, these standards might include structure and organization, quality of evidence for claims, stylistic matters including grammar and punctuation, etc.

The assessment of that artifact against a set of predefined standards can itself be justified or critiqued on various grounds; different teachers would prioritize different standards, and even teachers ostensibly using the same standards might assess the paper differently (a problem with the subjectivity of grades—I know, I said I wasn’t going to focus on the dumpster fire that is grading!). But let’s assume a good faith effort by the teacher to assess the paper on those standards.

If we all agreed that the grade merely assessed a point-in-time artifact according to that teacher’s particular standards and their subjective interpretation of those standards, I wouldn’t have much beef with assessment. My problem is how that point-in-time assessment comes too easily to signify that the student has “learned” to write research papers in some generalized sense.

In fact, the bestowing of the A does not even guarantee that the same student will receive As on future research papers they write, as they will have to negotiate different standards, expectations, and conventions tied to different disciplinary contexts—not to mention whatever idiosyncratic ideas their instructors have picked up about what constitutes a “good” research paper: “Thou shalt lose two points for every punctuation error; “Thou shalt lose ten points for having fewer than ten citations.”

As scholars of writing studies have been saying for a long time, writing skills are tied to specific contexts in which those skills are utilized, making it very difficult (and arguably impossible) to understand (let alone assess) them as isolated, generalizable skills. And yet, as I will explore later in this series, course learning outcomes are often written in ways that suggest a near one-to-one correspondence between the artifacts students produce and what students “know” how to do from having produced these artifacts.

I imagine (hope?) few people believe in this one-to-one correspondence—i.e., that higher-order knowledge and skill acquisition are truly this straightforward. And yet, our assessment systems are standardized in ways that, in practical terms, strongly imply such correspondence (including the very phrase “course learning outcomes”). There are numerous components to why this happens. For one, I think we tend to operate from the (understandable, but mistaken) premise that each individual is comprised of a stable sense of selfhood and identity. We also tend to think of learning as a linear process that moves from simpler skills and concepts to more complex ones—i.e., one works on skill A till they master it, then they move on to skill B, and so on.

There is nothing nefarious about these assumptions; nevertheless, significant strains of psychological, theological, legal, and other strains ofthought call into question this (highly Western) notion of a fixed, coherent, and stable self. Furthermore, processes of learning are much more nuanced, recursive, and often muddled than we want to believe. For example, how much people feel they are learning can often vary significantly from how much they actually retain from a learning experience.

In that earlier post, I explored how teachers might navigate the challenges of assessment through interdisciplinary communities of practice. And I remain committed to the idea that teaching should be understood (and institutionally supported) as intellectual work; in this regard, I am an advocate and practitioner of the Scholarship of Teaching and Learning. But I also recognize that communities of practice will not “solve” the problem of assessment. Because assessment is a fundamentally wicked problem.

As originally proposed by design theorists Rittel and Webber, wicked problems are intractable problems whose interconnected, dynamic, and uncertain characteristics and implications transcend the knowledge, expertise, and methodologies of any single discipline. Rittel and Webber juxtapose wicked problems with tame problems that are characterized by codified rules, recognizable patterns, and clearly defined structures for obtaining and interpreting feedback. Unlike tame problems, which have identifiable solutions, wicked problems cannot be definitively solved; they can only be “re-solved—over and over again.”

In framing assessment as a wicked problem, I do not mean we can learn nothing, or nothing useful, from the act of assessing learning; as I said above, assessment is a fundamental part of learning. And I certainly do not mean that students learn nothing from the various assignments and tasks they do throughout formal education. Obviously, many people do amazing things and make huge societal contributions that are deeply connected to knowledge and abilities they acquired during their time as students. Similarly, in rejecting the one-to-one correspondence between course deliverables and learning outcomes, I recognize that there is some (if hard to define) correspondence.

More than this, considerable evidence in various fields should make us confident that active learning practices, which allow students to applycontent soon after they are exposed to it, are more beneficial than merely passive learning from lectures, readings, videos, and podcasts.But while we can develop activities, such as high-impact practices, that give students more opportunities to be active thinkers and doers, we cannot equate these activities with learning.

Rather than giving up on the project of assessment, I suggest that we stop treating assessment as atame problem,which is what happens when course deliverables are conflated (whether by accident or design) with learning outcomes.

As someone who tends toward idealistic thinking, I have recently noticed that in much of my writing, there comes a moment toward the conclusion when I say something along the lines of, “I get that what I am calling for is idealistic, and fairly infeasible given current structural constraints, but it’s important to know what we are working toward even if we will never arrive there.”

And here we are at that moment, once again. Because, the fact is that cultivating a wicked orientation toward assessment might solve some problems, but it would invariably create other problems as well; that is, after all, the nature of wicked problems. Moreover, institutions are mechanisms of standardization, meaning that for the most part, they are structurally equipped to solve tame problems rather than wicked ones. This is, of course, a primary reason assessment keeps getting treated as a tame problem in the first place.

To put this another way: It is awfully hard to scale wickedness. Which brings us to that point in my writing—after I have justified my idealism on the grounds of needing a north star to guide us—when I propose some initial, concrete steps “toward that end” and clarify how modest these steps are intended to be.

To wit: Can we build assessment practices that are relatively wicked? Grappling with this question, and imagining what some initial, concrete, and modest steps “toward that end” might look like, will be the project of this multipart series on assessment.



2 responses to “Assessment is a Wicked Problem (Part 1)”

  1. […] first post in this series on assessment distinguished wicked problems from tame problems, which are […]

  2. […] down, folks). So, we also seek nonfictions we can measure. And, as I have been exploring in this series of posts on the wicked problem of assessment, we also tend to conflate measurable nonfictions with the […]