The Inevitability of Grading Curves:
In the comment thread to my post on the lawsuit over grading curves, commenter "Eli Rabett" writes:
It is not the job of a university to rank its students, it is the job of a university to educate them. If a student masters the work at an A or B level, and then gets knocked down to a C by curving, something is very wrong with the ethics of the teacher.
  I think this reflects a very basic misunderstanding about grading, so I thought I would offer some thoughts in response.

  The reality is that all grading is curved. The only differences are among grading on curves that are more or less explicit and more or less strict. Consider what it means to have "A level mastery" of a subject. What does that mean? There is no intrinsic meaning to an "A" level of mastery, or a "B" level, or a "C" level. Rather, these are relative levels of mastery based on expectations for what level of performance is acceptable or appropriate at different levels of education. The concept is inherently relative.

  To see this, imagine you are a teacher and you are grading papers analyzing George Orwell's Animal Farm. You pick up and read a paper, and it reveals the level of insight and understanding you might expect from a 10th grader. What grade does it deserve? I think the answer depends on the circumstances. If you are a 6th grade English teacher grading the work of 6th graders, then the paper deserves an A; if you are an English professor teaching English graduate Ph.D. students, the same paper deserves an F. The paper is the same either way. It's just that we have a natural sense of scale — of a curve — for what level of insight and sophistication is to be expected at different levels of educational achievement.

  Some grading schemes hide these judgments by relying on numbers. A score might be a 86.5 out of 100, or a 91.8 out of 100. Aren't these grades absolute rather than curved, with the first grade being an objective "B" and the later an objective "A-"? The answer is no. The common convention that an "A" range grade is a 90-100, a "B" range grade is 80-89, etc. is also just a curve; it's a common curve, but a curve nonetheless. It works on the premise that the professor who writes an exam uses a relative distribution of easy and difficult problems so that the scores will track the class's expected levels of achievement.

  That is, the teacher will aim for a specific mix of easy, medium, and hard questions so that the likely mix of answers will produce something like the desired distribution of percentage correct answers and then converted into the desired distribution of letter grades. If the questions are too easy you get too many A's, and if they're too hard you get too many C's; the teacher aims for the right mix to get the right distrubution. This is also a curve, but students usually don't think about that because it's "hidden" in the level of difficulty of the question.

  Controversy over the use of curves usually comes where curves are strict and explicit, imposing a specific distribution of grades in a class ex post rather than quietly aiming for that ex ante. This has the effect of making students much more aware of curves, heightening the sense that there is a clear difference between a "true" grade and a "curved" grade. The arguments for or against a strict curve are a lot like the debates over rules versus standards; there's a choice between giving professors more or less judgment about the relative performance of the class, and there are pros and cons to both approaches. But the choice in that setting concerns how classes are curved, not whether they are curved; whether students realize it or not, all grades are curved.

  UPDATE: A number of readers take issue with my definition of a "curve." They reason that if you select a set scale ahead of time, there is no curve because the grading is "against the exam" rather than against the other students. But the point of my post is that grading is relative to a benchmark, and that benchmark sets the curve. Either the benchmark is the professor's hazy memory of the performance of past classes or it is the actual performance of that particular class. There are differences between the two approaches, but they are surprisingly small, and it's quite misleading to think of the hazy-past-performance benchmark as the absence of a curve.