The Yale Cause or the Yale Effect?:
Maybe I'm just missing something, but isn't the most likely explanation for the apparent link between hiring Yale Law clerks and getting questioned or reversed that the trial judges who don't care much about getting reversed also are unusually likely to hire Yale Law clerks?

  This is anecdotal, but my sense is that a pretty specific group of trial judges regularly hires Yalies, and that these judges are unusually likely to see themselves as pathbreaking judges who chafe against what the appellate courts tell them to do. If their politics line up correctly, Yale students will often see these judges as heroes of the law and want to clerk for them in part because they "push the limits of the law" (a.k.a. make stuff up that seems cool) in ways that often lead to reversal.

  If I'm right about that, hiring Yale clerks will be a consequence of being reversal-prone rather than a cause of it. That doesn't mean that Yale graduates will be as good at identifying and following the law as graduates of other schools. But I'm not sure this paper sheds light on that question given the likely direction of causation.
Sean M:
Clearly, Harvard needs to endow a We're Better than Yale, No, Really Professor of Law stat to investigate this relationship.
4.18.2008 6:43pm
George Weiss (mail) (www):
any thoughts on why yalies may be want to clerk for to such judges more than the 'follow the law' judges at a higher rate than say, i don't know, clerks from your alma matter?
4.18.2008 6:45pm
GD (mail):
"make stuff up that seems cool"

Like landing in Bosnia under sniper fire?
4.18.2008 7:02pm
As long as there are things out there that haven't been counted and compared side-by-side, there will always be another absurd empirical paper to be written and published somewhere.

I agree with the post, assuming that the paper actually does show a correlation between Yale clerks and being reversed.
4.18.2008 7:07pm
I went to Yale, and I can vouch for the fact that activist judges were particularly in vogue there.

I remember when Judge Justice (district judge in Texas) came to speak at a class I was taking. He said -- and I'm not exaggerating or misquoting him -- that he decided the result he wanted and then sent his law clerks to figure out how to make that result fit within existing law/precedent/etc.

That approach is, of course, entirely lawless. Judge Justice conceded that he didn't look at the law (either case law or statutes) himself beforehand -- law that was cited was simple a post hoc justification for his preexisting biases. Congress could pass a law saying nearly anything and Judge Justice would arrive at the same decision nonetheless.

You'd think, given what Judge Justice said, that the room would be full of eerie silence or bemusement. Instead everyone applauded. The thrust of Judge Justice's remarks was that Judges needed to act to rework society from the bench--a sentiment that was shared by the professor who brought him in to speak.

One of my classmates clerked for Judge Justice. Judges like him appealed to my classmates because the 'transformative nature of the law' was stressed by the most popular professors at the school.
4.18.2008 7:09pm
Paul McKaskle (mail):
A colleague who clerked on the second circuit told me that her judge refused to even interview Yale law graduates because they didn't know enough about law.
4.18.2008 7:16pm
Kathi Smith (mail):
A lot of us are former district court clerks. I wonder how many of us EVER heard from our judge about aversion to being reversed. I never did. And my judge was a Reagan appointee.
4.18.2008 7:25pm
Thief (mail) (www):
Is this why I keep hearing the rumor that if the Democrats win the White House, Harold Koh is going to be their first Supreme Court nominee?
4.18.2008 7:32pm
Elliot Reed (mail):
I'm surprised that there's not a dummy variable for "judge went to Yale." Presumably Yale judges are more likely to hire Yale graduates, so maybe the problem is with Yale Law School, or with the kind of person Yale likes to admit vs. comparable students who end up in Cambridge or Palo Alto.
4.18.2008 7:55pm
Elliot Reed (mail):
To clarify, maybe the problem was with Yale Law in the past, compared to Yale Law today.
4.18.2008 7:57pm

Interesting story. When I was a student at Harvard in the 1990s, most of the faculty and most of the students shared the same attitude. The idea was that law was a tool for progressive social change, and that the best role models were those judges (Jack Weinstein, Justice Brennan, etc.) who wouldn't let traditional legal materials limit the judge's sense of what the law could become. Those students who objected on rule-of-law grounds were a small minority, and generally they were told that they belonged in the Federalist Society (not meant as a compliment, I should point out).
4.18.2008 8:04pm
Vermando (mail) (www):
The judge my uncle clerked for also had that attitude: "I'd rather have the facts on my side than the law," he'd always say.

Of course, he was one of the great Civil Rights pioneers down south, so I suppose that may have had something to do with his attitude, as well as why he didn't give a damn about being reversed sometimes.
4.18.2008 8:35pm
CDR D (mail):
If judges can't be limited by the law, then why should jurors?

Jurors, if they understand the power they have, can "shape" the law just as these lawless judges do.
4.18.2008 9:00pm
Royce Barondes (mail):
I gather I did not sufficiently detail in the paper how Stata's clogit estimator works. I believe it is fair to say that use of Stata's clogit estimator, as the paper does, is sufficient to reject this particular criticism. I thank you for pointing-out an area that could benefit from more explanation.
4.18.2008 9:03pm
Bruce McCullough (mail):
Another problem you have to deal with is taking the averages of ordinal variables -- that's a no-no. One of your variables is "average law school reputation". Reputations are ordinal, nor cardinal, and you can't take an average them. This is covered in most introductory undergraduate statistic texts, when students are shown the different types of variables (e.g., categorical, ordinal, interval, and ratio) and the types of analyses that can be performed on each.

More to the point, is the difference between the schools ranked one and two exactly equal to the difference between the schools ranked 20 and 21? If not, then the analysis is invalid. Assuming that the difference between one and two is the same as the difference between 20 and 21 does not make it so, any more than calling a cow's tail a leg means that a cow has five legs.

This type of statistical analysis is nonsense. It's very commons, but it's still nonsense.

Bruce McCullough
4.18.2008 9:25pm
frankcross (mail):
Royce, it's not a stats blog, but I've been wondering about which criticisms clogit might exclude. Haven't used it but have been reading up on it. Can you elaborate on its practical effect in this case?
4.18.2008 9:45pm
Royce Barondes (mail):

The paper does not take averages of ordinal reputation measures.

The law school lawyer reputation and academic reputation measures reported by USNWR used in the paper are not ordinal. The maximum reported by USNWR is 4.9. Harvard has a 4.9 value for both academics and practicing lawyers. Yale and Stanford have 4.9 for one and 4.8 from the other group (but they are reversed between the groups). I don't recall the lowest of all the law schools that are represented in the sample. It may be in either the low 2's or the high 1's, but I am not certain. But, in any case, the concern you reference does not appertain to the model.

I thank you for pointing out aspects of the modeling that the draft could profitably clarify.
4.18.2008 10:06pm
Duffy Pratt (mail):
At the district court, concern for reversal increases with the age of the case and with the amount of time that the judge put into it. Getting reversed after a grant of a 12(b)6 is no big deal. Getting reversed and sent to a new bench trial on an ivolved patent case was not a good thing. That's why, after a bench trial, almost everything that can be characterized as a fact becomes a fact and not a conclusion of law.
4.18.2008 10:53pm
Yale Law School is the best law school in the country.
4.18.2008 11:04pm
It seems like there would be a high correlation between the variables for school reputation, bar pass rate, and yale fraction. How do you account for this and what happens when you leave out the Yale fraction? Also, I would recommend that you do include more schools in your analysis. You may consider this data mining, but if it happens for more than just Yale, then perhaps this isn't as interesting as you thought. Just my $.02. Thanks.
4.18.2008 11:23pm
As an aside: many judges fail to remove data from their PDFs. Sometimes when I get a ruling or order from federal court I can hit ctrl-E and up pops the name of the law clerk (as "author"), and even the filename and position on the court's drive! (Posner's opinions used to list one woman's name as "author" on everything; I don't know if she was his secretary or what.) If enough of that sloppiness went on maybe somebody could dig in and see how many of these reversals actually took place with the participation of a Yale graduate.

But as Orin suggests the article, while interesting, leaves out a whole lot about the judges involved, which is surely the truly relevant factor. Somehow I doubt these reversal are taking place because Yale graduates cannot competently deal with the basics of district court work.
4.18.2008 11:29pm
Dave Hardy (mail) (www):
"I remember when Judge Justice (district judge in Texas) came to speak at a class I was taking. He said -- and I'm not exaggerating or misquoting him -- that he decided the result he wanted and then sent his law clerks to figure out how to make that result fit within existing law/precedent/etc.

That approach is, of course, entirely lawless."

On the 9th Circuit, it's simpler. They don't worry much about the second step.
4.18.2008 11:31pm
should have written "ctrl-D" above. Sorry for the error.
4.18.2008 11:43pm
Bruce McCullough (mail):

My criticism most certainly appertains.

The numbers are ordinal. Those numbers 4.9 and 4.8 can only be used to rank the schools. To see this, I simply ask you this: 4.8 WHAT? What is the unit of measure? There isn't one. Hence, 4.8 and 4.9 are arbitrary up to a monotone transform (at least) and hence, the averages (and differences betweeen them) are meaningless.

This assumes that the weights of the various components used to create the 4.8 and 4.9 are properly constructed. I doubt they are. I recommend a book in index number theory, e.g., "Index Numbers in Theory and Practice" by R G D Allen, and a book on measurement theory (all about ordinal and ratio and such not), e.g., "Measurement theory and practice: the world through quantification" by D J Hand, my review of which can be found JASA 100(472), 1462-1463.

4.19.2008 12:50am
It's very convenient to post an anonymous comment about how there's a judge who doesn't hire YLS grads because they don't know the law, without actually providing the name of that judge.

For all the Yale hate out there, it says something that YLS students are still good enough to fill 1/3 of the SCOTUS clerk spots for the upcoming term, clerking for judges across the political/jurisprudential spectrum.
4.19.2008 2:08am
YLS 1L, you are clearly new here: every thread about rankings or the alleged foibles of a top school is guaranteed to produce anywhere from a handful to a dozen posts from not-at-all-bitter posters explaining that one or all top schools are overrated. If anecdotal and unverifiable ("I once knew a guy who went to HLS and I beat him in city court"), no matter.
4.19.2008 2:26am

Actually, there are many judges who don't hire Yale graduates for that reason. You don't really need to speculate as to who they are: Just look at lists of judges and their clerks, and you'll see many who have never (or only very rarely) hired a Yale Law graduate.

At the same time, there are many judges who do hire Yale law grads because they realize that Yale grads often have sheer candlepower that outshines others, quite apart from whatever education they may have received. Person for person, Yale graduates are the smartest of the bunch: Obviously judges looking for brilliant clerks will keep that in mind. So even if judges assume that Yale graduates will have to play "catch up" relative to graduates of other schools when it comes to knowing actual law, many will assume that the graduates are smart enough to pick it up quickly.

Plus, at the U.S. Supreme Court level, a law clerk's ability to identify and follow the law is less important than it is at other levels. The job of a Supreme Court clerk tends to be more about candlepower and less about law-following relative to the job of clerking in the lower courts.
4.19.2008 3:06am
It's not fair to suggest that liberal judges get reversed because they "make stuff up." In any event, given what has happened to the Supreme Court liberal appellate judges may well get reversed for following the law.

Take qualified immunity. Saucier was quite clear that courts must reach the constitutional issue first. Yet it has been apparent for a while now that a majority of the court would not uphold Saucier. There has been no such holding, but if one pieces together dictum that is what is apparent.

For the past few years, a "liberal" judge following the law by applying Saucier was therefore taking a greater risk of being reversed than a "conservative" judge who invents exceptions to the doctrine (in your words, "makes stuff up").

Of course, now that the Court has granted cert in Pearson, the Court may well shed light on this issue.
4.19.2008 8:46am
Katya, what's your point? Saucier is unpopular, but at least for now, it's still the law. A "liberal" district court judge following Saucier is at almost zero risk of being reversed for that reason by a court of appeals. Court of appeals judges who follow shaky Supreme Court precedents take some risk of being reversed, but that rarely happens.

Judge Reinhardt (who, it is indeed fair to say, "makes stuff up") rarely gets reversed because he's simply in the wrong place at the wrong time and the Supreme Court decides to change the law by reversing one of his decisions. Rather, most of his reversals come from his refusal to follow the Supreme Court's decisions. He is something of a statistical outlier, but is a good example of the phenomenon of judges "making stuff up."

A better example of your point is probably Blakely and Booker. In Blakely, the logic of the Supreme Court's decision rendered the Federal Sentencing Guidelines unconstitutional, but the opinion itself said that the Court was passing only on a Washington state law and took no position on the Federal Guidelines. There was a split in the courts immediately afterward. Judge Cassell wrote what was probably the best opinion in favor of applying Blakely to invalidate the Guidelines. In the Seventh Circuit decision in Booker, from which the Supreme Court took cert, there was a great pair of opinions, in which Judge Posner wrote the majority, urging that the Guidelines are invalid, and in which Judge Easterbrook dissented, urging that the courts of appeals are not free to overrule decisions of the Supreme Court (in that case, Edwards v US), no matter how much the Supreme Court suggests that their foundations have eroded. During the period between Booker and Blakely, every judge who expressed an opinion one way or another risked being reversed, and most could not be accused of "making stuff up."

Even so, you're arguing that a proposition proves its converse, which is not true. Even though there are several reasons for reversal that are not the result of the judge below "making stuff up," it does not follow that every reversal is simply a case of reasonable minds disagreeing, nor does it follow that there is no correlation between a high reversal rate and a propensity to "make stuff up."
4.19.2008 9:57am
Doesn't this study seem to depend on two assumptions that don't seem warranted: (1) that district court opinions that end up on westlaw are a random sample and (2) that westlaw's yellow and red flags are in any way meaningful? I don't see why (1) is true and I have a hard time believing anyone thinks (2) is true. Westlaw's yellow flags, for example, are particularly meaningless.
4.19.2008 11:40am
OrinKerr, in a possible fit of self-loathing, said:

Person for person, Yale graduates are the smartest of the bunch

As hard as it may be to fathom for ranking obsessed law professors and students, people do sometimes turn down Yale for other schools (not even including Harvard). Even more frequently, Yale will turn down an applicant with an LSAT score and GPA above its 75th percentiles in favor of some mouth breathing half-wit who remained lucid just long enough to write some memoir of drug-induced promiscuity, or some celebrity's offspring who got shipped off on exotic vacations under the auspices of some personal charity so that the parents could take a tax deduction.
4.19.2008 11:51am

i don't follow your comment. First, the leading proponent of overruling Saucier has been Justice Breyer, who is not generally thought to be a conservative. Second, whether a lower court judge follows Saucier is not outcome determinative in any case; a lower court can't be reversed only on that ground.

I assume that was an attempt at parody, but If not, please note that your comment has no connection to what I wrote. Of course some people turn down Yale, and Yale turns down lots of top applicants. Most of my fellow Harvard students fit in to one of those categories, and we were a pretty sharp bunch. But that doesn't change my sense that on average, Yale Law grads tend to be the smartest of the bunch relative to other schools.
4.19.2008 12:33pm
UofC2L (mail):
"I remember when Judge Justice (district judge in Texas) came to speak at a class I was taking. He said -- and I'm not exaggerating or misquoting him -- that he decided the result he wanted and then sent his law clerks to figure out how to make that result fit within existing law/precedent/etc."

I have heard nearly this exact quote as well, except that it was during a talk by Judge Posner. In fact, if you would like to listen for yourself, go to: and listen to the debate between Judge Posner and Brian Leiter on "What do and what should judges do."
4.19.2008 12:42pm
frankcross (mail):
Bruce, you don't know the meaning of the word ordinal. It is definitely true that those numbers have a great degree of arbitrariness and are therefore limited, but they are definitely not ordinal, so there is nothing wrong with averaging them, save for any shortcomings of the numbers themselves.

And be careful about bashing Judge Justice. Look at those cases he was deciding specifically, not in the abstract, and tell me how they should come out.
4.19.2008 1:12pm

It occurs to me that there may just be a misunderstanding here; i meant 'person for person' to mean "on average", not 'every single person'. It's obviously not the case that everyone at Yale is smarter than people at other schools; that notion would be absurd.
4.19.2008 1:56pm
Royce Barondes:
I am pleased to have received various observations concerning how my paper might be clarified.

I had a couple of questions about the results running different models. Although I am still considering precisely what, if anything, might make the paper better for its audience, I am pleased to share the additional results here for those who asked:

There was an inquiry about the results from a more customary logit estimation using dummy variables for each judge (other than, of course, one of them). I can confirm that using a logit estimation with all the independent variables that are in model 1 in Table 4, but with 92 dummy variables added, one for each of 93 judges (2 judges' opinions being dropped because their opinions never have these adverse signals), has a parameter estimate [t-statistic] for the Yale Law School variable of: 1.64 [3.09], which is not materially different from the results reported in the paper.

I was also asked what happens if one drops the reputation measure. Re-estimating model 1 from Table 4, with that independent variable omitted, produces the following parameter estimates [t-statistics] for bar pass rate, Yale, Chicago, &reference bar pass rate, respectively: -0.025 [-1.61], 1.60 [3.02], 0.081 [0.11], and 0.044 [1.43]. In both cases, then, Yale remains positive and statistically significant.

I was asked excellent questions about factors associated with the particular judge--a judge likely to be reversed hiring clerks from Yale. I believe it's fair to say that both the clogit estimations reported in the paper as well as the dummy variable for each judge estimation adequately address that concern (although one might have some questions as to the validity of just including 90-some dummy variables for the individual judges, for reasons alluded-to below). I've provided a little more detail about the clogit estimator for those who are interested.

There is a thoughtful question about inclusion of opinions in Lexis. Again, as long as, for each judge, the judge remains constant in the criteria resulting in the issuance of opinions included in Lexis, I believe the estimation techniques will control for that. It is, of course, possible that a judge went off the deep end in the middle of the period studied. I regret no obvious way to address that has come to mind.

As to the question about "yellow" flags (the question at 10:40 asks about Westlaw; I in fact used Lexis), that is also a very good question (which I believe another another has also asked). Those are not coded in the paper as adverse outcomes for reasons discussed in the paper at pages 14-15. Lexis has an intermediate signal between these two, a questioned signal, which is rare. The paper reproduces results showing both those "questioned" signals included as adverse outcomes (table 4) and those signals not included as adverse outcomes (table 6). The results are similar.


I was also asked for some additional detail on the clogit estimator. The clogit estimation command is used to estimate a binary dependent variable where the data are in "clusters" or "groups". The Stata manual gives the following example:
* * *
One is trying to ascertain the relationship between whether a person is in a union in various years. Interviews were conducted in each of 15 different years of a sample of women aged 14-26 in 1968. The factors (independent variables) are age for the year in question, current grade completed, whether the person was living outside a standard metropolitan statistical area, and in the "South".
* * *
What one is trying to model in that example is the likelihood a binary value will take one outcome. A basic technique for trying to do this would be a probit or logit model. The problem with using that kind of model is that we expect some grouping within the outcomes for each individual. Thus, the error terms in our initial attempt to model the dependent variable may be correlated with this independent variable. That's a concern.

To avoid that concern, one can instead use the clogit estimator, which is discussed under the heading "fixed-effects logit" starting on page 282 of the Stata manual. However, any variable that stays constant for an individual across time will be collinear with the fixed effects and cannot be estimated.

This procedure is described on pages 283-84 of the Stata Reference Manual (release 10).
4.19.2008 2:09pm
I was asked excellent questions about factors associated with the particular judge--a judge likely to be reversed hiring clerks from Yale. I believe it's fair to say that both the clogit estimations reported in the paper as well as the dummy variable for each judge estimation adequately address that concern (although one might have some questions as to the validity of just including 90-some dummy variables for the individual judges, for reasons alluded-to below).

Royce, can you explain what that means for those of us not well versed in our clogit estimations?
4.19.2008 2:58pm
Another possible explanation, besides the stereotypes that are being proposed or insinuated here, is that many of the judges Yale grads clerk for are among the more competent/smarter district judges who get assigned a disproportionately high number of difficult cases, which are in turn more likely to (a) result in reported opinions (thus generating a Shepard's citation), (b) be less likely to be controlled by existing law, (c) be accordingly more likely open to good faith disagreement among jurists of reason, and (d) thus be more open to reversal by other good faith jurists of reason on appeal. None of this would be due to any "error" made by a clerk or judge, but rather the nature of the work that is being performed.
4.19.2008 5:04pm
In fact, another explanation is even simpler: the judges that Yale grads clerk for are ones who like to write. It is often up to the district judge whether a case results in a written opinion. If judges who like to write are more likely to hire Yale clerks (reasonable, since Yale grads like to write and are competent writers), then such judges are going to have more opportunities to generate a negative "Shepard's signal" than judges who generate far fewer written opinions. This might entirely explain the statistical effect that has supposedly been identified here.
4.19.2008 5:07pm
The theory behind Prof. Kerr's post (which, to me, makes sense) is similarly applicable at the Supreme Court level. If I'm not mistaken (and I'm too tied up to do the number-crunching right now), data show that certain Justices more regularly hire Yale law clerks, or at least hire them in greater proportions than other law schools (adjusting for size of law schools). These Justices often find themselves in dissent. Certain other Justices rarely hire from Yale, or at least do so at a much lower rate; they are often in the majority.

So does this mean Yale law clerks, as compared to their peers, lack the persuasion and reasoning skills it takes to assist their Justice in fashioning a majority? Or that they're not adequately preparing their Justice with arguments to convince four other Justices? No, of course not. It just means that the Justices whose legal perspectives are not carrying the day at present are also those most likely to find kindred spirits in Yale Law students and hire them.
4.19.2008 5:08pm
frankcross (mail):
Orin, I haven't used it, and Royce can surely explain better, but it is a procedure to take out the individual judge effect and isolate the clerks. Very simply, it looks for whether the reversal patterns are more consistent with the judge (over time as he employs clerks from different schools) or whether the judge's reversal rate varies as he takes clerks from different schools.
4.19.2008 6:14pm
Thanks for responding to some of the concerns here. However, you did not accurately address my concern. I asked what happens if you leave out the Yale fraction, not the school reputation. I realize this is your variable of interest, but if something immediately becomes significant after removing the Yale fraction (because these might be highly correlated), then I would be concerned about some multicollinearity issues.

Also, would you consider including the top ten and bottom ten law schools? (Include one in each regression, not all twenty in one regression) This is less "data mining" than including just Chicago and Yale. If the schools involved all have about the same probability of a "warning", then the positive school coefficient balances the negative coefficient on school reputation, and all you have found is that these variables are highly correlated. (If my idea is true then you will find that the top ten schools generally have larger coefficients than the bottom ten schools because the bottom ten schools gave smaller values for reputation. )

However, if Yale remains standing alone after including these other 18 schools, the by all means you can claim that Yale is unique. However, you don't want somebody to follow up on your study just to show that many schools show the same effect. I would prefer getting it right the first time rather than being upstaged by somebody else.
4.19.2008 6:21pm
If I'm not mistaken (and I'm too tied up to do the number-crunching right now), data show that certain Justices more regularly hire Yale law clerks, or at least hire them in greater proportions than other law schools (adjusting for size of law schools). These Justices often find themselves in dissent. Certain other Justices rarely hire from Yale, or at least do so at a much lower rate; they are often in the majority.

Is that true? Who are the Justices that don't hire from Yale or hire them in greater proportions?

Another possible explanation, besides the stereotypes that are being proposed or insinuated here, is that many of the judges Yale grads clerk for are among the more competent/smarter district judges who get assigned a disproportionately high number of difficult cases,

How often are cases assigned based on difficulty? I thought that most districts just assigned cases randomly except in very unusual circumstances.

Oh, and thanks for the explanation, Frank! That's helpful.
4.19.2008 6:44pm
Bruce McCullough (mail):

Your write: "It is definitely true that those numbers have a great degree of arbitrariness and are therefore limited, but they are definitely not ordinal".

If they have the necessary degree of arbitrariness, and they do, then schools given the numbers Havard 1, Yale 2, and Standford 3 might as well be given the numbers 1, 4, and 6 as long as all the other schools have their number doubled.
Unless you deny the above, then the numbers are ordinal! In which case the difference between Havard and Yale is not 1, but some positive number. Hence these numbers are not even interval scale, they are ordinal scale.

To see this another way, the runners of a race finish in order 1, 2, and 3. Are these ordinal? Yes.

If you continue to insist that these numbers are not ordinal, then tell me what is the unit of measurement.
Again, I ask you: 4.8 WHAT?? If there is no unit of measurement, then the variables are neither interval nor ratio scale and are ordinal. While they are presented as numbers, they can only be used to rank the schools and hence are ordinal. I refer you to any introductory statistics text.

4.19.2008 6:54pm
Bruce is correct. These numbers are technically ordinal numbers unless one can say that a school that is a 4 is twice as good as a school that is a 2 and 33.3% better than a school that is a 3. That is simply not how the rankings work (unless someone can show me something otherwise saying that it is). The numbers are ordinal unless some standard is devised to make the 4 twice as good as the 2. And I am almost positive that the system does not do that.

In other words, an average of these numbers is very, very flawed.
4.19.2008 7:05pm
I didn't mean that some Justices "don't" hire from Yale in the sense of they "won't" -- rather that they tend to hire fewer from Yale relative to their peers.

Having now looked at the numbers (culled from Wikipedia), my hypothesis is not entirely borne out by the data, although there's some (weak) correlation. Here's the data from the 2003 through 2007 Terms -- Yale clerks first, total clerks second.

WHR/JGR 2 18
JPS 3 20
SOC/SAA 5 22
AS 2 20
AMK 4 20
DHS 4 20
CT 3 20
RBG 9 20
SGB 5 20

Small sample size, obviously, but Ginsburg and Breyer appear to draw more from Yale than, say, the old/new Chief Justices, Scalia, and Thomas. (In truth, it was Ginsburg I was thinking of when I made my earlier hypothesis -- I recalled her having hired many Yale clerks in recent years.) Then again, Stevens hasn't really hired that many (although he has a general tendency to go with more underrepresented schools, like the old Chief), and Alito has hired several.
4.19.2008 7:11pm
frankcross (mail):
Bruce, I don't understand your point. The scores are meant to represent a measure of reputation on a scale of 1 to 5. This is a pretty common survey measurement approach. The unit of measurement is academic quality (however conceived in the minds of the responders). The approach is used in lots of peer reviewed research and never as an ordinal variable.

One way we know it isn't ordinal is that schools have the same rating, and different numbers of schools may have the same rating. In an ordinal ranking, the quantitative difference between any two sequential schools is one. In this case it is not.
4.19.2008 7:46pm
OrinKerr said:

It occurs to me that there may just be a misunderstanding here; i meant 'person for person' to mean "on average", not 'every single person'.

Yep, that was it. I think Harvard gives Yale a good run for the money here, as well, though. Yale seems to favor certain soft factors that are not necessarily indicative of intelligence, so much as passion or drive. This, of course, it not to say that such people don't make better students or Supreme Court clerks - it probably does. I'm not sure, however, that what distinguished Yale Law students from other students at tops schools is intelligence, even on average.
4.19.2008 7:50pm
For the purposes of my above post:
bottom 10 law schools = the lowest rank schools that you can go to and still expect a reasonable shot of getting one of these clerk positions.

Oh yeah, and what Bruce said. I'd try to use some more objective measure of law school quality instead of reputation. That might erase my correlation concern altogether. I would still use more schools though.
4.19.2008 7:51pm
Royce Barondes:

Just as in grading, the opportunity to postpone the writing of an exam is always tempting. So let me make a final stab, after which I will definitely need to return to class stuff, and I will have to for the moment leave this discussion and all the helpful observations it has provided, and I hope that the reader will be so kind as to excuse any problem that the brief amount of time available to me to formulate my reply here has produced in my discussion. But before I do that, let me make two observations:

1. For legal scholars dealing with empirical analyses, it is a great effort, I think, to find explanations of what's being done in various contexts. My personal experience is that the documentation associated with the Stata statistical software is the best at trying to make the material accessible.

2. My sense is that this discussion will not be sufficient for someone who is not familiar with logit estimation.

Now let me first discuss what the paper did not do--that's using a dummy variable for each judge.

The results of that kind of model for any variable, e.g., the Yale variable, show the relationship between the likelihood of a reversal and the variable, accounting for the impact of all other factors that are modeled. For each judge, there are two periods: the earlier 2 years and the later 2 years. There is one percentage of clerks from Yale (which of course may be zero) from the first period and another percentage from the second period. So what the results for the model with the dummy variables reflect is the change in the reversal rates for each judge between those two periods.

Let's say that judge 13 is really likely to be reversed. That is accounted-for in the dummy variable for that judge. So if that judge is more likely to hire clerks from Yale--is at philosophical odds with the appellate bench or for whatever reason--that fact is captured in the dummy variable for that judge. The Yale variable captures association between the reversal rate and the Yale fraction in the period in question conditional on the values of all the other variables, including this dummy variable for the particular judge.

So I believe the concerns that Claritas and others are raising all sorts of factors that ultimately get to whether judges more likely to be reversed are more likely to hire clerks from Yale are misplaced. [I could quibble, but that's too distracting for what is now a perhaps too-long discussion.] Each dummy variable for a judge in this model formulation picks-up that judge's general propensity to be reversed.

[An aside: Now, the picky reader may argue something like:

There may be something that can arise during a judge's career that (i) adversely influences the likelihood of an affirmance and (ii) influences the choice of clerks. So, one would be saying something like once a judge starts to "lose it", that fact causes the judge to be more likely to hire clerks from Yale. Fine. Yes. The model is not perfect. That kind of lack of precision, perhaps lack of perfection, is part of many econometric models. Creating these models is an art. A reader references the kinds of assumptions generally used by others, and makes comparisons.

As the paper notes, I did, in fact, try to assess a couple of things that seemed to be me most plausible to be causes of this kind of probelem, involving particular subject areas. Ultimately, the work, when explained, speaks for itself, and readers will reach their own conclusions.

End of aside]

As noted in my post of 1:09, the results in Model 1 in Table 4 for Yale are not qualitatively different than the results of doing this. I'm hopeful that this discussion so far is sufficiently cogent so that all these concerns about different factors that cause a judge who hires Yale clerks to be more likely to be reversed are adequately addressed.

To state an obvious point, these models are estimating a dichotomous (binary, 1/0) variable. One uses different estimators for a relationship with a binary dependent variable than are used when the dependent variable takes on something in a continuous range of variables. I don't know the particular commands for the statistical package you use, but one might use something like:

regress dependent-variable independent-variable-1 independent-variable-2 ...

to estimate the other kind of relationship and something like

logit dependent-variable independent-variable-1 independent-variable-2 ...

They are not computing the same thing.

My general sense is that most of the empirical work of legal scholars, at least in the past, would have addressed this investigation in the above way--a logit or a probit with numerous dummy variables.

But, if we want to see what the potential concern is with doing that--just suffing in multiple dummy variables, the most concise explanation of the background underlying the clogit estimator (and not the dummy variable approach) is the following, which concerns estimation of the impact of certain treatments of patients (corresponding to the judges in the paper) [Rabe-Hesketh &Skrondal, Multilevel and Longitudinal Modeling Using Stata at 131 (2005)]:

"[I]t would be tempting to use fixed intercepts by including a dummy variable for each patient (and omitting the overall intercept). This would be analogous to the fixed-effects estimator of within-patient effects discussed for linear models in section 2.6.2 [RB note: there the author investigates a relationship having a non-binary dependent variable]."

The author, after describing a problem with doing that, says, "we can ... construct[] a likelihood that is conditional on the number of responses that take the value 1 (a sufficient statistic for the patient-specific intercept). ... In logistic regression, conditional maximum likelihood estimation is more involved and is known as conditional logistic regression. Importantly, conditional effects are estimated in conditional logistic regression [as in another technique they discuss on page 120, which I will obliquely reference shortly]."

So I lastly need to explain what the authors mean by "conditional effects". They, at page 120 (albeit while discussing another estimation technique), say:

"[O]rdinary logistic regression models [RB note: models that do not include variables identifying the patients, in their case, or the judges, in this paper] the overall population-averaged probabilities whereas [the other technique] models the individual, subject-specific probabilities... Whereas the former is a model for the overall or population-averaged probability, conditioning only on covariates [RB note: meaning the independent variables other than the patient identity, in their case, or the judges, in our case] the latter is a model for subject-specific probability, given the subject-specific random intercept ... and the covariates. Other commonly used terms for population-averaged and subject-specific probabilities are marginal [probabilities] and conditional probabilities, respectively."

So, what this technique does is it produces an estimation of the various independent variables of interest conditioned on an attribute that varies among the judges--the number of responses that take the value of 1. It is "accounting" for judge-specific attributes that make the judge more or less likely to be reversed.
4.19.2008 8:17pm
frankcross (mail):
I have to retreat a little. There is some dispute among those who know better than I over whether these Likert scales should be treated as ordinal or interval variables. While I'm not prepared to resolve this one, it may be that the reputation rating should not be treated as an interval variable. Though I see peer-reviewed survey research all the time that does just that.
4.19.2008 8:59pm
Bruce McCullough (mail):

You write:. "Though I see peer-reviewed survey research all the time that does just that."

The taking of the average of a Likert scale is (almost) indisputably wrong, and you will (practically) never see it in a journal refereed by statisticians. In my research on this topic, I have only seen one article by a statistician who argued that it was legitimate. Hence the parentheses above.

You will see it done all the time in journals refereed by persons who have never taken graduate level statistics courses in statistics departments, especially in psychology and some social science journals. Those peer-reviewers might know psychology or social science, but they don't know measurement theory.

4.20.2008 1:09am
This is probably right. I clerked for a judge who almost exclusively hired Yale clerks and who was not afraid of the red flag (as he saw it, the higher court is always free to take a different view and he wouldn't get worked up about it). This is not quite the same as openly flouting settled authority but a willingness to explore the edges. Who better to help you with that project than some kid fresh out of Yale with a head full of theory and a very healthy sense of self esteem? (I certainly fit the bill at the time.)
4.20.2008 1:45pm
Royce Barondes:

Before I turn to my answer as to "4.8 what?" please let me be clear that I believe that your observations have been most helpful, and I think the paper could be improved by incorporating alternative estimations. The paper was included in SSRN as part of submitting it for consideration for presentation at a conference. Although I don't have the time to do so at the moment, it would be my plan to endeavor to improve the paper, insofar as feasible and appropriate, by seeking to provide alternative specifications, as well as incorporate responses to the numerous comments from others.

My brief answer off the top of my head would be "4.8" on the 1999 (the prior year) US News reputation numbers.

One can conceive of the following process:

Someone tries to generate a set of reputation measures for all law schools in which the differences between 4.8 and 4.6 are one-half the differences between 4.8 and 4.4. It's proposed. Hundreds of people look at this and are asked to propose how it might be changed to be more accurate. They express their views. The results are adjusted in response to those new surveys. This process is iterated multiple times.

In this conceptualization, last year, my survey response reflected how I thought the results of others should be adjusted for these numbers to be better.

Off the top of my head, insofar as it is possible to create a variable reflecting "reputation" in which the difference between 4.8 and 4.6 is twice the difference between 4.8 and 4.4, I would think that this conceptualized process would be a good way to do it (although I have not really thought about alternatives). I would not deny that one could reasonably take issue with the basic premise that such a variable can be constructed.

An overlay of this process is the schools are changing. So that also is, or should be, considered in filling out the survey.

Although I have seen what I would think are less internally-consistent reputation variables used in papers in peer-reviewed journals (the "Carter &Manaster" rankings of underwriter quality in IPOs, used in "A-hit" journals in financial economics in the last couple of decades comes to mind), I do not recall coming-across in the research from academics in my areas reference to a quantitive test that could be used to assess the extent to which the USNWR reputation numbers adequately reflect this collective, iterative decisionmaking process designed to produce numbers where the difference between 4.8 and 4.6 is one-half the difference between 4.8 and 4.4. I have not yet had the opportunity to review the references you mentioned in your prior post. I hope that I will be able to find in those sources such a test that can be practicably applied. That would, I think, helpfully inform the development of the paper.

I note that, of course, there are other problems in this particular variable. They could be produced by bad faith, for purposes of promoting faculty members' personal interests in having their schools rank higher. When I was filling in my survey, I could have given my school the highest possible score, and given all other schools the lowest possible score. (So that I am not banned from getting future surveys, let me note that's not what I did.) Insofar as I thought that would end up being "corrected" by USNWR, I could have moderated my efforts, so that they would be the most "wrong" that my bad faith could get USNWR to incorporate. And I suppose people could collaborate to influence the scores collectively.

This USNWR variable seeks to put reputation on a single dimension, whereas, even if we were to assess simply academic value, there would be differences in weightings between, for example, the value of knowledge of doctrine and understanding of theory. So the numbers, whatever they are, are necessarily imperfect.

It would seem to me a question of judgment when one is creating empirical models. There are a host of judgments that are made in putting together a model. For example, one may be concerned in a particular model whether one is modeling a non-linear relationship as a linear one.

One has to assess whether it is better to use measures that are more precisely computed but don't reflect what one really cares about in some sense. For example, I could substitute values derived from numbers of citations to faculty members' research. It's not obvious to me that would be better.

One has to assess the adequacy of the kinds of factors that one can quantify and incorporate into the model. These assessments are often just judgments that cannot be the product of a quantitative test. Ultimately, for some readers any particular investigation will provide useful information and other investigations will not. I would think that audience members would make that assessment based in part of their views of the relative merits of empirical techniques compared to those of other academics that they see in their discipline. Perhaps your observations will allow me to revise the document in a way that puts more persons in the former category.
4.20.2008 1:53pm
Bruce McCullough (mail):

4.8 on the prior year number is still ordinal, since the prior year number itself is ordinal.

The easy way out (at least so I can't complain) is, instead of taking the average, use a proportion. Suppose you want a measure of the quality of the schools that the faculty attended. Instead of taking the average rating, use the proportion of the faculty that attended a "top 20 school". Whether it's top 20 or top 50 or top 10 doesn't matter. Suppose you choose top 20. Then you'd want to make sure your results are robust, and run it also at top 10 and top 50 and see that, qualitatively, your results don't change.

Or you can just keep it as it is. The probability that some referee recognizes this error is epsilon. :)

4.20.2008 9:15pm