[Ira Matetsky, guest-blogging, May 11, 2009 at 11:16pm] Trackbacks
Some First Thoughts on Wikipedia

Hi. Eugene introduced me earlier, and some of you may recognize my name from a recent comment thread or two. As Eugene mentioned, I work as a litigation attorney at a firm in New York City ... and I first met him at a summer school mathematics program (which, I would like to remind him, we were carefully coached not to call "math camp") thirty years ago.

I've been reading the Conspiracy faithfully for five or six years now, and recently I've noticed Eugene's series of posts about court decisions that discuss or mention Wikipedia, the free-content, mass-written, ever-growing online encyclopedia. I've also noticed that in unrelated posts and comments, many Conspirators routinely link to relevant Wikipedia articles and seem to operate from the basic assumption that they will generally be factually accurate. So I infer that there is at least some respect for Wikipedia among some Conspirators. At the same time, I saw the comments on the thread where Eugene introduced me this afternoon, so I know there is some skepticism too.

Eugene's posts, and everyone's comments, have interested me because I've contributed to Wikipedia myself, and I'm an administrator on the site and a member of the in-house Arbitration Committee. (Wikipedians may edit under pseudonyms, and until this point I hadn't mentioned my real name on-wiki, although a determined critic managed to "out" my real identity about a year ago. For anyone curious, on Wikipedia I'm known as Newyorkbrad, Brad being my middle name.)

I hope to do two things this week. First, to explain to Conspirators a little more about how Wikipedia operates and address a couple of aspects that may not have occurred to casual readers. (I might even recruit a couple of new Wikipedia contributors -- but in fairness, I'm going to link to a couple of criticism sites as well, so you'll know what you might be getting into.) And second, I hope to gather input on some important issues from contributors here who will have an intelligent reader's familiarity with the site, but no predisposition in our internal, sometimes eternal, debates.

Anyone who has spent time on the Internet has heard about Wikipedia by now and has at least some knowledge of how it works. But here are some basics for those less familiar, which the rest of you can safely skip and go on to the end or come back tomorrow.

Wikipedia defines itself as "the encyclopedia that anyone can edit." That is literally true: anyone (short of a few sitebanned people) with an Internet connection can sit down at the keyboard and start editing. The "anyone" who can edit includes you, if you are so inclined; you don't even need to register an account in order to edit an existing article, though you do in order to create a new article from scratch.

For my part, I was drawn in as many others are: I ran a Google search to locate some information, and the Wikipedia article was the top result. I saw a mistake in an article and corrected it. (The double brackets are internal wikicode for a link to another page, and I'll use that code here as well.) Interestingly, my introduction to a flaw of the wiki collaborative editing model came a short while later later, when someone took the correction I made and immediately uncorrected it. Fortunately, when I made the change a second time, I figured out how to provide a more detailed explanation in the "edit summary" field, and this time it stuck. If I'd been reverted one more time, I probably would have shaken my head and walked away, as subject-matter experts, unfortunately, often do. But instead, having made one change led me to want to make others, and then I registered to start creating pages, and it became a hobby.

Wikipedia has existed for less than eight years, and its growth and popularity have far exceeded anything that those who created it could possibly have imagined. Today, there are millions of registered "editors" with accounts, although there are probably a few thousand truly dedicated everyday contributors, and there are close to three million articles. Content can be found on virtually every subject one might wish to write about: from Poe and poetry to pomegranites and Pokemon; from Poland and Portugal to Powell and Posner; from Pol Pot and Potsdam to polarity and pottery. (Of these, there may be a disproportionate amount of Pokemon; editors come from an enormous diversity of background but have historically skewed younger, for fairly obvious reasons.)

There are Wikipedias in several hundred languages, of which English is the largest (German is second), and there are also Wiktionary and Wikinews and a Wikiversity and Wikiquote and Wikisource, and Commons (a repository for image and sound files that can be used by all the projects) and Meta (for coordination). All of this is operated under the auspices of the Wikimedia Foundation, a charitable foundation that owns the hardware and is, theoretically at least, in charge of it all. But my involvement with the English Wikipedia is probably enough for one lifetime.

So why does this matter? One reason is that a lot of people find that editing, or even administering the site, is fun. That is is essential, as virtually everyone involved is a volunteer. Another is the satisfaction of contributing to an ever-growing source of "free knowledge." In addition to being "the encyclopedia that everyone can edit," Wikipedia is "the free encyclopedia," whose content can freely be reproduced on other websites or in other media. (This actually happens. One of my first articles was a short biography of a lawyer in Alabama who became a judge in Puerto Rico, named Peter J. Hamilton. It turns out that there is a Peter J. Hamilton Elementary School in Mobile, whose website has a "did you ever wonder who Peter J. Hamilton was?" page, and the answer turns out to be my article.)

But there is another major reason that a lot of people care about Wikipedia, whether they participate themselves in it or not, and why there are many critics concerned about the increasingly widespread role of the site. Because of its popularity and also because of its interconnected network of links, Wikipedia articles tend to score extremely high on Google and other Internet searches. In particular, if one searches on an individual's name, his or her Wikipedia article will generally be among the top group of Google hits -- much of the time the very first one. This has implications that are quite significant and in many instances troubling, which I will be discussing over the next couple of days.

That's long enough for an introductory post; I'm sure many are waiting for me to reach something more controversial. Over the next few days I'm going to explore some specific issues, beginning tomorrow with the question of how Wikipedia articles about living people can affect their subjects, and continuing later in the week with issues of site governance and article quality, behavioral standards and the role of anonymity.

The comments thread should be open, and I'd welcome suggestions for aspects I might address. (I make only one request: that regular Wikipedians who are looking over my shoulder, as well as Wikipedia critics from Wikipedia Review and elsewhere, bear in mind that this is a general-interest audience. Please don't hijack the comment threads with our own internal disputes and debates. No one here wants to read who is a sockpuppet of whom or whether so-and-so's block was fair or not. We have ANI and Wikipedia Review to hash those things out later.)

And one last unrelated request. A couple of weeks ago, [[Saxbe fix]] was the day's featured article, meaning it had pride of place on the main page for a day. I hadn't contributed to the article before, but I did some copyediting while it was mainpaged, and in doing so, I came across the assertion that President Reagan nominated Robert Bork rather than Orrin Hatch to the Supreme Court because Hatch's appointment would have raised an emoluments clause issue and the administration was not convinced that the Saxbe fix is constitutional. Although I had a dim recollection of the issue having come up in passing, I found that statement as written implausible and edited the article to say that this issue played only a small role in Judge Bork's selection. However, I didn't have a good source suitable for citation in the article to support my assertion, and I've been asked for one. This certainly would seem like an appropriate audience to fill in that particular lacuna. So if anyone can help with a source on this, please let me know in in the comments thread so I can go back and add it to the article.

Or better still, go visit [[Saxbe fix]] and edit it yourself.

Comments

[Ira Matetsky, guest-blogging, May 12, 2009 at 9:53pm] Trackbacks
Wikipedia, the Internet, and Diminished Privacy:

This is the second in my series of guestblog posts about the online encyclopedia Wikipedia, how it is organized and governed, and some aspects of its impact. My thanks to everyone who has commented on my post from yesterday. Later in the week I’ll have a post or two specifically focused on people’s comments, so please keep them coming.

As I mentioned yesterday and was picked up in the comments, one of the sources of Wikipedia’s popularity and influence is the fact that pages in it rank so highly on Google and other search engines. Where the Wikipedia page is an accurate, well-written, well-sourced article on the topic it covers, that is fine. On the other hand, some articles are better than others. And even if a page did once contain brilliant prose, it could have been changed for the worse by anyone, before a given reader finds it and reads it.

The shortest way of expressing this is that Wikipedia’s primary weakness precisely corresponds to its greatest strength. The best feature of the site is that anyone can edit (virtually) anything contained on it. The worst feature of the site is that anyone can edit virtually anything contained on it.

The ability of anyone to edit raises especially serious issues where an article concerns a specific living person. As long as an individual is “notable” by Wikipedia standards (with notability defined partly by a series of guidelines and partly subjectively), any registered editor is free to create a Wikipedia page about him or her, and anyone else is then free to edit that page.

In the first instance, this makes sense. Articles about human beings and their achievements are part of the core content of an encyclopedia. One could hardly imagine a general-purpose encyclopedia without articles about all of the U.S. Senators, or major-league baseball players, or astronauts, or Metropolitan Opera singers, or any of myriad other categories of prominent people. (Perhaps even law professors with dozens of publications and prominent blogs.) So there are several hundred thousand of these articles, known in Wikipedia parlance as “BLPs” -- “Biographies of Living Persons.”

Consistent with the whole Wikipedia model of open collaborative editing, there is virtually no control over who is writing or editing these articles. Sometimes, the author is a knowledgeable subject-matter expert familiar with subject and his or her work. Other times, he or she is a good-faith contributor drawing and summarizing information from published, reliable sources. On the other hand, a BLP could also have been created or recently edited by its subject’s worst enemy, his most bitter professional rival, her leading political opponent, or just a “vandal” out to make mischief.

Many Wikipedians have come to realize that the negative effects of false or misleading articles about living people can seriously damage the subjects of the articles. This is an area where many of the critics of Wikipedia have made very valid points.

There are two basic problems. One is the potential that an editor will insert inaccurate, misleading, and in some cases overtly defamatory or malicious content in an article. I’ll discuss that aspect of the problem and how it might be addressed tomorrow.

But there is another equally serious problem inherent in Wikipedia articles about some living people -- except that it is not a Wikipedia problem per se, but an Internet-wide one. That is the problem of how easy it is, in the era of near-universal Internet access and instantaneous search engines, to inflict devastating and nearly irreversable damage to people’s privacy. I’ll give a couple of specific examples.

In January 2007, a 13-year-old boy whom I will call John (I refuse to further disseminate his name) was kidnapped from his family and mistreated in a horrifying way over a period of 4 days before being rescued. Although the names of minors who are victims of this type of crime are often kept out of the news, in this instance John was a missing child, which rightfully led to intensive publicity both in print and online as the authorities searched for him. Since John was rescued, there has been extensive press coverage of how he was found, of the trial of the kidnapper, and to a lesser extent, of his and his family’s efforts to resume normal life. Much of that publicity also has included John’s full name; there seems to have been no particular attempt made to put the genie back in the bottle.

In the spring of 2007, someone decided that the case had been the subject of enough mainstream press coverage that it was notable and warranted a Wikipedia article. Reading that article made me miserable: not just because of what had happened, but also because I knew that behind the article was a teenage boy who must be dealing, in his own way, with the memories of what happened to him. I knew that as his life charts its course, and that as he lives it, when he applies to college or for a job or meets people, people will type his name into Google –- and since to the best of my knowledge he is in other respects unexceptionable, the main thing anyone looking him up will learn is the fact and the details of what happened for 4 days when he was 13.

I decided, as a Wikipedia administrator equipped with a "delete" button, that Wikipedia did not need to contain this article. After a long discussion on the “deletion review” page, my deletion of John’s article was upheld. Later that summer, policy was clarified to make it clear that in deciding whether to keep or delete a page, it is legitimate to take the effect of the page on its subject into account, at least to some degree.

But in spite of the deletion, John’s name still turns up on Wikipedia -- it appears in our article about the criminal who abducted him, despite my and others' having argued for removing it. Moreover, and equally important, a Google search turns up not just a few but thousands of other hits with the same content. This is by no means just a Wikipedia issue, though of course that does not absolve Wikipedians of our obligation to handle this type of content responsibly.

We face the Internet-wide question whether there is anything we can do to avoid effectively making a collective decision that this horrific incident is the key piece of information that should be available about John’s life. Except that there is no real decision to be made, because there is nothing to be done. In John's case, as I wrote on the deletion review, we have collectively added violation by the crowd to violation by the crime.

Another constant source of these issues is coverage of “Internet memes” -- videos or pieces of information that catch public attention, often in a humorous way, but in the process often are humiliating to their subjects. For example, I once arranged to deletion an article discussing an otherwise unknown person who sold his used laptop computer. When the computer didn't work properly, the purchaser took revenge by releasing embarrassing personal information and files from the computer onto the Internet. The resulting publicity, it was reported, had basically ruined this person's life. The people involved were identified on Wikipedia by name and location. To say the least, I thought we could remain a complete and worthwhile encyclopedia without further publicizing this matter. I nominated the article for deletion and got it deleted. The process took a month. (Today I might be more confident and just speedy-delete it myself.)

Another article we eventually decided we could live without discussed a young woman, also identified by name and city, who has been mocked for her poor judgment in having been overly detailed about how guests should behave at her 21st birthday party. For the rest of her life, if someone types her name into Google, they will find publicity about this supposedly grievous error she made, which may overshadow the coverage of anything else that she ever does or accomplishes. Wikipedia did not need to, and no longer does, discuss this episode; it never should have.

More examples come up every day. For those who follow such things: Should we include the “Star Wars Kid”’s full name? What, if anything, should we write about “Boxxy” or “Chris Chan” or “Brian P.”? Do we mention, and how much weight do we give to, the difficult times in people’s lives, especially where the person’s notability is borderline to begin with?

I do my best to advocate that Wikipedia not include content that will obviously hurt the subject of an article and does not enhance the encyclopedia we are writing. (I haven't done as much of this as I would like, given my ArbCom duties, but writing this essay has reminded me once again to place a priority on this work.) But even where a deletion or a redaction sticks, I don't delude myself any more that I've actually helped the subject of the article very much, where the news coverage of their situations on fifty or five thousand other websites spreading the same gossip and showing the same disrespect for privacy and dignity are still out there. Wikipedia is a critically high-profile website, and I don't denegrate for one minute the importance of improving things on our site. But there are plenty of times I read something despicable on another website and wish I could delete it and block the person who wrote it. Only on-wiki can I even try.

Even developments in the spread of online information that seem unambiguously positive turn out to have more complex overtones when one thinks through the privacy ramifications. For example, complete free online searching of the complete back contents of The New York Times has recently become available. That's a home run for increasing the flow of information to the world, right, and great news?

Well, yes, it certainly makes research easier in a number of ways, as opposed to screening the old microfilms as one used to have to do, and for purposes of my research for both sourcing Wikipedia articles and my everyday legal research article-writing, I like it very much. And yet ... anyone who ever committed a youthful indiscretion that happened to make page C17 of the paper on a slow news day, will now be defined by that as one of the top results for his or her name, for the rest of his or her life. And multiply by dozens of other newspapers, and every other type of medium and website, and on and on and on. (The increasingly free public online access to court pleadings is another example whose ramifications are still being thought through.)

Incidentally, it is unlikely that many of the people affected by these damaging (but non-defamatory) types of unwanted publicity will have much chance for legal redress, at least in the United States. (And bringing a suit to redress this type of harm may be useless anyway; its main effect may be to further magnify the very publicity one is complaining about.) For readers wishing to explore the legal issues created by unwanted publicity and the question of whether media disclosure of facts that someone would prefer to conceal can ever give rise to a tort claim, the best place to start is probably Judge Posner’s opinion in Haynes v. Alfred A. Knopf, Inc., 8 F.3d 1222 (7th Cir. 1993), available at http://altlaw.org/v1/cases/493290. It thoroughly surveys the competing policy arguments, the precedents, and the constitutional considerations. If anyone knows of a comparably thorough discussion brought up to date for the Information Age, please tell us in the comments.

Isaac Asimov famously predicted fifty years ago that emerging technology would come at the cost of vanished privacy, though he didn't get the exact form of the technology right. Fifty-odd years later, much of his prediction has come true, and I only hope that the website I help administer can avoid being a central part of the problem. In a way, we all live in the goldfish bowl now. It is not always a pleasant place to be.

Comments

[Ira Matetsky, guest-blogging, May 13, 2009 at 11:15pm] Trackbacks
Wikipedia and the Biography Problem:

My thanks to everyone who has commented on my first two posts about Wikipedia, the collaboratively edited online encyclopedia. (For those coming in late, I’ve contributed to Wikipedia for about three years and am an administrator of the site and a member of the in-house Arbitration Committee; my username there is Newyorkbrad, after New York, where I live, and Brad, my middle name.)

Tonight I’m going to continue discussing the impact that the content of Wikipedia’s biographical articles can have on their subjects. (By the way, I'd like to thank those Wikipedians, and Wikipedia critics, who have helped me hone some of my thinking in this area. I would thank them by name, or at least by Internet pseudonym, but since I may not be in full agreement with their recommended solutions, they might not appreciate being named.)

As is widely recognized, if someone notable enough to have a biographical article about himself or herself on Wikipedia (and doesn’t happen to have a very common name or a name that is also a word), that article will be one of the very top Google hits on a search for that person. Indeed, the Wikipedia article will very often be the highest-ranking Google result for anyone who is the subject of an article. The most common exception is if the person has his or her own website, in which case that site will often be number one, with Wikipedia right behind it.

Let’s try the experiment with a randomly chosen well-known person … how about, say, Eugene Volokh. I’ve just typed Eugene’s name into Google, and the first two hits are pages from this site, which counts as Eugene’s website; the third and fourth hits are his faculty bio and publication pages at UCLA; and the fifth hit is [[Eugene Volokh]], his biography on Wikipedia, which could use a little updating and sadly fails to mention [[HCSSiM]].

When Wikipedia was founded in 2001, no one involved anticipated that it would become as successful as it has, and no one anticipated the interplay between the link structure of Wikipedia and the algorithms used by search engines, which would raise Wikipedia biographies (and other articles) to such prominence. More generally, I don’t think anyone anticipated, and certainly no one thought through, all the implications of the fact that what was meant to be a harmonious, educational, collaborative corner of the Internet would be used to hurt people. Whether the site would have been set up differently had that outcome been predicted is destined to remain in the realm of thought experiment.

In the intervening years, though, it’s become more and more clear that malicious or simply thoughtless content added to Wikipedia BLP’s (“Biographies of Living Persons”) can be very damaging. A series of serious and widely reported incidents have brought the problem to public attention. Among these: the [[Siegenthaler incident]], in which an article was vandalized to accuse a completely innocent person of suspected complicity in an assassination, and no one caught the problem for four months; the incident in 2007 in which a Turkish academic was detained for several hours by immigration officials in Canada, reportedly based on an inaccurate allegation in his Wikipedia article that he was a terrorist; the lawsuit brought by a prominent golfer against the person who added defamatory content to his article; the blatant attack page created against a well-known California attorney, allegedly as part of a negative public relations campaign launched on behalf of one of the companies he was suing.

The Wikimedia offices have been contacted often enough by subjects of BLPs that apart from the usual network of on-site discussion pages and noticeboards, there is now an elaborate e-mail network (called OTRS) to which article subjects are referred. Concerns about defamatory BLP content are only a fraction of the inquiries received by OTRS and by administrators on-site: ironically, for every article subject demanding that his or her page be deleted or retracted, there is another inquiry by someone wanting to know why his or her page was not kept on the site. (Usually, the answer is that the person was judged not to be notable enough to warrant an article.) I think it is certainly fair to say that when Wikipedia was dreamed up, no one realized it would someday need a round-the-clock complaint desk.

In 2006, the English Wikipedia adopted a new policy on Biographies of Living Persons. It urged greater sensitivity to the effect that articles can have on the subjects, and in particular, provided that no negative or controversial content discussing a living person should be contained in an article unless a reliable source for the information is provided. Edits to enforce this policy were exempted from some of the usual editing regulations, particularly the “three revert rule,” which forbids changing any article back to a previous version more than three times in a 24-hour period. The policy is considered one of the most important we have, and it’s helped, but only some.

Articles about notable individuals suffer from the greatest amount of inappropriate or disputed editing — frequently to score points in political campaigns or other real-world disputes. One example was discussed in an article in The New Republic discussing the primary campaign between Hillary Clinton and Barack Obama, here: http://www.tnr.com/story_print.html?id=4f0c6aa3-3028-4ca4-a3b9-a053716ee53d . Another prominent dispute (which resulted in an ArbCom decision that I wrote) arose last September over whether various allegations belonged in [[Sarah Palin]] and related articles. But the most serious victims of BLP violations are not prominent people whose articles may watched, so that bad edits are quickly corrected, by hundreds of people; they are articles about less well-known people, on which libels or mistakes may go unrecognized or linger for weeks.

There have, of course, been lots of discussions about what to do about all this. Clearly, at this stage of its evolution, Wikipedia is not simply going to drop all the biography articles. (Even if it did, statements about living people would come up in hundreds of thousands of other articles. BLP applies on every page of Wikipedia, not just in the biographical articles themselves.)

A solution sometimes proposed is to allow subjects unhappy about the existence or content of their articles to demand their deletion. This will surely never be implemented in full; Wikipedia will not delete [[Barack Obama]] or [[George W. Bush]] or other articles about high-profile people even if the subjects were to ask for it.

On the other hand, in cases involving people at the margins of notability, which is subjective enough anyway, there can a place for taking the subject’s own feelings about the article into account in deciding whether or not to retain it. Sometimes this has become a de facto tiebreaker (in either direction) in close deletion discussions. (A few Wikipedians have urged that giving this factor even tiny weight in deciding what to keep or what to delete violates a philosophical principle that notability exists independent of the subject’s views, a view I would find more persuasive if application of the notability guidelines weren’t so often subjective in any event.)

Related to that is a proposal that subjects be allowed to “opt out” of Wikipedia if they aren’t prominent enough to have attained notability as measured by a well-defined, objective standard, such as having been the subject of offline “dead tree” biographical coverage such as a book or a hard-copy encylopedia.

Out of curiosity, if anyone reading this happens to be the subject of a Wikipedia article — please tell us in the comments whether you would exercise the option to have your article deleted on request, if that option existed, and why or why not.

I’ve gone over the recommended word limit for one of these posts (which won't surprise anyone who's come to know Newyorkbrad on-wiki), so I’ll write about semiprotection and flagged revisions and Section 230 tomorrow.

Comments

[Ira Matetsky, guest-blogging, May 14, 2009 at 4:56pm] Trackbacks
Wikipedia and the Biography Problem, part 2:

(Please read part 1 from last night first; I’m just picking up where I left off.)

Another proposal that would certainly reduce vandalism of Wikipedia articles would be to eliminate editing by unregistered users, either throughout Wikipedia or at least on BLPs. Presently, “anyone can edit” extends even to users who haven’t registered an account. In wiki parlance, unregistered users are referred to as “IP” editors, because in the article contribution histories, the IP number of the computer from which they edited is displayed instead of their username. This form of “anonymous editing” should not be confused with a different sort of anonymity, which allows users to register under pseudonyms without providing their real names.

The main value of allowing IP editing is that it gives brand-new users the ability to try out “anyone can edit” for themselves, without taking the time and trouble to register. Many new users make their first edits as IPs, often after spotting a typo in an article or noting that some information is missing, and there is a fear that if registration were required to edit, some proportion of first-timers wouldn’t bother, and therefore would never develop the habit of contributing and become “Wikipedians.” For example, this is precisely how I got started in editing, as I mentioned the other night.

While IPs contribute many good-faith edits and some become regular contributors, IP editors are also responsible for much of the drive-by vandalism — often, but by no means always, committed by bored schoolchildren — that afflicts many pages (and gives other editors the opportunity to earn credentials as “vandalism fighters”). The ratio between valid and vandalistic edits by IPs is sufficiently low that from time to time there is discussion of requiring registration to edit. A significant step in that direction was taken in 2006, when users were required to register before creating a new page (as opposed to editing an old one).

An intermediate step would be to disallow IP editing just on BLPs. Administrators have the ability to “semiprotect” any page of Wikipedia. A semiprotected page cannot be edited by IPs or by newly registered editors. (A “full protected” page cannot be edited by anyone, except for administrators under very specific guidelines.) Pages are semiprotected usually when they are being vandalized by IPs, typically for short periods by sometimes for a longer term or indefinitely. (For example, [[George W. Bush]] or [[Hillary Clinton]] could probably never be unprotected without being overrun, but those are unusual cases.)

It has been proposed that either all BLPs be permanently semiprotected, or at least that they be liberally semiprotected at a lower threshold of vandalism or at the subjects’ requests. This would certainly reduce the amount of vandalism and defamation from non-registered IPs. (An objection is that it would also eliminate the ability of an unregistered editor, perhaps the article subject himself or herself, to fix vandalism or remove defamation. I don’t know how often this happens.)

The most recently proposed approach for reducing BLP violations and other types of bad edits is called “flagged revisions.” The idea of giving this approach at least a trial was supported by a majority of English Wikipedia editors who participated in a recent poll, and it has already been implemented on the German Wikipedia. There are various somewhat different proposals for how this could be done, either on all articles, or on BLP articles, or some subset of them. In general terms, flagged revisions means that anyone can still edit an article — but the edit does not become visible to readers until another editor has reviewed and approved it. It introduces some level of quality control; it also, some say, represents a step away from “anyone can edit.”

This procedure itself raises some questions of implementation. Some are mechanical, such as, what happens when User:B edits the same sentence that User:A has just edited, but before the edit has been flagged? Others are more substantive, such as who gets to be an edit-flagger, and what standards do they use in flagging? If a flagger sees that someone wants to edit Jones’s biography by adding “Jones is a jerk!” then he or she will disapprove the edit — but that’s not really the type of edit that, if a few people see it before it gets reverted, will really damage Jones’s reputation (though it will damage Wikipedia’s). The more subtle defamations may never be recognized by a reviewer who is intelligent and dedicated but unfamiliar with Jones’s life and work — and so they will still make it into the articles — only now they would come with an “approved by an official revision flagger” seal of approval.

The English Wikipedia is struggling with whether to take a step toward flagged revisions. Proponents suggest that it's a long overdue necessary step to address an obvious fault with the site; opponents suggest it would be the death-knell of the "anyone can edit" philosophy that attracts people to contribute. A threshold issue is there is no clear governance process on the English Wikipedia for issues like this, so no one even knows just how the decision will be made. (I'll talk more about governance in a day or two.)

Incidentally, because the issue has come up in the comments, there have been relatively few lawsuits brought by individuals claiming to have been defamed on Wikipedia. To the best of my knowledge, there have been no successful defamation suits against the Wikimedia Foundation, which is the not-for-profit foundation (formerly headquartered in Florida and currently in California) that owns the hardware on which Wikipedia’s and its sister projects’ data reside and the Wikipedia trademark.

In very general terms, the Foundation’s position has been that because it does not create or control the specific contents of any particular page, it is shielded from liability for defamatary content contributed by any user pursuant to Section 230 of the Communications Act (47 U.S.C. § 230(c)), which provides that “[n]o provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.”

I know of no reported cases applying Section 230 to a claim against the Wikimedia Foundation. There is one unreported case, Bauer v. Glatzer in the Superior Court of New Jersey, which upheld the Foundation’s immunity. A leading case discussing Section 230 more generally is cf. Barrett v. Rosenthal, 40 Cal. 4th 33, 146 P.3d 510, 51 Cal. Rptr. 3d 55 (2006), while an interesting law review article analyzing the application of Section 230 to Wikipedia is Ken S. Myers, Wikimmunity: Fitting the Communications Decency Act to Wikipedia, 20 Harv. J. L. & Tech. 162 (2006).

I wish very much that I were ending this post with a brilliant solution to problematic content regarding living persons on Wikipedia, but I don’t have one, even after having thought about this matter from lots of angles for close to three years. One of the reasons I asked Eugene if I could post here was to see what the readers here — legally and technically savvy, but without a vested interest in how the issue is addressed — might have to say about these issues. I'll move on to other topics in the next few days, but I'll continue reading the comments here. I'll do my best to respond to some of them before my blogging stint here is up.

Comments

[Ira Matetsky, guest-blogging, May 15, 2009 at 11:20pm] Trackbacks
Wikipedia: Who Runs the Place?

As Wikipedia, the collaboratively edited online encyclopedia, becomes more prominent, people often wonder who operates and administers the site. I'm also asked sometimes how I became involved as an administrator.

A majority of the people who contribute occasionally to Wikipedia may have little or no interaction with the administrative side of things at all. A new user doesn't need anyone's permission to start editing or to register an account. One can make dozens or hundreds of edits and never encounter an administrator acting as such or come into contact with the site's rules and guidelines.

My experience as "newbie" Wikipedian was a largely, and perhaps unusually, positive one, and the lens of my own early experiences probably still flavors how I look at the site. As soon as I registered my account, an experienced editor left a helpful "welcome" message on my talkpage, with links to relevant pages of policies and helpful hints. (Each user has a talkpage, which is a special page for messages intended for that user.) The first time I made a bunch of edits to an article, someone posted to my talkpage and thanked me for my contributions. When I had questions about how to format an article, I posted to the Help Desk and received a polite and useful response almost instantly. When I made rookie mistakes, they were quietly corrected and I was gently advised what had gone wrong. I was invited to join a project of editors with interests similar to mine. When I started to learn about policies, I read guidelines such as "be civil to your fellow editors," "when there is a disagreement, discuss it and seek consensus," and "don't bite the newcomers."

So my first impression was that Wikipedians included a collaborative group of exceptionally friendly people working together to write an encyclopedia while having some fun in the process. (Okay, I soon learned that not every page of Wikipedia was like that, as I was clued in pretty early to some areas where there was some nasty feuding going on. In fact, within a couple of months, I was trying unsuccessfully to mediate one of the loudest feuds on the site. But a first impression is a first impression.)

Of course, not everyone has the same generally favorable introduction to contributing that I did. If an editor's first contribution is an article about an marginally notable person or a garage band or his junior high school, his first memory of Wikipedia may be of the article being summarily deleted. If a user starts off writing in a controversial area, her first experience may be one of "edit-warring" as disputing users change the article back-and-forth to their preferred versions. If an editor starts off by uploading images, she will very likely receive a warning for inadvertently violating one or another of the complex rules implemented to prevent copyright violations. And sometimes one just runs into another editor who either doesn't know anything about the subject-matter but acts as if he does, or who just feels like being a jerk.

(I was once asked whether I'd ever been a party to a real edit-war. The biggest one I recall was an ongoing dispute about whether Presidential and Congressional terms prior to the Twentieth Amendment ended at midnight on March 3rd or at noon on March 4th. This issue comes up all the time in biographies and lists. The answer, of course, is March 4th, but because there are some otherwise authoritative sources such as older editions of the "Congressional Biographical Directory" that say March 3rd, this remains a matter of occasional contention.)

So sooner or later a truly experienced editor will run into the administrative apparatus underlying the site. On the English Wikipedia, any registered editor is eligible to run for the status of administrator. In practice a few months' editing experience and a few thousand edits are required for a successful candidacy. Nominations can be made by oneself or by another user and are posted to a page called "Requests for adminship" ("RfA"), where any interested user can post a "support" or "oppose" comment (one must carefully avoid calling it a "vote") based on whatever criteria (within reason) they individually choose to apply.

After seven days, the results are reviewed by a senior administrator archly designated as a "bureaucrat," who determines whether there is a "consensus" to promote the candidate. Hundreds of megabytes of text on [[Wikipedia talk:Requests for adminship]] have been spent in seeking out the perfect metaphysical definition of consensus, but in practice, support from 75% of the "!voters" typically guarantees promotion.

There are no requirements for adminship beyond having a sufficiently strong record of participation to pass RfA. There is no requirement that the candidate disclose his or her real name or background, and many don't. (I've never disclosed my real name on-wiki, although at this point I will soon go ahead and do so.) For example, there is no minimum age requirement. (Certain specialized functionaries do now have to be over 18 and provide proof of their identity to the Wikimedia Foundation Office, though they don't have to disclose it publicly.) There have been administrators as young as 12 or 13 years old; there are no good demographic numbers that I'm aware of, but I would estimate that the median age would be no higher than mid-20s, and I'm painfully aware that at age 46 I am almost surely in the oldest decile of admins. (It feels like just yesterday that I was the youngest person ever elected to the School Board in my town, and now I'm a senior wiki-citizen.)

Critics of Wikipedia often suggest that there is a serious problem with the fact that so many of the administrators, with important powers such as blocking and deletion, are relatively youthful. These are often the same people who suggest that it is absurd for older people with more life experience to spend a portion of their hobby time serving as Wikipedia administrators. Sometimes the same critics make both of these comments, but they are, in effect if not in intent, mutually exclusive.

Administrators are given certain special powers not open to other users, such as the ability to block someone who has violated Wikipedia policies from editing; to delete a page; to protect a page from editing (either by new users or by any non-admin); close certain discussions and decide their outcomes; to view the content of most material that has been deleted. There are about 1600 administrators on the English Wikipedia, of whom a few hundred are active at any one time. There are rules governing how admins are to use their tools, and policies urging them to be civil and helpful in their interactions with other users. In my experience, most administrators do their best to live up to these guidelines; of course, the occasional exception affects the reputation of all.

There is also a system of methods for dispute resolution, including various options for mediation and noticeboards for discussing different types of concerns that may arise. At the end of the dispute resolution process is a body known as the Arbitration Committee, which consists of a group of editors (currently 16) chosen in annual elections. (Formally, the committee is appointed by Jimmy Wales, who holds a special role in Wikipedia governance derived from his role in founding the site, but in the past few elections he has followed the election returns.) The ArbCom addresses user conduct disputes, and typically is not empowered to decide issues such as "which version of this article is better?" or "what should our policy on such-and-such be?" At the moment there is no central mechanism for handing down binding resolution on content disputes or policy decisions, and there is disagreement about whether it would be desirable for there to be one.

I've been following the workings of the ArbCom since early in my wiki-career: first as an occasional critic, later as a clerk for the committee, and since January 2008 as one of the arbitrators. My work as an administrator and an arbitrator has completely changed my Wikipedia experience: Instead of contributing substance to a growing body of free knowledge in an atmosphere of respect and harmony, I must review the history of Wikipedia's most contentious, protracted, bitter, and unhappy disputes and help decide what to do about them.

The cases that come to arbitration are those that cannot be resolved any other way. Most often, they concern editing disputes in exactly the areas one might expect to be the most contentious of all; cases we have accepted this year have included disputes about editing of [[Ayn Rand]] and related articles, of [[Scientology]] and related articles, of [[Ireland]] (is "Ireland" primarily the name of an island or a country), of [[Macedonia]] (or is it [[The Former Yugoslav Republic of Macedonia]]?), and so on. We have also accepted cases involving individual administrators or editors who have engaged in allegedly problematic behavior.

After reviewing each case, the committee issues a decision comprising principles, findings of fact, and remedies. The remedies we can hand down range from noting instances of bad behavior and admonishing parties to do better, restricting a user's editing (such as by banning her from editing articles about a particular topic), imposing various types of probations or mentorships, revoking an administrator's adminship ("desysopping"), or in the most extreme cases, banning an editor from Wikipedia altogether.

We try to keep the process from becoming too legalistic, although occasional legal terms or wordings sneak into the process or the decisions, for which I am occasionally to blame. (The most useful thing I've tried to bring with me in terms of a legal concept is an instinct to always make sure that the parties have had a fair opportunity to present their views and evidence before we proceed to a decision.) My real-life work as a lawyer has not had much to do with how I think as an arbitrator: There are very few parallels between the work of a committee on a website and anything that happens in the real world, and in decisions, I've emphasized that nothing we decide is meant to have any consequences in the offline world. Still, sometime, if I can figure out a way to do it without sounding absurdly aggrandizing, I will write about what my time as a Wikipedia arbitrator has taught me about the types of decisions that must be made every day by a judge of a multi-member appellate court with a discretionary jurisdiction.

Ultimate control over the English Wikipedia, along with all of the sister projects and projects in other languages, resides with the Wikimedia Foundation. The Foundation is the charitable foundation that owns the equipment and the trademarks. The Foundation has a board of directors (chosen by a combination of members), an Executive Director and a small staff, and a General Counsel (currently Mike Godwin, of Godwin's Law fame). It sets policy only at a very broad level, and does not get involved in addressing particular disputes.

Tomorrow: Some responses to reader comments.

Comments

[Ira Matetsky, guest-blogging, May 16, 2009 at 11:16pm] Trackbacks
Wikipedia: Some Responses to Comments:

My thanks to everyone who has read my guestblog posts this week on the subject of Wikipedia, the online encyclopedia where I am an editor, an administrator, and an arbitrator (User:Newyorkbrad). Tonight I should address some of the comments on my earlier posts, which I will do in no particular order. (I've already implicitly addressed some comments on my earlier posts in later ones, so I won't duplicate that; and please understand that in limited time and space I can't possibly cover everything.)

In response to my posts about problems regarding Wikipedia articles involving biographies of living persons ("BLPs"), the suggestion was made that when an issue arises concerning whether a biographical article should be kept on Wikipedia or deleted, there be a presumption in favor of deletion unless there is a collective decision to keep it, rather than the other way around. (In Wikiparlance: when a BLP is AfD'd, "no consensus" would default to delete. In an ordinary deletion discussion, by policy, "no consensus" defaults to keep.)

This suggestion has been advanced and discussed on-wiki, and has won wide endorsements, but not quite enough to be adopted. A main sticking point is that a BLP can be nominated for deletion for reasons having nothing to do with defamation, privacy violation, or undue weight -- say, a dispute whether an athlete or a performer is quite notable enough to warrant coverage. In many of these instances, ironically, if the article subject were asked, he or she might prefer that the article remain. (we sometimes get complaints from people whose articles are deleted; there may well be more people who are unhappy that they are excluded from Wikipedia than people who are unhappy that they are included.)

I advanced a compromise proposal suggesting that deletion discussions on BLPs default to delete where the notability of the subject is not clear-cut (that would presumably be the case anytime the tentative AfD result is "no consensus") and (1) the article taken as a whole is substantially negative with respect to the reputation of the subject, (2) the article subject is a minor, or (3) the article subject is known to have himself or herself requested the article's deletion. It may be time to revive discussion on-wiki of this suggestion.

Also relevant are two decisions by the Arbitration Committee (although I was not active in either case) establishing that any administrator may delete content deemed obviously unsuitable, and in those cases, the content stays out unless and until there is a consensus to keep it. While these holdings are on the books, though, unilateral deletions of high-profile articles often lead to a great deal of disputation and "drama," which can result in greater publicity for the material the admin believes should be deleted than the disputed article itself ever had. (A notable improvement within the past couple of years is the use of "noindex" coding so that our back-office discussions such a deletion debates themselves don't show up on Google. The use of "noindex" to keep certain types of not-ready-for-prime-time Wikipedia content off of search engines should be expanded.)

Also apropos of BLP issues, I would like to thank two commenters on my first BLP post for making clear the tensions that exist in this area. I wrote about a boy named "John" who had been kidnapped and mistreated a couple of years ago, who I thought should not be the subject of a Wikipedia article, as an example of material both on Wikipedia and on the Internet more widely that raised privacy issues. The first commenter suggested that in using this example I must still be in the process of merely clearing my throat, because it is obvious that no such article should exist. The second commenter suggested that I was a censor for seeking to depublicize such content, including mention of the boy's name, which I'd been careful not to include. (I acknowledge, however, that I had not been aware of the Today interview of the boy's parents.) And so it goes. In any event, if anyone does not find that example compelling I offered several others.

There were several comments bemoaning the deletion of certain content on topics like anime. Although I haven't checked the specifics of the deleted articles that the commenters cited (which as an administrator I could do), in general I agree with these criticisms. Outside the context of BLPs, I am probably as strong an "inclusionist" (the opposite is "deletionist") as can be found in the administrator corps. We delete too many articles on topics found to be "not quite notable enough." In particular, our completely laudible policy of justifying inclusion of articles by requiring citation of multiple stable reliable sources and a showing of some degree of prominence can be taken too far, and has decimated our coverage in areas like webcomics. On the other hand, we don't want to be a promotional outlet for every garage band formed last week or website with 10 readers, and allowing articles with no sources makes it too easy to plant hoaxes -- so lines will always have to be drawn somewhere.

"Spoiler warnings" were removed throughout the fiction articles because a small but determined group of users armed with bots (automated programs that conduct repetitious tasks) believed strongly that they are "not encyclopedic." In the (paraphrased) words of one of them, if you look up a novel or a film in an encyclopedia, you can presume that it is going to discuss the plot, so no one should be surprised that there is mention of the ending. Of course, there are counterarguments. I personally don't have a strong view on this one, but to the commenter, you are free to start up a discussion on-wiki if the lack of spoiler warnings troubles you.

Someone suggested that Wikipedia needs stronger coverage of law and legal topics. The editors in Wikiproject Law would certainly welcome more participation from lawyers, law students, legal historians, legal academics, and others interested in the subject-matter in creating, expanding, honing, and sourcing articles on legal topics. A particular issue with these articles is making sure that where applicable, they are written from a global perspective, as the English Wikipedia is edited from and read in every country in the world. A usual if superficial response to on-wiki complaints that an article needs improvement is a template called "{{sofixit}}". More on this tomorrow.

My thanks to the commenter who recommended the Damon Knight story. I'll definitely be looking it up.

I'll wrap up this series of posts tomorrow with some links to Wikipedia for those who might want to start editing, some links to sites critical of Wikipedia for those who want to see more meta-debate, and a couple more questions for the audience. My thanks again to all the readers and commenters.

Comments

[Ira Matetsky, guest-blogging, May 17, 2009 at 11:54pm] Trackbacks
Wikipedia: Some Concluding Thoughts and an Invitation:

A few years ago, as the promised Information Superhighway was growing into the Internet that we know today, no one (to my knowledge) predicted that a collaboratively written, free-content, mass-linked website aspiring to cover all areas of human knowledge would become one of the most prominent information sources in the world. Still less did I anticipate that I would eventually play a role helping to administer such a site.

Eugene inspired me to volunteer this series of posts, now drawing to a close, by discussing a series of cases in which courts have either cited to Wikipedia for information, or asked themselves whether they can take judicial notice of the content of a Wikipedia article.

My own take on the reliability of Wikipedia articles is consistent with that suggested by some of the commenters: articles on non-contentious topics are usually accurate; articles on highly contentious articles are usually accurate on basic facts, but can be subject to bias and dispute with respect to the matters in controversy. It's an overgeneralization, but in essence, if debating a subject could lead to a fist-fight in a bar, or to a heated dispute in academe, then sooner or later the subject will be involved in a content dispute on Wikipedia. This is really not a surprise.

(The surprise comes from how many additional petty matters we also argue about. The people who sometimes refer to Wikipedia administrators and experienced editors collectively as a "Hivemind" may have overlooked the amount of bickering that goes on every day on the Administrators' Noticeboards.)

However, a strong article with more than the most basic content should contain citations of sources where information in the article was drawn from. Checking the sources, and where appropriate citing to them rather than the Wikipedia article itself, may often resolve the question of "is Wikipedia reliable enough to cite?" If no sources are cited, check the links to related articles; the relevant sources may be there. Otherwise, the article history will tell you who wrote the article, and sometimes a query on his or her usertalkpage will elicit the missing references. Beyond that, every article on Wikipedia has an associated "talkpage" where issues concerning the contents of the article, including requests for sourcing of controversial statements, may be addressed.

Anyone relying on Wikipedia must take into account that there are no guarantees as to who contributed a given article or sentence, or why. (In fact, this is emphasized in a couple of places on the site itself.) I think the general population of Internet users has become more aware of both the strengths and weaknesses of various resources, including Wikipedia, than was the case even a few years ago. At a dinner with extended family recently at which my role on Wikipedia came up, my niece, aged 11, told me that her middle-school librarian had cautioned her students not to rely automatically on the accuracy of Wikipedia and to double-check the information before using it for anything important.

More generally, everyone, and especially those young enough not to remember pre-Internet times, will all come to learn more generally which types of research can effectively be performed on the Internet, and which benefit most from access to older or more traditional resources. (Readers who are lawyers will recognize this as analogous to the discussions that go on between younger lawyers heavily dependent on Lexis or Westlaw, and more senior lawyers who believe that thinking through a problem and researching comprehensively often requires a trip to the library.) No one — whether a student writing a paper or someone looking for information — should simply accept information derived from any source without thinking through the quality of the source and what biases it might introduce.

Wikipedia is a valuable but a flawed resource and, as I stressed in the first of these posts, its main strength is also its main weakness: that anyone can contribute to it. Some articles suffer from political or other biases (a core content policy is that all articles must be written from a "neutral point of view," but not every editor is committed to upholding policy, and in any event, NPOV is often in the eye of the beholder). Some articles have been tampered with playfully or maliciously, and although most "vandalism" or "trolling" edits are picked up and reverted quickly, others are not, and some have lingered for months.

(Most vandals are just passing nuisances, such as bored schoolchildren, but some are more persistent, and a small but extremely troublesome handful are persistent to the point of doing serious damage. I'd be interested, just as a point of information, in learning whether there is any legal precedent for in some fashion barring such people from write-accessing or editing on a site. I will add that this is intended as a purely academic question.)

Moreover, the quality of articles varies very widely, and some articles need to be expanded or rewritten before they will have much value. Some articles are absent altogether; even with 2.8 million articles, there is a lot more yet to be written. (I envy some of the earlier editors who had the whole scope of knowledge to write on a blank slate, but there is still plenty more to be done. The occasional suggestion that "everything worth writing on-wiki has been written" is no more accurate than the comment of the apocryphal patent examiner who supposedly urged that "everything worth inventing has been invented.)

And yet — for all of Wikipedia's flaws, the fact is that it has become a central resource relied upon by many. That suggests that researchers typically find Wikipedia content both accessible and reliable. As I pointed out in my first post, Conspirators on this blog often link to a Wikipedia article when introducing a topic. They wouldn't do that if the articles weren't reasonably reliable at least in their basics. In my own experience, when I Google a topic and I come upon the Wikipedia article and read it, I find the information reliable. It may or may not be complete or brilliantly written, but it rarely is just wrong.

This will be my last post in this series, but I'll try to respond to any ongoing dialog in the comment thread. For those interested in further discussion, I assure you that there is ongoing dialog about virtually every issue affecting Wikipedia to be found somewhere right on Wikipedia. Although a few core policies are handed down by the Wikimedia Foundation, and some others arise from technical features and limitations of the software, almost all other Wikipedia policies and guidelines are developed largely by "the community," which means the collective body of editors, and more specifically, those who care enough about a given issue to participate in discussing it.

For those interested in discussing these issues without venturing onto Wikipedia itself, there is ongoing discussion of both theoretical objections to Wikipedia's structure and day-to-day operating issues on a website called "Wikipedia Review" (www.wikipediareview.com). (I participate there on occasion myself — I had not intended to, but someone invited me to join and I accepted.) "WR" can be a mixed bag, containing some instances of overgeneralizations and too much nasty ad hominem for my taste — but interspersed with that is some of the more well-reasoned criticism and commentary I've seen. WR also has a blog, blog.wikipediareview.com, whose contents be more accessible than the sometimes "inside baseball" discussions on the main forum. More recently, some present and former WR members have started another site, www.akahele.com, which also contains critical essays and commentary.

But in my opinion, the best way of enhancing or improving Wikipedia, whether by tweaking one article at a time or by advocating for some site-wide policy change, is to roll up one's sleeves and join in contributing there. For me, at least, I've combined the fun of a new hobby mixed with the enjoyment of sharing knowledge and helping resolve disputes.

Remember, anyone can edit, with or without registering an account. (Creating a brand-new article requires registration, which can be done using one's real name or a pseudonym.) The user interface is accessible even to those without computer skills (believe me, if I could master it, then anyone can), and within a few minutes of sitting down at the keyboard, you'll be an editor and a Wikipedian. Any editor can edit not just articles, but the policy discussions and related pages as well.

If you get stuck, [[Wikipedia:Help]] should link you to a page containing whatever information you need, although Volokh Conspirators (and anyone else) are also welcome to inquire at [[User talk:Newyorkbrad]] if you run into any problems. Perhaps a few of us can conspire there to pick a law-related article to collaborate on and bring up to Featured Article. (And also feel free to e-mail me with any questions or comments if you prefer; please use Newyorkbrad -at- Gmail.com rather than my work e-mail for this purpose.)

My thanks to Eugene and the other Conspirators for giving me a forum here this week (and also for giving me lots of interesting and challenging blog-reading over the years), to everyone who has commented or will comment on one of my posts, to the knowledgeable people who responded in detail to my query on Monday about the [[Saxbe fix]], and to all of the readers.

Comments