The Myth of the Superuser, Part One:
Thanks very much, Eugene, for inviting me to talk about my latest research. I’m offering the VC reader a twofer: Today through Wednesday I’ll describe my ideas about the Myth of the Superuser, and Thursday and Friday I’ll discuss an empirical project involving the Analog Hole. (Quick plug: The Superuser article is looking for a law review to call home, so if you choose articles for a journal, please give it a read!)
My first project is a critique of the rhetoric we use when we debate online conflict. In our debates, storytelling is epidemic, and the dominant trope is the myth of power. To restate it like a less-charged version of Godwin’s Law (I’d call it “Ohm’s Law,” but that’s taken): as a debate about online conflict progresses, the probability of an argument involving powerful computer users approaches one.
For example, law enforcement officials talk about the spread of zombie “botnets” to support broader computer crime laws. Privacy advocates fret about super-hackers who can steal millions of identities with a few keystrokes. Digital rights management opponents argue that DRM is inherently flawed, because some hacker will always find an exploit. (The DRM debate is unusual, because the power-user trope appears on both sides: DRM proponents argue that because they can never win the arms race against powerful users, they need laws like the DMCA.)
These stories could usefully contribute to these debates if they were cited for what they were: interesting anecdotes that open a window into the empirical realities of online conflict. Instead, in a cluttered rhetorical landscape, stories like these supplant a more meaningful empirical inquiry. The pervasive attitude is, “we don’t need to probe too deeply into the nature of power in these conflicts, because these stories tell us all we need to know.”
Too much attention is paid to the powerful user, or the Superuser as I call him. (UNIX geeks, I’m aware I’m overloading the term.) Today I focus on the first part of the argument, my “proof” that the Superuser’s importance is often exaggerated. Superusers inhabit the Internet, but they are often so uncommon as safely to be ignored.
(Two quick asides, that are sure to come up in comments: First, even a few Superusers deserve attention if they act so powerfully that they account for a significant portion of the harm. Measuring the impact of the Superuser requires more than a head count; it also means measuring the amount of harm caused by any one Superuser. Second, Superusers can empower ordinary users by building easy-to-use tools; I address this so-called “script kiddy” problem in the article.)
We know that the Superuser’s power is often exaggerated for three reasons:
First, some statements of Superuser harm are so hyperbolic as to be self-disproving. For example, as Cybersecurity Czar under the Clinton and second Bush administrations, Richard Clarke was fond of saying, “digital Pearl Harbors are happening every day.” I’m not sure what meaning Clarke was giving to the phrase, digital Pearl Harbor: he may have meant attacks with the psychologically damaging effect, horrific loss of life, terrifying surprise, size of invading force, or historical impact of the December 7, 1941 attack; no matter which of these he meant, the claim is a horribly exaggerated overstatement.
Second, experience suggests that some online crimes are committed by ordinary users much more often than by Superusers. Take data breach and identity theft. Data breachers are often portrayed as genius hackers who break into computers to steal thousands of credit card numbers. Although some criminals fit this profile, increasingly, the police are focusing on non-Superusers who obtain personal data using non-technical means, like laptop theft. Similarly, identity thieves are often not computer wizards; the New York Times reported last year that many District Attorneys see more meth addicts committing identity theft than any other segment of the population.
Consider also claims that terrorists are plotting to use computer networks to threaten lives or economic well-being. There has never been a death reported from an attack on a computer network or system. In fact, many experts now doubt that an attack will ever disable a significant part of the Internet.
Of course, there are limits to using opinions and qualitative evidence to disprove the Myth, because they share so much in common with the anecdotes that fuel it. The third way to dispel the Myth is through studies and statistics. As one very recent example, Phil Howard and Kris Erickson of the University of Washington released a study which found that sixty percent of reported incidents of the loss of personal records involved organizational mismanagement, while only thirty-one percent involved hackers.
This is just a taste; in the Article, I go into much greater depth about why the Myth is not to be believed. Tomorrow, I will discuss the significant harms that result from Myth-influenced policymaking. Finally, on Wednesday, I will focus on a root cause of the problem: the inability of computer security experts to discriminate between high risk and low risk harms online.
The Myth of the Superuser, Part Two, Harm:
First, a quick note to lawyers: today’s installment about my article is much more law-focused than yesterday’s.
I am grateful for yesterday’s comments. Many of you took issue with my use of the word, “Superuser.” You all have almost persuaded me to use “Superhacker” instead, although it would be a painful change. After living with this article for the past year-plus, it’ll be hard to think of it as anything but the Superuser piece. I’m still on the fence, so for the rest of my stay here, I’ll continue to call these mythical people, Superusers.
Why should we care whether exaggerated arguments about Superusers cause legislators to address risks that are unlikely to materialize? Because many, significant harms flow from the Myth of the Superuser. Due to the near-universal belief in the Myth, there has never been a thorough accounting of these harms, and we have been doomed to repeat and extend them. In my article, I discuss six harms which flow directly from policies and laws that are justified by the Myth. Today, I want to focus on two:
1. Overbroad laws. Congress’s typical response to the Myth of the Superuser is to write broad criminal prohibitions. It is haunted by the possibility that someday a Superuser who commits a horrific wrong will not be able to be brought to justice because of a narrow prohibition in the law. They fear an American version of Onel de Guzman, the Philippine citizen who confessed to writing the “I LOVE YOU” virus but escaped punishment because Philippine law did not criminalize the type of harm he had caused.
Consider, for example, the principal federal law that prohibits computer hacking, the Computer Fraud and Abuse Act (CFAA). Many of the statute’s prohibitions apply expansively, and I contend that Congress has repeatedly broadened the law, in large measure, to deal with the scary prospect of Superuser hackers. For proof, count the number of stories about anonymous Superusers in any House or Senate Report accompanying an amendment to the CFAA; an especially egregious example is the 1996 Senate Report.
The CFAA’s prohibitions cover an expansive laundry list of activity. You might be a felon under the CFAA’s broad “hacking” provisions if you: breach a contract; “transmit” a program from a floppy to your employer-issued laptop; or send a lot of e-mail messages. And even if the FBI decides not to prosecute you for these transgressions, the broad CFAA gives it the right to investigate you, to read your e-mail messages and maybe even wiretap your phones and Internet connections.
2. Infringements of Civil Liberties. Part of what is terrifying about the Superuser is how the Internet allows him to act anonymously, hopping from host to host and country to country with impunity. To find the Superuser, the police ask for better search and surveillance authorities and tools, as well as the latitude to pursue creative solutions for piercing anonymity.
But broadened search authorities can be used unjustifiably to intrude upon civil liberties. Search warrants for computers are a prime example; the judges who sign and review computer warrants usually authorize sweeping and highly invasive searches justified by storytelling about the Superuser Data Hider.
It has become standard boilerplate for agents in their affidavits supporting search warrant applications to talk about sophisticated technology that can be used to hide data. According to this boilerplate, criminals “have been known” to use kill switches, steganography and encryption to hide evidence of their crimes. In addition, file names and extensions are almost meaningless, because users can easily change this information to hide data.
Convinced of the prowess of the data hider, a typical judge will usually sign a warrant that authorizes the search of every single file on a suspect’s computers; that authorize the search of parts of the hard drive that don’t store files at all; and that allow off-site computer searches, where data is forensically examined for months or maybe even years. In upholding the scope of these kinds of searches, reviewing courts make bare and broad proclamations about what criminals do to hide evidence. These broad pronouncements (which are also citable precedent) are built upon nothing but an agent’s assertions and a judge’s intuitions about computer technology.
If, in reality, some criminals tend not to hide data inside obscured filenames or unusual directories, then judges might feel compelled to ask the police to cordon off parts of a computer’s hard drive.
So where does this particular myth end and reality begin? Common sense suggests that some criminals are paranoid enough to hide evidence. But it’s highly improbable that all criminals are equally likely to use these tactics. Home computer users who are committing relatively non-technological crimes — death threats or extortion via e-mail, for example — may have less incentive to hide evidence and no access to the tools required to do so. Painting all criminals in every warrant application as uniformly capable of hiding information is a classic example of the Myth.
In the Article, I call for judges to require a more particularized showing of “criminal tradecraft” before they sign sweeping warrants. How do we know that this class of criminal is likely to have used these particular tactics? The hurdle need not be very high; police training and experience are owed deference. But deference is not the same thing as acceptance of sweeping generalizations. In some cases, constraints on the police on the allowable scope of the search of a hard drive may be sensible, and perhaps even required by the particularity clause of the Fourth Amendment.
Very briefly, in addition to these two harms — overbroad laws and civil liberties infringements — the other four harms I identify are guilt by association (think Ed Felten); wasted investigative resources (Superusers are expensive to catch); wasted economic resources (how much money is spent each year on computer security, and is it all justified?); and flawed scholarship (See my comment from yesterday about DRM).
Tomorrow, I will conclude my discussion of the Superuser by focusing on a root cause of the myth, the failure of expertise.
The Myth of the Superuser, Part Three, The Failure of Expertise:
Over the past two days of discussion about my article, I have essentially been saying that we (policymakers, lawyers, law professors, computer security experts) do a lousy job calculating the risks posed by Superusers. This sounds a lot like what is said elsewhere, for example involving the risks of global warming, the safety of nuclear power plants, or the dangers of genetically modified foods. But there is a significant, important difference: researchers who study these other risks rigorously analyze data. In fact, their focus on numbers and probabilities and the average person’s seeming disregard for statistics is a central mystery pursued by many legal scholars who study risk, such as Cass Sunstein in his book, Laws of Fear.
In stark contrast, experts in the field of computer crime and computer security are seemingly uninterested in probabilities. Computer experts rarely assess a risk of online harm as anything but, “significant,” and they almost never compare different categories of harm for relative risk. Why do these experts seem so willing to abdicate the important risk-calculating role played by their counterparts in other fields? Consider four explanations:
1. Pervasive Secrecy. Online risks are shrouded in secrecy. Software developers use trade secrecy laws and compiled code to keep details from the public. Computer hackers dwell in a shadowy underground. Security consultants are bound contractually not to reveal the identities of those who hire them. Law enforcement agencies refuse to divulge statistics about the number, type, and extent of their investigations and resist Congressional attempts to increase public reporting.
Which brings us to California SB 1386. Inspired by experiences with this law, Adam Shostack argued at this year’s Shmoocon that “Security Breaches are Good for You,” by which he really meant, “breach disclosure is good for you,” setting off a mini-debate in a couple of blogs. (See this post and work backwards from there). On his blog, Adam said:
The reason that breaches are so important is is that they provide us with an objective and hard to manipulate data set which we can use to look at the world. It's a basis for evidence in computer security. Breaches offer a unique and new opportunity to study what really goes wrong. They allow us to move beyond purely qualitative arguments about how bad things are, or why they are bad, and add quantifatication.
I think Adam is on to something, and this quote echoes some of my conclusions in the article. But I’m not hitching my argument directly to his. Because even if you conclude that Adam is wrong; if you think the need for secrecy and non-disclosure trumps his desire for a more scientific approach to computer security, secrecy still shouldn’t trump accurate, informed policymaking (lawmaking, judging). What does this mean? If someone wants to keep the details behind a particular risk secret, for whatever reason, perhaps that’s his prerogative. But if he then complains to policymakers about vague, anecdotal, shrouded risks, he should be ignored or at least his opinion should be greatly discounted.
2. Everyone is an Expert. “Computer expert” is a title too easily obtained. Unlike modern medical science, where the signal advances require money and years of formal education to achieve, many computer breakthroughs tend to come from self-taught tinkerers. In many ways, the democratizing nature of online expertise is cause for celebration; it is part of what makes Internet innovation and entrepreneurship so exciting.
The problem is that so-called computer experts tend to have neither the training nor inclination to approach problems statistically and empirically. People can be called before Congress to testify about identity theft or network security, even if they have no idea nor even care how often these risks occur. Their presence on a speakers’ list crowds out the few who are thinking about these things empirically and robustly.
3. Self-Interest. Many experts have a self-interest in portraying online actors as sophisticated hackers capable of awesome power. Law enforcement officials spin yarns about legions of expert hackers to gain new criminal laws, surveillance powers, and resources. The media enjoy high ratings and ad revenue reporting on online risks. Security vendors will sell more units in a world of unbridled power.
4. The Need for Interdisciplinary Work. Finally, too many experts consider online risk assessment to be somebody else’s concern. Computer security experts often conclude simply that all computer software is flawed, and that malicious attackers can and will exploit those flaws if they are sufficiently motivated. The question isn’t a technology question at all, they contend, but it is about means, motive, and opportunity, which are questions for criminologists, not engineers.
Criminologists, for their part, spend little time studying computer crime, perhaps assuming that vulnerability-exploit models can only be analyzed using computer science. The answer, of course, is that they’re both wrong – and both right. Assessing an online risk requires an interdisciplinary blend of computer science, psychology and sociology; short-sighted analyses that focus only on some of these disciplines often result in misanalysis.
One Prescription: Better Data. I won’t spend too much time summarizing my prescriptions. The gist is that we need to start to police our rhetoric, and we need to do a better job collecting and using data. Two sources of data seem especially promising: the studies coming out of the burgeoning Economics of Information Security discipline, and the ongoing National Computer Security Survey co-sponsored by DOJ’s Bureau of Justice Statistics and DHS’s National Cyber Security Division and administered by RAND.
There is much more to my arguments and prescriptions, but I hope this is a good sample. Tomorrow, I will transition to something very different: a two-day look at a paper I have co-authored describing some empirical results about the Analog Hole and about consumer willingness-to-pay for digital music.