Skeptical About Alleged DOJ Data Retention Plan:
A few days ago, over at, Declan McCullagh made a troubling but very probably false claim:
  The Department of Justice is quietly shopping around the explosive idea of requiring Internet service providers to retain records of their customers' online activities.
  Data retention rules could permit police to obtain records of e-mail chatter, Web browsing or chat-room activity months after Internet providers ordinarily would have deleted the logs--that is, if logs were ever kept in the first place. No U.S. law currently mandates that such logs be kept.
  It is quite unlikely that the claim in the first paragraph is true. Privacy advocates have been expressing concern for years that there are secret DOJ plans to mandate ISP data retention. When asked, however, DOJ officials repeatedly have made clear that such a proposal is out of the question.

  What is the evidence that times have changed, and that now DOJ is "quietly shopping around" this "explosive" idea? As best I can tell from Declan's story, it is this and only this: A few weeks ago, at a Holiday Inn in Alexandria, Virginia, unnamed Department of Justice employees, apparently from DOJ's Child Exploitation and Obscenity Section (CEOS), mentioned the possibility of mandatory data retention requirements in a meeting with some ISP representatives.

  Who are these DOJ employees, though? CEOS does not have any high-level policy makers, as far as I know. It is a section consistening entirely of career prosecutors. No one at CEOS has the authority to opine on such a enormous and controversial question except entirely in his personal capacity. And the chances that DOJ would decide to "shop around" such a high-profile proposal using career lawyers meeting at a Holiday Inn seems a bit far-fetched.

  If I had to guess, I would imagine all that happened in this meeting was that a random career lawyer at DOJ had been wondering about data retention, and decided to discuss it as a possibility in a meeting despite DOJ policy to the contrary. Or perhaps the lawyer foolishly tried to raise the possibilitz as a threat to push ISP representatives to think more seriously about voluntary data retention. Either way, DOJ has not changed its policy at all. Is it possible that there is more to the story than that? Yes, but on the whole it is quite unlikely.

  I have enabled comments. As always, civil and respectful comments only. Thanks to Ran Barton for the link.
Chris Lansdown (mail) (www):
I realize that this is a very basic question, but why is this different from requiring phone companies to keep recordings of all phone conversations made over their network?
6.20.2005 8:23pm
EKR (mail) (www):
It sounds like a basic question, but it's actually remarkably complicated. Unlike the PSTN, on the Internet vastly different amounts of state exist at different parts of the network. Some nodes involved in the conversation have full information (think your mail client) some have only "header" information like to/from fields (think the mail server, which typically doesn't store the content for long) and some only know about packet flows (think the core routers). So, what data is recorded depends very much on who does the recordings. That said, what's most likely being discussed here is the header information that indicates the source and destination of communications traffic something akin to pen register and trap and trace information in the PSTN.

More discussion of the technical issues can be found on my blog.
6.20.2005 8:53pm
I think the logic of the proposed requirement is that much of child porn is shared not by public web sites (which are much easier to monitor, especially activity going in and out of them), but by private rings of pedophiles. Breaking into those rings is probably very difficult.

Whenever an arrest is made in a child sex abuse case (not necessarily related to cyberspace), then the DoJ would be able to view those past 2 months of internet activity to look for any evidence that person was involved in a ring. They could then view other potential ring members' activities in order to find out about them. It could be a powerful tool for law enforcement. And because ring members would have to possess the material on their hard drives (whereas web site viewers only have it in their internet caches), everyone involved would be committing a crime and thus the DoJ could arrest everyone involved in the ring.

A possible safeguard would be to require the DoJ to get a warrant before getting access to those 2mo of information. The DoJ would have to show a judge that the person in question

a) was arrested for a crime related to child sex abuse,

b) participated in child porn-related activity online, or

c) had extensive or regular internet contact with someone who participated in child porn-related activity online.

By requiring a warrant and making the requirements for that warrant very specific, the problem of law enforcement using this tool for other purposes could be greatly diminished.


Still, it should be noted that just the existence of that information would be a threat. A renegade employee at an ISP could collect information on certain people and harass/blackmail/reveal/etc. them.

The knowledge that internet activity was being monitored (even if that knowledge would ever be used) could be enough to chill some legitimate internet activity (such as for fringe political groups, or the sending of romantic emails, etc.).
6.20.2005 9:09pm
Andy Freeman (mail):
> And because ring members would have to possess the material on their hard drives (whereas web site viewers only have it in their internet caches)

Huh? Internet caches often/usually live on hard drives, typically the system disk (aka C:).

Downloaded material is far more likely to be on a removable device than an internet cache. Said removable device may be a hard drive, but it might be kept separately to hinder seizure.

I don't know that "save as" generates the same HTTP commands as viewing, but would be surprised if it didn't as there isn't much in the protocol other than get and post. If the same commands are generated, nothing outside the user's computer can distinguish "download" from "view".
6.20.2005 9:34pm
Good objections.

Cache: Material in an internet cache does reside on the hard drive, but as far as I know, it's generally not legally considered to be "possession" because it is incidental to viewing and is only temporary. People generally don't control individual files in their cache and don't know when a certain file has been removed. It would be wrong to prosecute someone for possession of child porn if they had only followed a misleading link and then that information remained on their cache.

If the person was preventing their old cache data from being deleted, was viewing the information in the cache while offline, or it could be shown they planned to retrieve it later, then it probably would be considered possession.

Protocol: If the internet protocol for viewing and saving are the same, then it would not be possible from the information saved to know which took place. But would information transmitted between a ring of porn-traders be done with HTTP protocol? Would HTTP be used to transmit information from AIM, for instance? Also, the protocol for uploading must be different than the one for downloading.

Removable drives: a hidden hard drive would be much harder to find. Agents would have to get a warrant to search the house for it. However, if there was evidence of files being _uploaded_, that could justify LEOs asking for a warrant to search the whole house for computer data storage as opposed to just a single computer connected to the internet.
6.20.2005 9:50pm
JamesH (mail):
It sounds easy but it isn't, basically because of the size of data involved.

If the ISP were to attempt to capture everything that every individual sent and received it would quickly fill up most storage mediums. If the person listens to Internet Radio a program that is 1 hour long would probably take up 300 MB. If you average 1 person listening (A signs off, b signs on, 12 people x 2 hours, etc) all day, after 60 days that is 432 GB of data.

Other scenarios:
Someone has a VOIP phone.
Should the ISP get hit by 'I love you' (100 K) x 10,000 users x (How many times did you get 'I Love you'?).
A remote Desktop Program also transmits large amounts of data. (Enough to set off some alarms were I work, until we figured out what it was).
A p2p program can transmit 35GB of data in a 24 hour period.

If Apple starts its rumored Movie Download service, watchout!!

This is more people trying to control something they don't understand. The costs of trying to implement it would be very high.

(The other thing is unless you capture the entire packet, you don't know for sure what is in it. (Even then, if it is encrypted you still don't know). So (the lawyers online can answer this), would just having a packet going from suspect A to suspect B's machine be enough? Where they just talking when it happened? or where they sending pictures? Was it a picture of the new car or Child porn? etc).
6.20.2005 10:00pm
William Spieler (mail) (www):
If this ever happens, I'm predicting a lot of linux downloads, if you know what I mean.
6.21.2005 12:04am
Rob Read (mail):
Here's a good start for what's been happening in Europe.

link to the register

Currently if you host your own SMTP and DNS servers then I doubt your ISP will have anything to store. Storing all the IP headers going through a router would be impractical.

The next generation Internet protocol IPv6 (we currently use ipv4) should encrypt most of the things that the government wants to stick it's nose into.
6.21.2005 8:01am
Roger (mail):
Getting back to the issue of "who said" and "what the DOJ did" I am somewhat disturbed by yet another rumor going around about what the "DOJ" is up to. A month or so ago, Talkleft et. al. were declaring that the DOJ had "changed the burden of proof" by regulation regarding child pornography. Now, we are all lawyers (or should be) so it seems strange that there is no discussion of whether it is even possible for the Justice Department to do this on its own.
6.21.2005 11:26am

I believe there are some civil regs relating to disclosure requirements that models used are 18 years old, and that a group has sued DOJ claiming that the DOJ's regs are invalid. I don't know much about it, and I'm not familiar with the regs.
6.21.2005 12:01pm
Roger (mail):
That really is the issue.
6.21.2005 12:28pm
Philip Taron (www):
From a technical standpoint, the requirement as stated would be almost impossible to fufill. I personally generate between 1 and 2 GB of data transfer a day, from listening to internet radio, watching online videos, and browsing photo albums. Though I'm on the high end, multiply this figure by the number of customers that ISPs have to see just how much space this data storage would take up. There's no storage medium available at the price point medium-sized ISPs would be able to buy -- and with broadband uptake rates, this will only get worse.

If something like this ever does make it into law -- which it very well might -- I only expect headers or other sort of "IP address x.x.x.x accessed site Y on $timestamp", the sort of information that ISPs already tend to keep for technical reasons.
6.21.2005 1:32pm
Kevin Murphy (mail) (www):
As EKR suggested, the most that could reasonably be done is to log packet headers (to, from, protocol, packet type) at the core routers. Anything else is way too much to keep (and even this may be). Besides, much traffic doesn't go through anything other than the core routers.

Not that it will stop the bad guys -- all one needs to do is have an off-shore server that provides a renumbering service and everthing disappears into a black hole. Sure, you can decide to trap everything from specific individuals, but you can do this now.
6.21.2005 3:03pm
If the same commands are generated, nothing outside the user's computer can distinguish "download" from "view".

That's correct. All web browsing is "downloading". That is how the web browser gets all of the images and text it displays. If you right-click on an image and pick "save as", you aren't (unless you have the cache turned off) actually downloading it again. You're just making a copy of the cached image that is already on your hard drive.

Another problem, of course, is that there is no way for the server to know if the user is deliberately and consciously requesting a file. Anyone who has visited a number of ordinary adult websites is familiar with the problem of unwanted popups. Each of those represents additional downloads that the user's computer "requested" without his actual permission.
6.21.2005 3:09pm
Andy Freeman (mail):
> If the person was preventing their old cache data from being deleted, was viewing the information in the cache while offline, or it could be shown they planned to retrieve it later, then it probably would be considered possession.

Is it possession if they;
(1) backup their system including the cache
(2) make their cache so big that it keeps things for months
(3) look through their cache to see what's there

> But would information transmitted between a ring of porn-traders be done with HTTP protocol?

If it's a web site, yes; the "get"s will be done with HTTP.

> Also, the protocol for uploading must be different than the one for downloading.

Nope. Superficially, the commands must be different (http has post), but sending a request requires sending information to identify the requested data. That identifying information is uploaded data which is important because bits is bits. For example, I might request 789345.txt. The receiver is free to save the "45" as part of a file that I'm sending a piece at a time. (Cookies are an interesting example. Cookies typically come with a requested page and are sent back to the same host with any subsequent request to said host. However, there's nothing stopping someone from simply sending data labelled "cookie" with a request. In other words, who's to say said said cookies actually came from said host in response to a previous page request?)

Moreover, the existence/use of other protocols doesn't distinguish content or demonstrate intent and there are others where things get transferred without specific intent. (RSS feeds are a good example. Heck - so is e-mail, as my spam folder proves.)

If there's an http get safe harbor, that's what bad folks will use. It's probably safe to say that http post is evidence of possession/distribution, but you probably also need to track what they posted. (Who's to say that the post succeeded?)

BTW, there are lots of possible channels, so you must copy and analyse every bit transferred if you actually want this to work. Another example - DNS, the protocol used to resolve names (such as to addresses (some number) has been used to transfer information through firewalls. (No, you don't want to log all dns request info - it's almost all innocuous and it happens all the time.)
6.21.2005 3:36pm
"the most that could reasonably be done is to log packet headers." I doubt that even this could be reasonably done. In viewing one web page, your computer probably exchanges hundreds of packets. Logging each packet would soon overload any ISP.

For the sake of not calling a government employee a lunatic and a peeping tom, I'm assuming that the DOJ employee was looking for the equivalent of the "pen trace" for telephones, which is a record of the number called, date, and time from one telephone. Getting a pen trace does not require nearly as much evidence as getting a warrant to record the calls. And in telephony, pen traces don't produce an overwhelming amount of data because people aren't calling a new number every few seconds. The phone company may have recorded this data for accounting purposes anyhow. Recording packet headers would require at least thousands of times the hard drive space. I don't know if it would be practical to try to filter the packet headers to extract just the web pages accessed, with info identifying the servers or who was accessing the page. This would still involve more data than a pen trace, because a web surfer will visit many pages while a telephoner is still on the first call, but I don't think it would be an unmanageable amount.

Finally, there's the legal question: A pen trace is subject to a lower threshold just because it doesn't indicate the content of the call, but give me the address of a web page and I can usually get the content just by typing it in at the the address bar. Thanks to Google caching and other web services, I can probably get the content even when the page has been taken down.
6.21.2005 6:06pm
AlexAz (mail):
It is very likely that the DOJ types were referring to the new 8USC2257 regs that go into effect tomorrow. These regs impost a tremendous record keeping requirement on the owners and content providers of "Adult Websites" Under these regulations anyone who has taken, posted, shared, or distributed any pictures of "Explicit Sexual Activity" taken since July 1995, yes, it is retroactive, must have available for examination, by DOJ inspectors, records that include, among other things, The picture in question, copies of picture IDs, issued by government agency, of all persons in the pictures. "Explicit Sexual Act" is defined to be any sexual or simulated sexual act including any picture of a naked individual. These records must be kept by all primary and secondary produces for 7 years. The age of the individual in the picture does not matter. The reg and associated definitions are broad enough that many pictures taken at the beach, a frat party, a swingers party, can be considered to fall under the regs. If the pics have been shared over the internet they qualify. This apparently includes family photo albums posted on the internet. The penalties for non compliance are up to a $10,000 dollar fine and 10 years in federal prison.

More info at The Free Speech Coalition Inc.
6.22.2005 6:26pm