Orin beat me to it, but I also found the story about using Google's search data as a way to detect flu outbreaks to be pretty interesting. My response, I must say, is a little different from Orin's - he found it "creepy," on the grounds that we might not "want Google establishing such a cozy relationship with the federal government." I recognize there's serious potential for abuse -- but on the other hand, take a look at this:

[Taken from the NY Times' story] That's a fairly extraordinary public health tool -- one that, according to the Times story, "may be able to detect regional outbreaks of the flu a week to 10 days before they are reported by the Centers for Disease Control and Prevention." And, at least if there's been no disclosure of personally identifiable information, that's some pretty useful stuff, and I'm not sure I'm so unhappy if the Centers for Disease Control have access to it. It's a very big and very important "if," to be sure, but I would hope that Orin's proposed "Search Engine Privacy Act" statute won't throw this particular baby out with the bathwater.

If there is a spike in searches for "how do you tell if your spouse is a zombie" then head for the hills.
As I noted in the other thread, to me the issue is not whether the CDC has access to this data. It is whether *only* the CDC has access to it.

One argument that others made in Orin's post was that Google was just acting in the public good. Granting arguendo that Google is just trying to protect the public good, it would still seem to me that public disclosure would be the best way to do that. Who knows what's good for the public better than, well, the public?

Of course, neither the NY Times story or the Drudge report article make it clear what is going on. Is Google giving the CDC extra information that the public doesn't already have via Flu Trends? Is Google somehow analyzing the Flu Trends data for the CDC, and not sharing the results of the analysis with the public? Does the CDC get any data days or weeks ahead of anyone else? It's entirely possible that the answers to all these questions are "no". If so, then, hey, cool, it does seem like Google just acting in the public good, and I'm happy they are.

But if the answer to any of my questions is "yes", why? What's the rational to share data for free with the CDC but not to share with the public?
OK by me but what would really be neat (sorry, cool) would be if Google also mapped HIV infections by neighborhood. Maybe then we could break through the "everyone's at risk" shibolleth.
As long it's just sanitized aggregate data, I don't really see a privacy problem.

I agree we certainly need to be wary of search engines cozying up to the government, but let's face it. If the Feds decided to go after this sort of thing, we'd never hear about it until long after the fact, regardless of whatever unrelated cooperation the company may have previously engaged in. Just like the telco data mining.
The noise will outweigh the signal.

If you search for "flu" or "influenza," that does not mean that you or a member of your household has influenza. The search might be about distinguishing a cold from the flu, getting info on influenza for a school report, learning about flu because your Aunt Helen in a nursing home 800 mile away might have it, or because you misspelled 'flue' when looking for a replacement part for your wood-burning stove.

This PR move for Google will not yield usable information, and it may detract from more productive methods of tracking influenza.
Does this really tell us anything? My first thought looking at the graphs is that people who get home from the doctor having been diagnosed with the flu then use Google to get information.
Trust me:
Only guilty people have anything to fear from the govt. All those amendment thingies are superfluous. That about it?
John Jenkins: Yes, it does do something tremendously useful. It takes the CDC about 10 days longer than Google to compile flu statistics. The CDC has to aggregate reports from many different doctors' offices and hospitals, but Google just has to scan its database of internet searches. A 10-day lead time is absolutely enormous in the context of flu outbreaks.
Dr. T.: When exactly will the noise start outweighing the signal? Did you even look at the graph in the NYTimes article? So far it looks like there's plenty of signal to see, so my tentative conclusion is that your comment is not even remotely credible.
Private enterprise -- still faster than the CDC. I'm shocked.
It was said in the last thread, so I'll just repeat it. has been around for a long time. This is just that with specialized search terms for the flu. It's really not very exciting. On google trends can put in whatever privacy "invading" terms you like.
I dont know how is in your country but in latinamerica , doctors heve a legal duty to communicate the government about contagius disesases. In case of venereal, std, if the patient left treatment must be denounced. and will be forced to treatment like in the USA with tuberculosis.
In Europe nonnominative information can be disclosed to the government.
I guess right to health protection trumps privacy
Feh. Your right to privacy in your Internet searches -- or for that matter in your home, possessions or even person -- stops right when you start carrying around highly communicable and potentially deadly diseases. If you don't like that, you can just stay at home all the time and miles away from kids, old folks, folks getting cancer treatment, and others who die from even ordinary influenza in big numbers every year.

If Google's program correctly and reliably identifies flu outbreaks, I don't give a damn if they e-mail the IP addresses directly to the CDC, the FBI, and local law enforcement, so a couple of blue-suited goons with guns come to your door and, depending who you're putting at risk, give you a useful pamphlet on recovering from the flu, force you to get vaccinated, or nail your front door shut for two weeks.

One of the reasons society survives in the face of existential threats like pandemic disease is that there's a limit to how tolerant sensible folks are to the delicate and occasionally off-the-rails sensibilities of constitutional lawyers. At some point, we just brush those fools aside and do what we must.
Splunge, the flu really isn't deadly to the vast majority of folks. Infants, elderly and immune-compromised probably got a shot anyway. The rest of us can ride out a flu with a few days in bed -- and be considerably stronger for it.
I guess my point is this: don't go throwing around the word "existential threat" without thinking hard about it.
PS: Any information willfully divulged to a search engine (e.g. search terms, IP address) is totally devoid of REP.
I must say, is a little different from Orin's - he found it "creepy," on the grounds that we might not "want Google establishing such a cozy relationship with the federal government."

No, the creepy part is wondering what kind of information google could derive from searches for STD's.
Just for the record, the data IS available to everybody, here. And, as others have already pointed out, will give anybody and everybody the same general aggregate data about any search term you'd care to try.
PatHMV, zforce: I saw the Flu Trends site, but Orin's original post and the article at Drudge in particular made it seem as if the CDC might potentially have access to more than just Flu Trends.

The more I read, the less likely it seems that the CDC has privileged access to Google's data, but I haven't seen any flat-out statements saying explicitly that the CDC doesn't get anything extra.
The noise will outweigh the signal.

That's a testable hypothesis. You could compare Google's estimates with CDC's actual data and determine exactly how much of Google's data is noise and how much is signal (through R-squared or another such metric). It's a claim in need of data to back it up.
This was actually predicted in Rainbows End by Vernor Vinge.
This could be a useful tool for the government.....

it more likely will show when blue cross is sending the flu shot around to different workplaces.

The day it was announced it was coming here a lot of us googled to see which three versions of the flu are in the shot this year.
Guys, its the potiential for abuse that people find creepy.
It always starts with something sensible. How soon until low level gov't employees are routinely analyzing google searches of future Joe the Plumbers?
that's some pretty useful stuff,

How is it useful? The CDC doesn't have a flu cure, and flu shots take too long to take effect to be a reactive strategy. Is the CDC going to create a a flu blitz team on standby ready to rush to the scene of the action, and...?
It is useful. More timely mapping of flu means, among other things, better (more accurate, faster) predictions about the behavior of an epidemic--meaning more efficient and effective prevention-- flu shots can be rolled out to priority areas and to priority populations and high risk populations can make other behavior changes as necessary.
Agreed. Allahpundit at Malkin's HotAir summarizes it nicely:
First, they're not sharing individual users' IP info, just the aggregate data to help the government track outbreaks epidemiologically — which their privacy policy already allows them to do. Second, it's not for the feds' eyes only, apparently; curiosity seekers can track the info themselves on the Google Flu map. Third, if they do break their promise and share personal info, word is bound to get out and send users scurrying for Yahoo, a fact that Google explicitly recognizes on its blog. Fourth, as Althouse reminds us, the stakes in stopping an outbreak early are, potentially, rather high. So what's the argument, aside from the standard "slippery slope" that can be applied to anything?

Incidentally, haven't they been doing this for years already with Google Trends? Every six months or so, some new data from Google about the vast appetite for porn in the Middle East will make the rounds, prompting a day's worth of snickers in the blogosphere. This is the same thing, it seems, except with health benefits. Exit question: How is it significantly different from, say, the Nielsen ratings?
Good lord. Last night, I just started (i.e. maybe 8 pages) Vernor Vinge's Rainbow's End, and those first 8 pages discuss this very topic. In his book private citizens ('trend' hobbyists) spot the outbreaks before the CDC as well.

Thing is, what may be an issue is that "trends" lead to government action against (or to protect) specific groups. As an example, the trends could show that a specific neighborhood's quarantine might protect the population surrounding it. Or, alternatively, that a specific sub-group is getting sick more often with a specific disease. Google can keep track of an awful lot of info from the people who are doing a specific "flu" or "AIDS" or whatever search.
My response, I must say, is a little different from Orin's
My sense is that Orin was misled on the facts of the matter by reading Drudge. Drudge (falsely) insinuated that Google is feeding data to the CDC that they aren't making generally available. Exhibit 1832 for the maxim that, as John Cole put it, Drudge's reliability compares unfavorably with that of the bathroom wall.
Lots of people say they have the flu when they don't. Lots of doctors diagnose flu based on reported symptoms without doing a flu test. Using Google searches as a way of tracking flu is PR and not epidemiology. The vast majority of searches for "flu" or "influenza" will not be from persons with acute influenza, so most of the raw data is noise, not signal. The NYTimes graph shows aggregate national data on "flu-like symptoms" (not true influenza), which is not of much value.

The types of influenza information needed by hospitals and public health departments are: what regions have high rates of influenza, is it influenza A or B, is it a new strain or one covered by the vaccine, is it targeting any specific age groups, what percentage of victims need hospitalization, what is the geographic pattern of spread, when will it hit my region, etc. I've seen no evidence that Google can provide such data. The CDC is slow, but regional and local epidemiologists have a reliable communication network that is much faster.
