The “Privacy-Enhanced Data Mining” Trap

trap.jpgThe Associated Press pushed a story to the wires about the Data Surveillance workshop which I’d mentioned a while back:

As new disclosures mount about government surveillance programs, computer science researchers hope to wade into the fray by enabling data mining that also protects individual privacy.

Largely by employing the head-spinning principles of cryptography, the researchers say they can ensure that law enforcement, intelligence agencies and private companies can sift through huge databases without seeing names and identifying details in the records.

So let’s talk about that. The argument can be re-stated as “we can take data, sift it, and then start an investigation based on the sifted data, and go through the warrants process.”

This requires both willful ignorance of the quality of the data being mined, and a rose-tinted willingness to trust the justice system.

The quality of data in a privately data-mined system will be no greater than that in any other system, and will likely be lower. It will be lower because inaccurate data will not be visible for correction. Fair information practices such as accuracy and access are deeply impeded.

Once the data mining system has come out and said “Alice is a suspect,” Alice will enter into a Kafka-esque bureaucratic nightmare. The computer found something.
How “the computer found something” can translate into a warrant is system dependent. Some systems may unmask the “data.” Others may be presented to a judge as “the computer thinks we need to investigate this person.” Either way, Alice’s innocence will be viewed with suspicion. Either she’s really good at hiding her guilt, or we’ve caught a sleeper.

Research into ways in which data mining can occur in ways that are respectful of the fair information practices is useful and worthwhile. Today’s privacy-destroying impulses need to be brought into check by a Congress and Judiciary balancing the executive. (Of course, the legislatures are contributing, as documented in stories like “Police to Get Access to Student Data.” Thanks, Alice!) Giving them a set of tools is worthwhile, but we should be aware of the limits of the tools we have today.

Photo, Implement of Destruction by Canardo.

Metricon: The Agenda

Andrew Jaquith has posted the Metricon Agenda. We had a lot of good papers, and couldn’t accept them all. (We’ll provide, umm, numbers, at the workshop.) If you’ve submitted a paper, you should have heard back by now. Thanks to all the submitters, and we look forward to seeing you at the workshop.

Happy Juneteenth!

I’m deeply in favor of holidays which celebrate freedom. We need more of them.

Juneteenth, also known as Freedom Day or Emancipation Day, is an annual holiday in the United States. Celebrated on June 19, it commemorates the announcement of the abolition of slavery in Texas. The holiday originated in Galveston, Texas; for more than a century, the state of Texas was the primary home of Juneteenth celebrations.

(Photo by MizJellyBean.)

Men Without Pants

To protect the rights of the official beer they were denied entry, so the male fans promptly removed the trousers and watched the game in underpants.

The BBC asserts that up to 1,000 fans were told to strip off their orange pants in “Fans Lose Trousers to Gain Entry.” Markus Siegler, the control-freak in charge of press for FIFA, said:

“Of course, FIFA has no right to tell an individual fan what to wear at a match, but if thousands of people all turn up wearing the same thing to market a product and to be seen on TV screens then of course we would stop it.”

Of course. That doesn’t make it normal or right. You do have to appreciate a nation which prefers nudity on their TV screens to advertising. Of course, FIFA is trying to minimize the numbers, and invent the term “ambush marketing” to make it seem unusual that there’s marketing involved in a sports event.

I tried to find a good picture, but this is a family blog.

Remembering the Maine

From Maine’s Public Law, Chapter 583, passed April 2006:

Sec. 9. 10 MRSA §1348, sub-§5, as enacted by PL 2005, c. 379, §1 and affected by §4, is amended to read:
5 . Notification to state regulators. When notice of a breach of the security of the system is required under subsection 1, the information brokerperson shall notify the appropriate state regulators within the Department of Professional and Financial Regulation, or if the information brokerperson is not regulated by the department, the Attorney General.

Maine now joins an exclusive club. Now all breaches, not just those of information brokers, must be reported to the AG’s office. Only New York has a similar law. The duty to notify applies to every “person”, now defined as:

an individual, partnership, corporation, limited liability company, trust, estate, cooperative, association or other entity, including agencies of State Government, the University of Maine System, the Maine Community College System, Maine Maritime Academy and private colleges and universities.

The emphasized portion is new law. Government agencies, colleges, and Universities just had new responsibilities placed on them. Among reported breaches, these are the most often seen types of institutions. Coincidence?
Update 7/18/2006: North Carolina’s law does the right thing, too.

Scottish and Procedural Liberty

car-crushed-by-tank.jpgIn “Scots Crush Cars Over ‘Document Offenses,'” Rogier van Bakel writes about bad new UK law:

Now cars can be seized and crushed if document offences are detected — and the region’s top police officer said yesterday a “clear message” is being sent to would-be offenders. … Tough new powers in the Serious Organised Crime and Police Act 2005 will allow officers to put the squeeze on “irresponsible and selfish” motorists.

The “would-be offenders,” in this case, are not only people who drive without a license, but also those who get behind the wheel without insurance. I don’t disagree that they need to be caught and corrected, but there’s something very unsettling about the fact that they apparently can’t have their day in court — that it’s within a mere cop’s powers to order a vehicle destroyed.

The idea that the police have the power to impose sentences is quite troubling, but more troubling to me is the idea that databases are now presumed correct. I don’t know if this is the case in Scotland, but many US states are going to “electronic proof of insurance.”

So let’s say that your insurance company computer is offline, and can’t provide proof of insurance. You know, sort of like AIG fumbled this week. Recall that AIG’s computer was stolen March 31, and they didn’t get around to telling anyone until June. A similar screw up could now get your car impounded and crushed. Odds are very good that AIG’s contracts will states that their failure to be online isn’t their problem, and you can’t recover damages for your time, loss of vehicle, or distress without taking them to court.

In the IT world, we used to talk about “Garbage in, Garbage Out.” It was an acknowledgment that data quality problems happened, and that they were often the fault of the system owner, not the customer. It was also a driver for the access provisions of privacy law. You have the right to access and correct certain data about you. (In the US, this applies mostly to the government, and certain aspects of the credit bureaus.)

With that loss of understanding comes a serious loss of liberty. The computer is presumed correct, and you are presumed to be a “demon customer.”

Car crush photo from the US Army.

Avant-Garde: A game for three players

three-musicians.jpg(From Bram Cohen and Nick Mathewson.)
The players are three reclusive artists. Their real names are Anaïs, Benoît,
and Camille, but they sign their works as “A,” “B,” and “C” respectively in
order to cultivate an aura of mystery. Every week, each artist paints a new
work in one of two styles: X and Y.

The art world despises uniformity: if all three artists paint in the same
style, their paintings don’t sell, and they get no points. If one of them
paints in a style different from the others, the different artist is
avant-garde and receives a point.

Because the artists are reclusive, the players can’t communicate with each
other. All they learn from one week to another is what style the other players
used in the previous week. (They learn this when gallery manager passes them
the latest gossip from the art world.)

What is the ideal strategy? Clearly, it’s bad when all three paint in the same
style. If the players could communicate, they could agree to take turns being
avant-garde, so that one week A wins, the next week B wins, the next week C
wins, and so on. Also, if they could communicate, A and B could conspire to
shut out C by always using opposite styles. (If A and B always differ, C will
always match one of them, and the other will win.) But since the players can’t
communicate except through their plays, how can they arrange to coordinate in
twos or threes?

If somebody ran an iterated tournament of this game in the style of Axelrod’s
Prisoners’ Dilemma challenge, what program would you submit? (Remember that
your program would often be playing against instances of itself, without
knowing it.)

Variation: what happens when the artists are so reclusive that they won’t even
speak to their gallery manager? In this variation, they only learn whether they
won the last week or not (by checking for their check in the mail).

The painting is Picasso’s Three Musicians.

Breach Roundup

Breach Roundup: “We’re From The Government” Edition


Baxter State Park phot by Jenpilot.

There Will Be No Privacy Chernobyl

Ed Felten asks:

What would be the Exxon Valdez of privacy? I’m not sure. I don’t think it will just be a loss of money — Scott explained why it won’t be many small losses, and it’s hard to imagine a large loss where the privacy harm doesn’t seem incidental. So it will have to be a leak of information so sensitive as to be life-shattering. I’m not sure exactly what that is.

(“The Exxon Valdez of Privacy.”) Privacy advocates have been waiting for this for a long time. It’s important to remember that the Exxon Valdez followed Silent Spring by nearly 30 years. The environmental movement had time to evolve memes. Privacy still has many meanings. The parade of breaches or overflows hasn’t done it, despite medical data, financial data, and just about anything you can imagine being leaked.

This past weekend, I was speaking to a vet friend, and he didn’t care about the VA leak. He said that military SSNs are so public anyway, you’d drive yourself nuts worrying.

Part of the problem is that alternatives are hard. Consumers can’t switch to hydro for their credit. (How’s that for mixing a metaphor?) Background checks are being made a liability issue, despite the base rate fallacy and their general failure modes. Driver’s licenses are being made machine readable.

We’re not going to have a privacy Chernobyl.

The New Transparency Imperative


…in the incident last September, somewhat similar to recent problems at the Veterans Affairs Department, senior officials were informed only two days ago, officials told a congressional hearing Friday. None of the victims was notified, they said.

“That’s hogwash,” Rep. Joe Barton, chairman of the Energy and Commerce Committee, told Brooks. “You report directly to the secretary. You meet with him or the deputy every day. … You had a major breach of your own security and yet you didn’t inform the secretary.” (From Associated Press, “DOE computers hacked; info on 1,500 taken.”)

It used to be that security breaches were closely held secrets. Thanks to new laws, that’s no longer possible. We have some visibility into how bad the state of computer security is. We have some visibility into the consequences of those problems. For the first time, there’s evidence that I can point to to explain why I tremble with fear at phrases like “we use industry-standard practices to protect data about you.”

The new laws are not yet well understood. They’re not well understood by computer security professionals. They’re certainly not the basis for a set of case law that establishes the meaning of some key terms, like encryption. (I expect that juries will frown on using rot-13 to encrypt secrets, even if it might be within the letter of the law.) The only place where they’re understood is by the public, who expects to hear when they’re at risk.

The change in expectations will have exceptionally beneficial long term effects. We will get data that we can use to measure aspects of computer security. What a set of real attack vectors look like. (We may not learn about insiders or super-hackers.) With that data, we can focus our efforts on putting better security measures in place.

Requiring companies to own up to problems will drive them to ask their vendors for better software. They will ask the experts how to distinguish good software from bad. This may have the effects that some experts hope liability would bring about.

There will be a lot of short-term pain as we discover the shape of the new normal. The transition is well worth it.

The image is X Ray 4, by Chris Harve, on StockXpert.