April 03, 2004

Non-Luddite Resistence to Data Mining

The following sentence caught my eye in this article attacking privacy advocates' opposition to technological data gathering innovations through the use of data mining:

Public health authorities have mined medical data to spot the outbreak of infectious disease, and credit-card companies have found fraudulent credit-card purchases with the method, among other applications.

Like many other credit card holders, I've received a call asking about a particular transaction. Every time I've been called, it was a false positive. This imposes a cost on me (I have to answer the phone, I have to listen and judge about the transaction). The cost is small and so I don't mind paying it, but it is a cost. Now what would be the cost of being falsely flagged by an anti-terrorism data system? I might not be permitted to get on a plane. My house might be broken into, my computer might have monitoring software secretly put on it, I could lose contracts that I'm bidding on. I could be simply thrown in jail for a day or two and have my reputation ruined.

These failure modes are much higher in cost. It doesn't take a paranoid or a luddite to be worried about them. Now this problem is not insoluble but it you have to identify what the problem is, that data mining creates an awful lot of false positives, especially as you're just starting your system and haven't refined your algorithms through real world experience.

You can make two types of adjustments that improve the acceptability of data mining in a defense environment. You can change the failure modes, the consequences of false positives, and you can change the frequency of failure modes. The first thing that absolutely has to change is the reputational consequence of being tagged. You have to get rid of the idea that a machine can look at a set of data and say "here, here's a terrorist" with any acceptable degree of reliability. Once being tagged just means that you're leading an interesting life and a human being should make a judgment whether you're in the small % of interesting lives that are also bad guys, the cost of being tagged falsely drops. Of course, the perceived value of the system also drops so advocates don't want to do this, but that's not the fault of the privacy advocates.

The false positive consequence of being refused carriage, or being subject to an unwarranted search also needs addressing, as does the business problem of losing contracts due to being tagged. All of these are difficult things to manage and it's so much easier to dump a luddite or paranoid tag on somebody who is complaining than do the hard work of harm reduction in the inevitable cases of false positives.

Total Information Awareness (TIA) might not be a bad idea in principle, but it needs a lot of work in harm reduction before it can see the light of day in real world use and the same goes for the rest of the resisted IT initiatives that the article talks about. Instead of whining about how we should all be willing to take it in the shorts without complaint, maybe a reduction in the rate of friendly fire might be in order.

Posted by TMLutas at April 3, 2004 10:49 AM