- BlogInfoSec.com - https://www.bloginfosec.com -

Those Data are Mine(d)

The “What They Know” series in The Wall Street Journal continues and is commendably relentless in its reporting of the growing compromises of personal data. Periodically and quite frequently, the WSJ publishes an article on how our privacy is being gnawed away many millions of records at a time. I first discussed the initial set of “What They Know” columns in my November 1, 2010 column “Privacy? What Privacy?” and suggested that we should all stay on the lookout for subsequent articles in the series. So here’s another one…

An article by Steve Stecklow and Paul Sonne, with the title “Shunned Profiling Method On the Verge of Comeback,” appeared on the front page of the November 24, 2010 edition. This time the reporters describe how “deep packet inspection” can glean behavioral information about millions of us by delving into the details held within the data packets that traverse the Internet.

I had originally known deep packet inspection to be a good thing. After all, it is touted by vendors of intrusion detection and prevention systems as a means of determining malicious, or at least suspect, behavior by prospective attackers. It enables the ferreting out of questionable transactions that would otherwise not be detected using more superficial techniques.

But like virtually all valuable technologies, deep packet inspection can be used for good or for evil, and, from the viewpoint of privacy advocates, such profiling of individuals, as described in the article, is certainly a bad thing.

When data mining first appeared on the scene, I was an enthusiastic proponent of the technology, and wrote about it some 14 years ago in an article in the December 1996 issue of Wall Street & Technology with the title “Cashing in on Data Mining.” Back then, the issue of privacy was not top of mind and was simply handled by not providing specific names, addresses, etc. of the persons in the database. The idea was to mine patterns of behavior that could then be used in developing marketing programs. While reports might show distributions of information, such as family income by zip code, they didn’t get down to the individual household level. Granted, marketers would determine that particular zip codes contained persons more likely to purchase a particular type of product of service, but they would target an area rather than specific individuals.
The big change over the past decade has been the ease of attribution of specific data to particular persons, even if separate individual sources do not provide those relationships directly. Through sophisticated analyses and inference methods, today’s analytical systems can drill down to the level of the individual and have a high degree of confidence, though not total assurance, that persons and their attributes have been accurately matched. Unfortunately, as occasionally happens in the criminal justice system, for example, the wrong person is “accused” of owning a specific set of data, even when it is incorrect. This can arise from typographical errors, similarity of names, obsolete information, incorrect data, misinterpretation of facts, and the like. So not only do we risk intrusion on our privacy, there is a real and present danger of wrong identities, misrepresentation, misattribution and misinterpretation.

We need policies, systems and procedures that allow individuals to know the full extent of the use of their personal data, to verify that such data are correct and correct inaccurate information about them. This is required in some parts of the world, such as Europe, and should be a basic requirement everywhere. How are we to be able to stop abuse and misuse of our personal data if we don’t even know what data about us are out there and to what uses they are put?