Yesterday, industry and government panelists participated in a conference sponsored by the Congressional Internet Caucus Advisory Committee that included a panel discussion on “Plumbing the Policy Implications of Data Analytics and Defining Big Data,” The Year’s Most Overused Term.” 

According to press reports, Federal Trade Commission Senior Policy Adviser and panelist Paul Ohm acknowledged that big data may have potential benefits to public health and research, but also noted that the benefits of big data “tend to get overblown.”  Mr. Ohm stated that, “when there is an expense to privacy, I think we should have discussions about whether the benefits [of big data] outweigh the costs.” 

Erik Jones, Deputy General Counsel of the Senate Commerce Committee, told participants that the Committee is investigating the collection of big data for use by companies to market to consumers.  He pointed specifically to last year’s inquiry by Commerce Committee Chairman, John D. Rockefeller IV (D-WV) into the activities of nine data brokers.  According to press reports, Mr. Jones stated that the Committee is “not suggesting that there’s something inherently wrong” with the use of big data for marketing purposes, but indicated that the Committee wants to learn more about what information is being collected and how that information is used.

Mr. Ohm also expressed concern generally about whether supposedly anonymous data can be linked to real people in a world of “big data.” 

This picks up on a concern that the FTC has previously expressed―that is, that improvements in technology and the ubiquity of public information have made it easier for “anonymous” data to be linked to a particular consumer or device. To address this concern, the FTC’s March 2012 Privacy Report includes recommendations for companies that seek to de-identify data (and thus avoid treating such data as personal data subject to the FTC’s recommended privacy framework). According to the FTC’s report, such companies should (1) take reasonable measures to ensure that data is de-identified; (2) publicly commit not to try to re-identify the data; and (3) contractually prohibit downstream recipients from trying to re-identify the data.