Last Friday, the FTC announced an agenda for its upcoming workshop, “Big Data: A Tool for Inclusion or Exclusion?” which will take place on Monday, Sept. 15, starting at 8:00 a.m. As we’ve previously reported, the workshop will build on recent efforts by the FTC and other government agencies to understand how new technologies affect the economy, government, and society, and the implications on individual privacy. In particular, while there has been much recognition for the value of big data in revolutionizing consumer services and generally enabling “non‐obvious, unexpectedly powerful uses” of information, there has been parallel focus on the extent to which practices and outcomes facilitated by big-data analytics could have discriminatory effects on protected communities.
The workshop will explore the use of big data and its impact on consumers, including low-income and underserved consumers, and will host the following panel discussions:
- Assessing the Current Environment. Examine current uses of big data in various contexts and how these uses impact consumers.
- What’s on the Horizon with Big Data? Explore potential uses of big data and possible benefits and harms for particular populations of consumers.
- Surveying the Legal Landscape. Review anti-discrimination and consumer-protection laws and discuss how they may apply to the use of big data, and whether there may be gaps in the law.
- Mapping the Path Forward. Consider best practices for the use of big data to protect consumers.
The FTC hopes that the workshop will build on the dialogue raised in its Spring Privacy Seminar Series held from February through May, which addressed mobile-device tracking, data brokers and predictive scoring, and consumer generated and controlled health data. The workshop will convene academic experts, business representatives, industry leaders, and consumer advocates, and will be open to the general public. In advance of the workshop, the FTC has invited the public to file comments, reports, and original research on the proposed topics. The deadline to submit pre-workshop comments is August 15. Following the workshop on September 15, the comment period will remain open until October 15.
The workshop comes on the heels of the White House’s anticipated report on big data released in May, which outlined the administration’s priorities in protecting privacy and data security in an era of big data. With an entire section dedicated to “Big Data and Discrimination,” the report warned that big data “could enable new forms of discrimination and predatory practices.” Chiefly focusing on the use of information, the report showed concern about using data to discriminate against vulnerable groups. Specifically, the report stated that “the ability to segment the population and to stratify consumer experiences so seamlessly as to be almost undetectable demands greater review, especially when it comes to the practice of differential pricing and other potentially discriminatory practices.” Other such practices that have been discussed by government officials and consumer advocates include:
- Identifying frequent shoppers and providing them with better customer service or reduced wait times. For example, a company might give white female shoppers preferential treatment if data analysis demonstrated that white women were the company’s most profitable customers.
- Offering particular consumers different product prices or mortgage rates, based on information about the individuals’ credit histories or whether they had previously been clients of a company’s competitors.
- Predicting the behavior of individual consumers based on patterns drawn from aggregate characteristics of a group of consumers. For instance, a company might assess the credit risks of individual consumers based on the aggregate credit characteristics of the groups of consumers who shop at specific stores.
Advocates also point out that big-data analyses can use information voluntarily shared outside of the consumer or retail context, for example on social networks, for purposes of inferring personal information such as race, ethnicity, religion, gender, age, and sexual orientation. In some scenarios, this information could be used to disproportionately reduce the access of some consumer groups to certain products, services, or content. As John Podesta, Counselor to the President who led the review underlying the White House’s report has said, big-data technology has the power to “reinforc[e] existing inequities in housing, credit, employment, health and education.” The report also points out that, because of the “lack of transparency and accountability, individuals have little recourse to understand or contest the information that has been gathered about them or what that data, after analysis, suggests.” Although the report also acknowledged that big data can help identify and address these practices, it pointed strongly toward the development of laws that would prevent them.
The discussion about big data and discrimination, however, began before the White House report. In 2012, Alistair Croll wrote a blog post entitled, “Big Data Is Our Generation’s Civil Rights Issue, and We Don’t Know It.” Croll cautioned that, in the consumer context, “personalization” was the same thing as “discrimination” masquerading as “better service.” He gave as an example an online dating service that in 2010 already had identified certain “trigger” words often used by members of a specific race or gender. Croll predicted that, “We’re seeing the start of this slippery slope everywhere from tailored credit-card limits . . . to car insurance based on driver profiles. In this regard, Big Data is a civil rights issue, but it’s one that society in general is ill-equipped to deal with.”
From a legal standpoint, a principal issue is the disparity in awareness between the companies that analyze data, and the consumers whose data is being used, making it difficult to apply existing anti-discrimination laws to this context. According to Pam Dixon, Executive Director of the World Privacy Forum, secret data scoring “can hide discrimination, unfairness and bias both in the score itself and in its use.” Kate Crawford, Principal Researcher at Microsoft adds, “It’s not that big data is effectively discriminating – it is, we know that it is. It’s that you will never actually know what those discriminations are.” For example, “[b]anks must report detailed statistics about their actual lending activity to regulators, but web advertising parameters are seemingly free of discrimination.” Crawford says that, “[b]y never putting offers in front of unwanted groups, and thus never formally rejecting them, those who engage in online discrimination could sidestep fair lending and redlining laws that apply in the physical world.”
Another problem is that even innocuously designed computer algorithms can generate discriminatory results, based on the data that the algorithms collect and analyze. Perhaps the issue is that the substance of the collected data indirectly reveals information about the person’s gender or race, making it possible for a gender-free or race-free algorithm to produce gender- or race-biased results. For example, a recruiting algorithm that sorts candidates based on their residential locations, academic institutions, or online connections could indirectly categorize the candidates based on gender or race. Similarly, recent studies have shown that algorithms can look at a person’s Facebook “likes” and predict the person’s gender, race, sexual orientation, and drinking habits with significant accuracy. Alternatively, the issue could be that the data is collected from an unrepresentative sample of the population. A smartphone app allowing drivers to report potholes for repair, or a Twitter app that lets users request emergency aid during natural disasters, for instance, might concentrate resources in wealthy areas with higher rates of smartphone ownership.
Michael Hendrix, director for emerging issues and research at the U.S. Chamber of Commerce Foundation, takes a cautious stance in the discussion and points out that “Big Data is just a tool—it can be used for good or ill,” and that “[s]imply singling out one innovation or industry won’t rid us of discrimination.” Hendrix would prefer for the discussion to “focus on the good uses of data while substantively addressing those practices that are clearly discriminatory or abusive,” and to recognize that “[a]chieving these aims requires more data, not less.” So far, with a shift in attention toward the use — rather than collection — of data, it would seem that the FTC and the White House are aiming to take this approach.