How to Exercise the Power You Didn’t Ask For

Permalink to this post

A version of this piece as it originally appeared in the Harvard Business Review on September 19, 2018 is accessible here.

I used to be largely indifferent to claims about the use of private data for targeted advertising, even as I worried about privacy more generally. How much of an intrusion was it, really, for a merchant to hit me with a banner ad for dog food instead of cat food, since it had reason to believe I owned a dog? And any users who were sensitive about their personal information could just click on a menu and simply opt out of that kind of tracking.

But times have changed.

The digital surveillance economy has ballooned in size and sophistication, while keeping most of its day-to-day tracking apparatus out of view. Public reaction has ranged from muted to deeply concerned, with a good portion of those in the concerned camp feeling so overwhelmed by the pervasiveness of their privacy loss that they’re more or less reconciled to it. It’s long past time not only to worry but to act.

Advertising dog food to dog owners remains innocuous, but pushing payday loans to people identified as being emotionally and financially vulnerable is not. Neither is targeted advertising that is used to exclude people. Julia Angwin, Ariana Tobin, and Madeleine Varner found that on Facebook targeting could be used to show housing ads only to white consumers. Narrow targeting can also render long-standing mechanisms for detecting market failure and abuse ineffective: State attorneys general or consumer advocates can’t respond to a deceitful ad campaign, for instance, when they don’t see it themselves. Uber took this predicament to cartoon villain extremes when, to avoid sting operations by local regulators, it used data collected from the Uber app to figure out who the officials were and then sent fake information about cars in service to their phones.

These are relatively new problems. Originally, our use of information platforms, whether search engines or social media, wasn’t tailored much to anything about us, except through our own direct choices. Your search results for the query “Are vaccinations safe?” would be the same as mine or, for a term like “pizza,” varied in a straightforward way, such as by location, offering up nearby restaurants. If you didn’t like what you got, the absence of tailoring suggested that the search platform wasn’t to blame; you simply were seeing a window on the web at large. For a long time that was a credible, even desirable, position for content aggregators to take. And for the most part they themselves weren’t always good at predicting what their own platforms would offer up. It was a roulette wheel, removed from any human agent’s shaping.

Today that’s not true. The digital world has gone from pull to push: Instead of actively searching for specific things, people read whatever content is in the feeds they see on sites like Facebook and Twitter. And more and more, people get not a range of search results but a single answer from a virtual concierge like Amazon’s Alexa. And it may not be long before such concierges rouse themselves to suggest it’s time to buy a gift for a friend’s birthday (perhaps from a sponsor) or persistently recommend Uber over Lyft when asked to procure a ride (again, thanks to sponsorship).

Is it still fair for search platforms to say, “Don’t blame me, blame the web!” if a concierge provides the wrong directions to a location or the wrong drug interaction precautions? While we tend not to hold Google and Bing responsible for the accuracy of every link they return on a search, the case may be different when platforms actively pluck out only one answer to a question — or answer a question that wasn’t even asked.

We’ve also moved to a world where online news feeds — and in some cases concierges’ answers to questions — are aggressively manipulated by third parties trying to gain exposure for their messages. There’s great concern about what happens when those messages are propaganda — that is, false and offered in bad faith, often obscuring their origins. Elections can be swayed, and people physically hurt, by lies. Should the platforms be in the business of deciding what’s true or not, the way that newspapers are? Or does that open the doors to content control by a handful of corporate parties — after all, Facebook has access to far more eyeballs than a single newspaper has ever had — or by the governments that regulate them?

Companies can no longer sit this out, much as they’d like to. As platforms provide highly curated and often single responses to consumers’ queries, they’re likely to face heated questions — and perhaps regulatory scrutiny — about whom they’re favoring or disfavoring. They can’t just shrug and point to a “neutral” algorithm when asked why their results are the way they are. That abdication of responsibility has led to abuse by sophisticated and well-funded propagandists, who often build Astroturf campaigns that are meant to look as if they’re grassroots.

So what should mediating platforms do?

An answer lies in recognizing that today’s issues with surveillance and targeting stem from habit and misplaced trust. People share information about themselves without realizing it and are unaware of how it gets used, passed on, and sold. But the remedy of allowing them to opt out of data collection leads to decision fatigue for users, who can articulate few specific preferences about data practices and simply wish not to be taken advantage of.

Restaurants must meet minimum standards for cleanliness, or (ideally) they’ll be shut down. We don’t ask the public to research food safety before grabbing a bite and then to “opt out” of the dubious dining establishments. No one would rue being deprived of the choice to eat food contaminated with salmonella. Similar intervention is needed in the digital universe.

Of course, best practices for the use of personal information online aren’t nearly as clear cut as those for restaurant cleanliness. After all, much of the personalization that results from online surveillance is truly valued by customers. That’s why we should turn to a different kind of relationship for inspiration: one in which the person gathering and using information is a skilled hired professional helping the person whose data is in play. That is the context of interactions between doctors and patients, lawyers and clients, and certified financial planners and investors.

Yale Law School’s Jack Balkin has invoked these examples and proposed that today’s online platforms become “information fiduciaries.” We are among a number of academics who have been working with policymakers and internet companies to map out what sorts of duties a responsible platform could embrace. We’ve found that our proposal has bipartisan appeal in Congress, because it protects consumers and corrects a clear market failure without the need for heavy-handed government intervention.

“Fiduciary” has a legalese ring to it, but it’s a long-standing, commonsense notion. The key characteristic of fiduciaries is loyalty: They must act in their charges’ best interests, and when conflicts arise, must put their charges’ interests above their own. That makes them trustworthy. Like doctors, lawyers, and financial advisers, social media platforms and their concierges are given sensitive information by their users, and those users expect a fair shake — whether they’re trying to find out what’s going on in the world or how to get somewhere or do something.

A fiduciary duty wouldn’t broadly rule out targeted advertising — dog owners would still get dog food ads — but it would preclude predatory advertising, like promotions for payday loans. It would also prevent data from being used for purposes unrelated to the expectations of the people who shared it, as happened with the “personality quiz” survey results that were later used to psychometrically profile voters and then attempt to sway their political opinions.

This approach would eliminate the need to judge good from bad content, because it would let platforms make decisions based on what their users want, rather than on what society wants for them. Most users want the truth and should be offered it; others may not value accuracy and may prefer colorful and highly opinionated content instead — and when they do, they should get it, perhaps labeled as such. Aggregators like Google News and Facebook are already starting to make such determinations about what to include as “news” and what counts as “everything else.” It may well be that an already-skeptical public only digs in further when these giants offer their judgments, but well-grounded tools could also inform journalists and help prevent propaganda posted on Facebook from spreading into news outlets.

More generally, the fiduciary approach would bring some coherence to the piecemeal privacy protections that have emerged over the years. The right to know what data has been collected about you, the right to ask that it be corrected or purged, and the right to withhold certain data entirely all jibe with the idea that a powerful company has an obligation to behave in an open, fair way toward consumers and put their interests above its own.

While restaurant cleanliness can be managed with readily learned best practices (keep the raw chicken on a separate plate), doctors and lawyers face more complicated questions about what their duty to their patients and clients entails (should a patient with a contagious and dangerous disease be allowed to walk out of the office without treatment or follow-up?). But the quandaries of online platforms are even less easy to address. Indeed, one of the few touchstones of data privacy — the concept of “personally identifiable information,” or PII — has become completely blurry, as identifying information can now be gleaned from previously innocuous sources, making nearly every piece of data drawn from someone sensitive.

Nevertheless, many online practices will always be black-and-white breaches of an information fiduciary’s duty. If Waze told me that the “best route” somewhere just so happened to pass by a particular Burger King, and it gave that answer to get a commission if I ate there, then Waze would be putting its own interests ahead of mine. So would Mark Zuckerberg if hypothetically he tried to orchestrate Facebook feeds so that Election Day alerts went only to people who would reliably vote for his preferred candidate. It would be helpful to take such possibilities entirely off the table now, at the point when no one is earning money from them or prepared to go to bat for them. As for the practices that fall into a grayer area, the information fiduciary approach can be tailored to account for newness and uncertainty as the internet ecosystem continues to evolve.

Ideally, companies would become fiduciaries by choice, instead of by legal mandate. Balkin and I have proposed how this might come about — with, say, U.S. federal law offering relief from the existing requirements of individual states if companies opt in to fiduciary status. That way, fiduciary duties wouldn’t be imposed on companies that don’t want them; they could take their chances, as they already do, with state-level regulation.

In addition, firms would need to structure themselves so that new practices that raise ethical issues are surfaced, discussed internally, and disclosed externally. This is not as easy as establishing a standard compliance framework, because in a compliance framework the assumption is that what’s right and wrong is known, and managers need only to ensure that employees stay within the lines. Instead the idea should be to encourage employees working on new projects to flag when something could be “lawful but awful” and congratulate — rather than retaliate against — them for calling attention to it. This is a principle of what in medical and some other fields is known as a “just culture,” and it’s supported by the management concept of “psychological safety,” wherein a group is set up in a way that allows people to feel comfortable expressing reservations about what they’re doing. Further, information fiduciary law as it develops could provide some immunity not just to individuals but to firms that in good faith alert the public or regulators to iffy practices. Instead of having investigations into problems by attorneys general or plaintiffs’ lawyers, we should seek to create incentives for bringing problems to light and addressing them industrywide.

That suggests a third touchstone for an initial implementation of information fiduciary law: Any public body chartered with offering judgments on new issues should be able to make them prospectively rather than retroactively. For example, the IRS can give taxpayers a “private letter ruling” before they commit to one tax strategy or another. On truly novel issues, companies ought to be able to ask public authorities — whether the Federal Trade Commission or a new body chartered specifically to deal with information privacy — for guidance rather than having to make a call in unclear circumstances and then potentially face damages if it turns out to be the wrong one.

Any approach that prioritizes duty to customers over profit risks trimming margins. That’s why we need to encourage a level playing field, where all major competitors have to show a baseline of respect. But the status quo is simply not acceptable. Though cleaning up their data practices will increase the expenses of the companies who abuse consumers’ privacy, that’s no reason to allow it to continue, any more than we should heed polluters who complain that their margins will suffer if they’re forced to stop dumping contaminants in rivers.

The problems arising from a surveillance-heavy digital ecosystem are getting more difficult and more ingrained. It’s time to try a comprehensive solution that’s sensitive to complexities, geared toward addressing them as they unfold, and based on duty to the individual consumers whose data might otherwise be used against them.

CDA 230 Then and Now: Does Intermediary Immunity Keep the Rest of Us Healthy?

Permalink to this post

This essay was originally published in November of 2017 as part of a series commemorating the 20th anniversary of the Zeran v. AOL case.

Twenty years after it was first litigated in earnest, the U.S. Communications Decency Act’s Section 230 remains both obscure and vital. Section 230 nearly entirely eliminated the liability of  Internet content platforms under state common law for bad acts, such as defamation, occasioned by their users. The platforms were free to structure their moderation and editing of comments as they pleased, without a traditional newspaper’s framework in which to undertake editing was to bear responsibility for what was published. If the New York Times included a letter to the editor that defamed someone, the Times would be vulnerable to a lawsuit (to be sure, so would the letter’s author, whose wallet size would likely make for a less tempting target). Not so for online content portals that welcome comments from anywhere – including the online version of the New York Times.

This strange medium-specific subsidy for online content platforms made good if not perfect sense in 1996. (My generally positive thinking about it from that time, including some reservations, can be found here.) The Internet was newly mainstream, and many content portals comprised the proverbial two people in a garage. To impose upon them the burdens of traditional media would presumably require tough-to-maintain gatekeeping. Comments sections, if they remained at all, would have to be carefully screened to avoid creating liability for the company. What made sense for a newspaper publishing at most five or six letters a day amidst its more carefully vetted articles truly couldn’t work for a small Internet startup processing thousands or even millions of comments or other contributions in the same interval. Over time, the reviews elicited by Yelp and TripAdvisor, the financial markets discussions on Motley Fool, the evolving articles on user-edited Wikipedia – all are arguably only possible thanks to that Section 230 immunity conferred in 1996.

The immunity conferred is so powerful that there’s not only a subsidy of digital over analog, but one for third-party commentary over one’s own – or that of one’s employees. Last year the notorious Gawker.com settled for $31 million after being successfully sued for publishing a two-minute extract of a private sex video. If Gawker, instead of employing a staff whose words (and video excerpts) were attributable to the company, had simply let any anonymous user post the same excerpt – and indeed worked to assure that that user’s anonymity could not be pierced – it would be immune from an identical invasion of privacy suit thanks to the CDA. From this perspective, Gawker’s mistake wasn’t to host the video, but to have its own employees be the ones to post it.

The Internet environment of 2017 is a lot different than that of 1997, and some of those two-people-in-a-garage ventures are now among the most powerful and valuable companies in the world. So does it make sense to maintain Section 230’s immunities today?

 

An infant industry has grown up

In 1997, it made sense on a number of fronts to treat the Internet differently from its analog counterparts. For example, there was debate from the earliest mainstreaming of Internet commerce about whether to make U.S. state sales tax collection apply to Internet-based faraway purchases. The fact that there was so little Internet commerce meant that there was not a lot of money foregone by failing to tax; that new companies (and, for that matter, existing ones) could try out e-commerce models without concerning themselves from the start with tax compliance in multiple jurisdictions; and that the whole Internet sector could gather momentum if purchasers were enticed to go online – which in turn would further entice more commerce, and other activity, online. I was among those who therefore argued in favor of the de facto moratorium on state sales tax. But that differential no longer makes sense. A single online company – Amazon – now accounts for about 5% of all U.S. retail sales, online or off.  It’s a good thing that Amazon’s physical expansion has meant that it naturally has started collecting and remitting state sales tax around the country.

Perhaps the evolution of the merits of equal treatment for state sales tax provides a good model for a refined CDA: companies below a certain size or activity threshold could benefit from its immunities, while those who grow large enough to facilitate the infliction of that much more damage from defamatory and other actionable posts might also have the resources to employ a compliance department. That would militate towards at least some standard to meet in vetting or dealing with posts, perhaps akin to the light duties of booksellers or newsstands towards the wares they stock rather than the higher ones of newspapers towards the letters they publish. Apart from the first-order drawback of an incentive to game the system by staying just under whatever size or activity threshold triggers the new responsibilities, there’s also the question of non-commercial communities that can become large without having traditional corporate hierarchies that lend themselves to direct legal accountability. Some of the most important computing services in the world rely on free and open source software, even as there remains a puzzle of how software liability would work when there’s no organized firm singly producing it. This puzzle has remained unsolved even today, since liability for bugs or vulnerabilities in even corporate-authored software tends to be quite minimal. That might change as the line between hardware and software continues to blur with the Internet of Things.

Even for companies suited for new, light responsibilities under a modified CDA, there might be a distinction made between damages for past acts and duties for future ones. The toughest part of the Zeran case even for those sympathetic to the CDA is that apparently AOL was repeatedly told that the scandalous advertisement purporting to be from Ken Zeran was in fact not at all related to him – and the company was in a comparatively good position to confirm that. Even then the company did nothing. It’s one thing to have permitted some defamatory content to come through amidst millions of messages; it’s another to be fully aware of it once it’s posted, and to still not be charged with any responsibility to deal with it. A more refined CDA might underscore such a distinction, favoring the kind of knowledge of falsehood that’s at the heart of the heightened New York Times v. Sullivan barrier that public figures must meet in establishing defamation by a newspaper, and also cover knowledge that might come about after publication rather than before – leading only to responsibility once the knowledge is gained and not timely acted upon.

 

The AI thicket

Even massive online speech-mediating companies can only hire so many people. With thousands of staffers around the world apparently committed to reviewing complaints arising over Facebook posts, the company still relies on algorithms to sift helpful from unhelpful content. And here the distinction between pre- and post-publication becomes blurred, because services like Facebook and Twitter not only host content – as a newspaper website does by permitting comments to appear in sequence after an article – but they also help people navigate it. A post might reach ten people or a billion, depending on whether it’s placed in no news feeds or many.

The CDA as it stands allows maximum flexibility for salting feeds, since no liability will attach for spreading even otherwise-actionable content far and wide. A refined CDA could take into account the fact that Facebook and others know exactly whom they’ve reached: perhaps a more reasonable and fitting remedy for defamation would less be to assess damages against the company for having abetted it, but rather to require a correction or other followup to go out to those who saw – and perhaps came to believe – the defamatory content. (To be sure, this solution doesn’t work for other wrongs such as invasion of privacy; no correction can “uninvade” it among those who saw the content in question.)

Such corrective, rather than compensatory, remedies may be more fitting both for the wronged party and for the publisher, but it could in turn make content elision much more common. For example, in the context of traditional book publishing, including for non-interactive digital books like those within a Kindle, the CDA does not protect the publisher against the author’s defamation. With a threat of liability remaining, I’ve worried that in addition to damages, a litigant might demand a digital retraction: a forced release of a new version of an e-book to all e-readers that omits the defamatory content.

Of course, if the challenged words are really defamatory that might be thought of as an improvement for both injured party and for the reader. But if done without notice to the reader, it smacks of propaganda, and to the extent lawsuits or threats of same can induce defendant publishers to cave – when caving doesn’t entail paying out damages but rather altering the content they’ve stewarded – it could come to happen all too frequently, and with the wrong incentives. Similarly, an AI trained to avoid controversial subjects – perhaps defined as subjects that could give rise to threats of litigation – might be very much against the public interest. This would mirror some of the damaging incentives of Europe’s “right to be forgotten” as developed against search engines. Any refinement of the CDA that could inspire AI-driven content shaping runs this risk, with the perverse solace that even with today’s CDA the major content platforms are already shaping content in ways that are not understandable or reviewable outside the companies.

Related to the power of AI is the refined power to personalize content in 2017, including by jurisdiction. If a Texas court finds something defamatory under Texas law, such as maligning certain food products, it might not be defamatory under, say, Massachusetts law. Any diminution of CDA 230’s immunities might in the first order impel online platforms like Facebook to have to police away any food disparagement – even if it’s posted and read by Facebook users in food-indifferent Massachusetts. If there were to be exposure under Texas law, perhaps it should only arise if the content were shown (or continued to be shown) in Texas. This could also provide a helpful set of pressures on the substantive doctrine: Texas citizens, including legislators, might rue being excluded from certain content online that’s available in other states.

The Internet’s development over the past twenty years has benefited immeasurably from the immunities conferred by Section 230. We’ve been lucky to have it. But any honest account must acknowledge the collateral damage it has permitted to be visited upon real people whose reputations, privacy, and dignity have been hurt in ways that defy redress. Especially as that damage becomes more systematized – now part of organized campaigns to shame people into silence online for expressing opinions that don’t fit an aggressor’s propaganda aims – platforms’ failures to moderate become more costly, both to targets of harassment and to everyone else denied exposure to honestly-held ideas.

As our technologies for sifting and disseminating content evolve, and our content intermediaries trend towards increasing power and centralization, there are narrow circumstances where a path to accountability for those intermediaries for the behavior of their users might be explored. Incrementalism gets a bad rap, but it’s right to proceed slowly if at all here, with any tweaks subject to rigorous review of how they impact the environment. The vice from the indiscriminate nature of Section 230’s broad immunity is somewhat balanced by a virtue of everyone knowing exactly where matters stand – line-drawing carries its own costs and distortions.

A novel way of defending against mass uses of our data

Permalink to this post

AI is getting better at performing mass categorization of photos and text. A developer can scrape a bunch of photos from, say, Facebook — either directly, likely violating the terms of service, or through offering an app by which people consent to the access — and then use a well-trained categorizer to automatically discern ethnicity, gender, or even identity.

Some defenses can be built in against abuse, starting with a technical parlor trick and ending with support from the law. There’s been promising research on “image perturbation” that adjusts a photo in a way that is unnoticeable to a human, but that completely confuses standard image recognition tools that might otherwise make it easy to categorize a photo. (There’s a helpful video summary of some of the research by NguyenYosinski, and Clune available here.)

For example, this intrepid group of MIT students can make Google’s otherwise-reliable image recognition algorithm mistake a turtle for a rifle, or a cat for … guacamole.

I’m part of a team at MIT and Harvard within the Assembly program — Thom Miano, Dhaval Adjodah, Francisco, Daniel Pedraza, Gretchen Greene, and Josh Joseph — that’s working on tools so that users can upload invisibly-modified photos of themselves to social media without making them so readily identifiable.

Those modifications won’t thwart AI tools forever — but they’ll represent an unmistakable indication about user preference, and the law can then demand that those preferences be respected.

There is already a model for this: photos taken with a smartphone are invisibly labeled with time, date, and location. Facebook and Twitter for years have automatically stripped this information out before they show those photos on their services, avoiding a privacy nightmare by which a single photo could instantly locate someone. (There would no doubt have been Congressional hearings had they failed to do this.)

They can and should similarly undertake, on behalf of their users, to perturb images with the latest technology to prevent widescale AI-assisted identification by others, and to provide an anchor similar to “do not track” to make user preferences about bulk downstream use abundantly clear.

And these defense need not only apply to photos. An insurance company had an opt-in plan for Facebook users to have the nature of their posts influence their car insurance rates. As the Guardian described it:

Facebook users who write in short, concise sentences, use lists, and arrange to meet friends at a set time and place, rather than just “tonight”, would be identified as conscientious. In contrast, those who frequently use exclamation marks and phrases such as “always” or “never” rather than “maybe” could be overconfident.

There are also techniques that have moved from academia to industry like “differential privacy” and its precursors, where decoy data — a few new random stray likes in a profile — can be introduced to allow for helpful generalizations from bulk data across lots of people while protecting individual privacy by preventing easy generalizations about a single person.