Fair Use Week 2023 (10th Anniversary): Day Two With Guest Expert Brandon Butler

Avoiding Copyright Literalism and the Fairness of Computer-Generated Works

by Brandon Butler

The last six months or so have seen the seemingly sudden appearance of several startlingly powerful tools that create complex new textual and visual works in response to relatively simple prompts. You probably know at least a couple by name: chatGPT (for text) and Stable Diffusion (for images) are the ones that seem to have taken over my social feeds. These tools are creating a buzz in part because the works they generate are of sufficient quality that they could pass for or replace the work of humans, at least in some contexts. This raises a laundry list of policy questions, some as old as the story of John Henry (will machines put humans out of work?), others as 21st Century as data sovereignty (how can nations govern data pertaining to their citizens when it flows seamlessly around the globe?).

In copyright world – including in some the inevitable raft of lawsuits – the question has been put more narrowly: do these computer tools violate the copyrights of the works that are used to “train” them? Lots of smart people have opined on this already, so I don’t want to go too deeply down this rabbit hole myself. The technical legal answer I favor is straightforward, and the very short version is that there’s no meaningful difference between these tools and the other “non-consumptive”/computational uses that courts have already blessed as fair use many times over. These uses are fair because precedent pretty clearly says they are. Maybe I’m being too glib about the technical legal answer, but in any case, I want to answer a different question.

Why should we embrace this (IMO) fact about the law, that fair use generally protects tools like chatGPT and Stable Diffusion against copyright liability? Even if we have legitimate concerns about the impacts of these technologies, we should recognize these are not copyright concerns and stand by fair use and the robots’ right to read. I think the answer is rooted in copyright’s purpose, and the corresponding limits in its scope.

In a nutshell, my argument is this: The exclusive rights in copyright law are not well-tailored to the law’s public interest purpose. Applied broadly and literally (I’ll call this “copyright literalism”), the exclusive rights in the law threaten to chill uses that benefit the public and that do not result in the kind of unfair competition that copyright was meant to prevent. Fair use exists in part to shield legitimate uses from copyright literalism and contain copyright to its intended domain. The application of copyright’s exclusive rights to computer-generated works is copyright literalism par excellence—it punishes literal copying even though the final result is non-infringing and the putative harm to the copyright holder (the creation of new *non-infringing* works that are cheaper and easier to produce) is not the kind of harm that copyright exists to prevent.

(NB: I realize that in some cases these technologies can be tricked into reproducing their training materials, and of course in these cases the outputs likely are infringing. I’m addressing here the argument that computer-generated works that are the result of a process involving “training” with in-copyright works are per se infringing.)

Copyright is for the public

Article I, Section 8, clause 8 of the US Constitution gives congress the power to create copyrights (and patents). Crucially, the clause specifies the purpose of this power: “to promote the progress of Science and the useful Arts.” Granting copyrights “for limited times” (a term of 14 years at the time that clause was written) is a means to an end, which ideally congress and the courts should bear in mind as they consider how to modify or apply the law.

Congressional action has not always been guided by this principle (witness the extension of copyright term by more than a century despite little evidence of any public benefit, but courts, especially the Supreme Court, acknowledge copyright’s public interest purpose all the time. For example, here’s Justice Kagan in Kirtsaeng v. John Wiley & Sons, Inc., 136 S. Ct. 1979, 1986 (2016):

“[C]opyright law ultimately serves the purpose of enriching the general public through access to creative works.”

And Justice O’Connor in one of my personal favorites, Feist Pubs., Inc. v. Rural Tel. Svc. Co., Inc., 499 U.S. 340, 349 (1991):

“The primary objective of copyright is not to reward the labor of authors, but ‘[t]o promote the Progress of Science and useful Arts.’”

And Twentieth Century Music Corp. v. Aiken, 422 U.S. 151, 156 (1975):

“[P]rivate motivation must ultimately serve the cause of promoting broad public availability of literature, music, and the other arts.”

And Fox Film Corp. v. Doyal, 286 U. S. 123, 127 (1932):

“The sole interest of the United States and the primary object in conferring the monopoly lie in the general benefits derived by the public from the labors of authors.”

The consequences of all this for fair use become clear in a pair of Supreme Court cases that enshrine fair use (alongside the idea/expression dichotomy) as a core, constitutionally-mandated element of the copyright law.

Public Interest Safety Valve(s)

Two cases sought to challenge the unprecedented expansion of copyright’s length and strength at the end of the 20th century. Eldred v. Ashcroft challenged the retroactive addition of 20 years to existing copyright terms, then Golan v. Holder challenged the restoration of copyright for works that had previously entered the public domain. In both cases the challengers argued that the law had intruded impermissibly on the public’s constitutional interests by starving the public domain, but in both cases the Supreme Court declined to second guess congress’s judgment.

To soften these blows to the public’s constitutional interest in copyright, the Court highlighted in Eldred (and reiterated in *Golan*) the presence of two key “First Amendment accommodations” in the law: fair use and the idea/expression dichotomy (the principle that copyright does not protect abstract ideas, only particular creative expressions). These doctrines ensure that even during the term of copyright, the public has some leeway to use copyright-encumbered works.

This is important because the literal scope of the exclusive rights in copyright are breathtakingly broad – reproduction, distribution, adaptation – there is hardly anything you can do with a copyrighted work that doesn’t involve one of these activities, especially in a digital context. And copyright infringement is what’s called a “strict liability” offense—there is no requirement that the alleged infringer have a bad intent in engaging in any of these acts. If not for fair use (and the body of other limitations and exceptions, including the idea/expression dichotomy), copyright would be a breathtakingly powerful private right to control others’ engagement with culture and knowledge.

Google v. Oracle, Copyright, and Competition

One more thread bears surfacing in this conversation: the role of copyright and fair use in fostering competition. The Supreme Court emphasized this role in its most recent fair use opinion, Google v. Oracle. In that case, Justice Breyer describes fair use’s role in the context of software copyrights:

fair use can play an important role in determining the lawful scope of a computer program copyright… It can distinguish between expressive and functional features of computer code where those features are mixed. It can focus on the legitimate need to provide incentives to produce copyrighted material while examining the extent to which yet further protection creates unrelated or illegitimate harms in other markets or to the development of other products. In a word, it can carry out its basic purpose of providing a context-based check that can help to keep a copyright monopoly within its lawful bounds.

As examples of how fair use has played this role in the past, Justice Breyer cited cases like Sony v. Connectix and Sega v. Accolade, cases where software engineers made copies of protected works in a process that resulted in the development of new, non-infringing software. Yes, these cases say, there is literal copying involved in this process, but the end result (and the only thing offered to the public in competition with the works that were copied “behind the curtain”) is something new and non-infringing – exactly the kind of creativity copyright is meant to promote, not discourage. So, fair use acts as a context-based check on the otherwise overly broad literal scope of copyright’s exclusive rights, shielding these intermediate, back-room, pro-competitive copies from liability and enabling the creation of valuable new works.

Similarly, in the Oracle case, Justice Breyer held that Google had created a valuable new work in the Android mobile operating system, and that the use of elements of Oracle’s Java language to enable programmers to interact more easily with Android was fair.

Pulling it all together: computer-generated works and copyright literalism

If you’re with me so far, you’ve seen

1. That copyright is for the public, and it grants rights to authors only as a means to the end of public enrichment
2. That copyright’s exclusive rights are so broad on their face that they increasingly (and, in a digital age, inevitably) apply to uses that don’t harm, and in many cases affirmatively advance, copyright’s public purpose.
3. That fair use is one of the most important safeguards against this outcome, and in particular,
4. That fair use applies where literal copying happens “behind the scenes” as part of a process whose result advances the purpose of copyright, as in reverse engineering, in the search engine cases, and in Oracle.

It shouldn’t be hard to see, then, why copyright law should be positively disposed toward the application of fair use to enable the creation of tools that generate new, non-infringing works. Yes, there is literal copying involved in “training” these computer tools, but the end result is increased production of new creative works, the very public benefit that copyright is meant to encourage. These back end uses should not be cognizable as violations of copyright. The alternative is to turn copyright on its head, discouraging the creation of new works in order to vindicate a private monopoly.

Mandatory chatGPT post-script: I asked the robot “What is the purpose of copyright?” Here is its answer – not bad!

The purpose of copyright is to provide legal protection and exclusive rights to creators and owners of original works of authorship, such as literary, artistic, musical, and other creative works. The aim of copyright is to encourage and reward creativity and innovation by giving creators the exclusive right to control the use and distribution of their works for a limited time.

Copyright gives creators the ability to prevent others from reproducing, distributing, performing, displaying, or creating derivative works based on their original works without their permission. By providing this protection, copyright allows creators to benefit financially from their works and to control the use and distribution of their works.

In addition to providing financial benefits for creators, copyright also helps to promote cultural and educational development by ensuring that creators have the incentive and means to produce new works. Copyright law also helps to ensure that creators receive recognition for their works and that their works are not misrepresented or misused in ways that could harm their reputation or integrity.

Brandon Butler is Director of Information Policy at University of Virginia. There he focuses on intellectual property, copyright, licensing and user privacy as they are related to the acquisition, dissemination and preservation of various forms of information and cultural artifacts, and as they are related to scholarly communication. He serves as an expert consultant to UVA librarians, to groups and individuals within the University, and to national and international efforts focused on relevant questions. He was previously a Practitioner-in-Residence at the Glushko-Samuelson Intellectual Property Law Clinic at American University’s Washington College of Law. Before that, Brandon was Director of Public Policy Initiatives at ARL from 2009 to 2013.

Fair Use Week 2023 (10th Anniversary): Day Two With Guest Expert Prof. Pia Hunter

The 10th Anniversary of Fair Use Week continues with a guest post by fair use expert and Fair Use Week Founders Award Winner Prof. Pia Hunter. Join her in a review of the whirlwind years of library pandemic closures, and how fair use, and the programs that explicitly utilized fair use, were critical in maintaining access to educational materials. -Kyle K. Courtney

Libraries, Instruction, and the COVID-19 Lockdown

by Pia Hunter

The onset of the COVID-19 lockdown in March 2020 stalled the services of many industries that operated in a strictly face-to-face environment. Early media reports suggested that the lockdown would be short term, but as weeks stretched into months, many businesses remained shuttered, and schools that customarily held face-to-face classroom instruction made an emergency switch to online learning. Libraries, which some have perceived as mere depositories for print materials, emerged as digital leaders and one of the most prepared industries to serve communities in a digital environment. When some educators struggled to adopt online learning models and provide students and teachers with access to books and media, libraries quickly filled the demand with digital content that users could access remotely.

Although libraries’ swift response to the COVID-19 lockdown appeared sudden to some, libraries have been modernizing their services to meet a range of users’ digital needs for decades, and the fair use doctrine has long supported that transformation. Physical access to library materials is not always possible, and in recent years, more public libraries have embraced the use of e-books and streaming media. Academic libraries have a teaching, learning, and research mission to support the scholarly activities of students and faculty. These libraries have created services that employ fair use to support online learning programs that were established well before the COVID -19 pandemic.

One question that has emerged frequently these past three years, is how? How have libraries provided access to copyrighted materials for remote users? How were students able to access copyrighted materials at the height of the pandemic? When we think of a classroom, most of us consider a traditional space with walls and students together in one room. The logistics for students to access library materials from their homes seemed insurmountable to some because the copyright laws surrounding how students and teachers can gain remote access is complex. Section 110(1) sets a generous standard for how content may be used, but it only applies to face-to-face instruction. Section 110(2), the TEACH Act, allows the digital transmission of copyrighted materials, but only under limited circumstances and the requirements are difficult for many educational institutions to achieve. With these competing sections of the Copyright Act, what was the solution?

Fair use, Section 107, which has long been the hero of the Copyright Act by allowing libraries to advance their services and provide remote access to users under certain conditions. During the pandemic, the HathiTrust (a digital repository from college and university libraries) created an Emergency Temporary Access Service to help its member institutions provide access to its faculty and students. This initiative was successful because fair use is flexible enough to cover different types of use. Some public domain titles were available in their entirety, and in other instances, users could view brief excerpts of copyrighted text online for limited periods of time.

The HathiTrust is a consortium of several academic libraries and could allow its member institutions to use the HathiTrust Collection, but it could not share access with the public. Therefore, K-12 students still needed access to library materials, and many public libraries could not provide digital access to print titles. This was especially true for school libraries which have mostly physical collections.

Internet Archive to the Rescue

The Internet Archive, a 501(c)(3) non-profit, “is building a digital library of Internet sites and other cultural artifacts in digital form.” Since 1996, the Internet Archive (IA) has been archiving websites, digitizing titles, and preserving our cultural memory. And on March 26, 2020, an NPR headline proclaimed, “’National Emergency Library’ Lends A Hand — And Lots Of Books! — During Pandemic.” Two days prior, the IA launched its National Emergency Library, which temporarily offered unlimited simultaneous access to its collection of 1.4 million digitized books. The goal was to provide reading and research materials to users whose K–12, public, and academic libraries had been suddenly closed due to the COVID-19 pandemic.

Many of the works were under copyright protection, and a collection of authors and publishers several authors argued that National Emergency Library was a copyright infringement because it allowed access to millions of titles, some of which were popular fiction materials and not scholarly in nature. This assertion is flawed because scholarship is inclusive, and the study of culture and society encompasses a myriad of content. Educators’ selection of materials for instruction is, and should be, unrestricted, and any external assertions of what material has scholarly relevance is overreach.

The IA typically operates under a standard virtual lending model, i.e., one user could borrow a single electronic copy of a text at a time, and once it was returned, another user could borrow the title. However, when many libraries closed due to the pandemic, the IA implemented the “National Emergency Library” to ensure that students, teachers, and researchers could continue to their work. This is not a dismissal of the publishers’ concerns, but libraries cannot be held to a 20th century standard of copyright law while trying to provide 21st century access to its communities. The publishers fail to consider that the IA’s National Emergency Library was created to support Emergency Remote Teaching under exigent circumstances for many educators who had little, if any, remote teaching experience.

Although the IA had announced their intention to end the emergency access by June 30, 2020, they ended the program two weeks early when publishers Hachette, Penguin Random House, Wiley, and HarperCollins announced that they would sue the IA for copyright infringement. On June 1, 2020, the publishers and several authors filed a complaint in the United States District Court for the Southern District of New York. But this case, Hachette v. Internet Archive, is not about the expanded access IA provided during the pandemic. It is a challenge to how we can use materials in a digital age and how fair use supports our right to do so. 

Many businesses suffered financial losses during the pandemic, but any argument that publishers lost millions in revenue because of the IA Emergency Library is unreasonable. Of course, authors and publishers should be compensated for their work, and they were, because libraries, including IA, already bought these books. And, in fact, libraries buy titles constantly and are the publishing industry’s best and most reliable customers. So, why can’t libraries make effective use the titles they have already purchased? Hachette v. Internet Archive invites the question of how many times and in how many formats do publishers expect libraries to buy the same title?

Yes, Section 106 of the Copyright Act of 1976 provides concrete protection for the authors’ ownership and control their work. But Section 107 tells us that fair use is not only an exception, but a right to information – one that has served many users for decades and allowed education to continue through one of the most extraordinary circumstances in modern society. Imagine a world where students could not use sections of copyrighted works in their papers or practice a piece of music without seeking permission from the rights holder? How sad would virtual spaces have been if teachers and librarians were unable to read stories to children online without gaining permission from the copyright holder? Without fair use, learning opportunities and creativity will fade, and on the 10th anniversary of Fair Use Week, we are reminded of our duty to protect it.

Pia Hunter is a Teaching Associate Professor and Associate Director for Research and Instruction at University of Illinois College of Law working out of the Law Library. Prior to joining the law library faculty, she served as Visiting Assistant Professor and Copyright and Reserve Services Librarian at the University of Illinois at Chicago (UIC) where she researched and developed best practices for copyright and fair use for instruction for the UIC campus.