In Bartz v. Anthropic, Judge Alsup found the use of copyrighted works to train a large language model to be “justified as a fair use”, describing the technology as “among the most transformative many of us will see in our lifetimes.”
Generative AI may certainly turn out to be remarkably transformative in the ordinary sense of the word. But, contrary to the legal conclusion reached by Judge Alsup, I argue that the use of copyrighted works for training doesn’t constitute “transformative” use in the context of copyright law.
In this context, a use is not transformative merely because it produces something new or technologically sophisticated. Rather, as established through decades of judicial interpretation, a transformative use is one that relates back to the original work by creating new information and insights about that work. Generative AI does not do this. It instead reappropriates the expressive content of the work to enable the generation of synthetic expressive content completely divorced from the original work.
This is not a criticism of the technology itself. Generative AI may well be impressive, even revolutionary. But in its recent recalibration of transformativeness, the Supreme Court was clear that not every innovative or creative secondary use should be considered transformative.
Justification
The 2023 Warhol Foundation v. Goldsmith decision marked the first time in nearly 30 years that the Supreme Court directly addressed transformativeness, and the period in between saw a steady expansion of the doctrine, which some worried went too far.
Warhol regained control of the doctrine, trimmed it of its excesses and refocused it. The Court warned against looking at merely whether something new is created—“Most copying has some further purpose… Many secondary works add something new.” The key, Warhol reminds, is to look at the justification for the use. This involves two senses of the word. First, a use is justified if it is the type of use that furthers the purpose of copyright without prejudicing the original author. And second, the use is justified if the user needs the original work to serve this purpose.
Justification in the first sense looks toward the purpose of copyright itself, which is “the conviction that encouragement of individual effort by personal gain is the best way to advance public welfare through the talents of authors and inventors in ‘Science and useful Arts.’” Exclusive rights allow creators to control and monetize their works. Markets built around those rights reflect diverse preferences—economic, aesthetic, educational, scientific—and empower authors and publishers to pursue their own goals.
Fair use is an exception to these exclusive rights, but it is an exception intended to serve the same purpose as copyright overall. Courts must therefore exercise care when departing from the general rule of exclusive rights, since “underprotection of copyright disserves the goals of copyright just as much as overprotection.”
Relating Back
The hallmark of justification, and thus transformativeness, is that the new use “relates back” to the original. The public benefit is two-fold for such uses: the public benefits from the new use itself, and it benefits from the insights about the original work that are created.
This relation-back principle is reflected in the statute, which identifies illustrative purposes—“criticism, comment, news reporting, teaching…, scholarship, or research” all or which involve new uses that provide new information or insights about the original work or treat them as referential objects in the context of further discussion.
The principle is also at the core of Campbell’s parody/satire distinction, which drew the line between a work that “at least in part, comments on” the original work being parodied (parody) and a work where “the commentary has no critical bearing on the substance or style of the original composition” and the original work is used merely “to get attention or to avoid the drudgery in working up something fresh” (satire).
And the legislative history recalls this relating-back characteristic. In the Copyright Office’s preliminary study on fair use that kicked off the drafting effort for the 1976 Copyright Act, Alan Latman wrote,
The modus operandi of certain fields requires that the rights of each author yield to a step-by-step progress. This consideration is often linked to the constitutional support for fair use as an indispensable tool in the promotion of “science.” Practical necessity and constitutional desirability are strongest in the area of scholarly works.
Similarly, in reviews of a work, a certain amount of reconstruction is often necessary; and in burlesque, the user must be permitted to accomplish the “recalling or conjuring up of the original.”
As a corollary, this relating-back purpose would be stymied without the ability to use the original work; because the new use is tied to the original work, there are no substitutes for the original work. Thus, we see courts reject transformativeness when the original work is used as a mere commodity or is fungible to the ultimate purpose.
The principle holds true even when looking at technological uses. For example, like criticism and commentary, technological tools that create new information about existing works, like book and image search tools, have been found transformative. Second, as with news reporting and biographical uses, technological tools that analyze specific works, like a plagiarism detector, treat the works as discrete referential objects to provide increased understanding of those specific works. In each of these cases, the technological use relates back to the original work that was copied.
Training Does Not Relate Back
That is not true in the case of training a generative AI model.
Training a large language model itself starts by breaking down huge amounts of text into smaller parts, and converting those parts into numbers for processing. These numbers (called tokens) are arranged in a mathematical space that captures how often tokens appear together and their contextual relationships. In other words, the unique choices each writer makes, the specific words they select and how they arrange them, are compiled into a single mathematical representation that encodes patterns of usage and co-occurrence.
The model then uses this representation to create a function optimized to predict the most likely next token in a sequence. It does not understand or think like a person but generates what appears as coherent and relevant text by operating on statistical patterns it has derived from the training materials.
This is not transformative in the legal sense.
Crucially, the training process does not comment on or critique the training materials. It does not relate back to them in any meaningful or referential way. What is taken is a part of, not information about, the original authors’ creative expression. The original expressive choices are extracted and embedded into the model’s internal representation, not to analyze them, but to reproduce similarly structured expression.
Indeed, in many cases, the process does not even identify which training materials were most influential in producing a given output. The process is opaque by design. In that sense, the AI model is the opposite of a referential or critical use. It is a black box whose output conceals rather than illuminates its sources.
Because there is no relating back, then the use of a particular work is also not justified in the narrow sense of the term. The individual works used in training are entirely fungible, and developers have a universe of available substitutes that could achieve the same purpose. The public benefit is served best in such circumstances through the ordinary application of exclusive rights in the market.
Conclusion
Generative AI models may “transform” input data into dazzling new outputs, but this is not the transformation that fair use favors. Under fair use, a transformative use is one that comments on, critiques, or provides new insights about an original work. It is not simply a technological process that digests and reuses expressive content in a different form.
Generative AI training reappropriates the expressive elements of copyrighted works to enable the generation of new content. It does not point back to, analyze, or even acknowledge the originals. That makes it more like satire than parody—creative, yes, but not legally justified without permission.
The choice to allow one’s work to be ingested into a generative system—to become raw material for future outputs—belongs to the copyright owner. Courts should be careful not to let technological innovation obscure that basic principle of copyright law and undermine its ability to benefit the public.