Anthropic Trained Its AI Model Using Millions of Copyrighted Books. A Judge Just Ruled in Favor of the Company, but There’s One Big Caveat

Anthropic has achieved a significant legal victory in the ongoing battle over copyright and intellectual property rights within the AI industry. This positive ruling for Anthropic could set an important precedent for other cases involving AI companies that have faced lawsuits for using copyrighted works to train their models. However, the win isn’t absolute.

Anthropic wins. In the lawsuit filed by three authors against Anthropic, the company was accused of downloading millions of copyrighted books without permission. The plaintiff also accused the company of purchasing some of these books to scan and digitize to train its AI models.

Senior district judge William Alsup ruled that “the training use was a fair use.” Companies developing AI models have often relied on the concept of fair use to justify their training practices, even when those practices involve copyrighted materials.

Fair use. This legal principle allows limited use of protected material without permission from the copyright owner. In copyright law, judges determine whether an activity qualifies as fair use by examining whether that use is “transformative.” This means whether something new is created from the original works. According to Alsup, “The technology at issue was among the most transformative many of us will see in our lifetimes.”

Important caveats. The judge acknowledged that the training process could be considered fair use. However, he also said that authors retain the right to sue Anthropic for copyright infringement.

The company argued that accessing “all these copies [was] at least reasonably necessary for training LLMs.” Alsup emphasized that, despite the purchases made, Anthropic built a substantial library of works without compensating their authors.

“Anthropic downloaded over seven million pirated copies of books, paid nothing, and kept these pirated copies in its library even after deciding it would not use them to train its AI (at all or ever again). Authors argue Anthropic should have paid for these pirated library copies. This order agrees.”

Meta Trained Llama Using Copyrighted Books. Mark Zuckerberg Knew It and Didn’t Care

Anthropic Trained Its AI Model Using Millions of Copyrighted Books. A Judge Just Ruled in Favor of the Company, but There’s One Big Caveat

This legal victory could set an important precedent for other AI companies facing similar lawsuits.

The judge upheld the fair use argument as valid.

However, Anthropic still faces a new trial regarding the creation of a library containing illegally downloaded books for which it didn’t pay.

RECEIVE "Xatakaletter", OUR WEEKLY NEWSLETTER