All Major AI Companies Have Ignored Copyright Laws. The Shocking Thing Is That There Are Still No Consequences

The recent lawsuit by French publishing groups against Meta is yet another example of the ongoing fight against copyright infringement in AI model training.

AI companies have ignored copyright laws
No comments Twitter Flipboard E-mail
javier-pastor

Javier Pastor

Senior Writer
  • Adapted by:

  • Karen Alfaro

javier-pastor

Javier Pastor

Senior Writer

Computer scientist turned tech journalist. I've written about almost everything related to technology, but I specialize in hardware, operating systems and cryptocurrencies. I like writing about tech so much that I do it both for Xataka and Incognitosis, my personal blog.

191 publications by Javier Pastor
karen-alfaro

Karen Alfaro

Writer

Communications professional with a decade of experience as a copywriter, proofreader, and editor. As a travel and science journalist, I've collaborated with several print and digital outlets around the world. I'm passionate about culture, music, food, history, and innovative technologies.

258 publications by Karen Alfaro

French publishers have had enough. They’ve sued Meta for copyright infringement. They’re not the first, and they won’t be the last. But that’s not the real issue—the real issue is that AI companies have been using copyrighted content to train their models, and it’s business as usual.

Business as usual. It’s been more than two years since Getty sued Stable Diffusion, accusing it of stealing photos to train its image-generating AI. That lawsuit was the first in a long line of cases making the same claim. Yet, despite all this time, little progress has been made. It’s as if what Stable Diffusion—and other AI companies—did has been pushed to the court’s back burner.

Copy what? Suspicions about this practice have existed for years, even before ChatGPT’s public launch in November 2022. Months earlier, in June, DALL-E was accused of relying on copyrighted images from creators who received nothing in return.

In another case, Microsoft, OpenAI, and GitHub were sued just weeks before ChatGPT’s debut for training GitHub Copilot with code from developers who never gave permission. But in July 2024, a California judge dismissed the plaintiffs’ claims.

Few verdicts, few consequences. So far, recent rulings seem to favor AI companies. OpenAI, for example, won a lawsuit that challenged its practices. However, its victory might be short-lived as it faces another major case from The New York Times, which argues it has suffered demonstrable harm.

Fair use? The New York Times lawsuit against OpenAI, which began in January 2025, is one of the most important in this legal battle. Sam Altman’s company insists it’s making “fair use” of the content to train its models. But the contradiction is striking—while claiming fair use, OpenAI has also signed multimillion-dollar deals with Reddit and publishers to license content and avoid further lawsuits.

Meta’s case: a different level. AI companies go to extreme lengths to secure high-quality training data. But Meta’s case stands out. It was recently revealed that Meta used more than 80 terabytes of books downloaded via BitTorrent to train its AI model. Many of these books were copyrighted, sparking widespread criticism and a new lawsuit from French publishing groups.

No punishment. Despite this massive, ongoing infringement of intellectual property, there has been little accountability. No court has delivered a ruling that seriously penalizes these copyright violations. Instead, these infractions continue unchecked—largely ignored in favor of the advantages AI models provide.

Image | Emil Widlund (Unsplash) | Meta

Related | Meta Trained Llama Using Copyrighted Books. Mark Zuckerberg Knew It and Didn’t Care

Home o Index