YouTube Videos Used to Train AI Without Creators' Permission — And Now You Can See If Yours Was One

YouTube Videos Used to Train AI Without Creators' Permission — And Now You Can See If Yours Was One
Journalist Alex Reisner at The Atlantic has found millions of YouTube videos that were used to train artificial intelligence systems without the creators asking permission first. He published this discovery in September 2025 as part of The Atlantic's "AI Watchdog" project, which tracks where AI companies get their training data.
This is not Reisner's first time investigating this issue. In 2023, he uncovered that Meta had used more than 191,000 books without permission to train its AI systems. Each time, he follows the same method: find the data, identify who owns it, and make the findings publicly searchable.
That searchable database matters. The Atlantic has created a tool where you can type in your name or your work and find out if it was used to train these AI systems. For individual creators and authors, this is significant — it lets you check if your work helped build a commercial product that you never agreed to and never received payment for.
Why YouTube is such a target for AI training
YouTube hosts an enormous library of videos that come with helpful labels and descriptions. Each video is timestamped and organized by topic, which makes the platform an attractive source for companies building AI systems that generate videos or understand visual content. Companies can access this material at scale without much effort.
What Reisner's work adds is proof. Instead of guessing that YouTube videos were used, his investigation documents exactly which videos were scraped, how many, and which AI products benefited from them.
The legal question
For creators, the core issue is straightforward: their work may have created value for a company without them receiving anything in return.
For AI companies, the legal situation is murkier. U.S. copyright law has a concept called "fair use" that sometimes allows people to use copyrighted material without permission for certain purposes, like criticism or education. Whether scraping millions of videos for AI training counts as fair use has not been decided by courts yet. YouTube's own rules explicitly forbid scraping video for machine learning, but it is unclear whether that rule is legally binding on companies that use the videos after they have been scraped.
The comparison to Reisner's 2023 books investigation is helpful but not perfect. The books were taken from an illegal library without permission. With video, the situation is more complex. Videos have different types of rights attached to them — who can copy them, who can use music in them, and more — and the files are much larger. A video AI model works differently than a text AI model too, which may matter legally, but courts have not yet decided how much.
What changes next
Until recently, creators had no way of knowing if their work was part of an AI training dataset. Finding out required expensive legal investigation. The searchable database The Atlantic has built changes that. Now anyone can check for free.
This shift affects how lawsuits might play out. Lawyers filing cases against AI companies need named plaintiffs who can point to their specific work being used. A public database makes it much easier to identify those people and build those cases. The volume of legal action around AI training data could rise considerably as a result.
The bigger picture is becoming clearer. For the first time, the opacity around where AI systems get their training data is being challenged. Reisner's books disclosure in 2023 opened the door for text. The YouTube investigation does the same for video. Each time something like this happens, the next investigation becomes easier to conduct and easier for courts and lawmakers to take seriously. Over time, this builds a public record that regulators and legislators can draw on when deciding what the rules should be.
AI companies have built their systems quickly on the assumption that detailed scrutiny would take years to arrive. That scrutiny is now catching up.


