Technology

CNN Files Copyright Lawsuit Against Perplexity AI Over Article Scraping

Martin HollowayPublished 3d ago6 min readBased on 4 sources
Reading level
CNN Files Copyright Lawsuit Against Perplexity AI Over Article Scraping

CNN Files Copyright Lawsuit Against Perplexity AI Over Article Scraping

CNN filed a copyright infringement lawsuit against Perplexity AI on Thursday in the U.S. District Court for the Southern District of New York, alleging the AI search company scraped more than 17,000 of its articles and generates verbatim copies of its content without authorization.

The lawsuit marks the latest escalation in the ongoing battle between traditional media companies and AI firms over the use of copyrighted content to train and operate AI systems. CNN joins The New York Times Company, which filed its own lawsuit against Perplexity on December 5, 2025, in pursuing legal action against the AI search startup.

The Allegations

According to the complaint, CNN alleges that Perplexity's AI tools produce verbatim copies of CNN's work and provide users with content that is otherwise locked behind CNN's paywall. The network claims Perplexity is unlawfully distributing its copyrighted content through the company's AI-powered search platform.

The scope of the alleged infringement is substantial. CNN contends that Perplexity scraped more than 17,000 of its articles, representing a significant portion of the news organization's digital content library. The lawsuit specifically targets how Perplexity's AI systems allegedly reproduce CNN's journalism without proper licensing or compensation.

The paywall circumvention allegation strikes at a core business model concern for digital publishers. CNN's subscription revenue depends on restricting access to premium content, and the lawsuit suggests Perplexity's AI responses effectively bypass these access controls by reproducing paywalled material in search results.

Failed Partnership Context

The legal action emerges from the wreckage of a potential business relationship between the two companies. CNN and Perplexity had explored a content licensing deal in October 2025 that would have made CNN's journalism available through Perplexity's Comet Plus subscription service. However, negotiations failed to reach a final agreement, and CNN scrapped the arrangement in November 2025.

This timeline suggests that CNN initially sought to monetize its content through a partnership model with Perplexity rather than immediately pursuing litigation. The breakdown of these discussions appears to have prompted CNN to take a more adversarial approach to protecting its intellectual property.

The failed partnership context may prove significant in the litigation. It demonstrates that CNN was willing to license its content under appropriate terms, potentially undermining any fair use defense Perplexity might mount based on the transformative nature of AI-generated responses.

Broader Industry Pattern

CNN's lawsuit follows a pattern of legal challenges facing AI companies that have trained their models on publicly available web content without explicit permission from content creators. The case joins a growing roster of copyright disputes involving major AI firms, including ongoing litigation against OpenAI, Anthropic, and other generative AI companies.

We have seen this pattern before, when search engines first began indexing and displaying web content in the early 2000s. Publishers initially resisted Google's web crawling and content display practices, leading to a series of legal and business negotiations that ultimately resulted in revenue-sharing models through advertising. The current AI content disputes echo those earlier tensions, but with higher stakes given AI systems' ability to generate human-like text that can substitute for original content rather than merely pointing users toward it.

The media industry's approach to AI companies has evolved from initial experimentation and partnership discussions to increasingly aggressive legal action as companies struggle with declining digital revenues and the potential for AI systems to replace direct traffic to their websites.

Technical and Legal Implications

From a technical perspective, the lawsuit raises questions about the distinction between training data ingestion and real-time content access. While many AI companies have defended their use of publicly available content for training purposes under fair use doctrine, Perplexity's model involves real-time web access to generate current responses, potentially creating different legal vulnerabilities.

The verbatim copying allegation is particularly significant because it suggests Perplexity's systems are not merely using CNN's content as training data but are reproducing substantial portions of articles directly in their outputs. This direct reproduction may be harder to defend under fair use principles than the more transformative uses typically associated with AI training.

The paywall circumvention claim introduces an additional dimension to the copyright issues. If proven, it could demonstrate concrete economic harm to CNN's subscription business model, strengthening the company's damages claims beyond theoretical copyright violations.

Industry Response and Precedent

The lawsuit comes as AI companies face mounting pressure to formalize content licensing agreements with publishers. Several AI firms have signed partnership deals with major media companies, including OpenAI's agreements with The Atlantic and News Corp, and Anthropic's partnership with Time magazine.

However, these voluntary partnerships have not prevented litigation. The New York Times' lawsuit against Perplexity, filed in December 2025, demonstrates that even high-profile media companies are pursuing parallel tracks of both business negotiations and legal action to protect their content rights.

The outcome of CNN's lawsuit, along with the Times case and other pending copyright litigation, will likely establish important precedents for how AI companies can legally access and use content from traditional media sources. These cases may ultimately force a broader reckoning within the AI industry about sustainable content licensing models.

Looking at what this means for the broader ecosystem, the CNN lawsuit signals that major media companies are increasingly willing to test the boundaries of copyright protection in the AI era through litigation rather than relying solely on business negotiations. The result may accelerate the development of formal licensing frameworks while potentially constraining AI companies' access to high-quality training and response data.