Perplexity Plagiarized Our Story About How Perplexity Is a Bullshit Machine

Earlier this week, WIRED published a “. In one experiment, it generated text about a girl following a trail of mushrooms when asked to summarize the content of a website that its agent did not, according to server logs, attempt to access.

Perplexity and its CEO, Aravind Srinivas, did not substantively dispute the specifics of WIRED’s reporting. “The questions from WIRED reflect a deep and fundamental misunderstanding of how Perplexity and the Internet work,” Srinivas said in a statement. ” and ” by Poynter Institute—including, perhaps most stringently, the seven-to-10 word test, which proposes that it’s “hard to incidentally replicate seven consecutive words that appear in another author’s work.” (Kelly McBride, a Poynter SVP who has described this test as being useful in identifying plagiarism, did not reply to an email.)

“If one of my students turned in a story like this, I would take them before the academic dishonesty committee for plagiarism,” said John Schwartz, professor of practice at the University of Texas at Austin’s journalism school, after reading the original story and the summary. “I find this just too close. When I was reading the Perplexity version, I just thought, there’s an echo in here.”

Perplexity and Srinivas, the company’s CEO, did not respond to a detailed request for comment in which they were presented with the criticisms experts made of the company for this story.

Bill Grueskin, professor of professional practice at Columbia Journalism School, wrote in an email that the summary looked to be “pretty much ok” for a chatbot identified as such, but that it was hard to say because he hadn’t had time to read the original WIRED story. “Quoting a sentence verbatim without quote marks is bad, of course,” he wrote. “I’d be pretty mortified if a news org ran an AI summary like this without disclosing the source—or worse, pretending it came from a human.” (Perplexity, of course, isn’t claiming this material came from a human.)

Perhaps luckily for Perplexity and its backers, this is a literal academic debate. Plagiarism is a concept pertaining to professional ethics, important in contexts like journalism and academia where being able to identify the source of information is of fundamental importance but of no legal significance in itself. If a rival studio releases a film containing a reasonable chunk of footage from Inside Out 2, Disney would sue not for plagiarism but for copyright infringement; similarly, a letter Forbes ” This is the law that, among other things, protects search engines like Google from liability for defamation when they link to defamatory content because they are services passing on information from other content providers; as he sees it, Perplexity is similarly shielded as long as it accurately summarizes material. (Whether AI-generated material enjoys 230 protection at all is a matter of debate.)

“They’d only get in trouble if they summarized the story incorrectly and made it defamatory when it wasn’t before. That’s something that they actually would be at legal risk for, especially if they don’t credit the original source clearly enough and people can’t easily go to that source to check,” he says. “If Perplexity’s edits are what make the story defamatory, 230 doesn’t cover that, under a bunch of case law interpreting it.”

In one case WIRED observed, Perplexity’s chatbot did falsely claim, albeit while prominently linking to the original source, that WIRED had reported that a specific police officer in California had committed a crime. (“We have been very upfront that answers will not be accurate 100% of the time and may hallucinate,” Srinivas said in response to questions for the story we ran earlier this week, “but a core aspect of our mission is to continue improving on accuracy and the user experience.”)

“If you want to be formal,” says Grimmelmann, “I think this is a set of claims that would get past a motion to dismiss on a bunch of theories. Not saying it will win in the end, but if the facts bear out what Forbes and WIRED, the police officer—a bunch of possible plaintiffs—allege, they are the kinds of things that, if proven and other facts were bad for Perplexity, could lead to liability.”

Not all experts agree with Grimmelmann. Pam Samuelson, professor of law and information at UC Berkeley, writes in an email that copyright infringement is “about use of another’s expression in a way that undercuts the author’s ability to get appropriate remuneration for the value of the unauthorized use. One sentence verbatim is probably not infringement.”

Bhamati Viswanathan, a faculty fellow at New England Law, says she’s skeptical the summary passes a threshold of substantial similarity usually necessary for a successful infringement claim, though she doesn’t think that’s the end of the matter. “It certainly should not pass the sniff test,” she wrote in an email. “I would argue that it should be enough to get your case past the motion to dismiss threshold—particularly given all the signs you had of actual stuff being copied.”

In all, though, she argues that focusing on the narrow technical merits of such claims may not be the right way to think about things, as tech companies can adjust their practices to honor the letter of dated copyright laws while still grossly violating their purpose. She believes an entirely new legal framework may be necessary to correct for market distortions and promote the underlying aims of US intellectual property law, among them to allow people to financially benefit from original creative work like journalism so that they’ll be incentivized to produce it—with, in theory, benefits to society.

“There are, in my opinion, strong arguments to support the intuition that generative AI is predicated upon large scale copyright infringement,” she writes. “The opening ante question is, where do we go from there? And the greater question in the long run is, how do we ensure that creators and creative economies survive? Ironically, AI is teaching us that creativity is more valuable and in demand than ever. But even as we recognize this, we see the potential for undermining, and ultimately eviscerating, the ecosystems that enable creators to make a living from their work. That’s the conundrum we need to solve—not eventually, but now.”

Subscribe to Updates

What's Hot

Perplexity Plagiarized Our Story About How Perplexity Is a Bullshit Machine

Related Posts

Subscribe to Updates