### Amazon Investigates Perplexity AI for Content Scraping
#### Background of the Investigation
Amazon’s cloud division has initiated an investigation into Perplexity AI, a startup specializing in AI-powered search. The investigation centers on allegations that Perplexity AI has been scraping content from various news websites without permission.
#### Evidence of Unauthorized Access
WIRED’s investigation revealed that Perplexity AI accessed Condé Nast’s websites using an unpublished IP address—44.221.181.252. This IP address, linked to an Elastic Compute Cloud (EC2) instance hosted on AWS, visited Condé Nast properties hundreds of times over the past three months. Despite Condé Nast’s engineers blocking Perplexity’s crawler via a robots.txt file, the startup managed to bypass these restrictions.
#### Broader Implications
The machine associated with Perplexity AI has also been detected crawling other major news websites that prohibit bot access. Representatives from The Guardian, Forbes, and The New York Times confirmed that the same IP address appeared on their servers multiple times.
#### AWS’s Role and Response
WIRED traced the IP address to an AWS-hosted virtual machine, prompting AWS to investigate whether using its infrastructure to scrape websites that forbid it violates their terms of service.
#### Perplexity AI’s Defense
Last week, Perplexity CEO Aravind Srinivas responded to WIRED’s findings, stating that the questions posed reflected a “deep and fundamental misunderstanding of how Perplexity and the Internet work.” However, he also mentioned that the company is taking steps to prevent potential copyright violations.
#### Industry Reactions
Jason Kint, CEO of Digital Content Next, commented on the situation:
“By default, AI companies should assume they have no right to take and reuse publishers’ content without permission,”
Kint added that if Perplexity is bypassing terms of service or robots.txt files, it indicates that something improper is happening.
#### Conclusion
The ongoing investigation by Amazon and the reactions from industry leaders highlight the complexities and ethical considerations surrounding AI and content scraping. The outcome of this investigation could have significant implications for how AI startups operate in the future.
6 Comments
Is Amazon becoming the next Big Brother?
I wonder how secure our data really is!
Amazon can’t seem to stay out of trouble.
Looks like Amazon’s got some cleaning up to do.
Oh great, now all my shopping habits are public knowledge!
Can’t wait to see what skeletons tumble out next.