Cloudflare Criticizes Perplexity for Stealth Web Crawling Practices
AI/Software

Cloudflare Criticizes Perplexity for Stealth Web Crawling Practices

Cloudflare and Perplexity are locked in a dispute over the use of stealth crawling techniques to access restricted websites.

A perplexing development has arisen between Cloudflare and Perplexity regarding the latter’s methods of crawling websites that prohibit such actions.

Background
Last month, Cloudflare announced that it would restrict AI scrapers from accessing sites using its services without authorization. However, it has been indicated that some crawlers utilized clandestine means to evade these restrictions.

Cloudflare’s Stance
In a recent blog post, Cloudflare detailed its observations of Perplexity employing hidden crawlers to bypass ’no crawl’ directives set by website owners. This blog served to elaborate on how Cloudflare detected this issue and consequently removed Perplexity’s status as a trusted bot.

Specific Findings
Perplexity is accused of habitually altering its user agent and changing IP addresses to conceal its web scraping activities, counter to the explicit ’no-crawl’ requests from several sites.

“Perplexity is repeatedly modifying their user agent and changing IPs and ASNs to hide their crawling activity…”
(Translation: Perplexity consistently alters its identifiable browsing characteristics to bypass web scraping restrictions.)

Cloudflare cited customer reports asserting that Perplexity successfully circumvented no-crawl protocols previously established. As a countermeasure, Cloudflare created new, purely internal domains that were not indexed or publicly reachable, employing strict ‘robots.txt’ rules meant to prevent bot access.

Perplexity’s Response
In reaction, Perplexity refuted Cloudflare’s claims, suggesting that these assertions may stem from misattribution of traffic or the company’s desire to generate a publicity moment at the expense of its own customers.

“…the bluster around this issue reveals that Cloudflare’s leadership is either dangerously misinformed on the basics of AI, or simply more flair than cloud.”
(Translation: Cloudflare’s defense indicates a potentially misguided understanding of AI technology.)

This friction between Cloudflare and Perplexity is unfolding against a broader landscape of AI interaction with web protocols, raising questions on the ethicality and legality of web scraping practices in the ongoing evolution of internet governance.

Next article

Developers Monitor Player Behavior for Enhanced Battlefield 6 Maps

Newsletter

Get the most talked about stories directly in your inbox

Every week we share the most relevant news in tech, culture, and entertainment. Join our community.

Your privacy is important to us. We promise not to send you spam!