{"id":683546,"date":"2025-08-06T09:55:27","date_gmt":"2025-08-06T06:55:27","guid":{"rendered":"https:\/\/buradabiliyorum.com\/en\/some-people-are-defending-perplexity-after-cloudflare-named-and-shamed-it\/"},"modified":"2025-08-06T09:55:27","modified_gmt":"2025-08-06T06:55:27","slug":"some-people-are-defending-perplexity-after-cloudflare-named-and-shamed-it","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/some-people-are-defending-perplexity-after-cloudflare-named-and-shamed-it\/","title":{"rendered":"Some people are defending Perplexity after Cloudflare \u2018named and shamed\u2019 it"},"content":{"rendered":"<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">When Cloudflare accused AI search engine Perplexity of stealthily scraping websites on Monday, while ignoring a site\u2019s specific methods to block it, this wasn\u2019t a clear-cut case of an AI web crawler gone wild.<\/p>\n<p class=\"wp-block-paragraph\">Many people came to Perplexity\u2019s defense. They argued that Perplexity accessing sites in defiance of the website owner\u2019s wishes, while controversial, is acceptable. And this is a controversy that will certainly grow as AI agents flood the internet: Should an agent accessing a website on behalf of its user be treated like a bot? Or like a human making the same request?<\/p>\n<p class=\"wp-block-paragraph\">Cloudflare is known for providing anti-bot crawling and other web security services to millions of websites. Essentially, Cloudflare\u2019s test case involved setting up a new website with a new domain that had never been crawled by any bot, setting up a robots.txt file that specifically blocked Perplexity\u2019s known AI crawling bots, and then asking Perplexity about the website\u2019s content.\u00a0And Perplexity answered the question. <\/p>\n<p class=\"wp-block-paragraph\">Cloudflare researchers found the AI search engine used \u201ca generic browser intended to impersonate Google Chrome on macOS\u201d when its web crawler itself was blocked. Cloudflare CEO Matthew Prince <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/x.com\/eastdakota\/status\/1952379571527193017\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">posted <\/a>the research on X, writing, \u201cSome supposedly \u2018reputable\u2019 AI companies act more like North Korean hackers. Time to name, shame, and hard block them.\u201d<\/p>\n<p class=\"wp-block-paragraph\">But many people disagreed with Prince\u2019s assessment that this was actual bad behavior. Those defending Perplexity on sites <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/x.com\/Chikor_Zi\/status\/1952382355059913193\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">like X<\/a> and <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/news.ycombinator.com\/item?id=44785636\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Hacker News<\/a> pointed out that what Cloudflare seemed to document was the AI accessing a specific public website when its user asked about that specific website.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">\u201cIf I as a human request a website, then I should be shown the content,\u201d one person on <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/news.ycombinator.com\/item?id=44786039\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Hacker News<\/a> wrote, adding, \u201cwhy would the LLM accessing the website on my behalf be in a different legal category as my Firefox web browser?\u201d<\/p>\n<p class=\"wp-block-paragraph\">A Perplexity spokesperson previously denied to TechCrunch that the bots were the company\u2019s and called Cloudflare\u2019s blog post a sales pitch for Cloudflare. Then on Tuesday, Perplexity <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.perplexity.ai\/hub\/blog\/agents-or-bots-making-sense-of-ai-on-the-open-web\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">published a blog<\/a> in its defense (and <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>ly attacking Cloudflare), claiming the behavior was from a third-party service it uses occasionally.<\/p>\n<div class=\"wp-block-techcrunch-inline-cta\">\n<div class=\"inline-cta__wrapper\">\n<p>Techcrunch event<\/p>\n<div class=\"inline-cta__content\">\n<p>\n\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__location\">San Francisco<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__separator\">|<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__date\">October 27-29, 2025<\/span>\n\t\t\t\t\t\t\t<\/p>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<p class=\"wp-block-paragraph\">But the crux of Perplexity\u2019s post made a similar <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>eal as its online defenders did.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThe difference between automated crawling and user-driven fetching isn\u2019t just technical \u2014 it\u2019s about who gets to access information on the open web,\u201d the post said. \u201cThis controversy reveals that Cloudflare\u2019s systems are fundamentally inadequate for distinguishing between legitimate AI assistants and actual threats.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Perplexity\u2019s accusations aren\u2019t exactly fair, either. One argument that Prince and Cloudflare used for calling out Perplexity\u2019s methods was that OpenAI doesn\u2019t behave in the same way.<\/p>\n<p class=\"wp-block-paragraph\">\u201cOpenAI is an example of a leading AI company that follows these best practices,\u201d <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/blog.cloudflare.com\/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Cloudflare wrote<\/a>.\u00a0\u201cThey respect robots.txt and do not try to evade either a robots.txt directive or a network level block. And ChatGPT Agent is signing http requests using the newly proposed open standard Web Bot Auth.\u201d \u00a0<\/p>\n<p class=\"wp-block-paragraph\"><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/developers.cloudflare.com\/bots\/concepts\/bot\/verified-bots\/web-bot-auth\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Web Bot Auth<\/a> is a Cloudflare-supported standard being developed by the Internet Engineering Task Force that hopes to create a cryptographic method for identifying AI agent web requests.<\/p>\n<p class=\"wp-block-paragraph\">The debate comes as bot activity reshapes the internet. As TechCrunch has previously reported, bots seeking to scrape massive amounts of content to train AI models have become a menace, especially to smaller sites.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">For the first time in the internet\u2019s history, <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.imperva.com\/resources\/resource-library\/reports\/2025-bad-bot-report\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">bot activity is currently outstripping human activity online<\/a>, with AI traffic accounting for over 50%, according to Imperva\u2019s Bad Bot report released last month. Most of that activity is coming from LLMs. But the report also found that malicious bots now make up 37% of all internet traffic. That\u2019s activity that includes everything from persistent scraping to unauthorized login attempts.<\/p>\n<p class=\"wp-block-paragraph\">Until LLMs, the internet generally accepted that websites could and should block most bot activity given how often it was malicious by using CAPTCHAs and other services (such as Cloudflare). Websites also had a clear incentive to work with specific good actors, such as Googlebot, guiding it on what not to index through robots.txt. Google indexed the internet, which sent traffic to sites. <\/p>\n<p class=\"wp-block-paragraph\">Now, LLMs are eating an increasing amount of that traffic. Gartner predicts <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2024-02-19-gartner-predicts-search-engine-volume-will-drop-25-percent-by-2026-due-to-ai-chatbots-and-other-virtual-agents\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">that search engine volume<\/a> will drop by 25% by 2026.\u00a0Right now humans tend to click website links from LLMs at the point they are most valuable to the website, which is when they are ready to conduct a transaction.<\/p>\n<p class=\"wp-block-paragraph\">But if humans adopt agents as the tech industry predicts they will \u2014 to arrange our <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/trip-and-travel\/\" data-internallinksmanager029f6b8e52c=\"10\" title=\"Trip &amp; Travel\" target=\"_blank\" rel=\"noopener\">travel<\/a>, book our dinner reservations, and shop for us \u2014 would websites hurt their business interests by blocking them? The debate on X captured the dilemma perfectly:<\/p>\n<p class=\"wp-block-paragraph\">\u201cI WANT perplexity to visit any public content on my behalf when I give it a request\/task!\u201d wrote <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/x.com\/denisandrejew\/status\/1952384393147736294\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">one person<\/a> in response to Cloudflare calling out Perplexity. <a rel=\"nofollow\" target=\"_blank\" rel=\"nofollow\" href=\"https:\/\/x.com\/denisandrejew\/status\/1952384393147736294\"\/><\/p>\n<p>\u201cWhat if the site owners don\u2019t want it? they just want you [to] directly visit the home, see their stuff\u201d <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/x.com\/khanhicetea\/status\/1952387322332471658\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">argued another<\/a>, pointing out that the site owner who created the content wants the traffic and potential ad revenue, not to let Perplexity take it.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThis is why I can\u2019t see \u2018agentic browsing\u2019 really working \u2014 much harder problem than people think. Most website owners will just block,\u201d <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/x.com\/jwblackwell\/status\/1952384112364548402\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">a third<\/a> predicted.<\/p>\n<\/div>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMN63nwsw68G3Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/buradabiliyorum.com\/en\/category\/technology\/\" target=\"_blank\" >Technology<\/a><\/span> category.<\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/techcrunch.com\/2025\/08\/05\/some-people-are-defending-perplexity-after-cloudflare-named-and-shamed-it\/\" target=\"_blank\" >Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When Cloudflare accused AI search engine Perplexity of stealthily scraping websites on Monday, while ignoring a site\u2019s specific methods to block it, this wasn\u2019t a clear-cut case of an AI web crawler gone wild. Many people came to Perplexity\u2019s defense. They argued that Perplexity accessing sites in defiance of the website owner\u2019s wishes, while controversial,&#8230;<\/p>\n","protected":false},"author":1,"featured_media":683547,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/10\/45A2342_VGAEbHsG.jpg?resize=1200,800","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[77337,155063,146921,157422,152732,73708],"class_list":["post-683546","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-ai","tag-ai-agent","tag-ai-bots","tag-cloudflare","tag-perplexity","tag-web-scraping"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/683546","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=683546"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/683546\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/683547"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=683546"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=683546"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=683546"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}