Cloudflare will block AI crawlers from accessing web pages by default, while providing more granular options for site owners to control the type of AI bots to crawl their pages. In addition, Cloudflare has launched a new initiative to have AI services pay for access to these pages, it is called Pay Per Crawl.
Cloudflare is used by about 20% of all of the internet, which can pose a serious problem for AI services that train based on the open web. 20% of all that content can just vanish from these AI services. Cloudflare is a significant content delivery network but also offers cybersecurity, DDoS mitigation, wide area network services, reverse proxies and more.
Blocking AI bots. For Cloudflare to announce it will block AI bots and crawlers by default is a big deal. Any new site that signs up for Cloudflare will automatically, by default, set to block AI bots from accessing that content. “This will fundamentally change how AI companies access web content going forward,” Cloudflare wrote.
Granular blocking controls. Cloudflare said it has partnered with AI companies to verify the identity and purpose of AI crawlers. Specifically, are the AI bots crawling for training, content generation, or search purposes.
This allows site owners and content creators to define more granular control over what bots they want to allow and which bots they want to disallow.
Pay Per Crawl. Cloudflare also announced a new compensation initiative to work out a method for AI companies to pay to crawl your content, it is called Pay Per Crawl. In the future, AI companies may also be able to preview content, see how recently it was updated in order to gather the most relevant content for their particular needs, and even access it in a machine-optimized format, the company told us.
This initiative will give content creators and site owners a new revenue stream, and AI companies a simple and efficient way to find and access the content they need.
Pricing will be determined by both publishers, who can set rates, and AI companies, who can choose whether to access webpages at those rates, the company said.
To sign up for this service and to learn more, click here.
More details. We recently covered how Cloudflare CEO Matthew Prince said, “AI is going to fundamentally change the business model of the web. The business model of the web for the last 15 years has been search… search drives everything that happens online.” We, at Search Engine Land, are part of this initiative.
Here are some of the publishers who are already adopting this today: ADWEEK, Atlas Obscura, BuzzFeed, Fortune, Stack Overflow, News/Media Alliance, The Atlantic, Battelle Media, Evolve Media, Hyperscience, IAB Tech Lab, O’Reilly Media, Quora, Raptive, Sovrn, Inc., StockTwits, Third Door Media, TIME, Webflow.
Here is a video that shows Matthew Prince talking about how serious this issue is:
“If the Internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone—creators, consumers, tomorrow’s AI founders, and the future of the web itself,” said Matthew Prince, co-founder & CEO, Cloudflare. “Original content is what makes the Internet one of the greatest inventions in the last century, and we have to come together to protect it. AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone.”
As the largest publisher in the country, comprised of USA TODAY and over 200 local publications throughout the USA TODAY Network, blocking unauthorized scraping and the use of our original content without fair compensation is critically important,” said Renn Turiano, Chief Consumer and Product Officer, Gannett Media. “As our industry faces these challenges, we are optimistic the Cloudflare technology will help combat the theft of valuable IP.”
“We applaud Cloudflare for advocating for a sustainable digital ecosystem that benefits all stakeholders — the consumers who rely on credible information, the publishers who invest in its creation, and the advertisers who support its dissemination,” said Vivek Shah, CEO, Ziff Davis.
Why we care. Being able to block AI crawlers from using your content without authorization has not been easy. Many services don’t fully respect the robots.txt rules, others created other methods to control crawling that content management systems have not fully adopted, and some (like Google) lump features like AI Overviews and AI Mode as part of search.
This should not only give publishers and site owners better control over AI crawlers but also put pressure on these AI companies to find better ways to compensate content creators for using their content going forward.
Search Engine Land is owned by Semrush. We remain committed to providing high-quality coverage of marketing topics. Unless otherwise noted, this page’s content was written by either an employee or a paid contractor of Semrush Inc.
#Cloudflare #block #crawlers #default #Pay #Crawl #initiative