OpenAI’s public crawler documentation now lists OAI-AdsBot, a bot which will go to pages submitted as ChatGPT adverts to examine coverage compliance and assist decide advert relevance.
The entry sits alongside OAI-SearchBot, GPTBot, and ChatGPT-Person on OpenAI’s crawler docs page, bringing the documented bot depend to 4.
OpenAI states that OAI-AdsBot solely visits pages submitted as adverts and that the info it collects isn’t used to coach its generative AI basis fashions.
What The Bot Does
Per OpenAI’s docs, OAI-AdsBot could go to an advert’s touchdown web page after the advert will get submitted. The bot checks whether or not the web page complies with OpenAI’s advert insurance policies. It might additionally use content material from the touchdown web page to assist determine when to indicate the advert to ChatGPT customers.
The bot identifies itself with the user-agent string Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); suitable; OAI-AdsBot/1.0; +https://openai.com/adsbot.
OAI-SearchBot and GPTBot are each at model 1.3, per OpenAI’s docs. The crawler solely visits pages submitted as advert touchdown pages, not the broader internet.
What The Bot Doesn’t Do
Information collected by OAI-AdsBot isn’t used to coach generative AI basis fashions. That retains OAI-AdsBot out of GPTBot’s territory, which handles coaching information assortment.
It additionally retains OAI-AdsBot separate from OpenAI’s different bots. OAI-SearchBot surfaces content material in ChatGPT search, whereas ChatGPT-Person fetches pages throughout user-initiated searching, and OAI-AdsBot is restricted to advert validation.
OAI-SearchBot and GPTBot could be managed independently by way of robots.txt. ChatGPT-Person is user-initiated, and the corporate notes that robots.txt guidelines could not apply to it. The OAI-AdsBot entry doesn’t say how the bot treats robots.txt.
No Public IP Checklist But
OpenAI publishes IP vary information for its three earlier bots at openai.com/searchbot.json, openai.com/gptbot.json, and openai.com/chatgpt-user.json. On the time of publication, no equal openai.com/adsbot.json file seems in OpenAI’s docs.
With no printed checklist, verifying an actual OAI-AdsBot go to turns into tougher. Person-agent strings could be spoofed, and the IP lists offer you a strategy to cross-check for the opposite three OpenAI bots. For OAI-AdsBot, that cross-check isn’t out there.
Why This Issues
OAI-AdsBot has two audiences. Advertisers shopping for placements on ChatGPT want the bot to achieve their touchdown pages; in any other case, the advert could not validate. Anybody monitoring AI bot exercise in server logs will get a brand new user-agent to look at, one tied to paid stock slightly than search or coaching.
Aggressive bot safety by way of Cloudflare, Akamai, or comparable instruments could block OAI-AdsBot earlier than it reaches the web page. That would create validation friction for advertisers who use strict bot-mitigation instruments.
Wanting Forward
ChatGPT’s advert program has moved quick since OpenAI started testing ads on Feb. 9. As entry opens as much as extra advertisers, OAI-AdsBot visitors will begin displaying up in additional server logs. Look ahead to an eventual IP vary file at openai.com/adsbot.json if OpenAI chooses to publish one. For now, the user-agent string is what you must work with.
Featured Picture: Blossom Inventory Studio/Shutterstock
#OpenAIs #Crawler #Docs #Checklist #OAIAdsBot #ChatGPT #Advertisements

