OpenAI is a 30 billion dollar company whose value comes not just from the language model that allows it to write convincingly but also from the limitless text it was trained on. You and I wrote that text. I've kept watch on the mailbox, but I haven't yet received my stock options. As a result, I'm taking the step to ask OpenAI not to train on this web site.
You can too! All it takes is adding a file to the root of your web site called robots.txt
. As the extension implies, this is just a plain text file. Add this to it:
User-agent: GPTBot
Disallow: /
Heck, while we're in here, why not just go ahead and block Google and Microsoft too. I doubt I'll ever rank since I'm not doing any SEO, but I'll be proactive on the off chance they decide they will offer any of my content as an answer directly on their results page.
User-agent: Googlebot
Disallow: /
User-agent: Bingbot
Disallow: /
I'm not depending on these mechanisms to be discovered anyway, so there's really nothing in it for me.
Let these robots feed their insatiable hunger on someone else's content.