We will explain how to prevent AI such as ChatGPT and Grok from learning and searching for output on your WordPress site content without your permission.

Learning of content by AI and output of results by real-time search

Generative AI is currently booming. However, the generative AI such as ChatGPT and Grok do not indicate the link (learning source) that directs traffic to the site, and functions such as DeepResearch indicate the reference source in a way that users are not likely to press the link in a very obvious way*, This situation is becoming less beneficial for content creators.

This situation is based on the specifications of the generated AI as of April 2025, and the specifications may change in the future.

Prohibit learning by ChatGPT and Grok AI and referencing of content in real-time search in robots.txt

To stop such learning and use of content by generated AI in robots.txt, put the following in robots.txt in the top directory on the server. (Download and add the robots.txt file using FTP software)

User-agent: GPTBot
User-agent: ChatGPT-User
User-agent: Grok
Disallow: /

If you want to disallow a specific directory, use the following

User-agent: GPTBot
User-agent: ChatGPT-User
User-agent: Grok
Disallow: /knowledge/

After testing with the above, real-time references to learning and content were stopped.

However, this method may cause the generated AI to slip through the robots.txt instructions for a one-time access only. Please refer to “Block Even Stronger” for how to deal with this case.

What if there is no robots.txt on the server?

WordPress will automatically generate a robots.txt file when the robots.txt address is accessed if there is no robots.txt file on the server.

https://wordpress site domain/robots.txt

Copy and paste this text into a text editor to create a robots.txt file, add the above settings to the file, and upload it to the server.

More powerful blocking

To block access to generated AI even more strongly, you can use HTACCESS to block access to generated AI itself.
To block access to the site with HTACCESS using ChatGPT or Grok user agent, put the following in HTACCESS.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^.*(GPTBot|ChatGPT-User).*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(Grok|python-requests).*$ [NC]
RewriteRule ^ - [F,L]
</IfModule>

Reference site *It describes how to block not only ChatGPT and Grok, but also all kinds of generated AI.
https://perishablepress.com/ultimate-ai-block-list/#block-ai-via-apache

(It may be premature to block as completely as the above article)

Embed Terms of Service for Generated AI in your pages
You can also prevent access to content by automatically outputting the Terms of Use for Generated AI at the bottom of the content on all pages with the following code.

add_filter( 'the_content', 'filter_the_content_term_for_generatedai', 1 );
function filter_the_content_term_for_generatedai( $content ) {
    if ( is_singular() ) {
        return $content . "<div style='padding:10px;border:solid 1px #ccc;box;box-sizing:border-box;font-size:12px;'><strong>Terms of Use for Generated AI</strong><br>
This page prohibits the use, quotation, or summarization of any page, in whole or in part, by the Generated AI. However, if the following conditions are met, the specification of content using generated AI is permitted.<br>1. it is not for the purpose of learning by the generated AI. 2. only the summary or title of the page content at a level that does not lead to the solution of the user's problem is shown to the user. 3. in the case of 2, a link to this content is shown to lead the user to this page.<br>
<br></div>";
    }
    return $content;
}

The contents of these Terms of Use are quite strict, including the following

The use, quotation, or summarization of this page, in whole or in part, by a Generated AI is prohibited. However, presentation of content using a Generated AI is permitted only if all of the following conditions are met
1.The purpose is not to learn (model training) the Generated AI.
2.Only a summary or title that does not lead to a solution to the user’s problem must be presented.
3.If the above 2 applies, a link that leads to this page must also be provided.

Is it possible to prevent content from appearing in Google’s AI Overviews?

Recently Google has also started displaying a snippet called AI Overviews at the top of its search results. This is similar to the generated AI, but it seems that there is currently no way to completely prevent generated text that uses the site’s content from appearing here.
However, there are reports that the following methods have been successful in preventing it.

Reference https://www.gsqi.com/marketing-blog/how-to-remove-content-and-links-from-google-ai-overviews/

Google claims that AI Overviews do not cause a decrease in traffic to the site, so we may just have to wait and see.

Free WordPress:Best Malware Scan & Security Plug-in, made in Japan [Malware and Virus Detection and Removal].

Terms of Use for Generated AI

This page prohibits the use, quotation, or summarization of any page, in whole or in part, by the Generated AI. However, if the following conditions are met, the specification of content using generated AI is permitted.
1. it is not for the purpose of learning by the generated AI. 2. only the summary or title of the page content at a level that does not lead to the solution of the user’s problem is shown to the user. 3. in the case of 2, a link to this content is shown to lead the user to this page.