Block Common Crawl via robots.txt

Description

This plugin adds lines to the virtual robots.txt file that WordPress creates automagically if the file is not present physically on the server to block the Common Crawl bot (CCBot) that is used to train AI algorithms like ChatGPT (in version 3.5 it makes up 60% of the training data mix) to crawl websites. Here is the information about the Common Crawl bot (CCBot).

Please be aware of what this plugin does exactly: If you activate the plugin it will add some lines to the robots.txt file to tell Common Crawl bots to not to crawl and index parts of this website.

Please only activate this plugin if you know what you’re doing.

What can I do with this plugin?

This plugin adds the necessary lines to the virtual robots.txt file that WordPress creates automagically to block the Common Crawl bot (CCBot) is used to train AI algorithms like ChatGPT to crawl websites.

What ideas is this plugin based on?

Our plugin to block the OpenAI ChatGTP bot after the launch of the OpenAI ChatGPT plugins that crawl websites.

What is the robots.txt file?

The robots.txt file is a plain text file located at the root folder of a domain (or subdomain) which tells web crawlers (like Googlebot) what parts of the website they should access and index.

The first thing a search engine crawler looks at when it is visiting a page is the robots.txt file and it controls how search engine spiders see and interact with the web pages.

System requirements

PHP version 7.4 or greater.
We require 7.4 or higher because we believe that everybody should be running a modern PHP version. When releasing this plugin WordPress recommends PHP 7.4 or higher. We would recommend 8.0 or higher as 7.4 has no security updates since 28 Nov 2022.

How to get this plugin in your Language!

The first release is avaliable in English and Spanish. In the “languages” folder we have included the necessary files to translate this plugin.

If you would like the plugin in your language and you’re good at translating, please use the native WordPress Translation functionality.

New to Translating a plugin? First read through the Translator Handbook, then select your locale at Translating WordPress and finally go to the translation page for this plugin to translate it.

Further Reading

You can access the description of the plugin in Spanish at: Block Common Crawl bot (CCBot) via robots.txt en español.

Contact

For further information please send us an email.

Screenshots

  • Lines that the plugin adds to the virtual robots.txt file that WordPress creates.

Installation

  1. First you will have to upload the plugin to the /wp-content/plugins/ folder.
  2. Then activate the plugin in the plugin panel. There are no settings.

Reviews

ഈ പ്ലഗിന് റിവ്യൂകൾ ഒന്നുമില്ല.

Contributors & Developers

“Block Common Crawl via robots.txt” is open source software. The following people have contributed to this plugin.

Contributors

Changelog

1.0.0 (21/APR/2023)

  • First release.