Robots.txt Generator

 

This tool allows you to generate your robots.txt.

Be careful, you use the robots.txt generator at your own risk. It is advisable to validate that your robots file does not exclude pages by mistake. There are many tools to test this like the Google tool.

Sitemap

Sitemap url
start with http:// or https://


Specific rules

Search engine / Crawler:

Authorization:
File or folder


Your robots.txt file




Robots.txt

The robots.txt file is a text file which contains commands for search engine indexing robots (or crawlers) to specify which pages can or can not be indexed. It is a ASCII file located at the root of the website, and must be written in lowercase and plural. The Meta tag "robots" (in the header of pages) can also be used to forbid the indexed of a page.

This file allows you to leave instructions to the indexing robots:

  • To indicate the location of the sitemap files
  • To forbid the indexation of your website for certain robots
  • To forbid the indexation of certain pages / directories

It can contain the following directives:

  • Sitemap: It specifies the sitemap or Sitemap index files
  • User-Agent: It specifies the robot concerned by the following directives.
    For example, Google's user-agent is Googlebot.
    *: this means that the directives refer to all indexing robots
  • Disallow: It allow to deny access to certain pages / directories of your website.
    Must start with /
    / means the whole site
    Several search engines (Google and Bing) allow the use of the characters $ and *:
    *: It represents any sequence of characters.
    $: It matches the end of the URL.
  • Allow: It's the opposite of the Disallow directive. It specifies which pages / directories to include from indexation. By default each pages can be indexed.

Be careful: The robots.txt file is not treated in the same way by all search engine.

For certains robots the first directive takes the upper hand, for others robots the most specific directives that take the upper hand.

Examples:
Exclusion of all pages for all search engines / crawlers:

User-Agent: *
Disallow: /


All the website is to index:
User-Agent: *
Allow: /


Exclusion of the Gigabot robot:
User-Agent: Gigabot
Disallow: /
User-Agent: *
Allow: /


Excluding a directory:
User-Agent: *
Disallow: /directory/


Excluding all pages starting with "car" (the "car-low-cost" page is disallowed to be crawled):
User-Agent: *
Disallow: /car


Excluding the page "car" (the "car-low-cost" page is allowed to be crawled):
User-Agent: *
Disallow: /car$

What is Robots TXT Generator?

Robots txt Generator makes it easy to create a robots file for your website.

You can easily and quickly choose which pages are allowed to be crawled by search engines.

You can report a bug or give feedback by adding a comment (below) or by clicking "Contact me" link (at the top right hand corner of the page).

Comments