The robots.txt file is a text file which contains commands for search engine indexing robots (or crawlers)
to specify which pages can or can not be indexed.
It is a ASCII file located at the root of the website, and must be written in lowercase and plural.
The Meta tag "robots" (in the header of pages) can also be used to forbid the indexed of a page.
This file allows you to leave instructions to the indexing robots:
To indicate the location of the sitemap files
To forbid the indexation of your website for certain robots
To forbid the indexation of certain pages / directories
It can contain the following directives:
Sitemap: It specifies the sitemap or Sitemap index files
User-Agent: It specifies the robot concerned by the following directives.
For example, Google's user-agent is Googlebot.
*: this means that the directives refer to all indexing robots
Disallow: It allow to deny access to certain pages / directories of your website.
Must start with /
/ means the whole site
Allow: It's the opposite of the Disallow directive.
It specifies which pages / directories to include from indexation.
By default each pages can be indexed.
Be careful: The robots.txt file is not treated in the same way by all search engine.
For certains robots the first directive takes the upper hand,
for others robots the most specific directives that take the upper hand.
Examples: Exclusion of all pages for all search engines / crawlers: