Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol
Below are instructions for creating and using a robots.txt file for your website, so that search engines index the content management of your website
robots.txt is a structured text file, when the spider (bot, crawler) of the SE (search engines) to the site to collect data on the robots.txt file would be to see the instructions in this file.
robots.txt can assign each of the different types of bot SE can vary in each area of the website or the website or not?
Some of the SE bots: Googlebot (Google), Googlebot-Image (Google), Yandex (Russian SE), Bingbot (Bing) / Yahoo Slurp (Yahoo) ...
The common syntax of a robots.txt file
User-agent: objects bot is accepted
Disallow / Allow: URL you want to block / allow
*: Represents all
For example: User-agent: * (That means accepting all types of bots.)
Lock the entire site
Disallow: /
Block 1 folder and everything in it
Disallow: /wp-admin/
Block 1 Page
Disallow: /private_file.html
Removing pictures from Google Images 1
User-agent: Googlebot-Image
Disallow: /images/sexy.jpg
Put all the pictures from Google Images:
User-agent: Googlebot-Image
Disallow: /
Block any image file, for example gif
User-agent: Googlebot
Disallow: /*.gif$
Things to avoid in the robots.txt file
- Distinguish case sensitive.
- No written opinion, the lack of spaces.
- Do not insert any character other than the command syntax.
- Each statement should be written on one line.
How to create a robots.txt file and location
- Use notepad or any other program that created the file, then rename the file robots.txt.
- Put in the root directory of the website.