| This article will help you to decide for yourself once | | | | can be good for your users, they won't find |
| you have read it. It shows you how to create a | | | | themselves halfway through the site with no idea how |
| robot.txt file.Basically, robots.txt is a plain text file which | | | | to get out. Then again, the more search engine entries |
| is placed in a server's root directory it includes | | | | you have, the better, right? It's up to you to decide |
| information on whether search engine robots should | | | | what should or shouldn't be excluded.OK so robot.txt is |
| index the site or parts of the site. The file (line begins | | | | good for your users and to tell the search engines |
| with '#'), then 'User-agent' lines. | | | | which pages to list but. Here is the bad part not all |
| Usually, the User-agent line is simply a wildcard, to | | | | robots or bots are good some will ignor the robot.txt |
| exclude all robots, like so :# robots.txt for | | | | file and just index all pages it comes across. So some |
| User-agent: *although you can write seperate agent | | | | of your admin pages could get displayed somewhere. |
| disallow sections for different robots. | | | | Also now you know about robot.txt it needs to be in |
| Next comes the Disallow section. this is read by the | | | | the root directory what's to stop someone who reads |
| robot and from there, it determines what's off-limits | | | | this article just going around and typing this would |
| when it comes to indexing your site.# robots.txt for | | | | display your robot.txt file!That would be like going into a |
| User-agent: * | | | | pub and leaving you wallet on the bar and going to the |
| Disallow: /administration/ # nothing under | | | | loo. What's the chances of it still being there when you |
| administration/ should be spidered Disallow: /temp/ # | | | | get back - yeh slim! |
| these are temporary files | | | | So now you know a bit about robot.txt it up to you to |
| Disallow: /active.asp # active content here, no point | | | | decide.Good luck. |
| spidering itDisallowing pages deep into your structure | | | | |