How robots work


How To Use The Robots.txt File To Increase Your Web Ranking

Sometimes we rank well on one engine for aindex.
particular keyphrase and assume that all
search engines will like our pages, and henceYou have to start a new batch of code for
we will rank well for that keyphrase on aeach engine, but if you want to list multiply
number of engines. Unfortunately this isdisallow files you can one under another. For
rarely the case. All the major search enginesexample
differ somewhat, so what's get you ranked
high on one engine may actually help to lowerUser-Agent:  Slurp  (Inktomi's  spider)
your  ranking  on  another  engine.
Disallow:  xyz-gg.html
It is for this reason that some people like
to optimize pages for each particular searchDisallow:  xyz-al.html
engine. Usually these pages would only be
slightly different but this slight differenceDisallow:  xxyyzz-gg.html
could make all the difference when it comes
to  ranking  high.Disallow:  xxyyzz-al.html
However because search engine spiders crawlThe above code disallows Inktomi to spider
through sites indexing every page it cantwo pages optimized for Google (gg) and two
find, it might come across your search enginepages optimized for AltaVista (al). If
specific optimizes pages and because they areInktomi were allowed to spider these pages as
very similar, the spider may think you arewell as the pages specifically made for
spamming it and will do one of two things,Inktomi, you may run the risk of being banned
ban your site altogether or severely punishor penalized. Hence, it's always a good idea
you  in  the  form  of  lower  rankings.to  use  a  robots.txt  file.
The solution is this case is to stop specificThe robots.txt file resides on your webspace,
Search Engine spiders from indexing some ofbut where on your webspace? The root
your web pages. This is done using adirectory! If you upload your file to
robots.txt file which resides on yoursub-directories it will not work. If you
webspace.wanted to disallow all engines from indexing
a file, you simply use the "*" character
A Robots.txt file is a vital part of anywhere the engines name would usually be.
webmasters battle against getting banned orHowever beware that the "*" character won't
punished by the search engines if he or shework  on  the  Disallow  line.
designs different pages for different search
engine's.Here are the names of a few of the big
engines:
The robots.txt file is just a simple text
file as the file extension suggests. It'sExcite  -  ArchitextSpider
created using a simple text editor like
notepad or WordPad, complicated wordAltaVista  -  Scooter
processors such as Microsoft Word will only
corrupt  the  file.Lycos  -  Lycos_Spider_(T-Rex)
You can insert certain code in this text fileGoogle  -  Googlebot
to  make it work. This is how it can be done.
Alltheweb  -  FAST-WebCrawler
User-Agent:  (Spider  Name)
Be sure to check over the file before
Disallow:  (File  Name)uploading it, as you may have made a simple
mistake, which could mean your pages are
The User-Agent is the name of the searchindexed by engines you don't want to index
engines spider and Disallow is the name ofthem, or even worse none of your pages might
the file that you don't want that spider tobe indexed.



1 A B C D 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105