| Have you ever needed to prevent Google from | | | | eventually get crawled and indexed using this method |
| indexing a particular URL on your web site and | | | | is quite high. |
| displaying it in their search engine results pages | | | | Using robots.txt to prevent Google indexing |
| (SERPs)? If you manage web sites long enough, a day | | | | Another common method used to prevent the |
| will likely come when you need to know how to do | | | | indexing of a URL by Google is to use the robots.txt |
| this. | | | | file. A disallow directive can be added to the robots.txt |
| The three methods most commonly used to prevent | | | | file for the URL in question. Google's crawler will honor |
| the indexing of a URL by Google are as follows: | | | | the directive which will prevent the page from being |
| - Using the rel="nofollow" attribute on all anchor | | | | crawled and indexed. In some cases, however, the |
| elements used to link to the page to prevent the links | | | | URL can still appear in the SERPs. |
| from being followed by the crawler. | | | | Sometimes Google will display a URL in their SERPs |
| - Using a disallow directive in the site's robots.txt file to | | | | though they have never indexed the contents of that |
| prevent the page from being crawled and indexed. | | | | page. If enough web sites link to the URL then Google |
| - Using the meta robots tag with the content="noindex" | | | | can often infer the topic of the page from the link text |
| attribute to prevent the page from being indexed. | | | | of those inbound links. As a result they will show the |
| While the differences in the three approaches appear | | | | URL in the SERPs for related searches. While using a |
| to be subtle at first glance, the effectiveness can vary | | | | disallow directive in the robots.txt file will prevent |
| drastically depending on which method you choose. | | | | Google from crawling and indexing a URL, it does not |
| Using rel="nofollow" to prevent Google indexing | | | | guarantee that the URL will never appear in the |
| Many inexperienced webmasters attempt to prevent | | | | SERPs. |
| Google from indexing a particular URL by using the | | | | Using the meta robots tag to prevent Google indexing |
| rel="nofollow" attribute on HTML anchor elements. | | | | If you need to prevent Google from indexing a URL |
| They add the attribute to every anchor element on | | | | while also preventing that URL from being displayed in |
| their site used to link to that URL. | | | | the SERPs then the most effective approach is to use |
| Including a rel="nofollow" attribute on a link prevents | | | | a meta robots tag with a content="noindex" attribute |
| Google's crawler from following the link which, in turn, | | | | within the head element of the web page. Of course, |
| prevents them from discovering, crawling, and indexing | | | | for Google to actually see this meta robots tag they |
| the target page. While this method might work as a | | | | need to first be able to discover and crawl the page, |
| short-term solution, it is not a viable long-term solution. | | | | so do not block the URL with robots.txt. When Google |
| The flaw with this approach is that it assumes all | | | | crawls the page and discovers the meta robots |
| inbound links to the URL will include a rel="nofollow" | | | | noindex tag, they will flag the URL so that it will never |
| attribute. The webmaster, however, has no way to | | | | be shown in the SERPs. This is the most effective |
| prevent other web sites from linking to the URL with a | | | | way to prevent Google from indexing a URL and |
| followed link. So the chances that the URL will | | | | displaying it in their search results. |