| Spammers have a knack for developing "overrides" to | | | | access the site has to be created. It is placed |
| even the most secured aspect of the system including | | | | anywhere within the website although it is advisable to |
| those that are not readily recognized as potential | | | | store it outside the web root so that it cannot be |
| targets. The .htaccess file can be used to keep e-mail | | | | accessed from the web. |
| harvesters away. This is considered very effective | | | | Recommended Practices to Deter Spam |
| since all of these harvesters get to identify themselves | | | | Avoiding the publication of referrers is one way of |
| in some way using the user agent files which gives | | | | discouraging spammers. It would be pointless to bother |
| .htaccess the capability to block them. | | | | sending spoofed requests to blogs when this |
| Spams Countered by .htaccess | | | | information is not known. Unfortunately, most bloggers |
| Bad bots are the spiders that are considered to do a | | | | believe that being able to click on a link such as "sites |
| lot more harm than good to a site such as an e-mail | | | | referring to me" and the like is a neat feature and |
| harvester. Site rippers are offline browsing programs | | | | have not evaluated its detrimental effect on the whole |
| that a surfer may unleash on a site to crawl and | | | | blogosphere. |
| download every one of its pages for offline viewing. | | | | If publishing referrers is a definite must, there should be |
| Both cases would result to a jacking up a site's | | | | a built-in support for a referral spam blacklist and |
| bandwidth and resource usage even up to the point of | | | | include the page in robots.txt. It specifically tells |
| crashing the site's server. Since bad bots would | | | | Googlebot and its relatives not to index the referrer's |
| typically ignore the wishes of ones' robots.txtfile they | | | | page. By doing this, spammers are unable to get the |
| can be banned using the .htaccess essentially by | | | | page rank they seek. This would only work however, |
| identifying the bad bots. | | | | when referrers are published separately from the |
| There is a useful code block that can be inserted into | | | | rests of the site's content. |
| the .htaccess file for blocking a lot of the known bad | | | | The use of rel = "no follow" likewise denies the |
| bots and site rippers currently existing. Affected bots | | | | spammers of their desired page rank at the link-level |
| will receive a 403 Forbidden Error when they attempt | | | | and not just the page-level using robots.txt. All link |
| to view a protected site. This usually results to a | | | | referrer section of the website linking to external |
| significant bandwidth saving and decrease in server | | | | websites should carry this attribute. This is done |
| resource usage. | | | | without exception so as to offer maximum protection. |
| Bandwidth stealing or what is commonly referred to as | | | | Referrer statistics gathered from beacon images |
| hot linking in the web community refers to linking | | | | loaded via JavaScript document, write statements that |
| directly to non-HTML objects that are not on one's | | | | are more reliable than what the raw web server logs |
| own server such as images and CSS files. The | | | | will contain. There is an option to totally disregard the |
| victim's server is robbed of bandwidth and money as | | | | referrer's section of a site's server logs. A cleaner list |
| the perpetrator enjoys showing content without having | | | | of referrers can be gathered from the use of |
| to pay for its delivery. | | | | JavaScript and beacon images from referrer stats. |
| Hot linking to one's own server can be disallowed with | | | | The current Master Blacklist File can be a powerful |
| the use of .htaccess. Those who will attempt to link an | | | | and efficient weapon against spam. A log file analysis |
| image or CSS file on a protected site is either blocked | | | | program that filters referrers against this list can help |
| or served a different content. Being blocked would | | | | root out spam. The Master Blacklist is a simple text file |
| usually mean a failed request in the form of a broken | | | | that can be downloaded from a website or simply |
| image while an example of a different content would | | | | mirrored. It is far from perfect since a check on the file |
| be an image of an angry man, presumably to send a | | | | against the referrers that got through shows that few |
| clear message to the violators. It is necessary that the | | | | or none of them were listed. |
| mod rewrite is enabled on one's server in order for this | | | | The idea of combating comment spam by harnessing |
| aspect of .htaccess to work. | | | | DNS-based black hole lists could also be used to ferret |
| Disabling hot linking of certain file types on a site would | | | | out other forms of spam such as referral spam. The |
| need a code to the .htaccess file which will be | | | | proposal is really rather simple and suggests to query |
| uploaded to the root directory or a particular | | | | the IP against a blacklist for a request with a referrer. If |
| subdirectory to localize the effect to just one section | | | | the IP is blacklisted or has a high score among a |
| of the site. A server is typically set to prevent | | | | multitude of blacklist, listing the referring URL in any |
| directory listing. If this is not the case, the required link | | | | section of a site's web stats should be refrained from. |
| should be stored into the .htaccess files of the image | | | | Once a given site has been identified as a referral |
| directory so that nothing in this directory will be allowed | | | | spam host name, querying the blacklist again for any |
| to be listed. | | | | IPs with the same host name in the HTTP request |
| The .htaccess file is also able to reliably password | | | | should not be done as a matter of efficiency. |
| protect directories on websites. Other options can be | | | | There are various forms of spam that has grown |
| used but only .htaccess offers total security. Anyone | | | | exponentially along with the popularity of blogs. This is |
| wishing to get into the directory must know the | | | | probably due to the very little restrictions given against |
| password and no "back doors" are provided. | | | | those that can post a comment. This is easily exploited |
| Password protection using .htaccess requires adding | | | | by spammers who are intent on getting their goods in |
| the approximate links to the .htaccess file in the | | | | front of people's view. Spammers have automated |
| directory that is being sought to be protected. | | | | tools on a constant look-out for blogs that can easily |
| Password protecting a directory is one of the | | | | be spammed. Spamming in all its forms, carry heavy |
| functions of .htaccess that takes a little more work | | | | consequences for those trying to use the Internet and |
| than the others. This is because a file containing the | | | | the world wide web in a productive way. |
| usernames and passwords which are allowed to | | | | |