|
Defeating Referral / Referrer Spam |
|
You may see a lot of spam referrals if you are runnning a web site that allows visitors to see statistics, such as those used in PHP-Nuke and other content management systems. Some of the spam you may see usually contains words related to drugs, sex, porn, and other types of content that you might find objectionable. Here are some of the sites that showed up in a customers PHP-Nuke web site within the first hour of the day:
1. |
|
hydrocodone_m358.fiberia.com |
  
|
26.31 % (5) |
2. |
|
phentermine_no_p.fiberia.com |
  
|
26.31 % (5) |
3. |
|
hydrocod_over.fiberia.com |
  
|
15.78 % (3) |
4. |
|
hydroon_and.fiberia.com |
  
|
5.263 % (1) |
5. |
|
ww2.wrongsideoftown.com |
  
|
5.263 % (1) |
6. |
|
galleries.allinternal.com |
  
|
5.263 % (1) |
7. |
|
buy-hydrocodone-lortab.bestall.ru |
  
|
5.263 % (1) |
8. |
|
cheap-hydrocodone-online.bestall.ru |
  
|
5.263 % (1) |
9. |
|
pheermine_proz.fiberia.com |
Defeating this type of spam is necessary because those links are not related to this particular web site, and because it could get penalties against your web site in search engine rankings. In a typical 24 hour period, this specific web site was getting between 2,500 - 6,000+ of these rogue referrers.
Referral Spam has been around for a long time. Generally speaking,
it is using software to create artificial referrals to another
website. The whole idea behind it is to gain links on statistic pages
that are shown by default on many CMS's.
The script to create
the referrals is mind numbingly easy. Nearly anyone with knowledge of
web scripting programming language would be able to create one. Most
scripts search a search engine for the keywords they wish to target.
Then they just pound the top sites with referrals in hopes that the
ranking sites have statistic pages that are visible by the public. In
my mind, this could be refined to spamming only sites that have certain
text on them. For instance, phpnuke sites that have publicly
accessible statistic pages uniformly contain the following text 'Access Statistics'. So, a simple search for the keywords AND 'Access Statistics' would result in sites that were susceptible to referral spam.
How
do site owners combat this type of abuse? Well, it isn't easy. The
spammers aren't going to stop anytime soon. It's too lucrative. The
simplest and most painless method is to remove any publicly accessible
stats pages. If you decide not to remove the stats pages, at least
make sure they are not indexed by the search engines. Remember, links
going out to 'bad neighborhoods' will harm your ranking. So, you don't
want to allow these links to be indexed.
Another method allows you to keep your statistics pages viewable, and requires a little bit of code modification to work. I spent a lot of time over the last month trying to block these web sites using various methods, such as banning IP addresses, entire blocks of IP addresses (since most of them original from Russia), banning by the actual host names, attempting to not track hosts with specific IP addresses or hostnames. Nothing I did seemed to work....until very recently.
What I ran acrossed was a package designed to stop referrals from abusing comments commonly used in topics, stories, blogs, etc. This specific software is called Referrer Karma and is available via a free download. Although the software was originally designed around the commonly used Wordpress blogging software, it will really work with any type of content management system that uses php and mysql.
At the very least, you need a database backend to hold the information it gathers and also php working on your server.
Following the installation instructions was pretty easy. The basic rundown is like this: * download the ref-karma.zip file * extract the contents to a folder on your hard drive * Create the necessary database * Upload the entire folder containing the ref-karma files to a folder on your web server that holds public accessable documents. If you host on our servers, this would be the public_html folder. * Edit the file per the installation documentation and set the hostname, database username, database password, and database name for the database you created above. * Run the installation file to install the database tables * Follow the remaining instructions to add the include line to whatever page you need it added to.
After doing all of this on the above customers web site and simply adding the include line to the main index.php file, I still had the same referrals coming to the site, although in a lot less amounts of traffic.
What I found I needed to do was also add the include line to several other files, including (since this is a php nuke site I am working on) the modules/MSAnalysis/index.php and the modules/Statistics/index.php file. These were the remaining files being hit by the referrers and once the include script was added here, all of those types of referrers stopped.
You can experiment with the files you need to add the include line from the installation instructions until you find the specific files being hit. One suggestion is that you could download a raw access log and parse it for either the IP address or the host name that shows as visiting your site and then find the pages that site is hitting. If you add the include lines to those pages, it will solve your problems.
Hope you find this information helpful. If you know anyone else that will benefit from this document, refer them to us using the links on the left. Don't forget to tell anyone you know who needs a web hosting provider that we go the extra mile to protect our customers and to help them obtain great search engine placements.
|
|
Last Updated ( Wednesday, 11 January 2006 )
|