How to block a referer spam traffic with Apache webserver

In this config you will learn what is a referer spam traffic, how it is generated and most importantly how to block referer spam on the Linux apache webserver.

What is a referer spam?

A referer spam is yet another nuisance invented by spammers causing unaware system admins, marketers or site owners to inadvertently visit or link back to the spammer’s site via publicly published access or referer logs on a victim’s website. This may consequently lead to a lower search engine ranking, as well as to drain your server’s resources.

Since you are reading this article chances are that you may have already noticed a strange referral traffic hitting your server while when following a link you land on completely unrelated website.

How it works

All hits generated using referer spam technique are not genuine visitors but rather they are a result of an automated script making a HTTP request while deliberately altering HTTP header with spam referral which will cause web server server to log it as genuine. Below you can find a sample of the apache’s access log:

10.1.1.8 - - [10/Mar/2015:11:56:55 +1100] "GET / HTTP/1.1" 200 10543 "http://example.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36"

From the above we can read that some user form 10.1.1.8 using Chrome browser visited a root page of our web server, where the referer link is from example.com domain. Such a log entry can be generated by anyone with an access to proper tools. Let’s use curl command to generate false referral from mydomain.local:

$ curl -s -e mydomain.local http://mysite.local > /dev/null

Now, when we examine apache’s logs we can find a following entry:

10.1.1.8 - - [10/Mar/2015:12:26:20 +1100] "GET / HTTP/1.1" 200 433 "http://mydomain.local" "curl/7.32.0"

Additionally, with a use of the curl command we can also alter an agent type:

 $ curl -A "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" -s -e http://mydomain.local http://mysite.local > /dev/null

which will result your web server to log:

10.1.1.8 - - [10/Mar/2015:12:31:17 +1100] "GET / HTTP/1.1" 200 433 "http://mydomain.local" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36"

The above is a referer spam and it can fool your web statistics tools such as google analytics as well as drain you server resources.

How to block a referer spam

What we would like to achieve here is to simply block traffic from any suspicious referral. For example, we are going to block any traffic from example.com referral domain as well as to block any traffic from a referral link containing keyword spam anywhere in the URL.

For this we will need apache’s rewrite module to be enabled. To see whether rewrite module is enabled on your server enter:

# apache2ctl -M | grep rewrite
 rewrite_module (shared)
Syntax OK

If you see no output the rewrite module is not enabled. To enable rewrite module run:

# a2enmod rewrite
Enabling module rewrite.
To activate the new configuration, you need to run:
  service apache2 restart
# service apache2 restart
[....] Restarting web server: apache2apache2: 
. ok

Next, change your virtual hostAllowOverride settings. For example:

FROM:
<Directory /var/www/>
                Options Indexes FollowSymLinks MultiViews
                AllowOverride None
                Order allow,deny
                allow from all
        </Directory>
TO:
<Directory /var/www/>
                Options Indexes FollowSymLinks MultiViews
                AllowOverride all
                Order allow,deny
                allow from all
        </Directory>

Once you made the above changes restart your web server:

# service apache2 restart

At this stage we will have two options on how to use our rewrites to block referer spam.

The first option is to insert our rewrite statements into our site configuration file. This is approach is recommend as it does not put much pressure on server resources since all rewrite statements are read only once during apache start up sequence. To do this enter a following rewrite lines into your site configuration file:

        <Directory /var/www/>
                Options Indexes FollowSymLinks MultiViews
                AllowOverride None
                Order allow,deny
                allow from all

                RewriteEngine on
                RewriteCond %{HTTP_REFERER} example.com|.*spam [NC]
                RewriteRule .* - [F]

       </Directory>

Once you do the above changes please restart your Apache web server. The disadvantage of the above configuration is that you must have a root access to the server. If you do not have server administrative access you have on option to insert .htaccess file into a root directory of your website with a following content:

RewriteEngine on
RewriteCond %{HTTP_REFERER} example.com|.*spam [NC]
RewriteRule .* - [F]

The disadvantage of the above .htaccess method is that it can significantly reduce your web server performance as the .htaccess file needs to be read every time a HTTP request is made.

One way or another your server should now deny any traffic from referer example.com or if the referer URL contains a keyword spam. To test a correctness or your referer spam filter run curl command while faking a referral source. Your request now should result in forbidden access ( Apache 403 error ) caused by .* - [F] RewriteRule.

blocking referer spam traffic with apache web server