NEW! Raven by TapClicks - SEO + SEM Tools with Deeper Competitive Insights

GET STARTED FREE External Link Tap Click Logo Tap Click Logo

The Raven Blog

We write tips for digital marketers and we create beginner through advanced guides for search engine optimization. Improve your SEO and Content Marketing game.

How To Stop Referrer Spam

Written by and published



Update – January 12, 2017: There’s a new post about how to remove Google Analytics referral spam that addresses new types of spam and includes resources for stopping it.

Referrer spam is becoming a problem. If you’re not familiar with referrer spam, it’s traffic from bots that impersonate a referral link. The pseudo traffic is designed to make their domain show up in your site analytics so that you’ll visit the site.

Why is Referrer Spam a Problem? Aside from junking up your site analytics with useless data, it’s a big waste of time. We’ve heard from many of our customers here at Raven just how frustrating it is to explain what “semalt” is to their clients and why it doesn’t matter.

While it’s possible to create a filter in Google Analytics to filter out referrer spammers like semalt, all it does is mask the problem. Also, as Himanshu Sharma has written about, it may create data sampling problems. So instead of filtering out bad data after the fact, I’m going to show you how to block it at the source.

Blocking Referrer Spam

How To Stop Referrer Spam

The key to stopping referrer spam is to block it before it has a chance to register on your site as a referrer. The simplest way to do this is to add the following code to your .htaccess file.

## SITE REFERRER BANNING
RewriteCond %{HTTP_REFERER} semalt.com [NC,OR]
RewriteCond %{HTTP_REFERER} buttons-for-website.com [NC,OR]
RewriteCond %{HTTP_REFERER} seoanalyses.com [NC]
RewriteRule .* - [F]

Deflecting

Another technique you can use is a Deflector, which redirects the traffic back to where it came from. Avi Wilensky, CEO of Promediacorp prefers this method to just blocking them. He creates a text file named deflector.map that looks like this.

#
## deflector.map
##
##referer --> redirect target
http://semalt.com http://semalt.com
http://seoanalyses.com http://seoanalysis.com
http://buttons-for-website.com http://buttons-for-website.com

Then he puts the following code in his .htaccess file.

RewriteMap deflector txt:/path/to/deflector.map
RewriteCond %{HTTP_REFERER} !=""
RewriteCond ${deflector:%{HTTP_REFERER}} =-
RewriteRule ^ %{HTTP_REFERER} [R,L]

I haven’t tried this yet, but I plan to. If you’ve had any experience with deflecting, please tell us about it in the comments below.

Blacklists

Shelli Walsh, of ShellShock UK, recommends using a blacklist of referrers and Regex coupled with commonly used spammy keywords. An example of this is available from Perishable Press.

The only problem with currently known referrer spam blacklists — at least the ones I found — is that they don’t seem to be kept up-to-date.

WordPress Plugin

For those who don’t have access to their .htaccess file or don’t feel like they have the experience to properly edit it, there’s a WordPress plugin for it. For many webmasters, semalt is the worst offender. That’s why Peadig created the Semalt Blocker for WordPress.

The Semalt Blocker plugin is currently limited to only blocking semalt, but the plugin creator, Alex Moss, has assured me that they’re working on a new version that will allow users to add more sites to block as needed.

Efficient Management of .htaccess

Another annoyance of having to block referrer spam is updating the .htaccess file for all of your sites. Fortunately, there’s a trick that Brian LaFrance of AuthorityLabs shared with me. He uses an umbrella .htaccess file for all of his sites. He does that by storing an .htaccess file in the directory that contains all of his site directories. The server will read that .htaccess file prior to each site’s individual .htaccess file, so the bots are stopped for all sites nested under that directory.

Personally, I like to use unique .htaccess files for each of my sites, but I still like for things to be as efficient as possible. My solution has been to create symbolic links to all of my .htaccess files in one folder. That way I have access to all of them, and then I can quickly open, paste and save…open, paste and save…

Here’s a spambot list that’s frequently updated.

Update – March 23, 2015

After writing and publishing this post, two new pieces of information were presented to me.

First, Rishi Lakhani would like credit for coming up with the Semalt Blocker plugin by Peadig.

He also wrote an excellent post on referrer spam over at Refugeeks that you should check out.

Second, Georgi Georgiev pointed me to his post that analyzes all of the options for blocking referrer spam. He concluded that the best overall solution is to create a custom filter in Google Analytics.

You can create a filter for your sites in Google Analytics by navigating to the Admin and then clicking on All Filters. Click on the New Filter button and then create a Custom Exclude for Campaign Source. Enter the domains you want to exclude using Regex. The format should be domain. followed by a pipe (|) for each additional domain.

darodar.|semalt.|buttons-for-website|blackhatworth|ilovevitaly|prodvigator|cenokos.|ranksonic.|adcash.|simple-share-buttons.|social-buttons.

It should look similar to this screenshot:

Google Analytics Filter

What about you? How do you block botnets?

Update – April 20, 2015

Matthieu Napoli left a helpful link to a Referrer Spam Blacklist hosted by Piwik on GitHub. Many thanks to Matthieu for sharing that.

Update – June 16, 2015

Tom Capper at Distilled discovered another way to filter out referrer spam in Google Analytics. He suggests using a screen resolution exclusion.

Update – October 14, 2015

I’m really impressed with a service called Referrer Spam Blocker that was created by Stijlbreuk. You can add filters to many sites at once and best of all, it’s free!

Link Spy Example

Start Spying Links

Link Spy helps you find top-quality links based on those websites that are already ranking for your focus keywords.

94 Responses to “How To Stop Referrer Spam”

  1. Thanks Jon! This stuff is getting annoying. Another method I use is SetEnvIfNoCase Referer which looks like this:
    SetEnvIfNoCase Referer semalt.com spammer=yes

    Order allow,deny
    Allow from all
    Deny from env=spammer

  2. Nathan Grimm

    I’ve seen some people claim that blocking these sites with htaccess doesn’t always work because they’re not hitting your site just hitting the Google Analytics script with random UA Codes. Is there any truth to that?

  3. modelcarguy

    I used to deal with this at the domain name level, bu I have found that 90% of the offending domains come from the same IP. So I have a script that blocks IP addresses that have proven to be annoying. it is work, but this keeps my useless traffic and server hits way down.

  4. Eric Itzkowitz

    Thanks John! I am going to try the deflector today. I assume the deflector.map.txt file should also be placed at the same directory level as the .htaccess file?

    • Jeremy Englert

      Eric, I got the same error when trying this. It appears it doesn’t work with all hosting configurations (I use WPEngine). However, I was able to contact their support and they set up a “blacklist” for me.

  5. Jeremy Englert

    If you’re hosting with WPEngine, neither of these methods will work. You will need to contact their support team and let them know which websites you would like to “Blacklist” – then they will set it up for you.

  6. Lucas Kelly

    Logically no search system could survive without robots. It’s not a reason to push panic, especially that present-day systems of statistics count can easily tell a human from a bot. All bot complaints are pointless. All I can say about Semalt is that they do not breach any rules. Their SEO works, and that matters most. I haven’t heard about any of their clients falling under Google filter.
    If you don’t trust to Google, what is it all about?

  7. Thanks for this article, very interesting stuff.

    I currently use the SetEnvIfNoCase Referer method and it has worked well in the past but recently I have found that Doadar is somehow sneaking through.

    I was interested to read in this thread that they may now be going for the GA code it´s self and not actually getting anywhere near my sites. This sort of explained why my GA was showing hundreds of Russian referrer visits that did not match with the raw AWstats showing minimal Russian traffic.
    I initially thought the deflector method you mentioned in your post could work for me but if the traffic never arrives I can´t see how I can deflect it back.
    Does anyone have any ideas beyond .htaccess and deflecting that can be efective against this problem.

    Why don´t Google themselves come up with an effective blocking method, life would be a little bit easier if they did.

  8. Just want to point that Bruno Walsh & Lucas Kelly are fake accounts created by Nataliya Khachaturyan, the pseudo-something of semalt… Exactly the same crap and nonsenses… Such a shame to be so stupid !

  9. Just want to point out that Bruno Walsh & Lucas Kelly are fake accounts
    created by Nataliya Khachaturyan, the pseudo-something of semalt…
    Exactly the same crap and nonsenses… Such a shame to be so stupid !

  10. emiliano1991

    Hi,

    The method of deflector can’t be used in .htaccess file and it’s generated an error 500!
    The RewriteMap directive may not be used in sections or .htaccess files. You must declare the map in server or virtualhost context.

    However, Good article

  11. Thank you Jon, just the thing!
    Notwithstanding the Deflector strategy is the best approach (but work-intensive for more than one domain), the Google filter seems like the most practical solution. With that approach, I was able to set the same filter of about a dozen prolific spammers to ALL the domains I manage in Analytics – using the “Available Views” list that is NOT included in your snapshot above (but should be ;).

  12. Ben Lloyd

    Jon!
    Man – trying to explain what’s going on to clients and then “take away” traffic, even though it never actually existed, by filtering it out is such a pain in the ass. Creating and updating filters in GA just seems like a futile war of attrition as well. Has anybody asked GA (or any other analytics platform) about whether they monitor for this stuff and why not issue filters or just offer the option of filtering this spammy traffic out from the GA installation in the first place.

    • Agreed. I’m sure someone has spoken to them. My guess is that they’re avoiding it as long as they can. Fighting spam for analytics is a bit different than it is for their search team. For example, who gets on that list and how does one get off of it. It would be one more distraction that they would have to staff up for, and there may be legal implications for them too.

      • Or even just a checkbox next to the referrers “filter out traffic from this source?” with the obligatory “you can always put it back in – just go to…”. They’re smart enough I suppose they can probably figure it out. This is one of those things with free products like this where you don’t really offer support and its all DIY so people use it and think it’s accurate when it just isn’t.

        How’s Nashville? We missed you at SearchFest this year – I’m sure I’ll run into you at another show.

  13. Ankush Kohli

    Thanks Jon but I notice less traffic from every source, even from organic traffic
    (Google, Yahoo & Bing) after applying the hostname filter. I saw
    this after comparing both both analytics view (with filter & without
    filters).

    I can understand direct or referral but why we see less sessions from organic traffic?

  14. Should it be domain./ or domain. wanted to double-check because paragraph states to use ./ however the screenshot shows a different format

  15. Beverley Bott

    This is all really useful and I’ve put a couple of filters in place to block this spam, but there seems to be a new one pop up as fast as you can block them. Perseverance I suppose until Google take action to stop their ability to spam analytics.

    Also it seems that the filters I have in place are stopping my legitimate referral traffic, most of which is referred from LinkedIn and articles. Any ideas why?

    The two filters I have in place are filtered on ‘Campaign source’ and assigned to the ‘All Web Site Data’ view. I have an unfiltered view in place also, which shows everything – spam and legitimate referrals.

    The two filters contain these regex:
    darodar.|semalt.|buttons-for-website.com|best-seo-solution.com|buttons-for-your-website.com|duckduckgo.com|best-seo-offer.com|
    cmswip01.nottingham.ac.uk|zoominfo.com|mycustomer.com|4webmasters.org|best-seo-offer.com|free-social-buttons.com|

    Can anyone suggest help? Thanks

  16. ClickMonster

    I’d like to just block at my server level, but I think I need to do this with IP addresses. Is there a comprehensive list of IP addresses associated with these referral spammers?

  17. could you use your hosts.deny file to block these from the entire server? I don’t want to have to update these filters, .htaccess files each time a new one pops up it is a nightmare. I am not sure what syntax to use for the hosts.deny.

    • Sergio Bobillier

      You could use hosts.deny but keep in mind that what you have to add there is the IP address from which the requests are coming from, not the spamming domain. In other words, adding 4webmasters.org to your host file won’t do any good unless the traffic is coming from the same IP that hosts the 4webmasters.org site.

      If you want to add everything on a single file I suggest you add the referrals to one server configuration file instead of the .htaccess files. That way there is only one file to maintain.

  18. I didn’t see anyone address getting rid of the historical data from the spam referrals. We were able to get rid of them through a New Segment, but everytime we logout, the New Segment doesn’t stay selected. Anyone know what we’re doing wrong, how to set the New Segment as the default segment or some other way to permanently not show the referral spam in historical data?

  19. Sergio Bobillier

    Nice Article 🙂

    There is something I would like to contribute. Blocking this referrals on the .htaccess file is a waste of processor and I/O time because Apache has to read and parse the entire list of entries each time it processes a request for any of the sites on the host.

    I would recommend adding it to the server configuration file, for example in Ubuntu there is a file called security.conf which is intended to centralize all the security policies and directives. There I added a line like this for each referral:

    SetEnvIf Referer 4webmasters.org referral_spam

    Then add this to the VHost configuration file (or the .htaccess if you want, a single line is not as bad as 180 of them)

    Require not env referral_spam

    This also makes things easier to maintain, you can add or remove hosts in a single place.

  20. Ivan Cuxeva

    Thank you Jon, this helps a lot. However, after implementing the first htaccess code, I still see some of them pop up in my analytics, which is weird since they are indeed blocked via htaccess. One of the culprits is free-social-buttons, and it seems they are using different servers as such: www1, www2, www3+ … im researching a way to block these via htaccess as well. Overall, the first code has blocked about 80% of the referral spam that was hitting my sites. 🙂

  21. I’ve got over 400 rules and today I’m running new reports and it has been hit by a whole new myriad of bots. This is maddening. I can’t keep up with it and reporting becomes just a joke. Maybe Raven could allow us to ignore certain lines in a report, at least then I could condense it down but my first 2 screens of analytics is just garbage and there is no easy way around it. #frustrated

  22. Carlos Real

    Hi guys,

    I’m currently using this in my htaccess, however, some still seems to get through, particularly “floating-share-buttons” who managed to rack up 900 sessions this month ! Is there anything wrong with the script below?

    Thanks for the article !

    # Block Russian Referrer Spam

    RewriteEngine on

    RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly..ru/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly.org/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*ilovevitaly.info/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*iloveitaly.ru/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*econom.co/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*savetubevideo.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*kambasoft.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*buttons-for-website.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*floating-share-buttons.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*semalt.com/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*4webmasters.org/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*trafficmonetizer.org/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*webmonetizer.net/ [NC,OR]

    RewriteCond %{HTTP_REFERER} ^http://.*darodar.com/ [NC]

    RewriteRule ^(.*)$ – [F,L]

  23. fixxi.net

    Can a linux/unix guy confirm whether the recommended regex script (below) processes each packet, even those packets that are part of an established TCP session, or just the beginning of the session ( somewhere around the TCP handshake ). I am trying to judge what performance impact this script ( with three rules ) will have on the Apache server and whether it is lighter on the server to let it respond, since referral spam seems to be just one hit to the homepage.

    ## SITE REFERRER BANNING

    RewriteCond %{HTTP_REFERER} semalt.com [NC,OR]

    RewriteCond %{HTTP_REFERER} buttons-for-website.com [NC,OR]

    RewriteCond %{HTTP_REFERER} seoanalyses.com [NC]

    RewriteRule .* – [F]

  24. Maarten De Wispelaere

    Thanks to this article, I found the piwik-list. I already had a tool running on my web-servers, but only with a small list of my own. I adjusted my tools to automatically generate an combined blacklist file, and configure apache accordingly. The lists are updated each night. At the moment, it’s only my own short list and the piwik one. If you have knowledge of other plaintext lists, please let me know and I’ll include them. If someone wants to contribute, please do. All code is here: https://github.com/bitprocessor/referralspam-block ; Thanks again for this post.

  25. Email blacklist are the easiest way to reduce spam messages. Your mails will not get delivered if the server has been blacklisted. Seo blacklist check will check over 100 DNS based blacklists on a server IP address.

  26. I’ve been using the filters in google analytics for a while now. I’m on windows hosting and have blocked it through my web.config file but that became a hassle and I also didn’t want to make that file huge, and since I get hit about once a week or so by a new one the analytics filter has been my best option so far. Constantly doing it is a pain but it’s easy to do.

  27. brokenOval

    Thanks! Great article. I seem to have been hit with this, but all my referal urls are in fact my own domain – it seems like when I initially got hit, the spammer stored my domain externally and spoofs the request to look like its coming from my own servers – obviously I can’t block requests from my own site – any ideas how to get around this?

  28. Awesome, But according to Moz “it is the biggest mistakes people make is trying to block Ghost Spam from the .htaccess file.” They also said “the .htaccess file can only effectively block crawlers such as buttons-for-website,com and a few others since these access your site. Most of the spam can’t be blocked using this method, so there is no other option than using filters to exclude them.

  29. webvisitors

    That’s really awsome but my question is, If We will filter it through Google analytics. It will work and is it any way to do it through Raventools. Thanks

    • It should. The filters you set up in the Admin affect the results in GA that you see for that property. In turn, Raven uses their API to get results for the property, which should be the filtered results from GA.