Bits N Tricks

A platform to contribute to the best of our knowledge so someone else can be benefited.

Filter Spam in Google Analytics 05Jan, 2016

Filter Spam in Google Analytics

Referrer traffic with 100% bounce rate in Google Analytics (GA) report is becoming a serious issue. This referrer traffic in the report is just a traffic stat not the actual traffic. This spam traffic affects the quality of analytics stats and alter website metrics, especially for website with fairly low traffic. Even in some cases this spam traffic stat affects the reputation of SEO agencies by creating a doubt in web owner’s mind against their SEO strategies that aim to promote their online business. So for the sake of Google Analytics traffic stats it becomes very important to handle these spam referrers. Some of the popular spam referrer are boost-my-site.com, copyrightclaims.org, free-flo ating-buttons.com, wordpress-crew.net.

ga spam report

Type of Spam in Google Analytics: 

There are two type of spam referrers detected in Google Analytics:

  1.  Ghost Spam

  2. Crawlers Spam

Ghosts: Ghost spams enter into Google analytics report without actually interacting with the website. This is the reason they are known as Ghosts. Ghosts use Analytics measurement protocol. By applying this method and randomly generated tracking codes (UA-XXXXX-1) spammers leave fake visit data in the analytics report.

Crawlers: Crawlers spams are just opposite to ghost spams. Crawler spams crawl the web-page even against the rules that are applied in robots.txt (that stop the crawlers from reading any website) and at the exit they leave a record similar in analytics report.

google_analytics

How to stop this spam traffic in analytics report?
Mostly people try to stop spam traffic by:

  • Blocking ghost spam from the .htaccess file: As already discussed the ghost spams never visit a website, so in this case adding lines to .htaccess file to block ghost spam won’t be helpful.
  • Using the referral exclusion list to stop spam: Let’s take an example to understand the referral exclusion traffic. Consider there is a shopping site where any customer that buys a product gets redirected to the third party website for payment. After successful payment he gets redirected back to that shopping site and analytics record this as a new referral traffic. Here referral exclusion list help to prevent referral traffic from getting recorded in the analytics report.
    But, using a referral exclusion list to manage spam traffic is not a good idea as the referral part will be stripped unless there is no preexisting record. Hence direct visit will be recorded that leads to bigger problem because spam is still in the report and it’s hard to track the direct visits.

Filters to Stop Google Analytics Spamming: There are 2 type of filters available to stop Google analytics spam:

  1. Valid Hostname Filter: In this spam filter regular expression of “valid hostnames” is added in the filter pattern space to deal with the Ghost spam (referral, organic or fake direct visit)
  2. Crawler Spam Filter: In this spam filter regular expression of “Crawlers Spam” is added in the campaign source to “exclude all hits from the known bots and spiders”.

Steps to filter spam referrers:

  • Click on reporting tab in Google Analytics and select a range of months from GA calendar.
  • Select audience tab from the sidebar.
  • Expand technology section and select network.
  • Select hostname tab at the top report.

2

  • Now copy all the valid host names from the list.

3

Valid hostnames: are like YourDomain name, Subdomain, Google, Shopping referrers, Translate services and more.
Invalid Hostnames: are like random characters, weird URLs, (not set) and even some known names like amazon.com etc (spammers use these names to misleas people).

Valid Hostname Filter: Since referrer spam are growing with every passing day, it is essential to prevent Google Analytics stats by applying additional filter. So, it is required to create filter that includes only traffic with the valid hostnames selected from Hostname report.

After gathering all valid hostnames, you need to create an expression that includes all valid hostnames as shown below:

yourdomainname.com|blog.yourdomainname.com|translatingservice.com|webcacheservice.com|videoservice.com|shoppingcart.com

Maximum length of this regular expression is 255 characters.

Once the regular expression is created, go to admin tab and select the view where the filter has to be applied.menu

  1. Click filters option from the sidebar filter_tab
  2. Click Add filter add_filter
  3. Enter filter name eg: valid hostnames
  4. Filter type: select custom
  5. Choose- Include and select hostname from dropdown option.
  6. Paste valid hostname regular expression in the filter pattern space.4
    Select verify this filter to see the table showing sample data before and after applying the filter.
  7. After verifying that no valid traffic is excluded, save the filter.

Invalid Hostnames Filter: There is another way of doing it. You can create a regular expression of Invalid Hostnames and exclude all invalid referrers by following these steps:

Go to your Google Analytics account and select admin tab menu

  1. Click filters option from sidebar filter_tab
  2. Click Add filter add_filter
  3. Enter filter name eg: Crawler Spam Filter
  4. Filter type: select sustom
  5. Choose exclude and select campaign source from dropdown option.
  6. Now paste the regular expression in filter pattern space. Select “Verify this filter” to see the table showing sample data before and after applying the filter. This may require 7 days data for verification (excluding the day when filter is applied.) So, if no spam data appears in 7 days, the verification won’t work. If instructions are followed properly the filter will work.5
  7. After verifying successfully, save the filter.
Posted by: Amit Verma / In: SEO and Tagged , , , ,
Cam

Leave a Reply

Your email address will not be published. Required fields are marked *