I came across a surprising issue with Google Analytics yesterday when a client contacted me, it appeared page hits were getting logged for dozens of pages that don't exist on his website. While I thought about the possibility of some PHP vulnerability or other security issue on the server, it occurred to me that obviously these pages had to exist somewhere for the Google Analytics tracking code to get executed.

I started doing a little detective work by Googling the file names of some non-existing pages that showed up in the stats. Soon enough I found one obscure enough to only come up with a few hits.

When checking the source code for that page the problem was obvious, the website was using the same Google Analytics tracking ID my client had on his website. It didn't appear to be anything malicious, maybe them making a typo or some HTML code getting ripped from my clients site and reused without realizing the tracking code was still in there.

The consequence though is some completely messed up statistics -- I am baffled Google doesn't do a 404 check on pages that get tracked or even a referrer check to see if the domain corresponds with the domain the tracking ID was registered to.

It leaves my client having to filter out dozens of non-existing pages, even worse on pages that have the same path and filename. For those there seems to be no clear way of figuring out what hits came from one website and what page hits from the other.

 
In my honest opinion this is a pretty serious issue that needs to be addressed, if nothing else it leaves Google Analytics open to unscrupulous characters to spam your visitor stats.
 

Posted
AuthorPeter
CategoriesGeneral