Goodbye Webalizer, Hello Google Analytics!

Goodbye Webalizer, Hello Google Analytics!

Google Analytics was announced today, and it’s a big step forward for webmasters and web metrics.

Urchin, which Google bought earlier this year and used as a foundation for Analytics, was a nice product. A little expensive, but the reports were definitely a step ahead of the competition. The installation was hideous, though, since the application was web-based and the installation on OS X came with an entire Apache server. But because it was web-based, it was an obvious choice for Google to use as a basis for a hosted service.

Traditionally, your web reporting software needs access to your web server access logs. In the case of webalizer and analog, they are command line programs that are run on a schedule on the server where your web logs are stored. Urchin ran as a daemon process, and was either able to process local logs, or to fetch logs via FTP or HTTP from a remote server, all on a schedule that you configured through the administrative interface. This made Urchin a little more flexible, but in practice the log transfers were not very reliable, and it was difficult to recover from failures.

Google Analytics solves the web log problem by not using web logs at all. Instead, you insert little chunks of javascript into your site pages, and Google tracks your web site traffic for you. This is good for a few reasons. You no longer have to worry about your web logs. You don’t care about their format, location, rotation schedule, or anything else about your hosting environment. You also don’t have to schedule any log processing on your server or other computers, and you don’t need to waste your CPU and I/O doing it in big chunks. Instead you get real time stats, with each hit on your site immediately being registered by Google. And those stats can now contain all sorts of extra info, like screen size and window location. The information was already there, but collecting it with standard hosting tools was not easy.

Of course, who but Google has a distributed network of web servers capable of counting, in realtime, all the traffic on the web? No one, and that’s why they did it. That and it’s a great tie-in to their Adwords program, helping web sites generate even more ad sales for Google.