Setup
- Download and install GraphViz. There's an RPM for linux...
- Download and install StatViz in a directory. It's basically one php file. The README file will tell you how to customize the configuration file and run it.
- I don't have too many PHP apps running so there's a couple of other things you may need to do. First, you'll need PEAR:Config. Once you have this, uncompress/untar it the easiest thing to do is move it Config.php and the Config dir to /usr/share/pear. Second, statviz takes up a lot of memory so you may need to increase the memory_limit configuration parameter in your /etc/php.ini
Basic Running
You can run it using
./statviz.php --config configfile
and then create a gif file of the output by doing something like
dot -Tgif -oOutputGifFileName
If you put the output gif file in
Things To Look For
There are a number of things you'll need to consider if you want accurate results:
- Make sure you look at the bot extensions and make best attempts to get these filtered out.
- Make sure you have all non-pages (graphics, js, css) filtered out.
- If possible, try to filter out requests from internal users. Statviz doesn't have a filter for this, so I just scrubbed out of the logs myself using a grep -v.
- If you're site has long URL's, you will most certaintly want to clean them up before processing. The tool allows you to create an alias file, but you may need/want to do some log scrubbing on your own.
- Play around with the GraphNReferrerPairs parameter. You can get a lot more detail on site activity with higher numbers, but the graph becomes the graph then becomes a lot more complex to digest. If you decide on a large graph, you may need to modify the source and change the size of the graph. It defaults to 10, 8 and there isn't a parameter to configure this. I changed it to 20, 16 for most of my small graphs (GraphNReferrerPairs <>) and to 40, 32 for larger graphs.
- Very long URLs are going to be a hassle, especially if they come from external referrers and out of your control. I put in some checks in the code to clip the very long URLs.
Automating
I've automated a couple of things on my site:
- A report that updates hourly on today's activity.
- I archive a daily gif file. (I will add weekly and monthly in the future).
- I have a 'full report' that shows activity for the last 30 days. I update this daily.
I'll put out another entry with a quick 101 on interpretting the results.
No comments:
Post a Comment
Comments on this blog are moderated and we do not accept comments that have links to other websites.