Linux's Tell-Tale Heart, Part 5

by Marcel Gagné

Welcome back, everyone, to the SysAdmin's Corner, where we attempt, week after week, to de-mystify the mysteries of your Linux system. Before I dive into this week's topic (which is bright and shiny with colorful charts), allow me a moment to cover some common ground. Every once in a while, a few newcomers to the Corner send me e-mails asking for links to past articles in this (or other) series. First, I bid you welcome, one and all.

There are two ways to find past articles. One is to scroll down to the bottom of the "News and Information" column and click on the "More Articles" link. The "Corner" can be located by following the "System Administration" link. You can also visit my web site at Salmar.com. This link (https://www.salmar.com/marcel/ljwritings.html) will take you right to a page of previous articles, with brief descriptions of what each article covered. Enough administrivia - on to the good stuff.

Last week, I introduced a web server log analyzer called Analog. At the end of the article, I promised you more flash and pizzazz when next we met. Today, I keep my promise. Analog is fast, highly configurable, and clean in its output. If you are like me, though, you can be (and occasionally are) seduced by colorful charts and diagrams for representing information. In addition to Analog, I use another web server analysis tool regularly, a little something called Webalizer.

For a peek at the kind of output you can expect from Webalizer, have a look at this sample from the official web site. In an instant, you can see how the access to your site has changed over recent months with charts displaying weekly and even hourly averages, based on pages, files and the number of hits. Even if you personally are not moved by Webalizer's output, your management will be. (They love that sort of thing. Trust me.) Best of all, Webalizer is free and distributed under the GNU Public License.

The official web site at https://www.mrunix.net/webalizer/ is fine for information, but I have found it tends to get bogged down with download requests. If you get refused because of too many FTP transfers, try the European mirror (in Austria) at https://webalizer.dexa.org/.

You'll find a 2.x development version of the Webalizer code out there as well, but for the sake of this article, I'm going to deal with the current (stable) 1.3x release. Those of you who are dealing with multiple platforms will be happy to learn that the Webalizer works on a variety of systems including Solaris, BSD, MacOS and number of others. Binaries are available for some platforms, but if you don't see yourself listed there, fear not; building from source is easy.

Okay. Side trip. There is just one possible catch; it goes like this. Webalizer uses the GD graphics library to build those "oh-so-cool" pseudo-GIF inline graphics, and the compression algorithm for the GIF file format is subject to patent restrictions that lately have generated much controversy. Part of what this means is that the GD library may or may not be included in your Linux distribution. The Webalizer download page does provide a link, should you need it. I built my Webalizer binaries on a Red Hat 6.2 system which still provided the libraries; odds are that for now, your system will likely have them as well. Those who would like more details on this whole GIF patent issue may visit the GNU web site (don't forget to come back).

Let us continue. The latest 1.3x version turned out to be webalizer-1.30-04-src.tgz. The steps are easy. You extract the source, change directory to the resultant distribution directory, and compile. Here's how I did it on my system. By the way, since I try out so many different packages, I keep a build directory in a consistent place, which makes cleaning up later on much easier. If you also experiment a lot, you might find this approach helpful as well. Enough talk; time to build.

     cd /data1/builds
     tar -xzvf webalizer-1.30-04-src.tgz
     cd webalizer-1.20-04
     ./configure
     make
     make install

You can do all these steps with whatever userid you wish, but the "make install" portion must be done as root. Now that you have the Webalizer installed, you will be making some configuration changes. (Surprised?) Webalizer needs to know where your log files are kept, as well as where and how reports should be generated and stored. Trust me; it's easier than it sounds.

You might have noticed that the build process created an /etc/webalizer.conf.sample file. The default location for the configuration file is /etc/webalizer.conf and I would recommend you leave it at that. First, make a copy of the sample file and edit that file. Let's pretend I am going to use pico as my editor in this case.

     cp /etc/webalizer.conf.sample /etc/webalizer.conf
     pico /etc/webalizer.conf

All I did was change four lines in this file, which were as follows:

     LogFile              /etc/httpd/logs/access_log
     OutputDir        /home/httpd/html/usage/
     Incremental    yes
     HostName        myhost.mydomain.com

The LogFile parameter identifies the location of your access_log file. The path above is typical of a Red Hat installation. Last week, I gave you some tips on finding the location of that file on any system, but you'll recall that the most likely location is /usr/local/apache/logs/access_log if you built Apache from scratch.

The next parameter, OutputDir, defines the location for the Webalizer reports. I created a directory called "usage" on my web server under my document root. Again, the path above implies a Red Hat system. Next in line is Incremental yes. This is a useful little parameter (and I do recommend it) if your log files rotate more than once a month. It lets the Webalizer continue from where it left off, even if that means starting on a brand-new log (remember those ".1", ".2", etc. files created by logrotate?). The last parameter, HostName, may not be necessary. It depends on whether your system returns the proper host name (which gets put at the top of the report). If you are running a number of virtual domains (as I do on my web server), you should pick one for the report headers. Read the configuration file yourself, and decide whether you want to change anything else. In my case, those were the only changes.

The last thing left to do is run Webalizer. You do that simply by typing this command:

     /usr/local/bin/webalizer

While I did not find Webalizer as quick as Analog (it processed my 800,000-line access_log file in 2 minutes and 55 seconds, compared to Analog's 1 minute and 39 seconds), it is still pretty fast. The data generated from Webalizer is cumulative. That means you can run Webalizer whenever it suits you, and it will just build on what it already has. If you want to do daily updates, you will certainly want to put this command in a crontab entry, like this. The "-e" flag, of course, means "edit".

     crontab -e

Once in the editor, I add the following line, which will run Webalizer every day at 6 o'clock in the morning:

     0 6 * * * /usr/local/bin/webalizer 1> /dev/null 2> /dev/null

Whether you decide on Analog or Webalizer is a matter of personal taste. In terms of sheer "flash" and bright, shiny colors, Webalizer is definitely ahead.

Just like that - another week is over. When next we convene here on the Corner, we will look into even darker parts of your system, areas that speak of things which confound, yet say so much about every system. I'll turn on the lights and show you around. In the meantime, remember that your Linux system is talking to you. Are you listening?

Load Disqus comments