For a flexible reporting tool that can yield information ranging from CPU use to the top I/O-consuming process look no further than dstat.
Well, maybe not "all-seeing", but it made for a catchy header. Though dstat does come pretty close when it comes to observing what's happening on your system.
You can use dstat to look at a lot of information at once, and you get to choose the information displayed. If you want to see disk input/output (I/O), swap use, the top I/O-using process, and CPU use all together, dstat will do that.
In this article we'll mainly look at using dstat to check on I/O and swap, but it shouldn't be hard to combine what we cover here with the options listed on the dstat man page to exploit more of its features.
Your machine will need python 2.0 or higher to run the latest dstat.
Check your python version by running "python -V":
$ python -V Python 2.5.2
The dstat options to check I/O use on a per-process basis require a kernel version 2.6.20 or higher. There is one exception: Red Hat backported the I/O stat reporting to their kernel version 2.6.18-144.el5, so if you are running that kernel or later on RHEL you can use that functionality in dstat. You can still run dstat without meeting that kernel version requirement, you just won't be able to get per-process I/O reporting.
Check your kernel version by running "uname -r":
$ uname -r 126.96.36.199-rscloud
You can install dstat either from a source package obtained from the project's web site or from a precompiled package for your distribution. The source package is probably the best approach, since then you know you have the program with the latest options.
This guide assumes you're using dstat 0.7.0 or later.
To install dstat from source you'll first need to download the latest source tarball from its web site. Look for a link on that page to the latest version, which at the time of this writing is 0.7.2.
Once you have the link to the source package you just need to get the tarball onto your system. You can download it to your desktop then upload it with scp or sftp, or you can use wget directly from your box:
Next unpack the tarball:
tar -xvjf dstat-0.7.2.tar.bz2
And finally, run the install script to put dstat's files in the right locations on your system:
cd dstat-0.7.2/ sudo make install
Now you should be able to type "dstat" and see the default statistics report.
From package manager
Check the package version available in your distribution's repository by selecting the appropriate command:
aptitude show dstat yum info dstat emerge --search dstat pacman -Ss dstat
At the time of this writing Arch doesn't have a dstat package in their repository, but the command is included in case they add it at a later date.
This guide assumes you're running dstat 0.7.0 or higher. If the package info shows that its version is older than that you might want to grab the source package instead. The options syntax and module references changed with 0.7.0, so there's a fair difference between that and earlier versions.
CentOS and RHEL
If you're running CentOS or Red Hat you can download an RPM package of the latest version of dstat for your distribution version from the author's RPM repository. The easiest way to get it is to visit that site and scroll to the bottom of the list of files to find the latest version. Then get that onto your system, perhaps with wget:
With the package on your system you can install it by running yum:
sudo yum --nogpgcheck install dstat-0.7.2-1.el5.rf.noarch.rpm
A look at the dstat man page will show you a large array of options that can be sent to dstat to control its data reporting. The order in which these options are sent to dstat will determine the order of those stats in the report. For example, if you run "dstat" by itself, dstat actually runs as if it was launched with a specific set of options:
There are too many options to discuss here so the best way to learn about the various options is to just try them out. The man page does a good job of explaining them. Be sure and switch the order of options around so you can see how that looks too.
By default dstat reports new figures every second. To quit dstat hit "control-C".
Note that the first line of the report is typically going to show nothing for all stats. That's because dstat operates by summarizing what it saw since its last report. When it first runs there's no data to average or sum.
You can control the delay between reports and the number of reports dstat will output in a run by passing those figures at the end of the dstat list of options. If you want dstat to run with default stats, wait 5 seconds between reports, and only report 3 results, you would run:
dstat 5 3
A similar command that uses the same delay and increment number but mixes the options up a bit (like the default, but omitting the "system" stats and reporting swap use in its place):
dstat -cdngs 5 3
If you run "dstat" by itself you'll see a list of default statistics that look something like this:
The default stats reported are:
CPU stats: What percentage of the CPU is in use. The more interesting sections of this report are user, system, and idle, which should break down most of the current CPU use. If you see high CPU use in the "wait" column there might be a problem elsewhere in the system. When a cpu "waits" it's because it's expecting a response from an I/O device (like memory, disk, or network) and hasn't received it yet.
Disk stats: Read and write activity to disks.
Net stats: Data sent and received on network devices.
Paging stats: Paging activity on the system. Paging refers to a memory management technique used behind the scenes on your system. A high level of paging can indicate that the system is using a lot of swap space, or it could mean that memory is very fragmented.
System stats: This shows interrupts and context switches. These stats are usually only useful if you have a baseline to compare them to. Higher stats in these columns usually indicate a large number of processes jostling for the CPU's attention. Since your server is likely running many processes by default, there will always be some numbers there.
Checking I/O stats
To view the swap information and top I/O-using processes for your system it's just a matter of combining the right options when you run dstat. You might use:
dstat -s --top-io --top-bio
That will display swap use, the top I/O-using process, and the top disk I/O-using process (in that order). That's usually enough to let you track down the source of an I/O or swap-related problem.
If you have a wider terminal window you can have the report include the pid (process ID) of the top I/O-using processes:
dstat dstat -s --top-io-adv --top-bio-adv
Another program that can be used to check I/O is iotop. It has a better presentation than dstat if you're just looking for swap and I/O activity, but it requires a newer version of python (2.5). It's also not as versatile in terms of what it can report.
The "sysstat" package includes several commands useful for gathering system usage statistics, and newer versions of the package include a program named "pidstat". When run with the "-d" option pidstat will display disk I/O information on a per-process basis.
The dstat command can be useful when trying to track down the cause of excessive swapping or disk use on your system, but it can check on much more than that. The customizability of the output can make it a lot easier to assemble the data pertinent to a problem instead of having to skim through unrelated details.
If you want to delve further into dstat, looking at scripting it and using additional modules, continue to the next article in this series.
- -- Jered