Update 2009-02-25 Someone noted that e2fsprogs includes program filefrag which "reports on how badly fragmented a particular file might be". As that program is actively maintained and made by the people behind ext2/ext3/ext4 filesystems, it is better to use it instead of the script here.
Update 2008-10-21 The script did not like if there was no contiguous/non-contiguous files and crashed with divide by zero, updated now. Thanks for the patch!
Update 2007-12-19 The configurator in the script was missing one include, which prevented it from working in Redhat systems. The fibmap.pl script has now been updated.
Measuring fragmentation of ext3 in linux
It has been often said that ext3 and other filesystems in Linux do not suffer from fragmentation and defragmentation is therefore not needed. However, one system I am using was becoming slower and slower all the time (for example, doing apt-get update/upgrades) and I was suspecting heavy fragmentation of the root-partition. Then I stumbled on the next problem: apparently no one has even bothered to make any tool for measuring the fragmentation level of filesystems in Linux, apart from the one percentual value reported by e2fsck. Of course, when the general consensus is that fragmentation is not an issue, why make a tool for measuring it...
I found one program, fibmap. It works (compiling with GCC-4.1 required a couple of fixes), but the output is a bit too cryptic for a random person (in this case, me). The most useful value the Windows defragmentation programs (for example Diskeeper) report is IMHO the number of non-contiguous blocks per file. For example a file with 1000 non-contiguous blocks means that the harddrive will have to make 1000 seeks when reading the file, and if that file is, say, 50MB in size the time wasted on seeking instead of reading will be very significant. If there are many files with hundreds/thousands non-contiguous blocks in a filesystem, fragmentation is quite bad and will affect the performance. Instead of changing fibmap to output the values I wanted, I decided to kludge together a perl-script. The script is based on this example.
The script can be downloaded here, fibmap.pl. The script must be run as root, and it takes as input names of files or directories to (recursively) measure for fragmentation. The output is like this:
$ ./fibmap.pl /usr/ 1052 15.7MB 15.28KB /usr/lib/libgcj.so.6.0.0 868 18.3MB 21.65KB /usr/lib/wireshark/libwireshark.so.0.0.1 840 20.7MB 25.27KB /usr/lib/libgcj.so.7.0.0 742 19.9MB 27.48KB /usr/share/gnome-applets/gweather/Locations.xml 467 4.2MB 9.31KB /usr/share/wireshark/wireshark/wireshark-filter.html . . . Non-contiguous: Files 12670 (1713.5MB, avg. 138.49kB per file), blocks 79184, average block 22.16kB Contiguous: Files 76846 (351.0MB, avg. 4.68kB per file)
The first number is the number of non-contiguous blocks in the file, the second the size of the file, and the third the average length of contiguous blocks. The first file, /usr/lib/libgcj.so.6.0.0, is divided to 1052 blocks, each averaging 15.28kB. Needless to say, reading such file is not stellarly fast. End of the output displays some information for all tested files.
So in this case the partition was heavily fragmented, but that is not very surprising because the partition is a bit too small and has therefore been always nearly full. Other filesystems I measured were not nearly as bad, but most files were still in quite small pieces, say, around 100kB per contiguous block on average.
Disclaimer: the script may be horribly broken and running it on your system may cause cataclysmic destruction. In more serious tone, the script is not really polished and is provided only in the hope it may be useful to someone. Anyway, the script worked fine for me in Debian unstable (up-to-date in 2006-09-05) with kernel 2.6.18-rc5 on ext3 filesystems. Other filesystems should work provided the IOCTL used for measuring block numbers, FIBMAP, is available for them.
If you have comments or questions, contact me at ilonen@lut.fi.