Copying Dirvish Backup Disks
You may find yourself with a dirvish backup partition to copy. These partitions are heavily hard-linked, and are thus very unusual. If you look at them with an image-by-image du you will see many terabytes of apparent data; this is the way this data must be read and moved by a copy tool.
There are not any efficient tools for copying rsync dump images, yet. The major tools:
cp -a, cpio -p, tar -cfp, dump/restore, and rsync -aH , all work above the file system layer. This means that they must traverse the entire tree, and that can take a long time. Of these tools probably cp is the best. cp copies all the data, correctly, except it does not preserve modification times on symbolic links.
tar -cfp does not copy socket special files, and is twice as slow as the rest. Not recommended.
cpio -p is the fastest, but does not preserver symbolic link or some directory modification times.
dump/restore only works for ext2, and Linus Torvalds himself says these tools are not to be trusted.
rsync -aH is about 30% slower than the rest, and also does not preserve symbolic link modification dates.
Note that cp -a will copy the whole structure, but that is not the best way to use it. cp builds large tables in RAM of the filenames copied, and with a dirvish file system this can appear to be hundreds of millions of filenames. This huge table starts spilling over into virtual memory, and the resulting disk thrashing can slow the copy process by 10X. If cp is used, it is best to copy vaults (that is, a collection of images) one at a time, rather than whole banks or partitions.
