Dirvish Users Questions
The mailing list is a better place to ask questions!
Place your question here; somebody will answer it eventually. Please include your name, and it would help if you added a personal page to the DirvishPeople page.
Go ahead, click edit, and start typing. Wikis are pretty easy to use.
Could there be a mailing list to discuss dirvish issues? I think this is more handy. Anyway.
A mailing list has been added - KeithLofstrom 2005 Feb 11
Error Code 24
We are using dirvish for backing up filesystem with 1 million+ files. Creating the filelist with rsync takes so much time that sometimes files get deleted in between. rsync then complains about "vanished files" and exits with code 24. dirvish handles code 24 as error and restarts rsync. This can happen 4 times, resulting in a 12 hour run for one vault. It could even happen that rsync exited with code 24 after the fourth run and dirvish complains about an incomplete backup.
We do not care about files that are deleted on the original and may be existant on the backup. I changed the "error" level of code 24 to "warning" level in dirvish.pl. Now dirvish will run rsync only once. This could be changed for the 1.3 release.
This will be fixed in experimental version 1.3 and production version 1.4 - KeithLofstrom 2005 Feb 11
dirvish-expire and permissions
MichaelTibbetts inquires: I've run into some issues with dirvish-expire and file permissions. When dirvish-expire attempts to remove files in an image that is not writable(by me), the rm -rf system call prints errors to standard error but dirvish-expire exits with an exit code of 0.
Should it fail and give a non-zero exit code? Or at least check the exit code of the system call and warn the user?
Two side effects of the current(1.2) behavior is that there are partial images left on disk(not such a big deal for my purposes) and a boat load of error messages being mailed to me by the cron job which executes dirvish-expire.
Would it be possible to get dirvish-expire to try to chmod +w before removing images?
Perl's File::Path::rmtree does something like this and is packaged with perl (at least as of version 5.8). A potential draw back with using this in place of the system call to rm is that rmtree chmod's to 0777 and if it fails part way through, you've just left an entire directory's content world readable, etc.
In order to avoid the 0777 issue on failure and some messy cleaning up, we would need to set the "skip any files to which you do not have delete access" parameter to true, which still leaves us with partial trees anyway in the event that we only have partial write access.
So, since dirvish already uses system() to run "rm -rf", why not simply chmod before doing the rm? A possible patch for dirvish-expire can now be found in RFE ChmodOnExpire.
MichaelTibbetts replies: Thanks Eric. That does the trick for me.
Dirvish on a Windows machine:
Hmmmmmm interesting... ( I apologize for the wierd formatting this as as best as I could get it ) . Anyone saw an Rsync presentation at LinuxFest like 6 months ago and finally got around to setting up, thanks to the Debian dirvish guide, and for the Winsync faq. The problem I'm having tho is that I got rsync working locally, without a problem, but when I try and use Winrsync it doesn't work. I just get prompted for a password, and it never accepts it. Worst still is that I cannot find the error log for the life of me, on the windows machine.
I might be missing something but here is all the information, its alot. I'm assuming two things tho, first that summary contains all the relevant information about master.conf and bank/vault/dirvish/default.conf. The second that the error regarding the password file being group readable, is not causing this.
This is dirvish 1.2.1 (deb unstable package), running on (you guessed it) Debian Unstable. I think this is a problem on the windows end as the etc vault seems to be working fine.
Any help would be appreciated it might be something I overlooked or maybe even a spelling mistake but i thought I checked that. Originally my password was thispassword so I have tried changing it to no avail. I also tried manually specifing the log location to be /cygdrive/c/rsyncd.log to no luck either.
Also a note, that the only way I can get it to stop prompting me for a password is to kill the rsync process on the windows machine in which case it errors out, thats not shown in the console output though, but thats what the logs might refer to.
First a file listing
Directory of C:\Program Files\WINrsync 24/10/2004 02:22 AM <DIR> . 24/10/2004 02:22 AM <DIR> .. 24/10/2004 02:22 AM 100 rsyncd.bat 24/10/2004 02:41 AM 305 rsyncd.conf 24/10/2004 02:33 AM 9 rsyncd.secrets 24/10/2004 02:15 AM <DIR> support 24/10/2004 02:15 AM 36,746 uninstall.exe 19/05/2002 06:23 AM 236 winrsync.bat 19/05/2002 12:58 PM 13,360 winrsync.php 14/05/2002 07:55 AM 967 winrsync.pif
@ECHO OFF cd c:\PROGRA~1\winrsync c:\PROGRA~1\winrsync\support\rsync --daemon --config rsyncd.conf
gid = users read only = true use chroot = false transfer logging = true log file = rsyncd.log log format = %h %o %f %l %b hosts allow = 172.27.1.2 hosts deny = 0.0.0.0/0 strict modes = false [backups] path = /cygdrive/c/winnt/system32/dns/ auth users = backups secrets file = rsyncd.secrets
Directory Contents on Server
[02:57:15] root@fermat:/data2/backups/dns/20041024$ls -asl total 20 4 drwx---- 2 root root 4096 Oct 24 02:57 . 4 drwxr-xr-x 4 root root 4096 Oct 24 02:42 .. 4 -rw-r--r-- 1 root root 576 Oct 24 02:55 log 4 -rw-r--r-- 1 root root 553 Oct 24 02:55 rsync_error 4 -rw-r--r-- 1 root root 713 Oct 24 02:55 summary
ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x \ --password-file=/etc/dirvish/win.password \ --exclude-from=/data2/backups//dns/20041024/exclude \ email@example.com::backups/ /data2/backups//dns/20041024/tree broken pipe RESULTS: warnings = 0, errors = 1 ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x \ --password-file=/etc/dirvish/win.password \ --exclude-from=/data2/backups//dns/20041024/exclude \ firstname.lastname@example.org::backups/ /data2/backups//dns/20041024/tree write error, filesystem probably full broken pipe RESULTS: warnings = 0, errors = 2
/lines broken with \ for readability / (KeithLofstrom)
*** Execution cycle 0 *** password file must not be other-accessible continuing without password file @ERROR: auth failed on module backups rsync: read error: Connection reset by peer (104) rsync error: error in rsync protocol data stream (code 12) at io.c(515) *** Execution cycle 1 *** password file must not be other-accessible continuing without password file rsync: writefd_unbuffered failed to write 31 bytes: phase "unknown" [receiver]: Connection reset by peer (104) rsync error: error in rsync protocol data stream (code 12) at io.c(909) summary client: email@example.com tree: :backups rsh: ssh Server: fermat Bank: /data2/backups/ vault: dns branch: default Image: 20041024 Reference: default Image-now: 2004-10-24 02:42:17 Expire: +3 months == 2005-01-24 02:42:17 Expire-rule: * * * * 1 +3 months exclude: lost+found/ SET permissions devices init numeric-ids stats xdev UNSET checksum sparse whole-file zxfer
Console Attempt to Init Vault
[02:42:11] root@fermat:/data2/backups/dns$dirvish --vault dns --init Password: Password: Password: Password: Password:
While I am not familiar with the details on windows, this can happen if you do not have the ssh connection configured properly. The second thing to do is to set dirvish to back up on small partition your backup server. And the first thing before that is to set up and test ssh . That will provide some clues that will help with the windows backup. Walk, run, fly!
On your Linux machine fermat(?), using dirvish for self-backup, you need to get it to the point that you can type:
root@fermat: ssh fermat root@fermat: (except now you are connected through ssh)
If it asks for a password, you will need to generate a key in the .ssh subdirectory of root:
root@fermat: cd /root/.ssh root@fermat: ssh-keygen -t rsa
And then append the id_rsa.pub file you just generated to the file authorized_keys2 with:
root@fermat: cat id_rsa.pub >> authorized_keys2
You append this to the same file on whatever machine you wish to log into without a password. Again, I do not know the windows equivalent (does ssh drop you into CMD.EXE, or what?) but if things are analogous you should be able to access your windows machine with sftp without a password as well.
Disk consumption spikes:
I've noticed a HUGE jump in disk consumption accross all my backups, they seem to be double each of there original size. I don't know when this happened, but it was recent. file.sh is just something that does it for every folder (before I found out I could do du -s -si 20041029/ 20041030/
I don't know of a way to get du to take two files and see if they are symlinked., and then not use those. Below are three different vaults, mydocs which is rsynced from my windows box, as is usbdrive, etc is local. The one that concerns me the most is mydocs, as if it continues to grow I'll have problems very quickly. Some of these, most of these don't even change on a day to day basis. I don't know of anyway to troubleshoot this.
Also is there a way to have a program go thru each pair of directories, check if the file matches and if it does hardlink the files together, so I can repair this?
[02:39:06] root@fermat:/data2/backups/mydocs$du ./ -s --si 4.4G ./ [02:39:11] root@fermat:/data2/backups/mydocs$./file.sh 2.2G 20041026 2.2G 20041027 2.2G 20041028 2.2G 20041029 2.2G 20041030 2.2G 20041031 2.2G 20041101 21k 20041102 2.2G 20041103 21k 20041104 2.2G 20041105 2.2G 20041106 2.2G 20041107 2.2G 20041108 2.2G 20041109 [02:40:38] root@fermat:/data2/backups/usbdrive$du -s --si 20041103 20041104 20041107 20041108 20041109 21k 20041103 171M 20041104 212M 20041107 212M 20041108 212M 20041109 [02:40:52] root@fermat:/data2/backups/usbdrive$du -s --si 462M . [02:43:16] root@fermat:/data2/backups/etc$du -s --si 34M . [02:43:18] root@fermat:/data2/backups/etc$du 20041024/ 20041025/ 20041026/ \ 20041027/ 20041029/ 20041030/ 20041031 20041101/ \ 20041102/ 20041103/ 20041104/ 20041105/ 20041106 \ 20041107 20041108/ 20041109/ -s --si 17M 20041024/ du: cannot access `20041025': No such file or directory 17M 20041026/ 17M 20041027/ 17M 20041029/ 17M 20041030/ 17M 20041031 17M 20041101/ 17M 20041102/ 17M 20041103/ 18M 20041104/ 18M 20041105/ 18M 20041106 18M 20041107 18M 20041108/ 18M 20041109/
/lines broken with \ for readability / (KeithLofstrom)
Follow Up Question (11-15-04): I found out that the new 'set' starts on 20041103, I found this by symlink randomly in another directory and checking the filesize of that directory, and found that if I have 20041024 and 20041102 the du reports 2 gigs in use, but 20041024 and 20041103 reports 4.1 gigs in use. The question is now that I know where the set is broken, how can I repair it, adn get my disk space back?
Per vault expire patterns
Is it possible to set per vault expiry patterns, as for instance my USB drive isn't gaurenteed to always be there, so I would like to have the backups expire less aggressively.
/Keith Lofstrom responds: You can put an expire pattern in the [VAULT]/dirvish/default.conf file, and it will override the expire pattern in the /etc/dirvish/master.conf file.
chattr for extra vault security
I would like my vault to be as stable and secure as possible, yet remain 'accessible'. What I was wondering as an extra layer of security was to set chattr +i on all my files. The drawback being that iirc I can't hardlink files anymore. But I was wondering would it be 'safe' to have something go before dirvish runs and chattr -i everything, run dirvish, then chattr +i everything. I use a similar method for my music files, except it just chattr +i 's everything at 7 am.
Keith Lofstrom responds: It would be interesting to see what happens. I suspect your disks would thrash for a /long/ time; up at the user level, a dirvish vault looks huge. This seems like a security-by-obscurity approach. You can probably accomplish the same thing by unmounting your backup drive, and remounting it read-only. Keep in mind, though, that if bad guys can read your backup disk, they /own/ you.
Vaults on Remote Hosts?
Is is possible for dirvish to put vaults on remote hosts? This would get around a lot of security problems like how to initiate backups from inside a secure network (behind a firewall) to a backup server on a remote planet. ie.
KeithLofstrom replies, 2005 Feb 11:
The security aspects of this are a little scary. Generally, you do not /want/ to put your secure files outside of the secure network; that defeats the security, because the bad guys can now get at things like password files, or subtly alter your backups. If you want to use external storage anyway, you can probably use samba or nfs, perhaps through a secure tunnel, to mount the remote drive locally. Since this is a workaround, and since a "remote drive" feature could be horribly misused by less experienced users, I don't feel a strong need to add such a feature to dirvish. Perhaps you can set me straight here.
Broken Symbolic Links causes failed shared images
Anonymous Gnome writes, Sat 2005 Feb 12:
I back up all staff user home directories on each host to single branch. A single broken symbolic link in any one of those directories will cause the entire image to fail. So anyone can sabotage the backup of other user's data merely by creating a broken link in his own home directory. This a security hole without an easy workaround. A broken symlink is just garbage data and dirvish should ignore it.
*KeithLofstrom replies, Sun 2005 Feb 13:* If the home directories of the hosts are not close to identical, it is probably not a good idea to share the same branch. A separate vault for each home directory is better, since little is to be gained by trying to overlay very disparate images. Restores could get very messy. You can easily share branches for /usr and /bin and other identical partitions, of course.
I have not seen any problems with broken symbolic links when the images are kept in separate vaults - there are plenty scattered through my backups. You might develop a more precise description of what you are trying to do, describing your client structure and your vault config files, then ask your question on the mailing list. Perhaps someone else has successfully done something similar to what you want.
Re-using a pre-existing copy ?
pjb writes 20060303:
We have a backup machine with a big partition which is already more that 50% full with a straight rsync copy of the primary machine. We'd like to upgrade this to use dirvish. There isn't enough space to do a dirvish --init, and anyway it would take days. I need to re-use the pre-existing rsync backup and fool dirvish into thinking that dirvish --init has already taken place. How should I do this ?
DSchulz writes 20090202:
You should be able to use pre-existing image by faking a successfull run and move the files inside the tree. Just create the vault, specify an empty path as source and run dirvish with init on this vault. So you'll have a successfull summary. Then move the existing files in the tree and adjust the source path in the configuartion. Then do an ordinary run. Please note that this hasn't been tested. You should try it out before on a test machine.
cannot expire :: No unexpired good images
ckoring writes 20061215:
I setup dirvish on our server. There are no problems with the creation of the images in the vault. But the images seems to expire not at any time. Here are some files of my test configuration:
$vault/dirvish/default.conf client: hr-divitec tree: /srv/samba/cad/config/Projekt exclude: OldVersions/ temp/ *.bak *.log xdev: true image-default: %Y%m%d-%H%M # Die taeglichen Versionen laufen nach 30 Tagen aus, die # woechentlichen nach sechs Monaten, die monatlichen # laufen nie aus expire-default: +1 days expire-rule: # MIN HR DOM MON DOW STRFTIME_FMT 55,56 * * * * +7 minutes * 14 * * * +3 minutes
#IMAGE CREATED REFERECE EXPIRES 20061214-1356 2006-12-14 13:56:52 default +7 minutes == 2006-12-14 14:03:52 20061214-1357 2006-12-14 13:57:23 20061214-1356 +1 days == 2006-12-15 13:57:23 20061214-1358 2006-12-14 13:58:26 20061214-1357 +1 days == 2006-12-15 13:58:26 20061214-1401 2006-12-14 14:01:31 20061214-1358 +3 minutes == 2006-12-14 14:04:31 20061214-1407 2006-12-14 14:07:22 20061214-1401 +3 minutes == 2006-12-14 14:10:21 20061214-1411 2006-12-14 14:11:30 20061214-1407 +3 minutes == 2006-12-14 14:14:29
client: hr-divitec tree: /srv/samba/cad/config/Projekt rsh: ssh Server: hr-divitec Bank: /srv/samba/backup vault: test branch: default Image: 20061214-1357 Reference: 20061214-1356 Image-now: 2006-12-14 13:57:23 Expire: +1 days == 2006-12-15 13:57:23 exclude: .wb* *.bak *~ .tmp OldVersions/ OldVersions/ temp/ *.bak *.log SET permissions devices numeric-ids stats xdev UNSET checksum init sparse whole-file zxfer ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x --exclude-from=/srv/samba/backup/test/20061214-1357/exclude --link-dest=/srv/samba/backup/test/20061214-1356/tree /srv/samba/cad/config/Projekt/ /srv/samba/backup/test/20061214-1357/tree Backup-begin: 2006-12-14 13:57:23 Backup-complete: 2006-12-14 13:57:23 Status: success
The status of all images is "success" so dirvish should find expired good images. When I execute dirvish-expire he tells me "cannot expire :: No unexpired good images". I don't find the mistake in my config, perhaps anybody can help me.
Keith responds: I will have a better answer later. I don't think anyone has looked at same-day expires, and the code may not support that. It is a good question, though, because we will want to do such expires when we do regression testing. I asked about this on the mailing list, and we will see who writes back