Dirvish Users Questions

The mailing list is a better place to ask questions!


Place your question here; somebody will answer it eventually. Please include your name, and it would help if you added a personal page to the DirvishPeople page.


Go ahead, click edit, and start typing. Wikis are pretty easy to use.


Mailing List

Could there be a mailing list to discuss dirvish issues? I think this is more handy. Anyway.

A mailing list has been added - KeithLofstrom 2005 Feb 11


Error Code 24

We are using dirvish for backing up filesystem with 1 million+ files. Creating the filelist with rsync takes so much time that sometimes files get deleted in between. rsync then complains about "vanished files" and exits with code 24. dirvish handles code 24 as error and restarts rsync. This can happen 4 times, resulting in a 12 hour run for one vault. It could even happen that rsync exited with code 24 after the fourth run and dirvish complains about an incomplete backup.

We do not care about files that are deleted on the original and may be existant on the backup. I changed the "error" level of code 24 to "warning" level in dirvish.pl. Now dirvish will run rsync only once. This could be changed for the 1.3 release.

This will be fixed in experimental version 1.3 and production version 1.4 - KeithLofstrom 2005 Feb 11


dirvish-expire and permissions

MichaelTibbetts inquires: I've run into some issues with dirvish-expire and file permissions. When dirvish-expire attempts to remove files in an image that is not writable(by me), the rm -rf system call prints errors to standard error but dirvish-expire exits with an exit code of 0.

Should it fail and give a non-zero exit code? Or at least check the exit code of the system call and warn the user?

Two side effects of the current(1.2) behavior is that there are partial images left on disk(not such a big deal for my purposes) and a boat load of error messages being mailed to me by the cron job which executes dirvish-expire.

Would it be possible to get dirvish-expire to try to chmod +w before removing images?

Perl's File::Path::rmtree does something like this and is packaged with perl (at least as of version 5.8). A potential draw back with using this in place of the system call to rm is that rmtree chmod's to 0777 and if it fails part way through, you've just left an entire directory's content world readable, etc.

EricMountain replies: Looking at the documentation, rmtree doesn't seem appropriate in this case as you pointed out.

In order to avoid the 0777 issue on failure and some messy cleaning up, we would need to set the "skip any files to which you do not have delete access" parameter to true, which still leaves us with partial trees anyway in the event that we only have partial write access.

So, since dirvish already uses system() to run "rm -rf", why not simply chmod before doing the rm? A possible patch for dirvish-expire can now be found in RFE ChmodOnExpire.

MichaelTibbetts replies: Thanks Eric. That does the trick for me.


Dirvish on a Windows machine:

SteveRamage asks:

Hmmmmmm interesting... ( I apologize for the wierd formatting this as as best as I could get it ) . Anyone saw an Rsync presentation at LinuxFest like 6 months ago and finally got around to setting up, thanks to the Debian dirvish guide, and for the Winsync faq. The problem I'm having tho is that I got rsync working locally, without a problem, but when I try and use Winrsync it doesn't work. I just get prompted for a password, and it never accepts it. Worst still is that I cannot find the error log for the life of me, on the windows machine.

I might be missing something but here is all the information, its alot. I'm assuming two things tho, first that summary contains all the relevant information about master.conf and bank/vault/dirvish/default.conf. The second that the error regarding the password file being group readable, is not causing this.

This is dirvish 1.2.1 (deb unstable package), running on (you guessed it) Debian Unstable. I think this is a problem on the windows end as the etc vault seems to be working fine.

Any help would be appreciated it might be something I overlooked or maybe even a spelling mistake but i thought I checked that. Originally my password was thispassword so I have tried changing it to no avail. I also tried manually specifing the log location to be /cygdrive/c/rsyncd.log to no luck either.

Also a note, that the only way I can get it to stop prompting me for a password is to kill the rsync process on the windows machine in which case it errors out, thats not shown in the console output though, but thats what the logs might refer to.

First a file listing

  Directory of C:\Program Files\WINrsync

 24/10/2004  02:22 AM    <DIR>          .
 24/10/2004  02:22 AM    <DIR>          ..
 24/10/2004  02:22 AM               100 rsyncd.bat
 24/10/2004  02:41 AM               305 rsyncd.conf
 24/10/2004  02:33 AM                 9 rsyncd.secrets
 24/10/2004  02:15 AM    <DIR>          support
 24/10/2004  02:15 AM            36,746 uninstall.exe
 19/05/2002  06:23 AM               236 winrsync.bat
 19/05/2002  12:58 PM            13,360 winrsync.php
 14/05/2002  07:55 AM               967 winrsync.pif

rsyncd.bat

 @ECHO OFF
 cd c:\PROGRA~1\winrsync
 c:\PROGRA~1\winrsync\support\rsync --daemon --config rsyncd.conf

rsyncd.conf

 gid = users
 read only = true
 use chroot = false
 transfer logging = true
 log file = rsyncd.log
 log format = %h %o %f %l %b
 hosts allow = 172.27.1.2
 hosts deny = 0.0.0.0/0
 strict modes = false

 [backups]
 path = /cygdrive/c/winnt/system32/dns/
 auth users = backups
 secrets file = rsyncd.secrets

rsyncd.secrets

 backups:p

Directory Contents on Server

 [02:57:15] root@fermat:/data2/backups/dns/20041024$ls -asl
 total 20
 4 drwx----  2 root root 4096 Oct 24 02:57 .
 4 drwxr-xr-x  4 root root 4096 Oct 24 02:42 ..
 4 -rw-r--r--  1 root root  576 Oct 24 02:55 log
 4 -rw-r--r--  1 root root  553 Oct 24 02:55 rsync_error
 4 -rw-r--r--  1 root root  713 Oct 24 02:55 summary

log

 ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x   \
   --password-file=/etc/dirvish/win.password                      \
   --exclude-from=/data2/backups//dns/20041024/exclude            \
   backups@172.27.2.27::backups/ /data2/backups//dns/20041024/tree

 broken pipe
 RESULTS: warnings = 0, errors = 1

 ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x  \
    --password-file=/etc/dirvish/win.password                    \
    --exclude-from=/data2/backups//dns/20041024/exclude          \
    backups@172.27.2.27::backups/ /data2/backups//dns/20041024/tree

 write error, filesystem probably full
 broken pipe
 RESULTS: warnings = 0, errors = 2

/lines broken with \ for readability / (KeithLofstrom)

rsync_error

 *** Execution cycle 0 ***

 password file must not be other-accessible
 continuing without password file
 @ERROR: auth failed on module backups
 rsync: read error: Connection reset by peer (104)
 rsync error: error in rsync protocol data stream (code 12) at io.c(515)


 *** Execution cycle 1 ***

 password file must not be other-accessible
 continuing without password file
 rsync: writefd_unbuffered failed to write 31 bytes: phase "unknown" 
       [receiver]:    Connection reset by peer (104)
 rsync error: error in rsync protocol data stream (code 12) at io.c(909)

summary
 client: backups@172.27.2.27
 tree: :backups
 rsh: ssh
 Server: fermat
 Bank: /data2/backups/
 vault: dns
 branch: default
 Image: 20041024
 Reference: default
 Image-now: 2004-10-24 02:42:17
 Expire: +3 months == 2005-01-24 02:42:17
 Expire-rule: *   *     *   *         1    +3 months
 exclude:
        lost+found/
 SET permissions devices init numeric-ids stats xdev
 UNSET checksum sparse whole-file zxfer

/etc/dirvish/win.password

 p

Console Attempt to Init Vault

 [02:42:11] root@fermat:/data2/backups/dns$dirvish --vault dns --init
 Password:
 Password:
 Password:
 Password:
 Password:

KeithLofstrom replies:

While I am not familiar with the details on windows, this can happen if you do not have the ssh connection configured properly. The second thing to do is to set dirvish to back up on small partition your backup server. And the first thing before that is to set up and test ssh . That will provide some clues that will help with the windows backup. Walk, run, fly!

On your Linux machine fermat(?), using dirvish for self-backup, you need to get it to the point that you can type:

 root@fermat: ssh fermat
 root@fermat:         (except now you are connected through ssh)

If it asks for a password, you will need to generate a key in the .ssh subdirectory of root:

 root@fermat: cd /root/.ssh
 root@fermat: ssh-keygen -t rsa

And then append the id_rsa.pub file you just generated to the file authorized_keys2 with:

 root@fermat: cat id_rsa.pub >> authorized_keys2

You append this to the same file on whatever machine you wish to log into without a password. Again, I do not know the windows equivalent (does ssh drop you into CMD.EXE, or what?) but if things are analogous you should be able to access your windows machine with sftp without a password as well.


Disk consumption spikes:

SteveRamage asks:

I've noticed a HUGE jump in disk consumption accross all my backups, they seem to be double each of there original size. I don't know when this happened, but it was recent. file.sh is just something that does it for every folder (before I found out I could do du -s -si 20041029/ 20041030/

I don't know of a way to get du to take two files and see if they are symlinked., and then not use those. Below are three different vaults, mydocs which is rsynced from my windows box, as is usbdrive, etc is local. The one that concerns me the most is mydocs, as if it continues to grow I'll have problems very quickly. Some of these, most of these don't even change on a day to day basis. I don't know of anyway to troubleshoot this.

Also is there a way to have a program go thru each pair of directories, check if the file matches and if it does hardlink the files together, so I can repair this?

 [02:39:06] root@fermat:/data2/backups/mydocs$du ./ -s --si
 4.4G    ./
 [02:39:11] root@fermat:/data2/backups/mydocs$./file.sh
 2.2G    20041026
 2.2G    20041027
 2.2G    20041028
 2.2G    20041029
 2.2G    20041030
 2.2G    20041031
 2.2G    20041101
 21k     20041102
 2.2G    20041103
 21k     20041104
 2.2G    20041105
 2.2G    20041106
 2.2G    20041107
 2.2G    20041108
 2.2G    20041109


 [02:40:38] root@fermat:/data2/backups/usbdrive$du -s --si 20041103 20041104  20041107 20041108 20041109
 21k     20041103
 171M    20041104
 212M    20041107
 212M    20041108
 212M    20041109
 [02:40:52] root@fermat:/data2/backups/usbdrive$du -s --si
 462M    .

 [02:43:16] root@fermat:/data2/backups/etc$du -s --si
 34M     .
 [02:43:18] root@fermat:/data2/backups/etc$du 20041024/ 20041025/ 20041026/ \
                          20041027/ 20041029/ 20041030/ 20041031  20041101/ \
                          20041102/ 20041103/ 20041104/ 20041105/ 20041106  \
                          20041107  20041108/ 20041109/ -s --si
 17M     20041024/
 du: cannot access `20041025': No such file or directory
 17M     20041026/
 17M     20041027/
 17M     20041029/
 17M     20041030/
 17M     20041031
 17M     20041101/
 17M     20041102/
 17M     20041103/
 18M     20041104/
 18M     20041105/
 18M     20041106
 18M     20041107
 18M     20041108/
 18M     20041109/

/lines broken with \ for readability / (KeithLofstrom)

Follow Up Question (11-15-04): I found out that the new 'set' starts on 20041103, I found this by symlink randomly in another directory and checking the filesize of that directory, and found that if I have 20041024 and 20041102 the du reports 2 gigs in use, but 20041024 and 20041103 reports 4.1 gigs in use. The question is now that I know where the set is broken, how can I repair it, adn get my disk space back?


Per vault expire patterns

SteveRamage asks:

Is it possible to set per vault expiry patterns, as for instance my USB drive isn't gaurenteed to always be there, so I would like to have the backups expire less aggressively.

/Keith Lofstrom responds: You can put an expire pattern in the [VAULT]/dirvish/default.conf file, and it will override the expire pattern in the /etc/dirvish/master.conf file.


chattr for extra vault security

SteveRamage asks:

I would like my vault to be as stable and secure as possible, yet remain 'accessible'. What I was wondering as an extra layer of security was to set chattr +i on all my files. The drawback being that iirc I can't hardlink files anymore. But I was wondering would it be 'safe' to have something go before dirvish runs and chattr -i everything, run dirvish, then chattr +i everything. I use a similar method for my music files, except it just chattr +i 's everything at 7 am.

Keith Lofstrom responds: It would be interesting to see what happens. I suspect your disks would thrash for a /long/ time; up at the user level, a dirvish vault looks huge. This seems like a security-by-obscurity approach. You can probably accomplish the same thing by unmounting your backup drive, and remounting it read-only. Keep in mind, though, that if bad guys can read your backup disk, they /own/ you.


Vaults on Remote Hosts?

Fortuitous asks:

Is is possible for dirvish to put vaults on remote hosts? This would get around a lot of security problems like how to initiate backups from inside a secure network (behind a firewall) to a backup server on a remote planet. ie.

vault: backup.example.com:/backups/example.com/abc/

KeithLofstrom replies, 2005 Feb 11:

The security aspects of this are a little scary. Generally, you do not /want/ to put your secure files outside of the secure network; that defeats the security, because the bad guys can now get at things like password files, or subtly alter your backups. If you want to use external storage anyway, you can probably use samba or nfs, perhaps through a secure tunnel, to mount the remote drive locally. Since this is a workaround, and since a "remote drive" feature could be horribly misused by less experienced users, I don't feel a strong need to add such a feature to dirvish. Perhaps you can set me straight here.


Anonymous Gnome writes, Sat 2005 Feb 12:

I back up all staff user home directories on each host to single branch. A single broken symbolic link in any one of those directories will cause the entire image to fail. So anyone can sabotage the backup of other user's data merely by creating a broken link in his own home directory. This a security hole without an easy workaround. A broken symlink is just garbage data and dirvish should ignore it.

*KeithLofstrom replies, Sun 2005 Feb 13:* If the home directories of the hosts are not close to identical, it is probably not a good idea to share the same branch. A separate vault for each home directory is better, since little is to be gained by trying to overlay very disparate images. Restores could get very messy. You can easily share branches for /usr and /bin and other identical partitions, of course.

I have not seen any problems with broken symbolic links when the images are kept in separate vaults - there are plenty scattered through my backups. You might develop a more precise description of what you are trying to do, describing your client structure and your vault config files, then ask your question on the mailing list. Perhaps someone else has successfully done something similar to what you want.


Re-using a pre-existing copy ?

pjb writes 20060303:

We have a backup machine with a big partition which is already more that 50% full with a straight rsync copy of the primary machine. We'd like to upgrade this to use dirvish. There isn't enough space to do a dirvish --init, and anyway it would take days. I need to re-use the pre-existing rsync backup and fool dirvish into thinking that dirvish --init has already taken place. How should I do this ?

DSchulz writes 20090202:

You should be able to use pre-existing image by faking a successfull run and move the files inside the tree. Just create the vault, specify an empty path as source and run dirvish with init on this vault. So you'll have a successfull summary. Then move the existing files in the tree and adjust the source path in the configuartion. Then do an ordinary run. Please note that this hasn't been tested. You should try it out before on a test machine.


cannot expire :: No unexpired good images

ckoring writes 20061215:

I setup dirvish on our server. There are no problems with the creation of the images in the vault. But the images seems to expire not at any time. Here are some files of my test configuration:

$vault/dirvish/default.conf
 client:  hr-divitec
 tree:  /srv/samba/cad/config/Projekt
 exclude:
      OldVersions/
      temp/
      *.bak
      *.log
 xdev: true
 image-default: %Y%m%d-%H%M 
 
 # Die taeglichen Versionen laufen nach 30 Tagen aus, die
 # woechentlichen nach sechs Monaten, die monatlichen
 # laufen nie aus
 
 expire-default: +1 days
 expire-rule:
 #   MIN     HR      DOM     MON     DOW     STRFTIME_FMT
    55,56     *        *      *        *       +7 minutes
      *      14        *      *        *       +3 minutes

$vault/dirvish/default.hist

 #IMAGE         CREATED             REFERECE        EXPIRES
 20061214-1356  2006-12-14 13:56:52     default         +7 minutes == 2006-12-14 14:03:52
 20061214-1357  2006-12-14 13:57:23     20061214-1356   +1 days == 2006-12-15 13:57:23
 20061214-1358  2006-12-14 13:58:26     20061214-1357   +1 days == 2006-12-15 13:58:26
 20061214-1401  2006-12-14 14:01:31     20061214-1358   +3 minutes == 2006-12-14 14:04:31
 20061214-1407  2006-12-14 14:07:22     20061214-1401   +3 minutes == 2006-12-14 14:10:21
 20061214-1411  2006-12-14 14:11:30     20061214-1407   +3 minutes == 2006-12-14 14:14:29

$vault/20061214-1357/summary

 client: hr-divitec
 tree: /srv/samba/cad/config/Projekt
 rsh: ssh
 Server: hr-divitec
 Bank: /srv/samba/backup
 vault: test
 branch: default
 Image: 20061214-1357
 Reference: 20061214-1356
 Image-now: 2006-12-14 13:57:23
 Expire: +1 days == 2006-12-15 13:57:23
 exclude:
        .wb*
        *.bak
        *~
        .tmp
        OldVersions/
        OldVersions/
        temp/
        *.bak
        *.log
 SET permissions devices numeric-ids stats xdev 
 UNSET checksum init sparse whole-file zxfer 
 
 
 ACTION: rsync -vrltH --delete -pgo --stats -D --numeric-ids -x 
      --exclude-from=/srv/samba/backup/test/20061214-1357/exclude 
      --link-dest=/srv/samba/backup/test/20061214-1356/tree
      /srv/samba/cad/config/Projekt/ /srv/samba/backup/test/20061214-1357/tree
 Backup-begin: 2006-12-14 13:57:23
 Backup-complete: 2006-12-14 13:57:23
 Status: success

The status of all images is "success" so dirvish should find expired good images. When I execute dirvish-expire he tells me "cannot expire :: No unexpired good images". I don't find the mistake in my config, perhaps anybody can help me.

Keith responds: I will have a better answer later. I don't think anyone has looked at same-day expires, and the code may not support that. It is a good question, though, because we will want to do such expires when we do regression testing. I asked about this on the mailing list, and we will see who writes back



DirvishUsers (last edited 2011-01-24 04:23:02 by KeithLofstrom)