automating a daily rsync backup

We saw how easy it was to use rsync and how to exclude files from the rsync backup.

To make things even easier, we can set up an automatic rsync and place the files into daily folders.


Automation

Automating tasks relieves a lot of the repetitive nature from the role of a sysadmin.

The tasks are still important but automation can leave you time to concentrate on other, more dynamic, tasks.

Bash Script

The base of the automatic rsync is a very simple bash script.

It contains just 4 lines which will create seven daily folders on your backup machine (Monday - Sunday), log into the Slice you want to backup and download the files and folders into the predefined location.

Create

On the backup machine, create a file called 'backup'. In this case I have placed it in my home 'bin' directory:

nano /home/backup/bin/backup

Once done, add the following:

#!/bin/bash

dest=/backup/demo/`date +%A`

mkdir -p $dest

rsync -e 'ssh -p 30000' -avl --delete --stats --progress demo@123.45.67.890:/home/demo $dest/

Short and sweet, but let's go through it and see what we did:

The first line turns the file into a bash script. Without this line, nothing would occur as the machine would not know what to do with the contents.

The second line defines the variable 'dest'. It starts with the backup path as 'backup/demo/' and then adds the day. So, if you ran this on a Monday, the 'dest' variable would be '/backup/demo/Monday'.

Do note that backticks (`) are used in the variable and not single quotes (').

The third line creates the destination directory as defined above. It used the '-p' option to create recursive directories if they do not exist. So if the script was run on a Friday, it would create '/backup/demo/Friday'.

Which brings us nicely to the final line which is the rsync command. Simply place whatever rsync options you want here as they will be called when the script is run.

Note the use of the '$dest' variable. The downloads will be placed in the '$dest' directory.

Executable

Now we need to make the file executable or nothing will happen when it is run:

chmod +x /home/backup/bin/backup

Adjust the path and name to the file you created above, but that's all you need to do.

Cron job

Final task is to create a cron job which will run the backup script at specified times.

I'm not going to go into details of cron here (I'll do a separate article for that) but feel free to search the Interweb if you want to adjust the times I show below.

As the normal user (we don't need root privileges for this) enter:

crontab -e

Now add the following:

# run rsync at 23.55hrs every day

55 23 * * *     sh /home/backup/bin/backup

A very simple command which you can embellish by adding an email address so the results can be sent to you and so on. I'll explain more about cron tasks in another article.

Basically, each day at 23.55hrs it runs the script we created earlier. Depending on how you have setup your Slice, you may get internal mail regarding the output.

Summary

Although we went into detail, the basics of writing a script and setting up a simple cron task are straight forward.

It saves time by automating essential, but repetitive, tasks and frees the sysadmin, you, for more important things. Like playing.

PickledOnion.

Article Comments:

Jaime commented Wed Nov 28 13:22:15 UTC 2007:

I think this article is not completely right. If you left it as it is now it will not work.

Before setting the script in the cron, you need to set up the SSH server in the machine you want to backup(B) or rsync will not be able to connect without a password.

You can find a quick tutorial about this here: http://www.astro.caltech.edu/~mbonati/WIRC/manual/DATARED/settingup_no-passwordssh.html

However... it may have serious security risks... as the user that is used to connect to the machine "B" will be able to access to it from machine A without any password... so it has serious risks.

PickledOnion commented Wed Nov 28 13:31:48 UTC 2007:

Hi Jamie,

You are correct in that you would, of course, have to set up SSH facilities on both machines.

The article is not intended to show how to configure that, it is only to show how a script can automate the repetitive backup tasks.

As an aside, each of the 'set up' articles here takes the user through creating and setting up passwordless access.

You would not put the private key on a shared host or work machine for the obvious reasons.

PickledOnion.

jTaby commented Sat Jan 12 12:56:18 UTC 2008:

It would be nice to have the cron article :( it's been a month, i think you've had a long enough break :P

dayid commented Wed Apr 23 14:07:47 UTC 2008:

I would highly advise anyone using rsync for network transfers to also make use of the "-c" argument to compare files by checksum. If you are transferring larger files and the process is killed or dies (or network fails), you may find the "size-on-disk" and "last modified" dates to match on source/destination, even if the full MB/GB of file has not transferred. This can be bad when you go to restore a large file from backup only to find that it half-transferred once; and upon subsequent transfers was skipped because the on-disk stat/file information matched, even though the checksum did not.

macc commented Fri May 15 20:34:17 UTC 2009:

How can I add an email address (to cronjob) so the results can be sent to me..?

Chris commented Mon Aug 30 15:15:05 UTC 2010:

Nice article! One quick note...

You don't need "l" in your rsync attributes. The "a" (archive mode) implies "-rlptgoD".

harry commented Thu Nov 04 19:17:25 UTC 2010:

How would you set up your bash script so that it would do incremental backups for two weeks, then a full backup ... and repeat? Full, Incremental x 14, Full , Incremental x 14. I really only need to keep data for 2 weeks in the past.

It seems that this script will do a full backup each day of the week, leaving you with 7 times your original size.

Jered commented Fri Nov 05 19:43:01 UTC 2010:

Harry, there are a few possible approaches. What the script in its current form does is use a directory for each day of the week (Monday, Tuesday, etc.) and replace/update any files already there.

The easiest way might be just to run a modified form of the script via crontab on the 2nd and 16th of every month (a schedule like "55 1 2,16 * *" would do it). For that, make a copy of the above script and change the line:

dest=/backup/demo/`date +%A`

To something like:

dest=/backup/demo/`date +%F`

That way instead of writing to a directory named for the day of the week, the bi-weekly backups would write to a directory named after the full date (like "2010-11-05").

If you don't want to have everything backed up twice on those two days you would need to change the scheduling part of the crontab entry to be "55 23 2-14,16-31 * *" (the dates are different because of the different times of the backups).

Harry commented Tue Nov 09 16:16:57 UTC 2010:

Jered, thanks a lot.

Macc, to add an email address so results are sent to you, in your crontab "crontab -e" or "/etc/crontab" file, add the following: MAILTO=foo@bar.com If MAILTO is set but left empty (MAILTO=""), no mail will be sent. MAILTO="" Otherwise mail is sent to the owner of the crontab.

Want to comment?


(not made public)

(optional)

(use plain text or Markdown syntax)