Backing up your files with rsync

Backing up files on a regular basis is an integral part of administering your server.

One way is to download each and every file when you want to save them. However, rsync makes the task a lot easier as it only downloads files that have changed - saving time and bandwidth.


Installation

Installing rsync is as simple as implementing your OS's package manager such as:

sudo aptitude install rsync
...
sudo emerge rsync
...
yum install rsync

Do remember that if you are downloading files to another system, both will require rsync to be installed.

Preparation

Very little to do here expect to establish where the saved files will be located.

In this example, I am going to backup my main Slice home directory to another server.

Security

As a rule (and one I stick to very closely), we don't upload and download anything without some encryption in place. As such we will be using the SSH protocol with rsync to ensure no one else can sniff out the data being transferred.

What does this mean? Well, if you want to automate your backups, you will need to ensure the destination server (where the backup directory is) has access to the originating server.

In my case, I have setup ssh keys so I don't need to enter a password each time I attempt to rsync my home folder. It's perfectly fine not to do it that way, but you will need to enter the password each time you rsync.

Command

So on the destination server, the command I would give is as follows:

rsync -e 'ssh -p 30000' -avl --delete --stats --progress demo@123.45.67.890:/home/demo /backup/

Let's go through the command in order:

-e 'ssh -p 30000': this ensures rsync uses the SSH protocol and sets the port.

-avl: This contains three options;

(a) is archive mode which basically keep the permission settings for the files.
(v) is verbose mode. You can leave it out or increase it by appending two v's (-vv).
(l) preserves any links you may have created.

--delete: deletes files from the destination folder that are no longer required (i.e. they have been deleted from the folder being backed up).

--stats: Adds a little more output regarding the file transfer status.

--progress: shows the progress of each file transfer. Can be useful to know if you have large files being backup up.

demo@123.45.67.890:/home/demo: The originating folders to backup.

The syntax here is very important - naturally you need the username and IP address of the originating server - but note in this example there is no trailing slash (/).

If you leave the trailing slash off, the named folder and contents will be downloaded. So in this case I would have a folder called 'demo' in my backup directory.

If I added the trailing slash (demo@123.45.67.890:/home/demo/) then I would have the contents of demo in my backup. So I may have folders called 'public_html' or 'configs' or 'bin' and so on.

/backup/: Identifies the folder on the backup server to place the files.

Output

So from the command above, my storage server would start to output something like this:

receiving file list ...
31345 files to consider
./
tuning-primer.sh
       42596 100%  533.30kB/s    0:00:00 (xfer#2, to-check=31331/31345)
bin/
bin/Backup
         618 100%    7.74kB/s    0:00:00 (xfer#3, to-check=31310/31345)
bin/Search
         455 100%    5.70kB/s    0:00:00 (xfer#4, to-check=31309/31345)
configs/
configs/php.ini
         114 100%    1.43kB/s    0:00:00 (xfer#5, to-check=31307/31345)
public_html/
...
...

As you can see, it receives a list of files (31,345 of them) and, for the first run, downloads them all.

Running rsync again will only download files that have changed so, depending on how busy your home directory is, a much smaller download will be conducted.

Summary

This was a quick introduction to rsync. The command shown is a simple but effective and secure means of creating an incremental backup of your files.

The next articles will concentrate on some more options such as excluding certain files or directories from the backup and creating a script to automate the process and put the backups into daily folders.

PickledOnion.

Article Comments:

Oscar Merida commented Sun Oct 14 05:00:10 UTC 2007:

For rotating daily/weekly snapshots that use rsync to minimize disk space consumed, check out rsnapshot.

apt-get install rsnapshot

http://www.rsnapshot.org/

Šime commented Tue Oct 30 18:59:09 UTC 2007:

The -a (archive) option is a shorthand for 'rlptgoD' so you should exclude the -l as it’s already covered by the -a switch.

lambo commented Wed Mar 26 05:33:45 UTC 2008:

just wondering, for this tutorial is it required to have another server such as the storage server that you mention where you back up the files too? or is it possible to just save these files into storage somewhere within slicehost without needing another server account? rsnapshot looks interesting, but my goal is to be able to save the configuration of my originating server somewhere in the available storage that I have, so if I ever break my installation, I can restore from backup. What is the best way to achieve this while only having one account? If rsnapshot is a solution, would you be able to do a tutorial on rsnapshot? Any help would be appreciated.

wease commented Mon Jun 09 00:28:42 UTC 2008:

lambo

Just don't use the -e command. Also for the sink directory (i.e. the directory within your user account where you will be backing up) designate it as you did for your source - that is drop the 'demo@123.45.67.890:' part

Defenestrator commented Thu Dec 18 05:32:30 UTC 2008:

You can also take daily snapshots of your backup while not using up too much space. Something along the lines of a "cp -al backup date +%Y%m%d" after rsync finishes will make a copy of the "backup" folder using hardlinks. Unchanged files will just have another hardlink to them, so only changed files will take up extra space. I believe rsnapshot does something similar.

Defenestrator commented Thu Dec 18 05:35:39 UTC 2008:

On the above comment, the section in the dark block should be in backticks. The comment system seems to have interpreted it as some sort of quote block instead, but it's intended to be all on one line.

Amitek Rathod commented Sun Sep 23 13:22:41 UTC 2012:

I want to download a file which uses a rsync protocol but i am not able to download it.

here a link to that file

http://proisk.com/?host=189.250.48.233&path=%2FDATOS%2FImagenes%2FDATABASES&name=ORACLE+9I+R2_WINDOWS&ext=ISO

it is an iso file.

Please help me.

Jered commented Tue Sep 25 15:53:22 UTC 2012:

If you want to download a file using an http URL you'd be better off using "wget" or "curl". The rsync program is primarily for copying files between directories or servers that you control.

sysadmin commented Wed Oct 03 19:18:47 UTC 2012:

Just wondering why are you using the "-e ssh" option everywhere with rsync. The man page says that ssh is used by default, so this is redundant, I suppose.

Jered commented Tue Nov 27 19:54:32 UTC 2012:

Part of it is being explicit, but I'm sure part of it is from old habit. The rsync tool didn't always use ssh by default (it used to use the much less secure rsh), and this article is about five years old now. On the bright side, including "-e ssh" doesn't hurt anything, it's just unnecessary.

flat iron reviews 2014 commented Thu Aug 28 19:01:21 UTC 2014:

I’m not that much of a online reader to be honest but your blogs really nice, keep it up! I'll go ahead and bookmark your website to come back down the road. Cheers

Want to comment?


(not made public)

(optional)

(use plain text or Markdown syntax)