rsync - exclude files and folders
rsync is cool. What would be even cooler would be excluding particular files or even a whole folder from the backup process.
That's no problem using the rsync '--exclude' option.
Why?
In the previous article, we looked at backing up files and folders to another server.
This was easily completed but there may well be some time sensitive files that are not needed such as log files. Sure, there are some log files (perhaps Apache logs) that you want to keep but others you won't such as a Ruby on Rails production log.
Perhaps there are files containing your DB password, such as a PHP mysqli connection file. Although needed on the main server, it is not needed on the backup.
A folder I always exclude when completing an rsync on my home folder is the 'sources' directory: I don't need copies of the source code I have download.
Let's see how to exclude that directory.
Single files and folders
The original command was like this:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress demo@123.45.67.890:/home/demo /backup/
To exclude the sources directory we simply add the '--exclude' option:
--exclude 'sources'
Note: the directory path is relative to the folder you are backing up.
So looking at the original command, I am backing up /home/demo. Adding the name 'sources' to the exclude pattern will exclude the 'home/demo/sources' folder.
As such, the final command would look like this:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude 'sources' demo@123.45.67.890:/home/demo /backup/
The same can be applied to files. I have decided that in addition to the sources folder, I want to exclude the file named 'database.txt' which resides in the public_html folder. So I add this:
--exclude 'public_html/database.txt'
So now the command looks like:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude 'sources' --exclude 'public_html/database.txt' demo@123.45.67.890:/home/demo /backup/
Multiple files and folders
Unfortunately, I have a load of files and folders I don't want to backup and adding each one like that will get tedious very quickly.
Not only will it get boring, it will make the command super long and prone to easy mistakes.
That's OK as I can define all the files and folders I want exclude in a single file and have rsync read that.
To do this create a file called 'exclude.txt' on the destination machine (the system you give the rsync command on):
nano /home/backup/exclude.txt
Define
Then define the files and folders you want to exclude from the rsync:
sources
public_html/database.*
downloads/test/*
As you can see, you can define patterns.
The first entry is straight forward. It will exclude any file or folder called 'sources' (remember the path is relative).
The second entry will look int the public_html folder and exclude any files (or folders) that begin with 'database.'. The * at the end indicates a wild card, so 'public_html/database.txt' and 'public_html/database.yaml' will be excluded from the backup.
Using a wildcard, the final entry will exclude the contents of the 'downloads/test/' but still download the folder (in other words, I will end up with an empty 'test' folder).
Final command
Now we have defined what to exclude we can direct rsync to the file with:
--exclude-from '/home/backup/exclude.txt'
Note: the path for this file is absolute. You are defining where in the file system rsync should look for the exclude patterns.
So, this time the final command would be:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude-from '/home/backup/exclude.txt' demo@123.45.67.890:/home/demo /backup/
That's better.
Summary
Excluding files from a backup can be time and space saver. They usually include superfluous downloads or files that contain sensitive information such as passwords that just don't need to be in several locations.
Using the exclude-from option allows for easy control over the exclude patterns - you can even define several exclude files and simply point rsync to the one that is convenient.
PickledOnion.


Article Comments:
CLIdiot commented Wed Oct 10 17:16:44 UTC 2007 ago:
Thanks for the writeup, Pik.
I frequently find myself adding the -P option for large transfers. It preserves partial transfers in case of interuption, and gives a progress report on each file as it's being uploaded.
I move large media files back and forth on my servers, so knowing how long the transfer has remaining is very useful.
Lukas commented Sun Nov 18 19:09:06 UTC 2007 ago:
Thank you! I'd been wondering for months wondering why my excludes weren't working. I wish the man page were as clear as your write-up.
Peter Tonna commented Fri Mar 28 16:00:43 UTC 2008 ago:
Hi, Great site, I have a problem excluding folders. I created a file with absolute paths and using --exclude-from '/backup/Exclude_List', the rsync just ignores the exclude list and backups the excluded folders aswell
/backup/Exclude_List /home/test1/ /home/test2/* /home/test3/
Any help? Thanks
Brandon commented Fri May 02 20:44:57 UTC 2008 ago:
It's interesting that rsync accepts a path for the exclusion file that is relative to the current directory, but it fails to use it properly. This appears to be a bug in rsync.
In other words (at least on OS X) if I pass
-exclude-from './exclude.txt'
and there's no such file, rsync complains that it can't find the file. Once I create that file, it acts like everything is fine. But it doesn't actually honor the things I list in it until I feed it an absolute path.
So thanks for the tip. I thought you were mistaken at first (since rsync was acting like my path was fine) but it turns out you saved me lots of headache trying to figure out why things weren't working.
(Now I can change all my old rsync scripts to use --exclude-from rather than --exclude !)
Raj commented Mon May 19 11:48:36 UTC 2008 ago:
Hi! It's great. but It asks the remote server password to rsync the directories. I have to run this command with cron tab, for this could you please tell me a solution to run the command with out password prompting.
Thanks in advance.
Pete commented Wed May 28 05:53:05 UTC 2008 ago:
@raj - you need to use ssh key authentication (man ssh-keygen), that will allow you to ssh to the remote host without it prompting for a password (assuming your private key does not have a password).
Once that is done, use -e ssh with rsync.
Cheers Pete
Heena commented Fri Aug 15 05:12:30 UTC 2008 ago:
This is great stuff.. its really helped me to take backup