rsync - exclude files and folders
rsync is cool. What would be even cooler would be excluding particular files or even a whole folder from the backup process.
That's no problem using the rsync '--exclude' option.
Why?
In the previous article, we looked at backing up files and folders to another server.
This was easily completed but there may well be some time sensitive files that are not needed such as log files. Sure, there are some log files (perhaps Apache logs) that you want to keep but others you won't such as a Ruby on Rails production log.
Perhaps there are files containing your DB password, such as a PHP mysqli connection file. Although needed on the main server, it is not needed on the backup.
A folder I always exclude when completing an rsync on my home folder is the 'sources' directory: I don't need copies of the source code I have download.
Let's see how to exclude that directory.
Single files and folders
The original command was like this:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress demo@123.45.67.890:/home/demo /backup/
To exclude the sources directory we simply add the '--exclude' option:
--exclude 'sources'
Note: the directory path is relative to the folder you are backing up.
So looking at the original command, I am backing up /home/demo. Adding the name 'sources' to the exclude pattern will exclude the 'home/demo/sources' folder.
As such, the final command would look like this:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude 'sources' demo@123.45.67.890:/home/demo /backup/
The same can be applied to files. I have decided that in addition to the sources folder, I want to exclude the file named 'database.txt' which resides in the public_html folder. So I add this:
--exclude 'public_html/database.txt'
So now the command looks like:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude 'sources' --exclude 'public_html/database.txt' demo@123.45.67.890:/home/demo /backup/
Multiple files and folders
Unfortunately, I have a load of files and folders I don't want to backup and adding each one like that will get tedious very quickly.
Not only will it get boring, it will make the command super long and prone to easy mistakes.
That's OK as I can define all the files and folders I want exclude in a single file and have rsync read that.
To do this create a file called 'exclude.txt' on the destination machine (the system you give the rsync command on):
nano /home/backup/exclude.txt
Define
Then define the files and folders you want to exclude from the rsync:
sources
public_html/database.*
downloads/test/*
As you can see, you can define patterns.
The first entry is straight forward. It will exclude any file or folder called 'sources' (remember the path is relative).
The second entry will look int the public_html folder and exclude any files (or folders) that begin with 'database.'. The * at the end indicates a wild card, so 'public_html/database.txt' and 'public_html/database.yaml' will be excluded from the backup.
Using a wildcard, the final entry will exclude the contents of the 'downloads/test/' but still download the folder (in other words, I will end up with an empty 'test' folder).
Final command
Now we have defined what to exclude we can direct rsync to the file with:
--exclude-from '/home/backup/exclude.txt'
Note: the path for this file is absolute. You are defining where in the file system rsync should look for the exclude patterns.
So, this time the final command would be:
rsync -e 'ssh -p 30000' -avl --delete --stats --progress --exclude-from '/home/backup/exclude.txt' demo@123.45.67.890:/home/demo /backup/
That's better.
Summary
Excluding files from a backup can be time and space saver. They usually include superfluous downloads or files that contain sensitive information such as passwords that just don't need to be in several locations.
Using the exclude-from option allows for easy control over the exclude patterns - you can even define several exclude files and simply point rsync to the one that is convenient.
PickledOnion.


Article Comments:
CLIdiot commented Wed Oct 10 17:16:44 UTC 2007:
Thanks for the writeup, Pik.
I frequently find myself adding the -P option for large transfers. It preserves partial transfers in case of interuption, and gives a progress report on each file as it's being uploaded.
I move large media files back and forth on my servers, so knowing how long the transfer has remaining is very useful.
Lukas commented Sun Nov 18 19:09:06 UTC 2007:
Thank you! I'd been wondering for months wondering why my excludes weren't working. I wish the man page were as clear as your write-up.
Peter Tonna commented Fri Mar 28 16:00:43 UTC 2008:
Hi, Great site, I have a problem excluding folders. I created a file with absolute paths and using --exclude-from '/backup/Exclude_List', the rsync just ignores the exclude list and backups the excluded folders aswell
/backup/Exclude_List /home/test1/ /home/test2/* /home/test3/
Any help? Thanks
Brandon commented Fri May 02 20:44:57 UTC 2008:
It's interesting that rsync accepts a path for the exclusion file that is relative to the current directory, but it fails to use it properly. This appears to be a bug in rsync.
In other words (at least on OS X) if I pass
-exclude-from './exclude.txt'
and there's no such file, rsync complains that it can't find the file. Once I create that file, it acts like everything is fine. But it doesn't actually honor the things I list in it until I feed it an absolute path.
So thanks for the tip. I thought you were mistaken at first (since rsync was acting like my path was fine) but it turns out you saved me lots of headache trying to figure out why things weren't working.
(Now I can change all my old rsync scripts to use --exclude-from rather than --exclude !)
Raj commented Mon May 19 11:48:36 UTC 2008:
Hi! It's great. but It asks the remote server password to rsync the directories. I have to run this command with cron tab, for this could you please tell me a solution to run the command with out password prompting.
Thanks in advance.
Pete commented Wed May 28 05:53:05 UTC 2008:
@raj - you need to use ssh key authentication (man ssh-keygen), that will allow you to ssh to the remote host without it prompting for a password (assuming your private key does not have a password).
Once that is done, use -e ssh with rsync.
Cheers Pete
Heena commented Fri Aug 15 05:12:30 UTC 2008:
This is great stuff.. its really helped me to take backup
Matthias commented Sun Oct 19 08:46:26 UTC 2008:
For backups, there is a tool called "rsnapshot" that uses rsync and can do hourly, daily, weekly etc backups. Great tool for automation.
Ellis commented Tue Oct 21 04:58:04 UTC 2008:
The section on "Single files and folders" is a bit misleading. If you add --exclude "sources" to your rsync command, every folder with the name "sources" within the original path will be excluded, e.g. both "home/demo/sources" and "home/demo/blah/foo/sources" will be excluded.
I'm still struggling to find out how to truly exclude only a single folder, but I've not had any luck yet.
jonobr commented Fri Nov 07 19:27:07 UTC 2008:
Greeting!
I was directed here from the ubuntu forum. The info here is great. I was wondering however, if there was an exclude option to not copy over files that have no content or a zero bytes in size?
Thanks a mill!
ez linux admin commented Wed Dec 17 12:07:22 UTC 2008:
It couldn't of been explained better, the exclude file will save me so much time when restoring backups. Thanks!
Jerome commented Wed Jan 21 07:25:55 UTC 2009:
anyway to --exclude *.log ?
Kay Farin commented Mon Feb 02 15:37:12 UTC 2009:
Thank you for your article, especially about the possibility to use a file for stating several objects to be excluded (--exclude-from parameter).
Hope to see you reading on my blog, Kay!
Herman HugeLoad IV commented Tue Feb 03 13:56:15 UTC 2009:
What a great article - thanks for this! As a previous commenter remarked - if only the man page was as well written!
One question: Where would I find syntax for rsync "push" and "pull" meaning how you might set this up to pull files from another box, and push files *to* another box?
I'm not totally clear on this aspect of rsync (yet). Thanks a bunch for writing this stuff up!
Murxx commented Tue Mar 17 16:10:37 UTC 2009:
Somehow that does not work for me here... Do you have an idea why it's failing? I want to exclude folder "/runenv/EDM8_qa/installer/8.1.2.8/config" because on the target machine this folder does not belong to my user. Unfortunately I still get an error:
rsync -rptogvzl --no-whole-file --delete --exclude '/runenv/EDM8qa/installer/8.1.2.8/config' --ignore-errors --stats -e rsh /runenv/EDM8qa/ target-machine:/runenv/EDM8_qa
building file list ... done rsync: failed to set times on "/runenv/EDM8_qa/installer/8.1.2.8/config/deployments": Not owner (1) rsync: failed to set times on "/runenv/EDM8_qa/installer/8.1.2.8/config/deployments": Not owner (1)
Number of files: 29217 Number of files transferred: 0 Total file size: 28071201005 bytes Total transferred file size: 0 bytes Literal data: 0 bytes Matched data: 0 bytes File list size: 1875958 Total bytes sent: 1876020 Total bytes received: 20
sent 1876020 bytes received 20 bytes 50027.73 bytes/sec total size is 28071201005 speedup is 14963.01 rsync error: some files could not be transferred (code 23) at main.c(702)
Niels commented Sat Apr 11 18:35:04 UTC 2009:
QUOTE
Ellis
The section on "Single files and folders" is a bit misleading. If you add --exclude "sources" to your rsync command, every folder with the name "sources" within the original path will be excluded, e.g. both "home/demo/sources" and "home/demo/blah/foo/sources" will be excluded.
I'm still struggling to find out how to truly exclude only a single folder, but I've not had any luck yet.
To exclude single folders, start your pattern with a / and then point to the single folder relatively to your directory you are backing up.
So to exclude the folder sources but not home/sources use the pattern /sources and your set.
Good luck
abrbon commented Mon Apr 27 13:21:01 UTC 2009:
Great article, exactly what i was looking fore, however i does not seem to work with directory names with spaces in them. For example i want to exclude a file named /ds-install/Tomcat 6.0/conf/server.xml but that does not seem te work. I have already tried to escape the space with '\' like this: /ds-install/Tomcat\ 6.0/conf/server.xml but that doesn't work either. Any ideas?
Kang commented Fri May 29 02:57:06 UTC 2009:
This is very helpful, it made my sync between my lab and home much easier. Thanks!
Ko commented Tue Jul 07 12:38:47 UTC 2009:
Wow!! Great article! You saved my day on the --exclude -subject. I think this should be added in the rsync-manpage!
attila commented Mon Jul 13 14:42:17 UTC 2009:
Nice article, thanks!
Ol commented Fri Aug 07 12:22:14 UTC 2009:
Hi,
Great tutorial. I'm not an expert in managing server. But in the man, it says the option -a includes is the same as -rlptgoD. Is there a reason why you still specify the option -l ?
Larry commented Tue Aug 25 14:31:50 UTC 2009:
Good info! I'm running the command:
rsync -ave "ssh -l root -p $ServerPort" \ --delete --stats \ --link-dest=$ServerFolder/daily.1 \ --exclude /proc/ --exclude /tmp/* / root@$ServerURL:..$ServerFolder/daily.0
Having an issue though - any files that are in /tmp are sent even though I've excluded them and they are left in the root of $ServerFolder/daily.0...
(I have the same issue trying to exclude lprng temporary files like /var/spool/lpd//df -- they end up in the root of $ServerFolder/daily.0)
Larry commented Tue Aug 25 15:33:50 UTC 2009:
Figured it out... Shell expansion... Using the --exclude-from option solved it by preventing the shell expansion of the wildcard entries. - With the expansion suppressed, the files do not end up as orphaned files in the destination rsync folder.
Notes to self on the exclude file content:
/dir/ means exclude the root folder /dir
/dir/* means get the root folder /dir but not the contents
dir/ means exclude any folder anywhere where the name contains dir/
Examples excluded: /dir/, /usr/share/mydir/, /var/spool/dir/
/dir means exclude any folder anywhere where the name contains /dir
Examples excluded: /dir/, /usr/share/directory/, /var/spool/dir/
/var/spool/lpd//cf means skip files that start with cf within any folder
within /var/spool/lpd
Larry commented Tue Aug 25 15:38:21 UTC 2009:
(the asterisk between the 2 slashes for the /var/spool example and the one following "cf" were stripped by the web page even though I put it in a "code" segment)
kj commented Fri Aug 28 15:08:52 UTC 2009:
thanks for the article, using it in cygwin, works great!
Scyn commented Thu Sep 17 13:24:29 UTC 2009:
Thanks for the useful tips, definitely clearer than the manpage.
Biagio commented Fri Oct 23 10:45:56 UTC 2009:
Simply... Quickly... Efficiently...
Thank's a lot! Biagio (I)
oh commented Fri Nov 27 16:50:15 UTC 2009:
Hi abrbon
To exclude files/folders with spaces in their names escape a space " " by a character set "[ ]".
Your file "/ds-install/Tomcat 6.0/conf/server.xml" can be excluded in a file "excludes" containing the line
/ds-install/Tomcat[ ]6.0/conf/server.xml
and rsync'ed something like
rsync -va --exclude-from=excludes /src/ /dest
Hope this helps.
Sherry commented Wed Feb 24 16:43:13 UTC 2010:
Thank you for the clear examples. I too find the explanations on the man pages clear as mud. With examples like this, I am able to accomplish a lot more in a lot less time.