Rsync is a very useful tool which allows Linux system administrators synchronize data locally or with a remote filesystem via the ssh protocol or by using the rsync daemon
. Using rsync
is more convenient than simply copying data, because it is able to spot and synchronize only the differences between a source and a destination. The program has options to preserve standard and extended filesystem permissions, compress the data during transfers and more. We will see the most used ones in this guide.
In this tutorial you will learn:
- How to use rsync to syncronize data
- How to use rsync with a remote filesystem via ssh
- How to use rsync with a remote filesystem via the rsync daemon
- How to exclude files from the synchronization
Software Requirements and Conventions Used
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Distribution-independent |
Software | The rsync application and optionally the rsync daemon |
Other | No special requirements are needed to follow this guide. |
Conventions | # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux commands to be executed as a regular non-privileged user |
Rsync – usage
Let’s start with rsync basic usage. Suppose we have a directory on our local filesystem, and we want to synchronize its content with another directory, perhaps on an external usb device, in order to create a backup of our files. For the sake of our example our source directory will be /mnt/data/source
, and our destination will be mounted at /run/media/egdoc/destination
. Our destination contains two file: file1.txt
and file2.txt
, while the destination is empty. The first time we run rsync
all the data is copied:
The destination path is the last thing we provided in the command. If we now list its content, we can see that it now contains the source files:
$ ls /run/media/egdoc/destination/ -l total 0 -rw-r--r--. 1 egdoc egdoc 0 Oct 6 19:42 file1.txt -rw-r--r--. 1 egdoc egdoc 0 Oct 6 19:42 file2.txt
The subsequent times we run rsync to synchronize the two directories, only new files and modified files will be copied: this will save a lot of time and resources. Let’s verify it: first we modify the content of the file1.txt inside the source directory:
$ echo linuxconfig > /mnt/data/source/file1.txt
Then, we will run rsync
again, watch the output:
$ rsync -av /mnt/data/source/ /run/media/egdoc/destination sending incremental file list file1.txt sent 159 bytes received 35 bytes 388.00 bytes/sec total size is 12 speedup is 0.06
The only copied file is the one we modified, file1.txt.
Create a mirror copy of the source to destination
By default rsync
just makes sure that all the files inside the source directory (except the one specified as exceptions) are copied to the destination: it does not take care of keeping the two directories identical, and it doesn’t remove files; therefore, if we want to create a mirror copy of the source into destination, we must use the --delete
option, which causes the removal of files existing only inside the destination.
Suppose we create a new file called file3.txt
in the destination directory:
$ touch /run/media/egdoc/destination/file3.txt
The file doesn’t exist in the source directory, so if we run rsync
with the --delete
option, it is removed:
$ rsync -av --delete /mnt/data/source/ /run/media/egdoc/destination sending incremental file list deleting file3.txt ./ sent 95 bytes received 28 bytes 246.00 bytes/sec total size is 0 speedup is 0.00
Since this synchronization is potentially destructive, you may want to first launch rsync with the --dry-run
option, in order to make the program display the operations that would be performed, without actually modifying the filesystem.
Synchronizing files remotely
Until now, we saw how to use rsync to synchronize two local filesystems. The program can also be used to synchronize files remotely, using a remote shell like rsh
or ssh
, or the rsync
daemon. We will explore both methods.
Running rsync through ssh
For the sake of our example we will be still using the same source directory we used in the previous examples, but as destination, we will use a directory on a remote machine with IP 192.168.122.32
. I previously setup an openssh server with a key-based login on the machine, therefore I won’t need to provide a password to access it.
How we can runrsync
via ssh
? First of all, for a remote synchronization to work, rsync must be installed both on the source and the remote machine. Rsync tries to contact a remote filesystem using a remote shell program whenever the destination or source path contains a :
character. In modern versions of rsync ssh
is used by default; to use another remote shell, or to declare the shell explicitly, we can use the -e
option and provide it as argument. Supposing our destination directory on the remote machine is /home/egdoc/destination
, we can run:
$ rsync -av -e ssh /mnt/data/source/ egdoc@192.168.122.32:/home/egdoc/destination
Notice that we specified the destination in the form <user>@<machine address>:/path/to/directory
.
Contacting a remote machine via the rsync daemon
The other method we can use to synchronize files with a remote machine is by using the rsync daemon
. This obviously requires the daemon being installed and running on the destination machine. Rsync tries to contact the remote machine talking to the daemon whenever the source or destination path contains a ::
(double colon) separator after the host specification, or when an rsync url is specified as rsync://
.
Supposing the rsync daemon is listening on port 873
(the default), on the remote machine, we can contact it by running:
$ rsync -av /mnt/data/source/ 192.168.122.32::module/destination
Alternatively we can use an rsync URL
:
$ rsync -av /mnt/data/source/ rsync://192.168.122.32/module/destination
In both the examples, module
(highlighted in the command), doesn’t represent the name of a directory on the remote machine, but the name of a resource, or module
in the rsync terminology, configured by the administrator, and made accessible via the rsync daemon. The module can point to whatever path on the filesystem.
Excluding files from the synchronization
Sometimes we want to exclude some files or directories from the synchronization. There are basically two ways we can accomplish this task: by specifying an exclusion pattern directly with --exclude
(multiple patterns can be specified by repeating the option), or by writing all the patterns into a file (one per line). When using the latter method, we must pass the file path as argument to the --exclude-from
option.
All the files and directories matching the pattern will be excluded from the synchronization. For example, to exclude all files with the “.txt” extension we would run:
$ rsync -av /mnt/data/source/ /run/media/egdoc/destination --exclude=*.txt
Conclusions
In this article we took a quick look to rsync, a very useful tool we can use to synchronize files and directories both on local and remote filesystems. We saw the program most used options, and what they let us accomplish, how to specify the source and destination directories, and the methods we can use to contact a remote filesystem. Finally we saw how to exclude files from the synchronization, specifying the exclusion patterns directly or inside a file. Rsync has a lot of options, too many to mention here. As always, we can find all the information we need into the program manual!