Rsync is a free and open-source tool that allows you to synchronise files between Unix-based systems locally and remotely. It is fast and smart, utilizing an algorithm to detect the differences between two files or folders, known as “delta”. This allows for the transfer of only the changed parts of files, reducing the amount of data sent over the network.
For example, suppose you have a folder named “a” containing files “b.txt”, “c.txt”, and “d.txt”. When you use rsync to copy that directory to another location for the first time, all files are transferred with their contents. However, if you later add a few sentences to “c.txt” and run rsync again, it will only copy those sentences you added to “c.txt”. Since “b.txt” and “d.txt” remain unchanged, there’s no need to copy them again.
This makes rsync an ideal tool for backups, as it can significantly reduce network transfer. Many people set up a standard linux vps dedicated to backing up important personal documents or utilising the remaining disk space on an existing VPS.
Let’s go through a possible option for an automated backup system, which involves using your standard linux vps as a remote backup for files stored on your local computer. Note that this is not a definitive guide to using rsync as a backup tool, but once you understand the basics, you can use them in various scenarios, such as setting up automated backups between two VPSes.
Table of Contents
Requirement
- A local machine running Linux or OS X
- A virtual private server running Ubuntu 12.04/14.04/16.04, Fedora 22, Debian 7/8, CentOS 6/7
- SSH access to your VPS
Rsync Basics
$ sudo apt-get install rsync # for Debian/Ubuntu
$ sudo yum install rsync # for Fedora/CentOS
The most basic rsync operation is between two local folders. the -a option enables “archive” mode, which is equal to -rlptgoD—check the man page for more information. The -v option enables “verbose” mode, which makes troubleshooting easier.
$ rsync -av source/ destination/
Or, when copying between a local machine and a remote one, such as a VPS, over the SSH protocol:
$ rsync -av -e ssh source/ YOUR-USER@REMOTE-MACHINE:/path/to/destination/
The 'sync' Backup
For this tutorial, we will be backing up a folder named /home/joel/saveme/. The backup destination will be a standard linux vps located at IP address 123.45.67.89, with a user name of joelvps and a destination folder of /home/joelvps/backup/. Please make sure to modify these values according to your setup.
We will begin with an incremental backup, which is similar to the syncing process used by Dropbox. Essentially, we are creating an exact copy of /home/joel/saveme on the VPS and then copying any subsequent changes.
It is important to note that these commands should be executed on your local machine, not the VPS.
$ rsync -av -e ssh /home/joel/saveme/ [email protected]:/home/joelvps/backup/
sending incremental file list
./
file1.txt
file2.png
file3.html
sent 247 bytes received 76 bytes 646.00 bytes/sec
total size is 0 speedup is 0.00
Because I have SSH keys set up, I only have to enter in my SSH key passphrase. If you use password-based logins, you’ll have to enter your standard linux vps user’s password here.
Now, all the files are replicated on the standard linux vps. What if I make some changes to file1.txt? This is where rsync works its magic—the next time I run the command, rsync determines what’s been changed, and sends only that data across to the backup folder on my VPS.
$ rsync -av -e ssh /home/joel/saveme/ [email protected]:/home/joelvps/backup/
sending incremental file list
./
file1.txt
sent 240 bytes received 38 bytes 185.33 bytes/sec
total size is 71 speedup is 0.26
See how only file1.txt was copied this time around? Now the folders are synchronized every time I run that command—almost.
Let’s say that I end up not wanting file2.png any more, and I delete it. I probably don’t need it backed up on the standard linux vps now, either right? Well, the default rsync behaviour will retain that file in the destination folder indefinitely. If you want rsync to delete any file from the destination that is not in the source, you can use the –delete flag.
$ rsync -av --delete -e ssh /home/joel/saveme/ [email protected]:/home/joelvps/backup/
sending incremental file list
deleting file2.png
sent 105 bytes received 25 bytes 86.67 bytes/sec
total size is 71 speedup is 0.55
Now I’m synced up. But, sometimes a sync isn’t enough—that’s where snapshots come in.
The ‘snapshot’ Backup
The goal of my snapshots is to retain a complete copy of the saveme folder and preserve its state indefinitely because you never know when something might go wrong. What if I accidentally delete a file and then rsync over the backup, too? What if I make a change to a file and then regret it? What if I rsync corrupted data to my backup folder? By creating snapshots at a chosen interval—whether hourly, daily, weekly, monthly, or something else entirely—you give yourself more options in worst-case scenarios.
rsync, with the allows you to create snapshots
I’m going to do a daily snapshot of my save folder. Instead of running one-off commands, let’s create a script using bash.
#!/bin/sh
# Create a timestamp
date=date "+%Y-%m-%dT%H_%M_%S"
# Specify the folder to snapshot on your local machine
SOURCE=/home/joel/saveme/
# Specify the destination folder on your VPS
DEST=/home/joelvps/snapshots
# Execute rsync followed by cleanup
rsync -azvP \
--delete \
--link-dest=../current \
$SOURCE [email protected]:$DEST/backup-$date \
&& ssh [email protected] \
"rm -rf $DEST/current \
&& ln -s $DEST/backup-$date $DEST/current"
When you run this script for the first time with your customizations, the entire directory that you specify under the “BACKUP” option will be synchronized to your standard linux vps. The script will then create a symbolic link between this newest snapshot and the “current” directory in your destination directory. In future snapshots, the “current” directory will be used for the “–link-desk” option in “rsync”. This option links unchanged files to a previous backup and only requires more space for new or changed files, conserving space while ensuring all your files are backed up properly.
However, the script currently doesn’t automatically delete old snapshots as new ones are created. You can do that manually, at least for now, but there are ways to automate this process. At the end of the above script, you can use the “find” command to discover directories that are more than 5 days old and delete them.
find /home/joelvps/snapshots/ -maxdepth 1 -type d -mtime +5 -exec rm {}
But, again, I’m going to play things cautiously for a while.
Automating Everything with Cron
The last step of this process is to automate each of the backups so that I don’t have to run them manually. Luckily, I have cron.
$ crontab -e
I can add both the daily synchronize command and the weekly snapshot script.
30 5 * * * rsync -av --delete -e ssh /home/joel/saveme/
[email protected]:/home/joelvps/backup/
0 20 * * 5 /usr/local/bin/daily-snapshot.sh
The first line executes the synchronization at 5:30 a.m. every morning. The second line executes my backup script at 8:00 p.m. once per week, on Friday.
When running this script for the first time, it is necessary to customize it according to your needs. Please note that to use cron with rsync, you must have a passphrase-less SSH key pair set up between your local machine and the VPS. If this is not the case, the commands will hang on your standard linux vps and prompt you to enter your password.
Other Tools
Automate backup processes with user-friendly tools created by other Linux users instead of using rsync directly.
- rsnapshot
- Attic
- rdiff-backup
- TimeShift
Conclusion
Protecting your valuable data is paramount. By leveraging the combined power of Rsync and a Standard Linux VPS, you can secure your file backup strategy. Ensure the safety and accessibility of your files with this robust solution. Embrace peace of mind knowing that your data is securely stored and easily retrievable whenever you need it. Don’t leave your files vulnerable to loss or corruption. Take proactive steps to safeguard them today. Secure your files efficiently and effectively with Rsync and a Standard Linux VPS backup system.