bkup
- NAME
- SYNOPSIS
- DESCRIPTION
- EXAMPLES
- OPTIONS
- SERVER SETUP
- ENVIRONMENT
- AUTHOR
- REPORTING BUGS
- LICENSE
- SEE ALSO
NAME
bkup is backing up local partitions to a remote host
bkup.tar.bz2 (latest version: 8-Jul-2015)SYNOPSIS
bkup is a small and simple backup program, reusing the system binaries as much as possible. The goal is a bit-by-bit copy of the file systems, either using powerful file system dumping tools (if provided), or a general purpose streaming utility like dd.
The program is split into a client and a server part. The client part can be run manually for testing purposes and casual backups, or it can be invoked by cron for a regular backup. A convenient server part is also provided; however, this component is not mandatory for a successful backup. Nevertheless a full working example is provided below.
DESCRIPTION
Many backup systems are operating on the file level, trying to preserve as much meta-information as possible. While this approach is likely suitable for day-to-day work, it fails when trying to back up a whole system which may be comprised of files with ACLs, extended attributes or other peculiarities. To cope with these, backup systems exist which are doing full-disk backups. Unfortunately, these solutions are considered bulky since most of the time the backup images grow large and more often than not it is hard to restore individual files.
A good backup system should not only make it easy to restore backups to bare metal (hence not depend on databases and the like) but it should also not modify the disk itself during backup (all file-level backup tools do this unless the partition is mounted with the noatime option). Hence, this tool relies upon file system specific dump tools such as xfsdump, ntfsclone, or the dump tool of other file systems: Ideally, only the modified data is backed up (resulting in little data transferred each day) but taking all these snapshots together allows to recreate a disk image identical to the originating one. Depending on the file system and its accompanying dump tool, some features will be supported or dropped (best effort). Hence, in order to get a good backup, it is important to select the right (meaning: well supported) file system.
Additional to reading the originating file system, bkup is also capable of doing image compression, and image encryption. Image compression can either be done on the client, or on the server (at the user's discretion; depending on the compute capabilities and the network speed), or it can be done during the encryption step. Currently supported compression schemes are gzip and bzip2 (utilizing several compute cores, if desired), but other tools may be used as well (lzop, xz, etc).
Encryption is currently done with GPG (version 1, 2, or PGP). The big advantage of using this infrastructure is that no passwords need to be provided when making the backup, thus increasing security and improving maintenance.
Finally, as backups should be stored offline, the data is sent off to another host with the help of SSH. Care has been taken that the remote host (which will receive the backup data) will only receive the (encrypted) data, nothing more. There is no need/possibility to execute random commands: This is important, as often malicious users do gain access to a system by using keys originally intended for backup purposes. By eliminating this risk, the backup system gets more trustworthy and there is less maintainance overhead in general.
EXAMPLES
Below are some typical use cases of the bkup utility. Please note that usually you want to create and register a public key before executing these commands (see section SERVER SETUP).
- bkup
-
Show a list of partitions that can be backed up. Please note that this list is not complete, and the user may specify other (device) files for backup as well.
- bkup -a user@srv /dev/sda
-
Back up a full disk, including all partitions. For storing the backup, the given account on the given host will be used.
- bkup -a user@srv -i ~/.ssh/bkupkey /dev/sda1 /dev/sda5 /dev/sda6
-
Back up partitions 1, 5 and 6 of the first hard disk, using the same remote host and identity stored in ~/.ssh/bkupkey for all partitions.
- bkup -a user@srv /scratch/bigfile --compress=server /dev/sda3 -e 0x408bdf8c!
-
Back up /scratch/bigfile but do the compression on the server (stream the file uncompressed over the network). After that, back up partition sda3 and encrypt it so that it can be only read with the given encryption key. (Please note that the exclamation mark is part of specifying recipients with GPG and means that the given key is preferred over the user's default encryption key.)
For both backup steps, use the same user ("user") and host ("srv").
- bkup /dev/sda3 --compression=none -a bob@host1 /dev/sdb5 --use-compress-program=/usr/bin/xz -a charly@host2 --list
-
Back up partition sda3 using no compression using the account bob@host1. Then, back up partition sdb5 using /usr/bin/xz as the compression program and the account charly on another host.
Actually, do not back up anything but show the user what problems could emerge when he would execute the command (option --list).
OPTIONS
As previously mentioned, the bkup tool acts both as a client and as a server. If no option is given, client mode is assumed. In server mode, only few options are interpreted.
In client mode, the general calling convention is
bkup [general-options] part1 [part1-options] part2 [part2-options] ...
Options not dealing with the actual data (such as --list for simulating a dry-run) can be given anywhere in the argument list. Additionally, options may be abbreviated to their shortest unique form.
The options discussed are only available on the client side except noted otherwise.
- account
-
--account is a mandatory option which tells the backup client where the backup server lives. This option will be passed to the SSH program and thus is of format [user@]hostname. The hostname is mandatory while <user> is optional. If the user name is left out, the user name making the backup is the same as the user name receiving the backup on the remote host. However, be aware that usually backups are created by the root user, and it is good practice to choose a different, low-priviledged user for receiving the backups.
- Examples
-
--account backup.example.com
--account bkupusr@srv
- compression
-
bkup offers 4 different compression workflows, depending on the computing power of the backup client and server, and the network capacities in between:
- none
-
Store the backup uncompressed. This setting can be useful if the network speed is high, and computing power is low on both the backup server and the client. Additionally it can be useful if the backup consists of random data (in a mathematical sense). Please be aware that images of such backups will consume a lot of space on the target system.
- client
-
The compression is done on the backup client. This is the default setting, as disks are usually large, network speed always a concern (even in local area networks), especially if multiple clients are connecting to the backup server at the same time.
- server
-
Compression will be done on the server. This setting is useful in case where the backup client has a low-end CPU and the network bandwidth is of no concern. This setting will not disable SSH compression, though (see below).
- gpg
-
This option does not only enable compression but encryption as well. GPG has built-in compression; leveraging that could result in less memory consumption. However, there are some disadvantages as well: First, tests have shown that bzip2 compression from GPG is not en par with the standalone bzip2 compression. (Backups tend to be smaller with the external bzip2 routine.) Additionally, GPG compression is limited to the gzip and bzip2 algorithms. Thus, if you want to use LZO (extra fast, see lzop(1)), xz (very small files, see xz(1)) or any other compression algorithm, you would probably want to use the --compress command line option.
For GPG compression the compression algorithm can be set by preparing an appropriate encryption key.
When using the --client or --server option multiple processor cores are directly supported (without any additional configuration) in case either pigz(1) (gzip compression) or pbzip2(1) (bzip2 compression) are installed. To choose one compression algorithm over the other, set the --use-compress-program accordingly.
Please note that with option values none and server compression will take place on the SSH link nevertheless; this is due to the fact that the SSH compression is deemed fast enough even on modest machines. If this is a concern, you may always set the SSH compression level manually to a lower value.
- dir (server only option)
-
To change the directory where the output files will be written to, use the --dir option. This is a shorthand for changing the directory before calling the bkup tool and is convenient in places where only one executable (in contrast to a sequence of commands) is allowed.
- encrypt
-
Using the --encrypt option will encrypt the backup stream using GPG (or alternatively with PGP, depending on the value of --use-gpg). The encryption key is the string parameter for this option which will be passed unaltered to GPG as the recipient. Consequently, you are free to choose whatever syntax pleases you, for example email addresses, fingerprints (recommended), or key IDs. Please remember to specify an exclamation mark after the key ID or fingerprint in case you want to use a subkey for encryption, and not a random key GPG deems to be appropriate. For more information on this topic, see section How to specify a user ID in the gpg(1) man page.
- Example
-
--encrypt 0x234AABBCC34567C4!
- help
-
Displays the man page for the bkup tool (this page).
- identity
-
The identity option allows the user to specify an identity file from OpenSSH (or compatible). Using this option is highly recommended when using the bkup tool non-interactively (eg in a cron script). Actually, when also using the bkup tool in server mode on the backup server, setting up SSH identities is mandatory. For more information, see "SERVER SETUP" below and the ssh(1) and ssh-keygen(1) man pages.
- level
-
Specify the backup level of the incremental backup. The argument to this option is expected to be a positive integer between 0 and 9, 0 meaning that incremental backups are disabled and every backup is a full backup (the default). Making a backup with level n+1 will use the most recent backup of level n.
Please note that not all file system tools currently support backup levels, eg xfsdump does have support for levels, ntfsclone does not.
The default for this option is 0, that is always make full backups. This setting is most probably fine for manual backups, however for automated backups it can be useful to specify the output of a script here. This script should generate integer values according to your backup plan. Together with the bkup tool you should have received a tool called bkup-lveq which is designed for exactly that purpose.
- Examples
-
--level `bkup-lveq 4 7`
--level `my-script-here.sh`
Example shell script
LEVEL=2 if [ `date "+%e"` -eq 7 ]; then LEVEL=0 elif [ `date "+%u"` -eq 4 ]; then LEVEL=1 fi echo $LEVEL
- list
-
If this option is the only option on the command line, the program will list partitions from the current host that may be available for backup. This is also the default mode if no command line arguments are given at all. This list is compiled by looking at the information in /dev/disk/by-uuid and provides an abridged view on the internal status of the program.
Together with other command line options, --list provides a kind of dry-run mode, as it will inhibit making the actual backup. Instead, it will try hard to determine whether a backup could be made, and print any errors and warnings to the console. Please note that this is just an approximation, as it will not connect to other servers, and it will not touch the given partitions. But it will be able to detect missing command line options, file systems which are mounted but expected to be unmounted (and vice versa), or tools which are needed but not available.
Generally, it is a good idea to use --list before performing the actual backup.
- server
-
Enable server mode. This option may also be negated (as --no-server) to explicitely enable client mode.
In server mode, the program is expecting data on STDIN as long as STDIN is not closed. It will store the received data into a file with its name based upon values contained in the environment variable SSH_ORIGINAL_COMMAND. (For more information how the file name is constructed see the source code.) Before writing the data to disk great care has been taken that no other files will be overwritten (not even older backup files, since overwriting files could pose a security problem). Hence, the only problem that may arise from using the backup server component is a Denial of Service in the case the disk runs full. However, this can be mitigated through setting quotas.
The server administrator can choose the directory to store the backups by giving the command line option --dir with a directory path as its parameter.
For more information on using the server component, see below "SERVER SETUP".
- use-compress-program (option available on both client and server)
- use-dd
- use-gpg
- use-ntfsclone
- use-ssh
- use-xfsdump
-
Use the specified program for the given purpose. These tools will be given some command line options, so they are expected to adhere to the standard. For example, when specifying a different tool for compression (by setting the command line option --use-compress-program), the tool is expected to take the same command line options as gzip. The program given with --use-dd is expected to behave like GNU dd, --use-ssh expects a program compatible with OpenSSH, and so on. If a command expects different command line options, the user may of course specify a custom wrapper script which then calls the non-conforming binary (eg to use other SSH implementations instead of OpenSSH).
SERVER SETUP
One goal of the bkup system is that it is easy to setup and to maintain, and as OpenSSH is a very reliable piece of software it is a natural fit for a backup system. The main disadvantage in this usage scenario is that SSH allows to execute random commands on a remote system, a feature people with not-so-good intentions also appreciate very much. The problem gets worse by the fact that backup systems should be fully automated, which outrules use of passwords and SSH public keys with passphrases.
Consequently, the goal of the bkup tool is to use SSH in a way that only the data is dumped to the remote server, and there is no control channel. Only the system administrator of the remote system has the possibility to influence the backup system. This is achieved by adding the SSH public key to the authorized_keys file in a special way.
This is how to do it:
On the client host (the host you want to create backups from) and with a user account that has enough permissions to create backups (usually root), create a new private/public key pair without a password using ssh-keygen. Save the key without a passphrase. In this example we will assume that you will save it to ~/.ssh/id_bkup. This will create two files ~/.ssh/id_bkup (the private key) and ~/.ssh/id_bkup.pub (the public key).
Print the public key contained in ~/.ssh/id_bkup.pub like so:
cat ~/.ssh/id_bkup.pub
and log in into the remote host (the one that will host your backups). There, add the newly generated key to the ~/.ssh/authorized_keys file. This step should be familiar if you know how to use SSH.
Prepend the just added line with the following text:
command="nice bkup --server --dir=/path/to/store/backups",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding "
immediately followed by the proper public key. So, the complete line should read like:
command="..." ssh-rsa AAAAB3N...
Also, make sure that either the bkup command is available through your PATH environment variable, or that you give the full path to the bkup executable.
If it is not obvious, you can now change the argument to the --dir option from the previous step to some other value that you prefer. This will be the directory where your backups will live. Make sure to create this directory before making any backups!
Done. Enjoy your new backup system.
Remove old backup files on the server
The above recipe only adds files on the server; eventually you want to delete old backup files to free some space. A possible approach is to remove them after a backup has been created. This is how to do it:
Create the script that will delete your files. For example, you could make use of the powerful globbing facilities of zsh(1):
#!/bin/zsh # return immediately if partition is not sda5 [ -n "$SSH_ORIGINAL_COMMAND" -a \ -n "${SSH_ORIGINAL_COMMAND##*sda5*}" ] && return cd ~/sync-folder || exit 1; # delete everything with 3, 5, 8 weeks and older # suppress error msg if glob expands to nothing { rm -v *.2.xfs.*(N.mw+2) \ *.1.xfs.*(N.mw+4) \ *.0.xfs.*(N.mw+7) } 2>/dev/null
See the man page of zsh to learn how to interpret the last two lines in the script.
Save this file; for example, use bkup-clean as the file name. You can also modify this script to your taste, such as letting it run after each single partition has been transferred.
Add the script file to the hook in the .ssh/authorized_keys created above; the relevant line would then change from
command="nice bkup..." ssh-rsa AAAAB3N...
to
command="nice bkup...; bkup-clean" ssh-rsa AAAAB3N...
ENVIRONMENT
bkup uses these environment variables:
- HOME
-
(client) read-only variable; used for tilde (~) expansion of the current user's home directory, and to search for identity files if they are given without a path name
- PERLDOC
-
(client) written-only variable; to switch perldoc back to man page rendering instead of terminal rendering
- SSH_ORIGINAL_COMMAND
-
(server) read-only variable; to read metadata related to the backup stream
This environment variable contains a space separated list of key=value pairs that will influence the file name of the target file.
AUTHOR
Written by Thomas Prokosch.
REPORTING BUGS
Report bkup bugs via email to user thomas-bkup at domain nadev.net; the tool's home page is at http://thomas.nadev.net/
LICENSE
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
related: bkup-lveq(1); file systems: xfsdump(8) ntfsclone(8) dd(1); compression: gzip(1) pigz(1) bzip2(1) pbzip2(1); server: ssh(1) ssh-keygen(1) authorized_keys(5) sshd(8) ssh_config(5); client: cron(8) crontab(1) crontab(5)