Painless Thumbdrive Backups

by Andrew Fabbro

Raise your hand if you've ever lost (or worried you'd lost) a USB thumbdrive. You spent hours fruitlessly searching the house, and then as you opened the washing machine door, it suddenly dawned on you that perhaps you didn't check your pockets thoroughly when you put this load in.

Fortunately, you have a backup of all the data, right? You religiously mount the drive and copy the data to a backup directory on a regular schedule, no?

That sounds an awful lot like drudgery to me too, and I got into computers to avoid boring work. Naturally, it's a lot more fun to spend some time working out the perfect method for painless thumbdrive backups.

What do I mean by painless? How about a system where you can walk up to your Linux box, plug in the drive, wait for a “backup complete” sound, unplug and walk away? Perhaps a system that keeps its backups orderly (say, the last seven copies)? Oh, and it should handle encrypted thumbdrives as well. And, if you need to recover, it should do both whole-volume replacement and per-file restores.

Not a problem. The key to this system is using udev rules and a simple shell script. The tools already are on your system. In this example, I use a CentOS 4.3 system, though any Linux distribution with a 2.6 kernel should work.

udev to the Rescue

udev is the modern device manager for Linux, replacing the 2.4 kernel's devfs. udev handles all device mapping, including hot plugging of devices. One of its coolest features is it lets you write your own event rules. This article shows you how to craft a rule that automatically fires when you plug your USB thumbdrive in to the system.

These rules are stored in /etc/udev/rules.d (if you're using a different Linux distribution, check /etc/udev/udev.conf for the udev_rules= line, which should point to the rules directory). You can place whatever udev rules you want as text files in this directory, and udev picks them up immediately for use without requiring a reboot.

How to Identify Your Device

To write a udev event rule, you first need a unique way to identify the USB device. Most thumbdrives have serial numbers, though not all. Fortunately, even with thumbdrives that do not have a serial number, you can craft udev rules for them.

I use two thumbdrives as examples: a JetFlash JF110, encrypted with TrueCrypt, and a Corsair Flash Voyager. The JetFlash has a serial number; the Corsair does not.

Plug your thumbdrive in, and cat /proc/scsi/usb-storage/*. You should find an entry for it similar to this:

   Host scsi5: usb-storage
       Vendor: Unknown
      Product: USB Mass Storage Device
Serial Number: 85a5b1f2c96492
     Protocol: Transparent SCSI
    Transport: Bulk
       Quirks:

If you have a serial number, skip forward to the “Writing the Rule” section of this article. If you see “None” for the Serial Number, you still can identify the device by using udevinfo. Follow these steps:

1) Look at dmesg's output. Typical output is as follows:

usb-storage: waiting for device to settle before scanning
  Vendor: Corsair   Model: Flash Voyager     Rev: 1.00
  Type:   Direct-Access
  ANSI SCSI SCSI device sde: 2031616 512-byte hdwr sectors (1040 MB)
[...]
sde: assuming drive cache: write through
 sde: sde1
Attached scsi removable disk sde at scsi12, channel 0, id 0, lun 0
Attached scsi generic sg4 at scsi12, channel 0, id 0, lun 0,  type 0

This tells you that /dev/sde is the device assigned.

2) Now, run:

udevinfo -a -p $(udevinfo -q path -n /dev/sde)

and examine the output. Look for these lines:

BUS=="scsi"
SYSFS{model}=="Flash Voyager  "
SYSFS{vendor}=="Corsair "
Writing the Rule

Now, with either the serial number or the vendor/model combo, you can write the rule. The rule creates a symlink for the device in the /dev tree, for example, /dev/corsair_drive, and then calls the script /usr/local/bin/backup-thumb.sh, which I'll get to in a moment.

Become root (su -), and create a text file in /etc/udev/rules.d called 95.backup.rules. You can use a number other than 95, but keep in mind that udev processes rules in alphanumeric order, and it's better to have local rules processed last.

If you have a serial number, type a rule like this (all on one line) into the file, and save it:

BUS="usb", SYSFS{serial}="85a5b1f2c96492", SYMLINK="jet_drive",
RUN+="/usr/local/bin/backup-thumb.sh jet_drive "

If you're using vendor/model identification, your rule would look like this:

BUS="scsi", SYSFS{vendor}=="Corsair ", SYSFS{model}=="Flash Voyager  ",
SYMLINK="corsair_drive", RUN+="/usr/local/bin/backup-thumb.sh
corsair_drive"

Note that you can string as many SYSFS{} entries together as you need to identify the drive uniquely. Your rule now fires every time you plug in your thumbdrive.

Note: if you have other rules for a device, udev executes the rules in sequence from top to bottom.

Set Up the Backup Script

backup-thumb.sh is the engine that backs up your thumbdrive. Our rule calls it, giving the name of the device (the SYMLINK) as its only argument. Everything else is configured in the CONFIG section. The backup script is shown in Listing 1.

Listing 1. Backup Script

#!/bin/bash

# Thumbdrive backup script from <citetitle>Linux Journal</citetitle>

# ##############################################
# CONFIG section

# where you want the backups to be kept
BACKUP_DIR=/backups/thumb

# how many old backups to keep
GENERATIONS=7

# backup only once a day
# set to 0 if you want a backup run every time
# you insert your thumbdrive
BACKUP_ONCE_DAY=1

# completion sound to play when backup is done
SOUND=/usr/share/sounds/KDE_Beep_ClockChime.wav

# END CONFIG
# ##############################################

# main program

# wait for device to settle
sleep 10

# make sure no one will be able to copy our backups
umask 077

# check the directory
DEVICE=$1
if [ ! -d ${BACKUP_DIR} ] ; then
        mkdir -p ${BACKUP_DIR}
fi

# only backup once per day
if [ ${BACKUP_ONCE_DAY} -gt 0 ] ; then
        DIDTODAY=${BACKUP_DIR}/${DEVICE}.did_today
        find ${BACKUP_DIR} -name ${DEVICE}.did_today -a -mtime +1 -delete
        if [ -f ${DIDTODAY} ] ; then
                exit
        else
                touch ${DIDTODAY}
        fi
fi

# rotate backups
cd ${BACKUP_DIR}
let GENERATIONS=${GENERATIONS}-1
while [ ${GENERATIONS} -ge 0 ] ; do
  let NEWFILE=${GENERATIONS}+1
        if [ -f ${DEVICE}.backup.${GENERATIONS} ] ; then
          mv -f ${DEVICE}.backup.${GENERATIONS}
${DEVICE}.backup.${NEWFILE}
        fi
  let GENERATIONS=${GENERATIONS}-1
done

# do the backup
dd if=/dev/${DEVICE} of=${BACKUP_DIR}/${DEVICE}.backup.0 > /dev/null 2>&1

# notify that we're done
aplay ${SOUND} > /dev/null 2>&1

Put this script in /usr/local/bin/backup-thumb.sh, and remember to chmod +x it. Next, edit the CONFIG section—the parameters are as follows:

  • BACKUP_DIR: where you want the backups to go.

  • GENERATIONS: how many days of backups to keep. Backups will be numbered 0 (most recent) to the limit you enter (oldest). Keep in mind that you need to have enough storage space for this many backups. If you are backing up a 1GB fob and set GENERATIONS to 7, backups will consume 7GB of space.

  • BACKUP_ONCE_DAY: if you plug and unplug your fob multiple times a day, you probably won't want to back it up each time. backup-thumb.sh uses a tag file so that it backs up only once per day. If you want to change this so it runs a backup every time you plug in a thumbdrive, set BACKUP_ONCE_DAY to 0.

  • SOUND: in this example, I've chosen a sound from the KDE distribution, but any WAV file will work. You easily can modify the script to use madplay instead of aplay and use an MP3 file as your completion sound.

How It Works

backup-thumb.sh sleeps for ten seconds on startup, because it must wait for the kernel to finish scanning the thumbdrive. If you plug in a thumbdrive and type dmesg, you'll see a “waiting for device to settle” message while this happens. Ten seconds for the kernel scan should be sufficient even for older machines.

Next, backup-thumb.sh sets permissions tightly so that only root can read the backups. Otherwise, some nefarious person could copy your backup to a different machine and mount it there.

The script executes a simple dd (bit-for-bit copy) of your thumbdrive to a backup file. This works whether the device is encrypted or not. When it's finished, it plays a noise you will hear on your computer's speakers. On a USB 2.0 port, backing up a 1GB thumbdrive takes about one minute.

How to Recover

If you lose your thumbdrive and want to restore your backup to its replacement, simply dd the backup image to the new thumbdrive, like so:

dd if=corsair_drive.backup.0 of=/dev/corsair_drive

Or, if you want to grab only some files from the backup, do the following:

mkdir /mnt/thumb
mount -o loop corsair_drive.backup.0 /mnt/thumb

You now can copy the files from /mnt/thumb.

If you're using TrueCrypt to encrypt your thumbdrive, you can mount the backup image in much the same way:

truecrypt corsair_drive.backup.0 /mnt/thumb/

That's about as painless as we can make thumbdrive backups. If you're too lazy to plug your drive in and come back when it beeps...well, stay away from laundromats!

This article has scratched only the surface of what you can do with udev rules. Any type of hot-plug event can fire a rule that can do almost anything. For example, you can write rules to mount devices automatically, copy pictures off a digital camera or set up a network link. udev's rules language provides great flexibility, including printf-like wild cards and the ability to set permissions.

The best overview for writing your own udev rules is Daniel Drake's “Writing udev Rules”, which can be found at www.reactivated.net/writing_udev_rules.html.

Andrew Fabbro has become an Oracle DBA; however, he still has root at home and he welcomes your comments sent to andrew@fabbro.org.

Load Disqus comments