Image via Wikipedia
** 12/31/11 NOTE ** I've added RSYNC COMMANDS into the script for people who don't use Amazon's S3 storage and services. In order to automate this part of the script you'll have to set up pairs of RSA host keys in order to for this part to run without passwords. See How To Setup RSA Hostkeys.
Here's a script that I've been tinkering with, perfecting, and tweaking over the last few months. The scenario is this :
The company I work for has some 30 production sites and some 40 development sites that we work with daily. We use Amazon's EC2 hosting for the sites we need rock-solid-brute-force high-level-computing. For lesser trafficked sites we use Linode hosting at a much lower cost. Each of these hosting services offer snapshots and backup services, but each differs greatly in what type of snapshot or backup is run. Amazon EC2 is a whole universe unto itself and takes sometime to figure out their EC2/S3 work-flow. None the less, we felt it necessary to have a secondary backup routine where the various individual site were backup'ed individually and could be restored within seconds rather than sifting through a huge all-in-one archive.
Solution: Use Drush (Drupal Command Line tool), php (to take advantage of the S3 API) and good old fashion UNIX commands to get the job done.
Drush was a late addition to this script as I noticed that doing a simple "mysqldump" created some unwieldy SQL files that had unnecessary data in the Drupal cache tables. I could write a complex mysql drop routine, but Drush does it in one command. Also, I'm using the unix split command to bust up the large tar files that Drush creates using "archive-site" into 1 gig chunks in order to overcome S3's 2 gig file limit on versions of < PHP 5.3. I originally used a simple tar command, but switched to Drush for the sake of continuity. At the end of this post I'll offer some simple commands to restore the backups rather quickly.
Hypothetically you'll need a directory structure like this to run this type of script
/path/backup
/path/backup/backups
Make sure you have read/write permissions to the directories. "backups" is a working directory where all sql and tar files will live temporarily before uploading to S3.
So here's the srcipt with notes.
<?php /* *- 12/29/11 *- this is the current inventory of sites we'll be backing up *- INSTANCE: ec2-75-101-158-168.compute-1.amazonaws.com *- old.balboapark.org *- slam this in your /etc/crontab file to run at 1am everynight *- 1 1 * * * php /root/backup/drush-test-backup.php */
//adjust to your path $firstdir = getcwd(); chdir('/ebsvol/apache/www/backup');
// get this from Amazon. it has all the S3 classes you'll need . adjust to your path. include ('/ebsvol/apache/www/backup/S3.php');
// real simple - create a file for each day and a weekly file for the end of the week $thedate = getdate(); if ($thedate["wday"] == "0") { $datestr = date('ymd'); } else { $datestr = date('D'); }
// old.balboapark.org //using drush to clear all caches. makes your drupal site DB all nice and clean and compact.
$backupfile = 'drush-old-balboapark-org-backup-' . $datestr . '.sql.gz'; chdir('/ebsvol/apache/www/old.balboapark.org'); $syscmd = 'drush cc all'; system($syscmd); chdir('/ebsvol/apache/www/backup'); $syscmd = 'mysqldump -uUSERNAME -pYOURPASS balboapark | gzip -c > backups/' . $backupfile; printf("Creating %s...\n", $backupfile); system($syscmd); printf("Copying %s to S3 bucket...\n", $backupfile);
//rsync section - uncomment and use this part if you dont use S3 and have set up RSA hostkeys on both servers. //$host = 'your.domain.com'; //$dest = '/path/to/your/destination'; //$rsync_user = 'username'; // depentant on hostkeys in ~.ssh/ //$syscmd = rsync -vrup backups/'.$backupfile. ' '. $rsync_user. '@'. $host. $dest; //printf("Sending %s to %s ...\n", $backupfile, $host); //system($syscmd);
// comment out if using RSYNC
s3copy('backups', 'bpoc-backups'); $syscmd = 'rm -f backups/' . $backupfile; printf("Deleting %s...\n", $backupfile); system($syscmd);
//taring without drush //$syscmd = 'tar czPf backups/' . $backupfile . ' /ebsvol/apache/www/old.balbopark.org'; //printf("Creating %s...\n", $backupfile);
//drush site backup. you actually need to be in your site directory to work. drush utilizes /sites/default/settings.php to run chdir('/ebsvol/apache/www/old.balboapark.org'); $backupfile = 'drush-old-balboapark-org-backup-' . $datestr . '.tar.gz'; $syscmd = 'drush archive-dump --destination=/ebsvol/apache/www/backup/backups/' . $backupfile; printf("Creating %s...\n", $backupfile); system($syscmd);
//split into 1 gig chunks - S3 has a 2 gig file transfer limit/bug for < versions of PHP 5.3 chdir('/ebsvol/apache/www/backup'); $dr = '/ebsvol/apache/www/backup/backups/'; printf("Spliting %s into 1 gig chunks...\n", $backupfile); $syscmd = 'split -b 1024m ' . $dr . $backupfile . ' ' . $dr . $backupfile . '.part-'; system($syscmd); //dump orignial tar file $syscmd = 'rm -f backups/' . $backupfile; system($syscmd);
//rsync section - USING WILD CARD ! uncomment and use this part if you dont use S3 and have set up RSA hostkeys on both servers. //$host = 'your.domain.com'; //$dest = '/path/to/your/destination'; //$rsync_user = 'username'; // depentant on hostkeys in ~.ssh/ //$syscmd = rsync -vrup backups/* '. $rsync_user. '@'. $host. $dest; //printf("Sending %s to %s ...\n", $backupfile, $host); //system($syscmd);
//comment out if using RSYNC printf("Copying %s to S3 bucket...\n", $backupfile); s3copy('backups', 'bpoc-backups');
//cleanup everything else $syscmd = 'rm -f backups/*'; printf("Deleting %s...\n", $backupfile); system($syscmd);
//simple mail notification mail("
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
", "The drush test backup for ec2-75-101-158-168.compute-1.amazonaws.com was run", "The backup ran. Please verify the files are in the S3bucket");
chdir($firstdir); return;
//S3 roll your own function
function s3copy($targetdir, $bucket) {
//switch directories to target directory $origdir = getcwd(); chdir($targetdir);
//instantiate S3 class using secret S3 KEYS $s3 = new S3('XXXXXXXXXXX', 'XXXXXXXXXXX');
//try to the bucket $okay = $s3 -> putBucket($bucket, S3::ACL_PUBLIC_READ);
if ($okay) { // echo "Created bucket ". $bucket ."\n"; } else { die("Can't create bucket " . $bucket . "\n"); }
//iterate through files in the directory if ($handle = opendir('.')) { while (false !== ($filename = readdir($handle))) { if ($filename != "." && $filename != "..") { if ($okay) { if ($s3 -> putObjectFile($filename, $bucket, basename($filename), S3::ACL_PUBLIC_READ)) { echo "File copied: " . basename($filename) . "\n"; } else { echo "*** Failed to copy: " . basename($filename) . "\n"; } } else { } } } closedir($handle); } chdir($origdir); } ?>
So now what ? How to I use the backup files ? By now you should have some GUI for S3 sotrage. Firefox has a nice S3 organizer that well allow uploads and downloads.
1) Create a new directory to restore your files to and place your backup files there.
2) navigate your way to said directory and issue the following commands:
a) to rejoin the split tar files open your terminal and do something like this:
bpoc-cjb-mac:Desktop cborkowski$ mkdir reassemble-test bpoc-cjb-mac:Desktop cborkowski$ mv drush-* reassemble-test/
bpoc-cjb-mac:reassemble-test cborkowski$ ls drush-old-balboapark-org-backup-Thu.sql.gz drush-old-balboapark-org-backup-Thu.tar.gz.part-aa drush-old-balboapark-org-backup-Thu.tar.gz.part-ab
bpoc-cjb-mac:Desktop cborkowski$ cd reassemble-test/ bpoc-cjb-mac:reassemble-test cborkowski$ cat drush-old-balboapark-org-backup-Thu.tar.gz.part* > drush-old-balboapark-org-backup-Thu.tar.gz. .......... working...... bpoc-cjb-mac:reassemble-test cborkowski$ ls drush-old-balboapark-org-backup-Thu.sql.gz drush-old-balboapark-org-backup-Thu.tar.gz drush-old-balboapark-org-backup-Thu.tar.gz.part-aa drush-old-balboapark-org-backup-Thu.tar.gz.part-ab
//untar the newly joined file bpoc-cjb-mac:reassemble-test cborkowski$ tar -xvf drush-old-balboapark-org-backup-Thu.tar.gz
b) to untar and restore your DB
bpoc-cjb-mac:reassemble-test cborkowski$ tar -xvf drush-old-balboapark-org-backup-Thu.sql.gz
//make sure you have a DB and privleges to restore bpoc-cjb-mac:reassemble-test cborkowski$ mysql -uUSERNAME -pPASSWD databasename < drush-old-balboapark-org-backup-Thu.sql
Bingo ! you now have all the files and a fresh DB to work with. Perhaps you might want to add a new Apache virtual host to test the restore or perhaps you just want to overwrite existing fies in your web root. If you made it this far I'll leave that up to you.
 |