How I Keep My Precious Data Safe

Some of you don’t need to read this because you’ve got your backup regime down so well that the Australian Defence Force wants you to be a drill Sargent at Puckapunyal. You can carry on with peace of mind that your data is safe. But most of you are playing fast and loose with valuable 1s and 0s. You’re pretty much Vincent D’Onofrio in Full Metal Jacket. I bet you don’t follow the 3-2-1 rule. You probably don’t even know what the 3-2-1 rule is! This article is for you.

The 3-2-1 Rule

US-CERT has a great guide on the 3-2-1 rule of backups. It’s key to ensuring a robust level of backups to cover most scenarios. There are times when it’s not applicable and you may want to go further with your backups, but for anyone reading this for advice, 3-2-1 is the way to go. Let me give you a brief introduction as to what the 3-2-1 backup rule is.

Backup321

3 copies of your data – the original local copy, a second copy on an external hard drive and a third copy on the cloud somewhere. This covers your arse for various disaster scenarios. God forbid, if there’s ever a fire or a flood or a storm, that destroys all your worldly possessions, your data is kept in the cloud and you re-download it. If you lose your laptop (which has its whole disk encrypted, right?), you can buy a new one right away and re-import your data off an external hard drive or NAS without having to download everything, possibly taking days. If the local backup silently stops working, you’ve at least got your cloud backup and vice-versa. Three copies of data is important for these sorts of scenarios.

2 different formats – this is pretty easy to achieve now that online backups are prevalent and cost-effective. Keep a copy on an external drive and in the cloud. If your internet connection is too slow (thanks Turnbull & Abbott), you can back up to a Blu-Ray or an SSD as well as an external drive.

1 copy off-site – keep your backups separate. There’s no point backing up to a second hard drive inside your computer if you then go on to lose the computer or have it damaged. Keeping a copy off-site maintains a way to get your data back if something bad happens to your normal computing environment. Again, the cloud is awesome for this and kills two birds with one stone (different format and off-site!). But not all our Internet connections allow this, so storing a HDD somewhere that isn’t your house and updating it every week or month can work too. Drop around to your mum’s place once a fortnight to say g’day and rotate two external HDDs at her joint. Or if you’re super paranoid, rent a deposit box from a self-storage place and stash some drives there.

3 copies of your data, on 2 different mediums, with 1 copy kept off-site. Easy.

What to Backup

I like to lead by example, so here’s my backup regime in detail, so you can either bask in its glory or flame me to the point of suicide for how pathetic it is.

I’ve divvied my data up into categories so I know how to back it up most efficiently. Most of it lives in my Mac’s home folder.

Screen Shot 2015-08-26 at 12.04.39 pm

My Mac’s home folder is the pièce de résistance. I lose this and I’d cry like a baby, blubbering in its mother’s bosom when it’s hungry. However, not all of its contents is important to me. There’s like 30GB of virtual machines in there that really don’t need to be backed up. My Downloads folder is full of ephemera that just hasn’t been deleted yet. If it’s something I download and need to keep it goes into Documents anyway.

If my home folder is the pièce de résistance, then the documents folder within it is the Mona Lisa to its Louvre. All my “work” junk like, almost every single article I’ve ever written and all the accounting and business documents I have, are kinda important. There’s personal info in there too, like scanned copies of important papers and lots of digital stuff I’ve collected over the years (you have gigabytes of websites you saved from a decade ago, yeah?).

Losing my photo library (again) would devastate me to an extent I can only imagine would be like losing a pet (I’ve never had a pet animal, why would I, I can barely look after myself). Same with my hand-picked iTunes Library that has been cultivated for years, made up of albums that are simply not on Spotify or Apple Music.

Then there’s the rest of the flotsam in my home folder that I can probably lose, but it’s so tiny I may as well back it all up just incase I need it – crap lingering on my Desktop and whatever is hanging around in the hidden directories and in my user Library.

Apps you might be thinking, what about your beautiful apps?! Well they don’t live in the home folder, so they’re dirt to me. All the apps on my Mac can be re-downloaded. Anything with specific settings or the like is kept in my home folder’s Library section.

Then there’s some weird outlier stuff I have that still needs backing up. There’s around 400GB of video I keep on my NAS (not porn) I don’t want to lose and maybe every 2-3 months a new bunch of videos is added (not new porn). Losing this is on the same level as losing my photos. Ditto email. Even though it lives in Gmail and I don’t use a local mail client (just the Gmail website), being locked out or hacked and losing access to my email would be a disaster.

How to Backup

The easiest solution here is to use Backblaze and Time Machine. Backblaze copies every single goddamn thing off your computer and puts it on their servers. When you inevitably need to retrieve your data, log in to their website, request a restore and a little while later you can select the files you want to download – or, give Backblaze more money and they’ll post you a drive with your data on it. It’s a nifty service for only US$50/year. TIme Machine is great too – plug-in an external HDD and let Mac OS X do the rest. Want to restore a file? Just open it up and use Finder to navigate to where the file you’re missing was supposed to be.

But I don’t use Backblaze or Time Machine. Backblaze only keeps my data on a rolling 30 day basis and I dislike how the restore process isn’t instant – I have to wait for them to release my data.

I have some issues with Time Machine as well. Keeping a HDD plugged in to my Mac all the time isn’t ideal. I use my laptop around the house and take it with me places. So why not use a Time Capsule or set up a Time Machine target on a file server? Well I don’t trust the method Apple uses for remote Time Machine backups – the sparse image bundle can be corrupted very easily and Time Machine has to start a fresh back up all over again. This happens to me every few months when using Time Machine remotely in the past and was a waste of time. Time Machine plugged in to a Mac = all good. Time Machine over network = rubbish.

The main cog in my backup regime is Arq. A wonderful Mac & Windows app that has an open source backup format and restore tools, so if the developers die, stop caring, are bought out by an evil corporation, there’s a chance a Good Samaritan might pick up where they left off. It’s easy to restore files too, just download Arq, re-add your storage credentials, adopt the backup set back into Arq and restore what you need.

Screen Shot 2015-08-26 at 12.06.02 pm

Arq supports backing up to Amazon S3/Glacier, Google Cloud Storage, Amazon Cloud Drive, Google Drive, OneDrive, Dropbox and good old SFTP. I use Arq to SFTP backups to a file server running Ubuntu Linux and also to upload to Google Cloud Storage Nearline, which for my ~200GB only sets me back US$2/m.

Using a combo of SFTP file transfer and Google Cloud Storage covers the 3-2-1 rule nicely for all the stuff on my Mac. Every hour, my entire home folder, guts and all, is encrypted and copied to an Ubuntu Linux file server over SFTP. Piece of cake to set up and practically every NAS unit from Synology or QNAP and the like supports this method of file transfer. A third copy of the home folder, minus Virtual Machines and Downloads is encrypted and uploaded to Google Cloud Storage. The first backup is a pain, but the subsequent small updates are fine. Sometimes there’s a big chunk of data to upload (if I add a bunch of photos or there’s some videos sitting on my desktop), but I’m lucky enough to be on the NBN so it’s not that big of a deal.

My super important documents are kept on Microsoft OneDrive, giving me a handy 4th copy of this valuable data. I mainly use OneDrive because I get 1TB for free for having an Office 365 subscription (which is only A$75/yr if you get Officeworks to price match MSY). I reckon Dropbox is a bit better featured, but I didn’t want to pay $13/month for Dropbox when I’ve got 1TB of storage on OneDrive just sitting there! You’ve all used Dropbox or OneDrive before so I’m not going to explain how it works. However, OneDrive or Dropbox on its own isn’t that effective of a backup. Sure, they automatically upload files as soon as you put them in the specific syncing directory. But it also deletes them as soon as you delete them too. Some cloud services have versioning features, which allow you to restore deleted files, but I’d rather rely on my Arq backup, which has months and months of files in its database, allowing me to go back as far as I want.

Google Photos is free and has unlimited storage for all your photos if you keep them on Google’s server at the “High Quality” setting instead of their original file size. I don’t really use Google Photos for anything but a free backup of all my photos is too tempting not to take advantage of. I just let the Google Photos app run in the background, providing a 4th copy of the photos for no cost and no thought. Sure, the Google/NSA database has all my photos, but, free backups, cool.

Backing up Gmail is a bit trickier. I use Gmvault, which is a Python script that downloads all your email off Gmail and stores it in a file-based database. Because it’s just a Python script it can be scheduled to run whenever you like. I run it hourly on my file server, as it’s on 24/7, unlike my Mac. To make sure I have 3 copies, I also back the Gmvault database up to Copy.com – another Dropbox/OneDrive-esque service, but with 15GB for free. Plenty of space for my email database (which is only around 8GB at the moment). I’d like to use OneDrive, but I can’t find a reliable command line client to run on my file server – Copy.com has an official one that gives me peace of mind over a random GitHub script that hasn’t been updated in a while.

The static data, which is around 400GB of small Final Cut Pro projects I’ll probably never return to but can’t delete, are backed up manually. They live on my file server and an external hard drive. I just copy the new projects to the external drive whenever a new project is complete. To meet the offsite storage component of the 3-2-1 rule, I upload the data to Mega (the file hosting service Kim Dotcom recently abandoned as untrustworthy, hah), in 50GB chunks. Mega give you 50GB for free and most of my projects hover around the 50GB mark, so it’s perfect. I just sign up for a new account for each project and stash the data there. I should probably move it elsewhere. I don’t think Mega is long for this world. I’ll likely upload it all to Google Nearline and cough up the ~US$4/month. However, that might tip me over the edge to move all my backups to Amazon Cloud Drive, which is US$60/yr for unlimited storage. But I’ll cross that bridge when I come to it.

The result of all this is that my important data is all backed up to the 3-2-1 rule.

  • Home Folder – 3 copies (original, file server, Google Nearline)
  • Photos – 4 copies (original, file server, Google Nearline, Google Photos)
  • Documents – 4 copies (Original, file server, Google Nearline, OneDrive)
  • Email – 3 copies – (Gmail, file server, Copy.com)
  • Static Data – 3 copies – (file server, external HDD, Mega)

I can sleep at night knowing that my very important information is safe, even from the number one risk to it – my own dumb self pressing delete.

The final reminder I have is to encrypt all your backups. This way, if someone steals your backups, they don’t have all your personal info too. Just a HDD they can format and maybe use to store all their research on how to get away with petty theft from honest hardworking people. This applies to the cloud as well. What if someone hacks in and pilfer data? They’re gonna see your stuff! Encrypt everything, don’t lose the passphrase and change it regularly! If you see a box for “encrypt my backups” tick it. Always.

If this post raised more questions than it answered – sorry. Feel free to drop some comments and I’m happy to give you some advice. If you’d like to do some further reading, there’s an excellent guide for Mac users called “Backing Up Your Mac” from Take Control Books. For US$10 you get a really in-depth guide on how all this stuff works and how to best plan your backup regime. Wirecutter also recently took a look at a bunch of cloud backup options. Backing up your smartphone and tablet is also important, but a topic for another article.

[optin-cat id=5772]