Categories
Data Custody Decentralisation and Neutrality Privacy and Anonymity The Next Computer

My data backup strategy and tools, 2021

Here’s an overview of how I backup my data across drives and devices.

I was driven to post this because of the recently reported data loss experienced by several people around the world, caused by a malfunctioning, possibly hacked network storage device from Western Digital: “WD My Book NAS devices are being remotely wiped clean worldwide“.

Today, WD My Book Live and WD My Book Live DUO owners worldwide suddenly found that all of their files were mysteriously deleted, and they could no longer log into the device via a browser or an app.

When they attempted to log in via the Web dashboard, the device stated that they had an “Invalid password.”

“I have a WD My Book live connected to my home LAN and worked fine for years. I have just found that somehow all the data on it is gone today, while the directories seems there but empty.

The same device that Western Digital encouraged its customers to ‘Put Your Life On [It]’, lost people’s photos, music, documents, backups, probably more.

Ordinary people like you and me need a better plan for our life’s work and memories than entrusting it to a company and its specialised hardware and software. We need a plan we understand.

This is that plan.

Devices to backup

  • MacBook Pro 1TB SSD
  • iPhone 128GB
  • iPad 256GB
  • External 1TB HDD – archives, old pictures, home movies, other uncategorised data

Laptop, phone, tablet all used daily.

Current backup plan

MacBook Pro

  • Runs Catalina; full weekly disk backup on external 1TB Time Machine HDD.
    • Quarterly restore test on 2014 MacBook Air also running Catalina
  • Backup main document and multimedia folders weekly with rsync, run manually from iTerm2, to external 2TB HDD (redundancy for above). Example: sudo rsync -aP --delete /Users/rahulgaitonde/Documents/ /Volumes/Backups/BackupDocuments

External 1TB drive

WD Elements 1TB drive
  • Backup weekly with rsync, run manually from iTerm2 to external 2TB HDD: same disk as above

iPhone, iPad

2018 12.9″ iPad Pro 256GB and 2018 iPhone XR 128GB
  • iCloud Drive backup, continuous

Other data

  • Email: Gmail and Google Workplace; downloaded locally to Thunderbird on MacBook Pro as Mbox files (which is itself backed up as above)
  • Photos: synced from iPhone and iPad to iCloud; also synced weekly from iPhone to MacBook Pro Photos.app on MacBook Pro
  • Notes: Notes.app and plaintext files; both synced to iCloud
  • Contacts, Calendar, Reminder: synced to iCloud; exported monthly to MacBook Pro
  • Passwords and secure notes: synced to Bitwarden; vault exported monthly to MacBook Pro
  • RSS feeds: synced to Feedly; OPML exported monthly to Macbook Pro
  • Bookmarks: synced to Firefox; HTML exported monthly to Macbook Pro
  • Read Later queue: synced to Instapaper and Pocket; CSV exported monthly to MacBook Pro. Some articles saved locally in Markdown in iCloud Drive

So, here are my tasks:

  • Weekly
    • Run Photos.app to sync iCloud Photos locally to Macbook Pro (turn off storage optimisation) – 10 minutes
    • Backup MacBook Pro to Time Machine external HDD – three hours
    • run rsync on MacBook Pro drive and on external 1TB HDD. Destination for both is external 2TB HDD (distinct from Time Machine). 10 minutes. First run took a long time; subsequent runs take a fraction of the time that Time Machine backups take.
    • Total time: appx. 20 active minutes; 3 hours in background
  • Monthly
    • Export Contacts, Calendar, RSS OPML, Bookmarks, Password Vault, Read Later queue and store locally – 10 minutes
    • Weekly tasks for that week
    • Total time: appx. 10 active minutes + regular weekly backup time
  • Quarterly
    • Test restore on 2014 MacBook Air – about 10 active minutes + 2 hours in background
    • Weekly and monthly tasks
    • Total time: appx. 10 active minutes + 2 hours in background + regular monthly backup time
  • Automated:
    • Downloading mail locally happens throughout the day since Thunderbird is always open
    • iCloud Drive backups happen daily automatically since iPhone charges wirelessly overnight

As you can see, I don’t actually spend a lot of time backing up my data. I last suffered a catastrophic data loss in 2008, and I’m determined to not let that happen again, especially now that storage is cheap and fast, and cloud backups exist.

In the early days of this system, I was tempted to automate large parts of it. I could run an open-source Time Capsule using an unused Raspberry Pi and Netatalk. I could also connect the external 2TB drive and run rsync from my Mac to the remote Pi machine (rsync, or remote sync, was in fact built for this use case).

That way my Time Machine backups would run every hour, not weekly. I could also automate rsync to, say, daily by using MacOS’ cron, a scheduling utility that’s part of almost every unix-based system.

But that frequency of backup seems overkill for my data, especially given that the vast majority of my everyday data, the one that changes daily, is backed up to iCloud. Even if I were to lose data mid-month, between restoring from the latest Time Machine backup and then syncing to iCloud, I’d be able to recover most, if not all, of my data. So that means leaving a computer running, with my backup disks attached, that’s really doing useful work for a tiny fraction of the time. That also means extra wear on the very disks I’m using for backup.

In conclusion

My solution is a mix of cloud sync and manual backup.

The cloud portion – for frequently changing data – uses iCloud, which seems to be the most privacy-centric of all cloud services.

The manual portion – for redundancy and archived data – uses open source tools and doesn’t rely on either an always-on computer, specialised hardware or a connection to the Internet, unlike the Western Digital NAS this post began with.

Finally, the solution doesn’t take a lot of time to run, and can be restored from pretty quickly. The only vulnerability in this system is that all the devices and disks are in my house. If there’s a catastrophic event at my place, the data that’s backed up manually will be lost.