It's not glamorous but backing up your images is probably the most important thing you should be doing.
Most of us only have copies of our digital images on our computer. So imagine if something happened to your computer and you lost all of your images? Sure you may have some prints or photo books and maybe images uploaded to Flickr, but losing all of those digital images in their original form would be pretty awful!
There are many ways we could lose the contents of our computer's hard drive:
- software or hardware malfunction
- fire or flooding
- electricity surge
...not forgetting user error!
As with most things you might think that these things only happen to other people, but unfortunately they are rather common. I've had several hard drives fail on me, I've been a victim of theft, I've deleted files that I didn't mean to (oops!) and I've had software corrupt files irrecoverably. Fortunately on each occassion I had a backup strategy in place that meant these weren't the disaster they could have been.
What is a Backup Strategy?
A backup strategy is just the organisation and process you use to maintain multiple copies (backups) of your data (in our case image files). The key thing to realise is that it's more than just a program to run a backup. You also need consistent organisation to know what to backup and where to find them in the backup, you need to plan to have multiple copies and you've got to make sure you're running your backups regularly.
Even ignoring backup, getting your images organised is vital. It may be ok to have files scattered through different folders when you have relatively few files, but as the numbers increase you will lose things.
Having a consistent organisation means that everything is where you would expect it so that you can find it. That means having a consistent folder structure that you place your images in, and a consistent file naming policy.
If you don't do any organisation currently, I'd strongly suggest doing so. There are lots of different ways of organising your files, but you won't go far wrong by using the following approach which I use:
- In a suitable location, create a "Photos" folder. This is where you will keep all of your images. I have a hard drive dedicated to my images so I put my photos folder on that drive.
- Within the Photos folder create sub-folders for each year of your photos.
- Within each year folder, create a Year-Month-Day folder (e.g. 2012-09-17) that you will import that day's images into. Some people also attach a description of what the photos were of in the folder name (e.g. 2012-09-11-LakeDistrict) to help them find things. Personally I store that information in the image file's metadata so I don't bother with that.
- When importing files into these folders rename them using a consistent file name template. I use "Initials-YearMonthDay-Original file number" e.g. DAF-20120907-7765. That way I can guarantee that every file on my computer has a unique name which I can search for as required.
I use Adobe Lightroom to import my files and I have it configured so that it automatically performs steps 3 and 4. So once you've got it set-up it pretty much looks after itself.
Now that I have all of my images in the "Photos" folder I know that I can copy that anywhere and I have copied all of my images in their nice tidy folder structure. If I have a problem with the original "Photos" folder I can just copy the backup version over the top and know that it will be identical.
Where to Backup To
Because of the many risks our images face, most backup strategies aim to keep at least 3 physically separate copies of your images. It's not impossible for both the primary and secondary copies to have a problem, so a backup of the backup provides extra peace of mind.
Some of the potential problems, such as fire or theft, will affect the machine itself and possibly the building it is located in. So it's important to have at least one ideally two offsite backups. An external hard drive is ideal where you can backup the files to the drive, take it offsite, and then bring it home periodically to update the backup.
Other problems are due to the computer or its use, such as a hard drive failure or user error. In this circumstance having your your only backups offsite can cause some problems. One is that you will have lost anything that has happened since your last backup. The other is that it can be inconvenient to access your backup when a problem occurs.
It can be useful to have a backup drive (internal or external) that is regularly kept in sync with the primary drive. This allows you more immediate access to files should you need to restore, as well as providing you a more up-t0-date backup so that you don't e.g. lose that week's images.
Backup to optical media such as DVD or Blu-Ray is an option. However be aware that these discs degrade over time (something called disc rot) which means that it is a good idea to recreate the backups periodically and also to have multiple copies. Given the risks and the relatively small size of these media (even Blu-Ray is smaller than many memory cards these days!) I prefer to avoid them and use hard drives instead.
An increasingly popular form of backup is called "cloud" or "online" backup. This basically sends your files to a secure server somewhere on the internet for safe storage. I haven't personally used such a service as yet but by all accounts it's a pain free option, especially if you don't have loads of images. For me, the thought of uploading 1.5Tb of images to the internet has been a little offputting to-date!
What Software To Use
I'm not going to cover backup software here, reviews abound elsewhere on the internet. But find something you're happy with and make sure to use it. Personally I've created some batch scripts that run RoboCopy (a Windows copy application) to perform my backups but those without a background in IT might prefer a more intuitive solution!
Backup More Than the Image Files
The most important thing to backup is the image files. We can fix most problems, buying a new computer and software in the worst case, but we can't reshoot the images in our collection.
However it can be very worthwhile to backup more than the images. One important one if you are a Lightroom user is the Lightroom catalog.
You may be aware that the LR catalog is a database storing the location, metadata and processing steps for all of the images in the catalog. If you lose the catalog, you've lost all of that metadata and all of the processing you've done to them - not good! You can mitigate this by saving the LR changes to "sidecar" XMP files but the best approach is to protect the catalog itself. Note to save space you only need to backup the catalog itself (the .lrcat file) not the previews folder.
If you have Lightroom presets or templates, it is worth including those in your backup too. You can find this by going to Edit->Preferences->Presets->Show Presets Folder from within Lightroom. In Windows it will pop-up an Explorer window with the location of all of your presets and templates (I presume something similar happens on Mac). Take note of the location and add it to your backup.
Basically, backup anything that would be a loss to you if it wasn't there. Some will keep a backup of their entire operating system. I don't bother with that personally, I don't mind reinstalling Windows if the worst happens. I do make sure to backup all of my documents, serial numbers, etc however.
Why RAID isn't Backup
If you're not familiar with RAID, then you can skip this bit. If you do use RAID, don't rely on it as a backup as it doesn't guard against the majority of issues listed above. In fact, it only guards against a drive failure. If a drive fails then you can swap it out for a new one and have a fresh copy of the data replicated across to it. However if the RAID controller fails, a virus infects your machine you delete something by accident, or a power surge or fire strike then all copies of the data within the RAID array are impacted.
I much prefer to use individual hard drives ("just a bunch of drives", JBOD) and manually initiate data transfers between drives so that I know if something goes wrong on one drive the RAID controller hasn't replicated that onto my "backup" drive that I need to recover from. By all means use RAID but please do not consider this as multiple backup copies.
Having talked about some of the ideas behind the backup strategy, I'll run you through how mine works.
- I have a primary 2TB internal hard drive that is my working drive. This is where all my images are imported to and Lightroom expects to find the files.
- Files are imported by Lightroom into the Photos/Year/Year-Month-Day folder structure I talked about above using a standard file naming template.
- I have a secondary 2TB internal hard drive in the same machine that is my backup drive. As soon as image files are imported onto the primary drive I run a script that mirrors the working drive to the backup drive. Thus the working and backup drive are exact copies of each other.
- I have a tertiary 2TB external hard drive that is my offsite backup drive. On a weekly basis I'll mirror my working drive onto the offline backup drive and then take it offsite for safe keeping. Again this is an exact copy of the working drive such that I can just swap them over should the working drive fail.
- I have a solid state drive (SSD) as my boot drive where I also store my Lightroom catalog for maximum performance. As part of my backup, I copy the LR catalog to my working drive which in turn is copied to the backup and offline backup drives.
This is a fairly robust system and has saved my bacon on several occassions. Using 3 (5 actually but could be done with 3) drives makes this more expensive but it also greatly reduces your chances of disaster. I wouldn't have the same level of comfort with only 1 backup.
On one occassion I ended up with a number of corrupt images on my working drive that I didn't realise until they had been copied over to the backup drive. Fortunately the offline backup had the original files intact (yes, it's fortunate that I hadn't performed my latest backup to the offline backup - shows that even with 2 backups you still aren't 100% safe).
Still, it's not perfect. Really I should have my two backups physically separated from the primary copy and this is something I need to address. So writing this article has been useful in re-evaluating my own strategy!
To bullet proof your system a bit more you can introduce checksum generation and verification into your backup strategy. A checksum is like a digital fingerprint, uniquely (to all intents and purposes) identifying a file and its contents. If you create a checksum on two different files, the checksums will be different.
Where this becomes useful is that if you run the checksum on the same file, it should always be the same. If the checksum for the same file ever changes you know that the file itself has changed. Sometimes that will be because you've made a change, so you can ignore the warning. Other times it will happen when you don't expect it which highlights a possible problem such as file corruption.
Before backing up, I generate checksums for any new files on my working drive. Post-backup I can run the checksum on the backup drive to check that it hasn't changed. Doing this periodically ensures that the contents of your backup are still good.
This step isn't strictly neccessary, just having multiple backups should be sufficient for most cases with file corruptions being relatively rare. But it does provide additional piece of mind. I've often thought about writing my own backup software to implement my backup strategy with the verification checks and monitoring in place to make this a little less onerous (or knowledge intensive).
Hopefully you realise the importance of backups. Things do go wrong (and always at the worst time!) and it's only sensible to take precautions against that. Please don't leave it to chance.
A backup strategy doesn't have to complicated, indeed I think mine is rather simple. The most important thing is that it's consistent and regular. Find something that works for you and stick with it.
Do you have any backup strategies or additional thoughts you'd like to share? I'd love to hear them in the comments.
There's a lot of info about backups already on the web. This article on dpBestflow provides a lot more info on the subject.
Chase Jarvis also put an interesting post, video and graphics together describing his backup workflow which you can find on his website here.