Why “Backup” is the Wrong Question, You Should Be Thinking “Recovery”
Having written that Tape Backup is not recommended, I should now write about what we are recommending for a solution. Prudent marketing might make me say: email me to find out, but that is not in the spirit of a good blog and frankly, it is not that hard to figure out what I might be thinking about. But before I do the what article, I think some important background information is necessary. It is important to understand what you are protecting against; backup by itself is really meaningless, what you need is recovery. There are actually a lot of things to think about when you think about recovery. Recovery can be categorized as needed in three cases:
- Case 1: I have overwritten or deleted a file. I need a copy that was saved yesterday, or last week, or last month.
- Case 2: I have just had a disk drive crash, a motherboard fail, a controller card fail, etc.
- Case 3: I have just lost access to my servers physically and electronically, either through fire, flood, theft, earthquake, etc.
For each case, there is a consequence and a probability of occurrence. In the first case, the consequence is annoying and perhaps costly. It is rarely, if ever catastrophic, although it can be in the case of a legal brief that must be filed by a certain time. It is, on the other hand, very likely that this will occur multiple times. The second case will very likely happen to you at some point in your business life, perhaps multiple times. The consequence is loss of the services supported by that hardware for some period of time and perhaps loss of a lot of important data. The third case, which is the one talked about the most in the tape backup days, is the absolute least likely to occur, but the most catastrophic because you will have lost access to all your servers and services. When figuring out a restore plan, you should think about Recovery Time Objective and Recovery Point Objective for each of your systems.
Recover Time Objective (RTO) is the amount of time you are willing to wait in the event of a failure of a particular category on a particular system.
Recovery Point Objective (RPO) is the point in time to which you need to recover in the event of a failure of a particular category on a particular system.
The following examples are not suggested times, these are examples of how the results of your analysis would be documented; your RPO and RTO could be wildly different.
- Email System, Category 2: RPO — loss of no more than 1 hour of email. RTO — 3 hours
- Electronic Medical Records System, Category 3: RPO — Loss of no more than 1 minute, RTO — 1 hour
- Word Processing Files, Category 2: RPO — Loss of a day is acceptable. RTO — 24 hours
You may not need to get your email system back that quickly, and you may easily be able to sustain the loss of a days email. There are clients, on the other hand, who will set the objective at zero lost emails and recovery times of less than 30 seconds. As you might expect, the tighter the recovery point objective and the shorter the recovery time objective, the more the solution is going to cost to secure and maintain.