If you have ever worked to develop, review, or test a Business Continuity or Disaster Recovery Plan (BCP or DRP), you may be familiar with the terms Recovery Point Objective (RPO), Recovery Time Objective (RTO), and Maximum Tolerable Downtime (MTD). But what do these terms mean, how are they different, and why are they important? Let's take a look.
Defining RPO, RTO, and MTD
Recovery Point Objective (RPO)
Recovery point objectives are about data loss tolerance. RPO is the term used in business continuity to identify the maximum targeted period in which data can be lost without severely impacting the recovery of operations. For example, if a business process could not lose more than one day's worth of data, then the RPO for that information would be 24 hours. RPO is very useful to help determine the frequency of backups for a given system.
Recovery Time Objective (RTO)
Recovery time objectives are about restoration goals. RTO is a term used in business continuity to identify the planned recovery time for a process or system which should occur before reaching the business process's maximum tolerable downtime. For example, if a business process could not sustain for more than one day without normal operations, then the first RTO should be less than 24 hours. RTOs can be helpful in determining what kind of recovery and/or redundancy may be required.
Maximum Tolerable Downtime (MTD)
Maximum tolerable downtime, also sometimes referred to as Maximum Allowable Downtime (MAD), represents the total amount of downtime that can occur without causing significant harm to the organization's mission. MTD is important to define so continuity planners can select and implement appropriate recovery methods and procedures to ensure downtime does not exceed acceptable levels.
Understanding the value of RPO, RTO, and MTD
RPO, RTO, and MTD are most frequently used in business continuity and are usually defined during the Business Impact Analysis (BIA). They are important measurements to ensure the requirements for a business process or function will be achieved by current systems and procedures. During the BIA, business process owners should be asked to identify their RPO and RTO requirements. Then, evaluation can be made on existing systems, processes, and procedures to ensure required RPOs and RTOs are able to be met in the event of a disruption and ensure RTOs do not exceed the organization's MTD.
RPO, RTO, and MTD in action
Let's look at a few examples.
Let's assume an imaging department receives paper documents they scan and save on server ABC. These paper documents have retention requirements and these documents would be difficult or impossible to reproduce if lost. Further, let's assume the current process by the department is to shred the paper documents at 4:00 PM the day after the documents are scanned. In this case, the RPO for server ABC would be 24 hours since after that time the paper documents would be destroyed and unable to be rescanned. Knowing the RPO for the server will allow IT to verify data stored on server ABC is backed up such that you would not lose more than 24 hours' worth of data to achieve an RPO of 24 hours.
RTO and MTD Examples
First, let's look at a simple example. In one department, let's assume server XYZ is used to retrieve critical customer or member information and legal requirements state our company must be able to access this information within 4 hours of a customer requesting it. In this case, the MTD would be 4 hours and the RTO for server XYZ would need to be less than 4 hours. Knowing the RTO for the server will allow IT to verify recovery processes are in place to meet these requirements.
Now let's look at a more complicated scenario. In another department, let's assume virtual server VM is running on a physical server SH and is pulling data from server DB. Senior management has determined the MTD for the department's business process is 8 hours. In this case, we must ensure the RTO for the business process, which would include the RTOs for each system, does not exceed the MTD. Let's also assume recovery of virtual server VM is dependent on physical server SH being recovered first. In this case, we cannot start the recovery of VM until SH is recovered. This means we must ensure the RTOs for VM and SH together do not exceed the MTD of 8 hours. Unlike the VM dependency, server DB does not have a dependency, so it can be recovered at the same time as SH.
With these requirements in mind, let's look at two different recovery scenarios based on different system recovery times.
First, let's assume the recovery time for each system is as follows: SH is 4 hours, VM is 2 hours, and DB is 3 hours. In this scenario, the combined RTO for the business process is 6 hours (see figure A). This RTO is within our MTD requirement.
Second, let's assume the recovery time for each system is as follows: SH is 6 hours, VM is 4 hours, and DB is 4 hours. In this scenario, the combined RTO for the business process is 10 hours (see figure B) which would not meet our MTD requirement of 8 hours.
Tips for using your RPOs, RTOs, and MTD to improve your business continuity operations
- Work with business process owners at least annually to create or review RTO and RPO requirements as a part of your BIA.
- Compare backup frequency with RPO requirements for all systems to ensure backup processes will be able to achieve recovery point objectives.
- Confirm the RTO(s) for a given business process will not exceed the organization's MTD.
- Plan to verify critical systems can meet RTO and RPO requirements when developing your business continuity exercise and testing plan.
- Ensure RTO and RPO metrics are evaluated and documented during BCP exercises.
Using Tandem Business Continuity Plan software to help you understand and evaluate your RPOs, RTOs, and MTD
The Tandem BCPGap Analysis Excel Spreadsheet is an excellent report to evaluate RTO, RPO, and MTD requirements. Exceptions are highlighted in red so users can easily see gaps and institute changes in order to meet requirements. The BCP Gap Analysis Excel Spreadsheet can be downloaded from two locations within the software:
- Tandem > Business Continuity Plan > Reports
- Tandem > Business Continuity Plan > Download Documents
To analyze RPOs, click the System Equipment RPO worksheet tab. This worksheet will show the gap between the shortest system/equipment backup frequency and the maximum tolerable time period for data loss (the RPO) for each business process where it is needed.
To analyze RTOs, click the Business Process RTO worksheet tab. This worksheet will show the gap between business process MTD and RPOs. In addition, the spreadsheet will display dependency RTOs to ensure gaps are addressed across related systems and processes.