Disaster Recovery

When upgrading or performing maintenance on your K2 environment, first:
  • Ensure there are no dependency issues in your solutions. See Dependency Checking for information about dependencies and how to resolve dependency issues.
  • Backup your K2 database
  • Create a checkpoint (snapshot) if K2 is running in a virtual environment
  • When creating a new environment for purposes of testing, it is best practice to create a new environment and not clone an existing environment as cloning an environment can cause unexpected behavior.

This allows you to revert the environment in case of a failed upgrade.

Disaster recovery is the process, policies and procedures put in place to deal with potential disasters that result in complete system outage, such as a natural disaster that takes the production data center offline. A disaster recovery plan forms part of a business continuity plan (BCP) and is essential to any organization that wants to either maintain or quickly resume mission-critical functions after such a disaster. The disaster recovery plan should typically include an analysis of business processes and continuity needs, especially planning for resumption of applications, data, hardware, communications (such as networking) and other IT infrastructure. You must also give attention to disaster prevention. As K2 interacts with other external systems such as SharePoint or other line-of-business (LOB) systems, it is important to include all related systems in your disaster recovery planning for K2.

When developing a disaster recovery plan, there are a couple of industry standard considerations that will help focus on how extensive the procedures and underlying infrastructure needs to be to support these goals:

  • Recovery Time Objective (RTO)

    The amount of time a system may be offline. Put another way, the maximum amount of time that it can take to bring K2 back to an operational state following a disaster recovery event. This will help assess the level of investment and rigor needed in creating and maintaining a parallel disaster recovery environment.

  • Recovery Point Objective (RPO)

    The maximum amount of time for which data may lost following an outage. This will influence the database backup and retention strategy.

Below is an example of RTO and RPO:

Assume a K2 platform supports a business unit where all of their solutions must be operational within five minutes of a disaster event. When the system comes back online, it must ensure that it has data consistent within 15 minutes of the disaster recovery event.

In this scenario RTO is five minutes and RPO is 15 minutes.

Determining RTO and RPO is important because it will help focus the level of effort and expense associated with building disaster recovery processes and infrastructure to support required objectives. Generally, the lower the objectives the higher level of effort and / or expense in order to support more aggressive service levels that translates for more investment in hardware, software and automation.