Have you ever felt the burn when you lost all the data on your mobile phone – all the contacts, pictures, videos, songs, documents, everything – because you were trying to reset the settings and accidentally ended up erasing all the data, without having the data backed up/saved somewhere else. Maybe not, but how about those times when you are working on your notebook and the hard drive crashed. And in an attempt to retrieve your data, you got access to only partial data, after many hours wasted in trying to get it back.
Tough, isn’t it? If it is that tough for an individual, I wonder how difficult it would be for an organization to deal with loss of data. Imagine the scenario for a telco, when it cannot let its customers pay their phone bills or recharge their phones online because of infrastructural difficulties. Or for an online e-commerce site, when it cannot let its customers access the site because of an outage, which might ultimately translate into lost customers and business. Or for a healthcare giant, when, despite having presence in multiple locations, services to customers cannot be offered because of an outage, rendering customer records, files, and test results unavailable, for a stipulated time.
The fact is, we live in a world where we depend on technology in many different ways. Though this makes our lives easier in many areas, it has a flip side too, especially in our working environment. Whether for an individual or an organization, technology is the medium that connects us to others, customers included. In the event of any kind of outage, be it due to human error or natural disaster, hardware or software malfunction, bandwidth issues or resource contention, database/application or website glitch; the result is the same – major inconvenience followed by further preventive measures that need to be taken in order to prevent financial loss of any sort.
No wonder they say: those who live the disaster management way, live to work another day.
Even with the best of intentions we may not be able to prevent a disaster
However, realistically speaking, we can effectively only control the aftermath of a disaster (of any proportion) by having a suitable disaster management solution in place. With the right disaster recovery solution, you can have your IT infrastructure back up and running with zero or minimal data loss, guaranteeing your users access to the data they need.
DR Solution options are primarily classified according to factors that are more pronounced in one solution when compared to another. However most DR set ups are based on either/or/both of the following solutions:
- Data Replication: In this case data sets are copied from one source to another. These copies are based on real-time transfer of data hence data recovery is also instant (as designed) in the event of a disaster.
- Back-up and Archiving: In this case data sets are either copied or archived at a secondary source, so that they can be retrieved later in the event of a disaster. Since data transfer does not take place in real-time mode, there could be latency involved in the recovery of data.
Needless to say, both replication and back-up have their own advantages and cost implications and are more suitable in respective use cases. Data replication does not replace a backup and vice versa. For example, if you have corrupted or manipulated data on the primary site it will be promoted to the secondary site also. In such a scenario, a backup can give you different versions of data going back in time, before the corruption took place. But, there might be latency involved in the retrieval of this data, which might not help if your applications are mission critical. Hence, ideally a combination of both data replication and backup methods are suggested for disaster recovery.
However there are few basics that should be considered if your aim is to have a suitable DR solution. These are simple to follow-through. All you have to do is to look around and figure out which of the following are the most important elements for you:
- What is the cost of downtime ($/minute) that you can afford?
- What are your applications? Which of these are critical?
- What network bandwidth do you have, how far can you stretch it?
- Where is your secondary site located, what is the proximity to main site?
- In the event of a disaster, what is an acceptable limit (in time) for which downtime can be tolerated and how fast would you like a restore to be complete?
- Are you looking for a transparent failover solution or are manual interventions an option, too?
- Would you like both (or more) of your sites to be communicating in real time?
- Would you like your secondary site to take over only in the event of a disaster?
- What is your overall budget for this entire solution?
- How realistic is the amount invested in hardware and software (including licensing), both now and in future?
Once the basic framework is ready, based on the above questionnaire, you can pick and choose a Disaster Recovery option for your organization. If your business requirements dictate the need for an exact copy of your data in a secondary location, all the while maintaining a certain operational performance, then you will have to decide which mode is more suitable for you – a hardware based or software based replication solution. Both solutions have their own distinct advantages. A hardware based replication offers more consistency and support options, among others factors. A software based replication will give you more real-time replication and quicker recovery, among other factors.
Furthermore, hardware based replication options that are available vary in terms of some criteria – such as affordability, flexibility and scalability. These can be broadly categorized as:
Host-based replication: Functional at server level, this solution is appropriate for single-server branch offices or smaller businesses. Though a little difficult to scale up post deployment, this solution is quite cost-effective when overall hardware, software, support and implementation costs are taken into consideration. It is also fast and can be deployed to improve SLA for specific applications or to cater for the requirements of one or more departments in an organization.
Network-based replication: Situated at a network level, this solution is more suitable for a heterogeneous environment that encompasses multiple servers and storage, irrespective of vendor. Since this solution is network based, its functioning is also independent of servers and storage components. The cost of implementing this solution is generally high, perhaps because this solution more or less offers the advantages of both host-based and array-based replication solutions. In variants, this can be offered as an inline appliance-based and/or fabric-based solution. The distance between two sites, bandwidth, application/data to be replicated and scalability required post deployment should be considered beforehand for this solution.
Array-based replication: Functional at an array level, this solution is more appropriate for organizations with numerous mission-critical applications where downtime is not an option. With a more centralized approach, this solution also offers flexibility to scale up post deployment, provided licensing costs and other requirements are adhered to. One of the primary requirements of this solution are generally similar storage models with similar configurations on both the sites. However this solution is vendor-specific hence the finer details of deployment and other requirements must be clarified beforehand.
Protecting your data in the event of a disaster and protecting your company from any financial loss are two sides of the same coin. So what is the bottom line? Considering the fact that disaster can affect anyone anytime, you have to ask yourself, how any kind of outage (if it were to happen) would affect your business. Pre-emptive disaster recovery planning is something that can be tackled many ways. It can be simplified too, if you have a basic framework of what you need (based on your business requirements) while taking budgetary considerations into account.
After all, prevention is better than cure and hence pre-emptive measures should be undertaken.