What is backup, anyway?

If there’s one thing that gets an IT manager really irritated, it’s people using the terms “backup” and “archive” synonymously.

However, it is understandable. The confusion between the two most often arises because individual companies have completely different requirements for backups and archiving.

Confusingly, they also often have their own definitions of what should be stored. What one company archives may be useless for another. And to make things more confusing, there can even be overlap between the two.

The good news for your long-suffering IT manager is that the differences between backups and archives are relatively easy and quick to explain. I’ll start by covering the basics of backups…

What is a backup?

According to the Storage Networking Industry Association (SNIA), a backup is “a collection of data stored on (usually removable) non-volatile storage media for purposes of recovery in case the original copy of data is lost or becomes inaccessible; also called a backup copy. To be useful for recovery, a backup must be made by copying the source data image when it is in a consistent state.”

It is important to note that original data is not deleted after a backup has been created.

An example of a backup most of us encounter every day is when we copy our smartphone photos to the cloud. The originals are still available on your phone, even when you don’t have a data connection, but there’s an additional copy online. If you lose your phone, you can restore all its data from a cloud backup to a replacement device.

These backups can include not only your data, but also applications or even entire operating systems and are stored in a way that allows you to recover the data and restore it to its original state after an incident – which could include anything from a hardware defect, to a hack or even when you give your phone an unplanned bath.

There’s also a difference between having effective backups and high availability. The latter usually involves two or more data centers, with data replicated at the second location (or even at multiple sites). This means that if one location goes down for any reason, others can take over seamlessly. The vulnerability of this approach is that if important files are deleted or infected with a virus, these issues also exist at the mirrored sites.

Effectively, backups are secondary copies of active business information and used to recover information a user has deleted or, in the case of a disaster, data that are essential to get a business back up and running.

But the key characteristic is that these backup copies are focused on keeping track of constantly changing business information, and therefore generally short-term and frequently overwritten, perhaps weekly or monthly depending on company policy.

Also, when any data is deleted on the original (whether deliberately or by accident), the backup will also delete that data after the defined retention period. This makes backup a poor choice for long-term storage and for retaining data for compliance reasons.

However, traditional backups do offer two important features that help increase data availability. Firstly, the ability to “travel back in time” to access multiple older data copies if current versions are damaged or deleted.

Secondly, a traditional backup includes the physical separation of the backup copy from the productive storage systems. By using a cross-media mix of disk and tape technology, backups also make it possible to maintain effective physical separation with different unconnected media, a so-called “air-gap”, to avoid falling victim to any issue that affects an entire company network.

How businesses can approach backups

As I mentioned, businesses have different reasons why they need to protect their data, from recovery to compliance, and place very different requirements on that data protection. Service level agreements, in most cases, defined by the business and implemented by the IT team, will differ in terms of the time it can take to execute a recovery, and in terms of how much data it is acceptable to lose.

Also, in the case of data in mainframes, there’s a rather different approach to managing data – with primary and backup storage. The operating system handles Information Lifecycle Management and can create both backup copies and archives – making it very different to open system environments.

To add to the complexity, there are multiple different approaches to backup to choose from. For example, the 3-2-1 method. This involves keeping at least three copies of data, storing two backup copies on different storage media, and making sure one copy is held off-site in case copies at the primary location are damaged (for example by fire or flood).

But regardless of the approach, backups are a necessity. It’s not just a case of having a copy of your business data. It’s about supporting your business by maintaining data integrity and ensuring its availability – something that I am sure everyone will agree is a priority.

But an effective data protection strategy isn’t fully covered by having backups in place. Businesses also need to implement archiving in a way that meets their unique requirements. Next time, we’ll take a look at archives – explaining how they are different to backups and why you need both (Spoiler – it’s not just to stop your IT leaders from grinding their teeth).

Watch this space!

In the meantime, why not register for Fujitsu Forum Munich where you’ll be able to see our data protection portfolio in action? You can also check out our other blog posts: