Data protection is coming of age. Once considered an unavoidable necessity by businesses, it is turning into a driver for data-driven transformation. But to tap into the insights derived by unleashing the potential of data, businesses must first manage their data effectively. And with the growing complexity of hybrid IT estates and rapidly growing data volumes, there are pitfalls and misperceptions to avoid.
Here are nine of the most common issues, with some steps that businesses should consider in protecting their data, to make it work for them.
Don’t take your eye off the threat of ransomware
Despite the general business slowdown due to the Covid-19 pandemic, cybercrime is booming, with criminals leveraging ransomware to take advantage of the changing situation. And with an increased risk profile due to remote working arrangements being set up hastily, this threat remains greater than ever.
Unlike localized issues such as the deletion of individual files, ransomware affects ALL business data. Paying the ransom is zero guarantee that it will help restore data.
Consequently, it’s important to consider data protection measures able to deal with failures of any size – from accidental file deletion to a catastrophic data center problem. To thwart ransomware’s infection of all networked files, you really need to ensure that you have a traditional “3-2-1 back-up” established – at least three total copies, on at least two types of storage media, with one completely offline version.
Be warned that a High Availability architecture isn’t enough to protect you in these cases. This is designed to protect against hardware failure, but not against the far-reaching data corruption caused by ransomware.
Don’t underestimate the value of your old data
In the past there was a clear demarcation between a business’ important active data – for example, SAP system information – and older data no longer in daily use and perhaps even out of date.
However, companies today are sifting through “data lakes” of old information in the hope of deriving valuable new insights that can drive disruptive business decision-making.
This data could be gathered from many diverse sources – from cameras monitoring a loading bay to the sensors on machines on the shop floor. Self-driving cars in training capture multiple terabytes of data every day.
This information is valuable not just to teach the cars to be autonomous, but also since there may be opportunities to monetize it, for example by selling detailed traffic flow and volume information to outdoor advertisers, to monitor exactly who has been exposed to their ads.
Additionally, there is increasing interest in validating the data used to train AI, to avoid accidentally learning an unwanted behavior.
Another example of this use of previously archived data is Fujitsu’s own System Inspection Service for SAP solutions. This examines a customer’s SAP architecture and identifies areas for optimization. Improvements include flagging areas where server load redistribution could deliver improved response times.
This system is extremely effective thanks to a collection of 10 years’ worth of SAP systems information. We collected the data, stored it, and now it is generating new value both for us and for our customers. I’d love to say it was our intention from the outset – but it wasn’t.
One business, one backup system
Businesses all face the challenge of storing vast amounts of data. This means cost-effective storage is crucial – but this also must be quickly accessible. This requires massive parallelization of data streams, which the FUJITSU Storage ETERNUS CS8000 has been designed to handle. And spinning disks have their limitations – even hyperscale providers offload some data onto tape, as that remains the best method for long-term storage.
But what about the cloud? It remains a major driver for customers looking for a data protection environment.
Many businesses now run hybrid infrastructures with a mixture of applications that run either in the cloud or on-premises. They often start trying to set up separate backups to protect each location, plus rapidly growing data collected at the network edge – and soon run into difficulties.
There’s no reason to run separate data protection systems. For a business to truly leverage its data, it must consolidate and plan to protect all of its data. Therefore, best practice is to implement a single unified data protection approach for the entire network from edge to core to cloud.
Restoring a backup can also be the most effective way of moving or sharing data
Many businesses prefer to keep the data relating to their business-critical applications on- premises and use the cloud for other applications. We’re seeing increased numbers of businesses using the cloud to quickly set up training environments or sandbox a process to test it out – for example checking the effect of a new patch on key systems. Leveraging the agility and fast set up speeds of the cloud for these use cases is obvious, but the training environment or sandbox still needs access to data.
Backups represent a convenient way of moving data around to make it available where needed. Businesses can even decide which version to use – a training system or sandbox doesn’t need access to the latest data, it just needs to work with data that resembles the live data. In this case, an older backup is the perfect fit.
The same data transfer mechanism works to simplify migration – by taking backup data from one cloud or storage target and restoring to another.
Cloud storage is not always as cheap as it first appears
Many customers use the cloud for backup because they consider it cheap. But “buyer beware” as the cost can vary dramatically. Some cloud storage is extremely cost-effective for storage, as you pay only to retrieve data.
However, this also means that a major restore can cost significantly more than expected. In some cases, keeping it on-premises is cheaper overall. This has even started a “cloud boomerang” trend, where businesses are bringing data back from the cloud to be stored on modern, much cheaper on-premises equipment.
You may be responsible for protecting your data in the cloud
However, for businesses with no second location to replicate data to, putting backup data in the cloud can be extremely convenient, as even in their worst-case scenario they would still have access to their backed-up data from any location.
One caution for this approach though: cloud data is not automatically protected.
Very often the responsibility for protecting it falls to the data owner, not the cloud provider. Therefore my recommendation is to ensure that you also create an on-premises backup of the cloud data.
It’s not about backup, it’s about recovery
Although we might talk about backup, the most important factor is the recovery. The intervals between backups determine how much data you’ll lose if you need to restore. The volume of data you need to restore also determines how long it will take to recover.
If you just want to recover a single file that was accidentally deleted, then most systems can deliver this very quickly, but if a complete recovery is needed, petabytes of data can take a long time to restore. Consequently, best practice suggests having a local backup copy for fast recovery purposes.
Modernizing your applications might need a new data protection plan
A few months ago, there was an international technology story about the US State of New Jersey, whose governor sent out a cry for help with its aging applications. Written in near-obsolete programming language COBOL, the state’s unemployment benefit system was overloaded with applications and the state was unable to find resources to update it.
COBOL is still surprisingly prevalent in a wide variety of applications, which will likely need to be re-implemented with code that businesses can support – triggering new data protection requirements.
These application changes are also driven by new business requirements. For example, a newly-implemented mobile app for financial services is likely to leverage Kubernetes – the gold standard for app data management, as it groups the containers that make up the apps for easy management and discovery.
This means that these apps can scale up and down extremely fast to handle many thousands of users. Kubernetes also allows the app to be run seamlessly between on-premises and the cloud.
However, the data for these, typically stateless apps, do need a different approach to storage that is compatible with enterprise-level container detection and persistent storage.
There’s protection, and then there’s overprotection
Typical data centers experience a great deal of change, with many systems generating diverse types of data across multiple locations – so it’s easy to lose oversight. You don’t want to find out the hard way that key data wasn't protected after all.
In fact, sometimes we see the opposite extreme, where multiple backups are run on the same system. We recently uncovered a large customer server which was being backed up twice, each with three copies, adding an enormous amount of unnecessary cost.
That’s why best practice is to run SystemInspection for Storage to ascertain whether everything is being protected, confirm the level of protection and ensure it is just being backed up once.
Ultimately – a business today can only be as successful as its ability to leverage its data. In the race to data-driven digital transformation effective data management means the difference between businesses that are winners and those that are losers.
Leveraging that data is challenging – as it is generated and stored in many different formats across vast, complex IT landscapes, scattered between the core, the edge and the cloud. But the foundation of leveraging it effectively is to create a single data lake, easily accessible by any application and backed up securely to combat even the most pernicious ransomware.
And that’s what data management is all about.