It was around the end of the 20th century that the term “big data” started to come into use. Initially considered more of a marketing than a technology concept, it was several more years before the phrase was widely adopted, and several more before data really became big.
At that time, a University of Berkeley study found that the entire planet produced about 1.5 exabytes of unique information a year, or about 250 megabytes for every man, woman, and child.
Today, armed with smartphones that have more processing capability than the fastest computers of the turn of the century, we collectively generate more than 2.5 exabytes of data EVERY DAY.
All this growing data must be stored somewhere – something that is a particularly important issue for businesses, since today all data has the potential to deliver insights when combined and analyzed: insights that can deliver new customer experiences, revenue streams and differentiation.
In fact, any data created today has the potential to be put to use in many ways, over and above its initial purpose. This places new demands on the storage technology we use – not just in terms of being able to scale, but also because today’s data needs to be accessible.
Growth, growth, growth – is there anything else beside the need for growth?
Let us continue our trip through the data jungle. You remember the giant Redwoods, aka Traditional RAID storage? These evergreen giants grow sky high, with an impressive trunk width of up to 20 meters and are the worlds biggest trees.
Why do we need to speak about SDS (Software-Defined Storage), when we already have a scalable, reliable and proven storage technology available?
You need to understand, that today’s storage systems need to deliver far more than their predecessors do and this is not only to be able to scale in new capacity dimensions. They have to be more flexible.
They need to expand seamlessly to deliver greater capacity and support more and more systems, all without interruption. And enterprise data needs to be viewed as a whole to deliver value.
If all that wasn’t enough for a business to manage, there are further trends that place additional stress on storage systems, such as the rise of cloud-native applications. This is where we will encounter the rainbow eucalyptus tree, aka SDS.
First, talking about scalability. When we look into how the rainbow eucalyptus is cultivated, we will find the right analogy to identify an important difference between traditional RAID storage and SDS.
Thanks to its rapid growth – up to 4 meters per year - and the suitability of its fibers for paper production, rainbow eucalyptus is cultivated on plantations. In order to increase the yield of a plantation, you do not push a single tree to grow faster or higher, but you simply add additional trees to the plantation.
The scalability of traditional RAID storage is impressive, but just as the growth of the redwood is limited by its given maximum height, the scalability of traditional RAID storage ends with the physical limit of the array and internal platform design.
SDS scales more like a plantation – you simply add nodes into the overall storage pool of the system. The management software of the system of SDS is typically distributed across all nodes. This makes it very easy to scale, providing the flexibility to add capacity when you need it. There is no need for capacity planning and forecasting – and theoretically, there is no limit to the capacity.
Secondly, it is about the ability to adopt to change. The marvelous eucalyptus tree is cultivated not only in plantations, but also as an ornamental element in parks and gardens. The reason for this is the exceptional coloring of its bark. It is smooth and brownish colored, but because cracks develop in it year after year, the bright green bark underneath is revealed, which progresses through a color spectrum from bright blue and violet to orange and fire red, before becoming yellowish-brown. It is this colorful bark, with its ever-changing appearance, which gave the rainbow tree its name.
What does this have to do with SDS? Well, it’s not just the volume of data that has evolved, it’s also the type of data we typically store and how we use it. We now increasingly create and use massive files – such as those used in medical imaging or for AI-based quality assurance processes.
Our IT architectures have also come a long way. Businesses have moved from having all their data stored on equipment in their own in-house data centers to implementing sophisticated hybrid IT approaches that put individual workloads on systems optimized to deliver the speed and security they need. That means that a storage system needs to encompass data held not just on premises, but also in a combination of public and private clouds.
Our applications and how we construct them have also changed. Rather than creating applications as large monolithic blocks of code that are hard to maintain and change, today's apps are increasingly based on micro-services.
This approach sees the entire application carved up into multiple separate functions, or micro-services, each with a single specific goal – for example, on an e-commerce website, one microservice might handle the payment, another is dedicated to the shopping cart, etc. This not only reduces complexity, but also greatly enhances scalability, flexibility, reliability and availability. Micro-services are the logical choice to deliver the fast roll-out of new capabilities.
Just as the color composition of a Rainbow Eucalyptus is changing every year, and never looks the same – so your application and IT environment may change. The good news is, that SDS is not only supporting new age workloads, but was designed with cloud native apps and modern workloads in mind – for example to help IT leaders support many workloads that change dramatically over a short time frame; such as those characterized by micro-services.
Another cloud-native characteristic of SDS is its extensive use of APIs, as these are designed with a focus on the data, not the underlying hardware. This makes the data uniquely accessible from anywhere and makes it easy to automate the data management.
It makes it possible to deliver single pane visibility over the whole storage system, effectively allowing even non-storage specialists to manage complex systems, therefore reducing management overheads.
We’ve seen a remarkable change in data use, volume and expectations over the last twenty years. Data has become the very lifeblood of our businesses – and its free circulation is required to give life to new revenue streams and opportunities.
The expectations placed on data, including the new cloud native apps of this sort that are appearing in their millions every year – require a storage system that can meet their unique requirements. SDS, a storage system designed for the cloud and the “new-age workloads”, is the logical approach.
Despite this huge change in the volume of data we store, the type of data, how and where we use it and ever-growing expectations for flexibility, many businesses are still deploying RAID storage.
This is a technology first introduced in the 1970s to store data for yesterday’s monolithic and client-server applications. It relies on appliances to connect physical storage hardware and storage software with management software that often runs in dedicated storage network interfaces or in the platform’s firmware.
While virtualization and storage area networks have replaced the traditional server connected to dedicated storage, few enterprises have successfully eliminated all the data silos inherent in their system. RAID has been doing a sterling job keeping up with our insatiable drive to create data. But in some use cases, we are reaching the limitations of RAID and so a different approach is required as these traditional, monolithic systems scale as slowly as mature trees grow.
Just four or five years ago, we undertook a survey where one in five respondents considered SDS to be simply a marketing buzzword rather than a real IT solution. But SDS has now reached a stage of technical maturity where it is beginning to be widely adopted by mainstream enterprises as well as small and mid-size organizations, in support of unpredictable and very mixed workloads.
These include artificial intelligence and machine learning (ML) systems that require flexible and highly scalable storage, workloads that require access to very large numbers of files or files of extreme size.
Another major trend we’re seeing is the increasing adoption of hyper-converged infrastructures (HCI) which is also driven by the complexity of today’s IT architectures. Similar to SDS where the software is abstracted from the hardware, hyper-converged infrastructures virtualize compute, networking and storage components as a whole, to enable shared, virtualized resources to be managed, deployed, expanded and recovered, all via software control.
Software-defined storage and hyper-converged infrastructures are also being deployed effectively within the context of wider IT infrastructure strategies to reduce costs. Further benefits are being unlocked by more effectively provision and automate enterprise storage systems.
But this is just the start – the flexibility, scalability and versatility of SDS means that it is being deployed for more and more use cases. In fact, according to Gartner analysts, by 2024, 50% of the global storage capacity will be deployed as SDS either on-premises or on the public cloud (up from less than 15% this year).
Just as the redwood tree and the rainbow eucalyptus prosper in their specific environments, Enterprises continue to require a selection of data storage solutions to meet their individual needs. And regardless of their unique requirements, Fujitsu’s broad storage portfolio, supported by partnerships with industry leaders, ensures that every customer’s solution is the perfect fit. For a growing number of customers, SDS is finding a place in their storage strategy.
And who knows, maybe such a beautiful rainbow tree fits also in your datacenter!
Come join us on a trip through the Data Jungle on www.fujitsu.com/data-jungle/