Data Protection Must Change In Virtualization Age

Backup hasn’t ever been as easy or reliable correctly, and virtualization just makes it worse.

My first job inside the storage market was working for a backup software manufacturer within the late ’80s. We were among the first companies that started advocating the postulate of backing up data across a network to an automatic tape library. For many reasons, it was very complicated and failed more often than it worked. Fast forward almost 25 years later and, overall, the knowledge protection process still doesn’t work o.k..

Why is that this the case? Why is backup the job that no-one wants? Why do you actually have to consider backups? Great questions. While backup software and the devices we back as much as have made significant improvements in capabilities and straightforwardness of use, they still fail almost as often as they work.

Rapid file growth
The single biggest challenge to the backup process isn’t the size however the quantity of knowledge. The dimensions or amount of information you should protect is obviously a controversy, but it’s something we’ve got handled for years. Addressing it means the constant upgrading of network connections, in addition to faster backup storage devices.

[Watch as virtualization and cloud solutions architect Bill Kleyman explains Why The Datacenter Is The middle Of The Universe.]

The bigger challenge is the selection of files that should be protected. We used to warn customers about servers that had millions of files; that may be now commonplace. Now we warn customers about billions of files. Backing up these servers via the file-by-file copy method common in legacy backup systems is sort of impossible. In lots of cases, it takes longer to stroll the file system than it does to truly copy the files to the backup target.

Rapid server growth
Virtualization of servers, networks, storage, and almost about everything else brings significant flexibility to datacenter operations. It has also resulted in the creation of an “app” mentality among users and line-of-business owners. Everything is an app now, and which means another virtual machine created within the virtual infrastructure. The expansion rate of VMs within a firm after a successful virtualization rollout is staggering.

All of those VMs, or at the very least most of them, should be protected. While most, if not all, backup solutions have resolved the difficulty of in-VM backups, few are facing the huge growth of VM count. Often each VM should be its own job, and meaning managing and monitoring potentially hundreds of jobs.

Growth of user expectations
These realities are compounded by the undeniable fact that user expectations are at an all-time high. They now interact with online services that seem to never be down, they usually expect an analogous from their IT. In other words, recovery must be instant — or at the least fast. Even the time to replicate data from the backup server may take too long, especially if there are billions of files to cope.

The fix can be better primary storage
The fix for all this can be to make primary storage more liable for its own protection. Clearly it does that to some degree already, providing protection from drive and controller failure. But given all of the above challenges, it should also provide long term, point-in-time data protection, in order that if an application corrupts that you would be able to roll back to a version you made a replica of an hour ago, instantly.

At an analogous time, data protection should change. We have seen intelligence added to systems with a view to incrementally and rapidly back up large file stores. We’ve also seen instant-recovery products that permit for a VM to run directly from a backup. But there are challenges with instant recovery that should be addressed, like how well that instantly recovered VM will perform from a disk backup device and the way that VM could be migrated back into production.

I’ll dive more into the various potential solutions, which includes better protected primary storage and smart data protection, in my next couple of columns.

George Crump is president and founding father of Storage Switzerland, an IT analyst firm eager about storage and virtualization systems. He writes InformationWeek’s storage blog and is a normal contributor to SearchStorage, eWeek, and other publications.

Are you better protected renting space for disaster recovery or owning a personal cloud? Follow one company’s decision-making process. Also within the Disaster Recovery issue of InformationWeek: Five lessons from Facebook on analytics success. (Free registration required.)

More Insights