• Skip to main content
  • 855-855-8300
  • Contact
  • Architecture
    • Overview
    • Benefits
    • Schedule a Demo
  • IT Initiatives
    • VMware Alternative
    • VMware DR
    • SAN Replacement
    • Hyperconverged
    • Small Data Center
    • Secure Research Computing
  • Resources
    • About VergeIO
    • Blog
    • Documents
    • In The News
    • Press Releases
    • Support
    • Technical Wiki
    • Videos
    • Webinars
  • How to Buy
    • Versions
    • Test Drive
×
  • Architecture
    • Overview
    • Benefits
    • Schedule a Demo
  • IT Initiatives
    • VMware Alternative
    • VMware DR
    • SAN Replacement
    • Hyperconverged
    • Small Data Center
    • Secure Research Computing
  • Resources
    • About VergeIO
    • Blog
    • Documents
    • In The News
    • Press Releases
    • Support
    • Technical Wiki
    • Videos
    • Webinars
  • How to Buy
    • Versions
    • Test Drive

dataprotection

Ransomware Counts on Patch Tuesday

June 20, 2023 by George Crump Leave a Comment

Ransomware counts on Patch Tuesday to successfully infiltrate an organization. While there is nothing wrong with applying patches on Tuesday, it is which Tuesday the patch is applied that can open the door that ransomware plows through. Ideally, you want to apply the patch the next Tuesday after the release; doing so would eliminate the exploits that most ransomware and other cyber threats use to do their work.

The problem is organizations wait weeks or even months to apply patches. Why? Because the IT team needs to understand how the proposed patch will impact the rest of their environment. They don’t want to apply a patch that suddenly causes other currently working environments to fail.

Today’s infrastructure solutions must enable IT to vet and apply patches quickly and eliminate Patch Tuesday altogether. IT needs a solution that can address these patching challenges:

  1. Difficulty determining where the potential conflict is because of the number of vendors involved in delivering IT services.
  2. Difficulty in assembling and maintaining a lab environment to test patches.
  3. Difficulty rolling back a patch once it is deployed.
Ransomware Counts on Patch Tuesday

Eliminate Patch Tuesday and set yourself up for ransomware recovery success by attending our live TechTalk, “Creating a Ransomware Response Strategy,” this Thursday at 1:00 PM ET.

There are Too Many Vendors to Eliminate Patch Tuesday

One of the biggest challenges facing IT as they attempt to apply patches to prepare for the next ransomware attack is the complexity of the multi-vendor data center and this is why ransomware counts on patch Tuesday. While Hyperconverged Infrastructures (HCI) were supposed to make the multi-vendor data center easier to manage, they have the opposite effect. Traditional HCI is still a vertically layered stack of multiple software solutions. At a minimum, most HCI has software-defined storage (SDS), hypervisor (VMware/Hyper-V), software-defined networking, and software that protects the environment (backup and recovery).

Many environments are only one step down the software-defined path, running a legacy three-tier stack, virtualizing only compute. As a result, legacy data centers and even more “modern” HCI data centers are equally confusing when determining the impact of applying a patch.

Ultraconverged Infrastructure Simplifies Patch Reconciliation

VergeOS rotates the traditionally vertical IT stack into a tightly integrated linear plane that provides all infrastructure services (networking, hypervisor, storage, data protection) as a data center operating system within a singular software code base. We call this ultraconverged infrastructure (UCI), and it moves beyond legacy hyperconverged infrastructure to deliver greater efficiency and scalability at a significantly lower cost.

Reducing the IT stack to a singular, horizontal layer increases efficiency and scalability and simplifies the patching process. Updates for the entire infrastructure come from a single source, and because VergeOS is inherently highly-available, IT can apply patches and updates without disruption. VergeOS applies patches one node at a time, and workloads automatically move between nodes so that applications are unaffected.

You Need a Lab to Eliminate Patch Tuesday

Patches also come from operating systems and application vendors. Properly evaluating the impact of these patches is best done in a lab. IT organizations need a lab for patch testing and various other use cases. The problem is not just the cost to configure and maintain the lab but also making sure the lab has the same settings and data as the production environment. These requirements mean that most organizations don’t have a dedicated lab environment. When one is needed, they have to scramble to put something together. As a result, the lab is nothing like the production environment they are looking to simulate.

Virtual Data Centers: The Always Ready Lab

One of the critical capabilities of VergeOS is Virtual Data Centers (VDC). Virtual Data Centers are to physical data centers, what virtual machines (VM) are to physical servers, an encapsulation. Using another VergeOS capability, IOclone, IT professionals can, within milliseconds, create a space-efficient copy of their entire data center within.

Capturing the entire data center, including the data, networking configuration, storage policies, and application setups, is critical to ensuring that IT does patch verification against an exact replica of production. Since the copy is standalone and not dependent on the original, administrators can apply the patch without concern of impacting the production environment.

IT can implement a single VDC for its entire data center or subdivide it by application or workload. For example, a VergeOS administrator may create a VDC for Oracle, another for MS-SQL, and a “core” VDC for general-purpose VMs. Each VDC can be cloned hundreds of times, and those clones can be used as golden masters, backups, development, and patch verification.

IT Needs to Eliminate Patch Tuesdays AND Surprise Wednesdays

Even with the best testing, sometimes an errant patch slips through. Depending on the level of chaos it causes, IT may have to recover from the backup infrastructure completely. Recoveries from backup, especially large ones, are time-consuming, meaning IT may deal with the Wednesday surprise for the rest of the week. The problem is most infrastructure software is too inefficient to maintain its data protection points, typically traditional snapshots, for more than a few hours. As pointed out in this article, “VMware Storage Challenges,” this problem is especially apparent in VMware environments.

IOclone: Unlimited Clones and Retention

To make surprise Wednesdays less of a concern, IT needs the ability to retain backup copies for more than a few hours. Traditional backup software can meet this need, but the time and nuances in recovering an application with an errant patch are significant. IOclone has the entire state of the VM and the entire data center or workload. No rollback is needed; point to the last known good instance, and the application is running.

Get Ahead of Ransomware

Ransomware Counts on Patch Tuesday

Because ransomware counts on patch Tuesday, applying the latest patches is critical to staying ahead of ransomware. With VergeOS, IT can apply patches almost as soon as they are released without waiting for Tuesday. They can test application patches against a mirror image of their production environment. If an errant patch slips through, they can instantly point to the non-patched version.

Even with the improved patching capabilities within VergeOS, ransomware may still slip through because of user carelessness. Our IOfortify solution takes you the rest of the way by leveraging the hardened VergeOS, IOclone, and new detection capabilities to deliver rapid restoration from an attack. During our TechTalk, “Designing a Ransomware Response Strategy,” we will conduct a live demonstration of IOfortify in action. See if we can recover a VM under attack during the webinar.

Patch Comparison: Traditional Infrastructure Software vs. VergeOS

Rapid Patch RequirementTraditional Infrastructure SoftwareVergeOS
Determining Patch ImpactDifficult – Multiple vendors makes identifying potential conflicts time consumingEasy – One Vendor
Pre-deployment TestingDifficult – Hard to setup, maintain and pay for dedicated labEasy – Virtual Data Centers and Cloning can create “Instant labs.”
Patch RollbackHard – Recovering from a backup copy is very time consumingEasy – No rollback required, just point to pre-patched clone.

Filed Under: Blog, Protection Tagged With: dataprotection, ransomware

Building a Ransomware Response Checklist

June 14, 2023 by George Crump Leave a Comment

The best time for IT Professionals to start building a ransomware response checklist is now, before an attack occurs. There are several reasons for creating a checklist:

√ Successful Ransomware Response requires preparation.

√ Stress levels are high during an attack. You might forget a critical element in a rush to get everything back online.

√ A checklist will expose areas where you must practice and test.

√ A checklist provides a framework for comprehensive auditing.

Section One: Build a Ransomware Resilient Foundation

▢ Implement a Prevention Solution
The first step in building a ransomware response checklist is to have the foundational elements covered. The best response is the one you don’t have to conduct because the attack doesn’t get through. While no prevention solution is perfect, and you still need a response strategy, they are effective at preventing many types of attacks.

▢ Simplify Patching
Most patch releases sent to IT professionals today involve closing down potential security exploits. These patches should be applied upon release. The problem is most IT professionals are hesitant to apply patches to the environment because of downtime and the potential for unexpected impact of the patch. This is especially true of infrastructure software since an errant patch or downtime because of a patch can impact dozens of servers instead of just one.

Simplifying patching is a critical item when Building a Ransomware Response Checklist.

Another challenge is that most IT infrastructures are comprised of multiple pieces of software. Instead of a single, cohesive data center operating system (DCOS), IT must run layers of incompatible infrastructure software components, including networking software, virtualization software, storage software, and data protection software. Patches are applied to these layers when the respective vendor for each layer releases a service pack, which rarely coincides with when the vendors of the other layers release their patches.

Look for a vendor that takes a DCOS approach to infrastructure, which is not only critical to simplifying patching but also simplifies the entire ransomware response effort.

A DCOS should provide two deliverables in terms of patching. First, it should be able to simplify the foundational DCOS patching process by integrating the legacy IT stack into a single software element. Second, it should make the patching of guest operating systems and applications running inside VMs simpler by enabling zero-capacity and zero-performance impact clones so that IT can test the released patch for conflicts with other elements within the data center. If there is a problem with the patch, IT can roll back to the prior version, or if the patch works, roll the patched version into production.

▢ Harden the Operating Environment
An essential but often overlooked step is to harden the infrastructure software as much as possible. Suppose the ransomware can infect a part of the core infrastructure, like the hypervisor, the storage software, or the data protection software. The impact is widespread in that case, and recovery is far more complex.

Hardening the Data Center is a critical item when Building a Ransomware Response Checklist

While most mainstream OSs are not resilient to attack, you should ensure your core infrastructure software, like the hypervisor, storage, and networking software, are hardened. Look for infrastructure software that takes special developmental steps to make it act like firmware, loaded into RAM, and can be replaced easily from an unalterable good copy. Again, a DCOS makes these processes easier since only one software component needs to be hardened instead of three or four.

Section Two: Build a Ransomware Resilient Protection Strategy

▢ Increase Protection Frequency and Retention
Protecting data is an obvious inclusion in any attempt at building a ransomware response checklist. Most data centers run into three challenges when creating a ransomware-resilient data protection strategy:

  1. Protection events occur too infrequently to be meaningful.
  2. Protected copies aren’t retained long enough to outlive a prolonged attack.
  3. Too many protection solutions are used, making the process complex and expensive.
    A best practice for a successful ransomware response is to make sure you are capturing all data hourly. Snapshots, on paper, look ideal for this use case, but most solutions experience significant performance problems as the number of snapshots increases, limiting how long those snapshots can be retained.

▢ Consolidate Protection Tools
To get around the limitation of traditional snapshots, most organizations use at least four data protection tools to protect their environment. They may use a combination of hypervisor snapshots, storage system snapshots, replication software, application-level backup utilities (dumps), and enterprise backup software. Using all these applications makes the data protection process more expensive and complex, especially during a ransomware recovery effort. IT may be unsure which part of the process has the best known good copy.

Look for an infrastructure DCOS that enables you to consolidate, preferably down to one, the number of tools used for data protection. In essence, the DCOS will protect itself. It should provide the ability to protect data frequently and retain those protection events indefinitely without suffering performance degradation. It should enable you to restore the entire data center footprint, if need be, including network and storage configurations, with a single click. Lastly, it should enable affordable, high availability so data can be moved off-site and adhere to all aspects of the 3-2-1 rule.

Finding an alternative to traditional snapshots is a critical item when Building a Ransomware Response Checklist

▢ Consider a Snapshot Alternative
Traditional snapshot technology, standard in most storage systems and hypervisors, is ill-suited to meet these requirements. The metadata requirements to maintain a high frequency, long retention snapshot schedule is too great. It impacts performance and makes deleting old snapshots to free up capacity too time-consuming. Clones are a better option for performance and retention because they are independent copies, but without global inline deduplication, frequent clones and long retention will consume too much storage capacity and degrade performance too much to be practical.

Look for an infrastructure that combines the best benefits of both clones and snapshots by implementing DCOS-wide deduplication. If the deduplication technology is built into the core of the DCOS, then it will eliminate concerns about algorithmic performance overhead and capacity consumption while enabling the cloning of PBs of data in milliseconds.

Section Three: Build a Ransomware Resilient Detection Strategy

Alerting to a potential attack is a critical item when Building a Ransomware Response Checklist

▢ Detect Data Anomalies
Detection is a critical component of building a ransomware-resilient checklist. The sooner the DCOS can alert IT to an attack, the faster IT can stop and remedy the situation. Most ransomware attacks take two vectors after the malware finds its way into the environment. First, they start encrypting files as fast as possible, and second, the malware starts replicating itself to encrypt more files in parallel.

Again multiple detection tools are problematic. Look for a DCOS that can deliver in near real-time, a single source of alerting based on data change rates. In a globally deduplicated environment, the DCOS builds an alert off of an unexpected increase in capacity consumption.

▢ Preserve Forensic Data
When ransomware attacks, most IT professionals’ first reaction is to start the recovery response as quickly as possible. The problem with jumping right into recovery is that the process will likely destroy any forensic data available to determine how the attack entered the environment and how it spread. Both data points are crucial to future prevention efforts.

Instead, look for a DCOS that enables quick isolation of the current state. Again using a cloning type of technology powered by global inline deduplication enables these clones to be made in milliseconds without consuming too much capacity. It is also critical that this clone be independent and isolated.

▢ Create Ransomware Honeypots
Another detection strategy is to create Honeypots of the environment and expose them to attack, obviously anonymizing data in them. These honeypots can alert you of a potential wider threat and provide excellent practice for further hardening your data center. Honeypots typically have a lower false positive rate, when compared to most traditional intrusion-detection systems.

Look for a DCOS that can virtualize entire data centers in the same way that virtual machines virtualize servers. Then the DCOS can easily create honeypot data centers that are securely isolated from the production virtual data centers.

Section Four: Build a Rapid Recovery Strategy

▢ Mount the Recovery, Don’t Copy

When ransomware strikes, rapid recovery is critical. Depending on the severity of the attack, IT may need to recover a few VMs or an entire data center. Copying data from another snapshot or a backup process takes too much time. Again, clone the current state for forensic reasons, then start recovery. The key is to be able to mount, in place, the last known good copy of data. That mount still needs isolation so IT can scan it for any malware trigger files before returning it to production.

Look for a DCOS that can in-place mount a previous VM version or an entire data center. An in-place mount provides instant access to the data so IT can scan it to ensure there are no malware remnants and then provide user access.

How’s Your Checklist?

Building a Ransomware Response Checklist is only effective if you tick all the boxes. If your evaluation is missing a couple of marks, then consider attending VergeIO’s next TechTalk, “Creating a Ransomware Response Strategy,” with our CEO, Yan Ness, and SE Director, Aaron Reid. They will dive deep into the elements of this checklist and show you a live demo of our IOfortify solution for recovering from a ransomware attack.

Filed Under: Blog, Protection Tagged With: dataprotection, Disaster Recovery, DR, ransomware

VergeIO Unveils IOclone To Solve the VM Snapshot Problem

April 25, 2023 by George Crump Leave a Comment

Ann Arbor, Mich, April 25th, 2023 — VergeIO, the Ultraconverged Infrastructure (UCI) company, today announced the launch of IOclone, a new solution that solves the virtualization snapshot problem facing users today. VMware and other virtualized environments suffer from highly inefficient snapshots and because of performance concerns, customers can only maintain a few active snapshots. This level of retention is insufficient for adequate data protection. With IOclone, customers can now leverage the built-in global data deduplication capabilities of VergeOS to create complete clones of virtual machines (VM) within milliseconds, regardless of VM size. Each clone is immutable and space efficient, initially consuming no additional capacity.

Hypervisors without the powerful capabilities of IOclone require customers to use expensive array-based snapshots or integrate with backup software solutions, forcing customers into an expensive and complicated multi-step solution for data protection. By comparison, IOclone is a single-step process tightly integrated into VergeOS. Once IT sets up a cloning policy, snapshots happen regularly without administrative intervention.

Now customers can create and maintain thousands of space-efficient copies of virtual machines or even virtual data centers without impacting performance. Both the original production instance and clones perform at the full performance of the infrastructure. Clones are instantly available for use in testing, QA, and development purposes, or customers can create “golden masters” and spawn hundreds or even thousands of VMs or virtual data centers (VDCs) from the original, again without impacting performance.

“Both clones and snapshots typically have some overhead in the capacity they consume and the processing required to use them. Clones typically have to make a copy of all of the metadata information, which means the cloning process takes some time upfront, but then they are ready to use and independent. Snapshots trade up front processing time and instead show performance degradation when in use or during clean-up,” said Greg Campbell, VergeIO founder and CTO. “IOclone delivers the best of both. Because our deduplication is part of the metadata in our filesystem, we get all the performance and independence benefits of cloning without their upfront overhead.”

Customers can execute IOclone on the virtual machine, the volume, or an entire virtual data center (VDC). In the same way that a virtual machine is an encapsulation of a server, a VDC clone is an encapsulation of the entire data center. It includes all the VMs within the data center and all the storage and networking policies, delivering near-instant recovery. 

IOclone does not require specialized storage controllers or storage data processing units. Thanks to the efficiency of VergeOS, it works with off-the-shelf servers using commodity flash and hard disk drives within the VergeOS environment. It is integrated into VergeOS and is available now at no additional charge to VergeOS customers. Customers looking to migrate off VMware can leverage VergeIO’s IOprotect and benefit from the immutable limitless protection of IOclone.

To learn more about IOclone, join VergeIO for “TechTalk, A Deep Dive into Virtual Infrastructure File Systems.” Live on May 4th at 1:00 PM ET / 10:00 AM PT. 

About VergeIO

VergeIO is the Ultraconverged Infrastructure (UCI) company. Unlike hyperconverged infrastructure (HCI), it rotates the traditional IT stack (compute, storage, and networking) into an integrated data center operating system, VergeOS. Its efficiency enables greater workload density on the same hardware with high levels of data resiliency. The result is dramatically lower costs and greatly simplified IT.

Filed Under: Press Release Tagged With: dataprotection, IOclone

Snapshots or Clones for Data Protection?

April 25, 2023 by George Crump Leave a Comment

snapshots or clones for data protection

Most storage solutions will provide IT professionals with either snapshots or clones for data protection, but are the differences between the two functions significant enough to make it part of your selection criteria? Like all things in IT, the answer depends.  In this case, it depends on if and how your vendor implemented the two technologies. 

Register now to join us live on May 4th for technical deep dive into virtual infrastructure file systems and see a live demonstration of IOprotect.

What Are Snapshots?

Deciding if snapshots or clones are best for data protection first requires understanding how the two technologies work. First, let’s look at snapshots. Most storage solutions, be they a filesystem or block storage, have a metadata layer that points to where each data segment resides. A snapshot makes a copy of those pointers at a specific time and then sets those blocks pointing to a read-only mode until it expires. 

Snapshot Update Methods

There are two methods for updating a read-only segment because of a snapshot.  The first method is a copy-on-write process. When a user or application attempts to update or change an existing segment, the storage solution copies the old segment to a new location and allows the new data to occupy the original segment. The storage solution then updates the snapshot metadata with the old segment’s new location. 

The second snapshot update method is “redirect on write”. Using this method, the storage system will write the modified data to a new location and update the metadata of the “production view” of the data. It does not need to update the “snapshot view” of the data. 

Both of these methods limit the scalability of snapshots because multiple writes and multiple changes to metadata need to occur. Also, many storage systems use separate metadata trees to manage each snapshot. As the number of snapshots increases and the depth of those snapshots (snapshots of snapshots), the complexity of managing and updating the metadata wears on system performance.  As a result, where the snapshot is occurring within the hypervisor, on the same hardware as the hypervisor (software-defined storage running as a virtual machine), or on dedicated storage hardware, there are limits to how many snapshots the storage solution can maintain. 

The complexity shows itself by degrading overall system performance. Storage systems with legacy snapshot technology require:

  • Limitations to number of copies retained
  • High-end processors in the storage servers
  • Dedicated data processing units (DPUs)
  • Days to remove old snapshots

What Are Clones?

Clones are copies of existing segments. They are more standalone, and updating a clone does not require the same metadata overhead as snapshots. The independence of a clone means that they don’t suffer from performance degradation as snapshots regardless of how many there are or how long they are retained. Clones don’t need either of the sophisticated update methods that snapshots require.

The Downside to Clones

The typical downside to clones is that they are either complete copies of the original volume or deduplicated copies. A full copy, means that data must traverse the internals of the storage infrastructure, travel across the network to the hypervisor, and back down the network again to the storage system. 

Some hypervisors have initiated capabilities to eliminate traversing the network, saving time. Still, most cloning functions must process data through the internals of the storage solution twice, even if that solution has a deduplication feature. With deduplication, the resulting clone may not consume any additional capacity, but the time to create that copy is still significant, especially if the volume is of any measurable size. It is best not to use the applications while the storage solution clones it. As a result, most organizations don’t use cloning as part of their data protection strategy. 

IOclone — The Best of Clones and Snapshots

As we’ve discussed, clones and snapshots typically have some overhead in the capacity they consume and the processing required to use them. Clones typically have to make a copy of all of the metadata information, which means the cloning process takes some time upfront, but then they are ready to use and independent. Snapshots don’t have the upfront processing time and, as a result, are ready for use almost instantly. However, they show performance degradation as the number of snapshots increases when used or during snapshot clean-up routines. 

IOclone is a capability of the VergeOS operating system that combines the best of clones and snapshots into a single solution. Since global inline deduplication is part of the metadata in VergeOS, IOclone, copies are similar to snapshots. Regardless of capacity, it can create clones of VMs, volumes, or entire virtual data centers in milliseconds. At the same time, IOclone-created copies have the stand-alone performance of independent clones without initially consuming additional capacity footprint.

With IOclone, IT doesn’t have to choose between snapshots or clones for data protection. This capability within VergeOS can retain hundreds, even thousands of copies of VMs, volumes, or entire Virtual Data Centers (VDC) without negatively impacting performance or capacity consumption.

Learn More

  • Register for our live TechTalk: Deep Dive on Virtual Infrastructure File Systems
  • Subscribe to our eBook: “Designing a Resilient Infrastructure“
  • Review our IOclone Datasheet

snapshots or clones for data protection

Conclusion

IOclone is also part of our IOprotect solution, which enables you to start a VMware Exit by first using VergeOS as a disaster recovery solution. Most customers find IOprotect reduces the cost of disaster recovery by more than 50% without adding additional hardware. It provides a complete recovery environment, converging disaster recovery so that data, applications, and the processing power to recover are all available from a small cluster of nodes. 

As your confidence in VergeOS grows, you can use it for your production environment. The tightly integrated VergeOS architecture delivers more efficient performance, increasing workload density on less physical hardware. Once your conversion is complete, you’ll lower costs by as much as 80% and enjoy an actively developed data center operating system with unparalleled support.

snapshots or clones for data protection

Filed Under: Blog, Storage Tagged With: dataprotection

Understanding VMware DR Components

April 11, 2023 by George Crump Leave a Comment

Understanding VMware DR components allows IT professionals to dramatically reduce spending without compromising recoverability. There are four main components to a VMware disaster recovery (DR) strategy:

Understanding VMware DR Components
  1. Storage
  2. Compute
  3. Network
  4. Replication Software

The products you select for each of these components impact how much that component will cost and has a ripple effect on the other components in terms of cost and choices. The total of these parts impacts the complexity of your DR strategy and the likelihood of a successful recovery.

To learn more about VMware DR, join us for tomorrow’s Whiteboard Wednesday session, “VMware Disaster and Ransomware Recovery—The Three NEW Best Practices,” at 1:00 PM ET / 10:00 AM PT.

Understanding VMware DR Storage

Understanding VMware DR components requires knowing what type of storage will be in place at the DR site. It represents one of the best opportunities to reduce DR costs. To copy data to the remote DR site, customers often use array-based replication, which typically requires another storage system from the same vendor at the DR site. Customers are forced to pay a premium for a rarely used storage system. Furthermore, since most storage vendors have given up on auto-tiering, the customer cannot use lower-cost hard disk drives at the DR site and then move the workloads to flash storage when a disaster occurs.

Reducing the cost of DR storage requires two capabilities. First, the ability to replicate directly from the VMware environment instead of using the array. Second it must support multiple types of media. Replicating directly from the VMware environment instead of using the array provides a much tighter integration into VMware, enabling a complete copy of data at the DR site. It also enables replicating to a commodity server with drives installed instead of a dedicated storage array. The ability to support multiple types of storage media, flash drives, and hard disk drives, for example, enables IT to take advantage of the fact that hard drive capacity is 8X less expensive than the equivalent flash capacity. The storage system must provide the ability to quickly move the most performance-dependent workloads to a flash tier during disaster recovery testing or an actual disaster.

Understanding VMware DR Compute

Understanding VMware DR components requires knowing the compute requirements at the DR site during a disaster. IT must ensure the DR site can support operations during a disaster. IT no longer has the luxury of ordering hardware on demand because supply chain issues continue to plague the industry. Your DR plan can’t be held up because servers are on backorder for three weeks or more. As a result, the server performance at the DR site must match the server performance at production, at least for the workloads that will be recovered at the DR site.

Reducing the cost of DR Compute requires running more virtual machines on less hardware without impacting performance. VMware is too weighed down by all its add-ons and lack of integration between them. IT needs to eliminate as much of the virtualization tax as possible by using a more efficient hypervisor at the DR site. An alternative VMware hypervisor that is 50% more efficient means a 50% reduction in server costs at the DR site.

Understanding VMware DR Networking

Buying a second set of network hardware for the DR site has the same problem as buying a second storage system; it is expensive. An alternative is to use “dumb switches” and software-defined networking (SDN) capabilities. The issue is the SDN software is often so expensive that its costs all but eliminates the savings of buying “dumb switches.” This high cost is especially true with VMware’s NSX. VMware’s SDN software can add $10,000 or more to the cost of each node in the DR site. Lastly, SDN creates another layer, similar to managing a separate physical network layer. Understanding VMware DR components includes knowing the operational implications of each component selected.

What about Replication Software?

As stated above, many VMware DR strategies depend on array-based replication. While it is sometimes included “free” with the array, it also has the added cost of a second storage system from the same vendor. In most cases, array-based replication is “blind” to the fact that VMware is running on top of it and may not capture all the configuration data. It certainly will not capture all the networking configuration information.

Customers may also use a dedicated replication solution that integrates with VMware. While these solutions capture the VMware environment well, they are costly and don’t help reduce DR storage or network costs.

A Holistic Approach to VMware DR

The fact that there are four components to a VMware DR strategy is the problem. IT must purchase each component and manually stitch them together to work. The coordination between all the components, ensuring all the data and configurations are captured, is critical to the strategy’s success.

VergeIO’s IOprotect simplifies and reduces VMware DR costs. It makes understanding VMware DR components easy because it reduces the “components” to one. IOprotect is part of VergeOS, an ultraconverged infrastructure (UCI) that integrates networking, compute, storage, and data protection into a single operating environment. It is one piece of software, not four or five.

Understanding VMware DR Components

With IOprotect, you can replicate your existing three-tier or hyperconverged infrastructure (HCI) to a single VergeOS environment. It seamlessly connects to your VMware environment and captures all the information you need for a successful disaster recovery strategy. You can also consolidate all your DR computing, storage, and networking requirements into as few as two servers plus a few “dumb switches” at your DR site. If you require more capacity or compute resources, add more nodes, but you won’t need to add many nodes because our customers consistently find they can run more workloads on less hardware. VergeOS is more efficient than VMware. They also require less storage capacity thanks to our high-performance global inline deduplication.

Testing your DR strategy is easy with VergeOS. Our Virtual Data Center (VDC) technology allows you to create a space-efficient, isolated clone of your replicated site. You can test and practice your DR skills while protecting your production VMware environment.

A DR Strategy with a Production Future

Understanding VMware DR Components

IOprotect is just the beginning. Using IOprotect for VMware DR extensively tests all VergeOS capabilities while your VMware environment is under license. You will likely reduce your VMware expenses by more than 60% during that time. Then when it is time to renew VMware in production, and you have to deal with the new, more expensive VMware pricing policies, you have an exit strategy, tested and ready for deployment. Now your cost savings increase even more, as does your operational simplicity.

Filed Under: Blog, Virtualization Tagged With: dataprotection, Disaster Recovery, DR, VMware

The HCI Disaster Recovery Problem

January 31, 2023 by George Crump Leave a Comment

While hyperconverged infrastructure (HCI) catches the attention of many IT professionals, the HCI Disaster Recovery problem, while seldom talked about, could be its greatest weakness. Proper HCI protection and disaster recovery typically require a separate infrastructure with its own software and hardware. This requirement complicates a critical process, creating a high risk of failure while dramatically increasing costs.

What is the HCI Disaster Recovery Problem

Part of the HCI Disaster Recovery problem is that most data protection solutions have to protect HCI architectures as traditional three-tier architectures. They back up through the hypervisor and to a separate storage system. That separate storage system is often scale-out in nature, so you have nodes backing up nodes.

Disaster recovery requires the same HCI configuration in the remote site as in the primary site. Also, the deduplication capabilities that most HCI vendors provide are bolt-on, which they deliver years after the HCI software first comes to market. As a result, it can’t deduplicate across HCI clusters. If the organization has multiple HCI clusters in one or more locations, it must transmit all the data to the disaster recovery site.

The HCI Disaster Recovery Problem Triples Inefficiency

HCI is incredibly inefficient. The inefficiency is the result of forcing customers to expand with like nodes. If all you need is more processing power, you can’t easily add more advanced CPUs or GPUs to the existing cluster. Even if you use the same processor type, you can’t buy nodes that are primarily processors; you must buy additional storage to match the other nodes in the cluster.

Backing up an HCI architecture, because conventional wisdom is to back up to a scale-out storage system, means you are doubling the inefficiency of the infrastructure. That scale-out backup storage suffers from the same inefficiency as scale-out HCI except in reverse. With scale-out backup storage, you are dragging along, and paying for, more processing power than you probably need just to get capacity.

Making sure an HCI architecture is protected from disaster triples its inefficiency. Forcing identical nodes in the disaster recovery site means that the HCI solution duplicates the same inefficiency at the disaster recovery site as in the primary location. Suppose you are replicating the backup infrastructure in addition to the HCI infrastructure because you don’t trust HCI replication. In that case, you are quadrupling the cost of data protection and disaster recovery costs.

The HCI Ransomware Recovery Problem

Ransomware is another form of disaster. It is unique in that the data center is still operational, but users and applications are not. HCI also has a ransomware recovery problem. HCI solutions do not harden their software. Since most are mostly Software Defined Storage (SDS) solutions that claim to be HCI, they run as a virtual machine (VM) within a hypervisor like VMware or Hyper-V. They are at the mercy of that hypervisor’s ransomware hardening.

Running storage as a VM castrates a vital line of ransomware defense, snapshots. Recovering quickly from a ransomware attack requires frequent, immutable snapshots. Given the latest ransomware attack profiles, IT must retain these snapshots for months. Storage running as a VM suffers from the same virtualization tax as other VMs. As a result, they can only keep a few snapshots before needing to expel them for performance reasons.

Solving the HCI Disaster Recovery Problem

Solving the HCI disaster recovery problem requires rethinking HCI. First, the IT stack (compute, storage, networking) needs to be integrated, not layers. At VergeIO, we call this rotating the stack, which removes the layers and creates a cohesive data center operating system (DCOS), VergeOS. It is a single piece of software, not dozens. We call it Ultraconverged Infrastructure. Next week we’ll be hosting a live webinar that compares HCI to UCI. Register here.

Solve The HCI Disaster Recovery problem with replication, snapshots and deduplication.
Solve The HCI Disaster Recovery Problem

While we support external backup applications, VergeOS includes built-in data protection and replication capabilities. They, like everything else, are integrated into the core code, so they operate with minimal overhead. You can execute immutable snapshots frequently and retain those snapshots indefinitely without impacting performance.

VergeOS also supports different node types, so the disaster recovery site can use different hardware than the primary. Also, VergeOS supports global, inline deduplication so that if you are replicating from multiple sites to a central disaster recovery location, it only replicates the unique data from each site. With VergeOS, transfers are fast, and disaster recovery storage costs are negligible.

The HCI Disaster Recovery Problem Creates Compromise

Because of cost and complexity, many organizations compromise when establishing their disaster recovery site. The enforcement of like hardware doubles server acquisition costs, and the lack of efficient data storage can triple or more storage costs.

The most common compromise is using the backup infrastructure as the disaster recovery solution. Backup software can replicate and even deduplicate data, but when it stores that data on the remote site, it is in the backup software’s format. It isn’t operational. If there is a disaster, the organization must wait, potentially days or hours, for restore job completion before allowing access.

Using backup as the disaster recovery solution also makes testing and practicing the recovery process much more complicated and time-consuming. The result is less frequent testing and no practice. The reason most disaster recoveries fail is a lack of testing and experience.

Eliminating Disaster Recovery Compromise

VergeOS provides no-compromise disaster recovery. The costs at the disaster recovery site are easily controlled thanks to node flexibility and data deduplication. The data at the DR site is live and ready to instantiate at a moment’s notice.

Networking is also a source of disaster recovery failures. Misconfigurations, improper remapping, and incompatible hardware between locations can cause many problems. VergeOS integrates software-defined networking and alleviates these problems, ensuring that newly recovered data centers are easily accessible by users and applications.

Testing, thanks to our snapshot functionality, is also easy. Thanks to our Virtual Data Center (VDC) technology, a snapshot of an entire data center can be made in seconds. That snapshot can then be mounted for recovery testing purposes. Deduplication ensures that the only growth in capacity is changes made to the disaster recovery dataset while the test is executing.

Data protection and disaster recovery have been problematic since the dawn of the data center. Continuing to try the same old thing (replace backup software, replace backup storage, try to find a better replication solution, pray the network works) isn’t the answer. With VergeOS, we start at the source of the problem the production infrastructure itself.

Learn More:

  1. Register for our live webinar, “Beyond HCI — The Next Step in Data Center Infrastructure Evolution.” During the webinar, VergeIO’s Principal Systems Engineer, Aaron Reed, and I will compare HCI and UCI in-depth. I’m even going to talk Aaron into giving you a live demonstration of the solution of VergeOS in action.
  2. Subscribe to our Digital Learning Guide, “Does HCI Really Deliver?”
  3. Sign-up for a Test Drive – Try it yourself, and run our software in your labs.

Filed Under: Blog, HCI Tagged With: dataprotection, HCI, Hyperconverged, snapshots

855-855-8300

Get Started

  • Versions
  • Request Tour
  • Test Drive

VergeIO For

  • VMware Alternative
  • SAN Replacement
  • Hyperconverged
  • Server Room
  • Secure Research Computing

Product

  • Benefits
  • Documents
  • Architecture Overview
  • Use Cases
  • Videos

Company

  • About VergeIO
  • Blog
  • Technical Wiki
  • License Agreement
  • Terms & Conditions

© 2023 Verge.io. All Rights Reserved.