Back to Blog

Five Capabilities Your Cloud Provider Cannot Give You

A company was spending $30,000+ a month on AWS. Eight months after moving to bare-metal Kubernetes, the hardware had paid for itself. This is what that infrastructure actually delivers — and why no cloud vendor can match it.

By Catalin Lichi · Sugau Infrastructure

Most infrastructure decisions are made by comparing the wrong things. Cloud against on-premises. Managed services against operational overhead. Monthly invoices against capital expenditure. These comparisons are legitimate, but they miss the question that matters most: what can you actually do with the infrastructure you own versus the infrastructure you rent?

When you build Kubernetes on bare metal using ZFS and KVM as the foundation, five capabilities emerge that no hyperscaler will offer you in the same package, under the same roof, at any price. They are not features you unlock with the right tier. They are architectural consequences of owning the stack.

1. Provisioning that does not depend on someone else's capacity

When you need a new Kubernetes node on a cloud platform, you submit a request. That request enters a queue, capacity is allocated from a shared pool, an image is hydrated across a network, and a virtual machine eventually joins your cluster. On a good day this takes minutes. During regional capacity events — exactly the conditions when you most need new nodes — it takes longer.

On bare-metal infrastructure using ZFS, a new Kubernetes node is a clone of a golden image. The clone operation is nearly instantaneous. KVM presents it as a virtual machine, the node joins the cluster, and workloads are scheduling within minutes — with no dependency on external capacity, no queue, and no managed API that can throttle or fail.

The same mechanism handles hardware replacement. A node migrates from a failed physical host to a healthy one via a simple data transfer pipeline. No rehydration. No waiting. The infrastructure recovers at the speed of your network.

The business consequence: disaster recovery and hardware replacement become routine operational procedures rather than high-stakes events with unpredictable timelines.

2. Virtualisation overhead you can measure, tune, and own

Virtualisation has a performance cost. The honest number for a KVM-hosted virtual machine on modern hardware — with properly configured drivers, CPU pinning, and memory allocation tuned to your workload — is three to five percent compared to running directly on bare metal. For the vast majority of enterprise workloads, this delta is operationally invisible.

What this buys you is the full benefit of a virtualisation layer: live migration between physical hosts, snapshot-backed provisioning, hardware fault isolation, and the ability to run multiple node types on shared physical servers — at a cost that is essentially rounding error against your compute budget.

Cloud providers also run your workloads in virtualised environments. The difference is that you have no visibility into their hypervisor, no control over how resources are allocated, and no ability to tune for your workload profile. You pay the performance cost either way. On bare metal, you know exactly what it is and you can optimise it.

The business consequence: predictable, tunable performance with none of the noisy-neighbour effects that affect shared cloud infrastructure.

3. Automation you own, built on APIs that do not change

Every cloud provider wraps operational procedures — VM provisioning, backup, snapshot management, node lifecycle — in managed services with proprietary APIs. When those APIs change, your automation breaks. When a service is deprecated, your operational model breaks with it. This coupling accumulates quietly over years and becomes an invisible cost that only surfaces when something goes wrong or when you try to leave.

The Sugau stack is built on libvirt and the KVM hypervisor. Both expose stable, open APIs that have not changed meaningfully in over a decade. Ansible orchestrates the complete VM lifecycle: provisioning from golden images, configuration management, snapshot scheduling, backup verification, and restoration testing. Every operation is expressed in version-controlled playbooks that you own, can audit, and can run on any Linux host with KVM support — today, in five years, and regardless of what any vendor decides.

When a node needs to be restored from a ransomware event or operator error, the recovery path is the same automation that runs in CI. There is no support ticket, no four-hour SLA, and no data egress charge. Recovery is as fast as your storage can move bytes.

The business consequence: operational procedures that are stable, auditable, and independent of any vendor's product roadmap.

4. Continuous data protection built into the storage layer

Backup is one of the most over-productised problems in enterprise infrastructure. Dozens of commercial platforms, cloud-native snapshot services, and third-party data protection vendors each add licensing cost, operational complexity, and another dependency surface to manage. Most of them are, at their core, reimplementations of functionality that ZFS has provided for twenty years.

ZFS snapshots are atomic, instantaneous, and space-efficient. A snapshot of a Kubernetes worker node — its persistent volumes, its OS disk, its application state — takes milliseconds and consumes storage proportional only to what changes after the snapshot is created. Snapshots can be scheduled at any frequency and replicated off-site continuously. Restoration to any point in history requires no unmounting of the live dataset and no interaction with a managed service.

On a Sugau deployment, this is not a backup product added to the stack. It is the storage subsystem doing what a well-designed storage subsystem should do. The result is continuous data protection across every node and every persistent volume in the cluster, with off-site replication included in the base architecture — not sold as an additional tier.

The business consequence: granular, continuous recovery capability at no additional licensing cost, with no vendor dependency in the recovery path.

5. Encryption where the keys never touch the hardware

Hardware theft is consistently underweighted in enterprise risk models. Physical security is treated as a separate problem — the domain of locks, cameras, and data centre access policies. This thinking holds until a drive leaves a colocation facility undetected, or a server is seized by a jurisdiction you did not plan for. At that point, what matters is whether the data on that hardware is readable without your cooperation.

ZFS provides native dataset-level encryption with key management that is completely decoupled from the data it protects. Combined with block-level encryption at the host where needed, a Sugau deployment can be configured so that no decryption key ever resides on the physical hardware. Keys are held in a remote escrow — an HSM, a dedicated key management system, or an air-gapped system under your direct control — and fetched over an authenticated channel at boot. If the hardware is removed and powered on anywhere outside your environment, every dataset it contains is opaque. There is nothing to read, nothing to recover, and nothing to report to a regulator.

This is not a capability cloud providers can offer, because in the cloud model the encryption keys ultimately reside in infrastructure the vendor controls. The question of who holds the keys is a question of who holds the data. In the Sugau model the answer is unambiguous: you do — geographically, jurisdictionally, and operationally separated from the hardware that stores it.

The business consequence: hardware seizure or theft becomes a logistics problem, not a data breach.

What these five capabilities have in common

None of this is speculative. ZFS and KVM have been production-grade for over a decade. The approach is the deliberate composition of mature, stable technology into an architecture where provisioning speed, performance predictability, operational independence, data protection, and physical security are properties of the base layer — not features purchased separately from a catalogue.

Cloud providers will offer managed equivalents for some of these capabilities, individually, with usage pricing, API rate limits, and product roadmaps outside your control. What they cannot offer is ownership: the certainty that your infrastructure behaves the same way at two in the morning when a node fails, when a jurisdiction changes its data residency requirements, or when a vendor decides to deprecate a service.

A gaming company discovered this when they replaced their entire AWS footprint with bare-metal Kubernetes. The €28,000 monthly invoice became hardware that paid for itself in under eight months. What they gained was not just cost reduction — it was an infrastructure model where they understood every layer, owned every component, and controlled every key.

That is what Sugau builds.

Sugau Infrastructure specialises in bare-metal Kubernetes deployments, cloud repatriation, and sovereign AI infrastructure for enterprises that require operational ownership of their compute stack.

sugau.com