What It Does
Cloud Hypervisor is an open-source Virtual Machine Monitor (VMM) written in Rust, targeting modern cloud workloads. It is maintained primarily by Microsoft and Intel as part of the rust-vmm ecosystem — the same shared Rust virtualization component library that underlies AWS Firecracker. Where Firecracker was designed for maximum minimalism (serverless ephemeral functions), Cloud Hypervisor occupies a middle ground: more features than Firecracker (CPU/memory hotplugging, vhost-user device offload, vDPA, NVME support, Windows guest support) while maintaining a security-focused, auditable Rust codebase significantly smaller than QEMU.
Cloud Hypervisor is used as the VMM backend for Kata Containers in many production deployments, and has been selected by projects like Arrakis as the sandboxing foundation for AI agent workloads. Fly.io uses Firecracker for most VMs but Cloud Hypervisor for GPU instances. The project has approximately 106K lines of Rust and is Apache 2.0 licensed.
Key Features
- CPU and memory hotplug: Add or remove vCPUs and memory to running VMs without restart — not supported by Firecracker
- vhost-user device offload: Delegate I/O devices to separate processes for fault isolation and performance; enables SR-IOV and DPDK acceleration
- Snapshot and restore: Full VM state (memory + CPU) can be serialized to disk and restored deterministically; foundation for agent backtracking use cases
- Windows guest support: Run Windows Server VMs in addition to Linux; Firecracker is Linux-only
- vDPA (virtio Data Path Acceleration): Hardware acceleration for network and storage I/O
- NUMA topology exposure: Expose NUMA nodes to guest for memory-locality-aware workloads
- Rust VMM security model: Memory-safe implementation eliminates entire CVE classes common in C-based hypervisors (QEMU/KVM has historically had memory corruption vulnerabilities)
- Kata Containers integration: Ships as a supported VMM backend in Kata Containers, providing hardware-level isolation for container workloads
- ~106K lines of Rust: Larger than Firecracker (~83K) but orders of magnitude smaller than QEMU (~1.5M lines C)
Use Cases
- Kata Containers backend: Drop-in hardware isolation for Kubernetes container workloads when gVisor-level isolation is insufficient and full QEMU overhead is unacceptable
- AI agent sandbox runtime: Foundation for sandbox platforms (Arrakis, CodeDuet) that need snapshot/restore for agent backtracking workflows
- Cloud VMs with hotplugging: Running longer-lived cloud instances that need elastic resource scaling without restart
- GPU VM workloads: Selected by Fly.io for GPU VMs where Cloud Hypervisor’s device model is more suitable than Firecracker’s minimal device set
- Windows guest hosting on Linux: Running Windows Server VMs on Linux KVM infrastructure without QEMU
Adoption Level Analysis
Small teams (<20 engineers): Poor direct fit. Cloud Hypervisor is a low-level VMM, not a product. Operating it directly requires deep virtualization knowledge. You would use it indirectly through a platform (Kata Containers, Arrakis) rather than directly.
Medium orgs (20–200 engineers): Fits as an infrastructure component when running Kata Containers or building a custom sandbox platform. Requires at least one engineer with KVM/virtualization expertise. Not a managed service.
Enterprise (200+ engineers): Good fit as part of a Kubernetes + Kata Containers stack for workload isolation. Microsoft and Intel contributions provide some confidence in long-term maintenance. Used in production at Fly.io and various Kata Containers deployments.
Alternatives
| Alternative | Key Difference | Prefer when… |
|---|---|---|
| Firecracker | More minimal, faster boot (~125ms vs ~200ms), AWS-backed, serverless focus, no hotplug | You need maximum startup speed for ephemeral serverless/agent workloads |
| QEMU/KVM | Feature-complete but ~1.5M lines of C; broader device support, higher attack surface | You need legacy device compatibility or exotic hardware pass-through |
| crosvm | Google’s ChromeOS VMM, Rust, similar scope; less cloud-focused | You are building on ChromeOS or Chromium OS infrastructure |
| libkrun | Ultra-minimal microVM library (not a full VMM); macOS Hypervisor.framework support | You need cross-platform (Linux + macOS) with minimal feature set |
Evidence & Sources
- GitHub repository — cloud-hypervisor/cloud-hypervisor
- Intel Releases Cloud Hypervisor — The New Stack (origin context)
- History of Cloud Hypervisor — Michael Zhao, Medium
- Guide to Cloud Hypervisor in 2026 — Northflank
- Firecracker NSDI paper — Amazon Science (comparison baseline)
- Fly.io GPU machines use Cloud Hypervisor, not Firecracker — HN comment
Notes & Caveats
- Not a managed service: Cloud Hypervisor is a library/binary, not a platform. You must build orchestration, lifecycle management, networking, and monitoring on top. Most users consume it indirectly via Kata Containers or a purpose-built sandbox platform.
- Slower boot than Firecracker: Cloud Hypervisor boots VMs in approximately 200ms; Firecracker boots in approximately 125ms. For workloads where cold-start latency matters (serverless, agent spawning), Firecracker retains an edge.
- Windows guest support adds surface area: The broader device model required for Windows support increases the attack surface compared to Firecracker’s deliberately minimal device set.
- Less battle-tested at Lambda scale than Firecracker: Firecracker powers AWS Lambda’s isolation at massive scale with extensive production hardening. Cloud Hypervisor has production deployments (Fly.io GPU, Kata Containers) but at smaller scale and with fewer published case studies.
- rust-vmm component sharing: Both Cloud Hypervisor and Firecracker share rust-vmm components. A vulnerability in a shared component (e.g., virtio device implementation) could affect both VMMs simultaneously.