Question 1

What is GPU virtualization?

Accepted Answer

Virtualization is a concept in computer science for creating virtual representations of physical hardware. While virtualization is commonly associated with Virtual Machines (VMs), it extends to other domains, including GPUs. GPU virtualization is essential for efficient resource sharing in high-performance computing, AI, and machine learning. However, it’s often misunderstood, especially when applied to GPUs, where the term can have multiple meanings.

Question 2

What existing types of GPU virtualization are there?

Accepted Answer

GPU virtualization currently exists in three main forms:

• Single-node GPU sharing
• Dedicated GPU passthrough
• Network-based GPU pooling (Thunder Compute’s approach)

The first two operate within a single physical server and are widely used today. Thunder Compute is pioneering the third approach, which operates across multiple servers or 'nodes.'

Question 3

How does single-node GPU sharing work?

Accepted Answer

Single-node GPU sharing (e.g., NVIDIA vGPU) divides a physical GPU into multiple virtual GPUs. This allows several virtual machines (VMs) to simultaneously use portions of the same GPU, improving resource utilization in scenarios where VMs don’t need the full power of a GPU.

Question 4

What is dedicated GPU passthrough?

Accepted Answer

Dedicated GPU passthrough (e.g., Intel GVT-d) assigns an entire physical GPU to a single VM. While this doesn’t split the GPU, it’s considered virtualization because it allows a VM to directly access the GPU, providing near‑native performance for applications that require the full power of a GPU.

Question 5

What is network-based GPU pooling?

Accepted Answer

At its core, Thunder Compute is a network-based GPU virtualization solution. It works by extending physical PCIe connections over a high‑speed network, so any computer can access any GPU across multiple servers. A virtual GPU is “plugged in” via software, and applications interact with remote GPUs just as if they were locally attached.

Question 6

Why is network-distributed GPU virtualization a game-changer?

Accepted Answer

Traditional GPU virtualization is limited by physical hardware constraints, typically supporting a maximum of eight GPUs per server. Thunder Compute’s network-distributed approach overcomes this by pooling GPUs across multiple servers (‘horizontal scalability’), allowing flexible, on‑demand allocation of GPU power and much higher overall utilization.

Question 7

How does Thunder Compute compare to similar technologies?

Accepted Answer

• NVIDIA InfiniBand improves network speed but doesn’t handle GPU resource allocation.
• Storage Area Networks (SANs) pool storage to avoid wasted capacity. Thunder Compute’s GPU virtualization works on the same principle, but for GPU compute resources.

Question 8

What does the future hold for GPU virtualization?

Accepted Answer

Network-based GPU virtualization is rapidly improving: early tests ran AI inference 100× slower than local GPUs, but within weeks performance hit ~2× slower for most workloads. As the technology matures, we expect remote GPUs to match local performance, extending even to consumer‑grade networks and unlocking massive, flexible compute pools.

Era	Milestone	Why it mattered
1960s	IBM CP‑40 time‑shared a mainframe between 14 users.	Turned million‑dollar hardware into a shared resource.
1990s	VMware noticed servers idled ≈85 % and revived virtualization on x86.	Drove utilization toward 80 %+ and made “one server per app” obsolete.
2000s	Amazon EC2 shipped virtual CPUs (vCPUs) by default.	Popularized “pay only for what you use” cloud pricing.

Year	Project	Overhead vs. Physical GPU
2013	rCUDA (research)	~100 × (with RDMA)
2022	Thunder Compute prototype	~1000 × (with TCP)
2025	Thunder Compute public beta	~1.5 × and falling

Virtualization in Cloud Computing: the Past, the Present, and the Future

1. What is virtualization?

2. Why bother?

3. A (very) short history

Era

Milestone

Why it mattered

4. Beyond the CPU

5. GPUs: today’s frontier

Year

Project

Overhead vs. Physical GPU

6. Where this is heading

Other articles you might like

Try Thunder Compute