Back
How Thunder Compute works (GPU-over-TCP)
TL;DR: We attach GPUs to your VM over a plain TCP socket instead of PCIe. This lets us time‑slice each physical GPU across many users without asking you to change a single line of code.
Published:
Oct 31, 2024
Last updated:
Apr 17, 2025

1. Why bother virtualizing a GPU?
GPUs are expensive and they often sit idle while you read logs or tweak hyper‑parameters. By stacking workloads from multiple users back to back on each GPU, we keep the hardware busy and the price low. This is different from a scheduler like slurm; everything happens behind the scenes, in real time, without waiting.
2. How does it work?
Network‑attached: The GPU sits across a high‑speed network instead of a PCIe slot. Your virtual machine communicates with the GPU over TCP—the same protocol your browser uses.
Feels local: You still
pip install torch
, usedevice="cuda"
, and go. Behind the scenes, our instance translates those calls into network messages.Time‑sliced: When your process runs, it owns the whole GPU. You have access to the full VRAM and compute of the card you pay for. When the process finishes (or you idle out), we can pass that GPU to someone else.

3. Performance
TCP adds a bit of delay, but most ML jobs spend far more time computing than waiting for data. By strategically optimizing the way your program runs behind the scenes, we usually land within 1×–1.8× of a direct‑attach GPU. Less optimized tasks may be much worse, but sometimes this can even be faster than native; check our docs to see what we've most thoroughly tested.
Feature | Status |
---|---|
Network-attached GPUs | Ready |
Secure process isolation | Ready |
Multi-GPU support | Ready |
PyTorch | Ready |
TensorFlow, JAX | Early access |
Multi-node clusters | In development |
Graphics / game engines | Not yet |
4. Security
When your job ends, we wipe every byte of GPU memory and reset the card so no data leaks to the next user. Each process runs in its own sandbox.
5. What’s next
Point‑in‑time slices for even more savings
Support for clusters for large-scale training
Graphics support after CUDA workloads are rock‑solid
Tell us what you need—ping our team in Discord. The first $20 each month is free, so spin up an A100 GPU and see how it feels.

Carl Peterson
Other articles you might like
Learn more about how Thunder Compute will virtualize all GPUs
Try Thunder Compute
Start building AI/ML with the world's cheapest GPUs