Running Deepseek R1 Locally on Thunder Compute

In this guide, we’ll show you how to run the Deepseek R1 model on Thunder Compute. To get the best performance I recommend using our 80GB A100 GPUs, which support the 70B model variant. For more information about GPU compatibility, check out our compatibility guide.

Note: Make sure you’ve already set up your Thunder Compute account. If not, follow the steps in our Quickstart Guide.

If you prefer video guides, here is an overview of the following steps:

Step 1: Create Your GPU Instance

Open your CLI and run the following command to create an instance. For best performance, select an 80GB A100 GPU and opt for the 70B model variant. For more details about instance templates, see our templates guide.

tnr create --gpu "a100xl" --template "ollama"

Step 2: Check Status and Connect

After creating your instance, check its status:

tnr status

Then, connect to your running instance using its ID:

tnr connect <instance-id>

Step 3: Start the Ollama Server

Once connected, start the Ollama server (which also launches the web UI) by running:

start ollama

If you encounter any issues during startup, check our troubleshooting guide for common solutions.

Wait around 30 seconds for the web UI to build and load.

Step 4: Access the Web UI and Load Deepseek R1

  1. Open your web browser and navigate to:
    http://localhost:8080

  2. In the web UI, select the Deepseek R1 model from the dropdown menu. If you’re on an 80GB A100 instance, be sure to pick the 70B variant for optimal performance.

Step 5: Interact with the Deepseek R1 Model

Type your prompt into the web interface. For example, try asking:

“If the concepts of rCUDA were applied at scale, overcoming latency, what would it mean for the cost of GPUs on cloud providers?”

After you submit your query, the model will begin processing. You’ll see its detailed thought process and response generation—note that it might take up to 200 seconds for a full answer to be produced.

Conclusion

That’s it! You now have Deepseek R1 running locally on Thunder Compute. Enjoy exploring its detailed reasoning and powerful capabilities. For more in-depth tutorials, check out our other guides, including using Docker on Thunder Compute, using Instance Templates, and running Jupyter notebooks for interactive development.