NAIRR Workshop · Jetstream2
Running the Local LLM Notebook on Jetstream2
A step-by-step guide to launching a cloud instance and running the notebook that compares three local AI engines — Ollama, llama.cpp, and vLLM.
First — the big picture
Before the steps, here's what we're actually doing, in plain terms. Don't worry if some words are new — we'll define them as they come up.
🖥️ We're going to "spin up a compute instance"
An instance is simply a computer you borrow over the internet — you don't own the hardware, you rent it for as long as you need it. We'll start with a CPU‑only instance (no graphics card) to keep things simple and inexpensive.
Same idea as Amazon AWS or Google Cloud: a computer that lives in the cloud, not on your desk.📓 We'll work inside a "Jupyter notebook"
A notebook is one document that mixes live code, written explanations, and results — all in your web browser. The code is split into small blocks called cells, and we run them one at a time so we can read each explanation and watch what happens before moving to the next.
If you've ever used Google Colab, it's the very same idea.📦 The code lives on GitHub — we'll "clone" it
All the code and explanations are stored in a shared online folder on GitHub (called a repository, or "repo" for short). With a single command we copy that folder onto our cloud instance — that copy step is called cloning.
Like grabbing a shared Google Drive folder, but for code.🧪 What this first example explores
To use an AI model (the "brain" — really just a big file of math), you need a program to actually run it. Those programs are called inferencing apps. We take one model and run it through three different apps — Ollama, llama.cpp, and vLLM — giving each the same questions (called prompts). The goal: is one app clearly the better choice? We compare their speed, memory use, and answer quality.
So the whole flow is just three moves:
Before you begin
Make sure you have:
- An ACCESS ID (register free at operations.access-ci.org/identity/new-user — use your school email, not Gmail)
- Been added to the workshop's Jetstream2 allocation by your instructor
- Logged in to the Jetstream2 portal: jetstream2.exosphere.app
First time connecting to Jetstream2? (do this once)
The very first time, you add your allocation to your Exosphere account. You won't repeat this for later workshops on the same allocation. You get your own workspace — your instances draw from the shared allocation, and you won't see other people's instances.
- Go to jetstream2.exosphere.app and click Add Allocation.
- Choose Jetstream2 as the provider when prompted.
- Sign in with your ACCESS ID — pick the same identity provider you registered with (often "ACCESS CI (XSEDE)", or your university).
- Exosphere lists the allocations you can use — select your instructor's project (e.g. NAIRR2xxxxx) and add it.
- You land on your dashboard for that allocation. Done — continue to Step 1.
Launch your instance
Signed in to Exosphere with your allocation added? Create a new virtual machine:
- Click Create → Instance
- Image: Featured-Ubuntu24
- Flavor (size): m3.quad (4 CPUs, 15 GB RAM — plenty for this demo)
- If asked which allocation to use, pick your instructor's project
- Leave the disk at the default 20 GB — no volume needed
- Click Create
Wait until the instance status shows Ready (usually 2–5 minutes). ☕
The default disk is enough
This demo uses a small model that fits the default 20 GB disk, so you don't need to add a volume. (Avoiding volumes also lets a whole class run at once — volumes are limited per allocation.) If you ever switch to a much larger model, you'd then need more disk.
Open the Web Desktop
On your instance's card, choose:
Interactions → Web Desktop
This opens a full Linux desktop inside your browser, running on the instance. Once it loads, open the Terminal application from the desktop.
Use Web Desktop, not "Console"
The Console is a raw boot screen with no browser and clumsy copy-paste. The Web Desktop includes a browser, which we need to view the notebook. The first load may take a minute and could ask for a passphrase shown on the instance page.
Install Jupyter
In the terminal, set up a clean Python environment and install Jupyter. Copy and paste these commands (run them one block at a time):
Why the ~/llmdemo environment?
Ubuntu 24 won't let you install Python packages system-wide. The virtual environment ("venv") is a private sandbox — it keeps Jupyter and the notebook's AI engines neatly separated from the rest of the system.
Download the workshop & start Jupyter
Still in the same terminal, download the workshop materials and launch Jupyter:
JupyterLab will open automatically in the desktop's Firefox browser. If it doesn't, copy the http://localhost:8888/...?token=... link the terminal prints and paste it into Firefox.
Open and run the notebook
In the JupyterLab file list (left side), open:
04_Local_LLM_Frameworks.ipynb
Run the cells from top to bottom — click a cell and press Shift + Enter, or use Run → Run All Cells.
Troubleshooting
| Problem | Fix |
|---|---|
| "command not found: jupyter" | Your environment isn't active. Run source ~/llmdemo/bin/activate again, then retry. |
| Web Desktop is blank or stuck | Give it a full minute on first load. If still stuck, close the tab and reopen Interactions → Web Desktop. |
| The notebook feels slow on the long "essay" prompt | That's expected on a 4-CPU instance. In the notebook's config cell, the comments show how to switch to a smaller, faster model (qwen3:1.7b). |
| "externally-managed-environment" error on pip | You skipped the venv. Run the Step 3 commands in order — the source .../activate line is required. |
| I'm done — how do I avoid using up credits? | Back in the Jetstream2 portal, Shelve or Delete your instance when finished. A running instance keeps spending the allocation. |
What this notebook teaches
It runs the same AI model three different ways. The big takeaway: the framework you choose mostly changes speed, while model size and quantization mostly change answer quality. The final table lets you see both at once.
Stuck? Get help from Jetstream2
Email help@jetstream-cloud.org, or drop in to Jetstream2 office hours — Tuesdays, 2:00 PM Eastern, on Zoom.