Coffee and Video Games Go Together: Installing the needed tools to run Large Language Models on my Workstation

This is going to be a quick guide as to all the things I did to get the tools installed on my workstation build so that I could use to to play with and test various AI models.

The system I'm using is the dual Xenon system that I built earlier for just this purpose. It has plenty of available CPU cores (32 using hyper-threading) and 256GB of ECC DDR4 RAM to run both virtual machines as well as docker containers as needed. As it's more of a workstation than a server I installed my favorite desktop Linux distribution Linux Mint 22.1. It's based off of Ubuntu 24.04 LTS so it's stable and almost anything that you can do on Ubuntu you can do on it with little or no modifications of the process. The advantage as far as I'm concerned is that it uses a more traditional Cinnamon desktop environment that I prefer over what Ubuntu uses.

For summary the following is what I'm working with:

The first thing needed is a way to run the models themselves and provide api access to other applications that will be making use of them. For this I installed and setup ollama as it's an easy to use tool that will provide access to other tools by an Open-AI compatible API and can make use of both the systems CPU and one or more graphics cards for AI acceleration.

Installing ollama on the system

For installing ollama you can follow the manual method in the following document.
How to install Ollama on Linux (2 easy methods)

1. Update the system

Make sure your system is up to date as this will make sure things run smoother.

sudo apt update
sudo apt upgrade

2. Install requited dependencies

The following will speed up the installation, the install script should install them if they aren't present, but it's best to make sure they are installed.

sudo apt install python3 python3-pip git

3. Download the ollama installation package

The following command will download a script from the ollama website and install ollama on your system taking care of most the details.

curl -fsSL https://ollama.com/install.sh | sh

verify that the install was successful

ollama --version

4. Run and configure Ollama

You should be able to launch the ollama as a server with the following command.

ollama serve

You can test that it's running by the following command, this will download, install and run the llama 3.2 model and give you verbose statistics about the command prompt.

ollama run --verbose llama3.2:3b

If you run the above command and let it finish downloading the model and then give it the prompt "Write a one-sentence summary of the plot of Cinderella." you should see something like the following.

You can exit out of the ollama shell by typing /bye and hitting enter.

5. Setup to start automatically

Now that we've verified it appears to be working, we should enable the service to start when the system starts up automatically.

sudo systemctl daemon-reload
sudo systemctl enable ollama

That should complete the installation of ollama server on the system and you should be ready to progress to the next step.

Installing Open WebUI

The first thing you need to do is install Docker on Linux Mint, I following the following instructions, I won't copy How to Install Docker on Linux Mint 22 the instructions as it's pretty long.

Create a Docker volume to persist Open WebUI data

docker volume create open-webui

Pull the docker container from github.

docker pull ghcr.io/open-webui/open-webui:main

Execute the docker run command to start the Open WebUI container

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Explanation of the command:

docker run -d: Runs the container in detached mode (in the background).
--network=host: Tells it to use the host network interface (allows external connections)
-v open-webui:/app/backend/data: Mounts the open-webui volume to the /app/backend/data directory in the container.
-e OLLAMA_BASE_URL=http://127.0.0.1:11434: sets the ollama base url.
--name open-webui: Assigns the name "open-webui" to the container.
--restart always: Always restart container when docker starts
ghcr.io/open-webui/open-webui:main: Specifies the Open WebUI Docker image to use

Assuming the command executed without error you should be able to open open up a browser to http://localhost:8080/ or http://<host ip>:8080/ from another system and setup the first user which will be the administrator.

Updating Open WebUI

You will need to update the Open WebUI app from time to time as there are feature updates and fixes from time to time. The easiest way I found to update Open WebUI is just to use watchtower to update it with the following command.

docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui

Using Open WebUI

I'm not going to write instructions on using Open WebUI, the interface is similar to Open AI/Gemini/Copilot interface in usage, there are advanced features that can be enabled such as search integration and RAG (Recovery Augmented Generation for integrating your own data into the knowledge) that are better covered in other places.

The main point of this post is to document the configuration for myself so if I need to do it again I know how I did it without having to go thought searching the documentation again.

Coffee and Video Games Go Together

Sunday, February 23, 2025

Installing the needed tools to run Large Language Models on my Workstation