k3s on Proxmox: A Best Practices Guide

Most k3s tutorials get you a running cluster in five minutes and call it done. That is fine for learning, but if you plan to run real workloads on this thing (databases, self-hosted SaaS, internal tools) you need to make decisions up front about disk layout, resource isolation, and backup strategy that are painful to change later. This guide walks through every step of building a single-node k3s cluster inside a Proxmox VM, with specific attention to the things tutorials skip: NUMA topology for multi-socket servers, separating the OS disk from the data disk, kernel tuning for container workloads, and automated backup of the cluster state. By the end you will have a cluster ready to run production workloads like Supabase, Gitea, or anything else you want to self-host, with a foundation clean enough that adding nodes, GitOps, or a service mesh later does not require starting over.

Prerequisites and Assumptions

This guide assumes you have:

A Proxmox VE host with at least 128GB RAM and a fast storage pool (NVMe-backed ZFS recommended)
A VLAN-aware bridge (vmbr0) configured in Proxmox
A working network with DHCP on the target VLAN
SSH access to the Proxmox host

The examples below use a Dell R730xd with dual Xeon E5-2699 v4 processors (44 cores / 88 threads), 256GB RAM, and NVMe storage on a ZFS mirror pool. Adjust the CPU, RAM, and storage values to match your hardware. If you have a single-socket server, you can skip the NUMA-specific settings.

Software Versions

Pinning versions matters. A command that works today might behave differently after an upstream release changes default flags or deprecates options. Here is what this guide was tested against:

Component	Version
Proxmox VE	9.1.6
Ubuntu Server	24.04.4 LTS
Kernel	6.8.0-101-generic
k3s	v1.34.5+k3s1
containerd	2.1.5-k3s1
Helm	3.x (latest)
Traefik	v3.6.9

Step 1: Create the VM in Proxmox

We are creating the VM via CLI rather than the GUI. This is intentional: a CLI command is reproducible, documentable, and can be committed to a git repository. If you ever need to rebuild this VM, you have the exact command. First, confirm your ISO is uploaded and your storage pool name:

ls /var/lib/vz/template/iso/
pvesm status

Take note of the ISO filename and the storage pool name. In the examples below, the storage pool is called my-storage and the ISO is ubuntu-24.04.4-live-server-amd64.iso. Replace these with your actual values.

qm create 100 \
  --name k3s-node01 \
  --ostype l26 \
  --machine q35 \
  --bios ovmf \
  --efidisk0 my-storage:0,efitype=4m,pre-enrolled-keys=0 \
  --cpu host \
  --sockets 2 \
  --cores 12 \
  --numa 1 \
  --memory 98304 \
  --balloon 0 \
  --scsihw virtio-scsi-single \
  --scsi0 my-storage:64,iothread=1,discard=on,ssd=1 \
  --scsi1 my-storage:400,iothread=1,discard=on,ssd=1 \
  --net0 virtio,bridge=vmbr0,tag=20 \
  --ide2 local:iso/ubuntu-24.04.4-live-server-amd64.iso,media=cdrom \
  --boot order=scsi0 \
  --agent enabled=1

That is a lot of flags. Here is why each one matters:

Flag	Value	Reasoning
`--machine q35`	Modern chipset	Better PCIe emulation than the older i440fx. Required for UEFI boot.
`--bios ovmf`	UEFI	Modern boot chain. Enables secure boot if you ever need it.
`--cpu host`	Passthrough	Exposes real CPU flags (AVX2, AES-NI) to the VM. k3s benefits from AES-NI for TLS operations.
`--sockets 2 --cores 12`	24 vCPUs	Matches a dual-socket physical layout. Leaves remaining cores available for other VMs. Adjust to your hardware.
`--numa 1`	NUMA enabled	See the gotcha below. This is the single most impactful setting for multi-socket servers.
`--memory 98304`	96GB	Generous for k3s workloads while leaving headroom for future VMs. Scale to your available RAM.
`--balloon 0`	No ballooning	Memory ballooning conflicts with workloads that assume available memory is real (ZFS ARC, databases, Java heaps).
`--scsihw virtio-scsi-single`	VirtIO SCSI	Best virtual disk performance. The `single` variant enables per-disk IO threads.
`--scsi0 ... --scsi1 ...`	64GB + 400GB	Separate OS and data disks. Each gets its own IO thread (`iothread=1`) to prevent contention.
`discard=on,ssd=1`	Trim support	Tells the guest the backing store is SSD. TRIM commands propagate through to the host ZFS pool for space reclaim.
`--net0 ... tag=20`	VLAN tag	Places the VM on a specific VLAN. Change the tag to match your network layout.
`--agent enabled=1`	QEMU guest agent	Lets Proxmox see the VM’s IP address, perform clean shutdowns, and freeze filesystems for snapshots.

Gotcha: NUMA is not the same as CPU type “host” A common misconception is that setting --cpu host handles NUMA. It does not. --cpu host passes through CPU instruction sets (AVX2, AES-NI, etc.). The --numa 1 flag is what tells the VM about your multi-socket memory topology. Without NUMA enabled, the VM sees flat memory. On a dual-socket server, this means the Linux scheduler has no idea that half the RAM sits behind the other CPU’s memory controller. Cross-NUMA memory access can be 30-40% slower. For latency-sensitive workloads like databases, this is the difference between good and terrible performance. If your server is single-socket, you can skip this flag entirely. It only matters for multi-socket systems.

Start the VM:

qm start 100

Step 2: Install Ubuntu Server

Open the console in the Proxmox UI (select VM 100, then Console). The Ubuntu Server installer will boot from the ISO. Walk through the installer with these choices:

Disk layout: Select the 64GB disk. Use the entire disk with ext4. Do not use LVM and do not use ZFS inside the VM. The host is already handling redundancy at the ZFS layer; adding another layer of volume management inside the guest just adds complexity for no benefit.
Hostname: k3s-node01
SSH: Enable OpenSSH server during install. Import your GitHub keys if prompted.
Snaps: Skip everything. Do not install Docker or any featured snaps. k3s bundles its own containerd runtime.

Gotcha: The installer shows two disks The installer will see both the 64GB and 400GB virtual disks. Make sure you select the 64GB disk for the OS installation. Leave the 400GB disk completely untouched. We will format and mount it manually after installation so it serves as a dedicated data disk for k3s.

After installation completes, remove the ISO from the VM:

qm set 100 --ide2 none

Step 3: Post-Install Configuration

SSH into the VM and run the initial updates:

sudo apt update && sudo apt upgrade -y

Install the prerequisites that k3s and its storage drivers expect:

sudo apt install -y \
  curl \
  wget \
  jq \
  sqlite3 \
  open-iscsi \
  nfs-common \
  qemu-guest-agent \
  apparmor \
  apparmor-utils

sudo systemctl enable --now qemu-guest-agent
sudo systemctl enable --now iscsid

Why open-iscsi and nfs-common? Even though you are not using network storage today, several Kubernetes storage drivers (including Longhorn, which you may add later) require these kernel modules and userspace tools to be present. Installing them now avoids mysterious “volume stuck in Pending” debugging sessions later. We also install sqlite3 for cluster state backups. k3s in single-server mode uses SQLite as its datastore, and we will use SQLite’s online backup feature for safe snapshots.

Step 4: Kernel Tuning

The default kernel parameters are tuned for general-purpose workloads. A Kubernetes node has different needs: lots of concurrent connections, heavy use of inotify for file watches, and a preference for keeping processes running over aggressive memory reclaim.

cat <<'EOF' | sudo tee /etc/sysctl.d/99-k3s.conf
# Connection handling
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 32768

# Increase conntrack table for iptables-based kube-proxy
net.netfilter.nf_conntrack_max = 131072

# Increase inotify limits -- each pod watch consumes watchers
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 1024

# Increase file descriptor limits
fs.file-max = 2097152

# Do not swap unless absolutely necessary
vm.swappiness = 1

# Better memory overcommit behavior for containers
vm.overcommit_memory = 1
vm.panic_on_oom = 0
EOF

sudo sysctl --system

A quick explanation of the less obvious settings:

vm.swappiness = 1: Tells the kernel to avoid swapping unless it is about to OOM. Container workloads with memory limits should be killed (and rescheduled by Kubernetes) rather than swapped to disk where they become unresponsive zombies.
vm.overcommit_memory = 1: Always allows memory allocation. Without this, the kernel’s heuristic overcommit check can reject allocations from containers even when there is plenty of free memory, causing spurious OOM kills.
fs.inotify.max_user_watches: The default of 8192 is laughably low for Kubernetes. Each ConfigMap watch, log tail, and file sync consumes a watcher. You will hit the default limit with just a handful of pods.

Step 5: Format the Data Disk

This is one of the most important architectural decisions in the entire setup: separating the k3s data directory from the OS disk. By default, k3s stores everything under /var/lib/rancher/k3s on the OS disk. “Everything” means container images, the cluster datastore, persistent volume data, and containerd snapshots. On a 64GB OS disk, you can fill that up surprisingly fast. A few container images and a database PVC later, your root filesystem is at 95% and the node starts evicting pods. First, verify the data disk device name:

lsblk

You should see your 64GB OS disk (with partitions) and a bare 400GB disk. In most cases this will be /dev/sdb, but UEFI boot can sometimes shuffle drive letters. Always verify before formatting.

# Format with ext4
sudo mkfs.ext4 -L k3s-data /dev/sdb

# Create mount point
sudo mkdir -p /mnt/k3s-data

# Add to fstab for persistence across reboots
echo 'LABEL=k3s-data /mnt/k3s-data ext4 defaults,noatime,discard 0 2' | sudo tee -a /etc/fstab

# Reload systemd and mount
sudo systemctl daemon-reload
sudo mount -a

# Verify (~393GB available)
df -h /mnt/k3s-data

Why noatime? It disables access time updates on every file read. In a k3s cluster doing thousands of small reads per second across container layers and config files, this eliminates a significant amount of unnecessary write I/O. Why discard? It enables continuous TRIM. Since the virtual disk is configured with discard=on in Proxmox and the backing storage pool is on NVMe SSDs, TRIM commands propagate all the way down: guest filesystem to virtual disk to ZFS to physical SSD. This keeps the SSDs performing optimally as data is written and deleted over time.

Step 6: Install k3s

curl -sfL https://get.k3s.io | sh -s - server \
  --write-kubeconfig-mode 644 \
  --disable traefik \
  --disable servicelb \
  --data-dir /mnt/k3s-data/k3s \
  --default-local-storage-path /mnt/k3s-data/local-storage \
  --kubelet-arg="max-pods=110" \
  --kubelet-arg="eviction-hard=memory.available<500Mi,nodefs.available<10%"

Here is what each flag does and why it is here:

Flag	Purpose
`--write-kubeconfig-mode 644`	Lets your non-root user read the kubeconfig without sudo. The default is 600 (root only).
`--disable traefik`	Disables the bundled Traefik instance. We install Traefik via Helm in a later step for version control and customization.
`--disable servicelb`	Disables k3s’s built-in ServiceLB (formerly Klipper). It binds host ports and conflicts with a Helm-managed ingress controller.
`--data-dir /mnt/k3s-data/k3s`	Moves all k3s state (datastore, containerd images, snapshots) to the dedicated data disk. This is the most important flag in the entire command.
`--default-local-storage-path /mnt/k3s-data/local-storage`	Points the built-in local-path-provisioner at the data disk for persistent volume claims.
`--kubelet-arg="eviction-hard=..."`	Tells kubelet to evict pods when memory drops below 500Mi or disk below 10%, preventing the node from going completely unresponsive.

Gotcha: Traefik and ServiceLB — why disable both? If you just want something working immediately, you can remove both --disable flags. The bundled Traefik will handle ingress out of the box and ServiceLB will expose it. This is perfectly fine for testing. The reason to disable them is control. The bundled Traefik is pinned to whatever version k3s shipped with, its configuration is managed through k3s HelmChart CRDs rather than standard Helm values, and upgrading it independently of k3s is awkward. If you plan to manage your cluster declaratively (Helm charts in git, eventually GitOps with Argo CD), you want every component installed through the same mechanism.

Verify the installation:

# Check the node is Ready
kubectl get nodes -o wide

# All system pods should be Running
kubectl get pods -A

The node should show Ready within 30 seconds. The system pods (coredns, local-path-provisioner, metrics-server) may take a minute to pull their images and start. Verify the local-path-provisioner is pointing at the correct directory:

kubectl get configmap local-path-config -n kube-system \
  -o jsonpath='{.data.config\.json}' | jq .

You should see your data disk path in the output:

{
  "nodePathMap": [
    {
      "node": "DEFAULT_PATH_FOR_NON_LISTED_NODES",
      "paths": [
        "/mnt/k3s-data/local-storage"
      ]
    }
  ]
}

Set up kubeconfig for Helm and remote access

k3s writes the kubeconfig to /etc/rancher/k3s/k3s.yaml, but Helm and other tools default to ~/.kube/config. Set the environment variable in your shell profile:

echo 'export KUBECONFIG=/etc/rancher/k3s/k3s.yaml' >> ~/.bashrc
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml

Gotcha: Helm says “cluster unreachable” If Helm gives you Kubernetes cluster unreachable: dial tcp 127.0.0.1:8080: connect: connection refused, it cannot find the kubeconfig. The fix is the KUBECONFIG export above. This catches people every time because kubectl works fine (k3s configures it automatically) but Helm does not inherit the same config path.

To access the cluster from your workstation, copy the kubeconfig and update the server address:

# On your workstation
mkdir -p ~/.kube
scp user@YOUR_VM_IP:/etc/rancher/k3s/k3s.yaml ~/.kube/config
sed -i 's/127.0.0.1/YOUR_VM_IP/g' ~/.kube/config

# Verify remote access
kubectl get nodes

Step 7: Install Helm

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version

Step 8: Traefik Ingress Controller

Even if you are not exposing services to the internet, an ingress controller lets you access services by hostname instead of port numbers. Instead of remembering that Supabase Studio is on port 3000 and Grafana is on port 3001, you access supabase.internal and grafana.internal.

helm repo add traefik https://traefik.github.io/charts
helm repo update

Create a values file:

# traefik-values.yaml

# Run as DaemonSet so it automatically scales when you add nodes
deployment:
  kind: DaemonSet

# Bind directly to host ports -- no LoadBalancer needed
ports:
  web:
    hostPort: 80
  websecure:
    hostPort: 443

# Enable the dashboard for debugging routing issues
ingressRoute:
  dashboard:
    enabled: true

# Prometheus metrics endpoint (useful when you add monitoring later)
metrics:
  prometheus:
    enabled: true

# Access logs help debug routing issues
logs:
  general:
    level: INFO
  access:
    enabled: true

helm install traefik traefik/traefik \
  --namespace traefik \
  --create-namespace \
  --values traefik-values.yaml

Verify it is running:

kubectl get pods -n traefik

# Smoke test -- should return 404 (Traefik is listening but has no routes)
curl -I http://YOUR_VM_IP

Expected response:

HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
X-Content-Type-Options: nosniff

A 404 from Traefik means it is running and listening, but no Ingress resources exist yet. This is the correct state. Once you deploy applications with Ingress resources, Traefik will route traffic to them. To use hostnames like supabase.internal, you need DNS entries pointing at your k3s node’s IP. You can add these to your router’s DNS, a local DNS server like Technitium or Pi-hole, or even /etc/hosts on your workstation. A wildcard entry like *.k3s.internal pointing at your VM’s IP is the most convenient approach.

Step 9: Namespace Strategy

Setting up namespaces now takes 30 seconds and saves you from dumping everything into default and regretting it later. Namespaces give you logical isolation, per-namespace resource quotas, and cleaner kubectl output.

# Database and stateful services
kubectl create namespace data

# Your applications
kubectl create namespace apps

# Supabase (it deploys many pods, deserves its own space)
kubectl create namespace supabase

Apply default resource limits to workload namespaces so a misbehaving pod cannot consume the entire node:

# default-limits.yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
    - default:
        cpu: "500m"
        memory: "512Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      type: Container

for ns in apps supabase data; do
  kubectl apply -f default-limits.yaml -n $ns
done

This means any pod deployed without explicit resource requests gets sane defaults (100m CPU request, 128Mi memory request) instead of being allowed to consume unlimited resources. You can override these per-deployment in your Helm values.

Step 10: Automated Backups

Your cluster state is everything: every Deployment, Service, Secret, ConfigMap, and namespace definition. Losing it means manually re-deploying everything from memory (or from your Helm commands if you were diligent about saving them).

Note: SQLite, not etcd k3s in single-server mode uses an embedded SQLite database for cluster state, not etcd. etcd is only used when you initialize a multi-server HA cluster with --cluster-init. This matters for backups because the k3s etcd-snapshot command does not work with the SQLite backend. Instead, we use SQLite’s built-in .backup command, which performs a safe online copy without locking the database.

cat <<'SCRIPT' | sudo tee /usr/local/bin/k3s-backup.sh
#!/bin/bash
BACKUP_DIR="/mnt/k3s-data/backups/snapshots"
DATA_DIR="/mnt/k3s-data/k3s"
RETENTION_DAYS=30
mkdir -p "$BACKUP_DIR"

TIMESTAMP=$(date +%Y%m%d-%H%M%S)

# SQLite online backup -- safe to run while k3s is serving traffic
sqlite3 "$DATA_DIR/server/db/state.db" ".backup '$BACKUP_DIR/state-$TIMESTAMP.db'"

# Also grab the server token and TLS certificates
# You need these along with the database to fully restore a cluster
tar czf "$BACKUP_DIR/server-$TIMESTAMP.tar.gz" \
  "$DATA_DIR/server/token" \
  "$DATA_DIR/server/tls" \
  2>/dev/null

# Prune old backups
find "$BACKUP_DIR" -name "state-*" -mtime +${RETENTION_DAYS} -delete
find "$BACKUP_DIR" -name "server-*" -mtime +${RETENTION_DAYS} -delete

echo "$(date): Backup completed"
SCRIPT

sudo chmod +x /usr/local/bin/k3s-backup.sh

Set up a cron job to run daily at 2 AM and on every reboot:

(sudo crontab -l 2>/dev/null; echo "0 2 * * * /usr/local/bin/k3s-backup.sh >> /var/log/k3s-backup.log 2>&1") | sudo crontab -
(sudo crontab -l 2>/dev/null; echo "@reboot /usr/local/bin/k3s-backup.sh >> /var/log/k3s-backup.log 2>&1") | sudo crontab -

Test it:

sudo /usr/local/bin/k3s-backup.sh
ls -la /mnt/k3s-data/backups/snapshots/

You should see a state-*.db file (your cluster database) and a server-*.tar.gz file (token + TLS certs).

Gotcha: k3s etcd-snapshot does not work on SQLite If you run k3s etcd-snapshot save on a single-server k3s installation, you will get etcd datastore disabled. This is not an error in your setup. k3s only enables the etcd snapshot mechanism when running with --cluster-init (multi-server HA mode). For single-server SQLite mode, use the sqlite3 .backup approach shown above.

Gotcha: –data-dir affects backup paths If you used --data-dir during k3s installation (as we did), the SQLite database lives at /mnt/k3s-data/k3s/server/db/state.db, not the default /var/lib/rancher/k3s/server/db/state.db. Similarly, any k3s CLI command that interacts with server state requires --data-dir to be passed explicitly, otherwise it looks in the wrong location and fails with a “token not found” error.

Step 11: Verification Checklist

Run through this after completing all steps to confirm everything is working:

echo "=== Node Status ==="
kubectl get nodes -o wide

echo -e "\n=== System Pods ==="
kubectl get pods -A

echo -e "\n=== Storage Class ==="
kubectl get storageclass

echo -e "\n=== Namespaces ==="
kubectl get namespaces

echo -e "\n=== Data Disk ==="
df -h /mnt/k3s-data

echo -e "\n=== Traefik ==="
kubectl get pods -n traefik

echo -e "\n=== Backups ==="
ls -la /mnt/k3s-data/backups/snapshots/

echo -e "\n=== Ingress Smoke Test ==="
curl -s -o /dev/null -w "HTTP %{http_code}" http://localhost
echo

echo -e "\n=== PVC Smoke Test ==="
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: smoke-test
  namespace: apps
spec:
  accessModes: [ReadWriteOnce]
  resources:
    requests:
      storage: 1Gi
EOF
sleep 5
kubectl get pvc smoke-test -n apps
kubectl delete pvc smoke-test -n apps

Everything green means: Node is Ready, all system pods are Running, storage class is local-path (default), Traefik is Running, ingress returns HTTP 404 (correct, no routes yet), and the PVC test shows Bound. If all of these check out, your k3s cluster is production-ready for workloads.

What Comes Next

With this foundation in place, you have a cluster ready for real workloads. The immediate next steps depend on what you need first:

Self-hosted Supabase: Deploy CloudNativePG for a properly managed Postgres instance, then install the Supabase community Helm chart on top of it.
Migrate a Vercel app: Containerize your application, push the image to a container registry, and deploy it to the apps namespace with an Ingress resource.
Monitoring: Prometheus + Grafana + Loki for metrics, dashboards, and centralized logging.
GitOps: Argo CD watching a git repository, so every deployment is a commit rather than a manual helm install.
Secrets management: OpenBao or Infisical with External Secrets Operator, replacing hardcoded secrets with dynamic injection.
cert-manager: Automated TLS certificates from Let’s Encrypt when you are ready to expose services externally.

The key principle is that none of these additions require tearing down what you just built. Each one layers on top of a cluster that already has the fundamentals right: separated storage, proper resource defaults, automated backups, and a clean namespace structure. That is what “production-grade” means for a homelab. Not complexity for its own sake, but making the right decisions early so the inevitable growth does not require a rebuild. — Software versions, kernel parameters, and CLI flags are accurate as of k3s v1.34.5 on Ubuntu 24.04.4 LTS with Proxmox 9.1.6. If something has changed or broken, the comments are open.

k3s on Proxmox: A Best Practices Guide

In this post

Prerequisites and Assumptions

Software Versions

Step 1: Create the VM in Proxmox

Step 2: Install Ubuntu Server

Step 3: Post-Install Configuration

Step 4: Kernel Tuning

Step 5: Format the Data Disk

Step 6: Install k3s

Set up kubeconfig for Helm and remote access

Step 7: Install Helm

Step 8: Traefik Ingress Controller

Step 9: Namespace Strategy

Step 10: Automated Backups

Step 11: Verification Checklist

What Comes Next

About The Author

Jay Luong

Leave a reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

k3s on Proxmox: A Best Practices Guide

In this post

Prerequisites and Assumptions

Software Versions

Step 1: Create the VM in Proxmox

Step 2: Install Ubuntu Server

Step 3: Post-Install Configuration

Step 4: Kernel Tuning

Step 5: Format the Data Disk

Step 6: Install k3s

Set up kubeconfig for Helm and remote access

Step 7: Install Helm

Step 8: Traefik Ingress Controller

Step 9: Namespace Strategy

Step 10: Automated Backups

Step 11: Verification Checklist

What Comes Next

About The Author

Jay Luong

Related Posts

Optimizing SMB Shares in Unraid: A Comprehensive Guide

How To Install OpenVPN Access Server on Ubuntu 14.04.1 LTS

Docker Tutorial 2020 – Version 19.03.8

Ethernet Switches

Leave a reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories