GPU PCI Address Instability: When Your Card Moves Between Reboots

DEV Community

AMD iGPU RAM theft, you know how sensitive these BIOS settings are.

If you're on Proxmox 8.4+, the "happy path" is to use the q35 machine type. The older i440fx is more prone to these PCI mapping failures and IRQ conflicts. I also found that preventing the card from entering deep power states helps avoid the "zombie GPU" scenario where the card is physically there but logically dead.

To stabilize this, I switched the VM to q35 and explicitly enabled PCIe mode for the passthrough device. I also added a kernel parameter to stop the CPU from entering deep sleep states, which I've found reduces the randomness of the PCIe bus scan.

# 1. Change VM to q35 machine type for better PCIe support
qm set <VMID> --machine q35
# 2. Pass through the GPU with pcie=1 to ensure it's treated as a PCIe device
# Replace <PCI_ADDRESS> with your current address (e.g., .&checktime(0000,01,00,':').0)
qm set <VMID> -hostpci0 <PCI_ADDRESS>,pcie=1
# 3. To stop the GPU from entering D3cold (which can cause boot-time instability)
# Run this on the Proxmox host
echo 0 > /sys/bus/pci/devices/0000:<PCI_BUS>:<PCI_SLOT>.0/d3cold_allowed

If the addresses keep shifting despite these changes, you're fighting your motherboard's firmware. At that point, I stopped fighting the VM abstraction and moved the NVIDIA drivers directly onto the Proxmox host. I then used the NVIDIA Container Toolkit to expose the GPU to my Kubernetes worker. It removes the PCI address fragility entirely because the host driver handles the hardware mapping, and the containers just see the device.

The lesson here is that PCI addresses are not constants; they are suggestions. If your workload requires 100% uptime and you can't guarantee a static PCI map, stop using VM passthrough and move the driver to the host.

Top comments (0)

Guatu

Engineer building AI agents, bare-metal K8s clusters, and IIoT systems. 7-node Proxmox homelab because cloud bills are optional. Consulting at guatulabs.com.

Location

United States
Education

Bachelors of Science in Mechanical Engineering with minor in Aerospace Engineering
Work

Infrastructure, AI and Reliability engineering at Defense Contractor, Consulting Services Guatu Labs
Joined

Mar 28, 2026

#proxmox #backups #homelab #infrastructure

Tesla P40 in a Homelab: 24GB of Inference on a Budget

#teslap40 #nvidia #proxmox #ollama

Proxmox Cluster Quorum: How Many Nodes Do You Actually Need

#proxmox #quorum #highavailability #homelab