Override tensors and tensor split with 2 GPUs #1794

Open

Description

opened

on Oct 14, 2025

I have been struggling for a while with dividing specific layers or tensors between two GPUs and a CPU. Is there a way to use Override Tensors to specify tensors to offload on two different GPUs? I tried moving layers 30-39 to CUDA0 and 40-49 to CUDA1 like this:

\.(3[0-9])\.*=CUDA0,\.(4[0-9])\.*=CUDA1

At first it looks like it should work:

Handling Override Tensors for backends: CUDA0 CUDA1 CPU
Override Tensor: \.(3[0-9])\.* to CUDA0
Override Tensor: \.(4[0-9])\.* to CUDA1

But in the end Koboldcpp only uses the last override command, i.e. 40-49 to CUDA1 and ignores the first one.

I also tried the opposite by setting GPU Layers to 99 and overriding specific layers to CPU, but then Koboldcpp ignores Tensor Split tettings and only uses one GPU.

Being able to control individual tensors is especially important when using MoE models. Setting MoE CPU Layers works fine with one GPU, but the Tensor Split settings are again ignored.

This is on Win10, i5-13600KF, RTX 4090 & RTX 3080 Ti, Kobold version 1.99.4.

Metadata

Assignees

No one assigned

Labels

No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Override tensors and tensor split with 2 GPUs #1794

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions