mkdirintel-ollamacdintel-ollama
uvvenv--python3.11..\.venv\Scripts\Activate.ps1uvpipinstall--pre--upgradeipex-llm[cpp]
Then run the init script
.\.venv\Scripts\init-ollama.bat
DLLs Injection
When I tried to run Ollama, I got this error svml_dispmd.dll not found and a bunch of other dll not found error.
The Intel runtimes should be already installed bundled with the ipex-llm[cpp]. So we just need to add the DLLs to our env. Here is the short script I used to inject those DLLs.
$IntelPaths=Get-ChildItem-Path".\.venv"-Filter"*.dll"-Recurse|Select-Object-ExpandPropertyDirectoryName-Unique$env:Path=($IntelPaths-join";")+";"+$env:Path
Graphics Driver
After that, the Ollama launched but when I try to run any model it doesnt work. I got error unsupported SPIR-V version number 'unknown (66560)'. This is because we need at least the driver that have SPIR-V 1.4 version, but my graphics driver is only SPIR-V 1.3.
To fix it I updated the graphics driver to the latest Intel Arc & Iris Xe Graphics Drivers (31.0.101.xxxx or newer) using the Official Updater.
Run Ollama
To run Ollama itself we need to prepare some environment variables. I used this script to do it. I also added the .venv activation and DLLs injection so it would be easier to run.
set_env.ps1
if($null-eq$env:VIRTUAL_ENV){..\.venv\Scripts\Activate.ps1}$env:Path=(Get-ChildItem-Path".\.venv"-Filter"*.dll"-Recurse|Select-Object-ExpandPropertyDirectoryName-Unique-join";")+";"+$env:Path$env:SYCL_DEVICE_FILTER="level_zero:gpu"$env:ONEAPI_DEVICE_SELECTOR="level_zero:0"$env:ZES_ENABLE_SYSMAN="1"$env:OLLAMA_INTEL_GPU="true"$env:OLLAMA_NUM_GPU="999"$env:OLLAMA_CONTEXT_LENGTH="8192"# for IPEX-LLM$env:OLLAMA_NUM_CTX="8192"# for Ollama Server
start_ollama.ps1
..\set_env.ps1Stop-Process-Nameollama-Force-ErrorActionSilentlyContinue.\ollama.exeserve
. .\set_env.ps1 Remember to add the dot in the beginning so the env stays in your current terminal.
After the Ollama server is running using .\start_ollama.ps1, you can just open new terminal and run .\set_env.ps1, then use Ollama as usual.
ollamarunphi3:mini
Results
With this setup, I tried Phi-3 Mini (3.8B) and it runs with 33/33 layers offloaded to the GPU. Finally I can run LLMs locally >:)