Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Bug] LoRA Loaded but Not Applied: Identical seed w/ and w/o LoRA produces identical output on Qwen-Image (Vulkan) #904

Open
Labels
bugSomething isn't working
@rodrigomatta

Description

Git commit

$ git rev-parse HEAD
d05e46c

Operating System & Version

Ubuntu 22.04

GGML backends

Vulkan

Command-line arguments used

./bin/sd --diffusion-model "/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf" --vae "/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors" --qwen2vl "/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf" --lora-model-dir "/home/user/ComfyUI-0.3.60/models/loras/qwen-image" -p "high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign "DRUNKARD" on bar with retrowave font. long pink nailpolish lora:Samsung:1" --cfg-scale 3.5 --steps 20 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 --vae-on-cpu --seed 244727409499015

Steps to reproduce

  1. Use the following models:

  2. Run the first command (without LoRA) using --seed 244727409499015 and save the output.png.

  3. Run the second command (with lora:Samsung:1) using the exact same --seed 244727409499015 and save the output.png.

  4. Compare the two output images. They will be identical

What you expected to happen

The image generated with the LoRA lora:Samsung:1 and seed 244727409499015 should be noticeably different from the image generated without the LoRA using the same seed.

The LoRA weights should be applied during the sampling process, altering the final image.

What actually happened

The LoRA appears to load correctly. The log output confirms it was found and "applied":

[INFO ] stable-diffusion.cpp:929 - lora 'Samsung' applied, taking 1.87s
[INFO ] stable-diffusion.cpp:969 - apply_loras completed, taking 1.87s

However, the final output.png generated with the LoRA is identical to the output.png generated without the LoRA, despite using the same fixed seed.

This indicates the LoRA weights are not actually being used during the sampling steps.

Logs / error messages / stack trace

  • output without lora:
    ./bin/sd --diffusion-model "/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf" --vae "/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors" --qwen2vl "/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf" -p "high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign "DRUNKARD" on bar with retrowave font. long pink nailpolish" --cfg-scale 3.5 --steps 20 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 --vae-on-cpu --seed 244727409499015
    Option:
    n_threads: 14
    mode: img_gen
    model_path:
    wtype: unspecified
    clip_l_path:
    clip_g_path:
    clip_vision_path:
    t5xxl_path:
    qwen2vl_path: /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf
    qwen2vl_vision_path:
    diffusion_model_path: /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf
    high_noise_diffusion_model_path:
    vae_path: /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors
    taesd_path:
    esrgan_path:
    control_net_path:
    embedding_dir:
    photo_maker_path:
    pm_id_images_dir:
    pm_id_embed_path:
    pm_style_strength: 20.00
    output_path: output.png
    init_image_path:
    end_image_path:
    mask_image_path:
    control_image_path:
    ref_images_paths:
    control_video_path:
    auto_resize_ref_image: true
    increase_ref_index: false
    offload_params_to_cpu: true
    clip_on_cpu: false
    control_net_cpu: false
    vae_on_cpu: true
    diffusion flash attention: true
    diffusion Conv2d direct: false
    vae_conv_direct: false
    control_strength: 0.90
    prompt: high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish
    negative_prompt:
    clip_skip: -1
    width: 1024
    height: 1024
    sample_params: (txt_cfg: 3.50, img_cfg: 3.50, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: euler, sample_steps: 20, eta: 0.00, shifted_timestep: 0)
    high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: default, sample_steps: -1, eta: 0.00, shifted_timestep: 0)
    moe_boundary: 0.875
    prediction: default
    flow_shift: 3.00
    strength(img2img): 0.75
    rng: cuda
    seed: 244727409499015
    batch_count: 1
    vae_tiling: false
    force_sdxl_vae_conv_scale: false
    upscale_repeats: 1
    chroma_use_dit_mask: true
    chroma_use_t5_mask: false
    chroma_t5_mask_pad: 1
    video_frames: 1
    vace_strength: 1.00
    fps: 16
    System Info:
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
    [DEBUG] stable-diffusion.cpp:151 - Using Vulkan backend
    [DEBUG] ggml_extend.hpp:66 - ggml_vulkan: Found 1 Vulkan devices:
    [DEBUG] ggml_extend.hpp:66 - ggml_vulkan: 0 = Radeon RX 7900 XT (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
    [INFO ] stable-diffusion.cpp:207 - loading diffusion model from '/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf'
    [INFO ] model.cpp:1097 - load /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf using gguf format
    [DEBUG] model.cpp:1114 - init from '/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf'
    [INFO ] stable-diffusion.cpp:254 - loading qwen2vl from '/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf'
    [INFO ] model.cpp:1097 - load /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf using gguf format
    [DEBUG] model.cpp:1114 - init from '/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf'
    [INFO ] stable-diffusion.cpp:268 - loading vae from '/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors'
    [INFO ] model.cpp:1100 - load /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors using safetensors format
    [DEBUG] model.cpp:1207 - init from '/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors', prefix = 'vae.'
    [INFO ] stable-diffusion.cpp:289 - Version: Qwen Image
    [INFO ] stable-diffusion.cpp:316 - Weight type stat: f32: 1422 | q5_0: 720 | q5_1: 120 | q4_K: 111 | q5_K: 25 | q6_K: 41 | iq4_xs: 20 | bf16: 6
    [INFO ] stable-diffusion.cpp:317 - Conditioner weight type stat: f32: 141 | q4_K: 111 | q5_K: 25 | q6_K: 41 | iq4_xs: 20
    [INFO ] stable-diffusion.cpp:318 - Diffusion model weight type stat: f32: 1087 | q5_0: 720 | q5_1: 120 | bf16: 6
    [INFO ] stable-diffusion.cpp:319 - VAE weight type stat: f32: 194
    [DEBUG] stable-diffusion.cpp:321 - ggml tensor size = 400 bytes
    [INFO ] stable-diffusion.cpp:348 - Using flash attention in the diffusion model
    [DEBUG] qwenvl.hpp:141 - merges size 151387
    [DEBUG] qwenvl.hpp:163 - vocab size: 151665
    [INFO ] qwen_image.hpp:539 - qwen_image_params.num_layers: 60
    [DEBUG] ggml_extend.hpp:1754 - qwenvl2.5 params backend buffer size = 5918.09 MB(RAM) (338 tensors)
    [DEBUG] ggml_extend.hpp:1754 - qwen_image params backend buffer size = 13733.54 MB(RAM) (1933 tensors)
    [INFO ] stable-diffusion.cpp:478 - VAE Autoencoder: Using CPU backend
    [DEBUG] ggml_extend.hpp:1754 - wan_vae params backend buffer size = 139.84 MB(RAM) (108 tensors)
    [DEBUG] stable-diffusion.cpp:596 - loading weights
    [DEBUG] model.cpp:2030 - using 14 threads for model loading
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf
    |=======================================> | 1933/2465 - 1177.94it/s
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf
    |==============================================> | 2271/2465 - 131.23it/s
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors
    |==================================================| 2465/2465 - 137.66it/s
    [INFO ] model.cpp:2351 - loading tensors completed, taking 17.92s (process: 0.02s, read: 17.18s, memcpy: 0.00s, convert: 0.09s, copy_to_backend: 0.00s)
    [INFO ] stable-diffusion.cpp:679 - total params memory size = 19791.47MB (VRAM 19651.64MB, RAM 139.84MB): text_encoders 5918.09MB(VRAM), diffusion_model 13733.55MB(VRAM), vae 139.84MB(RAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
    [INFO ] stable-diffusion.cpp:778 - running in FLOW mode
    [DEBUG] stable-diffusion.cpp:803 - finished loaded file
    [DEBUG] stable-diffusion.cpp:2470 - generate_image 1024x1024
    [INFO ] stable-diffusion.cpp:2597 - TXT2IMG
    [INFO ] stable-diffusion.cpp:949 - attempting to apply 0 LoRAs
    [INFO ] stable-diffusion.cpp:969 - apply_loras completed, taking 0.00s
    [DEBUG] stable-diffusion.cpp:970 - prompt after extract and remove lora: "high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish"
    [DEBUG] conditioner.hpp:1432 - parse '<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish<|im_end|>
    <|im_start|>assistant
    ' to [['<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish<|im_end|>
    <|im_start|>assistant
    ', 1], ]
    [INFO ] ggml_extend.hpp:1677 - qwenvl2.5 offload params (5918.09 MB, 338 tensors) to runtime backend (Vulkan0), taking 2.20s
    [DEBUG] ggml_extend.hpp:1579 - qwenvl2.5 compute buffer size: 38.04 MB(VRAM)
    [DEBUG] conditioner.hpp:1572 - computing condition graph completed, taking 2429 ms
    [DEBUG] conditioner.hpp:1432 - parse '<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    <|im_end|>
    <|im_start|>assistant
    ' to [['<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    <|im_end|>
    <|im_start|>assistant
    ', 1], ]
    [INFO ] ggml_extend.hpp:1677 - qwenvl2.5 offload params (5918.09 MB, 338 tensors) to runtime backend (Vulkan0), taking 2.20s
    [DEBUG] ggml_extend.hpp:1579 - qwenvl2.5 compute buffer size: 7.24 MB(VRAM)
    [DEBUG] conditioner.hpp:1572 - computing condition graph completed, taking 2312 ms
    [INFO ] stable-diffusion.cpp:2208 - get_learned_condition completed, taking 4745 ms
    [INFO ] stable-diffusion.cpp:2233 - sampling using Euler method
    [INFO ] stable-diffusion.cpp:2327 - generating image: 1/1 - seed 244727409499015
    [INFO ] ggml_extend.hpp:1677 - qwen_image offload params (13733.54 MB, 1933 tensors) to runtime backend (Vulkan0), taking 4.89s
    [DEBUG] ggml_extend.hpp:1579 - qwen_image compute buffer size: 590.52 MB(VRAM)
    |==================================================| 20/20 - 8.49s/it
    [INFO ] stable-diffusion.cpp:2364 - sampling completed, taking 169.98s
    [INFO ] stable-diffusion.cpp:2372 - generating 1 latent images completed, taking 170.96s
    [INFO ] stable-diffusion.cpp:2375 - decoding 1 latents
    [DEBUG] ggml_extend.hpp:1579 - wan_vae compute buffer size: 7492.50 MB(RAM)
    [DEBUG] stable-diffusion.cpp:1666 - computing vae decode graph completed, taking 75.65s
    [INFO ] stable-diffusion.cpp:2385 - latent 1 decoded, taking 75.65s
    [INFO ] stable-diffusion.cpp:2389 - decode_first_stage completed, taking 75.65s
    [INFO ] stable-diffusion.cpp:2709 - generate_image completed in 251.37s
    save result PNG image to 'output.png'

  • output with lora:
    ./bin/sd --diffusion-model "/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf" --vae "/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors" --qwen2vl "/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf" --lora-model-dir "/home/user/ComfyUI-0.3.60/models/loras/qwen-image" -p "high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign "DRUNKARD" on bar with retrowave font. long pink nailpolish lora:Samsung:1" --cfg-scale 3.5 --steps 20 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 --vae-on-cpu --seed 244727409499015
    Option:
    n_threads: 14
    mode: img_gen
    model_path:
    wtype: unspecified
    clip_l_path:
    clip_g_path:
    clip_vision_path:
    t5xxl_path:
    qwen2vl_path: /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf
    qwen2vl_vision_path:
    diffusion_model_path: /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf
    high_noise_diffusion_model_path:
    vae_path: /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors
    taesd_path:
    esrgan_path:
    control_net_path:
    embedding_dir:
    photo_maker_path:
    pm_id_images_dir:
    pm_id_embed_path:
    pm_style_strength: 20.00
    output_path: output.png
    init_image_path:
    end_image_path:
    mask_image_path:
    control_image_path:
    ref_images_paths:
    control_video_path:
    auto_resize_ref_image: true
    increase_ref_index: false
    offload_params_to_cpu: true
    clip_on_cpu: false
    control_net_cpu: false
    vae_on_cpu: true
    diffusion flash attention: true
    diffusion Conv2d direct: false
    vae_conv_direct: false
    control_strength: 0.90
    prompt: high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish lora:Samsung:1
    negative_prompt:
    clip_skip: -1
    width: 1024
    height: 1024
    sample_params: (txt_cfg: 3.50, img_cfg: 3.50, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: euler, sample_steps: 20, eta: 0.00, shifted_timestep: 0)
    high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: default, sample_steps: -1, eta: 0.00, shifted_timestep: 0)
    moe_boundary: 0.875
    prediction: default
    flow_shift: 3.00
    strength(img2img): 0.75
    rng: cuda
    seed: 244727409499015
    batch_count: 1
    vae_tiling: false
    force_sdxl_vae_conv_scale: false
    upscale_repeats: 1
    chroma_use_dit_mask: true
    chroma_use_t5_mask: false
    chroma_t5_mask_pad: 1
    video_frames: 1
    vace_strength: 1.00
    fps: 16
    System Info:
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
    [DEBUG] stable-diffusion.cpp:151 - Using Vulkan backend
    [DEBUG] ggml_extend.hpp:66 - ggml_vulkan: Found 1 Vulkan devices:
    [DEBUG] ggml_extend.hpp:66 - ggml_vulkan: 0 = Radeon RX 7900 XT (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
    [INFO ] stable-diffusion.cpp:207 - loading diffusion model from '/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf'
    [INFO ] model.cpp:1097 - load /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf using gguf format
    [DEBUG] model.cpp:1114 - init from '/home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf'
    [INFO ] stable-diffusion.cpp:254 - loading qwen2vl from '/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf'
    [INFO ] model.cpp:1097 - load /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf using gguf format
    [DEBUG] model.cpp:1114 - init from '/home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf'
    [INFO ] stable-diffusion.cpp:268 - loading vae from '/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors'
    [INFO ] model.cpp:1100 - load /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors using safetensors format
    [DEBUG] model.cpp:1207 - init from '/home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors', prefix = 'vae.'
    [INFO ] stable-diffusion.cpp:289 - Version: Qwen Image
    [INFO ] stable-diffusion.cpp:316 - Weight type stat: f32: 1422 | q5_0: 720 | q5_1: 120 | q4_K: 111 | q5_K: 25 | q6_K: 41 | iq4_xs: 20 | bf16: 6
    [INFO ] stable-diffusion.cpp:317 - Conditioner weight type stat: f32: 141 | q4_K: 111 | q5_K: 25 | q6_K: 41 | iq4_xs: 20
    [INFO ] stable-diffusion.cpp:318 - Diffusion model weight type stat: f32: 1087 | q5_0: 720 | q5_1: 120 | bf16: 6
    [INFO ] stable-diffusion.cpp:319 - VAE weight type stat: f32: 194
    [DEBUG] stable-diffusion.cpp:321 - ggml tensor size = 400 bytes
    [INFO ] stable-diffusion.cpp:348 - Using flash attention in the diffusion model
    [DEBUG] qwenvl.hpp:141 - merges size 151387
    [DEBUG] qwenvl.hpp:163 - vocab size: 151665
    [INFO ] qwen_image.hpp:539 - qwen_image_params.num_layers: 60
    [DEBUG] ggml_extend.hpp:1754 - qwenvl2.5 params backend buffer size = 5918.09 MB(RAM) (338 tensors)
    [DEBUG] ggml_extend.hpp:1754 - qwen_image params backend buffer size = 13733.54 MB(RAM) (1933 tensors)
    [INFO ] stable-diffusion.cpp:478 - VAE Autoencoder: Using CPU backend
    [DEBUG] ggml_extend.hpp:1754 - wan_vae params backend buffer size = 139.84 MB(RAM) (108 tensors)
    [DEBUG] stable-diffusion.cpp:596 - loading weights
    [DEBUG] model.cpp:2030 - using 14 threads for model loading
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/unet/Qwen_Image-Q5_0.gguf
    |=======================================> | 1933/2465 - 33.63it/s
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/clip/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf
    |==============================================> | 2271/2465 - 38.52it/s
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/vae/qwen_image_vae.safetensors
    |==================================================| 2465/2465 - 41.67it/s
    [INFO ] model.cpp:2351 - loading tensors completed, taking 59.17s (process: 0.02s, read: 57.54s, memcpy: 0.00s, convert: 0.11s, copy_to_backend: 0.00s)
    [INFO ] stable-diffusion.cpp:679 - total params memory size = 19791.47MB (VRAM 19651.64MB, RAM 139.84MB): text_encoders 5918.09MB(VRAM), diffusion_model 13733.55MB(VRAM), vae 139.84MB(RAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
    [INFO ] stable-diffusion.cpp:778 - running in FLOW mode
    [DEBUG] stable-diffusion.cpp:803 - finished loaded file
    [DEBUG] stable-diffusion.cpp:2470 - generate_image 1024x1024
    [INFO ] stable-diffusion.cpp:2597 - TXT2IMG
    [DEBUG] stable-diffusion.cpp:964 - lora Samsung:1.00
    [INFO ] stable-diffusion.cpp:949 - attempting to apply 1 LoRAs
    [INFO ] model.cpp:1100 - load /home/user/ComfyUI-0.3.60/models/loras/qwen-image/Samsung.safetensors using safetensors format
    [DEBUG] model.cpp:1207 - init from '/home/user/ComfyUI-0.3.60/models/loras/qwen-image/Samsung.safetensors', prefix = ''
    [INFO ] lora.hpp:120 - loading LoRA from '/home/user/ComfyUI-0.3.60/models/loras/qwen-image/Samsung.safetensors'
    [DEBUG] model.cpp:2030 - using 14 threads for model loading
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/loras/qwen-image/Samsung.safetensors
    |==================================================| 480/480 - 480000.00it/s
    [INFO ] model.cpp:2351 - loading tensors completed, taking 0.03s (process: 0.03s, read: 0.00s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.00s)
    [DEBUG] ggml_extend.hpp:1754 - lora params backend buffer size = 90.00 MB(VRAM) (480 tensors)
    [DEBUG] model.cpp:2030 - using 14 threads for model loading
    [DEBUG] model.cpp:2113 - loading tensors from /home/user/ComfyUI-0.3.60/models/loras/qwen-image/Samsung.safetensors
    |==================================================| 480/480 - 2388.06it/s
    [INFO ] model.cpp:2351 - loading tensors completed, taking 0.26s (process: 0.06s, read: 0.18s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.00s)
    [DEBUG] lora.hpp:175 - lora type: ".lora_down"/".lora_up"
    [DEBUG] lora.hpp:177 - finished loaded lora
    [DEBUG] lora.hpp:874 - (480 / 480) LoRA tensors will be applied
    [DEBUG] ggml_extend.hpp:1579 - lora compute buffer size: 1593.56 MB(VRAM)
    [DEBUG] lora.hpp:874 - (480 / 480) LoRA tensors will be applied
    [INFO ] stable-diffusion.cpp:929 - lora 'Samsung' applied, taking 1.87s
    [INFO ] stable-diffusion.cpp:969 - apply_loras completed, taking 1.87s
    [DEBUG] stable-diffusion.cpp:970 - prompt after extract and remove lora: "high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish "
    [DEBUG] conditioner.hpp:1432 - parse '<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish <|im_end|>
    <|im_start|>assistant
    ' to [['<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    high quality raw candid photo. eastern european young woman, late teens, slim. detailed skin texture, detailed blue eyes. pinterest style. natural late afternoon sunlight, on an outdoor rooftop bar patio, thick eyebrows, long wavy dark brown hair, wearing a pink pvc crop top, a black skirt with a white floral pattern, visible makeup, black sunglasses on her head, a silver moon pendant on a black cord, a silver chain-link bracelet, sitting at a wooden table and holding cup of coffee in her right hand, her body angled towards the camera while her head is turned to her right, with a thinking expression with left hand near chin, gazing off-camera towards the city view in the background where a bar area is also visible, neon sign DRUNKARD on bar with retrowave font. long pink nailpolish <|im_end|>
    <|im_start|>assistant
    ', 1], ]
    [INFO ] ggml_extend.hpp:1677 - qwenvl2.5 offload params (5918.09 MB, 338 tensors) to runtime backend (Vulkan0), taking 2.13s
    [DEBUG] ggml_extend.hpp:1579 - qwenvl2.5 compute buffer size: 38.23 MB(VRAM)
    [DEBUG] conditioner.hpp:1572 - computing condition graph completed, taking 2354 ms
    [DEBUG] conditioner.hpp:1432 - parse '<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    <|im_end|>
    <|im_start|>assistant
    ' to [['<|im_start|>system
    Describe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>
    <|im_start|>user
    <|im_end|>
    <|im_start|>assistant
    ', 1], ]
    [INFO ] ggml_extend.hpp:1677 - qwenvl2.5 offload params (5918.09 MB, 338 tensors) to runtime backend (Vulkan0), taking 2.13s
    [DEBUG] ggml_extend.hpp:1579 - qwenvl2.5 compute buffer size: 7.24 MB(VRAM)
    [DEBUG] conditioner.hpp:1572 - computing condition graph completed, taking 2247 ms
    [INFO ] stable-diffusion.cpp:2208 - get_learned_condition completed, taking 6473 ms
    [INFO ] stable-diffusion.cpp:2233 - sampling using Euler method
    [INFO ] stable-diffusion.cpp:2327 - generating image: 1/1 - seed 244727409499015
    [INFO ] ggml_extend.hpp:1677 - qwen_image offload params (13733.54 MB, 1933 tensors) to runtime backend (Vulkan0), taking 4.80s
    [DEBUG] ggml_extend.hpp:1579 - qwen_image compute buffer size: 590.57 MB(VRAM)
    |==================================================| 20/20 - 8.48s/it
    [INFO ] stable-diffusion.cpp:2364 - sampling completed, taking 169.71s
    [INFO ] stable-diffusion.cpp:2372 - generating 1 latent images completed, taking 170.67s
    [INFO ] stable-diffusion.cpp:2375 - decoding 1 latents
    [DEBUG] ggml_extend.hpp:1579 - wan_vae compute buffer size: 7492.50 MB(RAM)
    [DEBUG] stable-diffusion.cpp:1666 - computing vae decode graph completed, taking 75.84s
    [INFO ] stable-diffusion.cpp:2385 - latent 1 decoded, taking 75.85s
    [INFO ] stable-diffusion.cpp:2389 - decode_first_stage completed, taking 75.85s
    [INFO ] stable-diffusion.cpp:2709 - generate_image completed in 253.00s
    save result PNG image to 'output.png'

Image Image

Additional context / environment details

  • OS: Ubuntu 22.04 (Kernel 6.8.0-85-generic)
  • CPU: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
  • GPU: AMD Radeon RX 7900 XT/XTX (Navi 31)
  • Driver: amdgpu
  • Backend: Vulkan
  • RAM: 48GiB DDR4 (2x8GB @ 2400MHz, 2x16GB @ 3200MHz)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /