-
Couldn't load subscription status.
- Fork 433
Metal performance increases for MacOS...coming soon?! #301
-
Thanks for the great work guys!! 🙏
I'm running MacOS M2 with Apple Silicon. I've compiled both with Metal and standard CPU. At the moment GPU is about 75% of the performance of CPU 😟
I'm VERY keen to see those Mac Metal GPU optimisations!! The performance improvement of GPU over CPU for MacOS Metal with llama.cpp is in the order of 8-10x... so perhaps we could see a 15x performance for stable-diffusion.cpp on MacOS GPU when those GPU optimisation come!!
Beta Was this translation helpful? Give feedback.
All reactions
-
🚀 3 -
👀 2
Replies: 1 comment
-
When I try the metal build I see error like below...Anyone seen this? What should I do to fix this?
program_source:6357:127: error: use of undeclared identifier 'block_q4_1'
template [[host_name("kernel_mul_mv_id_q4_1_f32")]] kernel kernel_mul_mv_id_t kernel_mul_mv_id<mmv_fn<mul_vec_q_n_f32_impl<block_q4_1, N_DST, N_SIMDGROUP, N_SIMDWIDTH>>>;
^
program_source:6358:127: error: use of undeclared identifier 'block_q5_0'
template [[host_name("kernel_mul_mv_id_q5_0_f32")]] kernel kernel_mul_mv_id_t kernel_mul_mv_id<mmv_fn<mul_vec_q_n_f32_impl<block_q5_0, N_DST, N_SIMDGROUP, N_SIMDWIDTH>>>;
^
program_source:6359:127: error: use of undeclared identifier 'block_q5_1'
template [[host_name("kernel_mul_mv_id_q5_1_f32")]] kernel kernel_mul_mv_id_t kernel_mul_mv_id<mmv_fn<mul_vec_q_n_f32_impl<block_q5_1, N_DST, N_SIMDGROUP, N_SIMDWIDTH>>>;
^
}
ggml_backend_metal_init: error: failed to allocate context
Beta Was this translation helpful? Give feedback.