-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Error for exllamav2_kernel, running TGI on Google Colab #1762
-
Trying to run the tgi launcher on Google Colab after locally installing, but keep on getting error messages that the kernel is not installed.
text-generation-launcher --model-id bigcode/starcoder2-3b --sharded false --quantize bitsandbytes-fp4
ERROR text_generation_launcher: exllamav2_kernels not installed.
ERROR text_generation_launcher: Shard 0 failed to start
Keep getting these errors even though I cloned and installed the turboderp/exllamav2 repo from github.
Seems like a simple issue but can anyone give me help how to solve this?
I'm running locally because google colab doesn't let me use the docker container for running the tgi.
Or is there a better way for doing this?
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 4 replies
-
The exllamav2_kernels that are mentionned in the error are built into the TGI app (likely in transformers). It is activated by default: disable_exllamav2=False in load_quantized_model().
Ensure you have the latest version by using pip install --upgrade transformers, or try toggling the disable bit to True.
Beta Was this translation helpful? Give feedback.
All reactions
-
I see, then what might be the issue for getting this error?
2024年04月18日T20:25:12.602524Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
2024年04月18日 20:25:08.642575: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024年04月18日 20:25:08.642631: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024年04月18日 20:25:08.644753: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024年04月18日 20:25:09.925706: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
...
ImportError: cannot import name 'PositionRotaryEmbedding' from 'text_generation_server.utils.layers'
(/content/tgi/server/text_generation_server/utils/layers.py) rank=0
2024年04月18日T20:14:54.805319Z ERROR text_generation_launcher: Shard 0 failed to start
2024年04月18日T20:14:54.805347Z INFO text_generation_launcher: Shutting down shards
Error: ShardCannotStart
If that error is built in, I guess this is causing me the problem for not being able to launch.
Beta Was this translation helpful? Give feedback.
All reactions
-
Seem like the most useful part of that error is "Could not find TensorRT".
Have you tried pip install tensorrt?
Beta Was this translation helpful? Give feedback.
All reactions
-
Yes, but still doesn't work.
Beta Was this translation helpful? Give feedback.