-
Notifications
You must be signed in to change notification settings - Fork 3.1k
-
OS:Ubuntu 20.04
CUDA:11.1
Driver Version: 455.23.04
GPU Compute Capability: 8.6
Driver API Version: 11.1
Runtime API Version: 10.2
cuDNN Version: 7.6.
PaddlePaddle:2.0.2和2.0.1都试过,问题一样
在『2021语言与智能技术竞赛』- 事件抽取任务基线系统 的基线NoteBook中,下载代码到自己机器上运行,运行起来特别慢,而且执行代码时报错:parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device. 有没有大神解释一下什么原因
from paddlenlp.transformers import ErnieForTokenClassification, ErnieForSequenceClassification
from utils import load_dict
label_map = load_dict('./conf/DuEE-Fin/trigger_tag.dict')
id2label = {val: key for key, val in label_map.items()}
print(id2label)
# model = ErnieForTokenClassification.from_pretrained("ernie-1.0", num_classes=len(label_map))
# from paddlenlp.transformers import ErnieForSequenceClassification
model = ErnieForSequenceClassification.from_pretrained("ernie-1.0", num_classes=len(label_map))
line13 为最后一行,完整报错:
W0509 13:09:59.510846 14922 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.1, Runtime API Version: 10.2 W0509 13:09:59.515894 14922 device_context.cc:372] device: 0, cuDNN Version: 7.6. Traceback (most recent call last): File "dueStep2.py", line 13, in model = ErnieForSequenceClassification.from_pretrained("ernie-1.0", num_classes=len(label_map)) File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddlenlp/transformers/model_utils.py", line 229, in from_pretrained if k in base_parameters_dict: File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddlenlp/transformers/utils.py", line 83, in __impl__ init_func(self, *args, **kwargs) File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddlenlp/transformers/ernie/modeling.py", line 203, in __init__ self.pad_token_id = pad_token_id File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddlenlp/transformers/ernie/modeling.py", line 41, in __init__ self.word_embeddings = nn.Embedding( File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/nn/layer/common.py", line 1348, in __init__ self.weight = self.create_parameter( File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 407, in create_parameter return self._helper.create_parameter(temp_attr, shape, dtype, is_bias, File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/layer_helper_base.py", line 367, in create_parameter return self.main_program.global_block().create_parameter( File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/framework.py", line 2988, in create_parameter initializer(param, self) File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/initializer.py", line 557, in __call__ op = block._prepend_op( File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/framework.py", line 3100, in _prepend_op _dygraph_tracer().trace_op(type, File "/home/XXX/miniconda3/lib/python3.8/site-packages/paddle/fluid/dygraph/tracer.py", line 43, in trace_op self.trace(type, inputs, outputs, attrs, SystemError: (Fatal) Operator uniform_random raises an thrust::system::system_error exception. The exception content is :parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device. (at /paddle/paddle/fluid/imperative/tracer.cc:172)
Beta Was this translation helpful? Give feedback.
All reactions
重新尝试了以下两种配置,终于可以用了!个人猜测问题核心在于安装paddlepaddle-gpu==2.0.2.post110,版本号post110不能少,之前用的清华源,找不到该版本,于是就改为paddlepaddle-gpu==2.0.2,结果运行不起来。现有配置可能存在多显卡并行不支持的问题(使用 python 或 python3 进入python解释器,输入import paddle ,再输入 paddle.utils.run_check() ,如果出现PaddlePaddle is installed successfully!,说明您已成功安装。),但单块卡总算是能跑通了,感谢各位的耐心支持!
以下方案二选一即可,其他安装请参考https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html
方案一:
conda create -n pd python=3.7 -y
conda activate pd
conda install cudatoolkit=11.0 -y
python -m pip install paddlepaddle-gpu==2.0.2.post110 -f https://paddlepaddle.org.cn/whl/mkl/stable.html
pip install --upgrade paddlenlp -i https://pypi.org/simple
方案二:
将以下代码保存为e...
Replies: 10 comments
-
具体的原因没有看出来,麻烦能不能提供一下您的具体的GPU设备的信息,我们来参考设备信息来判断一下
Beta Was this translation helpful? Give feedback.
All reactions
-
GPU 为NVIDIA GeForce RTX 3090,24G显存
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.04 Driver Version: 455.23.04 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 Off | 00000000:1A:00.0 Off | N/A |
| 49% 54C P2 223W / 350W | 3823MiB / 24268MiB | 60% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 3090 Off | 00000000:1B:00.0 Off | N/A |
| 30% 26C P8 8W / 350W | 2MiB / 24268MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Beta Was this translation helpful? Give feedback.
All reactions
-
看起来是CUDA11驱动导致的,这里有个类似的问题 #262 ,也可以先用conda尝试安装一下相应的paddle版本
Beta Was this translation helpful? Give feedback.
All reactions
-
好的,我试试直接安装cuda10.2对应的paddle
Beta Was this translation helpful? Give feedback.
All reactions
-
conda install paddlepaddle-gpu==2.0.2 cudatoolkit=10.2 -c paddle
直接这样改问题依旧,我没有直接改动cuda版本的权限,CUDA Version: 11.1
Beta Was this translation helpful? Give feedback.
All reactions
-
conda install paddlepaddle-gpu==2.0.2 cudatoolkit=11.0
使用cuda11吧,30系显卡不支持cuda10,最低要求11
Beta Was this translation helpful? Give feedback.
All reactions
-
改为conda install paddlepaddle-gpu==2.0.2 cudatoolkit=11.0 依然不行,运行import paddle ,再输入 paddle.utils.run_check() 报同样的错误。
RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
Beta Was this translation helpful? Give feedback.
All reactions
-
重新尝试了以下两种配置,终于可以用了!个人猜测问题核心在于安装paddlepaddle-gpu==2.0.2.post110,版本号post110不能少,之前用的清华源,找不到该版本,于是就改为paddlepaddle-gpu==2.0.2,结果运行不起来。现有配置可能存在多显卡并行不支持的问题(使用 python 或 python3 进入python解释器,输入import paddle ,再输入 paddle.utils.run_check() ,如果出现PaddlePaddle is installed successfully!,说明您已成功安装。),但单块卡总算是能跑通了,感谢各位的耐心支持!
以下方案二选一即可,其他安装请参考https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html
方案一:
conda create -n pd python=3.7 -y
conda activate pd
conda install cudatoolkit=11.0 -y
python -m pip install paddlepaddle-gpu==2.0.2.post110 -f https://paddlepaddle.org.cn/whl/mkl/stable.html
pip install --upgrade paddlenlp -i https://pypi.org/simple
方案二:
将以下代码保存为env.yaml, 注意把最后一行prefix改为你自己的conda或miniconda中paddle的对应路径,然后运行 conda env create -f env.yaml
name: paddle
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- backcall=0.2.0=pyhd3eb1b0_0
- blas=1.0=mkl
- ca-certificates=2021年4月13日=h06a4308_1
- certifi=2020年12月5日=py37h06a4308_0
- cudatoolkit=11.0.221=h6bb024c_0
- intel-openmp=202120=h06a4308_610
- ipykernel=5.3.4=py37h5ca1d4c_0
- ipython=7.22.0=py37hb070fc8_0
- ipython_genutils=0.2.0=pyhd3eb1b0_1
- jedi=0.17.0=py37_0
- jupyter_client=6.1.12=pyhd3eb1b0_0
- jupyter_core=4.7.1=py37h06a4308_0
- ld_impl_linux-64=2.33.1=h53a641e_7
- libffi=3.3=he6710b0_2
- libgcc-ng=9.1.0=hdf63c60_0
- libsodium=1.0.18=h7b6447c_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- mkl=202120=h06a4308_296
- mkl-service=2.3.0=py37h27cfd23_1
- mkl_fft=1.3.0=py37h42c9631_2
- mkl_random=1.2.1=py37ha9443f7_2
- ncurses=6.2=he6710b0_1
- numpy-base=1.20.1=py37h7d8b39e_0
- openssl=1.1.1k=h27cfd23_0
- parso=0.8.2=pyhd3eb1b0_0
- pexpect=4.8.0=pyhd3eb1b0_3
- pickleshare=0.7.5=pyhd3eb1b0_1003
- pip=21.0.1=py37h06a4308_0
- prompt-toolkit=3.0.17=pyh06a4308_0
- ptyprocess=0.7.0=pyhd3eb1b0_2
- pygments=2.8.1=pyhd3eb1b0_0
- python=3.7.10=hdb3f193_0
- python-dateutil=2.8.1=pyhd3eb1b0_0
- pyzmq=20.0.0=py37h2531618_1
- readline=8.1=h27cfd23_0
- setuptools=52.0.0=py37h06a4308_0
- sqlite=3.35.4=hdfb4753_0
- tk=8.6.10=hbc83047_0
- tornado=6.1=py37h27cfd23_0
- tqdm=4.59.0=pyhd3eb1b0_1
- traitlets=5.0.5=pyhd3eb1b0_0
- wcwidth=0.2.5=py_0
- wheel=0.36.2=pyhd3eb1b0_0
- xz=5.2.5=h7b6447c_0
- zeromq=4.3.4=h2531618_0
- zlib=1.2.11=h7b6447c_3
- pip:
- appdirs==1.4.4
- astor==0.8.1
- babel==2.9.1
- bce-python-sdk==0.8.60
- cached-property==1.5.2
- cfgv==3.2.0
- chardet==4.0.0
- click==7.1.2
- colorama==0.4.4
- colorlog==5.0.1
- decorator==5.0.7
- dill==0.3.3
- distlib==0.3.1
- filelock==3.0.12
- flake8==3.9.2
- flask==1.1.2
- flask-babel==2.0.0
- future==0.18.2
- gast==0.4.0
- h5py==3.2.1
- identify==2.2.4
- idna==2.10
- importlib-metadata==4.0.1
- itsdangerous==1.1.0
- jieba==0.42.1
- jinja2==2.11.3
- joblib==1.0.1
- markupsafe==1.1.1
- mccabe==0.6.1
- multiprocess==0.70.11.1
- nodeenv==1.6.0
- numpy==1.20.2
- paddlenlp==2.0.0rc21
- paddlepaddle-gpu==2.0.2.post110
- pillow==8.2.0
- pre-commit==2.12.1
- protobuf==3.16.0
- pycodestyle==2.7.0
- pycryptodome==3.10.1
- pyflakes==2.3.1
- pytz==2021.1
- pyyaml==5.4.1
- requests==2.25.1
- scikit-learn==0.24.2
- scipy==1.6.3
- seqeval==1.2.2
- shellcheck-py==0.7.2.1
- six==1.16.0
- threadpoolctl==2.1.0
- toml==0.10.2
- typing-extensions==3.10.0.0
- urllib3==1.26.4
- virtualenv==20.4.6
- visualdl==2.1.1
- werkzeug==1.0.1
- zipp==3.4.1
prefix: /home/XXX/miniconda3/envs/paddle
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 4 -
🎉 2
-
清华源 需要在配置文件中添加Paddle通道,而且必须是大写的P(官方通道是小写的p).
Beta Was this translation helpful? Give feedback.
All reactions
-
重新尝试了以下两种配置,终于可以用了!个人猜测问题核心在于安装
paddlepaddle-gpu==2.0.2.post110,版本号post110不能少,之前用的清华源,找不到该版本,于是就改为paddlepaddle-gpu==2.0.2,结果运行不起来。现有配置可能存在多显卡并行不支持的问题(使用 python 或 python3 进入python解释器,输入import paddle ,再输入 paddle.utils.run_check() ,如果出现PaddlePaddle is installed successfully!,说明您已成功安装。),但单块卡总算是能跑通了,感谢各位的耐心支持!
以下方案二选一即可,其他安装请参考https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/conda/linux-conda.html
方案一:conda create -n pd python=3.7 -y conda activate pd conda install cudatoolkit=11.0 -y python -m pip install paddlepaddle-gpu==2.0.2.post110 -f https://paddlepaddle.org.cn/whl/mkl/stable.html pip install --upgrade paddlenlp -i https://pypi.org/simple方案二:
将以下代码保存为env.yaml, 注意把最后一行prefix改为你自己的conda或miniconda中paddle的对应路径,然后运行conda env create -f env.yamlname: paddle channels: - defaults dependencies: - _libgcc_mutex=0.1=main - backcall=0.2.0=pyhd3eb1b0_0 - blas=1.0=mkl - ca-certificates=2021年4月13日=h06a4308_1 - certifi=2020年12月5日=py37h06a4308_0 - cudatoolkit=11.0.221=h6bb024c_0 - intel-openmp=202120=h06a4308_610 - ipykernel=5.3.4=py37h5ca1d4c_0 - ipython=7.22.0=py37hb070fc8_0 - ipython_genutils=0.2.0=pyhd3eb1b0_1 - jedi=0.17.0=py37_0 - jupyter_client=6.1.12=pyhd3eb1b0_0 - jupyter_core=4.7.1=py37h06a4308_0 - ld_impl_linux-64=2.33.1=h53a641e_7 - libffi=3.3=he6710b0_2 - libgcc-ng=9.1.0=hdf63c60_0 - libsodium=1.0.18=h7b6447c_0 - libstdcxx-ng=9.1.0=hdf63c60_0 - mkl=202120=h06a4308_296 - mkl-service=2.3.0=py37h27cfd23_1 - mkl_fft=1.3.0=py37h42c9631_2 - mkl_random=1.2.1=py37ha9443f7_2 - ncurses=6.2=he6710b0_1 - numpy-base=1.20.1=py37h7d8b39e_0 - openssl=1.1.1k=h27cfd23_0 - parso=0.8.2=pyhd3eb1b0_0 - pexpect=4.8.0=pyhd3eb1b0_3 - pickleshare=0.7.5=pyhd3eb1b0_1003 - pip=21.0.1=py37h06a4308_0 - prompt-toolkit=3.0.17=pyh06a4308_0 - ptyprocess=0.7.0=pyhd3eb1b0_2 - pygments=2.8.1=pyhd3eb1b0_0 - python=3.7.10=hdb3f193_0 - python-dateutil=2.8.1=pyhd3eb1b0_0 - pyzmq=20.0.0=py37h2531618_1 - readline=8.1=h27cfd23_0 - setuptools=52.0.0=py37h06a4308_0 - sqlite=3.35.4=hdfb4753_0 - tk=8.6.10=hbc83047_0 - tornado=6.1=py37h27cfd23_0 - tqdm=4.59.0=pyhd3eb1b0_1 - traitlets=5.0.5=pyhd3eb1b0_0 - wcwidth=0.2.5=py_0 - wheel=0.36.2=pyhd3eb1b0_0 - xz=5.2.5=h7b6447c_0 - zeromq=4.3.4=h2531618_0 - zlib=1.2.11=h7b6447c_3 - pip: - appdirs==1.4.4 - astor==0.8.1 - babel==2.9.1 - bce-python-sdk==0.8.60 - cached-property==1.5.2 - cfgv==3.2.0 - chardet==4.0.0 - click==7.1.2 - colorama==0.4.4 - colorlog==5.0.1 - decorator==5.0.7 - dill==0.3.3 - distlib==0.3.1 - filelock==3.0.12 - flake8==3.9.2 - flask==1.1.2 - flask-babel==2.0.0 - future==0.18.2 - gast==0.4.0 - h5py==3.2.1 - identify==2.2.4 - idna==2.10 - importlib-metadata==4.0.1 - itsdangerous==1.1.0 - jieba==0.42.1 - jinja2==2.11.3 - joblib==1.0.1 - markupsafe==1.1.1 - mccabe==0.6.1 - multiprocess==0.70.11.1 - nodeenv==1.6.0 - numpy==1.20.2 - paddlenlp==2.0.0rc21 - paddlepaddle-gpu==2.0.2.post110 - pillow==8.2.0 - pre-commit==2.12.1 - protobuf==3.16.0 - pycodestyle==2.7.0 - pycryptodome==3.10.1 - pyflakes==2.3.1 - pytz==2021.1 - pyyaml==5.4.1 - requests==2.25.1 - scikit-learn==0.24.2 - scipy==1.6.3 - seqeval==1.2.2 - shellcheck-py==0.7.2.1 - six==1.16.0 - threadpoolctl==2.1.0 - toml==0.10.2 - typing-extensions==3.10.0.0 - urllib3==1.26.4 - virtualenv==20.4.6 - visualdl==2.1.1 - werkzeug==1.0.1 - zipp==3.4.1 prefix: /home/XXX/miniconda3/envs/paddle
感谢您在该问题上输出的宝贵经验,同时我们会考虑在文档上增加FAQ机制,减少后续的使用难度。
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1