-
Notifications
You must be signed in to change notification settings - Fork 529
[Tutorial] add KoBERT tutorial #1230
Conversation
* [CI] Update MXNet master version tested on CI (dmlc#1113) * [CI] Update MXNet master version tested on CI * Disable horovod test on master * update link Co-authored-by: Leonard Lausen <leonard@lausen.nl>
Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>
added kobert_naver_movie for KoBERT tutorial.
* Update Jenkinsfile_py3_cpu_unittest * Update Jenkinsfile_py3-master_cpu_unittest
Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!
Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!
Hello, my branch is v0.9.x and I made the lastest commit to that branch. I don't have any other branches. Can you tell me which steps are more required? Thanks.
Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!Hello, my branch is v0.9.x and I made the lastest commit to that branch. I don't have any other branches. Can you tell me which steps are more required? Thanks.
Hello, this commit is merged into v0.9.x branch yesterday and it seems that your pull request is opened 2 days ago. So maybe your pull request is not including this commit?
Sorry but I wonder whether there is actual need for merging into v0.9.x(release branch) rather than the master(develop branch)?
I am noticing the gpu-doc failures. On master branch there are some new features that might improve the stability of doc build and help us debugging errors.
Hello, I think you need to change the pull request target branch to dmlc:master. Currently you are still targeting dmlc:v0.9.x
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello,
Did you type this character by mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked and found out this is from a change in v0.9.x branch but not in master branch. #1219
It's good and please keep it!
Thanks!
Hello Jiyang,
One of the test is failing due to unclear errors. Please wait patiently while we are working on the fixes.
@leezu This gpu doc test cannot pass doctest. But it seems that this error is due to a connection error. Maybe we need to do some changes elsewhere?
any update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failed due to some external dependency. Let me trigger it again and see if it passes
any update?
Hello Jiyang, would you please merge the latest master branch into your pull request, especially to include #1236 ? So that gpu-doc can pass.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leezu any idea why the err log is missing?
fatal error: An error occurred (404) when calling the HeadObject operation: Key "batch/PR-1230/14/docs/examples/sentiment_analysis/kobert_naver_movie.stderr.log" does not exist
@leezu any idea why the err log is missing?
fatal error: An error occurred (404) when calling the HeadObject operation: Key "batch/PR-1230/14/docs/examples/sentiment_analysis/kobert_naver_movie.stderr.log" does not exist
I think this is because the ci/batch/submit-job.py
failed.
This failure is due to the failure of ci/batch/docker/gluon_nlp_job.sh
The failure above is due to the failure of docs/md2ipynb.py
So the root cause of this failure is that the conversion of the newly added md
file to the ipynb
file didn't succeed.
@jamiekang The failure now is in the next and final step of the build process. The rm
command helped to get to the final step.
Currently the CI fails due to a sphinx warning.
[2020年09月09日T06:57:48.867Z] /var/lib/jenkins/gluon-nlp-cpu-py3-master/docs/examples/sentiment_analysis/kobert_naver_movie.ipynb:Could not lex literal_block as "python". Highlighting skipped.
To reproduce locally, you should be able to run MD2IPYNB_OPTION=--disable_compute make docs_local
on your computer
adding this line generated new error:
!rm -rf dataset_folder
should I keep this line or remove it?
You need to update the folder name with the actual dataset path.
changed clean up code.
adding this line generated new error:
!rm -rf dataset_folder
should I keep this line or remove it?
You need to update the folder name with the actual dataset path.
ok, I will fix it. Any idea to overcome timeout?
Could not lex literal_block as "python". Highlighting skipped.
kernel.cu(1084): Error: Formal parameter space overflowed (4648 bytes required, max 4096 bytes allowed) in function
What does this mean?
Reshape_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_kernel.cu(1084): Error: Formal parameter space overflowed (4648 bytes required, max 4096 bytes allowed) in function
cc @ptrendx, looks like a failed fusion case.
Hmmm, yeeeaah... So, just out of curiosity - why do you call there expand_dims over a 100 times on a single thing?
Hmmm, yeeeaah... So, just out of curiosity - why do you call there expand_dims over a 100 times on a single thing?
There's no explicit call to expand_dims() in the source (.md or .ipynb).
I don't think this error comes from your PR - it happens in test_xlnet_finetune_glue[MRPC]
.
Any update?
here's a summary of the blocking issues:
- (resolved) fusion RTC bug for expand_dims (thanks @ptrendx)
- (resolved) conda dependency resolution failure (I upgraded conda and its python versions on all workers)
- (ongoing) horovod installation issue
Once the master-gpu-doc pipeline passes I will merge this PR first and we can unblock the horovod issue separately.
here's a summary of the blocking issues:
- (resolved) fusion RTC bug for expand_dims (thanks @ptrendx)
- (resolved) conda dependency resolution failure (I upgraded conda and its python versions on all workers)
- (ongoing) horovod installation issue
Once the master-gpu-doc pipeline passes I will merge this PR first and we can unblock the horovod issue separately.
Thanks. Let's see how the master-gpu-doc pipeline works.
Looks like there is still some error in the new notebook that needs to be resolved first:
[2020年11月09日T22:10:54.320Z] Warning, treated as error:
[2020年11月09日T22:10:54.320Z] /var/lib/jenkins/gluon-nlp-cpu-py3-master/docs/examples/sentiment_analysis/kobert_naver_movie.ipynb:Could not lex literal_block as "python". Highlighting skipped.
Checking what's causing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is what's troubling the python lexer. Trying to do the same inside python
It seems we still have the multiple expand_dims error.
I'll later also take a look about how to port KoBERT + Tutorial to the master version.
I'll later also take a look about how to port KoBERT + Tutorial to the master version.
Thanks!
/var/lib/jenkins/workspace/gluon-nlp-cpu-py3/conda/cpu/py3/lib/python3.5/site-packages/mxnet/include/mxnet/ndarray.h:41:10: fatal error: mkldnn.hpp: No such file or directory
Anyone can help this? Thanks in advance.
added kobert_naver_movie for KoBERT tutorial.
Description
(Brief description on what this PR is about)
Checklist
Essentials
Changes
Comments
cc @dmlc/gluon-nlp-team