This repository was archived by the owner on Jan 15, 2024. It is now read-only.

[Tutorial] add KoBERT tutorial #1230

Open

jamiekang wants to merge 38 commits into dmlc:v0.x

from jamiekang:v0.9.x

Open

[Tutorial] add KoBERT tutorial #1230

jamiekang wants to merge 38 commits into dmlc:v0.x from jamiekang:v0.9.x

Conversation

jamiekang

Copy link

Contributor

@jamiekang jamiekang commented May 12, 2020

added kobert_naver_movie for KoBERT tutorial.

Description

(Brief description on what this PR is about)

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

cc @dmlc/gluon-nlp-team

leezu and others added 10 commits

February 10, 2020 18:15

@leezu


 0.9.0 stable version

ef57ca0

@eric-haibin-lin @leezu


 [CI] Update MXNet master version tested on CI ( dmlc#1113 ) ( dmlc#1176 )

6520c72

* [CI] Update MXNet master version tested on CI (dmlc#1113)
* [CI] Update MXNet master version tested on CI
* Disable horovod test on master
* update link
Co-authored-by: Leonard Lausen <leonard@lausen.nl>

@eric-haibin-lin


 [BUGFIX] Fix vocab determinism in py35 ( dmlc#1166 ) ( dmlc#1167 )

50e5278

@xinyu-intel


 [DOC] add int8 command ( dmlc#1174 )

27331cb


 bump up version

3f7465a


 Specify llvmlite version in CI environment ( dmlc#1213 )

639a103

@leezu


 Fix layer_norm_eps in BERTEncoder ( dmlc#1214 )

9e356b3

@tirkarthi


 Fix deprecation warnings due to invalid escape sequences. ( dmlc#1219 )

bf14884

@eric-haibin-lin


 [BUGFIX] remove wd from squad ( dmlc#1223 )

6ee8f02

Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

@jamiekang


 add KoBERT tutorial

87ebe05

added kobert_naver_movie for KoBERT tutorial.

@jamiekang jamiekang requested a review from a team as a code owner

May 12, 2020 09:00

@chenw23


 [CI] Lift timeout on cpu unittest ( dmlc#1229 )

5dc6b9c

* Update Jenkinsfile_py3_cpu_unittest
* Update Jenkinsfile_py3-master_cpu_unittest

@chenw23

Copy link

Member

chenw23 commented May 13, 2020

Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!

@jamiekang

Copy link

Contributor Author

jamiekang commented May 14, 2020

Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!

Hello, my branch is v0.9.x and I made the lastest commit to that branch. I don't have any other branches. Can you tell me which steps are more required? Thanks.

@chenw23

Copy link

Member

chenw23 commented May 14, 2020

Hello Jiyang, would you please merge the latest commit to your branch for this pull request?
#1229 This fixes a cpu-unittest timing restriction which is preventing your commit from being built.
Thanks!

Hello, my branch is v0.9.x and I made the lastest commit to that branch. I don't have any other branches. Can you tell me which steps are more required? Thanks.

Hello, this commit is merged into v0.9.x branch yesterday and it seems that your pull request is opened 2 days ago. So maybe your pull request is not including this commit?

@jamiekang


 Merge remote-tracking branch 'upstream/v0.9.x' into v0.9.x

c81f543

@chenw23

Copy link

Member

chenw23 commented May 14, 2020

Sorry but I wonder whether there is actual need for merging into v0.9.x(release branch) rather than the master(develop branch)?
I am noticing the gpu-doc failures. On master branch there are some new features that might improve the stability of doc build and help us debugging errors.

@jamiekang


 Merge remote-tracking branch 'upstream/master' into v0.9.x

528425c

@chenw23

Copy link

Member

chenw23 commented May 14, 2020

Hello, I think you need to change the pull request target branch to dmlc:master. Currently you are still targeting dmlc:v0.9.x
Thanks!

@jamiekang jamiekang changed the base branch from v0.9.x to master

May 14, 2020 04:17

chenw23

chenw23 suggested changes

May 14, 2020

View reviewed changes

ci/batch/submit-job.py

logGroupName = '/aws/batch/job'

jobName = re.sub('[^A-Za-z0-9_\-]', '', args.name)[:128] # Enforce AWS Batch jobName rules

jobName = re.sub(r'[^A-Za-z0-9_\-]', '', args.name)[:128] # Enforce AWS Batch jobName rules

Copy link

Member

@chenw23 chenw23 May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,
Did you type this character by mistake?

Copy link

Contributor Author

@jamiekang jamiekang May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know anything about this.

Copy link

Member

@chenw23 chenw23 May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked and found out this is from a change in v0.9.x branch but not in master branch. #1219
It's good and please keep it!
Thanks!

@chenw23

Copy link

Member

chenw23 commented May 14, 2020

Hello Jiyang,
One of the test is failing due to unclear errors. Please wait patiently while we are working on the fixes.

@leezu This gpu doc test cannot pass doctest. But it seems that this error is due to a connection error. Maybe we need to do some changes elsewhere?

@jamiekang

Copy link

Contributor Author

jamiekang commented May 27, 2020

any update?

eric-haibin-lin

eric-haibin-lin reviewed

May 28, 2020

View reviewed changes

Copy link

Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test failed due to some external dependency. Let me trigger it again and see if it passes

@chenw23

Copy link

Member

chenw23 commented May 30, 2020

any update?

Hello Jiyang, would you please merge the latest master branch into your pull request, especially to include #1236 ? So that gpu-doc can pass.
Thanks!

@jamiekang


 Merge pull request #1 from dmlc/master

19c4045

Master

@jamiekang

Copy link

Contributor Author

jamiekang commented Jun 2, 2020 •

edited

Loading

Is this okay?

Merge pull request #1 from dmlc/master ... 19c4045

@avinashsai

Copy link

Member

avinashsai commented Jun 2, 2020

Is this okay?

Merge pull request #1 from dmlc/master ... 19c4045

yes

eric-haibin-lin

eric-haibin-lin reviewed

Jun 3, 2020

View reviewed changes

Copy link

Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leezu any idea why the err log is missing?

fatal error: An error occurred (404) when calling the HeadObject operation: Key "batch/PR-1230/14/docs/examples/sentiment_analysis/kobert_naver_movie.stderr.log" does not exist

@chenw23

Copy link

Member

chenw23 commented Jun 5, 2020

@leezu any idea why the err log is missing?

fatal error: An error occurred (404) when calling the HeadObject operation: Key "batch/PR-1230/14/docs/examples/sentiment_analysis/kobert_naver_movie.stderr.log" does not exist

I think this is because the ci/batch/submit-job.py failed.
This failure is due to the failure of ci/batch/docker/gluon_nlp_job.sh
The failure above is due to the failure of docs/md2ipynb.py

So the root cause of this failure is that the conversion of the newly added md file to the ipynb file didn't succeed.

@leezu

Copy link

Contributor

leezu commented Sep 9, 2020

@jamiekang The failure now is in the next and final step of the build process. The rm command helped to get to the final step.
Currently the CI fails due to a sphinx warning.

[2020年09月09日T06:57:48.867Z] /var/lib/jenkins/gluon-nlp-cpu-py3-master/docs/examples/sentiment_analysis/kobert_naver_movie.ipynb:Could not lex literal_block as "python". Highlighting skipped.

To reproduce locally, you should be able to run MD2IPYNB_OPTION=--disable_compute make docs_local on your computer

@szha

Copy link

Member

szha commented Sep 9, 2020

adding this line generated new error: !rm -rf dataset_folder

should I keep this line or remove it?

You need to update the folder name with the actual dataset path.

@jamiekang


 Update kobert_naver_movie.md

01e9973

changed clean up code.

@jamiekang

Copy link

Contributor Author

jamiekang commented Sep 10, 2020

adding this line generated new error: !rm -rf dataset_folder

should I keep this line or remove it?

You need to update the folder name with the actual dataset path.

ok, I will fix it. Any idea to overcome timeout?

@szha


 Update kobert_naver_movie.md

a36d009

@jamiekang

Copy link

Contributor Author

jamiekang commented Sep 10, 2020

Could not lex literal_block as "python". Highlighting skipped.

@jamiekang


 Merge branch 'v0.x' into v0.9.x

5b8488b

@jamiekang

Copy link

Contributor Author

jamiekang commented Sep 21, 2020

kernel.cu(1084): Error: Formal parameter space overflowed (4648 bytes required, max 4096 bytes allowed) in function

jamiekang added 2 commits

September 23, 2020 17:17

@jamiekang


 Merge branch 'v0.9.x' of https://github.com/dmlc/gluon-nlp into dmlc-...

f3d64e0

...v0.9.x

@jamiekang


 Merge branch 'dmlc-v0.9.x' into v0.9.x

d10d261

@jamiekang

Copy link

Contributor Author

jamiekang commented Sep 23, 2020

What does this mean?
Reshape_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_expand_dims_kernel.cu(1084): Error: Formal parameter space overflowed (4648 bytes required, max 4096 bytes allowed) in function

@szha

Copy link

Member

szha commented Sep 24, 2020

cc @ptrendx, looks like a failed fusion case.

@ptrendx

Copy link

Contributor

ptrendx commented Sep 24, 2020

Hmmm, yeeeaah... So, just out of curiosity - why do you call there expand_dims over a 100 times on a single thing?

@jamiekang

Copy link

Contributor Author

jamiekang commented Sep 24, 2020

Hmmm, yeeeaah... So, just out of curiosity - why do you call there expand_dims over a 100 times on a single thing?

There's no explicit call to expand_dims() in the source (.md or .ipynb).

@ptrendx

Copy link

Contributor

ptrendx commented Sep 25, 2020

I don't think this error comes from your PR - it happens in test_xlnet_finetune_glue[MRPC].

This was referenced Oct 6, 2020

Bump mxnet versions tested on CI #1384

Closed

[1.x / 1.8] Regression in runtime fusion apache/mxnet#19316

Closed

@jamiekang


 Merge branch 'v0.x' into v0.9.x

16be46d

@jamiekang

Copy link

Contributor Author

jamiekang commented Nov 9, 2020

Any update?

@szha


 Update py3.yml

5a8a328

@szha

Copy link

Member

szha commented Nov 9, 2020

here's a summary of the blocking issues:

(resolved) fusion RTC bug for expand_dims (thanks @ptrendx)
(resolved) conda dependency resolution failure (I upgraded conda and its python versions on all workers)
(ongoing) horovod installation issue

Once the master-gpu-doc pipeline passes I will merge this PR first and we can unblock the horovod issue separately.

@jamiekang

Copy link

Contributor Author

jamiekang commented Nov 9, 2020

here's a summary of the blocking issues:

(resolved) fusion RTC bug for expand_dims (thanks @ptrendx)

(resolved) conda dependency resolution failure (I upgraded conda and its python versions on all workers)

(ongoing) horovod installation issue

Once the master-gpu-doc pipeline passes I will merge this PR first and we can unblock the horovod issue separately.

Thanks. Let's see how the master-gpu-doc pipeline works.

@szha

Copy link

Member

szha commented Nov 9, 2020

Looks like there is still some error in the new notebook that needs to be resolved first:

[2020年11月09日T22:10:54.320Z] Warning, treated as error:
[2020年11月09日T22:10:54.320Z] /var/lib/jenkins/gluon-nlp-cpu-py3-master/docs/examples/sentiment_analysis/kobert_naver_movie.ipynb:Could not lex literal_block as "python". Highlighting skipped.

Checking what's causing it.

szha

szha reviewed

Nov 9, 2020

View reviewed changes

docs/examples/sentiment_analysis/kobert_naver_movie.md Outdated

```

```{.python .input}

!rm -rf nsmc # clean up

Copy link

Member

@szha szha Nov 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is what's troubling the python lexer. Trying to do the same inside python

@szha


 Update kobert_naver_movie.md

4ba25a8

@jamiekang

Copy link

Contributor Author

jamiekang commented Nov 19, 2020

It seems we still have the multiple expand_dims error.

@sxjscience

Copy link

Member

sxjscience commented Nov 20, 2020

I'll later also take a look about how to port KoBERT + Tutorial to the master version.

@jamiekang

Copy link

Contributor Author

jamiekang commented Nov 23, 2020

I'll later also take a look about how to port KoBERT + Tutorial to the master version.

Thanks!

@jamiekang


 Merge branch 'v0.x' into v0.9.x

f0c608a

@jamiekang

Copy link

Contributor Author

jamiekang commented Jan 5, 2021

/var/lib/jenkins/workspace/gluon-nlp-cpu-py3/conda/cpu/py3/lib/python3.5/site-packages/mxnet/include/mxnet/ndarray.h:41:10: fatal error: mkldnn.hpp: No such file or directory
Anyone can help this? Thanks in advance.

Labels

None yet

10 participants

@jamiekang @chenw23 @avinashsai @szha @leezu @ptrendx @sxjscience @eric-haibin-lin @xinyu-intel @tirkarthi

[Tutorial] add KoBERT tutorial #1230

Are you sure you want to change the base?

[Tutorial] add KoBERT tutorial #1230

Uh oh!

Conversation

@jamiekang jamiekang commented May 12, 2020

Description

Checklist

Essentials

Changes

Comments

Uh oh!

chenw23 commented May 13, 2020

Uh oh!

jamiekang commented May 14, 2020

Uh oh!

chenw23 commented May 14, 2020

Uh oh!

chenw23 commented May 14, 2020

Uh oh!

chenw23 commented May 14, 2020

Uh oh!

@chenw23 chenw23 May 14, 2020

Choose a reason for hiding this comment

Uh oh!

@jamiekang jamiekang May 14, 2020

Choose a reason for hiding this comment

Uh oh!

@chenw23 chenw23 May 14, 2020

Choose a reason for hiding this comment

Uh oh!

chenw23 commented May 14, 2020

Uh oh!

jamiekang commented May 27, 2020

Uh oh!

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

Uh oh!

chenw23 commented May 30, 2020

Uh oh!

jamiekang commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

avinashsai commented Jun 2, 2020

Uh oh!

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

Uh oh!

chenw23 commented Jun 5, 2020

Uh oh!

leezu commented Sep 9, 2020

Uh oh!

szha commented Sep 9, 2020

Uh oh!

jamiekang commented Sep 10, 2020

Uh oh!

jamiekang commented Sep 10, 2020

Uh oh!

jamiekang commented Sep 21, 2020

Uh oh!

jamiekang commented Sep 23, 2020

Uh oh!

szha commented Sep 24, 2020

Uh oh!

ptrendx commented Sep 24, 2020

Uh oh!

jamiekang commented Sep 24, 2020

Uh oh!

ptrendx commented Sep 25, 2020

Uh oh!

jamiekang commented Nov 9, 2020

Uh oh!

szha commented Nov 9, 2020

Uh oh!

jamiekang commented Nov 9, 2020

Uh oh!

szha commented Nov 9, 2020

Uh oh!

@szha szha Nov 9, 2020

Choose a reason for hiding this comment

jamiekang commented Jun 2, 2020 •

edited

Loading