Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

thunlp/DSDocRE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

6 Commits

Repository files navigation

DSDocRE

Source code for "Denoising Relation Extraction from Document-level Distant Supervision".

Requirements:

  • Python3
  • Pytorch>=1.5.0
  • apex>=0.1
  • numpy
  • transformers>=2.5.1

Prepare Data

Data can be downloaded from Google Drive. Please put the data file in the data/.

Pre-denoise

Run the following command to train the pre-denoising model:

python3 train.py -c config/PreDenoising.config -g 0

After training the pre-denoising model, run the following commands to generate rank score for all data:

mkdir data/rank_result
# Generate the rank_result for test data.
python3 test.py -g 0 \
 --config config/PreDenoising.config \
 --test_file data/test.json \
 --checkpoint checkpoint/PreDenoise/3.pkl \
 --result_score data/rank_result/test_score.npy \
 --result_title data/rank_result/test_title.json 
# Generate the rank_result for valid data.
python3 test.py -g 0 \
 --config config/PreDenoising.config \
 --test_file data/dev.json \
 --checkpoint checkpoint/PreDenoise/3.pkl \
 --result_score data/rank_result/valid_score.npy \
 --result_title data/rank_result/valid_title.json 
# Generate the rank_result for train_annotated data.
python3 test.py -g 0 \
 --config config/PreDenoising.config \
 --test_file data/train_annotated.json \
 --checkpoint checkpoint/PreDenoise/3.pkl \
 --result_score data/rank_result/train_annotated_score.npy \
 --result_title data/rank_result/train_annotated_title.json 
# Generate the rank_result for train_distant data.
python3 test.py -g 0 \
 --config config/PreDenoising.config \
 --test_file data/train_distant.json \
 --checkpoint checkpoint/PreDenoise/3.pkl \
 --result_score data/rank_result/train_distant_score.npy \
 --result_title data/rank_result/train_distant_title.json 

Pre-train

python3 train.py -c config/PreTraining.config -g 0

Fine-tune

# Choose an appropriate model to fine-tune
python3 train.py -c config/FineTuning.config -g 0 \
 --pretrained_bert_path checkpoint/PreTrain/epoch_15/

Cite

If you use the code, please cite this paper:

@inproceedings{xiao2020denoise,
 title={Denoising Relation Extraction from Document-level Distant Supervision},
 author={Xiao, Chaojun and Yao, Yuan and Xie, Ruobing and Han, Xu and Liu, Zhiyuan and Sun, Maosong and Lin, Fen and Lin, Leyu},
 booktitle={Proceedings of EMNLP},
 year={2020}
}

About

Source code for EMNLP 2020 paper "Denoising Relation Extraction from Document-level Distant Supervision"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /