EPTML (Efficient Prompt Tuning within Meta-Learning framework) is an improved (speed & accuracy) method based on the previous PBML code (https://github.com/MGHZHANG/PBML) for few-shot text classification.
A dataset for few-shot relation classification, containing 100 relations. Each statement has an entity pair and is annotated with the corresponding relation. The position of the entity pair is given, and the goal is to predict the correct relation based on the context. The 100 relations are split into 64, 16, and 20 for training, validation, and test, respectively.
A dataset for topic classification. It contains news headlines published on HuffPost between 2012 and 2018 (Misra, 2018). The 41 topics are split into 20, 5, 16 for training, validation and test respectively. These headlines are shorter and more colloquial texts.
A dataset of Reuters articles over 31 classes (Lewis, 1997), which are split into 15, 5, 11 for training, validation and test respectively. These articles are longer and more grammatical texts.
A dataset contains customer reviews from 24 product categories. Our goal is to predict the product category based on the content of the review. The 24 classes are split into 10, 5, 9 for training, validation and test respectively.
train.pycontains the meta-learning frameworkmodel.pycontains the overall model architechture.dataloader.pycontains the data loading and preparing process.main.pycontains the whole runing process.
- For benchmark: FewRel, HuffPost, Reuters and Amazon, download the dataset processed by Bao et al.,(2020), from https://people.csail.mit.edu/yujia/files/distributional-signatures/data.zip, then put the file to the corresponding benchmark directory
data/{benchmark}/. cd data/and runsplit_xxx.pyto split each benchmark totrain.json, val.json and test.json
mkdir .bert-base-uncasedin the project directory, downloadconfig.json pytorch_model.bin tokenizer.json vocab.txtfrom https://huggingface.co/bert-base-uncased/tree/main to the directory.
- For benchmark: FewRel,
data/FewRel/P-info.jsonprovides for each relation, a list of alias, serving as candidate words. you need to obtain this file from https://github.com/thunlp/MIML/tree/main/data - For benchmark: HuffPost, Reuters, and Amazon,
data/{benchmark}/candidate_words.jsoncontain candidate words of each class (the files are provided, you can also define your own candidate words). data/{benchmark}/candidate_ebds.jsoncontains candidate word embeddings of each class.cd data/and runword2ebd.pyto obtain this file.
- Run command
python main.py {benchmark} {shot}in the project directory to repeat the reported results where{shot}can be set to1 or 5for 5-way N-shot setting.
- For HuffPost, make sure the GPU memory size >= 40GB; For other datasets, use GPU with larger memory.
- change line No. 77 and No. 82 in model.py according to the comments
- change line No. 33 and No. 36 in main.py according to the comments
cd data/and runword2llm-ebd.pyto prepare the LLM word embeddings of each classmkdir llama-2-7b-hfin the project directory, downloadconfig.json generation_config.json pytorch_model-00001-of-00002.bin pytorch_model-00002-of-00002.bin pytorch_model.bin.index.json special_tokens_map.json tokenizer_config.json tokenizer.json tokenizer.modelfrom https://huggingface.co/meta-llama/Llama-2-7b-hf/tree/main to the directory.
- Pytorch>=0.4.1
- Python3
- numpy
- transformers
- json
- apex (https://github.com/NVIDIA/apex)
- peft
@article{eptml2024,
title={Few-shot Text Classification with An Efficient Prompt Tuning Method in Meta-Learning Framework},
author={Xiaobao Lv},
journal={International Journal of Pattern Recognition and Artificial Intelligence},
doi={https://doi.org/10.1142/S0218001424510066}
year={2024}
}