Name	Name	Last commit message	Last commit date
Latest commit History 14 Commits
data	data
finetune	finetune
pretrain	pretrain
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
logo.png	logo.png

Name

Last commit message

Last commit date

Latest commit

History

BERTu: A BERT-based language model for the Maltese language 🇲🇹

This repository contains code & information relevant for the paper Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese.

The pre-trained language models can be accessed through the Hugging Face Hub using MLRS/BERTu or MLRS/mBERTu. For details on how pre-training was done see the pretrain directory.

The models were trained on Korpus Malti v4.0, which can be accessed through the Hugging Face Hub using MLRS/korpus_malti.

For details on how fine-tuning was done see the finetune directory.
To consume fine-tuned models for evaluation/prediction refer to the evaluate directory.

Citation

Cite this work as follows:

@inproceedings{BERTu,
 title = "Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and {BERT} Models for {M}altese",
 author = "Micallef, Kurt and
 Gatt, Albert and
 Tanti, Marc and
 van der Plas, Lonneke and
 Borg, Claudia",
 booktitle = "Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing",
 month = jul,
 year = "2022",
 address = "Hybrid",
 publisher = "Association for Computational Linguistics",
 url = "https://aclanthology.org/2022.deeplo-1.10",
 doi = "10.18653/v1/2022.deeplo-1.10",
 pages = "90--101",
}

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLRS/BERTu

Folders and files

Latest commit

History

Repository files navigation

BERTu: A BERT-based language model for the Maltese language 🇲🇹

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BERTu: A BERT-based language model for the Maltese language 🇲🇹

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages