Stars
支持谷歌翻译、百度翻译、有道翻译的免费接口,基于Django、PyMuPDF实现了pdf文档英译汉的功能,翻译后的pdf格式基本保持不变,可以下载docx和pdf格式的翻译文档,基本解决复制caj中文论文时的格式问题,简单的满足看论文以及写总结的需求。
Code for "Dynamic Context-guided Capsule Network for Multimodal Machine Translation" (ACM MM2020)
A curated list of AWESOME papers, datasets and tutorials within Multimodal Machine Translation.
Cross-lingual Visual Pre-training for Multimodal Machine Translation
Parallel corpus and multilingual machine translation system of the Pali Buddhist scriptures in 15 countries(15国巴利文大藏经平行语料与多语言机器翻译系统)
pytorch实现Transformer,提供机器翻译案例和简单的翻译api接口(flask)。评分组件使用BLEU。
This Tibetan tokenizer based on Bi-LSTM+CRF methods, it was created with the aim of aiding researchers in the field of Tibetan natural language processing.
基于LLAMA2的增量预训练藏文大语言模型Tibetan-LLAMA2-7B&Tibetan-LLAMA2-13B;指令微调藏文大模型Tibetan-Alpaca-7B&Tibetan-Alpaca-13B。
Tibetan HandWritten Recognition
BoNLTK aims to provide out of the box support for various NLP tasks that an application developer might need for Bokey, Tibetan language.
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Tibetan to English Machine Translation
🏷 བོད་ཏོག [phøtɔk̚] Tibetan word tokenizer in Python
MNIST of Tibetan handwriting 国产手写藏文MNIST数据集(TibetanMNIST)的图像分类处理与各种好玩的脑洞~
Notebooks using the Hugging Face libraries 🤗
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
TensorFlow Neural Machine Translation Tutorial
bert-base-chinese example