Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

oleges1/quartznet-pytorch

Repository files navigation

quartznet-pytorch

Automatic Speech Recognition (ASR) on pytorch. Re-implementation on pytorch of Nvidia's Quartznet.

Features:

  • Youtokentome tokenization with BPE dropout
  • Augmentations: custom and audiomentations
  • 3 datasets support: CommonVoice, Librispeech and LJSpeech
  • Weights & Biases logging
  • CTC beam search interation
  • GPU-based MelSpectrogram

Trained models:

dataset wer using dummy decoder wer with ctc beam search wer finetuned dummy decoder wer finetuned ctc beam search
LJspeech 36.66 34.45 28.41 27.19

W&B Logs:

AltStyle によって変換されたページ (->オリジナル) /