Adding OCR Languages in VueScan

Headshot of Ed Hamrick

Ed Hamrick

VueScan has built-in Optical Character Recognition (OCR) for English, Spanish, German, French and Italian in VueScan 9.8.35 and later.

VueScan uses Google’s Tesseract 3 for VueScan 9.8.34 and earlier, and Tesseract 5 for VueScan 9.8.35 and later.

There are 44 additional languages you can use by downloading one of the ocr_xx.bin files (for VueScan 9.8.34 and earlier) or xxx.traineddata files (for VueScan 9.8.35 and later) below.

These files contain data about the character set used in each of these languages, and the OCR results will be better if you use them.

To add support for additional languages in the "Output | OCR text language" option, you need to download a language-specific file. Store this file on your hard drive in one of the following locations:

Operating System Download Location
macOS /Users/Shared
Windows (VueScan 9.1 and earlier) c:\vuescan
Windows (VueScan 9.2 and later) same location as vuescan.log or c:\Program Files\VueScan
Linux same location as vuescan.log or with vuescan executable program

Supported OCR Languages

Click on one of the links below and save the file in the location described above. You can find the additional languages and more accurate (albeit slower) trained data at https://github.com/tesseract-ocr. Note that you need to use one of the three letter language codes built into VueScan.

9.8.34 and earlier 9.8.35 and later Language
ocr_zh.bin zho.traineddata Chinese (Simplified)
ocr_tw.bin zht.traineddata Chinese (Traditional)
ben.traineddata Bengali
fas.traineddata Persian
guj.traineddata Gujarati
mar.traineddata Marathi
How to Scan Photo Albums with VueScan
All Articles
Colorize Black and White Film and Photos with VueScan

AltStyle によって変換されたページ (->オリジナル) /