June 29, 2024

VueScan has built-in Optical Character Recognition (OCR) for English, Spanish, German, French and Italian in VueScan 9.8.35 and later.

VueScan uses Google’s Tesseract 3 for VueScan 9.8.34 and earlier, and Tesseract 5 for VueScan 9.8.35 and later.

There are 44 additional languages you can use by downloading one of the ocr_xx.bin files (for VueScan 9.8.34 and earlier) or xxx.traineddata files (for VueScan 9.8.35 and later) below.

These files contain data about the character set used in each of these languages, and the OCR results will be better if you use them.

To add support for additional languages in the "Output | OCR text language" option, you need to download a language-specific file. Store this file on your hard drive in one of the following locations:

Operating System Download Location

macOS /Users/Shared

Windows (VueScan 9.1 and earlier) c:\vuescan

Windows (VueScan 9.2 and later) same location as vuescan.log or c:\Program Files\VueScan

Linux same location as vuescan.log or with vuescan executable program

Supported OCR Languages

Click on one of the links below and save the file in the location described above. You can find the additional languages and more accurate (albeit slower) trained data at https://github.com/tesseract-ocr. Note that you need to use one of the three letter language codes built into VueScan.

9.8.34 and earlier 9.8.35 and later Language

ocr_bg.bin bul.traineddata Bulgarian

ocr_ca.bin cat.traineddata Catalan

ocr_zh.bin zho.traineddata Chinese (Simplified)

ocr_tw.bin zht.traineddata Chinese (Traditional)

ocr_cs.bin ces.traineddata Czech

ocr_da.bin dan.traineddata Danish

ocr_nl.bin nld.traineddata Dutch

ocr_en.bin (built-in) eng.traineddata (built-in) English

ocr_fi.bin fin.traineddata Finnish

ocr_fr.bin fra.traineddata (built-in) French

ocr_de.bin deu.traineddata (built-in) German