open-source character recognition

Index| Download| Screenshots| Examples| Developers| Support| Links

This page is still under construction!

This is an overview, made mainly for developers. You can find typical example files made by me and sent by others. You can pick a example and try to improve gocr. Note that the other examples should be recognized like before your changes or even better.

Users can get a first impression here of how well gocr works.

TOPICS:
  1. excellent examples
  2. good examples
  3. unsorted examples (from other people)
Only errors, where good readable characters are not recognized, or completely wrong recognized are counted. Errors where a "0" (zero) is printed as "O" (like Omega) is not counted. Sometimes missing accents are also not counted. Take it as a rough overview. It is also counted by hand, so it is not very exact.

excellent examples

pbm=black/white, clean scans
testfile ----size---- num_c num_errors remarks
-------- kB x y wc g027 
font1.pbm.gz 40 1623 953 905 7 build by latex+gs 300dpi, 12pt, mixed fonts
font2.pbm.gz 24 1083 637 905 14 build by latex+gs 200dpi, 12pt, mixed fonts
--------------------------------------------------------------------
old overview:
testfile size num_c quality time num_errors remarks
-------- x y ------- p1 p1 p2 p3
g300a1.pbm 703 580 469 + 2s 4 - 0
g300a2.pbm 724 1252 1021 + 5s 2 0 0
g300b1.pbm 1564 277 55 + 4s 0 1 1
g300b2.pbm 599 1319 860 snowy 9s 76 - 40
g300b3.pbm 592 1324 934 snowy 7s 36 2 15
g300c1.pbm 750 2771 2182 + 15s 35 1 14
liebfrau1 2289 3200 1927 + 19s 13 0 8
meraji1 1912 1355 1246 thinn 15s 65 4 40
paraguay1 2617 1375 3280 frame 78s 1000 1 55
p1=gocr0.2.4a3 on P400
p2=recognita+4.0
p3=gocr0.2.5
most errors: connected chars, like fi,ff, italic font
 

unsorted examples

 ... 
 ... 

AltStyle によって変換されたページ (->オリジナル) /