Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Canjie-Luo/Text-Image-Augmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

18 Commits

Repository files navigation

Text Image Augmentation

Build Status

A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition". We provide the tool to avoid overfitting and gain robustness of text recognizers.

Note that this is a general toolkit. Please customize for your specific task. If the repo benefits your work, please cite the papers.

News

  • 2020-02 The paper "Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition" was accepted to CVPR 2020. It is a preliminary attempt for smart augmentation.

  • 2019-11 The paper "Decoupled Attention Network for Text Recognition" (Paper Code) was accepted to AAAI 2020. This augmentation tool was used in the experiments of handwritten text recognition.

  • 2019-04 We applied this tool in the ReCTS competition of ICDAR 2019. Our ensemble model won the championship.

  • 2019-01 The similarity transformation was specifically customized for geomeric augmentation of text images.

Requirements

We recommend Anaconda to manage the version of your dependencies. For example:

 conda install boost=1.67.0

Installation

Build library:

 mkdir build
 cd build
 cmake -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF ..
 make

Copy the Augment.so to the target folder and follow demo.py to use the tool.

 cp Augment.so ..
 cd ..
 python demo.py

Demo

  • Distortion

  • Stretch

  • Perspective

Speed

To transform an image with size (H:64, W:200), it takes less than 3ms using a 2.0GHz CPU. It is possible to accelerate the process by calling multi-process batch samplers in an on-the-fly manner, such as setting "num_workers" in PyTorch.

Improvement for Recognition

We compare the accuracies of CRNN trained using only the corresponding small training set.

Dataset IIIT5K IC13 IC15
Without Data Augmentation 40.8% 6.8% 8.7%
With Data Augmentation 53.4% 9.6% 24.9%

Citation

@inproceedings{luo2020learn,
 author = {Canjie Luo and Yuanzhi Zhu and Lianwen Jin and Yongpan Wang},
 title = {Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition},
 booktitle = {CVPR},
 year = {2020}
}
@inproceedings{wang2020decoupled,
 author = {Tianwei Wang and Yuanzhi Zhu and Lianwen Jin and Canjie Luo and Xiaoxue Chen and Yaqiang Wu and Qianying Wang and Mingxiang Cai}, 
 title = {Decoupled attention network for text recognition}, 
 booktitle ={AAAI}, 
 year = {2020}
}
@article{schaefer2006image,
 title={Image deformation using moving least squares},
 author={Schaefer, Scott and McPhail, Travis and Warren, Joe},
 journal={ACM Transactions on Graphics (TOG)},
 volume={25},
 number={3},
 pages={533--540},
 year={2006},
 publisher={ACM New York, NY, USA}
}

Acknowledgment

Thanks for the contribution of the following developers.

@keeofkoo

@cxcxcxcx

@Yati Sagade

Attention

The tool is only free for academic research purposes.

About

Geometric Augmentation for Text Image

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

AltStyle によって変換されたページ (->オリジナル) /