Commit 930c345

authored

Update Readme FIle with detail.

1 parent ce16f6f commit 930c345Copy full SHA for 930c345

File tree

1 file changed

+15

-6

lines changed

README.md

1 file changed

+15

-6

lines changed

`‎README.md`

Lines changed: 15 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -2,8 +2,7 @@`
`2`	`2`	`### Description`
`3`	`3`	`Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.`
`4`	`4`
`5`		`-Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) for the great contribution.`
`6`		`-`
	`5`	`+Special Thanks for the [Line HTR](https://github.com/lamhoangtung/LineHTR) and [@Harald Scheidl](https://github.com/githubharald) for their work.`
`7`	`6`	`### Why Deep Learning?`
`8`	`7`	`![Why Deep Learning](images/WhyDeepLearning.png?raw=true "Why Deep Learning")`
`9`	`8`	`> Deep Learning self extracts features with a deep neural networks and classify itself. Compare to traditional Algorithms it performance increase with Amount of Data.`
`@@ -12,7 +11,7 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f`
`12`	`11`	`![Step_wise_detail](images/Step_wise_detail_of_workflow.png?raw=true "Step_Wise Detail")`
`13`	`12`	`* First Use Convolutional Recurrent Neural Network to extract the important features from the handwritten line text Image.`
`14`	`13`	`* The output before CNN FC layer (512x100x8) is passed to the BLSTM which is for sequence dependency and time-sequence operations.`
`15`		`-* Then CTC LOSS [Alex Graves](https://www.cs.toronto.edu/~graves/icml_2006.pdf) is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as -log("gtText"); aim to minimize negative maximum likelihood path.See [this](https://distill.pub/2017/ctc/) for detail.`
	`14`	+* Then CTC LOSS [Alex Graves](https://www.cs.toronto.edu/~graves/icml_2006.pdf) is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as `-log("gtText")`; aim to minimize negative maximum likelihood path.
`16`	`15`	`* Finally CTC finds out the possible paths from the given labels. Loss is given by for (X,Y) pair is: ![Ctc_Loss](images/CtcLossFormula.png?raw=true "CTC loss for the (X,Y) pair")`
`17`	`16`	`* Finally CTC Decode is used to decode the output during Prediction.`
`18`	`17`	`</i>`
`@@ -21,6 +20,12 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f`
`21`	`20`	`#### Detail Project Workflow`
`22`	`21`	`![Architecture of Model](images/ArchitectureDetails.png?raw=true "Model Architecture")`
`23`	`22`
	`23`	`+* Project consists of Three steps:`
	`24`	`+ 1. Multi-scale feature Extraction --> Convolutional Neural Network 7 Layers`
	`25`	`+ 2. Sequence Labeling (BLSTM-CTC) --> Recurrent Neural Network (2 layers of LSTM) with CTC`
	`26`	`+ 3. Transcription --> Decoding the output of the RNN (CTC decode)`
	`27`	`+![DetailModelArchitecture](images/DetailModelArchitecture.png?raw=true "DetailModelArchitecture")`
	`28`	`+`
`24`	`29`	`# Requirements`
`25`	`30`	`1. Tensorflow 1.8.0`
`26`	`31`	`2. Flask`
`@@ -33,7 +38,8 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f`
`33`	`38`	`* Only needed the lines images and lines.txt (ASCII).`
`34`	`39`	`* Place the downloaded files inside data directory`
`35`	`40`
`36`		-###### You can find trained model to download from [here.](https://drive.google.com/open?id=10HHNZPqPQZCQCLrKGQOq5E7zFW5wGcA4) Download and extract all files inside the `model/` directory.
	`41`	`+##### The Validation character error rate obtain : 8.654728% i.e around 92 % accuracy`
	`42`	`+`
`37`	`43`
`38`	`44`
`39`	`45`	`To Train the model from scratch`
`@@ -66,9 +72,12 @@ With Correction clothed leaf by leaf with the dioappoistmest`
`66`	`72`	`Prediction output on Self Test Data`
`67`	`73`	`![PredictionOutput](images/PredictionOutput.png?raw=true "Prediction Output on Self Data")`
`68`	`74`
`69`		`-The Validation character error rate of saved model: 8.654728%`
	`75`	`+`
	`76`	+###### You can find trained model to download from [here.](https://drive.google.com/open?id=10HHNZPqPQZCQCLrKGQOq5E7zFW5wGcA4) Download and extract all files inside the `model/` directory.
	`77`	`+`
`70`	`78`	`# Further Improvement`
`71`	`79`	`* Line segementation can be added for full paragraph text recognition`
`72`	`80`	`* Better Image preprocessing to handle real time image.`
`73`		`-* Better Decoding approach to improve accuracy.`
	`81`	`+* Better Decoding approach to improve accuracy. Some of the CTC Decoder found [here](https://github.com/githubharald/CTCDecoder)`
`74`	`82`	`* More variety of data for real time recognition.`
	`83`	`+* Data Augmentation essential to improve accuracy.`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 930c345

File tree

1 file changed

1 file changed

`‎README.md`

0 commit comments