You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-6Lines changed: 15 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,7 @@
2
2
### Description
3
3
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.
4
4
5
-
Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) for the great contribution.
6
-
5
+
Special Thanks for the [Line HTR](https://github.com/lamhoangtung/LineHTR) and [@Harald Scheidl](https://github.com/githubharald) for their work.
7
6
### Why Deep Learning?
8
7

9
8
> Deep Learning self extracts features with a deep neural networks and classify itself. Compare to traditional Algorithms it performance increase with Amount of Data.
@@ -12,7 +11,7 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f
* First Use Convolutional Recurrent Neural Network to extract the important features from the handwritten line text Image.
14
13
* The output before CNN FC layer (512x100x8) is passed to the BLSTM which is for sequence dependency and time-sequence operations.
15
-
* Then CTC LOSS [Alex Graves](https://www.cs.toronto.edu/~graves/icml_2006.pdf) is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as -log("gtText"); aim to minimize negative maximum likelihood path.See [this](https://distill.pub/2017/ctc/) for detail.
14
+
* Then CTC LOSS [Alex Graves](https://www.cs.toronto.edu/~graves/icml_2006.pdf) is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as `-log("gtText")`; aim to minimize negative maximum likelihood path.
16
15
* Finally CTC finds out the possible paths from the given labels. Loss is given by for (X,Y) pair is:  pair")
17
16
* Finally CTC Decode is used to decode the output during Prediction.
18
17
</i>
@@ -21,6 +20,12 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f
21
20
#### Detail Project Workflow
22
21

@@ -33,7 +38,8 @@ Special Thanks for the [Lamhoangtung](https://github.com/lamhoangtung/LineHTR) f
33
38
* Only needed the lines images and lines.txt (ASCII).
34
39
* Place the downloaded files inside data directory
35
40
36
-
###### You can find trained model to download from [here.](https://drive.google.com/open?id=10HHNZPqPQZCQCLrKGQOq5E7zFW5wGcA4) Download and extract all files inside the `model/` directory.
41
+
##### The Validation character error rate obtain : 8.654728% i.e around 92 % accuracy
42
+
37
43
38
44
39
45
To Train the model from scratch
@@ -66,9 +72,12 @@ With Correction clothed leaf by leaf with the dioappoistmest
66
72
**Prediction output on Self Test Data**
67
73

68
74
69
-
**The Validation character error rate of saved model: 8.654728%**
75
+
76
+
###### You can find trained model to download from [here.](https://drive.google.com/open?id=10HHNZPqPQZCQCLrKGQOq5E7zFW5wGcA4) Download and extract all files inside the `model/` directory.
77
+
70
78
# Further Improvement
71
79
* Line segementation can be added for full paragraph text recognition
72
80
* Better Image preprocessing to handle real time image.
73
-
* Better Decoding approach to improve accuracy.
81
+
* Better Decoding approach to improve accuracy. Some of the CTC Decoder found [here](https://github.com/githubharald/CTCDecoder)
74
82
* More variety of data for real time recognition.
83
+
* Data Augmentation essential to improve accuracy.
0 commit comments