Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit ef045c9

Browse files
Merge pull request #1077 from nirbhay12345/Image-To-Text-Converter
Add image to text converter
2 parents 6493ce9 + f4ea193 commit ef045c9

File tree

5 files changed

+201
-0
lines changed

5 files changed

+201
-0
lines changed
843 KB
Loading[フレーム]
55.9 KB
Loading[フレーム]
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
<div align="center">
2+
3+
# Image to Text Conveter in python
4+
5+
</div>
6+
7+
8+
## Aim
9+
10+
The main aim of the project is to provide a GUI interface for the uploading an image and extracting the text present in the image
11+
12+
## Prerequisites
13+
14+
- [Tesseract-OCR](https://tesseract-ocr.github.io/tessdoc/Home.html#500x) should be installed on your system
15+
16+
17+
## Purpose
18+
19+
For use of Optical Character Recognision in the text extraction from an image
20+
21+
## Description
22+
23+
This project is a desktop app which load the text that is included in the image.
24+
Libraries used:
25+
- Pillow
26+
- PySimpleGUI
27+
- pytesseract
28+
29+
30+
## Workflow of the Project
31+
32+
### `main()` function
33+
34+
The UI is defined under the main function which is called when the window is initialize. The main function contains the:
35+
- Layout for the application
36+
- A loop for checking for events on the window (until the window is terminated)
37+
38+
### `window.read()` function
39+
40+
The `window.read()` function listens to:
41+
- events that happen on the application window
42+
- values from specific keys
43+
44+
### Keys
45+
46+
Keys are given to the ui elements as ids in html. Thus, values in the code is the collection (more like a dictionary for the window) and contains key value pairs as:
47+
```py
48+
values = {
49+
keys: 'some value'
50+
}
51+
```
52+
53+
for ex.
54+
```py
55+
values = {
56+
"-FILE-": "path/to/file"
57+
}
58+
```
59+
60+
The values are accessed in the similar fashion ( *(key, value) pairs where the key is the id of the ui element and value is the innerValue of the element* ) in the entire application.
61+
62+
### Layout
63+
64+
This is straight forward.
65+
- a 1-D list is row
66+
- a 2-D list is a column
67+
68+
```py
69+
layout = [
70+
[
71+
[row1, col1],[row1, col2],
72+
]
73+
[
74+
[row2, col1],[row2, col2]
75+
],
76+
]
77+
```
78+
79+
### `ocr_core(filename)` function
80+
81+
This function need two things in place to work properly:
82+
- The Tesseract-OCR path should be given properly (see the setup point 4)
83+
- The file that is given should exisit on the computer at the given path
84+
85+
Next is all handled by the `pytesseract.image_to_string()` function which provided by the *pytesseract* module
86+
87+
88+
## Setup instructions
89+
90+
1. Intitialize a virtual-environment in the directory.
91+
1. now activate the env by
92+
- `./env/Scripts/activate` on windows
93+
- `source env/bin/activate` on Linux and Mac
94+
1. Then after, install the dependencies by running `pip install requirements.txt`
95+
1. Now go and replace your tesseract-OCR path on the **line 18** of the `image_to_text_converter.py` file
96+
1. Now simply run the code by `python image_to_text_converter.py`
97+
98+
99+
### Output
100+
101+
If we have an image such as:
102+
<div align="center">
103+
104+
![Image](./Images/quote.png)
105+
106+
</div>
107+
108+
The output will be the extracted text:
109+
```
110+
You will face many defeats in
111+
life, but never let yourself be
112+
defeated.
113+
114+
MAYA ANGELOU
115+
```
116+
117+
![GIF](./Images/IMAGE2TEXT.gif)
118+
119+
## Conclusion
120+
121+
Thus in this way we have used the pytesseract module and the OCR to detect characters in the given image. This project uses PySimpleGUI which is a simpler yet elegant way for making ui in python.
122+
123+
## Author
124+
125+
Nirbhay Chaplot
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
import io
2+
import os
3+
import PySimpleGUI as sg
4+
from PIL import Image
5+
import pytesseract
6+
7+
# include the file types that should be allowed in your app
8+
file_types = [
9+
("PNG (*.png)", "*.png"),
10+
("JPEG (*.jpg)", "*.jpg"),
11+
("All files (*.*)", "*.*")
12+
]
13+
14+
15+
def ocr_core(filename):
16+
"""
17+
This function will handle the core OCR processing of images.
18+
"""
19+
pytesseract.pytesseract.tesseract_cmd = r'<Put your tessract-ocr exe path here>'
20+
# the function 'image_to_string' extracts all the text from the image
21+
text = pytesseract.image_to_string(Image.open(filename))
22+
return text
23+
24+
# the ui is rendered as a function call and anything defined inside the function is rendered in the window
25+
def main():
26+
col1 = [[sg.Image(key="-IMAGE-")]]
27+
col2 = [[
28+
sg.Text("Image File"),
29+
sg.Input(size=(25, 1), key="-FILE-"),
30+
sg.FileBrowse(file_types=file_types),
31+
sg.Button("Load Image"),
32+
]]
33+
# see the readme for understanding Layout
34+
layout = [[
35+
sg.Column(col1, element_justification='c'),
36+
sg.Column(col2, element_justification='c'),
37+
[
38+
sg.Button("Load Text"),
39+
sg.Multiline(size=(100, 10), key="-TEXTBOX-")
40+
]
41+
]]
42+
43+
window = sg.Window("Image 2 Text", layout, margins=(
44+
50, 50), size=(1000, 500), resizable=True)
45+
46+
while True:
47+
event, values = window.read()
48+
if event == "Exit" or event == sg.WIN_CLOSED:
49+
break
50+
if event == "Load Image":
51+
# if the load image button is clicked
52+
filename = values["-FILE-"]
53+
if os.path.exists(filename):
54+
image = Image.open(values["-FILE-"])
55+
image.thumbnail((400, 400))
56+
# convert the image to bytestream
57+
bio = io.BytesIO()
58+
# save the image as PNG
59+
image.save(bio, format="PNG")
60+
# update the ui to show the loaded image
61+
window["-IMAGE-"].update(data=bio.getvalue())
62+
if event == "Load Text":
63+
# The image is taken and its file path is passed into the ocr_core() function
64+
filename = values["-FILE-"]
65+
if os.path.exists(filename):
66+
# update the text based on the returned value of the function
67+
window["-TEXTBOX-"].update(ocr_core(filename))
68+
69+
window.close()
70+
71+
72+
if __name__ == "__main__":
73+
main()
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Pillow==8.4.0
2+
PySimpleGUI==4.55.1
3+
pytesseract==0.3.8

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /