-
Notifications
You must be signed in to change notification settings - Fork 311
-
In the inference documentation, I was really confused by how the pipeline required a library of both the images and their bounding box information in a JSON. After digging around the discussions for a while, I came across a really helpful post with some code to begin setting this up using EasyOCR (link below). However, While this code was a great start, I have had poor results with recognizing English characters in tables using EasyOCR and I wanted to know about better ways, either better tools or just better settings. What is everyone else using?
As a student, I am relatively new to using ML for image processing so I'm still learning how to fine tune the settings to get better results. I can't really increase resolution too much or else my machine kills the process so I wanted to know about more efficient ways that I could be unaware of.
Thank you for any advice in advance.
Link: #121
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment
-
I don't know if you found the answer for this or not but i think the poor OCR results is only due to the poor image format. Here is a function which helps in preparing the Images for OCR
def prepareForOCR(numpyArray):
img = cv2.resize(numpyArray, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
threshold_img = cv2.threshold(cv2.bilateralFilter(img, 5, 75, 75), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
return threshold_img
Beta Was this translation helpful? Give feedback.