2

I need to implement OCR (tesseract, Abbyy, MODI, Aprise, etc..) to identify dynamically changing web elements in application page at automation runtime. This way the Selenium webdriver automation script will always work without ever changing the object locators in the code.

Any directions to get started?

Niels van Reijmersdal
32.7k4 gold badges59 silver badges125 bronze badges
asked Sep 4, 2017 at 8:23
5
  • what is OCR/OR? Commented Sep 4, 2017 at 8:50
  • So, what's the problem? Take a screenshot and run your ocr of choice on it, maybe cut out the "suspicious" section and do some preparation since at least tesseract doesn't seem to like noise pictures too much (or I'm just bad with tesseract) and it should work. Had something like this running for language validation at some point, if you tell us what's your problem and if it fits I could look for it. Commented Sep 4, 2017 at 8:59
  • 1
    You mean you have objects that have a new name every time the page is loaded? Do you have an example? Probably there is a better way to find a locator that does not need constant updating. Commented Sep 4, 2017 at 12:32
  • I am looking to use OCR how same feature is avaibale in robotic process automation tools like UIPath, Automation Anywhere to perform action on objects during runtime for dynamic changing objects on page . Commented Sep 5, 2017 at 6:56
  • Object locator will not work always in dynamic object change . for ex- if element is avable inside a web table and the table locator is changing constantly or its in div (each time div updates id or class with different text format which is not follow in any pattern) in that case its very difficult to manage. Only can be updated once it got failed after test run. Commented Sep 5, 2017 at 7:07

2 Answers 2

2

The flow for something like that would look something like this:

  1. Take a screenshot of your full browser window
  2. Analyse the screenshot with your OCR software and let it return coordinates
  3. Interact with the element on that location, e.g. click item coordinates.

You say you want this because:

The Selenium script will always work without changing object locators in code

I think your thought process is interesting, but I think this is a dream and a lie. What if the application changes so that your OCR software finds two or more objects with the same text? Which one should it use? What if you have multiple elements that look the same, but the order has changed. Think about edit buttons. I think you will introduce new maintainability issues, while you increase the complexity of the testing framework hugely.

I assume you want to solve maintainability issues of Selenium tests, I would try:

  • Learn to write good Selenium locators
  • Centralize actions and locators in Page Objects
  • Force developers to run and update test during development, not afterwards by testers. I would also let the developers create the happy-path tests and extend those by testers if needed.
  • Write less Selenium tests, follow the guideline test automation effort spread as described by the test-pyramid.
answered Sep 4, 2017 at 12:27
0

This answer assumes that you have ruled working on the DOM level with selectors.

So if you want to do visual testing, I see several options:

  1. If you are not tied to Selenium, use the free Kantu web testing tool, which works visually, just the way you want it to work. It has built-in OCR. At the very least, you can use this tool test if the visual testing approach itself is the right one for your test case - before spending much time to get Selenium to work this way.

If Selenium:

  1. OCR for use with Selenium: Go for an online ocr solution, it will be much easier to use than implementing Tesseract. Here is some Java code to use it.

  2. Maybe you do not need OCR, but "only" image recognition? In this case Selenium + Sikuli is an option.

answered Oct 4, 2017 at 16:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.