Technical task.
- user visits amazon.com website
- user fills out a search field with the product name and activates search ==>
- a page with search results is displayed.
- user looks for the product having maximum reviews
- count user extracts minimum product price (with applied discount - if any) from the page
- user assigns amazon_price = product price
- user visits bestbuy.com website
- user chooses United States country
- user fills out a search field with the product name and activates search ==>
- a page with search results is displayed.
- user looks for the product having maximum reviews count
- user extracts minimum product price (with applied discount - if any) from the page
- user assigns bestbuy_price = product price
import pytest from selenium.webdriver.common.by import By from selenium.webdriver.common.keys import Keys from conftest import driver first_dict = {} second_dict = {0: 0} @pytest.mark.parametrize("url, search_locator, products_locator, price_locator, review_locator ", [ ("https://www.amazon.com/", "//input[@name = 'field-keywords']", "//div[@class = 'a-section']", "span.a-price-whole", "span.s-underline-text"), ("https://www.bestbuy.com/", "//input[@class = 'search-input']", "//div[@class = 'embedded-badge']", "div.priceView-hero-price span:first-child", "span.c-reviews")]) def test_shopping(driver, url, search_locator, products_locator, price_locator, review_locator): product = 'samsung galaxy s22' driver.get(url) if url == "https://www.bestbuy.com/": driver.find_element(By.XPATH, '//a[@class = "us-link"]').click() search_nav = driver.find_element(By.XPATH, search_locator) search_nav.send_keys(product) search_nav.send_keys(Keys.RETURN) # for windows need to be changed to Keys.ENTER names_product = driver.find_elements(By.XPATH, products_locator) assert len(names_product) != 0, "Page with product isn't displayed" review_counts = driver.find_elements_by_css_selector(review_locator) products_price = driver.find_elements_by_css_selector(price_locator) for i in range(len(products_price)): if i >= len(review_counts): break review_count_text = review_counts[i].text.strip('()').replace(',', '.') price_count_text = products_price[i].text.strip('$').replace(',', '.') if review_count_text == '' or price_count_text == '' or review_count_text == 'Not Yet Reviewed': continue review_count = int(review_count_text.replace('.', '')) price_count = float(price_count_text.replace('.99', '')) if "amazon" in url: first_dict[review_count] = price_count if "bestbuy" in url: second_dict[review_count] = price_count max_first_value = max(first_dict) max_second_value = max(second_dict) bestbuy_price = second_dict[max_second_value] amazon_price = first_dict[max_first_value] # once script completed the line below should be uncommented. assert amazon_price > bestbuy_price
1 Answer 1
search_nav.send_keys(Keys.RETURN) # for windows need to be changed to Keys.ENTER
I don't understand why there's a comment here.
Add an if sys.platform == ... and be done with it.
Explain the details in the code, not in a comment.
review_counts = ...
products_price = ...
These (and the *_locators) are helpful identifiers, thank you. Consider revisiting the whole singular vs plural distinction, for consistency.
for i in range(len(products_price)):
if i >= len(review_counts):
break
We might have computed max( ... ) across the two lengths,
fine, whatever.
More importantly: It isn't clear why "too many reviews"
leads to invalid data.
At a minimum we need a # comment explaining what sort
of bad data has been observed on particular web pages.
We need this to understand what the code is doing,
and also to identify whether next year's web pages still
manifest that behavior, or if perhaps the logic can be pruned.
review_count_text = ...
price_count_text = ...
Those identifiers should probably not have "count" in the middle of them. They came from counts, but they do not contain counts.
I recommend Extract Helper: pass in
review_counts / products_price
and get back review_count / price_count.
For one thing, it will let you write a Unit Test
that reveals helpful example text strings to the
maintenance engineers you hire next year.
if "amazon" in url:
first_dict[review_count] = price_count
if "bestbuy" in url:
second_dict[review_count] = price_count
These are ill-chosen identifiers.
They are unimaginative and unhelpful.
An obvious name would be amazon_dict.
But better: use another level of indirection.
Could be defaultdict(dict)
where we talk about review_counts[vendor][review_count].
Or it could be review_counts[f"{vendor}_{review_count}"].
assert amazon_price > bestbuy_price
This possibly is true ATM. But pricing strategies will vary as the months go by.
The parsing code above this seems to be nicely motivated, turning possibly chaotic web text into well-defined variables (or signalling fatal error if page format changed). In contrast, this particular assertion seems like it belongs one level up in the call stack. Push the parsing down into a helper function, and make an assertion on what comes back from it.
Overall?
I would not be willing to assign or delegate maintenance tasks on this code base as written. Adding unit tests with example HTML text snippets would go a long way toward explaining the underlying assumptions. The current code does not yet appear to be ready for pushing it into production.
-
\$\begingroup\$ Thank you very much. This is a test assignment I recently wrote for the Trainee position. I will fix it. \$\endgroup\$lr_lennok– lr_lennok2023年02月27日 13:14:51 +00:00Commented Feb 27, 2023 at 13:14