8
\$\begingroup\$

Problem

Output a receipt of purchased items. Inputs:

  • products_text_path: Products are found in a text file in <barcode>:<name>:<price> line format.
  • purchased_items_set: The purchased items are given as a list of integer barcodes. Same item be found several times.
  • discount_rules_text_path: There are discount rules in a text file in <barcode>:<min_quantity>:<discount> format which mean that buying more than min_quantity of the barcode item will give you a discount (given as the discount percentage /1)

Goal

Readable code that handles very large amount of input

Example

Products:
- 123:banana:0.5
- 789:apple:0.4
Discounts:
- 123:2:0.2 # Meaning buying 2 bananas will add a 20% discount
Purchases:
- [123,123] # Bought 2 bananas
Output:
- [('banana', 2, 0.8)]

Potential issues in the code below

  • I'm holding all the information in memory.
  • I might be using too many classes to represent the difference pieces of information?

Code

Code is best understood when read starting from the start_process function

class Product:
 def __init__(self, barcode: int, name: str, price: float):
 self.bardcode = barcode
 self.name = name
 self.price = price
 @classmethod
 def parse_product(cls, product_line: str):
 name, barcode_str, price_str = product_line.split(':')
 price = float(price_str)
 barcode = int(barcode_str)
 return cls(barcode, name, price)
class Products:
 def __init__(self):
 self.products = {} # Type: Dict<int, Product>
 def add_product(self, product: Product):
 self.products[product.bardcode] = product
 def get_product(self, product_barcode):
 return self.products[product_barcode]
 @classmethod
 def parse_products(products_cls, products_path: str):
 products = products_cls()
 with open(products_path) as products_lines:
 for product_line in products_lines:
 product = Product.parse_product(product_line)
 products.add_product(product)
 return products
class ProductPurchase:
 def __init__(self, barcode: int, quantity = 1, discount = 0.0):
 self.barcode = barcode
 self.quantity = quantity
 self.discount = discount
class DiscountRule:
 def __init__(self, product_barcode: int, min_quantity: int, discount_amount: float):
 self.product_barcode = product_barcode
 self.min_quantity = min_quantity
 self.discount_amount = discount_amount
 @classmethod
 def parse(cls, discount_rule: str):
 barcode_str, min_quantity_str, discount_amount_str = discount_rule.split(':')
 return cls(int(barcode_str), int(min_quantity_str), float(discount_amount_str))
class DiscountRules:
 def __init__(self):
 self.dicounts = {} # Type: Dict<int, DiscountRule>
 def add_discount(self, discount: DiscountRule):
 self.dicounts[discount.product_barcode] = discount
 def apply_discount(self, purchase: ProductPurchase):
 if purchase.barcode not in self.dicounts:
 return
 discount = self.dicounts[purchase.barcode]
 if purchase.quantity < discount.min_quantity:
 return
 purchase.discount = discount.discount_amount
 @classmethod
 def parse_discount_rules(cls, rules_path: str):
 rules = cls()
 with open(rules_path) as rules_lines:
 for rule_line in rules_lines:
 rule = DiscountRule.parse(rule_line)
 rules.add_discount(rule)
 return rules
class Purchases:
 def __init__(self):
 self.purchases = {} # Type: Dict<int, ProductPurchase>
 def add_purchase(self, purchased_barcode: int):
 if purchased_barcode in self.purchases:
 self.purchases[purchased_barcode].quantity += 1
 else:
 self.purchases[purchased_barcode] = ProductPurchase(purchased_barcode)
 def get_purchases(self):
 for purchase in self.purchases.values():
 yield purchase
class Checkout:
 def __init__(self, purchases: Purchases, products: Products, discount_rules: DiscountRules):
 self.purchases = purchases
 self.products = products
 self.discount_rules = discount_rules
 def check_out(self):
 result = []
 for purchase in self.purchases.get_purchases():
 self.discount_rules.apply_discount(purchase)
 product = self.products.get_product(purchase.barcode)
 price = _calculate_price(product.price, purchase.quantity, purchase.discount)
 result.append((product.name, purchase.quantity, price))
 return result
def _calculate_price(price, quantity, discount):
 return str(round(price * quantity * (1.0 - discount), 2))
def start_process(products_paths: str, discounts_path: str, scanner_input):
 purchases = Purchases()
 for purchased_barcode in scanner_input:
 purchases.add_purchase(purchased_barcode)
 products = Products.parse_products(products_paths)
 discount_rules = DiscountRules.parse_discount_rules(discounts_path)
 checkout = Checkout(purchases, products, discount_rules)
 return checkout.check_out()
asked Mar 27, 2021 at 19:01
\$\endgroup\$

3 Answers 3

3
\$\begingroup\$
  • Use dataclasses so you don't have to write boilerplate for your data containers.
  • You don't really need Products, DiscountRules, and Purchases when you could just have dict[int, Product], dict[int, DiscountRule], and dict[int, ProductPurchase] respectively. The existence of these classes makes the code harder to read.
  • Checkout doesn't need to be a class. You can just declare the method check_out. If you find yourself writing a class that only has two methods, one of which is __init__, you probably don't need a class.
  • Don't store currency as floats. Use a currency's smallest unit so you can store it as an integer. The main idea is to avoid rounding errors that come up in floating point arithmetic. For example, if Product's price is the price in US currency, it should be an integer representing the price in cents, not dollars.
  • You can use collections.Counter to count the scanned barcodes, which gives you something like a dict[int, int] (barcode -> quantity). Then you can use this to create a collection of ProductPurchases later.
  • I see type hints being used, but not consistently. Type hints should be added for method return types as well.
  • This is personal preference, but for the methods that parse a string into a data container type like Product, instead of the names parse_product, parse, etc. I would name it from_string so the usage is more self-documenting, e.g. Product.from_string(s).

Refactored version

from __future__ import annotations
from collections import Counter
from dataclasses import dataclass
@dataclass
class Product:
 barcode: int
 name: str
 price: int # in the currency's smallest unit, e.g. cents for US currency
 @staticmethod
 def from_string(product: str) -> Product:
 barcode, name, price = product.split(":")
 return Product(int(barcode), name, int(price))
@dataclass
class DiscountRule:
 barcode: int
 min_quantity: int
 discount_amount: int # percentage, e.g. 20 => 20% discount
 @staticmethod
 def from_string(discount_rule: str) -> DiscountRule:
 barcode, min_quantity, discount_amount = discount_rule.split(":")
 return DiscountRule(
 int(barcode), int(min_quantity), int(discount_amount)
 )
@dataclass
class ProductPurchase:
 product: Product
 quantity: int
 discount: int # percentage, e.g. 20 => 20% discount
 @property
 def cost(self) -> float:
 return self.product.price * self.quantity * (100 - self.discount) / 100
 def __str__(self) -> str:
 discount_text = (
 f" (with {self.discount}% discount)" if self.discount > 0 else ""
 )
 return f"{self.quantity}x {self.product.name}{discount_text} {self.cost / 100:.2f}"
def calculate_product_purchase(
 products: dict[int, Product],
 discounts: dict[int, DiscountRule],
 barcode: int,
 quantity: int,
) -> ProductPurchase:
 product = products[barcode]
 discount_pct = (
 discount.discount_amount
 if (discount := discounts.get(barcode, None))
 and quantity >= discount.min_quantity
 else 0
 )
 return ProductPurchase(product, quantity, discount_pct)
def print_receipt(
 products_path: str, discounts_path: str, purchases: list[int]
) -> None:
 with open(products_path) as p_file, open(discounts_path) as d_file:
 products = {
 p.barcode: p for line in p_file if (p := Product.from_string(line))
 }
 discounts = {
 d.barcode: d
 for line in d_file
 if (d := DiscountRule.from_string(line))
 }
 for barcode, quantity in Counter(purchases).items():
 product_purchase = calculate_product_purchase(
 products, discounts, barcode, quantity
 )
 print(product_purchase)
if __name__ == "__main__":
 print_receipt(
 "products.txt", "discounts.txt", [123, 123, 123, 123, 789, 789]
 )

products.txt

123:banana:50
789:apple:40

discounts.txt

123:2:20
answered Mar 28, 2021 at 0:26
\$\endgroup\$
8
  • \$\begingroup\$ great comments! why not having calculate product_purchase as a class method? also, why do you store references to classes using barcodes and not the class itself? also, I see multiple discounts stored by value and in +1 place, can't this be a problem if such discount needs to be changed during run time? \$\endgroup\$ Commented Mar 28, 2021 at 8:47
  • 1
    \$\begingroup\$ @miquelvir calculate_product_purchase could be a class method, yes, but I personally didn't like the call pattern of ProductPurchase.calculate_product_purchase(...). Personal preference, really. \$\endgroup\$ Commented Mar 28, 2021 at 9:34
  • \$\begingroup\$ @miquelvir Re also, why do you store references to classes using barcodes and not the class itself?: I'm not sure I quite follow you here. Are you asking why I didn't model Product to contain a list of DiscountRules? If so, I didn't feel like that was really necessary and would introduce unwanted coupling. When I think of a Product in this problem, it clearly needs an id (barcode), a name, and a unit price. But I don't think it should hold a DiscountRule or a list[DiscountRule], i.e. I think its model should exist independently of DiscountRule. \$\endgroup\$ Commented Mar 28, 2021 at 9:35
  • \$\begingroup\$ (continuing from the previous comment) For example, what if in the future there is a new kind of DiscountRule that offers a discount if you purchase everything in a list of distinct Products (e.g. buy one apple and one banana for 10% off)? Then that coupling might need to be refactored out in that case. \$\endgroup\$ Commented Mar 28, 2021 at 9:36
  • \$\begingroup\$ Re the part about discounts stored by value and discounts changing at runtime: If the discounts change, then all one needs to do is recalculate all the product purchases using the updated discounts. In other words, simply re-running the program with the changed discounts file, since the flow we have here is load data & input -> perform calculations -> print report -> exit program. The data flow is unidirectional, and I don't think there's anything in the problem statement that would suggest that we need to worry about discounts being modified while the program is running. \$\endgroup\$ Commented Mar 28, 2021 at 9:37
3
\$\begingroup\$
  • Indentation is not a multiple of 4, while PEP8 establishes that it should.
  • No entry-point to your script (if __ name __ == '__ main __'...).
  • Parse_product was getting the name before the barcode, while the question establishes the order is the opposite one.
  • Class products is basically a dictionary. It does not make sense to me to have a whole class to just cover a dictionary (not actually exposing all of the useful interface form it) just for the parse_products class method. I see 4 options: inherit from dict, this way at least its a proper dict, have ProductUtils and just have the utility function parse_products, or have a function parse_products, have the parse_products utility inside the Product class, no need to have Product and Products (if products its basically a dict). All of them seem better to me.
  • Calculate price makes no sense to have it as a utility function. If it computes the discounted price of a purchase, why not have it in the Purchase class?
  • Do you have some background working with DBs? The ProductPurchase structure makes sense in the context of a database. But in this OOP design, why not just having ProductCart? Also, instead of storing barcodes, store the actual products (we don't loose much space for a single object reference we already have, and it will be much more useful than just having the key to it)!
  • Similarly, purchases is just a dict, so why not inherit from it to get its full utility and just override what we need to be different. Also, add_purchase is misleading, since you add a product/barcode, not a purchase (since the purchase is generated within the method).
  • The DiscountRules class seems to me un-necessary. Why not store the discounts in the Product class itself?
  • In general, no docstrings in any of the methods nor comments anywhere.
  • The check_out method basically only needs to go through the purchases. Essentially, it does nothing and thus it is not needed.
  • Other comments in the code

This is the final version I suggest:

from typing import Iterable, Dict
class Product:
 def __init__(self, barcode: int, name: str, price: float):
 self.barcode = barcode
 self.name = name
 self.price = price
 self.discounts = []
 @classmethod
 def parse_product(cls, product_line: str):
 barcode, name, price = product_line.split(':')
 return cls(int(barcode), name, float(price))
 def add_discount(self, discount: 'DiscountRule'):
 self.discounts.append(discount)
class Products(dict):
 def __init__(self, *args, **kwargs):
 super(Products, self).__init__(*args, **kwargs)
 def add_product(self, product: Product):
 self[product.barcode] = product
 @classmethod
 def parse_products(cls, products_path: str, discount_rules: 'Dict[int, DiscountRule]'):
 products = cls()
 with open(products_path) as products_lines:
 for product_line in products_lines:
 product = Product.parse_product(product_line)
 if product.barcode in discount_rules:
 product.add_discount(discount_rules[product.barcode])
 products.add_product(product)
 return products
class ProductPurchase:
 def __init__(self, product: Product, quantity: int = 1):
 self.product = product
 self.quantity = quantity
 @property
 def discount(self):
 total_discount = None
 for discount in self.product.discounts:
 if self.quantity >= discount.min_quantity:
 if total_discount is None:
 total_discount = 0
 total_discount += discount.discount_amount
 return total_discount
 def purchase(self, quantity: int = 1):
 self.quantity += quantity
 @property
 def discounted_price(self):
 return round(self.product.price * self.quantity * (1.0 - self.discount), 2)
class DiscountRule:
 def __init__(self, product_barcode: int, min_quantity: int, discount_amount: float):
 self.product_barcode = product_barcode
 self.min_quantity = min_quantity
 self.discount_amount = discount_amount
 @classmethod
 def parse(cls, discount_rule: str):
 barcode_str, min_quantity_str, discount_amount_str = discount_rule.split(':')
 return cls(int(barcode_str), int(min_quantity_str), float(discount_amount_str))
 @classmethod
 def parse_discount_rules(cls, rules_path: str) -> 'Iterable[int, DiscountRule]':
 with open(rules_path) as rules_lines:
 for rule_line in rules_lines:
 discount_rule = DiscountRule.parse(rule_line)
 yield discount_rule.product_barcode, discount_rule
class Purchases(dict):
 def __init__(self, *args, **kwargs):
 super().__init__(*args, **kwargs)
 def purchase(self, product: Product):
 if product.barcode in self:
 self[product.barcode].purchase()
 else:
 self[product.barcode] = ProductPurchase(product)
class ShoppingCart:
 def __init__(self, purchases: Purchases, products: Products):
 self.purchases = purchases
 self.products = products
 @classmethod
 def init_from_files(cls, products_path: str, discounts_path: str, scanner_input: list):
 discount_rules = DiscountRule.parse_discount_rules(discounts_path)
 products = Products.parse_products(products_path, dict(discount_rules))
 purchases = Purchases()
 for purchased_barcode in scanner_input:
 purchases.purchase(products[purchased_barcode])
 return cls(purchases, products)
if __name__ == "__main__":
 cart = ShoppingCart.init_from_files("products.txt", "discounts.txt", [123, 123])
 print("TICKET")
 for purchase in cart.purchases.values():
 print("%s x %s | %s€" % (purchase.quantity, purchase.product.name, purchase.discounted_price))
 print("------")

And code with comments on it:

from typing import Iterable, List
class Product:
 def __init__(self, barcode: int, name: str, price: float):
 self.barcode = barcode # bardcode seems a typo for barcode
 self.name = name
 self.price = price
 self.discounts = []
 @classmethod
 def parse_product(cls, product_line: str):
 barcode, name, price = product_line.split(':')
 """price = float(price_str)
 barcode = int(barcode_str) >>> un-needed in two steps """
 return cls(int(barcode), name, float(price))
 def add_discount(self, discount: 'DiscountRule'):
 self.discounts.append(discount)
class Products(dict):
 def __init__(self, *args, **kwargs):
 super(Products, self).__init__(*args, **kwargs)
 def add_product(self, product: Product):
 self[product.barcode] = product
 """def get_product(self, product_barcode):
 return self[product_barcode]
 
 >>> no need for this method, the dict already implements a get
 """
 @classmethod
 def parse_products(cls, products_path: str):
 """stick to the cls naming convention
 since it is a class method from the Products class, it is trivial that it is a product_cls
 """
 products = cls()
 with open(products_path) as products_lines:
 for product_line in products_lines:
 product = Product.parse_product(product_line)
 products.add_product(product)
 return products
class ProductPurchase:
 def __init__(self, product: Product, quantity: int = 1): # >>> type hinting missing
 self.product = product
 self.quantity = quantity
 # self.discount = discount
 """
 >>> instead of storing a hard copy of the discount, which might get corrupted
 if the discount or quantity change, use a property and compute it live
 """
 @property
 def discount(self):
 """ computes the discount, accepts multiple discounts per product now """
 total_discount = None
 for discount in self.product.discounts:
 if self.quantity >= discount.min_quantity:
 if total_discount is None:
 total_discount = 0
 total_discount += discount.discount_amount
 return total_discount
 def purchase(self, quantity: int = 1):
 """ provide a nice interface for modifying the quantity """
 self.quantity += quantity
 @property
 def discounted_price(self):
 return round(self.product.price * self.quantity * (1.0 - self.discount), 2)
class DiscountRule:
 def __init__(self, product_barcode: int, min_quantity: int, discount_amount: float):
 self.product_barcode = product_barcode
 self.min_quantity = min_quantity
 self.discount_amount = discount_amount
 @classmethod
 def parse(cls, discount_rule: str):
 barcode_str, min_quantity_str, discount_amount_str = discount_rule.split(':')
 return cls(int(barcode_str), int(min_quantity_str), float(discount_amount_str))
 @classmethod
 def parse_discount_rules(cls, rules_path: str) -> 'Iterable[DiscountRule]':
 with open(rules_path) as rules_lines:
 for rule_line in rules_lines:
 yield DiscountRule.parse(rule_line)
 @classmethod
 def add_discounts_to_products(cls, discounts: 'Iterable[DiscountRule]', products: Products):
 for discount in discounts:
 if discount.product_barcode in products:
 products[discount.product_barcode].add_discount(discount)
class Purchases(dict):
 def __init__(self, *args, **kwargs):
 super().__init__(*args, **kwargs)
 def purchase(self, product: Product): # work with objects! not with keys which refer to objects
 if product.barcode in self:
 self[product.barcode].purchase()
 else:
 self[product.barcode] = ProductPurchase(product)
 """def get_purchases(self):
 for purchase in self.values():
 yield purchase
 >>> not needed since now the interface is the actual dict
 """
class ShoppingCart: # a more meaningful name to the object, check_out is an action/method
 def __init__(self, purchases: Purchases, products: Products):
 self.purchases = purchases
 self.products = products
 # self.discount_rules = discount_rules >>> not needed anymore
 def check_out(self) -> Iterable:
 """ return an iterable instead of a list, the caller can then do whatever xhe wants """
 for purchase in self.purchases.values():
 yield purchase.product.name, purchase.quantity, purchase.discounted_price
 @classmethod
 def init_from_files(cls, products_path: str, discounts_path: str, scanner_input: list):
 # get products from products_path
 products = Products.parse_products(products_path)
 # get discount_rules from discount_path
 discount_rules = DiscountRule.parse_discount_rules(discounts_path)
 DiscountRule.add_discounts_to_products(discount_rules, products)
 purchases = Purchases()
 for purchased_barcode in scanner_input:
 purchases.purchase(products[purchased_barcode])
 return cls(purchases, products)
"""def get_checkout(products_paths: str, discounts_path: str, scanner_input):
 
 >>> if all the code is using objects, why have a random function and not use those objects?
"""
if __name__ == "__main__":
 cart = ShoppingCart.init_from_files("products.txt", "discounts.txt", [123, 123])
 print(list(cart.check_out()))
answered Mar 27, 2021 at 21:03
\$\endgroup\$
1
  • 2
    \$\begingroup\$ Elaborate review, suggesting valid design issues👍 One prime strategy/directive in both software-development and writing (text, not code) is divide and conquer or modularisation. This would ease reading/understanding/maintaining of both your answer and the code under review. What about headings and separate code-blocks (modules) to improve structure🤔 \$\endgroup\$ Commented Mar 28, 2021 at 0:21
2
\$\begingroup\$

Since you are handling CSV data, and if you have a well-structured file, there is no need to reinvent the wheel and do the parsing yourself. Python already has a library for this purpose: csv — CSV File Reading and Writing. Of course it is not impossible that a personal and minimalist implementation could perform better than an established and time-proven module, but this is doubtful. But you can test for yourself.

Another more obvious alternative would be using Pandas. Import the data to a dataframe, that can be further manipulated with the result of your calculations.

And you have yet one more option: a database. You could use the sqlite module and import the files to tables. I think this option is well suited for your purpose. You can get the job done with a few joins and possibly some subselects depending on the need. But here this is just a very basic join of 3 tables that you need.

SQLite can import CSV files in bulk but you'll need to invoke the shell for this. See for example: https://wizardforcel.gitbooks.io/sqlite-doc-en/content/30.html

NB: the database file does not even have to be stored on disk, you can create an in-memory DB that is disposed after your program has executed. But if you have large datasets and you are worried about memory consumption, then use a DB file. If you are on Linux you can use a temporary file on /tmp, which is normally tmpfs (= fast data access).

Either way, it is going to be easier to perform your calculations (rebates) in bulk, when all the data has been imported. You can create views, temporary tables etc. If your needs become more complex, then you can use a mix of Python and SQlite, or choose a database that supports stored procedures to handle more sophisticated computations.

I don't think the approach you've chosen is optimal. At the very least I would recommend that you separate these two concerns in your coding: the import of data, and the transformation of data. My suggestion is to reconsider the tools available for the job. It seems to me that you can ditch those classes and simplify the job a lot if you choose the DB approach.

answered Mar 28, 2021 at 14:50
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.