I am writing code that will assign entities to transaction. Each transaction has a counterparty entity.
Requirements
- The entities should be (de-)serialisable (from pydantic models into JSON and vice versa)
- Individual entities will be stored in a database
- Entities have some hierarchy, e.g. a Supermarket is a type of Company.
- If possible, I would like to pattern match entities, such that e.g. a
case
statement that expects a Company will accept both a Supermarket and a Company, but anothercase
statement expecting a Supermarket will not accept any other Company.
Design
I have two specific ideas:
Idea 1:
class Entity():
name: str
class Company(Entity):
pass
class Supermarket(Company):
pass
Aldi = Supermarket(name="Aldi") # this will be stored as a database entry.
class Person(Entity):
pass
...
Idea 2a
class Entity():
name: str
type: EntityType
class EntityType():
parent: EntityType | None
children: list[EntityType]
Company = EntityType(parent=None) # will be hard to backfill the children...
Supermarket = EntityType(parent=Company)
Aldi = Entity(name="Aldi", type=Supermarket)
...
Idea 2b
class Entity():
name: str
type: EntityCode
class EntityCode():
l0: str | None
l1: str | None
l2: str | None
l3: str | None
Company = EntityCode(
l0 = "Company"
)
Supermarket = EntityCode(
l0 = "Company"
l1 = "Supermarket"
)
Aldi = Entity(name="Aldi", type=Supermarket)
...
I am unsure whats best to use. I could make all 3 work with my requirements. Option 1 uses inheritance, which is a bit harder to serialise since all objects have the same properties so hard to know which entity a given json is unless we store the class name aswell. Also its harder to find the specific parent of an entity. Finally, inheritance may be a waste since entities dont really differentiate from each other with special properties etc.
Option 2a and 2b encode the information using a property, while the actual entity instances are of the same type Entity
. Both are easy to serialise and store in a database. 2a will be hard to backfill the children, and 2b is a bit restrictive for entities that dont have 4 levels of hierarchy. Also ideally i would use a different type instead of str for these levels, maybe some literal value or enums.
Anyway, I am interested if anyone has an opinion on this, or if this is maybe a well-known problem and there is one way or another of solving this that is superior.
1 Answer 1
Option 2b limits the levels artificially. At the same time, it forces all levels to have all the identifiers without ensuring consistency. And using the identifiers may require complex hard coded rules, which makes maintenance difficult.
Option 2a is more powerful and convenient to use. But you loose the benefit of the entity specialization. i.e supermarkets no longer have specific behaviors or additional attributes. This option should be chosen if the entity specific behaviors can be encapsulated in a strategy, or if you're in a data-only object model (not recommended).
Option 1 looks elegant and very robist on view of future evolutions.
-
I am accepting your answer since it answers my question asking for feedback/comments. I have decided to go for option 2b (in a slightly modified version) to follow more closely the existing hierarchies of dutch SBI codes (similar to the european NACE codes)charelf– charelf2023年05月12日 07:56:14 +00:00Commented May 12, 2023 at 7:56
-
Actually my solution is closer to 2a in hindsight...charelf– charelf2023年05月12日 08:18:51 +00:00Commented May 12, 2023 at 8:18
-
1@charelf I missed this contextual information. Indeed, you'll not replicate in your code the class hierarchy for all the NACE, which is more a statistical classifier than a behavioral one. In this cas I'd have gone for 2a as well.Christophe– Christophe2023年05月12日 18:33:34 +00:00Commented May 12, 2023 at 18:33
-
Yes, in hindsight this piece of context would have been useful to include in the question. Thanks for confirming my choice then.charelf– charelf2023年05月14日 11:04:20 +00:00Commented May 14, 2023 at 11:04
Also it's harder to find the specific parent of an entity
Why do you need to persist/serialize the hierarchy? Why would you mix models if, indeed, they are of different complexity?I am asking this here to see if anyone has faced a similar problem
. A similar problem doesn't mean "exactly the same problem, constraints, context, priorities, non-functional and functional requirements and needs then you have". We can not extrapolate our solutions to yours. If you already have one, go with it, don't waste your time wondering or distressing. The final solution will come alone as you delve into the problem. Bear also in mind that, asking for opinions is off-topic. They only lead to opinionated answers and debate.