Encoding hierarchy information in objects

Question 1

I am writing code that will assign entities to transaction. Each transaction has a counterparty entity.

Requirements

The entities should be (de-)serialisable (from pydantic models into JSON and vice versa)
Individual entities will be stored in a database
Entities have some hierarchy, e.g. a Supermarket is a type of Company.
If possible, I would like to pattern match entities, such that e.g. a case statement that expects a Company will accept both a Supermarket and a Company, but another case statement expecting a Supermarket will not accept any other Company.

Design

I have two specific ideas:

Idea 1:

class Entity():
 name: str
class Company(Entity):
 pass
class Supermarket(Company):
 pass
Aldi = Supermarket(name="Aldi") # this will be stored as a database entry.
class Person(Entity):
 pass
...

Idea 2a

class Entity():
 name: str
 type: EntityType
class EntityType():
 parent: EntityType | None
 children: list[EntityType]
Company = EntityType(parent=None) # will be hard to backfill the children...
Supermarket = EntityType(parent=Company)
Aldi = Entity(name="Aldi", type=Supermarket)
...

Idea 2b

class Entity():
 name: str
 type: EntityCode
class EntityCode():
 l0: str | None
 l1: str | None
 l2: str | None
 l3: str | None
Company = EntityCode(
 l0 = "Company"
)
Supermarket = EntityCode(
 l0 = "Company"
 l1 = "Supermarket"
)
Aldi = Entity(name="Aldi", type=Supermarket)
...

I am unsure whats best to use. I could make all 3 work with my requirements. Option 1 uses inheritance, which is a bit harder to serialise since all objects have the same properties so hard to know which entity a given json is unless we store the class name aswell. Also its harder to find the specific parent of an entity. Finally, inheritance may be a waste since entities dont really differentiate from each other with special properties etc.

Option 2a and 2b encode the information using a property, while the actual entity instances are of the same type Entity. Both are easy to serialise and store in a database. 2a will be hard to backfill the children, and 2b is a bit restrictive for entities that dont have 4 levels of hierarchy. Also ideally i would use a different type instead of str for these levels, maybe some literal value or enums.

Anyway, I am interested if anyone has an opinion on this, or if this is maybe a well-known problem and there is one way or another of solving this that is superior.

Question 2

Many Json libraries support polymorphic serialization, usually by storing a identifier for the concrete type as a property. I'm not familiar with the python echo system, but I would expect at least some builtin support for this.

Question 3

Also it's harder to find the specific parent of an entity Why do you need to persist/serialize the hierarchy? Why would you mix models if, indeed, they are of different complexity?

Question 4

@JonasH it probably is possible as well in python, so this is not a limitation. I am more interested if there are other good arguments favouring one method over the other.

Question 5

I'm not sure I understand the question. Is it just about polymorphic serialization? If so, why not just let the library do its thing? Or by "hierarchy" do you mean some type of tree or graph of objects, not the inheritance hierarchy?

Question 6

@charelf The problem is in here I am asking this here to see if anyone has faced a similar problem. A similar problem doesn't mean "exactly the same problem, constraints, context, priorities, non-functional and functional requirements and needs then you have". We can not extrapolate our solutions to yours. If you already have one, go with it, don't waste your time wondering or distressing. The final solution will come alone as you delve into the problem. Bear also in mind that, asking for opinions is off-topic. They only lead to opinionated answers and debate.

Question 7

Option 2b limits the levels artificially. At the same time, it forces all levels to have all the identifiers without ensuring consistency. And using the identifiers may require complex hard coded rules, which makes maintenance difficult.

Option 2a is more powerful and convenient to use. But you loose the benefit of the entity specialization. i.e supermarkets no longer have specific behaviors or additional attributes. This option should be chosen if the entity specific behaviors can be encapsulated in a strategy, or if you're in a data-only object model (not recommended).

Option 1 looks elegant and very robist on view of future evolutions.

Question 8

I am accepting your answer since it answers my question asking for feedback/comments. I have decided to go for option 2b (in a slightly modified version) to follow more closely the existing hierarchies of dutch SBI codes (similar to the european NACE codes)

Question 9

Actually my solution is closer to 2a in hindsight...

Question 10

@charelf I missed this contextual information. Indeed, you'll not replicate in your code the class hierarchy for all the NACE, which is more a statistical classifier than a behavioral one. In this cas I'd have gone for 2a as well.

Question 11

Yes, in hindsight this piece of context would have been useful to include in the question. Thanks for confirming my choice then.

score 0 · Accepted Answer · 2023-05-11 19:28:05Z

0

Option 2b limits the levels artificially. At the same time, it forces all levels to have all the identifiers without ensuring consistency. And using the identifiers may require complex hard coded rules, which makes maintenance difficult.

Option 2a is more powerful and convenient to use. But you loose the benefit of the entity specialization. i.e supermarkets no longer have specific behaviors or additional attributes. This option should be chosen if the entity specific behaviors can be encapsulated in a strategy, or if you're in a data-only object model (not recommended).

Option 1 looks elegant and very robist on view of future evolutions.

Share

Improve this answer

answered May 11, 2023 at 19:28

Christophe's user avatar

Christophe ChristopheChristophe

81.9k11 gold badges135 silver badges201 bronze badges

4

I am accepting your answer since it answers my question asking for feedback/comments. I have decided to go for option 2b (in a slightly modified version) to follow more closely the existing hierarchies of dutch SBI codes (similar to the european NACE codes)

charelf
– charelf

2023年05月12日 07:56:14 +00:00
Commented May 12, 2023 at 7:56
Actually my solution is closer to 2a in hindsight...

charelf
– charelf

2023年05月12日 08:18:51 +00:00
Commented May 12, 2023 at 8:18
1

@charelf I missed this contextual information. Indeed, you'll not replicate in your code the class hierarchy for all the NACE, which is more a statistical classifier than a behavioral one. In this cas I'd have gone for 2a as well.

Christophe
– Christophe

2023年05月12日 18:33:34 +00:00
Commented May 12, 2023 at 18:33
Yes, in hindsight this piece of context would have been useful to include in the question. Thanks for confirming my choice then.

charelf
– charelf

2023年05月14日 11:04:20 +00:00
Commented May 14, 2023 at 11:04

Add a comment |

Stack Exchange Network

Encoding hierarchy information in objects

Requirements

Design

Idea 1:

Idea 2a

Idea 2b

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Encoding hierarchy information in objects

Requirements

Design

Idea 1:

Idea 2a

Idea 2b

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions