Extensible hierarchical database design

Question 1

This question is an extension to the question here (and follows the same hierarchy) - Multiple intersection tables vs multiple joins based on this answer - https://softwareengineering.stackexchange.com/a/415095/278602 .

I am now thinking of designing an extensible hierarchical database. Here's one approach I had in mind, have a single Category table with the following columns :

id (Primary Key)
parent_id (Foreign Key referring the id above)
name
dept_id (Foreign Key to the id on the Department table)
is_category_group
is_category
is_sub_category

I am thinking when the is_category_group is true then the parent_id can be null. Also, something I missed in my original question was that I'll need a many-to-many relationship between Sub-Category and Attribute so I thought of introducing a join table, CategoryAttributeRelation with these columns:

category_id (Foreign key to id on Category table, will be sub-category/leaf id)
attribute_id (Foreign key to id on Attribute table)

Here's what will be there on the Attribute table:

id (Primary Key)
name

Most of the Attributes are global in nature i.e. they belong to all Departments. Are there any better ways of accomplishing extensiblity? I came across hierarchyId in MS SQL Server which is the db of our choice but haven't researched into it. Also, need to figure out how to write the queries to get to the Attributes given a Department.

I have a hierarchical relationship between my tables, with the children having foreign keys referring back to their parent ids (assuming id is the primary key for each table):

Department has many Category Groups
Category Group has many Category(-ies)
Category has many Sub-Category(-ies)
Sub-Category has many Attributes and Attribute can be part of multiple Sub-Category(-ies)

Question 2

Downvoters please let me know why you're downvoting it? Is it already discussed somewhere else or the question isn't properly formatted?

Question 3

What is the question? Is "Are there any better ways of accomplishing extensibility?" the question?

Question 4

@user253751 Yes

Question 5

What is extensibility?

Question 6

@user253751 In this case, in my view, the ability to add more sub-hierarchies without breaking the existing design/architecture

Question 7

There is a choice of which technology to employ. The question strongly implies a relational database but this is not all that is available. A document store or a graph store would work too.

Document stores, such as MongoDB, hold their information in a JSON-like way. JSON is a tree, so maps well to your data structure. Document stores do not require all rows to have the same structure. When new levels are introduced new rows can implement them without having to touch existing rows. Varying levels of hierarchy are handled natively.

A tree is described mathematically by graph theory as a collection of nodes connected by edges. There are databases which offer this model of data storage. You could hold each Department, Category and Attribute etc. as a node and declare edges to link them. The DBMS's query language will have syntax to start from anywhere in the tree and follow edges to retrieve attribute values. New levels can be added as required by declaring further nodes and re-organizing edges to suit.

Even in the relational model there are several ways to implement a hierarchy. I cover some in this answer, with references. In brief they are

Adjacency List
Path Enumeration
Nested Set
Closure Table

Which is best for your use-case will depend on how often the hierarchy changes, whether concurrent reads must be supported during those changes or if a maintenance window can be declared, whether the tree can be read "upwards" from Attribute to Department, and the performance required from reads.

I would suggest you separate the entities that go into the hierarchy from the representation of that hierarchy and from the attributes stored.

If you hold the same information about each category whatever its level then a single Category table will suffice. If the levels vary significantly in the details held then implement entity inheritance.

In a separate table hold the hierarchical relationships. The precise schema will depend on which implementation you choose. By holding only the categories' primary key in this table tree traversal will be faster.

Attributes are held, as you suggest, in an intersection between Category and Attribute, with only the lowest-level categories participating. (My data modeller Spider Sense is tingling at this; I think there are a lot of questions to be answered about Attributes before correctness, completeness and consistency can be assured. But that's for another day.)

From the little knowledge I can glean from your two questions I'd think the hierarchy is stable and any change will require a release (because of the UI changes) so I'd put my two cents on the nested set model.

Michael Green Michael Green 9235 silver badges17 bronze badges · Answer 1 · 2020-09-09 12:37:24Z

There is a choice of which technology to employ. The question strongly implies a relational database but this is not all that is available. A document store or a graph store would work too.

Document stores, such as MongoDB, hold their information in a JSON-like way. JSON is a tree, so maps well to your data structure. Document stores do not require all rows to have the same structure. When new levels are introduced new rows can implement them without having to touch existing rows. Varying levels of hierarchy are handled natively.

A tree is described mathematically by graph theory as a collection of nodes connected by edges. There are databases which offer this model of data storage. You could hold each Department, Category and Attribute etc. as a node and declare edges to link them. The DBMS's query language will have syntax to start from anywhere in the tree and follow edges to retrieve attribute values. New levels can be added as required by declaring further nodes and re-organizing edges to suit.

Even in the relational model there are several ways to implement a hierarchy. I cover some in this answer, with references. In brief they are

Adjacency List
Path Enumeration
Nested Set
Closure Table

Which is best for your use-case will depend on how often the hierarchy changes, whether concurrent reads must be supported during those changes or if a maintenance window can be declared, whether the tree can be read "upwards" from Attribute to Department, and the performance required from reads.

I would suggest you separate the entities that go into the hierarchy from the representation of that hierarchy and from the attributes stored.

If you hold the same information about each category whatever its level then a single Category table will suffice. If the levels vary significantly in the details held then implement entity inheritance.

In a separate table hold the hierarchical relationships. The precise schema will depend on which implementation you choose. By holding only the categories' primary key in this table tree traversal will be faster.

Attributes are held, as you suggest, in an intersection between Category and Attribute, with only the lowest-level categories participating. (My data modeller Spider Sense is tingling at this; I think there are a lot of questions to be answered about Attributes before correctness, completeness and consistency can be assured. But that's for another day.)

From the little knowledge I can glean from your two questions I'd think the hierarchy is stable and any change will require a release (because of the UI changes) so I'd put my two cents on the nested set model.

Stack Exchange Network

Extensible hierarchical database design

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Extensible hierarchical database design

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions