So I am new to Databases.Recently I was asked this question in an interview where they expected me to come with a schema for storing and retrieving categories for an E-commerce website in an RDBMS. Each category could have many sub-categories. But each sub-category could have just one parent category. So this is the idea that I gave them: First I created the table like so:
create table category(category_id TEXT PRIMARY KEY,
category_name TEXT,parent_id TEXT);
Then I created an index on the parent_id, like so:
Create index idx_category_table on category(parent_id);
Then I used the following queries to get the parent and child queries:
Select B.category_name as CategoryName from category A,
category B where A.category_id='3' and
A.parent_id = B.category_id;
Select B.category_name as CategoryName from category A,
category B where A.category_id='1' and A.category_id = B.parent_id;
I was rejected from the interview. However, I am curious to know how to better this.
2 Answers 2
I have a small concern about your approach that I think you should have addressed during the interview because it could raise potential issues in your design.
Do you want/need nested product categories?
As it currently stands you're allowing nested categories and even allowing loops of categories where a category can reference itself or one of it's own subcategories.
If you don't want nested categories then one of your restrictions will have to be that category with a parent should check it's parent doesn't also have a parent. The easiest way to do this is with two extra virtual columns(not stored), the first will simply say if a table has a parent(parent_id IS NOT NULL
) and the second defines the value we need to create said relationship(namely that the value should be 0).
CREATE TABLE category(
category_id TEXT PRIMARY KEY,
category_name TEXT,
parent_id TEXT,
is_subcategory INT GENERATED ALWAYS AS (parent_id IS NOT NULL) VIRTUAL,
can_be_child_of_subcategory INT GENERATED ALWAYS AS (0) VIRTUAL,
FOREIGN KEY (parent_id, can_be_child_of_subcategory) REFERENCES category(category_id, is_subcategory) ON UPDATE RESTRICT
);
And now without changing the definition of the virtual columns we can't add a subcategory to a subcategory.
That said there's a few less major issues that might be worth considering in a bit more detail:
You should be using a foreign key, not an index:
FOREIGN KEY(parent_id) REFERENCES category(category_id)
You appear to be using TEXT as the field type, but accessing it with integers. If you're using a text type it's worth making it clear it's a GUID, either way you should use the right type for the field.
category_id INT
category_id = :sub_category_id
-- OR
category_id CHARACTER(32)
category_id = :sub_category_guid
It's ANSISQL standards to write keywords in upper case, obviously you don't need to keep to that standard as every SQL language supports lower or mixed case, but almost every documentation will have keywords in all caps so it's worth keeping to the convention:
CREATE TABLE category(category_id CHARACTER(32) PRIMARY KEY,
category_name TEXT,parent_id CHARACTER(32));
I also suggest making things not null
CREATE TABLE category(category_id CHARACTER(32) NOT NULL PRIMARY KEY,
category_name TEXT,parent_id CHARACTER(32));
I'd also suggest adding a bit more whitespace and at least for an interview document choices
CREATE TABLE category(
category_id CHARACTER(32) NOT NULL PRIMARY KEY,
category_name TEXT,
parent_id CHARACTER(32),
FOREIGN KEY (parent_id)
REFERENCES category(category_id) ON UPDATE RESTRICT
);
Finally I'm really not a fan of your naming on the aliases for the table. Much nicer to use a descriptive name to make it easier to understand:
-- Fetch all child subcategories for a given category
SELECT
SubCategory.category_name AS CategoryName
FROM category ParentCategory,
category SubCategory
WHERE
ParentCategory.category_id = :parent_category_guid
AND
ParentCategory.parent_id = SubCategory.category_id
;
-- Fetch the parent category for the child
SELECT
ParentCategory.category_name AS CategoryName
FROM category ParentCategory,
category SubCategory
WHERE
SubCategory.category_id = :sub_category_guid
AND
ParentCategory.category_id = SubCategory.parent_id
;
-
\$\begingroup\$ Nesting of categories must be allowed. We are trying to represent a directory like structure. \$\endgroup\$Suryasis Paul– Suryasis Paul2020年03月02日 12:25:17 +00:00Commented Mar 2, 2020 at 12:25
-
\$\begingroup\$
NOT NULL PRIMARY KEY
is redundant; the latter implies the former. See for instance postgresql.org/docs/current/… \$\endgroup\$Reinderien– Reinderien2020年07月30日 12:31:05 +00:00Commented Jul 30, 2020 at 12:31 -
\$\begingroup\$ @Reinderien He tagged SQLite, not postgress. In SQLite primary keys allow null unless otherwise stated for compatibility reasons. \$\endgroup\$scragar– scragar2020年07月31日 13:06:19 +00:00Commented Jul 31, 2020 at 13:06
-
\$\begingroup\$ That's a terrible idea, but it's good to know. \$\endgroup\$Reinderien– Reinderien2020年07月31日 13:24:51 +00:00Commented Jul 31, 2020 at 13:24
I'll echo @scragar's recommendation that your ID types be integers and not text. This is a significant misrepresentation of types.
Otherwise: you're using the old, implicit join style. The new, explicit join is recommended; this would turn:
Select B.category_name as CategoryName from category A,
category B where A.category_id='3' and
A.parent_id = B.category_id;
into
select parent.category_name
from category child
join category parent on parent.id = child.parent_id
where child.id = 3;
The schema is also open to integrity failures. You need to add a references
clause in your table definition.