1

I am trying to build a publications database, and i want to develop a genre relation that has various (maybe look up tables) sub-genres attached. as an example, in the genre relation there would be a row for 'POETRY' but somehow i want to divide that into sub genres like 'PROSE', 'HAIKU' or 'FREE_VERSE'.

what is the best way to design such a thing? would i have a separate look-up table for each row in the genre relation for the various sub-genres? that doesn't seem very flexible. any input would be greatly appreciated.

asked May 27, 2015 at 23:53
9
  • Would you be collecting different data for each of the subtypes ('PROSE', 'HAIKU' or 'FREE_VERSE')? Or would the data all be the same for all subtypes of "POETRY"? Commented May 28, 2015 at 0:48
  • mostly, i am just trying to create a searchable database of various published works. the data rows would be much the same, but the sub types would be used as a classification system... maybe thats not the right way to go about it? Commented May 28, 2015 at 4:29
  • agreeed @TemporaryName. I'm trying to strike a balance between designing a system that is both flexible and complete, but that i can still write queries against (at my skill level) Commented May 28, 2015 at 20:02
  • @StillLearningToCode Just to be clear, are there different properties for each genre? I can amend my answer to deal with a simpler case of merely needing to designate subtype but where all Publication types are really the same. Commented May 28, 2015 at 20:04
  • 1
    And I agree with @TemporaryName in that queries should not be the main concern at this point. Queries will be the least of your problems if you get the model wrong. If a simpler structure that requires a sub-genre is more appropriate, then I will amend my answer. If a simpler structure that requires a flexible hierarchy of genre is appropriate, then the accepted answer should be moved from mine to Walter's. Regardless, take some time to think about the nature of the data so that you can provide clearer direction: I would rather give up the 15 pts than contribute to yet more over-engineering. Commented May 28, 2015 at 20:37

3 Answers 3

4

This seems nearly identical to the following question, and my answer there should be detailed enough to explain the concept (and is just too much to copy and paste here).

Don't know how to transform variable entity into relational table

For your case specifically I would think it should work itself out to be something along the lines of:

  • Publication table is the main entity with the common properties, a PK on PublicationID, and a PublicationTypeID field that FKs to PublicationType
  • Lookup table for PublicationType that has a PK of PublicationTypeID. Entries would be the various genres.
  • Sub-class tables for the genres, such as Poetry, Fiction, etc. Each sub-class table has a PK named PublicationID that also FKs back to the Publication table, and fields for properties that are specific to the particular genre.
  • Where applicable, sub-class tables have sub-type field, such as PoetryTypeID and FKs to the PoetryType table.
  • Lookup tables for sub-classes, one table per each sub-class type. The PK would be PoetryTypeID (e.g. Prose, Haiku, Free Verse, etc).

In light of additional details being made available that give context to the scope of the request (specifically the following comment on the question:

the data files ... are all in fact printed, no audio of image/video files. ... one sub-genre is sufficient, and ... it is possible to have just a genre without any sub sections..

), it seems clear now that the flexibility allowed for by my suggestion above is not needed as sub-types aren't really being stored. Instead, the request now seems to be for simply a means of categorization. For that, if you want a rigid structure of Genre / Sub-Genre such that:

  • those are the only two levels, and
  • you always know which level you are dealing with (in queries), as opposed to a dynamic structure where each Publication can be associated with a Genre / Sub-Genre at any level (1 through N) and you would need to query to get the heirarchy

then you can do something along the lines of:

Publication
-------------
PublicationID INT NOT NULL PRIMARY KEY
SubGenreID TINYINT NOT NULL FOREIGN KEY REFERENCES(SubGenre.SubGenreID)
Title VARCHAR(200) NOT NULL
...
SubGenre
-------------
SubGenreID TINYINT NOT NULL PRIMARY KEY
GenreID TINYINT NOT NULL FOREIGN KEY REFERENCES(Genre.GenreID)
Name VARCHAR(50) NOT NULL
Genre
-------------
GenreID TINYINT NOT NULL PRIMARY KEY
Name VARCHAR(50) NOT NULL

In this model you know that what is in the Publication table is always the SubGenre, and it always has a parent Genre. You did say that it is possible to not have a SubGenre (though is that really the case?), you can use the same Name for both SubGenre and Genre (e.g. News -> News instead of Poetry -> Bad Teenage).

Or, if you want a dynamic structure of Genre / Sub-Genre such that:

  • There can be just one level or any number of levels, and
  • You can only easily tell if you are either at:

    • a top-level Genre (ParentID IS NULL), or
    • one or more levels deep in SubGenre-land (ParentID IS NOT NULL).

    Else you need to query to get the full hierarchy

then you can do something along the lines of:

Publication
-------------
PublicationID INT NOT NULL PRIMARY KEY
PublicationTypeID SMALLINT NOT NULL FOREIGN KEY
 REFERENCES(PublicationType.PublicationTypeID)
Title VARCHAR(200) NOT NULL
...
PublicationType
-------------
PublicationTypeID SMALLINT NOT NULL PRIMARY KEY
ParentPublicationTypeID SMALLINT NULL FOREIGN KEY 
 REFERENCES(PublicationType.PublicationTypeID)
Name VARCHAR(50) NOT NULL

Please note that the ParentPublicationTypeID field is NULLable, which indicates a top-level item when it is NULL. This is the model that Walter Mitty suggested.

answered May 28, 2015 at 5:10
1

Perhaps you'll find the "Nested Set Model" for suitable for genres. Some advantages to this are:

  1. You can sub-genre things as deeply as your heart desires.
  2. You only need 1 table for defining Genres (not counting the associative table between Genre and Publication).

Here's an article that explains it: http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

In your case, the tables would like something like so: enter image description here

Some sample data: enter image description here

For this sample data, "Green Eggs and Ham" are assigned the genres of "Children's", "Prose", and "Poetry". Even though the "Poetry" assignment isn't explicit in the PublicationGenre table, it can be inferred by the fact that "Prose" is a child of "Poetry". Similarly, "Joe's Haiku" is assigned the genres of "Haiku" and "Poetry".

What the "Nested Set Model" would look like for this sample data: enter image description here

answered Jun 3, 2015 at 0:16
2
  • 2
    The value of this technique depends greatly on which RDBMS is being used. If the implementation will be on MS SQL Server then one can use a proper, self-referencing relationship and query that easily using recursive CTEs (Common Table Expressions) OR use the hierarchyid datatype. And I believe PostgreSQL and/or Oracle can also handle recursive CTEs or at least have similar features. With those options available, this Nested Set Model appears to be clunky (at best). However, it still seems that MySQL (and possibly others) does not have those features, in which case this is a good work-around. Commented Jun 3, 2015 at 5:27
  • 1
    @srutzky Cool information, I never heard of recursive CTEs. Definitely sounds cleaner that way if the RDBMS supports it. Commented Jun 3, 2015 at 14:37
0

A solution like class-table-inheritance is probably overkill in this case. It sounds like you need the same data, in the same fields, for sub genres and genres. This can all be stored in one lookup table, the PublicationType table.

You need one more field in this table, to record the relationship between sub-genres and genres. I'll call this field ParentPublicationTypeId. The entry in the Haiku row would contain the PublicationTypeId for Poetry. This references a different row in the same table, but it's just another case of a foreign key. The buzzword is "reflexive relationships" if you want to find further reading.

answered May 28, 2015 at 10:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.