I'm trying to design database tables to track files uploaded by end users.
Files can be uploaded under different contexts.
Each context is a different table.
As an example:
- An employee can upload multiple "receipt" files against a "weekly expense" entry.
- An employee can upload multiple "photo" files against a "pets" entry.
The entities / tables in this (contrived) scenario are:
Employees (Id, Name)
Expenses (Id, Date, RequestedRefund, RequestedByEmployeeId)
Pets (Id, Name, Type, BelongsToEmployeeId)
I already have a table called Files
which tracks information about the file itself:
Files (Id, Name, Size, Extension, Folder)
My question is around how to map a expense receipt upload and pet photo upload against the file record. I know I can do this in 2 ways:
1st Option:
Have one generic mapping table: GenericFileMap (FileId, ContextId, Type)
where
FileId
is the id of the file recordContextId
is the id of the context record that i'm trying to retrieveType
is the field to describe the context itself
In this scenario, I'd fetch all expense receipts for an employee like so:
SELECT *
FROM Employee e
INNER JOIN Expenses ex ON e.Id = ex.RequestedByEmployeeId
INNER JOIN GenericFileMap g ON g.ContextId = ex.Id AND g.Type = 'expense'
INNER JOIN Files f ON g.FileId = f.Id
2nd Option:
Have separate mapping table for each entity / context table:
ExpenseFiles (ExpenseId, FileId)
PetFiles (PetId, FileId)
The query for receipts would be:
SELECT *
FROM Employee e
INNER JOIN Expenses ex ON e.Id = ex.RequestedByEmployeeId
INNER JOIN ExpenseFiles ef ON ef.ExpenseId = ex.Id
INNER JOIN Files f ON ef.FileId = f.Id
I believe the 1st option is not the "normalized" way to do things in a relational database. But the only reason i'm considering this is, in my particular situation, there are at least 15 different contexts from which a user can upload / attach files.
The only benefit i see from the 1st option is the time I'd save creating and duplicating tables and stored procedures (one each for CRUD) for each and every context.
However, it does seem that the 2nd option is the "right" way to do things.
My questions:
- Which of the two options is the smarter approach?
- Which of the two is better for performance in the long run (I'll definitely put in the extra effort needed if the second option will be more performant) ?
-
As far as I understand the scenario the "generic" table has one big downside: It's impossible to define foreign key constraints from one column to different tables. One you have to build complex, error prone and comparatively slow triggers, to ensure integrity. And integrity is a very important point.sticky bit– sticky bit2018年08月25日 06:43:59 +00:00Commented Aug 25, 2018 at 6:43
-
See the discussion for this questionMichael Green– Michael Green2018年08月25日 12:00:57 +00:00Commented Aug 25, 2018 at 12:00
1 Answer 1
You don't want option # 1 because:
- You might as well just place
ContextId
andType
into theFiles
table. - You would be better off with a
TypeID TINYINT
field (and the associatedType
lookup table) so that you didn't need to do an inefficient string comparison each time.
What is probably best is a modified option # 2. Yes, start with the context-specific file map tables (e.g. ExpenseFiles
, PetFiles
, etc) but also add a ContextTypeID TINYINT
column in the existing Files
table. There should also be a new ContextType
lookup table with ContextTypeID TINYINT NOT NULL
as the Primary Key (but do not make it an IDENTITY
column). You can add a Name
column to the lookup table, and a Foreign Key from Files
to ContextType
on ContextTypeID
. Having this new column in the Files
table will make it much easier to determine which context-specific file map / property table the associated record is in.
The reason for going with option # 2, even if there are 15 different contexts, is that, inevitably, at least some of them will evolve and diverge over time. So you might have:
ExpenseFiles (ExpenseId, FileId, Date, Total)
PetFiles (PetId, FileId, Name, Age, AnimalType)