I am looking for some best practices regarding table design when an entity is expected to evolve over time and get more properties.
As an example let's consider a process
that has a bunch of metadata and can either be running or finished. A running process would have a startTime
but no endTime
, whereas a finished process needs to have an endTime
. It is assumed that every runningProcess
will be finished at some point.
I see some general approaches:
- A single table
processes
with a nullableendTime
column where all processes are stored - Two tables
runningProcesses
andfinishedProcesses
that look the same except for thefinishedProcesses
having anendTime
column that therunningProcesses
table lacks. - An extra table
processEndTimes
that only stores aprocessId
with a FK constraint and anendTime
Option 1) is pretty simple but has the disadvantage that a query for finishedProcesses
would run on a larger table and filter by the endTime
column.
Option 2) has the disadvantage that ending a process is not an atomic action any more but needs to be a transaction to delete a row from runningProcesses
and insert a row in finishedProcesses
. Additionally, a query for all processes with some metadata would always run on a union of the tables.
Option 3) seems to overcome the disadvantages of Option 1) and 2) but has the disadvantage that the table design does not reflect the way I'd model the processes in business logic.
What option would you go for? And why?
1 Answer 1
Option 4
process
table with only process Metadata. No start_time
/end_time
.
processLog
table with process ID, a log_type
column (start, end, info, debug, error, "I'm on step X", etc) and a few others.
processStatus
(materialized) view of the previous two tables.
And a few process_api
procedures to control the flow of data.
Description
In this design
- the Model is hidden
- you look at the data through a View
- and Control the new data via API.
-
Thank you for your insights @Michael! However, is it to be considered good practice that the model is hidden at this level? In my actual projects, accessing a database will always be done over an API anyway, so the model is never exposed to the user. But wouldn't I want to expose the model to other developers including myself a few months down the road to reveal the intention at first glance?Philipp Just– Philipp Just2021年11月23日 10:21:57 +00:00Commented Nov 23, 2021 at 10:21