Ok, I couldn't find a better title, to be honest.
I'm recently landed in a new working place, and I've been asked to help with the DB. I wanted to try to explain a concept, but I'm not sure how to word it at best, and maybe this concept has already a better explanation and a name, too. I don't label myself as a DB expert, so I'd like to present them something more official than my own point of view, no matter how many good DBs I've designed in my life.
One of the things I've found mostly amusing (but not in a good way) is that they seems to follow design the db according to the logic of the problem and not according to "this is a database, it should contain data", so for example to know if an element is in a certain category, the query goes this way (in pseudocode):
SELECT Column1, Column2...ColumnN
FROM Invoices [LIST OF JOIN]
WHERE
Invoces.Field1 IN
(TableA WHERE TableA.Field NOT IN
(TableB WHERE TableB.Field IN ((TableC WHERE) OR (TableD WHERE)))
OR Invoces.Field1 NOT IN
(TableB WHERE TableB.Field IN ((TableC WHERE) AND (TableD WHERE))
OR TableB.Field IN
(TableC WHERE TableC.Field IN (TableC WHERE))
OR TableB.Field NOT IN
(TableC WHERE TableC.Field NOT IN (TableD WHERE))
and so on and on for other twenty conditions or so. No, I'm not joking.
This happens because instead of saying "this is an invoice and its category is an attribute of the invoice itself", they go all the way with things like "the invoice is been inserted by that user, in that day, and he could have inserted it into a certain department or none at all, and in case of the department then that department could be part of a certain structure..." and so on. It's not even a matter of referential integrity, there are no FKs at all in this case...
Another practical example is the date of last edit of a document: instead of being stored as an attribute, it's calculated by a function which goes something along this way:
if that document is related to another, then if the other is this kind, then if the first document is related to this other document too, then the date of last edit is the date when the administrator has printed the daily appointment calendar"...
clearly with a bunch of else for every if, too. AND, it's not because there is some sort of dependencies, it's just because many things are stored to reflect their logic.
In a way I'd be tempted to say "write once, read many", with the meaning of "write that attribute once and then read it for free as many times as you need", because we have a low frequency of writes and a very high frequency of reads, and clearly all reads must every time recalculate everything. But I've got a feeling there must be some principle that says that you should design databases to hold data...
1 Answer 1
It Depends
The purpose of a database is to store valid data
in a secure fashion for multiple users across multiple applications.
Some of the logic will focus on the valid data
portion of that statement. These will be your constraints
and data types
.
Other parts of code within the database will focus on the multiple applications
portion. For example, a VIEW
will hide some complex logic to ensure that all applications see the data in an identical manner. One application could be a web based application, the other application could be a 3rd party report generator.
OLTP
Within an OLTP system, the complex SELECT
statement and the last modified date
logic doesn't make much sense.
If this is suppose to be an OLTP system, you should try to change things so that they follow normal relational database practices.
Data Warehouse
The complexity of your pseudo-code SELECT
statement and your date of last edit
logic makes perfect sense in a Data Warehouse [DW] environment. However, they are probably part of an ETL/ELT process for materializing the data into a denormalized table for faster report generation.
The TABLES
and VIEWS
and MATERIALIZED VIEWS
are built to solve a very specific set of business problems. From my experience, those objects are used as the source for various Business Reports. In some cases, one table/view per report is created to solve a Business Requirement.
No FKs
? That's probably normal for a DW. The data in the tables could be refreshed with TRUNCATE
followed by INSERT...SELECT
. In this situation, a Foreign Key
could do more harm than good.
As I read the comments, it seams your company needs to move to a more formal OLTP hierarchical design. I wish you luck.
no FKs at all in this case
- been there, done that, got the the t-shirt. I was new on the job on site with a client which controlled the rail infrastructure of a major European country. You can imagine how many "widgets" they had. So I ran a query to get an idea of PK->FK relationships. My query returned no results! Thinking I had made a mistake, I asked a senior programmer - who responded **with pride" that there were "no FKs - all searches were done on PKs"! (WTF?). You could try suggest adding "last_modified" to last_edit"? You have my deepest condolences - I'd be getting my cv ready!