SQL - Insert valid data based on predefined values

Question 1

I have two tables city and suburb. Postgresql code:

CREATE TABLE city 
(
 id uuid PRIMARY KEY,
 name character varying NOT NULL,
 CONSTRAINT city_id_key UNIQUE (id),
 CONSTRAINT city_name_key UNIQUE (name)
)
CREATE TABLE suburb
(
 id uuid PRIMARY KEY,
 city_id uuid NOT NULL,
 name character varying NOT NULL,
 CONSTRAINT fk_suburb_city FOREIGN KEY (city_id) REFERENCES city (id),
 CONSTRAINT suburb_id_key UNIQUE (id),
 CONSTRAINT suburb_name_key UNIQUE (name)
)

I want to create a table called address for storing city + suburb pairs. Here is the DDL for address table:

CREATE TABLE address
(
 id uuid NOT NULL,
 city_name character varying NOT NULL,
 suburb_name character varying NOT NULL
)

I want to make sure that only redundant copies of information is inserted into address. Here is an example:

I want to allow inserting into address all the city_name suburb_name pairs:

SELECT c.name AS city_name, s.name AS suburb_name
 FROM city c, suburb s
 WHERE c.id = s.city_id

Result:

A - B
A - C
X - Y

For the data above I want to allow all the pairs:

A - B
A - C
X - Y

But if someone wants to insert A - Y pair into address, I want the DBMS to raise an error/exception.

Questions:

Does it make sense to check a constraint like this?
If it is a valid idea, what is the best solution for this? Trigger, stored procedure, some kind of constraint?

I prefer DBSM independent solutions. I'm more interested in the basic idea of the suggested solution not in a postgresql specific solution.

Reflecting to @Yuri G's answer: I don't want any join when I read form address. I want to store real values in it not ids. Slow insert into address is not problem. Fast read is important for me. Change in city or suburb table is not a problem after insertion in address. So no need for update in address table. I just want to make sure that the data I insert into address is a valid city - suburb pair (according to city and suburb tables).

My plan is to upload city and suburb tables with lots of data and use them for validating insertions in address table. I don't want to allow my users to insert into address for example: "New York - Fatima bint Mubarak St" because Fatima bint Mubarak St. is in Abu Dhabi.

Thank you for the answers.

Question 2

Why dont you use primary key instead of unique? Are you having NULL values in your id column?

Question 3

Updated, thx @Teja

Question 4

The unique constraint on city name and suburb name can cause problems because two different suburbs may share the same name - Richmond Heights, MO (St. Louis) and Richmond Heights, OH (Cleveland) for example. As well as two cities - Springfield, IL and Springfield, MO. Adding a similar table for "state" with city.state_id referencing state.id could be added to help to differentiate. You could also consider making the "address" table into a view and handle inserts into each table separately.

Question 5

Less talking more sample data. Insert test values and show us what you want and don't want. Or what you want to reject. The A-X-Y-V thing is totally detached from your schema.

Question 6

Let the software on the client side operate the record identifiers for City & Suburb, not their values.

And do so on the server side too:

CREATE TABLE address
(
 id uuid NOT NULL,
 city_id character varying NOT NULL,
 suburb_id character varying NOT NULL,
 CONSTRAINT fk_city FOREIGN KEY (city_id) REFERENCES city (id),
 CONSTRAINT fk_suburb FOREIGN KEY (suburb_id) REFERENCES suburb (id),
)

Of course, you'll need 2 lookup operations prior to insert, basically two selects by name of the city/suburb, to retrieve these IDs (or deny operation).

Though this way you'd be keeping data integrity most simple & efficient way, I believe.

Question 7

I understand your point. Please see my updated question. Thank you @Yuri G

Question 8

just read into your update... sorry, first of all, just curious, I don't really get the point: do you think reads through full-text search from detached table would be faster than two joins through numeric indexes?

Question 9

And to have the integrity in scope of suburb-city pair - you don't even need two tables: suburb is already related with city... But whatever. If you adamant in your kind-of-denormalized approach (even though your arguments sounds pretty weird to me) - mind that in that case you'd need two full-text lookups anyway, whatever you do, trigger or clientside validation. And remember that lookup for the record is the heaviest part there, fetching would be way faster.

Yuri G 1,2821 gold badge10 silver badges14 bronze badges · Accepted Answer · 2017-04-14 23:20:53Z

0

Let the software on the client side operate the record identifiers for City & Suburb, not their values.

And do so on the server side too:

CREATE TABLE address
(
 id uuid NOT NULL,
 city_id character varying NOT NULL,
 suburb_id character varying NOT NULL,
 CONSTRAINT fk_city FOREIGN KEY (city_id) REFERENCES city (id),
 CONSTRAINT fk_suburb FOREIGN KEY (suburb_id) REFERENCES suburb (id),
)

Of course, you'll need 2 lookup operations prior to insert, basically two selects by name of the city/suburb, to retrieve these IDs (or deny operation).

Though this way you'd be keeping data integrity most simple & efficient way, I believe.

Share

Improve this answer

answered Apr 14, 2017 at 23:20

Yuri G's user avatar

Yuri G

1,2821 gold badge10 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Farkas István

Farkas István Over a year ago

I understand your point. Please see my updated question. Thank you @Yuri G

2017年04月15日T00:54:28.08Z+00:00

Yuri G

Yuri G Over a year ago

just read into your update... sorry, first of all, just curious, I don't really get the point: do you think reads through full-text search from detached table would be faster than two joins through numeric indexes?

2017年04月15日T01:33:48.26Z+00:00

Yuri G

Yuri G Over a year ago

And to have the integrity in scope of suburb-city pair - you don't even need two tables: suburb is already related with city... But whatever. If you adamant in your kind-of-denormalized approach (even though your arguments sounds pretty weird to me) - mind that in that case you'd need two full-text lookups anyway, whatever you do, trigger or clientside validation. And remember that lookup for the record is the heaviest part there, fetching would be way faster.

2017年04月15日T01:41:56.94Z+00:00

CollectivesTM on Stack Overflow

SQL - Insert valid data based on predefined values

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related