How to structure this data? (Many columns currently.)

Question 1

Working in MariaDB, we're trying to figure out how to store this data. All of our other tables have been normalized and are working nicely, but this has me stumped.

We have a number of objects and each object needs to have a state and two colors stored for each layer. The problem is that the upper limit of layers is around a hundred. Each object needs the data stored for each layer, as the objects are definitely not guaranteed to be identical in any way.

ObjID | L0 | L0C | L1 | L1C | ... | L99 | L99C
8374629 | mmq | FFFF00,FFFF00 | mqm | 663FBA,34CF09 | ... | mqq | CDCDCD,00CDCD
8374630 | qmq | EE9329,00FF00 | qqm | 00FF52,ACF03D | ... | qqq | FFFF00,00F000
8374631 | mqq | 597842,F98520 | mmm | 05FF10,3726CD | ... | mmm | 05FF10,C0EAFF
8374632 | qqm | FADD03,A6A6A6 | qmm | DDAA11,763276 | ... | qmm | EEEEAA,EE9329
8374649 | mmm | BC1A44,FF00FF | qqq | 000000,FFCFFF | ... | mqm | 2323DD,01CD11

ObjID is always unique, but the rows are purged periodically as the object is no longer needed from the system. Lookup will always be based on ObjID. There will never be any filtering of the other rows (ie, selecting all where L0 is mmm).

The only other idea we've come up with is the idea of using strings to store the data and breaking them apart in the code that's using the data, but I assume that parsing a 1300 character string into a massive array would be both more inefficient and very inelegant. It also wouldn't allow easy editing of any given layer, which needs to be available by design.

The objID is referenced on another table, but otherwise the data here isn't replicated in any other location. Is this our best option for storing this many unique points? Some two hundred columns just seems like quite a lot.

Question 2

If those data have no meaning at "DB layer", they can be understood as "blob" - and you can store it any way you want - if you always select all columns and never filter on it, then get the best format for your app and store that as text/binary data (be it json, serialized object, your own binary packed interpretation) - that way you don't have to change the table to accommodate new properties, types etc - but if you ever needed to filter on it, that would be really painful

Question 3

To normalise this table you must remove the repeating groups and multi-value columns. The resulting table will be

objID
LevelNumber
State
Colour1
Colour2

Using the sample data from your question you would see

ObjID | Level | State | Color1 | Color2
8374629 | 0 | mmq | FFFF00 | FFFF00
8374629 | 1 | mqm | 663FBA | 34CF09
...
8374629 | 99 | mqq | CDCDCD | 00CDCD
....
8374649 | 1 | mmm | BC1A44 | FF00FF
8374649 | 2 | qqq | 000000 | FFCFFF
...
8374649 | 99 | mqm | 2323DD | 01CD11

The colors can be normalised into a separate table, too, giving two tables:

State Color
===== =====
objID objID
LevelNumber LevelNumber
State ColourNumber
 ColourValue

A table with several hundred columns is within the capabilities of any DBMS. It may be strange but it is not nesessarily wrong.

Question 4

Level number isn't an individual thing though; I call it level number but essentially it's ... the level number's state. ALL objIDs will have a value for EVERY level number, so having a separate location for that wouldn't make sense? Though I suppose you're right that we should split the colors into separate entities, I just didn't know if several hundred columns would be severely detrimental. Strange I absolutely knew.

Question 5

Or possibly I'm not understanding how to apply what you're suggesting; I'm a visual person (why I always write out my columns above) and pretty new to designing complex or large dbs, so that's a big possibility.

score 1 · Answer 1 · 2015-06-04 10:39:48Z

To normalise this table you must remove the repeating groups and multi-value columns. The resulting table will be

objID
LevelNumber
State
Colour1
Colour2

Using the sample data from your question you would see

ObjID | Level | State | Color1 | Color2
8374629 | 0 | mmq | FFFF00 | FFFF00
8374629 | 1 | mqm | 663FBA | 34CF09
...
8374629 | 99 | mqq | CDCDCD | 00CDCD
....
8374649 | 1 | mmm | BC1A44 | FF00FF
8374649 | 2 | qqq | 000000 | FFCFFF
...
8374649 | 99 | mqm | 2323DD | 01CD11

The colors can be normalised into a separate table, too, giving two tables:

State Color
===== =====
objID objID
LevelNumber LevelNumber
State ColourNumber
 ColourValue

A table with several hundred columns is within the capabilities of any DBMS. It may be strange but it is not nesessarily wrong.

Level number isn't an individual thing though; I call it level number but essentially it's ... the level number's state. ALL objIDs will have a value for EVERY level number, so having a separate location for that wouldn't make sense? Though I suppose you're right that we should split the colors into separate entities, I just didn't know if several hundred columns would be severely detrimental. Strange I absolutely knew.
Or possibly I'm not understanding how to apply what you're suggesting; I'm a visual person (why I always write out my columns above) and pretty new to designing complex or large dbs, so that's a big possibility.

Stack Exchange Network

How to structure this data? (Many columns currently.)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to structure this data? (Many columns currently.)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions