I'm refactoring a little side project to use SQLite instead of a python data structure so that I can learn SQLite. The data structure I've been using is a list of dicts, where each dict's keys represent a menu item's properties. Ultimately, these keys should become columns in an SQLite table.
I first thought that I could create the table programmatically by creating a single-column table, iterating over the list of dictionary keys, and executing an ALTER TABLE, ADD COLUMN command like so:
# Various import statements and initializations
conn = sqlite3.connect(database_filename)
cursor = conn.cursor()
cursor.execute("CREATE TABLE menu_items (item_id text)")
# Here's the problem:
cursor.executemany("ALTER TABLE menu_items ADD COLUMN ? ?", [(key, type(value)) for key, value in menu_data[0].iteritems()])
After some more reading, I realized parameters cannot be used for identifiers, only for literal values. The PyMOTW on sqlite3 says
Query parameters can be used with select, insert, and update statements. They can appear in any part of the query where a literal value is legal.
Kreibich says on p. 135 of Using SQLite (ISBN 9780596521189):
Note, however, that parameters can only be used to replace literal values, such as quoted strings or numeric values. Parameters cannot be used in place of identifiers, such as table names or column names. The following bit of SQL is invalid:
SELECT * FROM ?; -- INCORRECT: Cannot use a parameter as an identifier
I accept that positional or named parameters cannot be used in this way. Why can't they? Is there some general principle I'm missing?
Similar SO question:
-
You might find sqlite.org/optoverview.html of interest and consider how changing a column name affects whether an index can be used and therefore the nature of the compiled bytecode.Duncan– Duncan2015年10月08日 15:43:07 +00:00Commented Oct 8, 2015 at 15:43
3 Answers 3
Identifiers are syntactically significant while variable values are not.
Identifiers need to be known at SQL compilation phase so that the compiled internal bytecode representation knows about the relevant tables, columns, indices and so on. Just changing one identifier in the SQL could result in a syntax error, or at least a completely different kind of bytecode program.
Literal values can be bound at runtime. Variables behave essentially the same in a compiled SQL program regardless of the values bound in them.
1 Comment
I don't know why, but every database I ever used has the same limitation.
I think it would be analogous to use a variable to hold the name of another variable. Most languages do not allow that, PHP being the only exception I know of.
Comments
Regardless of the technical reasons, dynamically choosing table/column names in SQL queries is a design smell, which is why most databases do not support it.
Think about it; if you were coding a menu in Python, would you dynamically create a class for each combination of menu items? Of course not; you'd have one Menu class that contains a list of menu items. It's similar in SQL too.
Most of the time, when people ask about dynamically choosing table names, it's because they've split up their data into different tables, like collection1, collection2, ... and use the name to select which collection to query from. This isn't a very good design; it requires the service to repeat the schema for each table, including indexes, constraints, permissions, etc, and also makes altering the schema harder (Need to add a field? Now you need to do it across hundreds of tables instead of one).
The correct way of designing the database would be to have a single collection table and add a collection_id column; instead of querying collection4, you'd add a WHERE collection_id = 4 constraint to your SELECT queries. Note that the 4 is now a value, and can be replaced with a query parameter.
For your case, I would use this schema:
CREATE TABLE menu_items (
item_id TEXT,
key TEXT,
value NONE,
PRIMARY KEY(item_id, key)
);
Use executemany to insert a row for each entry in the dictionary. When you need to load the dictionary, run a SELECT filtering on item_id and recreate the dictionary one row/entry at a time.
(Of course, as with everything in Software Engineering, there are exception. Tools that operate on schemas generically, such as ORMs, will need to specify table/column names dynamically.)