-
-
Notifications
You must be signed in to change notification settings - Fork 172
Centralize per-database logic behind a single SqlDialect abstraction #1317
Description
Introduce a SqlDialect abstraction to make per-database logic a single source of truth
Per-database knowledge is currently scattered across many files and keyed on two parallel enums that nothing forces to agree: SupportedDatabase (8 variants: Sqlite, Duckdb, Oracle, Postgres, MySql, Mssql, Snowflake, Generic) and sqlx's AnyKind (5 variants). Because the dialect-specific data is spread over ~6 independent match sites, several keyed on the wrong enum, the compiler cannot enforce completeness: each site silently falls through to a _ default. Adding or fixing a dialect today means hunting down every site and hoping none was missed, and a missed arm is a silent correctness/data bug, not a compile error.
Proof: the references are real and scattered
$ grep -rcn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | grep -v ':0' ; echo "total:" ; grep -rn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | wc -l
src/webserver/database/sql.rs:37
src/webserver/database/sql/parameter_extraction.rs:9
src/filesystem.rs:3
total: 49
$ grep -rn -E "AnyKind::" src | wc -l
24
49 per-variant SupportedDatabase references (37 in sql.rs alone) plus 24 AnyKind references, all to be kept consistent by hand.
The scattered sites
dialect_for_dbreturns the sqlparser dialect via an 8-arm match: sql.rs#L293-L304- Placeholder style keyed on
AnyKind(only 5 entries), with a silent fallback to a temp prefix when a kind is absent: parameter_extraction.rs#L25-L46 - Placeholder cast type keyed on
SupportedDatabase(different enum, with a_ =>default): parameter_extraction.rs#L96-L115 - Default pool size per kind: connect.rs#L83-L94
- Trace-context SQL inlined in the execution path: execute_queries.rs#L505-L514
- CSV import reaching into sqlx's private
AnyConnectionKindto special-case the PostgresCOPYpath: csv_import.rs#L172-L177 - DB-filesystem
CREATE TABLEper dialect: filesystem.rs#L265-L280 - Name detection in
from_dbms_name, plus its inverse mapping duplicated inside the tests: sql.rs#L786-L792
The enum mismatch directly produces fragile fallthroughs that only exist because the key type is wrong:
_ => Self::Genericinfrom_dbms_name_ => run_csv_import_insert(any non-Postgres connection)unreachable!(...)when a kind is missing fromDB_PLACEHOLDERS, and the silent temp-prefix fallback for an unlisted kind (parameter_extraction.rs#L58-L60)
Proposed change
Introduce a single trait SqlDialect (or a method-rich impl on SupportedDatabase) that centralizes everything per-DB:
- placeholder style and placeholder cast type
- sqlparser
Dialect - otel
db.system.name - default pool size
- capability flags: native JSON,
COPY FROM STDIN, trace propagation, rollback statement,::cast support, MSSQL concat rewriting
Crucially, key the placeholder data on SupportedDatabase and make DB_PLACEHOLDERS (and the other tables) an exhaustive match, so the compiler forces every dialect to be handled and the unreachable!/_ => Odbc/_ => Generic fallthroughs disappear. Adding or fixing a dialect then becomes "implement one trait / add the arm the compiler demands" instead of editing ~6 scattered sites that nothing forces to agree.
Risks and mitigations
- Large but mechanical refactor. Mitigate by leaning on the existing dialect/placeholder tests in
sql.rs(theALL_DIALECTStable drives many of them) which already pin per-dialect behavior. - Land it incrementally, one capability at a time, so each step stays reviewable and bisectable.
Expected win
- Single source of truth for per-DB behavior.
- Compiler-enforced dialect completeness instead of hand-synced matches across two enums.
- Removes the silent
_ => Odbc/_ => Genericfallthroughs that can quietly mishandle (or corrupt data for) an unlisted dialect.
This refines the "split sql.rs by responsibility" suggestion in #1249: a SqlDialect abstraction is the natural seam to split along.