-
Notifications
You must be signed in to change notification settings - Fork 87
-
Problem
Altimate Code's lineage engine supports many SQL dialects, but users working with dialect variants (MaxCompute, cloud-specific
extensions, etc.) have no way to customize dialect behavior without forking the codebase.
Concretely, MaxCompute SQL parsed as hive silently drops lineage edges for functions whose signatures diverge from Hive —
notably DATEADD, NVL, IF, ARRAY_SORT, FILTER, TRANSFORM, FROM_JSON. Real pipeline queries produce incomplete
lineage with no warning.
Proposed Solution: Dialect Extension API
A public API that lets users register custom dialect configurations via YAML/JSON without modifying Altimate Code internals:
- Dialect inheritance — define a new dialect that extends an existing one (e.g.,
maxcomputeextendshive) with only the
overrides needed - Custom function signatures — register argument names and lineage behavior for dialect-specific functions
- Function-to-function mappings — define how dialect-specific functions map to other dialects for transpilation
Example config (.altimate/dialects/maxcompute.yaml):
name: maxcompute extends: hive overrides: - DATEADD - NVL - IF - ARRAY_SORT - FILTER - TRANSFORM - FROM_JSON Why This Benefits Everyone - MaxCompute and other variant dialects get accurate lineage without core code changes - New dialects become self-serve — no PR needed for dialect support - The existing dialect abstraction is already there; this surfaces it as a public API - Community can share dialect configs (similar to how dbt packages share profiles) References - Current dialect handling: packages/opencode/src/altimate/native/dbt/lineage.ts - altimate-core dialect engine: @altimateai/altimate-core Happy to contribute a prototype with real MaxCompute SQL test cases.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment
-
@long-xjh Please contribute a prototype. We will take it from there.
Beta Was this translation helpful? Give feedback.