Copied to Clipboard
the key components of this declarative approach are target lag and dynamic boundaries.
the role of target lag
target lag defines how fresh you need the data to be. instead of telling snowflake when to run (such as a specific cron schedule), you tell snowflake what maximum data latency is acceptable. if you set the target lag to 1 hour, snowflake handles the scheduling automatically to ensure the data in the dynamic table is no more than one hour behind the source tables. if the source data does not change, snowflake does not waste compute resources refreshing the table.
automatic dependency management
snowflake dynamic tables automatically handle dependencies. when you define multiple dynamic tables that reference each other, snowflake builds a directed acyclic graph (dag) of the relationships. snowflake is aware of these dynamic boundaries, so it orchestrates the refresh order across all connected tables to ensure data consistency. you do not need to configure task chains or parent-child dependencies manually.
3) validate the result
once i deployed the dynamic tables, we validated the results by monitoring the snowflake dynamic table account history and query execution logs. i observed several immediate improvements:
-
declarative simplicity: our pipeline code shrunk significantly because we deleted numerous task definitions, task schedules, and custom merge scripts
-
automatic healing: when upstream data ingestion paused and resumed, the dynamic tables automatically recalculated and caught up to the target freshness without human intervention
-
optimized resource utilization: snowflake only consumed compute resources during the active refresh cycles, reducing unnecessary warehouse uptime compared to our old, rigid scheduling intervals
better late than never
discovering this solution felt like finding the missing piece of a puzzle. it is an incredible upgrade to our development workflow, regardless of when it happened.
still, i cannot help but think about how much time and effort would have been saved if i had discovered this earlier. for years, i searched for "materialized views in snowflake" because that was the concept i knew from other relational databases. because snowflake does not have the concept of materialized views, i assumed snowflake did not have a declarative caching mechanism that supported joins, which led me down the path of manual task management.
if you are currently in that position, searching for a way to materialize complex joins without writing brittle scheduled tasks, this is the sign you need:
looking for materialized views in snowflake? try dynamic tables.
this realization would have set me on the right path years ago. hopefully, sharing this experience will help another engineer bypass the struggle of manual task orchestration and jump straight to declarative, automated pipelines.
references
related reading