job_name parameter to InputContextexecute_in_process on a GraphDefinition (it would use the fs_io_manager instead of the in-memory io manager)/instance URL path prefix has been removed. E.g. /instance/runs can now be found at /runs./workspace URL path prefix has been changed to /locations. E.g. the URL for job my_job in repository foo@bar can now be found at /locations/foo@bar/jobs/my_job.dagstermill.yield_event.save_on_notebook_failure parameter.use_ephemeral_airflow_db which will create a job run scoped airflow db for airflow dags running in dagsterAssetKeys.1.0.16 for some compute log managers where an exception in the compute log manager setup/teardown would cause runs to fail.prefix argument to prevent badly constructed paths.tag:. This resolves an issue where retrieving the available tags could cause significant performance problems. Tags can still be searched with freeform text, and by adding them via click on individual run rows.β± [dagit] Schedules defined with cron unions displayed "Invalid cron string" in Dagit. This has been resolved, and human-readable versions of all members of the union will now be shown.
You can no longer set an outputβs asset key by overriding get_output_asset_key on the IOManager handling the output. Previously, this was experimental and undocumented.
log property, which log events that can later be viewed in Dagit. To enable these log views in dagit, navigate to the user settings and enable the Experimental schedule/sensor logging view option. Log links will now be available for sensor/schedule ticks where logs were emitted. Note: this feature is not available for users using the NoOpComputeLogManager.BETWEEN was used to determine the section of the table to replace. BETWEEN included values from the next partition causing the I/O manager to erroneously delete those entries.BETWEEN was used to determine the section of the table to replace. BETWEEN included values from the next partition causing the I/O manager to erroneously delete those entries.MultiPartitionsDefinition API. In Dagit, you can filter and materialize certain partitions by providing ranges per-dimension, and view your materializations by dimension.build_asset_reconciliation_sensor.FreshnessPolicy to any of your software-defined assets, to specify how up-to-date you expect that asset to be. You can view the freshness status of each asset in Dagit, alert when assets are missing their targets using the @freshness_policy_sensor, and use the build_asset_reconciliation_sensor to make a sensor that automatically kick off runs to materialize assets based on their freshness policies.op_version s to software-defined assets or observation_fn s to SourceAssets. When a set of assets is versioned in this way, their "Upstream Changed" status will be based on whether upstream versions have changed, rather than on whether upstream assets have been re-materialized. You can launch runs that materialize only stale assets.@multi_asset_sensor decorator enables defining custom sensors that trigger based on the materializations of multiple assets. The context object supplied to the decorated function has methods to fetch latest materializations by asset key, as well as built-in cursor management to mark specific materializations as "consumed", so that they wonβt be returned in future ticks. It can also fetch materializations by partition and mark individual partitions as consumed.RepositoryDefinition now exposes a load_asset_value method, which accepts an asset key and invokes the assetβs I/O managerβs load_input function to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.asset_selection parameter on @sensor and SensorDefinition, you can now define a sensor that directly targets a selection of assets, instead of targeting a job.dagit or dagster-daemon locally, environment variables included in a .env file in the form KEY=value in the same folder as the command will be automatically included in the environment of any Dagster code that runs, allowing you to easily use environment variables during local development.dagster-dbt now supports generating software-defined assets from your dbt Cloud jobs.dagster-airbyte and dagster-fivetran now support automatically generating assets from your ETL connections using load_assets_from_airbyte_instance and load_assets_from_fivetran_instance.dagster-duckdb integration: build_duckdb_io_manager allows you to build an I/O manager that stores and loads Pandas and PySpark DataFrames in DuckDB.dagster instance migrate:
define_dagstermill_solid, a legacy API, has been removed from dagstermill. Use define_dagstermill_op or define_dagstermill_asset instead to create an op or asset from a Jupyter notebook, respectively.ComputeLogManager API is marked as deprecated in favor of an updated interface: CapturedLogManager. It will be removed in 1.2.0. This should only affect dagster instances that have implemented a custom compute log manager.dagster-graphql and dagit now use version 3 of grapheneUPathIOManager base class is now a top-level Dagster export. This enables you to write a custom I/O manager that plugs stores data in any filesystem supported by universal-pathlib and uses different serialization format than pickle (Thanks Daniel Gafni!).fs_io_manager now inherits from the UPathIOManager, which means that its base_dir can be a path on any filesystem supported by universal-pathlib (Thanks Daniel Gafni!).build_asset_reconciliation_sensor now works with support partitioned assets.build_asset_reconciliation_sensor now launches runs to keep assets in line with their defined FreshnessPolicies.FreshnessPolicy object is now exported from the top level dagster package.FreshnessPolicy defined, their current freshness status will be rendered in the asset graph and asset details pages.upload_interval which specifies in seconds, the interval in which partial logs will be uploaded to the respective cloud storage. This can be used to display compute logs for long-running compute steps.dagit or dagster-daemon locally, environment variables included in a .env file in the form KEY=value in the same folder as the command will be automatically included in the environment of any Dagster code that runs, allowing you to easily test environment variables during local development.observable_source_asset decorator creates a SourceAsset with an associated observation_fn that should return a LogicalVersion, a new class that wraps a string expressing a version of an assetβs data value.execute_k8s_job function that can be called within any op to run an image within a Kubernetes job. The implementation is similar to the build-in k8s_job_op , but allows additional customization - for example, you can incorporate the output of a previous op into the launched Kubernetes job by passing it into execute_k8s_job. See the dagster-k8s API docs for more information.define_dagstermill_asset now supports RetryPolicy . Thanks @nickvazz!load_assets_from_airbyte_instance, users can now optionally customize asset names using connector_to_asset_key_fn.load_assets_from_fivetran_instance, users can now alter the IO manager using io_manager_key or connector_to_io_manager_key_fn, and customize asset names using connector_to_asset_key_fn..env filesasset_selection parameter on @sensor and SensorDefinition, you can now define a sensor that directly targets a selection of assets, instead of targeting a job.materialize and materialize_to_memory now accept a raise_on_error argument, which allows you to determine whether to raise an Error if the run hits an error or just return as failed.MultiPartitionsDefinition object. An optional schema migration enables support for this feature (run via dagster instance migrate). Users who are not using this feature do not need to run the migration.-db-pool-recycle cli flag (and dbPoolRecycle helm option) have been added to control how long the pooled connection dagit uses persists before recycle. The default of 1 hour is now respected by postgres (mysql previously already had a hard coded 1hr setting). Thanks @adam-bloom !load_assets_from_airbyte_instance and load_assets_from_airbyte_project.dbt_cloud_resource resource configuration account_id can now be sourced from the environment. Thanks @sowusu-ba !load_assets_from_fivetran_instance helper which automatically pulls assets from a Fivetran instance.securityContext configuration of the Dagit pod in the Helm chart didnβt apply to one of its containers. Thanks @jblawatt !asset_selection parameter of RunRequest to not be respected when used inside a schedule.task_definition field in the EcsRunLauncher to an environment variable stopped working.load_assets_from_dbt_manifest. This fixed then error when load_assets_from_dbt_manifest failed to load from dbt manifest with exposures. Thanks @sowusu-ba !build_asset_reconciliation_sensor has changed to be more focused on reconciliation. It now materializes assets that have never been materialized before and avoids materializing assets that are "Upstream changed". The build_asset_reconciliation_sensor API no longer accepts wait_for_in_progress_runs and wait_for_all_upstream arguments.@asset and @multi_asset now accept a retry_policy argument. (Thanks @adam-bloom!)fs_io_manager will now return a dictionary that maps partition keys to the stored values for those partitions. (Thanks @andrewgryan!).JobDefinition.execute_in_process now accepts a run_config argument even when the job is partitioned. If supplied, the run config will be used instead of any config provided by the jobβs PartitionedConfig.run_request_for_partition method on jobs now accepts a run_config argument. If supplied, the run config will be used instead of any config provided by the jobβs PartitionedConfig.NotebookMetadataValue can be used to report the location of executed jupyter notebooks, and Dagit will be able to render the notebook.dagster.yaml file; check out the docs.in_process executor, where all steps are executed in the same process, the captured compute logs for all steps in a run will be captured in the same file.define_dagstermill_asset which loads a notebook as an asset.make_dagster_job_from_airflow_dag now supports airflow 2, there is also a new mock_xcom parameter that will mock all calls to made by operators to xcom.PartitionedConfig), previously run_request_for_partition would produce a run with no config. Now, the run has the hardcoded dictionary as its config.AssetsDefinition, through group-based asset dependency resolution, which would later error because of a circular dependency. This has been fixed.@jayhalekey_prefix with methods like load_assets_from_modules.dagster instance migrate.- were not being properly sanitized in some situations. Thanks @peay!load_assets_from_airbyte_project. Thanks @adam-bloom!ServiceUnavailable error. Thanks @cavila-evoliq!display_raw_sql flag to the dbt asset loading functions. If set to False, this will remove the raw sql blobs from the asset descriptions. For large dbt projects, this can significantly reduce the size of the generated workspace snapshots.load_assets_from_airbyte_project now caches the project data generated at repo load time so it does not have to be regenerated in subprocesses.load_assets_from_airbyte_instance or load_assets_from_airbyte_project.Shift while clicking the repository name, and all repository groups will be collapsed or expanded accordingly.EcsRunLauncher now allows you to pass in a dictionary in the task_definition config field that specifies configuration for the task definition of the launched run, including role ARNs and a list of sidecar containers to include. Previously, the task definition could only be configured by passing in a task definition ARN or by basing the the task definition off of the task definition of the ECS task launching the run. See the docs for the full set of available config.SkipReason within a multi-asset sensor (experimental) would raise an error. This has been fixed.define_asset_job, you would run into a CheckError when launching the job from Dagit. This has been fixed.AssetMaterialization now has a metadata property, which allows accessing the materializationβs metadata as a dictionary.DagsterInstance now has a get_latest_materialization_event method, which allows fetching the most recent materialization event for a particular asset key.RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now work with IO managers whose load_input implementation accesses the op_def and name attributes on the InputContext.RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now respect the DAGSTER_HOME environment variable.InMemoryIOManager, the IOManager that backs mem_io_manager, has been added to the public API.multi_asset_sensor (experimental) now supports marking individual partitioned materializations as "consumed". Unconsumed materializations will appear in future calls to partitioned context methods.build_multi_asset_sensor_context testing method (experimental) now contains a flag to set the cursor to the newest events in the Dagster instance.TableSchema now has a static constructor that enables building it from a dictionary of column names to column types.dagster run migrate-repository which lets you migrate the run history for a given job from one repository to another. This is useful to preserve run history for a job when you have renamed a repository, for example.DagsterCloudOperator and DagsterOperator now support Airflow 2. Previously, installing the library on Airflow 2 would break due to an import error.execute_in_process, no error would be raised. Now, a DagsterMaxRetriesExceededError will be launched off.load_assets_from_...(..., use_build=True), AssetObservation events would be emitted for each test. These events would have metadata fields which shared names with the fields added to the AssetMaterialization events, causing confusing historical graphs for fields such as Compilation Time. This has been fixed.load_assets_from_... was generated in a way which was non-deterministic for dbt projects which pulled in external packages, leading to errors when executing across multiple processes. This has been fixed.dagster_airflow.hooks. Thanks @bollwyvl!dagster-gcp now supports google-api-python-client 2.x. Thanks @amarrella!build_asset_reconciliation_sensor.multi_asset_sensor and kicking off subsequent partitioned runs.multi_asset_sensor (experimental) now accepts an AssetSelection of assets to monitor. There are also minor API updates for the multi-asset sensor context.AssetValueLoader, the type returned by RepositoryDefinition.get_asset_value_loader is now part of Dagsterβs public API.RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now support a partition_key argument.RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now work with I/O managers that invoke context.upstream_output.asset_key.dagster.yaml as follows:code_servers:
local_startup_timeout: 120
load_assets_from_airbyte_instance function which automatically generates asset definitions from an Airbyte instance. For more details, see the new Airbyte integration guide.DagsterCloudOperator and DagsterOperator , which are airflow operators that enable orchestrating dagster jobs, running on either cloud or OSS dagit instances, from Apache Airflow.port configuration in the airbyte_resource was marked as not required, but if it was not supplied, an error would occur. It is now marked as required.load_assets_from_dbt_project or load_assets_from_manifest_json. This has been fixed.sqlalchemy.exc.TimeoutError now retryredshift_resource no longer accepts a schema configuration parameter. Previously, this parameter would error whenever used, because Redshift connections do not support this parameter.LazyPysparkResource that only initializes a spark session once itβs accessed (thank you @zyd14!)build_asset_reconciliation_sensor function accepts a set of software-defined assets and returns a sensor that automatically materializes those assets after their parents are materialized.