Cloud Data Fusion overview
Stay organized with collections
Save and categorize content based on your preferences.
Cloud Data Fusion is a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines. The Cloud Data Fusion web interface lets you build scalable data integration solutions. It lets you connect to various data sources, transform the data, and then transfer it to various destination systems, without having to manage the infrastructure.
Cloud Data Fusion is powered by the open source project CDAP.
Get started with Cloud Data Fusion
You can start exploring Cloud Data Fusion in minutes.
- Create a Cloud Data Fusion instance: get started by creating a Cloud Data Fusion instance.
- Cost: before you begin your journey, familiarize yourself with Cloud Data Fusion costs.
- Concepts: understand the key terminologies used in Cloud Data Fusion.
- Quickstart: experience Cloud Data Fusion by creating your first pipeline.
Explore Cloud Data Fusion
The main components of Cloud Data Fusion are explained in the following sections.
Tenant project
The set of services required to build and orchestrate Cloud Data Fusion pipelines and store pipeline metadata are provisioned in a tenant project, inside a tenancy unit. A separate tenant project is created for each customer project, in which Cloud Data Fusion instances are provisioned. The tenant project inherits all the networking and firewall configurations from the customer project.
Cloud Data Fusion: Console
The Cloud Data Fusion console, also referred to as control plane, is a set of API operations and a web interface that deal with the Cloud Data Fusion instance itself, such as creating, deleting, restarting, and updating it.
Cloud Data Fusion: Studio
Cloud Data Fusion Studio, also referred to as the data plane, is a set of REST API and web interface operations that deal with creation, execution, and management of pipelines and related artifacts.
Concepts
This section introduces some of the core concepts of Cloud Data Fusion.
Concept | Description |
---|---|
Cloud Data Fusion instance |
|
Namespace | A namespace is a logical grouping of applications, data, and the associated metadata in a Cloud Data Fusion instance. You can think of namespaces as a partitioning of the instance. In a single instance, one namespace stores the data and metadata of an entity independently from another namespace. |
Pipeline |
|
Pipeline node |
|
Plugin |
|
Hub | In the Cloud Data Fusion web interface, to browse plugins, sample pipelines, and other integrations, click Hub. When a new version of a plugin is released, it's visible in the Hub in any instance that's compatible. This applies even if the instance was created before the plugin was released. |
Pipeline preview |
|
Pipeline execution |
|
Compute profile |
|
Reusable pipeline |
|
Trigger |
|
Cloud Data Fusion resources
Explore Cloud Data Fusion resources:
- Release notes provide change logs of features, changes, and deprecations
- Pricing for Cloud Data Fusion
- Supported regions for Cloud Data Fusion
- API and reference
What's next
- See Cloud Data Fusion use cases.
- Create a Cloud Data Fusion instance.
- Work through a tutorial.