Baikal provides real time data synchronization between relational database and Hadoop native data-store. It allows interactive query of the real-time data on top of the Hadoop native data-store using Spark or Hive interface. It has an extensible architecture to allow synchronization of data on Hadoop native data-store to downstream data pools for further big data application like data mining, machine learning and AI, thus provides an end-to-end real-time data ingestion and provisioning.
Baikal is ease of use. It has a web-based management tool to provide administrative functions like configuration, monitoring of the real time data transfer between relation databases to Hadoop native data-store. The open source edition supports MySQL, DB2 and PostgreSQL as source database. Other types of database support are under development and will be included when finished.
| Feature | Community Version | Enterprise Version |
|---|---|---|
| Auto deployment tool | √ | √ |
| Web based managment tool | √ | √ |
| Real time data sync up | √ | √ |
| Data encryption | x | √ |
| Data transfer encryption | √ | √ |
| Dynamic encryption key | x | √ |
| Database level snapshot | √ | √ |
| Row/Column level access control | x | √ |
| Configurable data desensitization | √ | √ |
| Data consistency guarantee | √ | √ |
| Carbondata support | √ | √ |
| Kerberos support | x | √ |
| Pipeline monitoring by machine learning | ×ばつ | √ |
| Selectable encrypt method | ×ばつ | √ |
| Data quality audit | ×ばつ | √ |
| Data access monitoring/audit | ×ばつ | √ |
| Diagnosis report | ×ばつ | √ |