Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
-
Updated
Oct 24, 2025 - Java
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
Postgres-native columnar storage extension
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
An open-source columnar data format designed for fast & realtime analytic with big data.
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Hydra九头龙,面向PB级别知识库取数、情报系统、数据平台、大规模控制调度系统。面向大规模数据采集、分析、智能取数。——以实现大规模分布式爬虫搜索引擎为例。
Time-series data warehouse built for speed. 2.42M records/sec on local NVMe. DuckDB + Parquet + Arrow + flexible storage (local/MinIO/S3). AGPL-3.0
Roadmap for Data Engineering
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Make dbt great again! Enables end user to extend dbt to his/her needs
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
OpenChatBI is an intelligent chat-based BI tool powered by large language models, designed to help users query, analyze, and visualize data through natural language conversations. It uses LangGraph and LangChain to build chat agent and workflows that support natural language to SQL conversion and data analysis.
All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Certificate specialization are stored in this repository.
Accelerator to build a Microsoft Fabric modern data platform using pre-built reusable Fabric items and an orchestration ELT Framework
Add a description, image, and links to the datawarehouse topic page so that developers can more easily learn about it.
To associate your repository with the datawarehouse topic, visit your repo's landing page and select "manage topics."