This repository contains a collection of projects that demonstrates the end-to-end workflow of a data analytics pipeline, showcasing practical implementations of data engineering and analytical concepts using SQL.
Each project is designed to simulate real-world scenarios and includes the following stages:
-
Data Extraction: This includes pulling datasets from online sources, such as Kaggle, official websites and so on.
-
Data Ingestion & Cleaning: This involves importing raw data which is present in the form of flat files into a structured environment that has been prepared by analyzing the requirements, handling missing values, duplicates, and inconsistencies and finally standardizing formats for dates, strings, and categorical variables.
-
Data Transformation & Analysis: This includes the creation of derived columns, calculated metrics and aggregations to reshape and enrich the data, support analysis and derive insights to solve business problems.
The projects span a wide range of concepts, from data warehousing to EDA, covering:
-
Foundational Techniques: Includes basic data exploration, string operations, conditional logic and date manipulations.
-
Advanced SQL Features: Includes aggregations, window functions, common table expressions (CTEs), subqueries, temporary tables, views and stored procedures for modular and reusable logic.
- SQL Projects
- Data Warehousing
- Data Exploration