Multi-repo architecture with real boundaries
Study how infrastructure, pipeline logic, orchestration, and analytics serving are split without losing the end-to-end shape.
Projects
This page is built for developers who want more than a project title. Each section explains what was built, why it matters, and which repositories are worth opening first.
Flagship case study
This is the strongest body of work on the site so far. It is a multi-repo platform built around an e-commerce source system, CDC ingestion, Bronze/Silver/Gold data shaping, analytics serving, and disciplined session-based operations.
Study how infrastructure, pipeline logic, orchestration, and analytics serving are split without losing the end-to-end shape.
Follow the flow from raw database changes to stakeholder-ready outputs, with validation, quarantine handling, and cost-aware operations.
Public repos worth opening
Private network, data lake buckets, IAM, serving infrastructure, monitoring, and the full AWS platform skeleton.
Open repoPySpark jobs and shared libraries for CDC reconciliation, Silver modeling, freshness metrics, and quarantine handling.
Open repodbt transformations shaping the Gold analytics layer from Silver inputs, with a cleaner business-facing serving contract.
Open repoNatural-language analytics over Gold data with SQL guardrails, charts, PDF reports, and stakeholder-friendly answers.
Open repoGitHub Actions workflows that start a working platform session and destroy it afterwards to keep costs controlled.
Open repoAirflow DAGs for the same pipeline when a visual orchestration path is needed in MWAA.
Open repoGCP translation
The GCP work stays private because it is the active internal workspace where the architecture is still being shaped. The point is not to make a superficial copy. The point is to deeply learn how the same platform ideas translate into GCP: organization and folder hierarchy, project-per-environment design, Workload Identity Federation, BigLake governance, BigQuery serving, and cleaner internal skills.
These are the working repos building the private GCP platform foundation from the ground up.
Other projects
A more focused GCP analytics pipeline outside the private enterprise rebuild.
Open repoA different warehouse shape that still reflects the same interest in clean modeling and business-facing outputs.
Open repoA medallion architecture implementation in a Databricks shape, useful for comparing platform thinking across ecosystems.
Open repo