Production-grade ETL pipeline: CSV and PostgreSQL sources loaded into a Snowflake warehouse with star-schema modelling and SCD Type 2 history. Built during my Data Engineering internship at Nagarro.
Daily batch flow from raw sources through transform to warehouse
Fact at centre · four conformed dimensions · Kimball-style
Update a customer attribute to see the row expire and history preserve
| ck | cust_id | name | city | eff_date | exp_date | is_cur |
|---|
Last 30-day window · production instance
updated_at watermarks — log-based CDC deferred until WAL rates justify it.MERGE INTO on natural keys. Pipeline is safely re-runnable.