ACID in Data Engineering: From Simple Examples to Distributed Systems Internals
A practical deep dive into how Atomicity, Consistency, Isolation, and Durability are implemented across databases, lakehouses, and distributed systems.

Architecting real-time pipelines, building ML models, and transforming data into insights
Carbon is a live product focused on helping users estimate footprint, understand where emissions come from, and prioritize improvements.
Users enter operational and activity data such as travel, energy, and resource consumption through a guided flow.
Carbon converts raw activity data into emission estimates using standardized calculation logic and transparent assumptions.
A focused dashboard highlights footprint hotspots and reduction opportunities to support faster, data-backed decisions.
Test the full user flow and see how Carbon turns operational input data into decision-ready sustainability insights.
Real-world solutions across data engineering, ML, and analytics
Built end-to-end ML system predicting optimal compute resources for job submissions. Implemented interactive dashboard with user authentication, vectorization (TFIDF, CountVectorizer), clustering (K-means, DBScan), and confidence scoring. Migrated from static reporting to dynamic analytics achieving 95%+ performance improvement.
Designed and deployed production Tableau dashboard monitoring 6+ compute clusters across global locations. Implemented real-time resource allocation tracking, memory utilization analysis, and capacity visualization. Managed data consistency migration to new database system.
Developed comprehensive workload analyzer tracking job statistics, cluster performance, and resource utilization. Fixed critical bugs (date filtering, obsolete data visibility, labeling). Implemented analytics queries for user tracking across multiple compute clusters.
Built ML model predicting project delays using project management data. Implemented Recursive Feature Addition (RFA) with Random Forest Classifier. Engineered features (milestone position, task duration, completion percentage) with 4+ stages of cross-validation. Created fallback logic for missing baseline dates.
Conducted exploratory analysis on engineering tool test results from compute and design teams. Integrated with log aggregation systems, created dashboards highlighting performance patterns and bottlenecks. Performed tool analysis extracting actionable insights for optimization.
Maintained enterprise reporting platform with data verification, automated cloud storage management via PowerAutomate workflows. Fixed cloud analytics data source issues, managed secret key rotation with Lambda, handled data source tagging and ownership. Optimized data ingestion frequency reducing costs.
Personal projects demonstrating technical skills and data analysis expertise
Comprehensive data cleaning project using MySQL on world layoff dataset. Demonstrates advanced SQL techniques for handling missing values, duplicates, and data standardization.
Complete data cleaning and visualization project analyzing bike sales across Europe, Pacific, and North America using Excel. Includes pivot tables, charts, and trend analysis.
Interactive Power BI dashboard exploring data domain aspects. Demonstrates dashboard design, data modeling, and business intelligence best practices.
Tableau dashboard analyzing Airbnb housing listings in Amsterdam. Provides insights into pricing, location, availability, and market trends.
Python-based web scraper that monitors Amazon products and sends notifications for price drops. Demonstrates web scraping, data handling, and automation techniques.
Proficient across modern data stack and cloud platforms
8+ years of progressive experience in data
NXP Semiconductors
Leading data analytics, ML, and DevOps initiatives for HPC cluster management and job prediction systems
Techno Teams
Analytics and data-driven insights for marketing and sales optimization
ABN AMRO Bank N.V.
Research and stakeholder analysis for business and IT alignment
Techno Teams
Technical Writing & Content Strategy Department
Sharing knowledge on data engineering, ML, and analytics
A practical deep dive into how Atomicity, Consistency, Isolation, and Durability are implemented across databases, lakehouses, and distributed systems.
Open to collaboration, opportunities, and discussing all things data
Interested in working together on data projects or want to discuss your pipeline architecture?
Send me an Email