Data-Driven Engineer

Architecting real-time pipelines, building ML models, and transforming data into insights

50+

Data Projects

Years Experience

100M+

Records Processed

View Projects Get In Touch

Featured Projects

Real-world solutions across data engineering, ML, and analytics

⚡

Data Engineering

Real-Time Event Processing Pipeline

Built a high-throughput data pipeline processing millions of events daily using Apache Spark and Kafka. Implemented auto-scaling on AWS to handle peak loads with 99.99% uptime.

Reduced latency by 85% | Processed 500M+ events/day

PythonSparkKafkaAWSAirflow

🤖

ML/Data Science

Predictive ML Model - Revenue Forecasting

Developed ensemble machine learning models (XGBoost, LightGBM) for revenue forecasting with 94% accuracy. Deployed as API service with real-time predictions.

94% prediction accuracy | ±5% MAPE

Pythonscikit-learnXGBoostFastAPI

📊

Data Analytics

Interactive Analytics Dashboard

Created comprehensive Tableau dashboards analyzing customer behavior, product performance, and revenue trends. Self-service analytics reduced report requests by 60%.

60% fewer manual reports | 40K+ daily users

TableauSQLPython

🔄

Data Engineering

ETL Orchestration Framework

Designed scalable ETL framework using Apache Airflow and dbt. Automated data transformation pipeline with monitoring, error handling, and data quality checks.

100% automated workflows | 99.9% SLA

AirflowdbtSQLPythonAWS

☁️

DevOps/Infrastructure

Cloud Data Warehouse Migration

Led migration of on-premise data warehouse to AWS Redshift with zero downtime. Optimized queries and implemented columnar compression reducing costs by 40%.

40% cost reduction | Zero downtime migration

AWSRedshiftSQLTerraform

🛡️

Data Engineering

Data Governance & Quality Platform

Built data governance framework with automated quality checks, lineage tracking, and metadata management using Apache Atlas and custom Python pipelines.

90% data quality compliance | Centralized governance

PythonApache AtlasSQLAWS

Skills & Expertise

Proficient across modern data stack and cloud platforms

Programming

🐍Python

95%

📝SQL

98%

🎯Scala

75%

⚙️JavaScript/TypeScript

80%

Big Data

⚡Apache Spark

92%

📨Apache Kafka

85%

🔄Apache Airflow

88%

🏗️dbt

87%

Cloud

☁️AWS

90%

🗄️Redshift

88%

💾S3/Data Lake

92%

🚀Lambda/EC2

85%

Analytics

📊Tableau

90%

📈Power BI

82%

🎨Data Visualization

89%

📐Statistical Analysis

86%

ML/DS

🤖Machine Learning

87%

🧠TensorFlow/PyTorch

78%

📚scikit-learn

91%

📊Statistics

88%

Data Engineering

Pipelines & ETL

Analytics

BI & Visualization

Machine Learning

Predictive Models

Cloud DevOps

Infrastructure & Scale

Career Journey

8+ years of progressive experience in data

2024-Present

Senior Data Engineer

Tech Company

Leading data infrastructure and platform initiatives

✓Architected real-time processing pipeline handling 500M+ events/day
✓Built data governance framework ensuring 90%+ compliance
✓Led team of 5 engineers in data platform modernization

2022-2024

ML Engineer / Data Scientist

Analytics Startup

Building machine learning products and data analytics

✓Developed 94% accurate revenue forecasting model in production
✓Reduced model inference latency by 70% through optimization
✓Established ML best practices and MLOps pipeline

2020-2022

Data Analytics Engineer

E-commerce Company

Analytics platform development and business intelligence

✓Created Tableau dashboards used by 40K+ daily active users
✓Automated 100+ manual reporting processes using Python
✓Designed data warehouse architecture on AWS Redshift

2018-2020

Business Intelligence Developer

Financial Services

BI development and data warehousing

✓Built ETL pipelines processing 100M+ records daily
✓Migrated legacy data warehouse to cloud with zero downtime
✓Implemented data quality framework reducing errors by 95%

2016-2018

Junior Data Analyst

Tech Startup

Getting started with data analysis and visualization

✓Created 50+ analytical reports for stakeholders
✓Learned SQL, Python, and data visualization fundamentals
✓Supported 10+ successful data-driven initiatives

Insights & Articles

Sharing knowledge on data engineering, ML, and analytics

Data Engineering12 min read

Scaling Real-Time Data Pipelines to 500M Events Per Day

A deep dive into architectural decisions, bottleneck identification, and optimization techniques used to handle massive event throughput with Apache Spark and Kafka.

#Spark#Kafka#Architecture#Performance

Jan 28, 2024Read More →

Machine Learning15 min read

Building ML Models That Actually Deploy in Production

Lessons learned from deploying 10+ ML models: choosing the right frameworks, handling model drift, A/B testing strategies, and monitoring in production.

#MLOps#Production#Best Practices

Dec 15, 2023Read More →

Cloud & DevOps10 min read