MS Data Science @ UMBC | Data Analyst · Data Engineer · ML Engineer
📍 Bellevue,WA | 📫 LinkedIn
I'm a data scientist with hands on experience building end to end ML pipelines, analyzing large scale datasets, and deploying predictive models that solve real business problems. Currently completing my Master's in Data Science at UMBC, with a focus on classification, prediction, and big data processing.
I enjoy turning messy, raw data into clear decisions whether that's through a well tuned model, a clean SQL query, or an interactive dashboard.
Languages
Python SQL R
Machine Learning & AI
scikit-learn TensorFlow PyTorch XGBoost
Data Engineering & Big Data
Apache Spark Hadoop MapReduce ETL Pipelines
Visualization & BI
Tableau Power BI Matplotlib Seaborn Plotly
Tools & Deployment
Streamlit Jupyter Git Pandas NumPy
UMBC Data Science Capstone — Built a classification model to predict outcomes from structured real world data. Applied feature engineering, model comparison (Logistic Regression, Random Forest, XGBoost), and deployed an interactive Streamlit app for live predictions.
Python scikit-learn XGBoost Streamlit Pandas
→ View project
Big Data Processing with Spark & Hadoop — Designed and implemented MapReduce and Spark pipelines to process and analyze large-scale datasets. Demonstrated distributed computing fundamentals on real data workloads.
Apache Spark Hadoop MapReduce Python Jupyter
→ View project
End-to-end ML project — Merged 6 data sources, engineered features, trained Logistic Regression, Random Forest and XGBoost models achieving AUC-ROC of 0.9903. Added SHAP explainability and deployed as a live Streamlit web app.
Python scikit-learn XGBoost SHAP Streamlit Pandas
→ View project | → Live Demo
- Expanding SQL + Python analytics portfolio with business-focused EDA projects
- Exploring FastAPI for ML model serving
If you're hiring for data analyst, data engineer, or ML engineer roles I'd love to connect.