The Data Science Journey

Here is my journey in applied machine learning, analytics, and physics-driven research built for real-world impact.

From agentic support assistants to cosmological model testing, I connect six years of quantitative research with pragmatic product thinking. Explore the highlights below to see how I design, ship, and interpret data-centric solutions. Welcome! Explore the highlights below.

Oct 30, 2025

A-IT Support Resilient Ticket Assistant

An LLM-powered agent that orchestrates ticket collection, classification, and grounded responses so support teams can close loops faster.

GitHub repository

Aug 6, 2023

Django & ArXiv Article Discovery

A full-stack filter that ranks ArXiv submissions with NLP, surfacing the papers that matter to a target research agenda.

Web app & ranking engine

Django ArXiv article filter combined view

Jul 21, 2023

Periodic K-Means on Household Power Data

Clustering household energy signatures with a periodic kernel to reveal consumption archetypes and demand peaks.

Notebook analysis

Household power consumption clustering visualization 1

Household power consumption clustering visualization 2

Jul 12, 2023

Biodiversity EDA & Hypothesis Testing

Exploratory analytics for U.S. National Parks biodiversity, pairing statistical testing with storytelling visuals.

Exploration notebook

Biodiversity analysis plot 1 – US National Parks

Biodiversity analysis plot 2 – US National Parks

Oct 19, 2023

Forest Cover Type Classification

Sequential Keras models that push predictive accuracy on forest cover labels while balancing complexity and generalisation.

Stack: Keras, TensorFlow, structured feature engineering, disciplined experiment tracking.

The notebook walks through the model family and highlights how additional depth and regularisation impact performance.

Deep dive

Neural network model comparison for cover type classification

Feb 15, 2023

Cosmology Model Testing (NEDE)

Bayesian inference for novel dark energy models, blending theoretical physics priors with observational data constraints.

Published article

Spectral index vs tensor-to-scalar ratio plot

Sep 20, 2023

Fundamentals of NLP – Disaster Tweets

End-to-end tweet classification with conventional ML pipelines, showing that solid features can rival deep models on lean datasets.

Workflow includes data cleaning, tokenisation, TF-IDF, Naive Bayes, Logistic Regression, and disciplined hyper-parameter tuning.

Kaggle notebook