The Data Science Journey

Here is my journey in applied machine learning, analytics, and physics-driven research built for real-world impact.

From agentic support assistants to cosmological model testing, I connect six years of quantitative research with pragmatic product thinking. Explore the highlights below to see how I design, ship, and interpret data-centric solutions. Welcome! Explore the highlights below.

A-IT Support Resilient Ticket Assistant

An LLM-powered agent that orchestrates ticket collection, classification, and grounded responses so support teams can close loops faster.

AI ticket assistant screenshot

Django & ArXiv Article Discovery

A full-stack filter that ranks ArXiv submissions with NLP, surfacing the papers that matter to a target research agenda.

Django ArXiv article filter combined view

Periodic K-Means on Household Power Data

Clustering household energy signatures with a periodic kernel to reveal consumption archetypes and demand peaks.

Household power consumption clustering visualization 1 Household power consumption clustering visualization 2

Biodiversity EDA & Hypothesis Testing

Exploratory analytics for U.S. National Parks biodiversity, pairing statistical testing with storytelling visuals.

Biodiversity analysis plot 1 – US National Parks Biodiversity analysis plot 2 – US National Parks

Forest Cover Type Classification

Sequential Keras models that push predictive accuracy on forest cover labels while balancing complexity and generalisation.

Stack: Keras, TensorFlow, structured feature engineering, disciplined experiment tracking.

The notebook walks through the model family and highlights how additional depth and regularisation impact performance.

Soil type features visualization Neural network model comparison for cover type classification

Cosmology Model Testing (NEDE)

Bayesian inference for novel dark energy models, blending theoretical physics priors with observational data constraints.

Bayesian parameter triangle plot Spectral index vs tensor-to-scalar ratio plot

Fundamentals of NLP – Disaster Tweets

End-to-end tweet classification with conventional ML pipelines, showing that solid features can rival deep models on lean datasets.

Workflow includes data cleaning, tokenisation, TF-IDF, Naive Bayes, Logistic Regression, and disciplined hyper-parameter tuning.

Tweet classification project thumbnail