ML Resilience Lab
About
ML Resilience Lab is an advanced experimental playground designed to showcase resilience patterns, fault tolerance, and chaos engineering principles within production-grade machine learning pipelines. It implements a multi-layered defense system for real-time credit card fraud detection using a 5-stage Medallion Architecture. By combining a highly optimized Python-based streaming backend with a Next.js dashboard, the system allows developers to interactively inject faults—like schema anomalies, API outages, or data drift—and observe automated self-healing reactions like circuit breakers and drift kill switches in real time.
Technologies Used
Medallion Data Flow Architecture
Transactions flow through a progressive 5-stage Medallion Architecture (Bronze, Silver, Gold, Inference, Decision) in streaming mode. Each layer guarantees strict validation, enrichment, real-time ML scoring, and final deterministic rule-fusion.
Interactive Chaos Engineering
Allows interactive fault injection directly from the Next.js control panel. Developers can simulate invalid payloads, kill external APIs to trigger Circuit Breakers, inject velocity bursts, or induce data drift to observe the system's resilience patterns in real time.
MLOps & Model Tracking
Includes a complete machine learning training pipeline using MLflow, optimized for Area Under the Precision-Recall Curve (AUPRC). Employs a Random Forest ensemble to achieve high recall on highly imbalanced fraud datasets, logging all metrics and parameters to a model registry.
