Will this relationship last 10 years or more? (~0.9% of couples do)
This app is powered by a machine learning model trained to predict long-lasting relationships. The final model — a Logistic Regression with 6 selected features — reaches an AUC of 0.75. Here's the full analytical journey, from raw data to production.
Initial deep-dive into the raw dataset: column types, distributions, and missing values. This step established the data quality baseline and confirmed the data was clean enough to proceed — no major imputation needed, no data leakage concerns.
Individual features from partner A and partner B in isolation showed no exploitable signal — raw personality scores alone couldn't predict anything meaningful. This led to the core insight: create relational features that measure compatibility and difference between the two partners rather than individual traits.
First attempt: predict the exact duration of the relationship in months (regression). Multiple algorithms were benchmarked — Linear Regression, Ridge, Random Forest, LightGBM.
The problem was reframed: predict whether a relationship will last 10 years or more (binary yes/no). A full model comparison was run: Logistic Regression, Decision Tree, Random Forest, LightGBM — all with class_weight="balanced" to handle class imbalance.
SHAP (SHapley Additive exPlanations) values were computed to understand why the model makes each prediction, and which features actually drive the outcome.
Rather than relying on SHAP alone, a more rigorous Sequential Backward Elimination approach was applied: iteratively remove the least useful feature and measure the impact on AUC. This gives a holistic view rather than feature-by-feature inspection.
The final production pipeline — StandardScaler + Logistic Regression — was retrained on the full dataset (not just the train split) using the 6 selected features, then serialized to disk.
.pkl file is what the FastAPI backend loads at startup to serve real-time predictions. The pipeline handles scaling automatically, so the API just receives raw scores.An experimental notebook testing whether specific pair combinations of categorical features (e.g. "Quality Time × Acts of Service" as a single feature) provide richer signal than the simple binary same_love_language flag.
class_weight="balanced" prevented the model from simply predicting "no" for everyone.Built by Céline Apéry GitHub LinkedIn · Model trained on the Cupid's Algorithm — Kaggle dataset