An explainable ML model trained on 51 covariate-pruned features — labs, FISH, NGS, disease burden, treatment, immunoglobulin, and demographic domains — compared against the 7 features used by R2-ISS staging.
Relapse < 24 months ROC comparison, n = 103, events = 43.
Staging systems (gray) vs. AIRS-MM models (teal/red) — CatBoost and TabPFN both outperform R2-ISS, the best of the three staging systems.
AUC (Area Under the ROC Curve) scores how well a model ranks patients by risk, from 0.5 (no better than a coin flip) to 1.0 (perfect separation). Out of 103 patients, 43 relapsed within 24 months — AUC measures how often the model correctly ranks a relapsing patient as higher-risk than a non-relapsing patient.
Staging systems built only on lab values and core cytogenetics (ISS, R-ISS, R2-ISS) top out at 0.692. Adding the full 51-feature set — NGS mutation data, EMD status, treatment and immunoglobulin profile — lifts AUC to 0.745–0.767. That ~0.07–0.10 gain is the quantitative signature of the same story told elsewhere on this page: TP53 mutations and EMD carry prognostic information that staging alone doesn't capture.
del(17p), high-risk MM, R2-ISS stage IV, elevated LDH, and high-risk FISH (≥2 abnormalities) were all significantly more common in TP53-mutated patients — suggesting biological risk not fully captured by current staging.
Neither EMD nor pathogenic mutation burden are incorporated into ISS, R-ISS, or R2-ISS — yet both are associated with significantly inferior outcomes, pointing to prognostic value beyond standard risk stratification.
| Variable | TP53WT (n=194) | TP53MUT (n=16) | p-value |
|---|---|---|---|
| del(17p) | 27 (14%) | 11 (69%) | <0.0001 |
| High-risk MM | 55 (28%) | 12 (75%) | 0.0003 |
| R2-ISS stage IV | 38 (21%) | 11 (69%) | 0.0003 |
| LDH, median (IQR) | 192 (156–242) | 247 (210–359) | 0.005 |
| High-risk FISH ≥2 abnormalities | 112 (59%) | 14 (88%) | 0.031 |
| del(1p32) | 29 (15%) | 6 (38%) | 0.033 |
| Elevated LDH | 74 (39%) | 11 (69%) | 0.031 |
| Calcium >11 mg/dL | 28 (14%) | 6 (38%) | 0.028 |