AIRS-MM · Newly Diagnosed Multiple Myeloma

What AIRS-MM sees beyond staging

An explainable ML model trained on 51 covariate-pruned features — labs, FISH, NGS, disease burden, treatment, immunoglobulin, and demographic domains — compared against the 7 features used by R2-ISS staging.

Seif SM · Daya GN · Zhu W · Feng L · Orlowski RZ · Thomas SK
MD Anderson Cancer Center — Institute for Data Science in Oncology
Model Inputs

Two very different views of the same patient

R2-ISS Staging

7 staging-derived features
  • 01β2-microglobulin
  • 02Albumin
  • 03LDH
  • 04del(17p)
  • 05t(4;14)
  • 06t(14;16)
  • 07+1q gain/amplification (newly added high-risk marker)

AIRS-MM

51-feature covariate-pruned non-staging set
  • +Full labs panel & disease burden
  • +Complete FISH cytogenetics
  • +NGS — mutation identity & burden (incl. TP53)
  • +Extramedullary disease (EMD)
  • +Treatment & immunoglobulin profile
  • +Demographic covariates
Discrimination

AUC by horizon — AIRS-MM vs R2-ISS

Relapse < 24 months ROC comparison, n = 103, events = 43.

ISS
0.637
R-ISS
0.641
R2-ISS
0.692
TabPFN (Ours)
0.745
CatBoost (Ours)
0.767

Staging systems (gray) vs. AIRS-MM models (teal/red) — CatBoost and TabPFN both outperform R2-ISS, the best of the three staging systems.

What AUC means here

AUC (Area Under the ROC Curve) scores how well a model ranks patients by risk, from 0.5 (no better than a coin flip) to 1.0 (perfect separation). Out of 103 patients, 43 relapsed within 24 months — AUC measures how often the model correctly ranks a relapsing patient as higher-risk than a non-relapsing patient.

Staging systems built only on lab values and core cytogenetics (ISS, R-ISS, R2-ISS) top out at 0.692. Adding the full 51-feature set — NGS mutation data, EMD status, treatment and immunoglobulin profile — lifts AUC to 0.745–0.767. That ~0.07–0.10 gain is the quantitative signature of the same story told elsewhere on this page: TP53 mutations and EMD carry prognostic information that staging alone doesn't capture.

What Staging Misses

Two signals not captured by current staging systems

TP53

Adverse features cluster around TP53-mutated MM

del(17p), high-risk MM, R2-ISS stage IV, elevated LDH, and high-risk FISH (≥2 abnormalities) were all significantly more common in TP53-mutated patients — suggesting biological risk not fully captured by current staging.

EMD

Extramedullary disease & mutation burden

Neither EMD nor pathogenic mutation burden are incorporated into ISS, R-ISS, or R2-ISS — yet both are associated with significantly inferior outcomes, pointing to prognostic value beyond standard risk stratification.

TP53WT vs TP53MUT

Key baseline characteristics

VariableTP53WT (n=194)TP53MUT (n=16)p-value
del(17p)27 (14%)11 (69%)<0.0001
High-risk MM55 (28%)12 (75%)0.0003
R2-ISS stage IV38 (21%)11 (69%)0.0003
LDH, median (IQR)192 (156–242)247 (210–359)0.005
High-risk FISH ≥2 abnormalities112 (59%)14 (88%)0.031
del(1p32)29 (15%)6 (38%)0.033
Elevated LDH74 (39%)11 (69%)0.031
Calcium >11 mg/dL28 (14%)6 (38%)0.028