Skip to content

Classification

Stage 4 of the pipeline: Training and evaluating classifiers.

Default Classifier: CatBoost

Fixed Classifier

CatBoost is the default and recommended classifier. The research question is about preprocessing effects, not classifier comparison.

Why CatBoost?

Property Value
Mean AUROC 0.878
Best AUROC 0.913
Handles categorical Yes
GPU support Yes
Overfitting protection Built-in

Bootstrap Evaluation

All results use bootstrap validation:

CLS_EVALUATION:
  BOOTSTRAP:
    n_iterations: 1000
    alpha_CI: 0.95

This provides:

  • Robust confidence intervals
  • Per-iteration metrics for statistical tests
  • Subject-wise stability analysis

STRATOS Metrics

Classification is evaluated with all STRATOS-compliant metrics:

Discrimination

  • AUROC: Area Under ROC Curve (95% CI)

Calibration

  • Calibration slope: Should be ~1.0
  • Calibration intercept: Should be ~0.0
  • O:E ratio: Observed/Expected ratio

Overall Performance

  • Brier score: Proper scoring rule
  • Scaled Brier (IPA): Interpretable proportion

Clinical Utility

  • Net Benefit: At clinical threshold
  • DCA curves: Decision Curve Analysis

Running Classification

# Default (CatBoost with best preprocessing)
python -m src.classification.flow_classification

# With specific preprocessing
python -m src.classification.flow_classification \
    outlier_method=MOMENT-gt-finetune \
    imputation_method=SAITS

API Reference

flow_classification

flow_classification

flow_classification(cfg: DictConfig) -> None

Main classification flow for glaucoma screening from PLR features.

Orchestrates the classification pipeline including feature-based and time-series classification approaches. Initializes MLflow experiment and delegates to subflows.

PARAMETER DESCRIPTION
cfg

Hydra configuration with PREFECT flow names and settings.

TYPE: DictConfig

Notes

Time-series classification is currently disabled as it showed limited promise after refactoring.

Source code in src/classification/flow_classification.py
def flow_classification(cfg: DictConfig) -> None:
    """
    Main classification flow for glaucoma screening from PLR features.

    Orchestrates the classification pipeline including feature-based
    and time-series classification approaches. Initializes MLflow
    experiment and delegates to subflows.

    Parameters
    ----------
    cfg : DictConfig
        Hydra configuration with PREFECT flow names and settings.

    Notes
    -----
    Time-series classification is currently disabled as it showed
    limited promise after refactoring.
    """
    experiment_name = experiment_name_wrapper(
        experiment_name=cfg["PREFECT"]["FLOW_NAMES"]["CLASSIFICATION"], cfg=cfg
    )
    logger.info("FLOW | Name: {}".format(experiment_name))
    logger.info("=====================")
    prev_experiment_name = experiment_name_wrapper(
        experiment_name=cfg["PREFECT"]["FLOW_NAMES"]["FEATURIZATION"], cfg=cfg
    )

    # Init the MLflow experiment
    init_mlflow_experiment(experiment_name=experiment_name)

    # Classify from hand-crafted features/embeddings
    flow_feature_classification(cfg, prev_experiment_name)

    # Classify from time series
    ts_cls = False
    if ts_cls:
        raise NotImplementedError(
            "Need to be finished, new bug with the refactoring, but did not seem promising"
        )