Skip to content

TaskAug ECG replication: PTBXLDataset, ECGBinaryClassification, TaskAugResNet#1123

Open
RogeCS wants to merge 3 commits intosunlabuiuc:masterfrom
paulgarciaro:taskaug-ecg
Open

TaskAug ECG replication: PTBXLDataset, ECGBinaryClassification, TaskAugResNet#1123
RogeCS wants to merge 3 commits intosunlabuiuc:masterfrom
paulgarciaro:taskaug-ecg

Conversation

@RogeCS
Copy link
Copy Markdown

@RogeCS RogeCS commented Apr 23, 2026

Contributors

Name NetID / Email
Paul Garcia alanpg2@illinois.edu
Rogelio Medina orm9@illinois.edu
Cesar Nava can14@illinois.edu

Type of Contribution

Full pipeline reproducibility — Dataset + Task + Model

Paper

Raghu, A., Raghu, M., Kornblith, S., Duvenaud, D., & Ghahramani, Z. (2022).
Data Augmentation for Electrocardiograms.
Conference on Health, Inference, and Learning (CHIL), PMLR 174.
https://proceedings.mlr.press/v174/raghu22a.html

Description

This PR implements the TaskAug framework from Raghu et al. (2022), which
learns a task-adaptive, differentiable augmentation policy jointly with an ECG
classifier via bi-level optimisation. Three new PyHealth components are
contributed:

1. PTBXLDataset — Dataset wrapper

Loads the PTB-XL 12-lead ECG corpus (21,799 records, 18,869 patients).
Parses ptbxl_database.csv and scp_statements.csv to assign four binary
diagnostic superclass labels: MI (myocardial infarction), HYP
(hypertrophy), STTC (ST/T-wave change), and CD (conduction
disturbance).

2. ECGBinaryClassification — Task class

Lazy-loads WFDB waveform files, applies per-lead z-score normalisation, and
pads/truncates signals to a fixed length (default 2500 samples @ 500 Hz).
Supports all four superclasses via a task_label argument.

3. TaskAugResNet — Model

BaseModel subclass combining:

  • TaskAugPolicy — K-stage differentiable augmentation policy using
    Gumbel-Softmax over 8 ECG-specific operations (Gaussian noise, magnitude
    scale, time mask, baseline wander, temporal warp, temporal displacement,
    no-op, and a novel LeadDropout extension) with class-specific learnable
    magnitudes (mag_neg, mag_pos).
  • _ResNet1D — 1-D ResNet-18 backbone adapted for multi-lead ECG
    (kernel size 7, 12-channel input).
  • policy_parameters() / backbone_parameters() helpers enabling the
    inner/outer loop split required for bi-level optimisation.

Extensions beyond the paper

  • LeadDropout (8th augmentation op): randomly zeros entire ECG leads with
    probability proportional to the learned magnitude, simulating electrode
    failure in clinical settings.
  • shared_magnitudes flag: ablation option that forces identical
    augmentation strength across classes, directly testing the paper's
    class-asymmetric magnitude hypothesis.

Example script

examples/ptbxl_ecg_classification_taskaug_resnet.py provides:

  • Standard joint training baseline
  • BiLevelTrainer (first-order DARTS approximation of the outer loop)
  • 6-configuration ablation study (A–F) comparing no-aug, fixed noise,
    TaskAug K=1/K=2, frozen policy, and shared magnitudes
  • Outer-loop learning-rate sweep
  • --synthetic flag for dependency-free testing with no download required

File Guide

File What to review
pyhealth/datasets/ptbxl.py PTBXLDataset — metadata parsing, label generation
pyhealth/tasks/ecg_classification.py ECGBinaryClassification — waveform loading, normalisation, task schema
pyhealth/models/taskaug_resnet.py TaskAugPolicy, _ResNet1D, TaskAugResNet, _lead_dropout
examples/ptbxl_ecg_classification_taskaug_resnet.py End-to-end training, BiLevelTrainer, ablation study, CLI
TaskAug ECG DLH.ipynb Colab demo notebook — runs top-to-bottom on a free T4 GPU
docs/api/datasets/pyhealth.datasets.PTBXLDataset.rst Dataset API docs
docs/api/models/pyhealth.models.TaskAugResNet.rst Model API docs

Testing

All components can be exercised without downloading PTB-XL:

python examples/ptbxl_ecg_classification_taskaug_resnet.py \
    --synthetic --mode ablation --epochs 5

m-cnava and others added 3 commits April 21, 2026 22:02
…to 500→250 Hz (Raghu et al. 2022) (#4)

* fix bugs

* fix bugs

* fix bugs

* more enhancements

* more enhancements

* ablation

---------

Co-authored-by: Rogelio Medina <rogelio.medina@c3.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants