Skip to content

PTB-XL Dataset, Classification Task, ResNet Models, and BiLSTM Model Implementation#1115

Open
jtwells2 wants to merge 41 commits intosunlabuiuc:masterfrom
jtwells2:master
Open

PTB-XL Dataset, Classification Task, ResNet Models, and BiLSTM Model Implementation#1115
jtwells2 wants to merge 41 commits intosunlabuiuc:masterfrom
jtwells2:master

Conversation

@jtwells2
Copy link
Copy Markdown

Authors: Anurag Dixit - anuragd2@illinois.edu, Kent Spillner - kspillne@illinois.edu, John Wells - jtwells2@illinois.edu
Original Paper: https://proceedings.mlr.press/v149/nonaka21a.html
Dataset: https://www.kaggle.com/datasets/physionet/ptbxl-electrocardiography-database

Overview

This PR adds a full pipeline replication for In-depth Benchmarking of Deep Neural Network Architectures for ECG Diagnosis, with one dataset contribution, four model contributions, one multilabel classification task, a Jupyter Notebook file showing example usage and the team's ablation study.

Dataset

  1. datasets/ptbxl.py - PyHealth dataset for PTB-XL data. Processes *.hea and *.mat files, assigns train/test/val splits, and loads requisite data into a dd.DataFrame.

Models

  1. resnet.py - PyHealth model implementing ResNet-18
  2. lambda_resnet.py - PyHealth model implementing Lambda ResNet-18
  3. se_resnet.py - PyHealth model implementing SE-ResNet-50
  4. resnet_ecg_base.py - file implementing building blocks used by the three ResNet models
  5. bilstm_ecg.py - PyHealth model implementing a Bidirectional LSTM

Tasks

  1. ptbxl_multilabel_classification.py - PyHealth task for PTB-XL multi-label ECG classification, can use "superdiagnostic" labels (5 classes) or "diagnostic" labels (27 classes) and use either 100 Hz or 500 Hz sampling of the PTB-XL data

Example Usage / Ablation Study

  1. ptbxl_superdiagnostic_se_resnet.ipynb - Jupyter Notebook example file containing runs of the multilabel classification task with all four models on a subset of 7500 patients, along with an evaluation to match what was performed in the original paper.

Unit Tests

  1. test_ptbxl.py - unit tests for the PTB-XL dataset, using synthetic data
  2. test_resnet_ecg.py - unit tests for all three ResNet models (ResNet-18, Lambda-ResNet-18, SE-ResNet-50), using synthetic data
  3. test_bilstm_ecg.py - unit tests for BiLSTM, using synthetic data

jtwells2 and others added 30 commits April 5, 2026 18:11
Add PTBXLDataset to __init__.py
Add the multilabel classification task definition
All three models based on the implementations used in "In-depth
Benchmarking of Deep Neural Network Architectures for ECG Diagnosis."
Common code extracted to ecg_resnet_base.py.

Assisted-by: Claude:Claude-4.6-Sonnet
Includes tests for ResNet-18, Lambda-ResNet, and SE-ResNet models.

Assisted-by: Claude:Claude-4.6-Sonnet
Includes docs for ResNet18ECG, LambdaResNet18ECG, and SEResNet50ECG
models.

Assisted-by: Claude:Claude-4.6-Sonnet
…QUICK_MODE, run full 2x2 ablation

- bilstm_ecg.py: fix super().__init__ kwargs, hardcode in_channels=12, fix forward() label API
- ptbxl_multilabel_classification.py: fix getattr namespace (ptbxl/mat->mat, ptbxl/dx_codes->dx_codes), add patient_id to samples
- notebook: add ResNet18ECG + LambdaResNet18ECG, QUICK_MODE flag, viz fix, Section 4 markdown correction
- ablation chart: ptbxl_model_comparison_ablation.png (4 models x 4 configs, ~1000 patients dev subset, 5 epochs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants