Add CCEP ECoG dataset, Localize SOZ task, and SPES models#1027
Add CCEP ECoG dataset, Localize SOZ task, and SPES models#1027mkoretsky1 wants to merge 7 commits intosunlabuiuc:masterfrom
Conversation
|
Hi @mkoretsky1 , Kevin, and Darian, thanks for putting this together. A few items I'd like to see addressed, plus some smaller things. Blockers 1. The Destrieux atlas
rda_path = Path(__file__).resolve().parent.parent / "destrieux.rda"
_destrieux_df = pyreadr.read_r(str(rda_path))["destrieux"]That resolves to 2. Bare try:
...
response_df['distances'] = distances
response_df = response_df.sort_values(by='distances', ascending=True)
except:
print("Error with subject", subject)The bare 3. Metadata CSV written into the data root. output_path = os.path.join(root, "ccep_ecog-metadata-pyhealth.csv")
df.to_csv(output_path, index=False)BIDS datasets are often on mounted storage with restricted write permissions. Matt McKenna's fix for the same pattern in #981 shipped as: def __init__(self, root: str = ".", config_path: Optional[str] = ..., **kwargs):
self._verify_data(root)
self._tmp_dir = tempfile.mkdtemp(prefix="pyhealth_ccep_ecog_")
self._index_data(root, self._tmp_dir)
super().__init__(
root=self._tmp_dir,
tables=["ecog"],
...
)
def __del__(self):
shutil.rmtree(self._tmp_dir, ignore_errors=True)Could you mirror that pattern? 4. In def get_spes_dataloader(dataset, batch_size, shuffle=False, norm_stats=None):
dataset.set_shuffle(shuffle)
return DataLoader(...)
Simplest fix: drop the Code quality (none blocking)
Minor
The model implementations are clean, and the per-fold normalization is done correctly. Once the blockers are addressed I'm happy to approve. |
|
@joshuasteier We've fixed the blockers and made some code quality updates. When you have a chance could you please re-review the PR? The one thing we're unsure about is on 1. Destrieux atlas .rda file. For now, we have hard-coded the brain region indexes into |
Contributors
Mathew Koretsky (NetID: mathewk4) (mathewk4@illinois.edu)
Kevin Splinter (NetID: kevints5) (kevints5@illinois.edu)
Darian Tavana (NetID: dtavana2) (dtavana2@illinois.edu)
Type of Contribution
Full Pipeline: Dataset + Task + Model (Option 4)
Original Paper
Norris, J.; Chari, A.; van Blooijs, D.; Cooray, G. K.; Friston, K.; Tisdall, M. M.; and Rosch, R. E. 2024. Localising the Seizure Onset Zone from Single-Pulse Electrical Stimulation Responses with a CNN Transformer. Proceedings of the 9th Machine Learning for Healthcare Conference, volume 252 of Proceedings of Machine Learning Research. PMLR. (https://proceedings.mlr.press/v252/norris24a.html).
Description
Adds a full PyHealth pipeline for electrode-level seizure onset zone (SOZ) localization from CCEP ECoG recordings, based on Norris et al. (2024). Includes a BIDS-compatible dataset class (CCEPECoGDataset), a task class (LocalizeSOZ) that preprocesses raw stimulation EEG into paired divergent/convergent response tensors, and two models (SPESResNet, SPESTransformer) for binary SOZ classification. An end-to-end example script demonstrates 5-fold patient-level cross-validation with per-fold normalization and an ablation study isolating the contribution of trial variability and the hybrid embedding in the transformer model.
Files to Review
pyhealth/datasets/ccep_ecog.py- CCEP ECoG dataset implementationpyhealth/tasks/localize_soz.py- Localize SOZ task implementationpyhealth/models/spes.py- SPES models implementationtests/core/test_ccep_ecog.py- Dataset and task test suitetests/core/test_spes.py- Model test suiteexamples/ccep_ecog_localize_soz_spes.py- End-to-end example demonstrating dataset, task, and model usagepyproject.toml- Added requirements specific to this implementation