feat(MedTsLLM): add MedTsLLM model with LUDB/BIDMC/MIT-BIH datasets and ECG/respiratory tasks#1108
Open
antonbarchukov wants to merge 1 commit intosunlabuiuc:masterfrom
Open
feat(MedTsLLM): add MedTsLLM model with LUDB/BIDMC/MIT-BIH datasets and ECG/respiratory tasks#1108antonbarchukov wants to merge 1 commit intosunlabuiuc:masterfrom
antonbarchukov wants to merge 1 commit intosunlabuiuc:masterfrom
Conversation
…nd ECG/respiratory tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contributor
Type of Contribution
Option 4 — Full Pipeline (Dataset + Task + Model, extra-credit eligible)
Further Extensions / Experiment Logs / Ablations
Outside of pyhealth, and yes ik the repo name '-pyhealth'
https://github.com/antonbarchukov/cs598-pyhealth
Paper
Chan, N., Parker, E., Bennett, W., Wu, T., Jia, R. A., Liu, J., Wong, E.
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis.
Proceedings of Machine Learning for Healthcare (MLHC) 2024.
arXiv: https://arxiv.org/abs/2408.07773
Original repo: https://github.com/flixpar/med-ts-llm
Summary
Adds the MedTsLLM model — a frozen-LLM time-series encoder that reprograms ECG / respiratory patches into language-model embedding space via cross-attention over learned word-prototype tokens — together with three PhysioNet datasets and four tasks demonstrating the full pipeline end-to-end.
Datasets (
pyhealth/datasets/)LUDBDataset— 12-lead ECG with P / QRS / T wave annotations (PhysioNet LUDB 1.0.1)BIDMCDataset— respiratory + ECG with breath annotations (PhysioNet BIDMC PPG & Respiration)MITBIHDataset— arrhythmia beat annotations (PhysioNet MIT-BIH)Each dataset inherits from
BaseDataset, ships aconfigs/*.yamlschema, supports apreprocess=Truecache, and offers an 80/20 paper-split via a seed=0 record-name shuffle.Tasks (
pyhealth/tasks/)ECGWaveSegmentation— per-timestep P / QRS / T classification (LUDB)ECGBoundaryDetection— QRS boundary detection (MIT-BIH)ECGAnomalyDetection— normal / abnormal beat classification (MIT-BIH)RespiratoryBoundaryDetection— breath onset detection (BIDMC)Model (
pyhealth/models/medtsllm.py+pyhealth/models/_medtsllm/)ReprogrammingLayer(multi-head cross-attention) with aLinearProjectionablationRevINnormalization,PatchEmbedding,FlattenHeadoutput headprompt_dataset/prompt_task/prompt_patient/prompt_stats; defaults match the reference implementation's dataset+task+patient (dtp) configAblation Study / Examples (
examples/)One example script per task, each with CLI flags for prompt-component ablation and LLM-backbone swap:
examples/ludb_ecg_segmentation_medtsllm.pyexamples/mitbih_ecg_boundary_medtsllm.pyexamples/mitbih_ecg_anomaly_medtsllm.pyexamples/bidmc_respiratory_boundary_medtsllm.pyTests (
tests/core/)208 tests across four files. All tests run on synthetic wfdb records generated by
tests/core/_synthetic_wfdb.py— no real patient data is committed to the repo. Regenerate the committed fixtures with:File Guide
pyhealth/models/medtsllm.py,pyhealth/models/_medtsllm/{layers,prompt}.pypyhealth/datasets/{ludb,bidmc,mitbih}.py,pyhealth/datasets/configs/{ludb,bidmc,mitbih}.yamlpyhealth/tasks/{ecg_wave_segmentation,ecg_boundary_detection,ecg_anomaly_detection,respiratory_boundary_detection}.pyexamples/*_medtsllm.pytests/core/test_{medtsllm,ludb,bidmc,mitbih}.py,tests/core/_synthetic_wfdb.pydocs/api/{datasets,tasks,models}.rst+docs/api/{datasets,tasks,models}/pyhealth.*.rsttest-resources/core/{ludb,bidmc,mitbih}/(fully synthetic)