Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contributor Info
NetID: kevint6
Email: kevint6@illinois.edu
Type of Contribution
Standalone Task (Option 3)
Original Paper Reference
Paper: Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting (Staniek et al., 2024)
Implementation Description
This PR implements the
SofaLabForecastingMIMIC3task. Following the "prediction of causes" paradigm from the referenced paper, this task forecasts three specific clinical variables (Bilirubin, Creatinine, and Platelets) and then deterministically derives a binary SOFA-deterioration label based on the 24h delta in SOFA scores.Note: Because the paper's purpose to produce a vectorized forecasting as a the output, like the input, I couldn't use pyhealth native models to be correct with the paper other than opt to just train on sofa labels, but that would defeat the purpose of the uniqueness of the paper. So I opted to use sklearn LinearRegression as a baseline model to test.
File Guide
pyhealth/tasks/sofa_lab_forecasting_mimic3.py: Core task implementation.tests/core/test_mimic3_sofa_lab.py: Comprehensive test suite using synthetic data.examples/mimic3_sofa_lab_forecasting_linear.py: Ablation study comparing 12h vs. 24h lookback performance.docs/api/tasks/pyhealth.tasks.sofa_lab_forecasting_mimic3.rst: API documentation.