Skip to content

Add WordBasisLinearModel for word-basis explanations of linear classifiers#1024

Open
sawruhv wants to merge 3 commits intosunlabuiuc:masterfrom
sawruhv:add-word-basis-linear-model
Open

Add WordBasisLinearModel for word-basis explanations of linear classifiers#1024
sawruhv wants to merge 3 commits intosunlabuiuc:masterfrom
sawruhv:add-word-basis-linear-model

Conversation

@sawruhv
Copy link
Copy Markdown

@sawruhv sawruhv commented Apr 19, 2026

Contributor

Type of Contribution

  • Model

Original Paper

High-Level Description

This PR adds WordBasisLinearModel, a PyHealth model inspired by the paper Representing visual classification as a linear combination of words.

The model reproduces the paper’s core two-step idea in a minimal PyHealth-friendly form:

  1. learn a bias-free linear classifier on precomputed embeddings
  2. reconstruct the learned classifier weight vector as a linear combination of fixed word embeddings

This PR focuses on the model contribution only. It does not attempt to reproduce the full paper pipeline, reader study, or dataset-specific preprocessing. Instead, it implements the core word-basis explanation method as a reusable PyHealth model.

The PR also includes:

  • synthetic model tests
  • a runnable ablation/example script using synthetic/demo data
  • model API documentation

Ablation / Example

The example script performs a weight decay sweep:

  • weight_decay = 0.0
  • weight_decay = 1e-4
  • weight_decay = 1e-2

It compares:

  • training loss
  • validation loss
  • training accuracy
  • validation accuracy
  • cosine similarity between the learned classifier weights and the word-basis reconstruction

This is the concrete model ablation required by the rubric.

File Guide

Please review these files:

  • pyhealth/models/word_basis_linear_model.py
  • pyhealth/models/__init__.py
  • tests/models/test_word_basis_linear_model.py
  • examples/sample_binary_word_basis_linear_model.py
  • docs/api/models/pyhealth.models.word_basis_linear_model.rst
  • docs/api/models.rst

Notes

  • Tests use only synthetic/pseudo data.
  • The example script is intentionally small and runnable locally.
  • The implementation is tied to the paper’s core method, while keeping the PR scope appropriate for an Option 2 model contribution.

@sawruhv sawruhv force-pushed the add-word-basis-linear-model branch from 5213ad8 to e4d3b56 Compare April 19, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant