Skip to content

Make microunit the default tax-unit constructor (#113)#116

Closed
MaxGhenis wants to merge 5 commits into
wire-microunitfrom
prototype/microunit-activation
Closed

Make microunit the default tax-unit constructor (#113)#116
MaxGhenis wants to merge 5 commits into
wire-microunitfrom
prototype/microunit-activation

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented May 30, 2026

What

Makes microunit the default tax-unit constructor for microplex's PolicyEngine entity tables — it is microplex's required tax-unit engine (#113), not an optional prototype. Builds on #114 (delegation seam).

  • High-fidelity path is default-on when the real CPS-derived fields (person_number, spouse_person_number, family_relationship) are present — which the production candidate carries. Built from those real pointers (not collapsed relationship_to_head), with person_number==1 anchoring a valid reference person so microunit always has a head.
  • The coarse relationship_to_head-only heuristic stays opt-in (config.microunit_construct_from_normalized) so minimal frames don't silently get the lossy reconstruction.
  • Legacy role-flag reconstruction remains only as a fail-safe (microunit raises → log + fall back; never a silent legacy substitution when microunit input is sufficient).

filing_status is PE's job (delegated)

microplex does not export filing_status — it's a PE formula variable, computed from the partition + marital units (for calibration via sim.calculate, and for one-off household requests). microunit's filing_status_input is internal bookkeeping only, so the entity tests no longer assert it.

Behavior (verified on real CPS + unit tests)

  • tax_units/hh ≈ 1.42 vs authoritative 1.38 (legacy under-split at 1.16); microunit constructs without crashing on real households.
  • The 4 build_policyengine_entity_tables role-flag tests now exercise the microunit default path; a 19+ non-student own-child gets its own unit (microunit's genuine qualifying-child age rule), not the legacy "fold." A legacy-fallback test (no high-fidelity fields) preserves that path.
  • test_us.py: 159 passed; delegation suite 12 passed; ruff clean.

⚠️ Entity-convergence (#113)

microunit is eCPS's tax-unit engine, so any loss change from this is entity-convergence toward eCPS, not independent MP improvement — label it as such.

Tracked follow-up

Thread A_HSCOL/enrollment into the adapter so microunit's qualifying-child-to-24 student extension fires (currently under-claims 19–23-yo student own-child dependents vs eCPS) → #122.

🤖 Generated with Claude Code

MaxGhenis and others added 4 commits May 30, 2026 17:44
…ized frame

Synthesize microunit's CPS input columns (PH_SEQ/A_LINENO/A_AGE/A_MARITL/
A_SPOUSE/PEPAR1/PEPAR2/A_EXPRRP) from microplex's normalized materialization
columns (household_id/age/relationship_to_head), so the microunit tax-unit
delegation can fire on real synthesized data. Gated OFF by default via
allow_normalized_adapter / config.microunit_construct_from_normalized.

HEURISTIC AND UNVALIDATED: the relationship->A_EXPRRP and married->A_MARITL maps
are approximate and PEPAR1/PEPAR2 assume a child's parents are the household head
and spouse. Fidelity must be validated against the legacy reconstruction before
trust (see #115). Default behavior is unchanged with the flag off.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… CPS data)

Running the adapter against real CPS ASEC surfaced two issues:
- microunit requires one reference person per household; guarantee the line-1
  member is the single head and demote spurious extra heads (multi-family).
- microunit still raises on households it cannot resolve, so wrap
  construct_tax_units in a fail-safe: log and return None (caller falls back to
  the legacy reconstruction) instead of crashing materialization.

10 tests passing (adds fail-safe + head-promotion coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ds (#115)

The heuristic adapter (from collapsed relationship_to_head) crashed microunit on
real heterogeneous households. microplex actually carries the real CPS-derived
fields at materialization, so build microunit's contract from those instead:
- A_LINENO from person_number (real 1-based within-household line number)
- A_SPOUSE from spouse_person_number (real spouse line pointer)
- A_EXPRRP from family_relationship; A_MARITL from spouse presence / surviving
- person_number==1 always anchors a valid reference person (no more crashes)
PEPAR1/PEPAR2 stay heuristic (child's parents = household head + spouse).

The normalized adapter now dispatches to this path when the real fields are
present, else the relationship_to_head heuristic. Validated on 8000 real CPS
households: microunit constructs without crashing, tax_units/hh=1.42 vs
authoritative 1.38 (vs legacy reconstruction's ~1.27 under-split). 12 tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
microunit is microplex's required tax-unit engine, not an optional prototype.
Default the high-fidelity path ON when the real CPS-derived fields
(person_number/spouse_person_number/family_relationship) are present -- which the
production candidate carries -- while the coarse relationship_to_head-only
heuristic stays opt-in via config so minimal frames don't get the lossy path.

Tests: the four build_policyengine_entity_tables role-flag tests now exercise the
microunit default path. filing_status is PE-computed (delegated; microplex does
not export it), so its non-authoritative internal value is no longer asserted.
The young-adult-child case asserts microunit's genuine qualifying-child age rule
(a 19+ non-student own-child gets its own unit, not folded). Adds a legacy-
fallback test (no high-fidelity fields). Full test_us.py: 159 passed; delegation
suite 12 passed; ruff clean.

Follow-up: thread A_HSCOL/enrollment into the adapter so microunit's
qualifying-child-to-24 student extension fires (currently under-claims 19-23yo
student own-child dependents vs eCPS).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis changed the title Prototype (#115): activate microunit tax-unit construction from normalized columns Make microunit the default tax-unit constructor (#113) May 31, 2026
…s/tests

An independent review of the microunit delegation+activation change surfaced
three findings; fixes:

1. (docs / intended-behavior) microunit's default-on path replaces the
   CPS-provided tax_unit_id (Census TAX_ID) even when
   policyengine_prefer_existing_tax_unit_ids is True. This is the intended
   "replace the CPS tax units, keep the SPM units" behavior -- the existing-ID
   path is a fallback for households microunit does not construct -- but the
   method docstring wrongly claimed the authoritative-ID path "is never routed
   here". Rewrote the docstring to document the override + fallback ordering and
   the separate SPM/family/marital preservation.

2. (real bug) The high-fidelity adapter assumed only the 1-based CPS A_FAMREL
   coding, so a 0-based family_relationship frame -- which the pipeline supports
   elsewhere (see _normalize_relationship_to_head and data_sources.cps) --
   silently mis-coded children as spouses and dropped their parent pointers.
   Normalize the coding scheme per household (shift 0-based households up by one)
   before the A_EXPRRP / parent-pointer mapping.

3. (missing coverage) Added two regression tests for the previously-untested
   boundaries: bad CPS tax_unit_id [100,100,200] + high-fidelity fields ->
   microunit folds to one unit with spm_unit_id preserved; and a 0-based
   family_relationship frame -> same A_EXPRRP/parent pointers and partition as
   the 1-based frame.

test_us.py 160 passed; delegation suite 13 passed; ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis deleted the branch wire-microunit May 31, 2026 20:34
@MaxGhenis MaxGhenis closed this May 31, 2026
MaxGhenis added a commit that referenced this pull request May 31, 2026
microunit is microplex's required tax-unit engine, not an optional prototype.
Default the high-fidelity path ON when the real CPS-derived fields
(person_number/spouse_person_number/family_relationship) are present -- which the
production candidate carries -- while the coarse relationship_to_head-only
heuristic stays opt-in via config so minimal frames don't get the lossy path.

Tests: the four build_policyengine_entity_tables role-flag tests now exercise the
microunit default path. filing_status is PE-computed (delegated; microplex does
not export it), so its non-authoritative internal value is no longer asserted.
The young-adult-child case asserts microunit's genuine qualifying-child age rule
(a 19+ non-student own-child gets its own unit, not folded). Adds a legacy-
fallback test (no high-fidelity fields). Full test_us.py: 159 passed; delegation
suite 12 passed; ruff clean.

Follow-up: thread A_HSCOL/enrollment into the adapter so microunit's
qualifying-child-to-24 student extension fires (currently under-claims 19-23yo
student own-child dependents vs eCPS).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Auto-closed when base branch wire-microunit was deleted on #114 merge. Superseded by #123 (same reviewed content, rebased onto main).

MaxGhenis added a commit that referenced this pull request May 31, 2026
Default-on high-fidelity microunit adapter replaces the CPS TAX_ID (keeps SPM units); builds on #114. filing_status stays PE-delegated. Entity-convergence toward eCPS per #113. Supersedes auto-closed #116. Follow-up: #122.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant