Open
Conversation
MaxGhenis
requested changes
Apr 10, 2026
Contributor
MaxGhenis
left a comment
There was a problem hiding this comment.
Requesting changes.
- GitHub reports
mergeable=CONFLICTINGandmergeStateStatus=DIRTY; a localgit merge-tree --write-tree --name-only HEAD FETCH_HEADconfirms a content conflict inpolicyengine_us_data/datasets/puf/puf.py. That explains the missing CI on this PR. - The PUF split moves SSTB Schedule C income out of
self_employment_incomeand intosstb_self_employment_income, but SOI replication still buildsbusiness_net_profitsandbusiness_net_lossesfrom onlype("self_employment_income")inpolicyengine_us_data/utils/soi.py. After this change, SSTB Schedule C profits/losses disappear from those total business-income comparisons. Please aggregateself_employment_income + sstb_self_employment_incomeanywhere the statistic is total Schedule C/self-employment income, or keep a separate total variable for validation.
Verified locally: uv run pytest policyengine_us_data/tests/test_calibration/test_puf_impute.py -q passed: 14 tests.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
MaxGhenis
approved these changes
Apr 10, 2026
Contributor
MaxGhenis
left a comment
There was a problem hiding this comment.
Approved after pushing the main merge/conflict resolution and SOI total Schedule C fix. Local verification: tests/unit/calibration/test_calibration_puf_impute.py and tests/unit/test_soi_utils.py pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add the SSTB split inputs needed by the parallel
policyengine-usQBID changes.This PR exposes:
sstb_self_employment_incomesstb_w2_wages_from_qualified_businesssstb_unadjusted_basis_qualified_propertyin the PUF/calibration pipeline using the existing
business_is_sstbflag.Implementation
self_employment_incomeinto non-SSTB and SSTB pieces using the current all-or-nothingbusiness_is_sstbindicator.IMPUTED_VARIABLESso they flow through PUF-based calibration.Important limitation
This does not infer mixed SSTB and non-SSTB allocations within the same record. The current data pipeline only has an all-or-nothing SSTB flag, so mixed-category wage/UBIA allocation remains approximate until more granular source data or imputation is added.
Verification
PYTHONDONTWRITEBYTECODE=1 python -m py_compile policyengine_us_data/datasets/puf/puf.py policyengine_us_data/calibration/puf_impute.py policyengine_us_data/tests/test_calibration/test_puf_impute.pypytest -q policyengine_us_data/tests/test_calibration/test_puf_impute.py(blocked here because this environment is missingtorch, imported by the repo test setup)