While it is possible to install via PyPi:
pip install policyengine-us-datathe recommended installation is
pip install -e .[dev]
which installs the development dependencies in a reference-only manner (so that changes
to the package code will be reflected immediately); policyengine-us-data is a dev package
and not intended for direct access.
The following SSA data sources are used in this project:
- Latest Trustee's Report (2025) - Source for
social_security_aux.csv(extracted viaextract_ssa_costs.py) - Single Year Supplementary Tables (2025) - Long-range demographic and economic projections
- Single Year Age Demographic Projections (2024 - latest published) - Source for
SSPopJul_TR2024.csvpopulation data
PolicyEngine constructs its representative household datasets through a multi-step pipeline. Public survey data is merged, stratified, and cloned to geographic variants per household. Each clone is simulated through PolicyEngine US with stochastic take-up, then calibrated via L0-regularized optimization against administrative targets at the national, state, and congressional district levels, producing geographically representative datasets.
The Enhanced CPS (make data-legacy) produces a national-only calibrated dataset. For the current geography-specific pipeline, see docs/calibration.md.
For detailed calibration usage, see docs/calibration.md and modal_app/README.md.
The pipeline runs as sequential steps in Modal:
make pipeline # prints the steps below
# 1. Build data (CPS/PUF/ACS → source-imputed stratified CPS)
make build-data-modal
# 2. Build calibration matrices (CPU, ~10h)
make build-matrices
# 3. Fit weights (GPU, county + national in parallel)
make calibrate-both
# 4. Build H5 files (state/district/city + national in parallel)
make stage-all-h5s
# 5. Promote to versioned HF paths
make promoteThe paper requires a LaTeX distribution (e.g., TeXLive or MiKTeX) with the following packages:
- graphicx (for figures)
- amsmath (for mathematical notation)
- natbib (for bibliography management)
- hyperref (for PDF links)
- booktabs (for tables)
- geometry (for page layout)
- microtype (for typography)
- xcolor (for colored links)
On Ubuntu/Debian, you can install these with:
sudo apt-get install texlive-latex-base texlive-latex-recommended texlive-latex-extra texlive-fonts-recommendedOn macOS with Homebrew:
brew install --cask mactexTo build the paper:
make paperTo clean LaTeX build files:
make clean-paperThe output PDF will be at paper/main.pdf.
The documentation uses Jupyter Book 2 (pre-release) with MyST. To install:
# Install Jupyter Book 2 pre-release
pip install --pre "jupyter-book==2.*"
# Install MyST CLI
npm install -g mystmdTo build and serve the documentation locally:
cd docs
myst startOr alternatively from the project root:
jupyter book start docsBoth commands will start a local server at http://localhost:3001 where you can view the documentation.
The legacy Makefile command:
make documentationNote: The Makefile uses the older jb command syntax which may not work with Jupyter Book 2. Use myst start or jupyter book start docs instead.