The Archive of Mass ENvironmental Data (AMEND) is a project to assemble and analyze data related to environmental regulation, focused on water policy in Massachusetts.
The website for the project can be viewed here.
This git repository contains code for data acquisition (see get_data/), analysis (see analysis/), and the jekyll site (see docs/).
Data is refreshed automatically every Monday at 6am UTC via two GitHub Actions workflows:
- Update Data: Fetches all active data sources, validates row counts and schema, assembles the SQLite database, and commits updated CSVs. If any step fails, a GitHub Issue is opened automatically.
- Update Charts: Runs after a successful data update to regenerate Chart.js visualizations. The PySTAN-based CSO regression analysis (
NECIR_CSO_map.py) is excluded from CI and must be run locally.
Both workflows can also be triggered manually from the GitHub Actions tab.
If a workflow fails, a GitHub Issue is opened with a link to the failed run. Make sure you are watching the repository with Issues notifications enabled to receive email alerts.
To run a full update locally:
bash update_all.shThis script will not update ECOS budget records or the SSA wage table, which require manual data entry.
Large files (SQLite database, full drinking water CSV, permit PDFs) are stored on Google Cloud Storage at gs://openamend-data in the openamend GCP project. A budget alert is configured at $1/month.
The service account amend-github-actions@openamend.iam.gserviceaccount.com is used by CI to write to that bucket. Its credentials are stored as the GCP_SA_KEY GitHub Actions secret.
The SODA API credentials for the MA Comptroller payroll data are stored as SODA_APP_TOKEN and SODA_SECRET_TOKEN GitHub Actions secrets, and also locally in get_data/SECRET_SODA_token (not committed).
The site is hosted via GitHub Pages from the docs/ directory.
To run locally:
conda env create -f amend_jekyll_env.yml
conda activate amend_jekyll
cd docs
bundle exec jekyll serveFor running data fetches and most chart scripts (no PySTAN/geopandas):
pip install -r requirements-ci.txtFor all scripts including PySTAN CSO regression analysis:
conda env create -f amend_python_env.yml
conda activate amend_python