Skip to content

Algorithm

Jip Claassens edited this page Mar 31, 2026 · 2 revisions

Home | Data | Implementation


IGOR runs an Iterative Proportional Fitting (IPF) procedure that alternates between two balancing steps until convergence. See Home for full notation and a summary of the core formula.


Overview

Initialize: B_ar = 1 for all a, r

For t = 1 … NumberOfIterations:
    Step 1 — recompute A_i and X_ai  (cell constraint: Σ_a X_ai = P_i)
    Step 2 — update B̃_ar            (region constraint: Σ_i X_ai ≈ Q_ar)
    Step 3 — normalize B_ar          (numerical stability)

Output: X_ai from final iteration

Step 1 — Compute cell allocations X_ai

Given $B_{ar}^{(t-1)}$ from the previous iteration, recompute the cell balancing factor:

$$A_i^{(t)} = \frac{P_i}{\displaystyle\sum_{a} E_{i,s(a)} \cdot E_{i,b(a)} \cdot B_{a,r(i)}^{(t-1)}}$$

Then the allocation for each SexAgeClass $a$ in cell $i$ is:

$$X_{ai}^{(t)} = E_{i,s(a)} \cdot E_{i,b(a)} \cdot A_i^{(t)} \cdot B_{a,r(i)}^{(t-1)}$$

This guarantees $\sum_a X_{ai}^{(t)} = P_i$ exactly after every iteration.

The two ESTAT share terms encode the cross-product structure:

  • $E_{i,s(a)}$ — sex share (M or F) for the 1 km cell containing $i$
  • $E_{i,b(a)}$ — broad-age share (LT15, 15–64, or GE65) for the same cell

The product $E_{i,s(a)} \cdot E_{i,b(a)}$ assumes sex and broad age are locally independent given the 1 km cell — the LAU-level $B_{ar}$ term corrects for any departure from that assumption.


Step 2 — Update region balancing factors B_ar

Sum the current-iteration allocations over all cells in region $r$:

$$\hat{Q}_{ar}^{(t)} = \sum_{i:, r(i)=r} X_{ai}^{(t)}$$

Update the balancing factor multiplicatively:

$$\tilde{B}_{ar}^{(t)} = B_{ar}^{(t-1)} \cdot \frac{Q_{ar}}{\hat{Q}_{ar}^{(t)}}$$

Intuition:

Situation Effect on $\tilde{B}$
$\hat{Q} > Q_{ar}$ (over-allocated) $\tilde{B}$ shrinks → next iteration allocates less
$\hat{Q} < Q_{ar}$ (under-allocated) $\tilde{B}$ grows → next iteration allocates more
$\hat{Q} \approx 0$, $Q_{ar} > 0$ $\tilde{B}$ diverges → normalization (Step 3) prevents this

If $\hat{Q}{ar}^{(t)} = 0$ and $Q{ar} = 0$, the factor is set to 0 (MakeDefined(..., 0f) in the code).


Step 3 — Normalize B_ar per region

To keep factors numerically stable across iterations, each region's vector of $\tilde{B}$ values is rescaled by its maximum:

$$B_{ar}^{(t)} = \frac{\tilde{B}_{ar}^{(t)}}{\max_{a'} \tilde{B}_{a'r}^{(t)}}$$

After normalization, the largest SexAgeClass in every region always has $B = 1$; all others are ≤ 1. Because $A_i$ compensates multiplicatively in Step 1, this rescaling does not affect the final $X_{ai}$ values — only the scale of the $B$ factors.


Convergence

The loop runs for exactly NumberOfIterations steps (default: 10). No early-stopping criterion is applied; the result container always points to the last iteration:

Result := Iters/<last iter name>

Diagnostic outputs in Result/

Container / attribute Description
Diff_X_i_P_i Residual cell error $\sum_a X_{ai} - P_i$ (should be ≈ 0 by construction)
X_asr Modelled population per SexAgeClass summed to LAU level
Diff_X_asr_Q_asr Residual region error, relative to region total
MAPE Mean Absolute Percentage Error per LAU region (lower = better)
Error_Distr_vs_Expected_Shares Share error per SexAgeClass per region
Error_Abs_vs_Expected_Sizes Absolute population error per SexAgeClass per region

See also

  • Data — what $P_i$, $E_{i,s}$, $E_{i,b}$, $Q_{ar}$ are and where they come from
  • Implementation — how the iteration template is built in GeoDMS

Clone this wiki locally