Spatial Analytics · PostgreSQL/PostGIS · Composite Scoring
We evaluate how “well-resourced” each SA2 in selected SA4 zones of Greater Sydney is by building a spatial database and a composite scoring system. We integrated six datasets (ABS SA2/Population/Income, Retail Business counts, Transport for NSW GTFS stops, NSW DoE school catchments, NSW POI API), standardized geometries to GDA2020 (EPSG:7844), and performed all joins in PostGIS. Each indicator was normalized (z-scores) and aggregated; the sum was passed through a sigmoid to obtain a final score in [0,1]. Key findings: • Sydney – Inner West: consistently high scores across SA2s (dense infrastructure & transport). • Sydney – Blacktown: largest internal disparity (south high; north low), indicating spatial inequality. • Sydney – Eastern Suburbs: mixed performance. • Pearson correlation between score and median income is weak & slightly negative (≈ −0.08), suggesting resource access is not simply a function of income in this subset. We also add robustness checks via rank-based scoring and validate predictors with Lasso + OLS.
Two choices made the work robust and explainable: (1) keeping geospatial logic inside PostGIS (indexes, ST_Intersects/Contains, and consistent SRIDs) and (2) separating indicator engineering from scoring so we could swap normalization (z-score vs. rank) without breaking the pipeline. The z-score + sigmoid path surfaced contrast clearly but can inflate extremes; the rank-based variant, while simpler, improved stability and policy communication. Model validation reminded us that a single composite index rarely “explains” socioeconomic outcomes—Lasso/OLS helped quantify limits and justify future variables (e.g., housing cost, land use). If iterating, I’d expand indicators, add time dynamics for “access volatility,” and publish a policy brief pairing low-scoring SA2s with actionable levers.