Completed

Education vs Experience: Earnings Model Comparison

Course Project · University of Sydney 2025 S2
Regression
Econometrics
Forecasting
Model Comparison
Omitted Variable Bias
Nonlinear Model
Out-of-sample
Policy Insight

Econometrics · Multiple Regression · Model Specification · Forecasting

Project Overview

This project examines how education and work experience jointly shape individual earnings using a structured model comparison framework. Objective: • Quantify the relative importance of schooling vs. experience. • Evaluate model specification sensitivity. • Improve predictive performance while maintaining interpretability. Methodology: • Built multiple regression specifications with progressively richer controls. • Compared linear, extended, and nonlinear models. • Performed out-of-sample forecasting to evaluate real predictive power. • Assessed omitted-variable bias and model stability. Key findings: • Education emerges as the strongest and most stable predictor of earnings. • Experience increases wages but shows clear diminishing marginal returns. • Extended models reduce omitted-variable bias and improve explanatory power. • The full nonlinear specification delivers the best out-of-sample forecasts. Implication: Human capital accumulation through education remains the dominant long-run see driver of earnings, while early-career experience primarily affects short-run wage growth.

What I Did

  • Constructed a clean earnings panel and engineered core human-capital variables (education years, experience, and interaction terms).
  • Built a baseline Mincer-style regression and progressively extended specifications with demographic and cognitive controls.
  • Diagnosed model assumptions via residual patterns, functional-form checks, and multicollinearity review.
  • Estimated nonlinear specifications to capture diminishing returns to experience.
  • Performed out-of-sample forecasting to compare true predictive performance rather than relying only on in-sample fit.
  • Quantified omitted-variable bias by comparing coefficient stability across nested models.
  • Synthesized results into policy-relevant insights on education access and early-career labour dynamics.

Reflection

The most important lesson from this project is that model specification matters as much as model accuracy. Early linear models appeared adequate in-sample, but coefficient stability checks revealed sensitivity to omitted variables. By expanding the specification and testing nonlinear forms, the analysis became both more interpretable and more predictive. Two practices proved especially valuable: (1) evaluating models with out-of-sample forecasts rather than relying purely on R², and (2) explicitly testing diminishing returns to experience. If I were to extend this work, I would: • introduce panel or pseudo-panel structure, • explore causal identification strategies, • incorporate occupation and industry heterogeneity, • and build an interactive earnings simulator for policy scenarios. This project strengthened my ability to connect econometric rigor with real labour-market interpretation — moving beyond “fit a regression” toward structured model reasoning.

🛰 Ship Console