ECON 452 (A & B) Winter 2001 C. Ferrall / A. Gregory
OUTLINE FOR PART II
- Introduction
- The goal of Part II is for you to learn three skills:
- How to carry out econometric studies using data
from cross-sectional surveys (in particular,
DLI surveys
stored in the QED Data Archive).
- How to report results from your study so people will understand them
- How to model a kind of variable that often appears in survey data:
limited-dependent variables.
- Some Questions About These Goals
- Are these good goals?
- Does the focus in Part II complement the focus in part I?
- Are the projects designed to achieve these goals?
- Why am I asked to find and read an article using similar data?
- Should I expect a certain amount of frustration in carrying out these projects?
- Are the deadlines firm?
-
Getting and Reporting Econometric Results Based on Survey Data
- Example: The Economics of Abuse,
- Survey Data isn't always easy to work with
- Sometimes you have to worry about sampling weights
- DLI files contain many masked or censored variables
- Even when not masked or censored, pay attention to missing observations
- Most survey questions have qualitative
and categorical answers
- "Section II: The Data"
See the annotations in Bowlus and Seitz paper for more details.
- Tell your reader about your data sources.
- Tell your reader how you selected observations and manipulated the data
- Show your reader tables of
summary statistics and in the text briefly discuss the patterns in the tables
- Show your reader some informative graphs
- Place supporting material or information that is difficult to display in
prose into a data appendix
-
A Quick Review of Multiple Linear Regression (MLRM)
Based on: Multiple Regression Notes
Supplementary material: Mike Abbott's Econ
351 notes (and the corresponding sections of Gujarati)
- Ordinary Least Squares (OLS) is perfect for the MLRM
- The MLRM is defined by about seven assumptions, A0-A7
- The OLS estimator is easy to derive and it's formula only depends on A0
- The statistical properties of OLS depend crucially on A0-A7
- Using OLS and all the assumptions of MLRM you can carry out hypothesis
tests, construct confidence intervals, and make predictions
- Doing all this is a breeze in Stata
- "Section III: Empirical Results"
- Show your reader well-constructed tables of regressions
- Briefly discuss or intepret most of the regression coefficients
- Bury unimportant coefficients or models in appendices, footnotes, or even deeper!
- When the assumptions of the MLRM don't hold neither do the nice properties
of OLS
- It's not hard to correct for violations of constant variance (A6)
using standard errors robust to heteroscedasticity
- And you already spent six weeks worrying about serial correlation (~A5), so we'll skip that
- It's tougher to deal with endogenous or error-correlated
regressors (~A3), and we must skip this problem due to time constraints.
- So we will focus on the linear assumption in the MLRM.
- An Even Quicker Introduction to Maximum Likelihood Estimation
-
Maximum Likelihood starts with a model of the whole population distribution
that depends on some unknown population (or model) parameters
- The sample probability function treats data as variable and model
parameters as fixed. The likelihood function is really
just the probability function, but it treats model parameters as variable
and data as fixed
- Finding parameter values that maximize the likelihood function generally
leads to statistically consistent and asymptotically normal estimates
of the population parameters.
- Under A0-A7 the ML estimates of the NMLRM are the same as the OLS estimates!
- Let's see that using Stata's maxlik command ...
- Limited Dependent Variables (LDVs)
-
Introduction
-
There are lots of Kinds and Examples of LDVs (Binary, Censored, Multinomial,
and Truncated Outcomes)
- Economic Models of Individual Choices Often Lead to LDV StatisticalModels
- Binary Outcomes are the simplest type of LDV
- The Linear Probability Model (LPM) just means running OLS on a binary variable
- Logit and Probit are non-linear models for binary outcomes estimated using ML.
- Interpretiting coefficients in these models are little trickier than in the MLRM
- MLE of logit/probit is easy in Stata
- You can do hypothesis testing and prediction
- Present results of logit/probit models pretty much like OLS, with a few differences
- Multinomial Outcomes require more structure than binary outcomes
- If you want to model letter marks (A,B,C,etc), use the Ordered Logit/Probit Model
- If you want to model something like which way people get to work use the Multinomial Logit/Probit Model
- MLE works more or less the same as before.