Thomas Crossley, Peter Levell, and Stavros Poupakis, "Regression with an Imputed Dependent Variable", Journal of Applied Econometrics, Vol. 37, No. 7, 2022, pp. 1277-1294. These files can be used to replicate our empirical exercise using the Consumer Expenditure Survey (CEX) and Panel Study of Income Dynamics (PSID) datasets in Stata. They can be used to replicate Figure 2 "standard deviation of log consumption", Table 1 "Imputing nondurable consumption spending using CES" and Table 2 "Empirical Example: Log nondurable consumption on house values". /*---- Data ------*/ The zip file clp-files.zip contains two datasets cexdata.csv (13.3 MB, N = 37,522) and psiddata.csv (16MB, N = 37,000). /*----- Files -----*/ The zip file also contains the following Stata do files master.do This calls the other files. preppsid.do This file sets up the PSID data for analysis. It uses the psidtools command which needs to be downloaded (ssc install psidtools). It also requires the user to have downloaded the PSID from https://simba.isr.umich.edu/Zips/ZipMain.aspx. It saves the file psiddata.csv that is then used for further analysis in psid_reg and run_AM. prepcex.do This file takes CEX raw data and creates the file cexdata.csv which is used to run first stage regressions in our imputation procedure in cex_reg.do. The file draws on CEX datasets with demographic and expenditure data (cexdemogs and cexexp) that were created using raw files from various years downloaded from the Bureau for Labor Statistics (https://www.bls.gov/cex/). These larger files are available on request. The do file is provided to give background on how variables were constructed. cex_reg.do This file runs the first stage regressions used to impute consumption into the PSID. The results are saved to be reloaded from the "estimates folder". This is also where results for Table 1 are saved. psid_reg.do This file carries out the PSID regressions for the regression prediction and rescaled regression prediction approaches. The results are saved in log/psidresults and log/doublelength_artificialregression. The file also plots Figure 2. run_AM.do The file implements the GMM approach needed to compute estimates for column 4 in Table 2. The file takes a long time to run if run from scratch. For this reason, we recommend starting from results saved from a previous run available in "estimates\gmm_estimates_onestep". rrp.ado An ado file for running the two-step rescaled regression prediction procedure described in the paper. /*----- Online appendix code *---/ The code in the folder "replication -- online appendix" carries out the Monte Carlo experiments contained within Appendix C of the paper. The file rrp.do is a program that performs the RRP procedure. A program illustrate.do contains a simpler version of the code that demonstrates the two-step procedure and the standard error correction of the RRP procedure. The program tableC1.do replicates Table C1 in the Online Appendix. The program tableC2-C7.do replicates Tables C2-C7 in the Online Appendix. The user needs to run first rrp.do, as both programs require the rrp program to be installed. For further information on this code, please contact peter_l [AT] ifs.org.uk