In Choi and Hanbat Jeong "Differencing versus Non-Differencing in Factor-Based Forecasting", Journal of Applied Econometrics, Vol. 35, No. 6, 2020, pp. 728-750. This readme file describes files used for the simulations and the empirical application. All simulation code files are zipped in cj_simulations.zip. All data files and associated Matlab files forthe empirical application are zipped in cj_application.zip. ****** Contents ****** ********************** ** Files in cj_simulations.zip" ** Files in "cj_application.zip" ** Details on DATA.txt ** Details on DDATA.txt ** Details on keyvar_inf.txt ** Details on keyvar.txt ** References ****** Files in cj_simulations.zip" ****** ****************************************** simulation.m - One can adjust the maximum # of factors (kmax), T (# of time series for factor estimation), N (# of cross-section units), r (# of factors), h (forecasting horizons), nc (# of I(1) variables among full data), and parameter values. There are several functions for "simulation.m". genar.m (function) - Generating AR(1) process: One can generate e_{t} = \alpha*e_{t-1} + v_{t}. Inputs of the function are the parameter \alpha, T, N, and throw. Note that "throw" means the number of time series observations are thrown out. generroreffi.m (function) - Generating equation (18) in Section 3. That is, it generates the idiosyncratic components in the factor model. Inputs of the function are T, N, r, and throw. estimation_fac.m (function) - Estimating principal components based on the process X_{t} = F_{t}*\Lambda + e_{t}. Inputs of this function are X (T by N data matrix), r (# of factors, and method. If "method == 0", this function takes I(1) variable approach (see Bai, 2004). On the other hand, if "method==1", this function takes I(0) variable approach (see Bai, 2003). Outputs of this function are Fhat (T by r estimated factors), Lhat (N by r estimated factor loadings), ehat (estimated idiosyncratic components), and vk (the estimated variance of the idiosyncratic components). Note that vk is used for computing criteria of Bai and Ng (2002) and Bai (2004). demeaning.m (function) - This function is for estimation_fac.m to demean the data X. ****** Files in "cj_application.zip" ****** ******************************************* Application.m - This program generate forecasting results for each time series. One can generate the forecasts from the two models (in this paper) for the three sampling periods: (i) Pre Great Moderation period, (ii) Great Moderation period, and (iii) Crisis and aftermath period. One can adjust rmax (# of maximum factors in the forecasting equation), kmax (# of maximum time lags in the forecasting equation) and h (forecasting horizon). For this, we need to load five files: DATA.txt contains 103 I(1) predictors (706 by 103 matrix). DDATA.txt contains 123 I(0) predictors (706 by 123 matrix). Note that DATA.txt and DDATA.txt are for factor estimation. For details of them, see the below descriptions. dateyearmonth.txt contains year/month for 706 time series observations (for each variable) (e.g., 196001 = Jan 1960). keyvar_inf.txt contains main variables for forecasting inflation rates. keyvar.txt contains main variables for forecasting other 68 I(1) macro variables. Note that keyvar_inf.txt and keyvar.txt are for forecasting equations. They contain targeting variables for prediction. For details of them, see the below descriptions. The datasets above come from the FRED-MD dataset, which can be found at the following website: https://research.stlouisfed.org/econ/mccracken/fred-databases/. There are several functions used by "Application.m" IPC_1.m (function) - This function determines the number of factors based on Bai's (2004) IPC_1 criterion. Inputs of this function are X (data) and rmax (the (pre-specified) maximum # of factors). cross_validation.m (function) - This function is a key function generating forecasts from the two models. By performing the cross-validation method (for the detail, refer to our Section 4), this function generates optimal forecasts. Inputs of this function are XDATA, ZDATA, YDATA, yDATA (which are matrices of datasets), h (forecasting horizon), r_0 (pre-specified maximum # of factors for forecasting), kmax (pre-specified maximum lags for forecasting), and method. If "method == 0", one can generate forecasts based on our I(1) approach. On the other hand, if "method == 1", one can generate forecasts based on the conventional I(0) approach. Outputs of this function are (i) optimal forecasts, (ii) optimally chosen # of factors (for prediction), and (iii) optimally chosen # of time lags. dbtest.m (function) - This function generates the Diebold-Mariano (1995) test results. Inputs of this function are e_1t and e_2t (those are two forecasting errors), and method. If "method == 1", the quadratic loss function will be taken. On the other hand, if "method == 2", the absolute value loss function will be taken. specden.m (function) - This function calculates the spectral density of the loss differential (d_t), which is for dbtest.m. ****** Details on DATA.txt ****** ********************************* The data consist of 103 I(1) variables from Jan 1960 to Oct 2018. The following is the order of variables. For example, the first variable M1 is located at the first column of DATA.txt. For the detailed descriptions of the variables, refer to Appendix of McKracken and Ng (2016) ( gsi:description). M1 M2 MB Reserves tot C&I loans DC&I loans Cons credit PPI: fin gds PPI: cons gds PPI: int mat???ls PPI: crude mat???ls Commod: spot price Sens mat???ls price CPI-U: all CPI-U: apparel CPI-U: transp CPI-U: medical CPI-U: comm. CPI-U: dbles CPI-U: services CPI-U: ex food CPI-U: ex shelter CPI-U: ex med PCE defl PCE defl: dlbes PCE defl: nondble PCE defl: services AHE: goods AHE: const AHE: mfg non (MZM Money Stock) non (Consumer Motor Vehicle Loans Outstanding) non (Total Consumer Loans and Leases Outstanding) non (Securities in Bank Credit at All Commercial Banks) Reserves nonbor PI PI less transfers Consumption M&T sales Retail sales IP: total IP: products IP: final prod IP: cons gds IP: cons dble iIP:cons nondble IP:bus eqpt IP: matls IP: dble mats IP:nondble mats IP: mfg IP: res util IP: fuels Emp CPS total Emp CPS nonag U < 5 wks U 5-14 wks U 15+ wks U 15-26 wks U 27+ wks UI claims Emp: total Emp: gds prod Emp: mining Emp: const Emp: mfg Emp: dble gds Emp: nondbles Emp: services Emp: TTU Emp: wholesale Emp: retail Emp: FIRE Emp: Govt Orders: dble gds Unf orders: dble M&T invent M2 (real) S&P 500 S&P: indust S&P PE ratio Ex rate: Switz Ex rate: Japan Ex rate: UK EX rate: Canada Cap util Help wanted indx Help wanted/emp U: all U: mean duration Overtime: mfg M&T invent/sales Inst cred/PI S&P div yield FedFunds Commpaper 3 mo T-bill 6 mo T-bill 1 yr T-bond 5 yr T-bond 10 yr T-bond Aaabond Baa bond ****** Details on DDATA.txt ****** *********************************** M1 M2 MB Reserves tot C&I loans DC&I loans Cons credit PPI: fin gds PPI: cons gds PPI: int mat???ls PPI: crude mat???ls Commod: spot price Sens mat???ls price CPI-U: all CPI-U: apparel CPI-U: transp CPI-U: medical CPI-U: comm. CPI-U: dbles CPI-U: services CPI-U: ex food CPI-U: ex shelter CPI-U: ex med PCE defl PCE defl: dlbes PCE defl: nondble PCE defl: services AHE: goods AHE: const AHE: mfg non (MZM Money Stock) non (Consumer Motor Vehicle Loans Outstanding) non (Total Consumer Loans and Leases Outstanding) non (Securities in Bank Credit at All Commercial Banks) Reserves nonbor PI PI less transfers Consumption M&T sales Retail sales IP: total IP: products IP: final prod IP: cons gds IP: cons dble iIP:cons nondble IP:bus eqpt IP: matls IP: dble mats IP:nondble mats IP: mfg IP: res util IP: fuels Emp CPS total Emp CPS nonag U < 5 wks U 5-14 wks U 15+ wks U 15-26 wks U 27+ wks UI claims Emp: total Emp: gds prod Emp: mining Emp: const Emp: mfg Emp: dble gds Emp: nondbles Emp: services Emp: TTU Emp: wholesale Emp: retail Emp: FIRE Emp: Govt Orders: dble gds Unf orders: dble M&T invent M2 (real) S&P 500 S&P: indust S&P PE ratio Ex rate: Switz Ex rate: Japan Ex rate: UK EX rate: Canada Cap util Help wanted indx Help wanted/emp U: all U: mean duration Overtime: mfg M&T invent/sales Inst cred/PI S&P div yield FedFunds Commpaper 3 mo T-bill 6 mo T-bill 1 yr T-bond 5 yr T-bond 10 yr T-bond Aaabond Baa bond Avg hrs Avg hrs: mfg CP-FF spread 3 mo-FF spread 6 mo-FF spread 1 yr-FF spread 5 yr-FFspread 10yr-FF spread Aaa-FF spread Baa-FF spread HStarts: Total HStarts: NE HStarts: MW HStarts: South HStarts: West BP: total BP: NE BP: MW BP: South BP: West ****** Details on keyvar_inf.txt ****** *************************************** Variables in this file are for predicting the U.S. inflation rates. - Columns 1 ~ 3: Variables for our I(1) approach U.S. inflation rates unemployment rate term spread - Columna 4 ~ 6: Variables for the conventional approach Time differenced variables from Column 1 to Column 3 ****** Details on keyvar.txt ****** *********************************** Variables in this file are for predicting other 68 I(1) time-series - Columns 1 ~ 68: Variables for our I(1) approach PI PI less transfers Consumption M&T sales Retail sales IP: total IP: products IP: final prod IP: cons gds IP: cons dble iIP:cons nondble IP:bus eqpt IP: matls IP: dble mats IP:nondble mats IP: mfg IP: res util IP: fuels Emp CPS total Emp CPS nonag U < 5 wks U 5-14 wks U 15+ wks U 15-26 wks U 27+ wks UI claims Emp: total Emp: gds prod Emp: mining Emp: const Emp: mfg Emp: dble gds Emp: nondbles Emp: services Emp: TTU Emp: wholesale Emp: retail Emp: FIRE Emp: Govt Orders: dble gds Unf orders: dble M&T invent M2 (real) S&P 500 S&P: indust S&P PE ratio Ex rate: Switz Ex rate: Japan Ex rate: UK EX rate: Canada Cap util Help wanted indx Help wanted/emp U: all U: mean duration Overtime: mfg M&T invent/sales Inst cred/PI S&P div yield FedFunds Commpaper 3 mo T-bill 6 mo T-bill 1 yr T-bond 5 yr T-bond 10 yr T-bond Aaabond Baa bond - Columna 69 ~ 136: Variables for the conventional approach Time differenced variables from Column 1 to Column 69 ****** References ****** ************************ Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica, 71, 135-172. Bai, J. (2004). Estimating cross-section common stochastic trends in nonstationary panel data. Journal of Econometrics, 122, 137-183. Diebold, F.X, and R.S. Mariano (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263. McCracken M.W., and S. Ng (2016). FRED-MD: A Monthly Database for Macroeconomic Research. Journal of Business and Economic Statistics, 34, 574-589.