Time Series Regression

Henry Thompson

email for the full text

Introduction

Suppose an economic model reduces to the effect of exogenous variable Xt on endogenous variable Yt with exogenous control variable Zt.  Subscripts refer to the time period.  In general functional form, the economic relationship of interest is

Yt = f(Xt, Zt)

where Xt and Zt may be vectors.  This introduction to time series regression focuses on estimating regressions.  The issue for economic theory is the significance, sign, and size of the partial derivative effect of Xt on Yt holding Zt constant.  Lags Xt-1 and Zt-1 may be more critical than contemporaneous effects, extending to Xt-i and Zt-j in high frequency data.

The reduced form equation should be derived from a structural theoretical model.  In time series “reduced form” also refers to a model with lags of dependent variable that are exogenous in the sense they occurred previously, Yt = f(Xt-i, Zt-j).  Estimated parameters can lead to the derivation of structural coefficients in the theoretical model.  For instance the demand elasticity in a market model can be derived from the estimated structural equations in the market model. 

Economic time series depend in part on the process underlying their history.  The best predictor of yt+1 may be yt.  The yt series reflects the history of the influence of truly related variables under actual circumstances.  Univariate regressions of yt involving only its own history plus perhaps time itself as a variable can be very successful predictors of yt+1.  Such univariate models do not, however, reveal economic relationships of interest since economics is based on relationships between variables.   Univariate pretests are critical for economic analysis, however, as properties of the series in (1) lay the foundation for successful specification of regressions. 

Models with lags of independent variables are better predictors than univariate models when unexpected changes occur in independent variables.  Successful estimates of reduced form equations isolate the effects of exogenous variables and may lead to derived structural parameters.  The goals of time series regression are to focus and improve economic theory. 

OLS regression assumes normally distributed variables with each observation equal to its mean plus a random error term.  A positive regression coefficient indicates above average observations of Xt are associated with above average observations of Yt holding exogenous control variable Zt constant.  Standard errors are based on variables with normal distributions.

If a series has a positive trend, early observations are below the mean and later ones above it.  There is no clustering around the mean as with normal distributions, the series simply passing through the mean.  A trending variable has a low peak and fat tails relative to the normal distribution.    A regression on nonstationary variables understates standard errors resulting in inflated parameter significance and overstated explanatory power.  The first step to estimate a regression is to identify the underlying autoregressive processes to ensure the lack of trends in every variable.  If a regression is estimated with trending variables, the standard errors will be understated.

The typical problem in applied time series regression is trending variables.  If theory suggests Xt should have a positive effect on Yt and both have positive trends, they will be correlated and regression coefficients will appear significant.  Observations of Xt below its mean will be associated with observations of Yt below its mean simply because they occur at an earlier time.  The resulting residual correlation overstates parameter significance and explanatory power.  Residual correlation implies information remains in the residual, suggesting model misspecification.   

A stationary time series converges to a dynamic equilibrium steady state.  The series may not be normally distributed as it approaches the steady state but should be nearly so.  Sample periods do not include all of variable “history” assumed to extend into the future as well.  The sample selection period may result in a trending variable that could be stationary or even normal with a longer sample.  If variables in a regression are stationary, standard errors are typically reliable.  Applied time series analysis focuses in some part on these distorted standard errors due to trends. 

A simple regression related to (1) is

yt = α0 + α1xt + α2zt + εt.

Variables are transformed to natural logs, yt = lnYt and so on.  Log linear regression coefficients are point estimates of elasticities.  The goal is to interpret theory in terms of the estimated elasticity α1 = yt/xt although the ultimate form of the time series regression may not be so simple.  The individual time series processes determine the form of variables.  Variables can be transformed with differences and lags.  The regression may include structural breaks, time itself, and variance of the series.  Variables should be stationary.  White noise residuals of underlying univariate processes are candidate variables.  The error correction model ECM includes the residual εt in a difference equation regression. 

Begin with a theoretical model to relate theoretical parameters to estimated coefficients.  Rely on theory and preliminary regressions to suggest the most relevant exogenous and control variables.  The residual εt has to be white noise WN with zero mean, lack of residual correlation, and constant variance.  The residual correlation r(εt, εt-1) plays a critical role in time series regression. 

Even in the presence of residual correlation, the estimated coefficient a1 is unbiased or just as likely above as below its true value.  It is consistent, converging to its true value as the number of observations increases.  It is also super consistent with accelerating convergence as the number of observations increases.  In the presence of residual correlation, however, it is impossible to say a1 is not zero due to the understated standard errors.

Spurious regressions occur when unrelated trending variables appear related in regressions due to the underestimated standard errors.  Arbitrary choice of trending variables can result in apparently successful regressions.  To avoid spurious regressions, rely on economic theory to select variables.  Relying on economic theory, spurious regressions are not an issue. 

If the series in a regression are not stationary but their differences are, regressions on differences produce more reliable estimates.  Difference stationary series may also be cointegrated suggesting an error correction model that includes transitory adjustment relative to the long term dynamic equilibrium.  Unsuccessful difference regressions may disguise a significant error correction processes.  Partial adjustment models introduce the lag of the dependent variable as an exogenous variable. 

In an economic model with more than a single endogenous dependent variable, solve the reduced form equations with each dependent variable a function of exogenous variables.  Estimate each of the reduced form equations.  For example, the market model determines endogenous price P and quantity Q from the demand function D = D(P, Y), the supply function S = S(P, W), and the equilibrium condition Q = S = D.  Exogenous variables are the demand shifter Y and the supply shifter W.  Estimate P or Q as functions of Y and W, but not P as a function of Q or vice versa. 

The macroeconomic model provides another example.  National income Y is a function of exogenous government spending G, money supply M, the foreign interest rate r*, and foreign income Y*.  The interest rate r is endogenous and should not be a right hand variable in a regression with endogenous Y, or vice versa.  A floating exchange rate E would be endogenous since it adjusts to a nonzero trade balance.  With a fixed exchange rate E would be exogenous and B endogenous. 

Theory is flexible in that various assumptions lead to different reduced form equations.  For instance, price is exogenous in international economics by the small open economy assumption.  Time series evidence provides tests of particular assumptions as theory stands ready to work through the implications of alternative assumptions.