Time Series Regression
email for
information of the full version
Introduction
Suppose
an economic model reduces to the effect of exogenous variable Xt on endogenous variable Yt with exogenous control variable Zt.
Subscripts refer to the time period.
In general functional form, the economic relationship of interest is
Yt =
f(Xt, Zt) (1)
where Xt and Zt
may be vectors. This introduction to
time series regression focuses on estimating regressions based on (1). The issue for economic theory is the
significance, sign, and size of the partial derivative effect of Xt on Yt
holding Zt constant. Lags Xt-1 and Zt-1 may
be more critical than contemporaneous effects, extending to Xt-i
and Zt-j in high frequency
data.
The
reduced form equation (1) should be derived from a structural theoretical
model. In time series “reduced form”
also refers to a model with lags of dependent variable that are exogenous in
the sense they occurred previously, Yt =
f(Xt-i, Zt-j). Estimated parameters in (1) can lead to the
derivation of structural coefficients in the theoretical model. For instance the demand elasticity in a
market model can be derived from the estimated structural equations in the
market model.
Economic
time series depend in part on the process underlying their history. The best predictor of yt+1 may be yt. The yt
series reflects the history of the influence of truly related variables under
actual circumstances. Univariate regressions of yt involving only its own history plus
perhaps time itself as a variable can be very successful predictors of yt+1. Such univariate
models do not, however, reveal economic relationships of interest since
economics is based on relationships between variables. Univariate
pretests are critical for economic analysis, however, as properties of the
series in (1) lay the foundation for successful specification of
regressions.
Models
with lags of independent variables are better predictors than univariate models when unexpected changes occur in
independent variables. Successful
estimates of reduced form equations isolate the effects of exogenous variables
and may lead to derived structural parameters.
The goals of time series regression are to focus and improve economic
theory.
OLS
regression assumes normally distributed variables with each observation equal
to its mean plus a random error term. A
positive regression coefficient indicates above average observations of Xt are associated with above average
observations of Yt
holding exogenous control variable Zt
constant. Standard errors are based on
variables with normal distributions.
If
a series has a positive trend, early observations are below the mean and later
ones above it. There is no clustering
around the mean as with normal distributions, the series simply passing through
the mean. A trending variable has a low
peak and fat tails relative to the normal distribution. A regression on nonstationary
variables understates standard errors resulting in inflated parameter
significance and overstated explanatory power.
The first step to estimate (1) is to identify the underlying
autoregressive processes to ensure the lack of trends in every variable. If (1) is estimated with trending variables,
the standard errors will be understated.
The
typical problem in applied time series regression is trending variables. If theory suggests Xt
should have a positive effect on Yt
and both have positive trends, they will be correlated and regression
coefficients will appear significant.
Observations of Xt below its mean
will be associated with observations of Yt
below its mean simply because they occur at an earlier time. The resulting residual correlation overstates
parameter significance and explanatory power.
Residual correlation implies information remains in the residual,
suggesting model misspecification.
A
stationary time series converges to a dynamic equilibrium steady state. The series may not be normally distributed as
it approaches the steady state but should be nearly so. Sample periods do not include all of variable
“history” assumed to extend into the future as well. The sample selection period may result in a
trending variable that could be stationary or even normal with a longer
sample. If variables in a regression are
stationary, standard errors are typically reliable. Applied time series analysis focuses in some
part on these distorted standard errors due to trends.
A
simple regression related to (1) is
yt =
α0 + α1xt + α2zt + εt. (2)
Variables are transformed
to natural logs, yt
= lnYt and so on. Log linear regression coefficients are point
estimates of elasticities. The goal is
to interpret theory in terms of the estimated elasticity α1 = ∂yt/∂xt
although the ultimate form of the time series regression may not be as simple
as (2). The individual time series
processes determine the form of variables in (2). Variables can be transformed with differences
and lags. The regression may include
structural breaks, time itself, and variance of the series. Variables in (2) should be stationary. White noise residuals of underlying univariate processes are candidate variables for (2). The error correction model ECM includes the
residual εt
of (2) in a difference equation regression.
Begin
with a theoretical model deriving (2) to relate theoretical parameters to
estimated coefficients. Rely on theory
and preliminary regressions to suggest the most relevant exogenous and control
variables. The residual εt has to be white
noise WN with zero mean, lack of residual correlation, and constant
variance. The residual correlation r(εt, εt-1) from (2) plays a critical
role in time series regression.
Even
in the presence of residual correlation, the estimated coefficient a1
in (2) is unbiased or just as likely above as below its true value. It is consistent, converging to its true
value as the number of observations increases.
It is also super consistent with accelerating convergence as the number
of observations increases. In the
presence of residual correlation, however, it is impossible to say a1
is not zero due to the understated standard errors.
Spurious
regressions occur when unrelated trending variables appear related in
regressions due to the underestimated standard errors. Arbitrary choice of trending variables can
result in apparently successful regressions.
To avoid spurious regressions, rely on economic theory to select
variables. Relying on economic theory,
spurious regressions are not an issue.
If
the series in a regression are not stationary but their differences are,
regressions on differences produce more reliable estimates. Difference stationary series may also be cointegrated suggesting an error correction model that
includes transitory adjustment relative to the long term dynamic equilibrium. Unsuccessful difference regressions may
disguise a significant error correction processes. Partial adjustment models introduce the lag
of the dependent variable as an exogenous variable.
In
an economic model with more than a single endogenous dependent variable, solve
the reduced form equations with each dependent variable a function of exogenous
variables. Estimate each of the reduced
form equations. For example, the market
model determines endogenous price P and quantity Q from the demand function D =
D(P, Y), the supply function S = S(P, W), and the
equilibrium condition Q = S = D.
Exogenous variables are the demand shifter Y and the supply shifter
W. Estimate P or Q as functions of Y and
W, but not P as a function of Q or vice versa.
The
macroeconomic model provides another example.
National income Y is a function of exogenous government spending G,
money supply M, the foreign interest rate r*, and foreign income Y*. The interest rate r is endogenous and should
not be a right hand variable in a regression with endogenous Y, or vice
versa. A floating exchange rate E would
be endogenous since it adjusts to a nonzero trade balance. With a fixed exchange rate E would be
exogenous and B endogenous.
Theory
is flexible in that various assumptions lead to different reduced form
equations. For instance, price is
exogenous in international economics by the small open economy assumption. Time series evidence provides tests of
particular assumptions as theory stands ready to work through the implications
of alternative assumptions.