Introduction to Applied Time
Series Regression
Auburn University
Suppose
there is an economic model that suggests xt should affect yt
with control variable zt. In
general functional form yt = f(xt, zt). Ideally yt would be endogenous and
xt and zt exogenous in the underlying economic model. Economics generally studies systems of
equations but this introduction focuses on estimating a single equation.
Each
time series variable depends on its own history and estimating these underlying
processes is prior to estimating the relationship of interest between xt
and yt. The best predictor of
yt+1 is typically its own history that includes the influence of all
related variables. In economic theory
the focus is on model specification and parameter estimation. Empirical models isolate and quantify the
effects of exogenous variables, and may suggest ways to improve theory.
Variables
in OLS regressions should be normally distributed with a constant mean with observations
around the mean according to a white noise error term. Variables with trends have low peaks and fat
tails and are not normally distributed.
With a positive trend, early observations are below the mean and later
ones above the mean. Standard errors are
calculated based on constant means and normal distributions. If variables are nonstationary then standard
errors are understated.
Consider
the OLS regression
yt
= α0 + α1xt + α2zt
+ εt . (1)
The goal of the project is
to interpret theory in terms of the estimated coefficient α1 =
δyt/δxt.
Begin with a theoretical model and then derive (1) to be able to relate
estimated coefficients to the theoretical model. Rely on theory to suggest exogenous variables. Either xt or zt can
represent more than a single variable. The
model yt = α0 + α1xt +
εt without the control variable should be estimated and
compared to (1). The residual εt
has to be white noise WN with a mean close to zero, low autocorrelation, and
constant variance.
The
ultimate form of the regression may not be as simple as (1) since OLS assumes normally
distributed variables but time series variables typically have trends, may have
structural breaks, and may be heteroskedastic with a changing variance over
time.
A
key concept in applied time series analysis is stationarity. A stationary process has a long history and
is converging to its dynamic steady state.
Stationarity is a weaker condition than a constant mean but regressions with
stationary variables may produce reliable statistics. The key test is the residual εt
that has to be white noise WN.
The
typical issue in time series is that variables are not normally distributed and
the OLS regression (1) has understated standard errors. If theory suggests xt should
affect yt and both have trends, they will be correlated and estimated
coefficients will appear significant but explanatory power is overstated.
An
OLS regression with nonstationary variables leads to autocorrelation indicated
by significant autocorrelation corr(εt, εt-1) in
the residual series εt.
Autocorrelation implies information remains in the residual and something
else must affect yt in a systematic way. A pattern in the residual suggests there is
something more that can be explained, requiring either a different model or transformed
variables.
Understated
standard errors with autocorrelation imply overstated coefficient significance
and explanatory power. A spurious OLS
regression has biased and underestimated variances, inflated t-statistics, and
an inflated R2. Coefficient
estimates are unbiased, just as likely above as below the true value. Estimated coefficients are consistent and
converge to the true value as the number of observations increases and the
variance approaches zero. In fact,
estimated coefficients are super consistent with accelerating convergence as
the number of observations increases.
If
series are not stationary they may be difference stationary random walks, and
OLS regressions in differences are then reliable. A difference stationary random walk series
may also be cointegrated, related through an error correction process that
adjusts relative to the long run dynamic relationship between variables.
Economic
models with more than a single dependent variable can be solved in reduced form
with each dependent variable a function of all exogenous variables. Consider the market model that determines
endogenous price P and quantity Q = D = S from the demand function D = D(P,y)
and supply function S = S(P,w). The
exogenous variables in the model are the demand shifter y and the supply
shifter w. It would be appropriate to
estimate Q and P as functions of y and w but inappropriate to estimate Q as a
function of P. This identification
problem should be addressed in deriving (1).
Theory is flexible in that various assumptions about endogeneity lead to
different regression models.
As
another example, the ISLMBP model with national income Y a function of exogenous
variables including government spending G, money supply Ms, the foreign
interest rate r*, and foreign income Y*. The domestic interest rate r is endogenous and
should not be on the right hand side in a regression with the dependent variable
national income. A floating exchange
rate e would be an endogenous variable with the balance of payments B exogenous
since e adjusts if B ¹ 0
but a managed exchange rate would be exogenous.
The
time series processes of the variables involved ultimately determine their form
in the regression and lagged effects may be important. For instance, an increase in the price of
coffee might raise the demand for tea next year. The theoretical model should then check for lagged
effects of exogenous variables.
Regression
options include transforming variables with logarithms, differences, inverses, and
lags, and de-trending. The error
correction model ECM includes the residual εt of (1) in a
second stage difference model that separates transitory adjustment from
adjustment toward the dynamic equilibrium.
Theory
is the guide to variable selection and endogeneity. Regression results might suggest ways to
refine theory. Empirical analysis should
lead to subsequent theoretical and empirical analysis. Beyond the term project, other time periods
or variables can be examined.
SECTIONS
White noise
Stationary
variables
Stationary
with a structural break
Difference
stationary variables
Unit root
with a structural break
Difference
models
Error
correction models
Lagged
transformation models
Detrending
Event Breaks
Other Models:
2SLS, VAR, Causality, Conditional Mean and Variance
Conclusion
The
primary goals of time series regression analysis are to interpret economic
theory in terms of estimated coefficients and suggest ways to refine
theory. Successful applied time series
analysis requires the combination of solid economic theory with reliable time
series techniques. In the results, discuss
only significant coefficients and not the signs of insignificant coefficients. Work through the algebra of any differences,
residuals, or lags and relate results to theory. Report the best possible regression results
with residuals as close to WN as possible.
Advanced techniques deal with optimal lags, endogenous influences across
processes, simultaneous equations, simultaneous estimation of time varying
variance, endogenous structural breaks, instrumental variables, and so on.
This
Introduction to Applied Time Series
Regression provides the foundation for a term project. Send an email for
information on the full text.