Disclaimer: This lecture was produced by one of our expert business writers, as a learning aid to help students with their studies. We also have several sample papers, each written to a specific grade, to illustrate the work delivered by our expert business writers.

Any opinions, findings, conclusions, or recommendations expressed in this lecture material are those of the authors and do not reflect the views of BusinessTeacher.org. Information contained in this lecture should not be used as a basis for providing financial or investment advice and should be treated as educational content only.

What is time series data?

A significant number of the properties and standards explored in cross-section econometrics proceed when our data are gathered over time. A time series is an ordered grouping of numerical data points/observations on a variable that are taken at discreet and similarly spaced time intervals. We list the time periods as 1,2, …, T and symbolise the set of variables as (y1, y2,….yT).

The usage of time series models is bifold. They are used to:

·       Fit a model and advance to forecasting, monitoring or even feedback and feedforward control.

·       Obtain a comprehension of the basic powers and structure that produced the observed data

Formally, a grouping of variables by time is called a stochastic process/time series process.

1.1 Examples of time series regression models.

1.1.1 static models

yᵼ = β₀ + β₁Xᵼ + uᵼ, t = 1, 2, . . . , T

Where T is the number of observation in the time series. The relation between y and x is contemporaneous.

Usually, a static model is postulated when a change in x at time t is believed to have an immediate effect on y: 

1.1.2 Finite Distributed Lag (FDL) models

In FDL models, earlier values of one or more explanatory variables affect the current value

of y.

(2) yᵼ = α₀ + δ₀xᵼ + δ₁xᵼ-₁+ δ₂xᵼ-₂ + uᵼ

is an FDL of order two.


Multipliers indicate the impact of a unit change in x on y.

Impact Multiplier: Indicates the immediate one unit change in x on y. In (2) δ0 is the impact multiplier.

To see this, suppose xᵼ is constant, say c, before time point t, increases by one unit to

c + 1 at time point t and returns back to c at t + 1. That is

· · · , xᵼ₋₂ = c, x₋₁= c, xᵼ = c + 1, xᵼ+₁= c, xᵼ+₂= c, . . .

Suppose for the sake of simplicity that the error term is zero, then

yᵼ₋₁ = α₀ + δ₀c + δ₁c + δ₂c

yᵼ = α₀ + δ₀(c + 1) + δ₁c + δ₂c

yᵼ+₁ = α₀ + δ₀c + δ₁(c + 1) + δ₂c

y ᵼ+₂  = α₀ + δ₀c + δ₁c + δ₂(c + 1)

y ᵼ+₃  = α₀ + δ₀c + δ₁c + δ₂c

from which we find

                                                                      yᵼ − yᵼ₋₁ = δ₀,

which is the immediate change in yᵼ.

In the next period, t + 1, the change is

                                                                   yᵼ+₁ − yᵼ₋₁= δ₁,

after that

                                                                  yᵼ+₂ − yᵼ₋₁ = δ₂,

after which the series comes back to its underlying level yᵼ+₃ = yᵼ₋₁. The series {δ₀, δ₁, δ₂} is called the lag distribution, which abridges the dynamic effect that an impermanant increase in x has on y

Lag Distribution: A graph of δj as a function of j summarizes the distribution of the

effects of a one unit change in x on y as a function of j, j = 0, 1, . . ..

Especially, if we standardize the initial value of y at yt−1 = 0, the lag distribution follows out the consequent estimates of y due to a one unit, temporary change in x.

Interim multiplier of order J:

(3)                  δ(J) =

Indicates the cumulative effect up to J of a unit change in x on y.

In (2) e.g., δ(1) = δ₀ + δ₁.

Total Multiplier: (Long-Run Multiplier)

Indicates the total (long-run) change in y as a response of a unit change in x.


2.0 Finite sample properties of OLS under classical assumptions - Issues in using OLS with time series data

2.1 Unbiasedness of OLS

2.1.1 The time series/stochastic process ought to take after a model that is linear in its parameters. This assumption is essentially the same as the cross-sectional supposition, except we are now indicating a linear model for time series data.

The stochastic process is a gathering of random variables {Xt} indexed by a set T, i.e. t ∈ T. (Not necessarily independent!)

{(Xᵼ₁,Xᵼ₂,…….Xᵼᴋ, yᵼ) : t = 1,2,….n} follows the linear model:

yᵼ = β₀ + β₁xᵼ₁ +………, BᴋXᵼᴋ + uᵼ

2.1.2         No perfect collinearity - in the sample and along these lines in the fundamental time series process, no independent variable is constant nor a perfect linear combination of the others

2.1.3         Zero conditional mean - For each t, the expected value of the error Uᵼ, given the explanatory variables for all time periods, is zero. Mathematically,

                                                                                E(Uᵼ|X) = 0, t = 1,2,….n.

This presumption suggests that the error term Uᵼ, is uncorrelated with each independent variable in every time period. The way this is expressed as far as the conditional expectation implies that we likewise accurately determine the functional relationship between Yᵼ and the independent variable.

We need to add two assumptions to round out the Gauss-Markov assumptions for time series regressions: Homoskedasticity (which is mentioned in cross-sectional analysis) and No serial correlation.

2.1.4         Homoskedasticity

Conditional on X, the variance of uᵼ, is the same for all t: Var(Uᵼ|X) = Var(uᵼ) =

This assumption means that (Var(UᵼX) cannot depend on X- it is sufficient that Uᵼ and X are independent-and that (Var(Uᵼ) must be constant over time.

2.1.5         Serial correlation.

Serial correlation is a problem associated with time-series data. It occurs when the errors of the regression model are correlated with their own past values.

2.2 Serial correlation and autocorrelation

Autocorrelation is a particular type of serial correlation in which the error terms are a function of their own past values.

ᴜ₊ = pᴜᴌ₋₁ + eᴌ   The first equation is known as an AR (1)

ᴜ₊ = e₊ + ae₊₋₁    The second equation is known as a MA (1)

Both of these equations describe serial correlation but only the first describes autocorrelation. The second equation describes a moving average error which is a different type of serial correlation.

2.2.1    First-order autocorrelation

Many forms of autocorrelation exist. The most popular one is first-order autocorrelation.


Yt = βXt + ut

ut = ρut₋₁ +  et as we know called an AR(1)

where the error term depends upon his predecessor or the current observation of the error term ut is a function of the previous (lagged) observation of the error term:

εt is an error with mean zero and constant variance. Assumptions are such that the Gauss-Markov conditions arise if ρ = 0.

•        The coefficient ρ (RHO) is called the autocorrelation coefficient and takes values from  -1 to +1.

•        The size of ρ will determine the strength of the autocorrelation.

•        There can be three different cases:

1.         If ρ is zero, then we have no autocorrelation.

2.         If ρ approaches unity, the value of the previous observation of the error becomes more important in determining the value of the current error and therefore high degree of autocorrelation exists. In this case we have positive autocorrelation.

3.         If ρ approaches -1, we have high degree of negative autocorrelation.

                                                                         Properties of ut

To determine the properties of ut, we assume | ρ | < 1 stationarity condition)

                                            Then it holds that:

E(ut ) = ρE(u₊₋₁ )+ E(εt ) = 0         (1)

  var(ut ) = E(u ) = ρs           (2)

cov(ut ,ut+s ) = E(ut ,ut-s )= ρs  (3)

cor(ut ,uᴌ+s ) = ρs                                          (4)

(note that this requires -1 < ρ < 1.)

2.2.2 Higher-order Autocorrelation

Second order when:

ut1ut-1+ ρ2ut-2+et

p-th order when:

ut1ut-1+ ρ2ut-23ut-3 +…+ ρput-p +et

3.0 ARMA model using the Box and Jenkins approach

Time series modelling is used to analyse and find properties of economic time series under the idea that ‘the data speaks for itself’. Time series models allows   to be explained by the past or lagged values of   itself (the AR component) and current and lagged values of et (the MA component), which is an uncorrelated random error term with zero mean and constant variance - that is, a white noise error term.

Building the ARMA model follows 3 stages, these are identification, estimation and diagnostic checking.

3.0.1 Model Identification

First, an ARMA model can only be identified for stationary series. If the variables employed are non-stationary, then it can be proved that the standard assumptions for asymptotic analysis will not be valid. In other words, the t-ratios will not follow a t distribution, and the f-statistic will not follow an f-distribution. 

 The formal test is done by testing for a unit root using Augmented Dickey Fuller (ADF) model. Augments the Dickey Fuller using ƿ lags of the dependent variable.

If the û₊ is 1(1) then the model would employ specifications in first differences only.

The ACF and the PACF of the stochastic process (real exchange rate) are observed to see if it decays rapidly to zero. ARMA process will have a geometrically declining ACF and PACF. As a helping tool to test for significance of a correlation coefficient, it can be tested if is outside this band:

3.0.2 Model Estimation

Using ADF and observing ACF and PACF do not clear whether or not the variance of the differenced series is time invariant. In this case, information criteria is used for model selection.

This report uses Akaike’s information criterion (AIC), and the Schwarz’s Bayesian information criterion (SBIC).


3.0.3 Model Diagnostics

This is done through:

                                 I.            Residual diagnostics - the pattern of white noise residual of the model fit will be assessor by plotting the ACF and the PACF or we perform Ljung Box Test on the residuals.

H₀ = no serial correlation

H₁ = there is serial correlation

Where m = lag length, n is the sample size and is ρ the sample autocorrelation coefficient

4.0 Cointegration

If we have two non-stationary time series X and Y that become stationary when differenced (these are called integrated of order one series, or I(1) series; random walks are one example) such that some linear combination of X and Y is stationary (aka, I(0)), then we say that X and Y are cointegrated.

4.1 Testing for a unit root

Hypothesis - given the sample of data to hand, is it plausible that the true data generating process for y contains one or more unit roots?

Augmented Dickey Fuller (ADF) model with no intercept and trend


The Augmented Dickey Fuller expands on the Dickey Fuller test using ƿ lags of the dependent variable. The lags of Δу₊ soaks up any dynamic structure present in the dependent variable, to ensure that υ₊ is not auto correlated (Brooks 2014). For determining the optimal number of lags, the two simple rules of thumb are usually suggested. To begin with, the frequency of data can be used to decide (i.e. if data is monthly = 12 lags; if quarterly = 4 lags). Second, an information criterion can be utilised to choose by picking the number of lags that minimises its value mainly Schwarz information criterion (SIC) and the Akaike information criterion (AIC). After making sure the series are stationary after first difference (note regression amid this stage will give spurious results), the following stage is to use the Engle-Granger 2 step method.

 4.2 Engle-Granger

Assume the Engle and Granger combination that defines the dynamic long-run equilibrium relationship between the price in a domestic market P₁ and the price in world market P₂:

P¹₊ = α₀ + α₁P²₊ + µ₊

Where µ₊ is a random error with constant variance that can be simultaneously correlated.

The Process

This involves running a static regression;                                

                                yt = θ’xt + et ,

The asymptotic distribution θ is not standard but the test suggested was to estimate θ by OLS.

Step 1

Subsequent to making sure that all variables are 1(1), step 1 of Engle-Granger requires estimating the cointegration regression using OLS.  Then save the residuals of the cointegrating regression û₊.


The essential goal is to inspect the null hypothesis that there is a unit root in the potentially cointegrating regression residuals, while under the alternative, the residuals are stationary. Under the null hypothesis, therefore a stationary linear combination of the non-stationary variables has not been found. Hence, if the null hypothesis is not rejected, there is no cointegration. For this situation, the model would utilise specifications in first differences only. On the other hand, if the null of a unit root in the potential cointegrating regression’s residual is rejected, it would be presumed that a stationary linear combination of the non-stationary has been found. Along these lines, the variables would be classed as cointegrated. The fitting technique would be to form and estimate an error correction model (Brooks 2014). If the residuals are , the model would need to be estimated containing only first differences showing the short run solution written as

Step 2

Estimate the error correction model using residuals rom the first step as one variable



The standard notation form:

Where ECT is the Error Correction term (û₊₋₁ = Y₊₋₁ - α₀ - α₁X₊₋₁)

Rejection of the null hypothesis of no cointegration shows that the residuals are stationary with mean zero.

5.0 Simple panel data methods

The next section moves on to the coverage of multiple regression using pure time series data or pure cross-sectional data. There are two types of data that have both cross-sectional and time dimensions: independently pooled cross sections (IPCS) and panel, or longitudinal data.

5.1 Panel data methods

A panel data set, while having both a cross sectional and time series dimension contrasts in some essential regards from an independently pooled cross section (which is acquired by sampling randomly from a large population at different points in time). To gather panel data - which is sometimes called longitudinal data - we take after or endeavour to take after some individuals, states, families, cities or whatever, across time.

5.2 Two-period panel data analysis

In its most straightforward frame, panel data alludes to estimations of yᵢ,ᵼ, t = 1, 2: two cross-sections on the same units, i = 1, . . . , N. Say that we run a regression on one of the cross-sections. Any regression may well show the ill effects of omitted variables bias: there are various elements that may impact the cross-sectional result, past the involved regressors. One method would be to attempt to capture as many of those components as would be prudent by measuring them and incorporating them in the analysis.

In like manner, the net impact of all time-varying factors can be managed by a time fixed effect. For a model (such as suicide rate vs. unemployment rate) with a solitary explanatory variable, yᵢᵼ = β₀ + γ₀d2ᵼ + β1Xᵢᵼ + aᵢ + uᵢᵼ (5) where the indicator variable d2ᵼ is zero for period one and one for period two, not varying over i. Both the y variable and the X variable have both i and t subscripts, varying across (e.g.) cities and the two time periods, as does the error term u. For the suicide rate example, coefficient γ₀ picks up a macro effect: for example, suicide rates across the UK may have varied, on average, between the two time periods. The individual time effect picks that up. The term aᵢ is an individual fixed effect, with a different value for each unit (city) but not varying over time. It grabs the effect of everything beyond X that makes a specific city one-of-a-kind, without our specifying what those components might be. By what method might we estimate equation (5)? On the off chance that we just pool the two years’ data and run OLS, we can infer estimates of the β and γ parameters, however we are overlooking the ai term, which is being incorporated in the factor error term vᵢᵼ = aᵢ + uᵢᵼ. Unless we can be sure that E(vᵢᵼ|Xᵢᵼ) = 0, the pooled method will prompt biased and inconsistent estimates. This zero conditional mean supposition expresses that the unobserved city-specific heterogeneity must not be correlated with the X variable: in the example, with the unemployment rate.

To export a reference to this article please select a referencing style below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.