pdf.io >> Free >> Multivariate GARCH Models  A Comparative.pdf

Multivariate GARCH Models  A Comparative
 FileName: RP20101.pdf [readonline]


 FileSize: 1780 KB download
 Shared by: wwwm_coventry_ac_uk 33 month ago
 Category: Free
 Report us: delete it


Abstract: 1 Multivariate GARCH Models  A ComparativeStudy of the Impact of AlternativeMethodologies on CorrelationKarl Shutes and Jacek NiklewskiFebruary 16, 2010Economics, Finance & Accounting Department, Coventry University, Coventry

1
Multivariate GARCH Models  A Comparative
Study of the Impact of Alternative
Methodologies on Correlation
Karl Shutes and Jacek Niklewski
February 16, 2010
Economics, Finance & Accounting Department, Coventry University, Coventry
[email protected]
[email protected]
GARCH, multivariate GARCH, BEKK, VECH, CCC, DCC, OGARCH, GO
GARCH
Abstract
With the growth in the requirements of the risk management indus
try and the complexity of instruments that are used in nance, there has
been a signicant growth in the forms of multivariate GARCH models.
These models now allow a signicant number of dimensions to be con
sidered rather than the relatively small number that used to be the case.
This paper examines three multivariate GARCH models: the Dynamic
Conditional Correlation GARCH model of Engle [2002], the Generalized
Orthogonal GARCH model of Broda and Paolella [2008]and the General
ized Orthogonal GARCH model of Boswijk and van der Weide [2009]for
modelling conditional correlation. The data from Polish Stock Exchange
are considered for ten companies. The results present high volatility in
conditional correlation for both GOGARCH models whereas DCC seem
to more stable.
1 Introduction
A number of models have arisen in light of the relative computational diculty
of estimating a multivariate GARCH model. This paper will consider the ap
proaches taken by a number of these and compare the estimates and forecasts for
2
each. With the current crisis of condence in risk management and the require
ments of regulators for the implementation of Basel II, there is a requirement
for GARCH modelling to explicitly to take into account multivariate issues.
2 Literature Review
GARCH models have a distinguished history that can be traced back to Boller
slev, 1986. The emphasis in the early work was primarily the univariate time
series properties of the data series. Indeed much of the early multivariate work
also looked to reduce the multivariate GARCH models to an univariate model
wherever possible.
One of the most important factors in nance is risk. The risk is not observable
directly which induces problems with measuring, modelling and forecasting it.
The ability to measure, model and forecast the risk precisely is very important
as it used in portfolio management, asset and option valuation, hedging or
choosing appropriate investment strategy (Brooks [2008], Piontek [2003]). There
are dierent methods for measuring risk. GARCH and its derivative models are
perhaps the most widely used and considered in the literature and in industry.
This section considers a number of the main frameworks and extensions to the
model as well as considering a number of the known characteristics of nancial
data.
2.1 Features of nancial data
The nancial data exhibit dierent features like (Brooks 2008: 380, Piontek
2004a,b):
Volatility clustering There are periods of high and low volatility. The high
absolute returns tend to follow high absolute returns and small absolute returns
tend to follow small absolute returns.
Leptokurtosis eect The distribution of returns shows much fatter tails than
the normal distribution assumes (i.e. The probability of rare events is much
larger.).
Leverage eect The volatility tend to be larger for the price falls than for
price rises when the magnitude of the price rise and fall is identical. This is
asymmetric inuence of negative and positive information on future level of
volatility.
Skewness The returns distribution presents some degree of skewness.
Autocorrelation of rates of returns especially in periods of low variability.
Longrun memory eect High order autocorrelation coecients of squared
returns (errors) are signicant. More precisely when autocorrelation coecients
of squared errors sum up to innity.
3
2.2 Autoregressive conditionally heteroscedastic (ARCH)
This is a special class of models very popular in nancial modelling and forecast
ing. This model was proposed by Engle [1982] and ARCH(q) can be represented
as (Yu 2002):
rt = µ + ut
2
σt = α0 + α1 u2 + . . . + αq u2
t−1 t−q
where ut ∼ iid(0, σt )
2
or
rt = µ + σt εt
2
σt = α0 + α1 (rt−1 − µ)2 + . . . + αq (rt−q − µ)2
where εt ∼ iid(0, 1)
The conditional variance of error depends on q lags of squared errors. The
hstepahead forecast of volatility can be shown as (Yu 2002):
σt+h = α0 + α1 (ˆt+h−1 − µ)2 + . . . + αq (ˆt+h−q − µ)2
ˆ2 r r
rt+h−j = rt+h−j
ˆ 1≤h≤j
where
(rt+h−j − µ)2 = σt+h−j
ˆ2 h>j
This model allows to model timevarying variances however there are some lim
itations. Firstly when modelling nancial time series the number q tend to
be large. Secondly the nonnegativity constrain of alphas ( ∀ αi ≥ 0) can
i=0,...,q
be violated as the number of alphas increases (Brooks 2008: 391392, Piontek
2000). Therefore generalized version of ARCH model was developed by Boller
slev [1986].
2.3 Generalised autoregressive conditionally heteroscedas
tic (GARCH)
The conditional variance in GARCH model depends not only on lagged squared
errors but also on lags of conditional variance. The GARCH(p,q) can be pre
sented as follows (Yu 2002):
rt = µ + ut
2
σt = α0 + α1 u2 + . . . + αq u2 + β1 σt−1 + . . . + βp σt−p
t−1 t−q
2
4
where ut ∼ iidN (0, σt )
2
or
rt = µ + σt εt
2
σt = α0 + α1 (rt−1 − µ)2 + . . . + αq (rt−q − µ)2 + β1 σt−1 + . . . + βp σt−p
2
(1)
where εt ∼ iidN (0, 1)
The hstepahead forecast of volatility can be shown as (Yu 2002):
m
ˆ2
σt+h = α0 + i=1 (αi σ2
+ βi )ˆt+h−i − βh wt − . . . − βm wt+h−m
ˆ ˆ h = 1, . . . , p
m
ˆ2
σt+h = α0 + i=1 (αi
2
+ βi )ˆt+h−i
σ h = p + 1, . . .
where:
st = rt − µ,
m = max{p, q},
αi = 0 f or i > q
βi = 0 f or i > p
wτ = s2 − E(s2  Iτ −1 ) f or 0 < τ ≤ t
ˆ τ τ
wτ = 0 f or τ ≤ 0
ˆ
στ = s2
ˆ2 τ f or 0 < τ ≤ t
1 T
στ = s2 =
ˆ2 τ T
2
i=1 si f or τ ≤ 0
The GARCH(p,q) model can be presented as ARCH(∞). GARCH(1,1) is su
cient to capture all volatility clustering in the data. GARCH is more parsimo
nious and avoids overtting (Brooks 2008: 393, Piontek 2004aPiontek 2004b).
The unconditional variance of error is (Hamilton 1994: 666):
q p
α0
var(ut ) = q p f or αi + βi < 1
i=1 i=1
1− αi − βi
i=1 i=1
q p
If αi + βi ≥ 1then the unconditional variance is not dened.
i=1 i=1
5
2.4 ARCH and GARCH extensions
`Simple` ARCH and GARCH models cannot account for all of these features
presented above. That is why many extensions was developed to model the
nancial data accurately. These are only some examples from the extensive
collection (Bollerslev 2008, Brooks 2008: 404410): APARCH (Engle 1990),
EGARCH (Nelson 1991), FIGARCH (Baillie et al. 1996), GARCHM (Engle
et al. 1987), GJRGARCH (Glosten et al. 1993), GARCHt (Bollerslev 1987),
IGARCH (Engle and Bollerslev 1986), NGARCH (Higgins and Bera 1992) and
TGARCH (Zakoian 1994).
3 Multivariate models
So far I have focused mainly on modelling and forecasting volatility of one time
series. However in practice there is a need for being able to model and predict
the covariances (correlations) between timeseries. Though we have to move
from univariate models to multivariate models. The covariances in nance are
used for calculations of hedge ratios, portfolio VaR (Value at Risk) estimates,
betas of CAPM (Capital Asset Pricing Model), assets weights in portfolio and
many more. Multivariate models not only model variances but also covariances
(Bauwens et al. 2006, Brooks [2008], Silvennoinen and Teräsvirta 2008).
Consider a vector stochastic process{rt } with dimension N × 1. Let t−1 denote
the information set generated by the observed series{rt } until time t − 1. I
assume that (Bauwens et al. 2006):
rt = µt (θ) + t
1
t = Ht2 (θ)zt
where:
θ vector of parameters,
µt (θ) conditional mean N × 1 vector,
Ht (θ) conditional variance N × N matrix,
zt  iid vector N × 1, thatE(zt ) = 0 andV ar(zt ) = IN
It is worth noting that the conditional variance of rt is equal to the conditional
variance of t (Bauwens et al. 2006):
1 1
V ar(rt  t−1 ) = V ar( t  t−1 ) = Ht2 V ar(zt  t−1 )(Ht
2
) = Ht
6
1
Ht2 is positive denite matrix (N × N ) which may be obtained by e.g. the
Cholesky decomposition (Piontek 2006).
The next few sections will focus on specication of Ht .
3.1 VEC model
This model was proposed by Bollerslev et al. [1988] The VEC model can be
presented as follows (Silvennoinen and Teräsvirta 2008):
q p
vech(Ht ) = c + Aj vech( t−j t−j ) + Bj vech(Ht−j )
j=1 j=1
where vech(·) operator stacks the columns of the lower triangular part of a N ×N
matrix as a N (N +1) × 1 vector and Aj and Bj are N (N +1) × N (N +1) matrices
2 2 2
of parameters (Silvennoinen and Teräsvirta 2008). Each conditional variance
and covariance depends on lagged squared errors and crossproducts of errors
and lagged conditional variances and covariances. That is why VEC model is
very general however high exibility induces some disadvantages. Firstly the
2
(N + 1) (N + 1)
number of parameters is equal to (p + q) N +N , which is
2 2
large even for p = q = 1 and N = 3 the number of parameters equals 78. This
makes estimation demanding. There are restrictive conditions induced to make
the covariance matrix Ht positive denite for all t (Bauwens et al. 2006,Brooks
2008: 434, Piontek 2006, Silvennoinen and Teräsvirta 2008). Therefore diagonal
version of VEC model was proposed.
3.2 DVEC model
DVEC model is restricted version of VEC Bollerslev et al. [1988]. This model
assumes that Aj and Bj are diagonal matrices. This assumption implies less
(N + 1)
parameters to be estimated (p + q + 1) N (e.g. for p = q = 1 and
2
N = 3 the number of parameters equals 18). Therefore estimation is less de
manding at the cost of exibility. Each elementhijt depends on lagged values
of errors it jt and its own lagged values. This induces the lack of transmission
eect (Piontek 2006). Even though it is easier to obtain positive deniteness of
the conditional variance matrices for DVEC than VEC, the restrictions are still
strong (Bauwens et al. 2006, Brooks 2008: 434435, Engle et al. 1995, Piontek
2006, Silvennoinen and Teräsvirta 2008).
7
3.3 BEKK model
The solution for the problem of ensuring positive deniteness is a new parame
terisation of the conditional variance matrix Ht (Engle et al. 1995):
q K p K
Ht = CC + Akj t−j t−j Akj + Bkj Ht−j Bkj
j=1 k=1 j=1 k=1
where Akj , Bkj and C are parameter matrices with dimension N × N however
C is lower triangular. This model was proposed by Baba, Engle, Kraft and
Kroner and is called the BEKK model (Engle et al. 1995). Parameter k ensures
the generality of the model however when K > 1 then identication prob
lems arise (Silvennoinen and Teräsvirta 2008). Under very weak condition the
conditional covariance matrix Ht is positive denite at all time (Engle et al.
1995). The constant term matrix is decomposed into two C and C to ensure
positive deniteness of Ht .BEKK is almost as general as VEC as it includes
all diagonal representation of VEC and almost all positive denite VEC rep
resentations (Engle et al. 1995). The number of parameters to be estimated
(p + q) KN 2 + N (N +1) is still large. Assuming that p = q = 1, N = 3 and
2
K = 1 then (p + q) KN 2 + N (N +1) = 24.
2
The model can simplied by assuming that Akj , Bkj matrices are diagonal. The
number of parameters decreases to (p + q) KN + N (N +1) (e.g. for p = q = 1,
2
N = 3 and K = 1 the number of parameters equals 12) but is still large
(Silvennoinen and Teräsvirta 2008).
By using BEKK parametrization for Ht the positive deniteness is easily ob
tained, the problem with convergence could be an issue as Ht is not linear in
parameters. The interpretation of parameters seems not to be easy (Silven
noinen and Teräsvirta 2008).
3.4 OGARCH model
To overcome the estimation problem of large number of parameters the O
GARCH model was presented by Alexander [2000]. This model tries to express
multivariate GARCH by means of univariate GARCH models i.e. the N ×N con
ditional variance matrix Ht is modelled using m ≤ N univariate GARCH models
(Bauwens, Laurent and Rombouts 2006). The error vector process { t }can be
represented as linear combinations of m uncorrelated factors ft with uncondi
tional variances of one, where m is usual much smaller than N (Alexander 2000,
Bauwens et al. 2006, Silvennoinen and Teräsvirta 2008):
1
V −2 t = ut = Wm ft
where:
8
ft = (f1t . . . fmt ) that E (ft  t−1 ) = 0 and V ar (ft  t−1 )
2 2
= Σt = diad σf 1t , . . . , σf mt
Each factor is assumed to follow GARCH(1,1) process so:
2
σf it = (1 − αi − βi ) + αi fi,t−1 + βi σf i,t−1 f or i = 1, . . . , m
V = diag (v1 , . . . , vN )andvi the population variance of it
1
Wm is orthogonalN × mmatrix thatWm = Pm Λm 2
Λm = diag (λ1 . . . λm ) that λ1 ≥ . . . ≥ λm > 0 and λ is the eigenvalue of the
population correlation matrix of ut
Pm is N × m matrix of corresponding eigenvectors to eigenvalues of the popu
lation correlation matrix of ut
The conditional variance matrix of ut is equal :
Vt = V ar (ut  t−1 ) = W m Σ t Wm
Therefore the conditional variance matrix of t equals :
1 1 1 1
Ht = V ar ( t  t−1 ) = V 2 Vt V 2 = V 2 W m Σ t Wm V 2
The parameters for OGARCH(1,1,m) model are V, Wm , all αi all βi .The number
of parameters is equal N (m+1)+4m or in extreme case (i.e. m = N ) N (N +5) .
2 2
V, Wm are obtained by sample counterparts. The number of factors used is
established by principle component analysis.
The advantage of the model is that in practice only a few principle compo
nents are enough to explain most of variability in the system. This means that
estimation process is much easier. However if the data are weakly correlated
then identication problems arise. Another problem for OGARCH model is
when the components have similar scaling (unconditional variance). Thirdly if
the number of components m is less than N then rank of the conditional vari
ance matrix is reduced which can be a problem for some diagnostic tests and
−1
applications which use the Ht matrix van der Weide [2002]. Finally the trans
formation matrix Wm is restricted to be orthogonal. Therefore van der Weide
[2002]showed generalized version of OGARCH model.
3.5 GOGARCH
The model can be dened as the OGARCH model above with two main dier
ences. Firstly the number of factors equals the number of series (i.e. m = N ).
Secondly the transformation matrix W is restricted to be invertible not only
orthogonal like in OGARCH model. W is obtained by using singular value
decomposition (Bauwens et al. 2006, Silvennoinen and Teräsvirta 2008, van
der Weide 2002):
9
1
W = PΛ2 U
where: Λ = diag (λ1 . . . λN ) that λ1 ≥ . . . ≥ λN > 0 and λ is the eigenvalue
of the population correlation matrix of ut
P is N ×N matrix of corresponding eigenvectors to eigenvalues of the population
correlation matrix of ut
U is N × N orthogonal matrix with det (U ) = 1
N (N +1)
Matrix U can be obtained as a product of 2 rotation matrices (Bauwens
et al. 2006, van der Weide 2002):
U= Rij (δij ) − π ≤ δij ≤ π i, j = 1, . . . , N
where Rij (δij )performs a rotation in the plane spanned by ei and ej over an
angle δij . δij are called the Euler angles may be obtained by maximum likelihood
estimation.
The implied conditional correlation matrix of t can be calculated as follows
(Bauwens et al. 2006, van der Weide [2002]):
−1 −1
Rt = Dt Vt Dt
1
where: Dt = (Vt ◦ I) 2 and Vt = W Σt W
◦ is Hadamard product (i.e. elementwise product)
The model can be estimated using twostep procedure. In the rst step van
der Weide [2002] P and Λ are estimated by exploiting unconditional variance of
ut (i.e. sample counterparts). In the second step the conditional information
is used to estimate rotation coecients of U and all αi and βi of N factors.
This means that N (N +3) (i.e. N (N −1) + 2N ) parameters to be estimated by log
2 2
likelihood function (Bauwens et al. [2006], Silvennoinen and Teräsvirta [2008],
van der Weide 2002). The number of parameters is quite large.
It is worth mentioning that MGARCHinmean models cannot be estimated
with OGARCH and GOGARCH due to twostep estimation procedure. Sec
ondly OGARCH and GOGARCH are part of factor GARCH models and there
fore are nested in BEKK model (Bauwens et al. [2006]).
Allowing the transformation matrix W to be timevarying is one of the possi
ble extensions. Secondly to use dierent GARCH models for components (i.e.
not only GARCH(1,1)) would be another extension left for further study (van
der Weide 2002).
10
3.5.1 NLS
The problem of maximizing the multivariate likelihood function for high dimen
sions led to development of threestep procedure. This estimation method was
proposed by Boswijk and van der Weide [2006]. The second step of the two
step procedure is divided into two steps. This allows to separate the estimation
of a part of link matrixW (i.e.U matrix) from univariate GARCH parameters
(i.e.{αi , βi }m ).
i=1
The threestep procedure tries to identifyU from the autocorrelation structure
of s∗ s∗ where s∗ = Λ−1/2 P εt . They obtain the estimate for B = U AU by
t t t
regressing the following model:
st st − Im = B st−1 st−1 − Im B + Γt , E (Γt ) = 0,
using nonlinear leastsquares method. Estimate forU is obtained from B as A
is diagonal matrix.
The threestep procedure is not only more practical in terms of implementation
but also is less prone to convergence problems. However the main disadvantage
is loss of eciency.
They apply OGARCH, DCC and GOGARCH model to 10 years daily returns
of Dow Jones Industrial index and NASDAQ Composite index. They nd that
patters are quite similar for volatilities and covariances with some dierences in
heights of the peaks however more discrepancy is observed in estimated corre
lations between GOGARCH and two other models. GOGARCH correlations
seem to like a smoothed version of the DCC and OGARCH. GOGARCH es
timates display lower and upper bands which is a conrmation of the previous
results (van der Weide 2002).
They also perform a test for two vevariate examples of ve indices US and
European indices. What they nd is that the NLS (e.i threestep) estimator
performs as good as the ML (e.i twostep) estimator or even better. US data
exhibit noticeable skewness and kurtosis which makes the model to be misspec
ied. These facts have bad inuence on ML estimator whereas NLS estimator
seems to be much more robust.
3.5.2 Chicago
The twostep as well as threestep procedure seem to be too slow when di
mension of the model is high. For that reason Broda and Paolella [2008] in
troduce a twostep procedure for estimation of GOGARCH model. They use
independent component analysis (ICA) as the main tool for the decomposi
tion of a highdimensional problem into a set of univariate models. The ICA
algorithm maximizes the conditional heteroscedasticity of the estimated com
ponents. Their method is called CHICAGO (e.i. Conditionally Heteroscedastic
11
Independent Component Analysis of Generalized Orthogonal GARCH models).
Their procedure allows them to apply nonGaussian innovations.
The independent component analysis ICA is more powerful tool than the prin
ciple components analysis PCA in a sense of preserving interesting features of
the data like clusters. This is because the PCA tries to nd the direction of
the component in which the variance of the data is maximized whereas the ICA
tries to nd the direction of the component in which the interesting features of
the data are kept.. This objective leads to dierent components between ICA
and PCA. For details see Hyvarinen [1999a].
Broda and Paolella estimateU by independent component analysis. There are
many approaches for solving ICA problem. There is a matter of choosing an
appropriate objective function and optimization algorithm. This might be ex
pressed in the following `equation` (Hyvarinen 1999b):
ICA method = objective function + optimization algorithm
The matrix M dening the transformation:
ft = Mm ε t
The aim of ICA is to nd Mm ≡ Wm such that yt = Mm εt are independent. One
−1
of the most important restrictions for ICA is that the independent components
must be nongaussian. If more than one of components is gaussian, the matrix
Wm is not identiable.
One of the method for solving this problem is by maximizing negentropy. The
central limit theorem tells that the distribution of the sum of independent ran
dom variables with nite second moments converges to a gaussian distribution.
Let us dene z = Wm m. Then we have y = mT ε = mT Wm f = z T f which
T
means that y is a linear combination of f with weights given by z T . Accord
ing to central limit theorem z T f more gaussian than any fi and least gaussian
when it equals one of fi (only when one of zi of z is nonzero). Taking m that
maximizes the nongaussianity of mT ε. This vector m corresponds to a z which
has only one nonzero component. This in turn leads to one of the independent
components equals mT ε = z T f .
Dierential entropy H of a random vector y with density f (y) is dened as
(Hyvarinen and Oja 2000):
ˆ
H(y) = − f (y)logf (y)dy
This measure is well known as the Shannon's entropy or measure of uncertainty
(Shannon 1948). A gaussian variable has the largest entropy among all random
variables of equal variances. Now we can dene negentropy J (i.e a measure of
nongaussianity):
12
J(y) = H(ygaussian ) − H(y)
In practice however the density is unknown and the estimate of the negentropy
is needed. One of the possible estimators of the negentropy suggested by Hy
varinen [1999a] is:
JG (m) = [E{G(mT ε)} − E{G(v)}]2
where m is an mdimensional (weight) vector constrained so that E{(mT ε)2 } =
1 and G is nonquadratic function. Hyvarinen proposed the following choices of
G functions:
G1 (u) = log cosh a1 u
G2 (u) = exp(−a2 u2 /2)
with 1 ≤ a1 ≤ 2, a2 ≈ 1
To summarize the aim is to nd m that maximizes the negentropy of mT ε.
The example of a FastICA xedpoint algorithm for one and several units was
proposed by Hyvarinen [1999a]. This algorithm is based on NewtonRaphson
method. It is transformed to a xedpoint iteration. It is worth noting that the
convergence is cubic (or at least quadratic).
The second method of solving ICA is by exploiting the time structure of the data
set. This approach seems to be more natural for time series data e.g. nancial
returns data as the nancial data exhibit GARCHeects. That is why by max
imizing the autocorrelation of the squared returns one can separate independent
components (Broda and Paolella 2008). The a xedpoint algorithm was pro
posed by Hyvarinen et al. [2001] based on cross cumulants. The convergence is
cubic. For details see Hyvarinen et al. [2001].
Broda and Paolella [2008] use the second algorithm however the suggest that
one may use the rst one if the second algorithm fails to converge but this is
rare.
They also compare three estimators of matrix U: ML of van der Weide [2002],
NLS of Boswijk and van der Weide [2006] and ICA of Broda and Paolella [2008].
ML and NLS estimators are virtually unbiased whereas ICA shows a small bias.
NLS and ICA are much more robust than ML as they are separated from factors
specications. ICA doesn't exhibit problems with convergence conversely to ML.
The time of the estimation for their data set shows big discrepancy between
the estimators. ICA algorithm is 56 and 297 times faster than NLS and ML
respectively. Taking into account all features (i.e robustness, accuracy, reliability
and speed) the ICA estimator looks very promising.
13
They also apply nongaussian distributions for components. They use two spe
cial cases of the generalized hyperbolic distribution (i.e. normal inverse gaussian
and hyperbolic). They also propose to use Asymmetric Power ARCH model
for the components instead of GARCH(1,1). However the problem with using
generalized hyperbolic distribution of a weighted sum of independent random
variables lies in estimating cumulative density function which is needed in cal
culating portfolio risk measures like VaR or Expected Shortfall. This problem
can be solved by saddlepoint point approximation. This method is not only
extremely accurate but also computationally cheap. Their application example
consider Var forecasts for 3 equally weighted portfolios of 10 companies taken
from Dow Jones. The data spans the period from 23/09/1992 to 23/03/2007.
VaR forecast obtained are 1.13% (4.48%) for the normal inverse gaussian distri
bution and 1.04% (3.98%) for the hyperbolic distribution at nominal level 1%
(5%). The null hypothesis of correct coverage of the Kupiec test is accepted
with a pvalue of 0.54 (0.26) for the normal inverse gaussian distribution and
0.85 (0.02) for the hyperbolic distribution.
Method of Moments
Boswijk and van der Weide [2009] propose another threestep method for esti
mation of GOGARCH model based on the method of moments. This method is
based on the fact that latent factors exhibit heteroscedasticity. All they assume
about the factors is that they have persistence in variance and nite fourth
moments. This method is very convenient as it doesn't require an optimiza
tion of an objective function. In the third step univariate GARCH models are
estimated for latent factors.
The starting point for derivation of their estimator is matrixvalued process St =
st st − Im and Ft = ft ft − Im and in particular their autocorrelation properties.
st = V −1/2 εt . It worth noting that OGARCH model of Alexander (2000)
assumes standardized principle components s∗ = Λ−1/2 P εt are independent
t
whereas here the components are conditionally uncorrelated. This is a weaker
assumption. Let us dene the autocorrelations ρik = corr(fit , fi,t−k ) and the
2 2
crosscovariances τijk = cov(fit , fi,t−k fj,t−k ). Another assumption states that
2
for some integer p, min max ρik  > 0, max τijk  = 0. They dene
1≤i≤m1≤k≤p 1≤k≤p,1≤i≤j≤m
the autocovaviance matrices as:
Γk (f ) = E(Ft Ft−k ), k = 1, 2, . . .
Taking into account all the assumptions they end up with
Γk (f ) = diag((κ1 − 1)ρ1k , . . . , (κm − 1)ρmk )
The autocorrelation matrix then can be dene as
Φk (f ) = Γ0 (f )−1/2 Γk (f )Γ0 (f )−1/2 = diag(ρ1k , . . . , ρmk )
14
The autocovariance and autocorrelation matrices for st = U ft :
Γk (s) = E(St St−k ) = E(U Ft U U Ft−k U ) = U Γk (f )U
Φk (s) = Γ0 (s)−1/2 Γk (s)Γ0 (s)−1/2 = U Φk (f )U
U matrix can be identied by eigenvectors of Γk (s) or Φk (s) as Γk (f ) and Φk (f )
are diagonal and U is orthogonal matrix.
The sample estimators for Γk (s) or Φk (s) are given as follow:
T
ˆ 1 1 T
Γk (s) = St St−k = t=k+1 (st st − Im )(st−k st−k − Im )
T T
t=k+1
ˆ ˆ ˆ ˆ
Φk (s) = Γ0 (s)−1/2 Γk (s)Γ0 (s)−1/2
However their experiment suggests that the most ecient estimator of Uk using
ˆ
a symmetric version of Φ
ˆ k (s) (i.e. 1 (Φk (s) + Φk (s) )).
2
ˆ ˆ
Obtaining even more ecient estimator U may be possible by combining infor
ˆ
mation from dierent lags. That is why the the follow the Cayley transform to
derive the pooled estimator:
p p
ˆ
U = (Im − ˆ ˆ
wk (Im − Uk )(Im − Uk )−1 )(Im − ˆ ˆ
wk (Im − Uk )(Im − Uk )−1 )−1
k=1 k=1
where wk can be chosen as an equal weight or depending on eigenvalues of
2 (Φk (s) + Φk (s) ) for details see Boswijk and van der Weide [2009].
1 ˆ ˆ
They perform nite sample performance of their estimator of U matrix. To do
this they follow Fan et al. [2008] approach by dening the square root d(U, U )of
ˆ
a symmetric version of the distance measure D(U, U ˆ) for orthogonal matrices.
For details see Boswijk and van der Weide [2009]. They calculate the root
ˆ
mean square distance of d(U, U ) (i.e RMSD) over 5000 Monte Carlo replica
tions for dierent numbers of the observations T ∈ {800, 1600, 3200, 6400} and
dierent values of p ∈ {1, 5, 10, 25, 50, 100, 200}. The eigenvalueweighted esti
mator always is better than the equallyweighted estimator. The optimal lag
length is p = 50(all the components have nite kurtosis) or p = 100(some of
the components don't have nite kurtosis) depending on the properties of the
components.The larger the sample size is the higher lag order is needed.
The Maximum Likelihood estimator (ML) has much smaller RMSD than the
Method of Moments estimator (MM). However very important fact is that MM
15
estimator for the process with some of the components having not nite kurtosis
(which violates one of the assumptions) has the same behaviour as for the process
with all the components with nite kurtosis. The gap between eciency of ML
and MM estimators is reduced when dierent GARCH specications or non
Gaussian innovations are proposed for the components. When the dimension of
the system increases then convergence problems are possible for ML estimator.
The gap between time needed for estimation ML and MM grows signicantly
when the dimension of the system increases.
The also perform two empirical applications for comparison of ML and MM
estimates. They rst consider Dow Jones STOXX 600 European stock market
sector indices. The data spans the period from January 1987 to December 2007.
They focus on a trivariate model of three sectors. They nd that the estimates
obtained forU matrix as well as the GARCH parameters are dierent. Estimated
variances and covariances are quite similar but correlations seem to dier more.
Generally speaking more variation can be noticed in series estimated by ML
method than by MM method. Then they add to system another twelve sectors
and perform the abovementioned estimation once again. The variances and
covariances are similar. The conditional correlations display larger dierences
however the variation in 15variate model is small around their unconditional
mean. All variances, covariances and correlations in the 15variate model are
much smoother than in the 3variate model.
The second application examine the conditional correlations between American
Airlines, SouthWest Airlines, Boeing, FedEx, crude oil and kerosene daily re
turns. The focus on the data from July 19,2003 to August 12, 2008. They
nd that all correlations display the same pattern. MM correlations show more
variation that ML correlations.
3.5.3 DCC of Engle
The Dynamic Conditional Correlation (DCC) model was proposed by Engle
[2002]. This model belongs to a group of multivariate models that can be seen
as nonlinear combinations of univariate GARCH models. The DCC is one of
a generalized version of the Constant Conditional Correlation (CCC) model of
Bollerslev [1990]. Other DCC models are Tse and Tsui [2002] or Christodoulakis
and Satchell [2002]. However I will just focus here of Engle's DCC model which
is dened as follows:
Ht = Dt Rt Dt
where
1/2 1/2
Dt = diag(h11t . . . hN N t )
hiit can be any univariate GARCH model
16
1/2 1/2 1/2 1/2
Rt = diag(q11t . . . q N N t )Qt diag(q11t . . . q N N t )
Qt = (qijt )is the N xN symmetric positive denite matrix dened as:
¯
Qt = (1 − α − β)Q + αut−1 ut−1 + βQt−1
√
where uit = εit / hiit ,
α and β are nonnegative scalars that α + β < 1,
Qis the N xN unconditional variance matrix of ut .
¯
The main drawback of the model is that all conditional correlations follow the
same dynamic structure. The number of parameters to be estimated equals
(N + 1)(N + 4)/2 is large when the N is large (Bauwens et al. 2006). Therefore
Engle propose the estimation of the DCC model by twostep procedure. This
is possible as the conditional variance Ht = Dt Rt Dt can be seen as volatil
ity part and correlation part. Instead of using the likelihood function for all
the coecients he suggested replacing Rt by the identity matrix. This leads
to a quasiloglikelihood function that is the sum of loglikelihood functions of
N univariate models. In the second step Engle estimates parameters of Rt .
This method produces consistent but not ecient estimators. It is possible to
compare loglikelihood function of the twostep procedure with of the onestep
procedure and of the other models. For details see (Bauwens et al. 2006, Engle
2002).
Engle performs a comparison of several correlation estimators. The data gener
ating process is described by two GARCH models and by six dierent correlation
functions. The simulation is performed 200 times for 1000 observations. They
use eight dierent models for estimating correlations: the Moving Average, the
exponential smoothing, the scalar BEKK, the diagonal BEKK, the Orthogonal
GARCH, the DCC with integrated moving average estimation, the DCC by
log likelihood for integrated model and the DCC by log likelihood for mean
reverting model. Three dierent measures for comparison are used. The rst is
the mean absolute error. The second is the autocorrelation test of the squared
standardized residuals. The third test is based on estimator of VaR (i.e. Value
at Risk) for two assets portfolio. For details see Engle [2002]. Overall the ex
periment shows that DCC models are very goo
 Related pdf books
 Building Better Neighbourhoods
 Multivariate GARCH Models  A Comparative
 ViceChancellor
 Transformation Technologies and Business Investment:
 Using Game Theory to
 Section 6
 FACULTY OF HEALTH AND LIFE SCIENCES
 Claiming events of school (re)design: materialising the promise of Building
 Insolvency – Application for an Administration Order – Court’s discretion 
 STATIONERY PRICE LIST 2012
 PN 19 9 465 467 Mentor
 of and Irtual
 www.coventry.ac.uk/resourceshop All prices include VAT
 The Heights, 5965 Lowlands Road, HarrowontheHill, Middlesex ...
 Information for Students from Jordan Welcome to Coventry University
 Popular epubs
 Multivariate Design, Synthesis, and Biological Evaluation of ...
 Simulations and examples for multivariate nonparametric regression ...
 Multivariate Analysis of National Track Records Author(s): Brian ...
 1 GHK simulator: get draws from truncated multivariate normal ...
 Derivation and Evaluation of OSAN Models from UML Models of ...
 A comparative analysis of promoting pay equity: orking Paper ...
 Scientiﬁc Production in Computer Science: A comparative study ...
 A comparative study of 7 algorithms for model reductioni ...
 cePicomParative testing service
 Toward a Developmental and Comparative Conflict Theory of Race ...
Download the ebook