Студопедия
Случайная страница | ТОМ-1 | ТОМ-2 | ТОМ-3
АвтомобилиАстрономияБиологияГеографияДом и садДругие языкиДругоеИнформатика
ИсторияКультураЛитератураЛогикаМатематикаМедицинаМеталлургияМеханика
ОбразованиеОхрана трудаПедагогикаПолитикаПравоПсихологияРелигияРиторика
СоциологияСпортСтроительствоТехнологияТуризмФизикаФилософияФинансы
ХимияЧерчениеЭкологияЭкономикаЭлектроника

Example 6.4: Forecast variance for the ETS(A,A,A) model 2 страница

Читайте также:
  1. A bad example
  2. A Christmas Carol, by Charles Dickens 1 страница
  3. A Christmas Carol, by Charles Dickens 2 страница
  4. A Christmas Carol, by Charles Dickens 3 страница
  5. A Christmas Carol, by Charles Dickens 4 страница
  6. A Christmas Carol, by Charles Dickens 5 страница
  7. A Christmas Carol, by Charles Dickens 6 страница

 

It follows that

 

Vn + h|n = (F 2 F 1) Vn + h− 1 |n (F 2 F 1) _

_ _

 

+ σ 2 (F 2 ⊗ F 1) Vn + h 1 |n (G 2 ⊗ G 1) _ + (G 2 ⊗ G 1) Vn + h 1 |n (F 2 ⊗ F 1) _


                          Appendix: Derivations101  
  ( G 2 F 1 + F G 1)_ V n + h− 1 |n + M M _  
+ σ           −→h− 1 (−→h 1) _  

× (G 2 ⊗ F 1+ F 2 ⊗ G 1) _

 

  ( G 2 G 1)_   V M M _( G   G 1) _.  
+ σ       n + h 1 |n + 2 −→h 1 (−→h 1) _      

The forecast mean and variance are given by

µn + h|n =E(yn + h | xn, zn) = w 1 _Mh− 1 w 2

 

and

 

vn + h|n =V(yn + h | xn, zn) =V[vec(w 1 _Qh 1 w 2+ w 1 _Qh 1 w 2 _ε n + h)]

= V[(w 2 _ ⊗ w 1 _) −→Qh 1 + (w 2 _ ⊗ w 1 _) −→Qh 1 ε n + h ]

= (w 2 _   ) + σ 2 M M w   w 1)  
⊗ w 1 _)[ Vn + h 1 |n (1+ σ −→h− 1 (−→h 1) _ ](      

= (1 + σ 2)(w 2 _ ⊗ w 1 _) Vn + h 1 |n (w 2 _ ⊗ w 1 _) _ + σ 2 µ 2 n + h|n.

 

When σ is sufficiently small (much less than 1), it is possible to obtain some simpler but approximate expressions. The second term in (6.31) can be dropped to give Mh = F 1 h 1 M 0(F 2 h 1) _, and so

µn + h|n ≈ w 1 _F 1 h 1 xn (w 2 _F 2 h 1 zn) _.

The order of this approximation can be obtained by noting that the obser-

 

vation equation may be written as yt = u 1, t u 2, t u 3, t , where u 1, t = w 1 _xt 1, u 2, t = w 2 _zt 1and u 3, t =1+ ε t. Then

 

E(yt) = E(u 1, tu 2, tu 3, t) = E(u 1, tu 2, t)E(u 3, t),

 

because u 3, t is independent of u 1, t and u 2, t . Therefore, because E(u 1, t u 2, t ) = E(u 1, t )E(u 2, t ) + Cov(u 1, t , u 2, t), we have the approximation:

µn + h|n =E(yn + h | xn, zn) =E(u 1, n + h | xn)E(u 2, n + h | zn)E(u 3, n + h) + O (σ 2).

When u 2, n + h is constant the result is exact. Now let

µ 1, h =E(u 1, n + h +1 | xn) =E(w 1 _xn + h | xn) = w 1 _F 1 h xn,

 

µ 2, h =E(u 2, n + h +1 | zn) =E(w 2 _zn + h | zn) = w 2 _F 2 h zn, v 1, h =V(u 1, n + h +1 | xn) =V(w 1 _xn + h | xn),


 

and

 

 

Then


 

v 2, h

 

v 12, h


= V(u 2, n + h +1 | zn) = V(w 2 _zn + h | zn),

= Cov(u 2 + +, u 2 + + | xn, zn)

1, n h 1 2, n h 1

 

= Cov([ w 1 _xn + h ]2, [ w 2 _zn + h ]2 | xn, zn).


 
= 0 _m,

102 6 Prediction Distributions and Intervals

 

µn + h|n = µ 1, h 1 µ 2, h 1+ O (σ 2) = w 1 _F 1 h 1 xnw 2 _F 2 h 1 zn + O (σ 2).

 

By the same arguments, we have

 

E(y 2 t) = E(u 1,2 tu 2,2 tu 3,2 t) = E(u 1,2 tu 2,2 t)E(u 3,2 t),    
and    
E(y 2 n + h | zn, xn) = E(u 1,2 n + hu 2,2 n + h | xn, zn)E(u 3,2 n + h) _  
_  

= Cov(u 21, n + h, u 22, n + h | xn, zn) + E(u 21, n + h | xn)E(u 22, n + h | zn) E(u 23, n + h)

 

= (1 + σ 2)[ v 12, h 1 + (v 1, h 1 + µ 21, h 1)(v 2, h 1 + µ 22, h 1)].

Assuming that the covariance v 12, h 1 is small compared to the other terms, we obtain

vn + h|n (1+ σ 2)( v 1, h− 1+ µ 21, h− 1)( v 2, h− 1+ µ 22, h− 1) µ 21, h− 1 µ 22, h− 1.

We now simplify these results for the ETS(M,Ad,M) case where xt = (_t, bt) _

and zt = (st,..., stm +1) _, and the matrix coefficients are w 1 _ = [1, φ ], w 2 _ = [0,..., 0, 1],

  F 1 =   φ   , F 2 =   0 _     ,    
  _0 φ _ _ m− 1      
        Im− 1 0 m 1_    
G 1 = α α ,   and G 2 = 0 _   γ .  
      m− 1 0 m 1_  
  _ β β _         _ Om− 1    

Many terms will be zero in the formulae for the expected value and the

variance because of the following relationships: G 22 = Om, w 2 _G 2

 

and (w 2 _ ⊗ w 1 _)(G 2 ⊗ X) = 02 _m where X is any 2 × 2 matrix. For the

terms that remain, w 2 _ ⊗ w 1 _ and its transpose will only use the terms from

the last two rows of the last two columns of the large matrices because

w 2 _ ⊗ w 1 _ = [02 _m 2, 1, 1].

 

Using the small σ approximations and exploiting the structure of the ETS(M,Ad,M) model, we can obtain simpler expressions that approximate

µn + h|n and vn + h|n .

Note that w 2 _F 2 j G 2 = γdj +1, m w 2 _. So, for h < m, we have

 

h

 

w 2 _zn + h | zn = w 2 _ ∏(F 2+ G 2 ε n + h−j +1) zn = w 2 _F 2 h zn = sn−m + h +1 j =1

 

Furthermore,

µ 2, h = snm + h + m

and v 2, h =_(1+ γ 2 σ 2) hm 1_ s 2 −m + h +.

n m


Appendix: Derivations  

 

Also note that xn has the same properties as for ETS(M,Ad,N) in Class 2. Thus

 

µ 1, h = _n + φh bn

and v 1, h = (1+ σ 2) θh − µ 21, h .

 

Combining all of the terms, we arrive at the approximations

 

    µ n + h|n = µ ˜ n + h|n s   + + O (σ 2)    
          n−m + hm    
and vn + h|n ≈ s 2 nm + hm +_ θh (1+ σ 2)(1+ γ 2 σ 2) hm − µ ˜2 n + h|n _,  
where µ ˜ n + h|n = _n + φh bn, θ 1= µ ˜2   , and    
            n +1 |n      
            h− 1   h ≥ 2.  
    θh = µ ˜2 n + h|n + σ 2 (α + βφj)2 θhj,  
            j =1        

 

These expressions are exact for h ≤ m. The other cases of Class 3 can be derived as special cases of ETS(M,Ad,M).

 

Derivation of Cj Values

 

We first demonstrate that for Class 1 models, lead-time demand can be resolved into a linear function of the uncorrelated level and error compo-nents. Back-solve the transition equation (6.20) from period n + j to period n, to give

j

xn + j = F j xn +∑ F jin + i. i =1

 

Now from (6.19) and (6.20) we have

 

yn + j = w_xn + j 1+ ε n + j

= w_F xn + j 2 + w_n + j 1 + ε n + j

.

.

.

 

j− 1

 

= w_F j 1 xn + ∑ w_F ji 1 n + i + ε n + j i =1

 

j− 1

 

= µn + j|n + ∑ cj−iε n + i + ε n + j , i =1

 

where ck = w_F k 1 g. Substituting this into (6.11) gives (6.15).


104 6 Prediction Distributions and Intervals

 

To derive the value of Cj for the ETS(A,Ad,A) model, we plug the value of ci from Table 6.2 into (6.13) to obtain

 

j

Cj =1+∑(α + βφi + γdi , m )

i =1

 

j j

= 1 + α j + βφi + γdi , m

 

  i =1 i =1  
= 1 + α j + βφ _(j + 1)(1 − φ) (1 − φj +1)_ + γ jm,  
(1 − φ)2  

where jm = j / m_ is the number of complete seasonal cycles that occur within j time periods.

 

A similar derivation for the ETS(A,A,A) model leads to

 

j    
Cj =1+∑(α + + γdi , m ) =1+ j α + 21 β (j +1) + γ jm.
i =1 _ _

The expressions for Cj for the other linear models are obtained as special cases of either ETS(A,Ad,A) or ETS(A,A,A) and are given in Table 6.6.


 

Selection of Models

 

One important step in the forecasting process is the selection of a model that could have generated the time series and would, therefore, be a reasonable choice for producing forecasts and prediction intervals. As we have seen in Chaps. 2–4, there are many specific models within the general innovations state space model (2.12). There are also many approaches that one might implement in a model selection process. In Sect. 7.1, we will describe the use of information criteria for selecting among the innovations state space models. These information criteria have been developed specifically for time series data and are based on maximized likelihoods. We will consider four commonly recommended information criteria and one relatively new infor-mation criterion. Then, in Sect. 7.2, we will use the MASE from Chap. 2 to develop measures for comparing model selection procedures. These mea-sures will be used in Sects. 7.2.2 and 7.2.3 to compare the five information criteria with each other, and the commonly applied prediction validation method for model selection using the M3 competition data (Makridakis and Hibon 2000) and a hospital data set. We also compare the results with the application of damped trend models for all time series. Finally, some implications of these comparisons will be given in Sect. 7.3.

 

 

7.1 Information Criteria for Model Selection

 

The goal in model selection is to pick the model with the best predictive ability on average. Finding the model with the smallest within-sample one-step-ahead forecast errors, or even the one with the maximum likelihood, does not assure us that the model will be the best one for forecasting.

 

One approach is to use an information criterion which penalizes the like-lihood to compensate for the potential overfitting of data. The general form of the information criteria for an innovations state space model is

 

ˆ | y) + (n), (7.1)  
IC = 2 log L (θ, x ˆ0  

    Selection of Models      
      Table 7.1.Penalties in the information criteria.  
Criterion ζ (n) Penalty Source  
AIC     2 q Akaike (1974)  
BIC   log(n) q log(n) Schwarz (1978)  
HQIC   2 log(log(n)) 2 q log(log(n)) Hannan and Quinn (1979)  
AICc   2 n /(n− q − 1) 2 qn /(n− q − 1) Sugiura (1978)  
LEIC   Empirical c qc Billah et al. (2003)  
    ˆ | y)is the maximized likelihood function, q is the number of  
where L (θ, x ˆ0  
parameters in ˆ   , and ζ (n) is a function  
θ plus the number of free states in x ˆ0  

of the sample size. Thus, (n) is the penalty assigned to a model for the number of parameters and states in the model. (We also require that the state space model has no redundant states—see Sect. 10.1, p. 149.) The information criteria that will be introduced in this chapter are summarized in Table 7.1.

For the Gaussian likelihood, we can drop the additive constants in

2 log(L (ˆ, ˆ |))and replace the expression by L (,)from (5.3) to

θ x 0 y θ x 0

obtain

n   n              
IC = n log ∑ ε 2 t + 2 ∑ log r (xt 1) + (n). (7.2)  
t =1   t =1 |   |      

Recall from Chap. 5 that ε t = [ yt − w (xt 1)]/ r (xt 1). Also, the likelihood function is based on a fixed seed state x 0. Not only is the fixed seed state critical for this form of the Gaussian likelihood in the nonlinear version, it is essential in both the linear and nonlinear case for comparing models that differ by a nonstationary state (see Chap. 12 for a discussion of this problem).


Дата добавления: 2015-10-24; просмотров: 134 | Нарушение авторских прав


Читайте в этой же книге: B) Local trend approximation 1 страница | B) Local trend approximation 2 страница | B) Local trend approximation 3 страница | B) Local trend approximation 4 страница | Parsimonious Seasonal Model | Quarterly sales distribution: 16 steps ahead | Lead time demand distribution: 3−steps ahead | Example 6.1: ETS(M,N,M) model | Lead−time demand variance | Forecast Variance |
<== предыдущая страница | следующая страница ==>
Example 6.4: Forecast variance for the ETS(A,A,A) model 1 страница| Example 6.4: Forecast variance for the ETS(A,A,A) model 3 страница

mybiblioteka.su - 2015-2024 год. (0.03 сек.)