1. Trang chủ >
  2. Kinh Doanh - Tiếp Thị >
  3. Quản trị kinh doanh >

2 An example of backtesting: a stock portfolio VaR

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (7.03 MB, 811 trang )


226



Risk Management and Shareholders’ Value in Banking

Actual portfolio values



7000

6800

6600

6400

6200

6000

5800

5600



Figure 8.1



26/06/05



26/05/05



26/04/05



26/03/05



26/02/05



26/01/05



26/12/04



26/11/04



26/10/04



26/09/04



26/08/04



26/07/04



5400



Evolution of a Diversified Stock Portfolio Value



Actual value change



150



VaR at 95%

96



100

73

65



64



50



47

31

1

4



0



−50



−100



−1



72



67

61 59

58

53

45

45

45

45

44

42

40

39

39 37

37

37 33

36 32

36 36

34

32

33 33

32

32

3128

31

30

2831

27 24

28

28

27 26

2627 26

24

23

22

22

22

21

21

20

1

8

1

8

17

8

1

1

5

1

4 1 10

1

4 1

4 9 1

1

4

1

3

1

3

2 0

1

2

1 9 0 11 9

1

1

10

1

1

1

0 1

1

0

0

9 1

8

7

6 72 1 8 1 7 9 7 8 1 7

6

5 4 4 4

4 2 4

4

1

0

0

−1

−1

−1

0

0 −1 2 0

−1 2 2

0

59



−8

−8

−13 −15

−17

−26

−29

−46



96

76



67

61

55

55

48

43

37

30 33



59



48



50



60

56



52

48

42



−4

−3

−5

−5−3

−6 −6

−5 −7 −3

−6

−8

-8

−7 0

−7

−8

-8

-9 −6

−8

−9-8

-9

−11-9 -1

−11 −11 −13

−12 -1 -1 2 -1 −11

−13

3

−16

−14

−15

6

−17

−19 −19 −2 -16 −21

−19

−20

−2

1

1

−21 −25 −21 −22 −17

−22

−22−2

−2-24

2

−22

3

-26-24

−27

−27

8

−28

−29 −2 -30

−30 -31−27

−31

−32

−32-34

−32

−31

−35−33

−36−3

−38 -40

8

−39

−38

−38

−40 −40

−42

−41 −44

−43

-43

−43

−47 −44

−49

−50 −50

−50

−52

−58

−62

−66

−78

−95



−97



−150

26/07/04 26/08/04 26/09/04 26/10/04 26/11/04 26/12/04 26/01/05 26/02/05 26/03/05 26/04/05 26/05/05 26/06/05



Figure 8.2



Backtesting of a Parametric Simple Volatility Model



average criterion. To estimate volatility and consequently VaR, the data from the previous

three months are used.

Note that the portfolio’s daily loss exceeds VaR only on 10 out of the 257 days considered, i.e., in 3.9 % of cases. This result appears to be fairly consistent with the desired

confidence level. However, it must be pointed out that – notwithstanding a frequency of



Evaluating VaR Models



227



“exceptions” which is consistent with the 95 % confidence level – the value of estimation

errors connected with these exceptions reaches significant amounts in some cases, and

sometimes (on 5 August 2004 and 14 April 2005) exceeds 100 % of the estimated VaR.1

Actual value change



100



VaR 95 % lambda 0.94



VaR 95% lambda 0.97



80

60

40

20

0

−20

−40

−60

−80

−100

26/07/2004



Figure 8.3



26/09/2004



26/11/2004



26/01/2005



26/03/2005



26/05/2005



Backtesting of a Parametric Exponential Volatility Model



Figure 8.3 shows the evolution of daily VaR estimated using the variance-covariance

approach with volatility estimated through the exponential average criterion with two

different decay factors: 0.94 and 0.97. Note that a lower lambda generates a VaR estimate

which is more responsive to recent conditions, and therefore, in turn, more volatile.

This means that VaR will increase more rapidly in the presence of strongly negative or

positive recent returns, and, on the other hand, will decrease more quickly when daily

price changes take small values. It follows that the ability to estimate the portfolio risk

will be better when large losses are preceded by other large losses, whereas it will be

worse (compared to a more modest decay factor) when large losses follow on relatively

calm periods. In the specific case represented in Figure 8.3, VaR with a decay factor

of 0.94 performs slightly better (9 exceptions out of 257 days) than VaR with a decay

factor of 0.97 (10 exceptions), even if the two models’ error rates are substantially similar

(3.5 % vs. 3.9 %). It is however to be mentioned that a smaller decay factor may be more

complex to use when one wishes to use it as a risk limit for traders and as a tool to

measure their performance, because it generates more volatile VaRs.2

Figure 8.4 shows the evolution of daily VaR estimated using the historical simulation

model. In this case, the fifth percentile of a sample consisting of the value changes that the

1

As will be better clarified below, although this may be a considerable problem from a risk management point

of view, it cannot be used as a parameter to evaluate a VaR model. Indeed, these models just express the

probability that a given threshold will be exceeded, but do not say anything about the size of excess losses.

2

The excessive volatility of VaR measures may make it difficult to introduce an effective risk limits system.

Indeed, a trader who is aware of this volatility would tend to underuse the risk capacity assigned to him/her

for fear of a sudden increase in his/her positions’ VaR.



228



Risk Management and Shareholders’ Value in Banking

Actual value change



100



Historical VaR at 95%



80

60

40

20

0

−20

−40

−60

−80

−100

26/07/2004



Figure 8.4



26/09/2004



26/11/2004



26/01/2005



26/03/2005



26/05/2005



Backtesting of a Historical Simulation Model



portfolio would have recorded – given its current composition – based upon the prices in

the prior 6 months was selected for each day. Note that VaR obtained using the historical

simulation method shows a peculiar trend over time, which is characterized by a certain

stability interrupted by sudden “leaps”. This is due to the fact that VaR is based upon the

value of the loss for the percentile corresponding to the selected confidence level. This loss

will remain constant until: (i) a higher loss replacing the previous one occurs, or (ii) this

same loss “leaves” the estimation sample. The figure shows a number of exceptions of

11, with a maximum error (on 14 April 2005) of 119 % of VaR. So, the result which is

obtained by applying this approach turns out to be fairly similar to the one connected

with the variance-covariance approach based upon simple averages.

The consistency between these two results is justified by the nature of the distribution of

portfolio returns, which – as highlighted in Figure 8.5 – looks reasonably like a normal.

As a consequence, in a portfolio like the one being analysed - which, on the other hand,

is also characterized by a linear payoff – the typical advantages of historical simulations

(full valuation and return distribution not tied to any known random variable) will be

reduced.

Finally, Figure 8.6 jointly shows the evolution of daily VaR measured using 3 of

the criteria illustrated above: variance-covariance with simple moving average, variancecovariance with exponential moving average and a lambda of 0.94, and historical simulation. Note that the second approach shows much more marked VaR variability.

Table 8.1 reports some performance measures relating to the three approaches.

The examples we have just seen referred to a simplified portfolio and a limited time

horizon. If we want to generalize, reference can be made to a major empirical study

conducted by Darryl Hendricks (1996).

The author compared 4 historical VaR models (with time horizons of 125, 250, 500

and 1,250 days, respectively) with 8 variance/covariance VaR models of the simple average (calculated over 50, 125, 250, 500 and 1,250 days) and exponential average type



Evaluating VaR Models



229



16%



% of cases in the sample



14%

12%

10%

8%

6%

4%

2%



1.55%



1.47%



1.32%



1.16%



1.00%



0.84%



0.68%



0.52%



0.37%



0.21%



0.05%



−0.11%



−0.27%



−0.43%



−0.58%



−0.74%



−0.90%



−1.06%



−1.22%



−1.38%



0%



Daily returns



Figure 8.5



Empirical Distribution of Stock Portfolio Returns



−30.0

l = 94%

simple average

historical simulation



−35.0

−40.0

−45.0

−50.0

−55.0

−60.0

−65.0

−70.0

26/07/2004



Figure 8.6



26/09/2004



26/11/2004



26/01/2005 26/03/2005



26/05/2005



The Results of Three Different VaR Models



(with λ of 0.94, 0.97 and 0.99). A thousand different currency portfolios, each consisting

of 8 randomly weighted different currencies, were generated. The daily VaR of each of

these 1,000 portfolios was calculated using each of the 12 different models for approximately 12 years (3,000 daily observations, from January 1993 to December 1995), with

two different confidence intervals (95 % and 99 %). De facto, 72,000,000 individual VaR

estimates were generated.



230



Risk Management and Shareholders’ Value in Banking

Table 8.1 Performance of different VaR estimation approaches

Number of

Maximum error as

exceptions a percentage of estimated VaR

Parametric (simple average)



109 %



Parametric (λ = 94 %)

WWW.



11

9



115 %



Parametric (λ = 97 %)



10



110 %



Historical



11



119 %



Table 8.2 shows the results of this study. In particular, the percentage of non-excess

losses and the mean actual loss/VaR ratio in those cases in which the former exceeds the

latter are shown. With reference to this second ratio, the last row in the table shows the

theoretical value which it should take if the distribution of market factor returns were

normal and VaR were correct.

Table 8.2 Empirical results of Hendricks’s study (1996)

Number of non exceptions/

total days % ratio

Confidence level



Excess loss/VaR mean

ratio



95 %



99 %



95 %



99 %



simple average 50 days



94.8



98.3



1.41



1.46



simple average 125 days



95.1



98.4



1.38



1.44



simple average 250 days



95.3



98.4



1.37



1.44



simple average 500 days



95.4



98.4



1.38



1.46



simple average 1,250 days



95.4



98.5



1.36



1.44



exponential average (λ = 0.94)



94.7



98.2



1.41



1.44



exponential average (λ = 0.97)



95.0



98.4



1.38



1.42



exponential average (λ = 0.99)



95.4



98.5



1.35



1.40



Historical simulation 125 days



94.4



98.3



1.48



1.48



Historical simulation 250 days



94.9



98.7



1.43



1.37



Historical simulation 500 days



94.8



98.8



1.44



1.37



Historical simulation 1,250 days



95.1



99.0



1.41



1.30



1.254



1.145



Methodology



Normal distr. reference value

Source: Hendricks (1996).



As far as the first ratio is concerned, the models perform altogether well, as they considerably approach the theoretical value indicated by the confidence level. However, a



Evaluating VaR Models



231



review of the second ratio shows that excess losses are on average much higher than the

loss expected assuming normality (for the 99 % confidence level, even two or three times

as high). This implies that the difference between the normal distribution and the actual

data distribution is particularly sensitive “beyond VaR”, i.e., in the extreme tails of the

distribution.

The examples illustrated so far have highlighted that a correct evaluation of the quality

of a VaR model should be based upon two different aspects:





the consistency between the number of exceptions, i.e., the number of days on which

√ losses exceed estimated VaR, and the confidence level adopted for VaR estimation.

the “size” of the exceptions, i.e., the value of the loss in excess of the VaR measure.

As will be noted in Chapter 19, the backtesting methodology proposed by the Basel

Committee to evaluate the quality of a VaR model is based solely upon the first of these

two criteria. From this point of view, the illustrated example showed that – the number of

exceptions being equal – the performance of alternative models can differ considerably

depending on the size of the loss.

When presenting our backtesting examples, for the sake of simplicity, we did not linger

over how to construct the daily profit and loss measure with which the VaR estimated

on the previous day should be compared. For this purpose, there are three alternative

solutions:

(1) the P&L (capital gain or loss) coming from the actual acquisitions and sale of the

portfolio positions which were actually traded by the bank ;

(2) the P&L which is obtained by revaluing the portfolio held by the bank at the end of

the day under the new market conditions at that time;

(3) the P&L which is obtained by revaluing the portfolio held by the bank at the end of

the previous day under the new market conditions at the end of the day.

The first solution is clearly inadequate. Basing a valuation only on actually realized

P&L (and not also on the one implied in the new market values of portfolio positions)

would be against the same mark-to-market logic on which the whole trading activity

is based. The second solution would be “sullied” by the changes in the composition of

the bank’s portfolio which occurred during the day and which could obviously not have

been predicted by the VaR model on the previous evening. The third solution, referred

to as “static P&L”, is the most appropriate one, because it compares VaR with a more

homogeneous P&L result. As a matter of fact, a portfolio’s VaR is generally estimated at

the end of a trading day based upon the portfolio’s composition at that time: in this sense,

it does not incorporate the trading activity which will be performed on the following

day. The potential loss for the following day is estimated assuming that the portfolio’s

composition will remain unchanged. The quality of this estimate (i.e., its ability to predict

the impact that changes in market conditions will have on the portfolio) must therefore

be evaluated “with constant composition”.3

3

Although the third solution is theoretically preferable for evaluating the quality of a VaR model, the most

relevant losses from a management perspective are obviously those used by the second solution, which also

considers intraday trading, and therefore the changes experienced by the composition of the portfolio during

the day.



232



Risk Management and Shareholders’ Value in Banking



8.3 ALTERNATIVE VaR MODEL BACKTESTING TECHNIQUES

In the example in Figures 8.1 to 8.6, it was approximately concluded that a number of

daily exceptions of 9, 10 or 11 over a year consisting of 257 trading days is relatively

satisfactory, because it is consistent with the 95 % confidence level. This conclusion,

however, was not supported by any statistical significance measure.

The problem can be expressed in these terms:

(i) what is the maximum percentage of exceptions which is consistent with the model’s

confidence level?

(ii) What is the minimum percentage of exceptions beyond which it must be concluded

that the model is not good (and, in particular, that the bank is exposed to higher risks

than indicated by VaR)?

The answers to these questions also depend on the number of available observations.

Consider the case of a VaR model with a 99 % confidence level: if this model is tested on

100 observations, and 2 exceptions (2 %) are obtained, it can hardly be concluded with

any certainty that it is incorrect. On the one hand, the percentage of exceptions is twice as

high as expected (2 % instead of 1 %), on the other hand, the error is small, as it is due to

a single observation. The situation would be very different – and it would be possible to

state with greater certainty that the model is incorrect – if 200 exceptions out of 10,000

observations had been recorded.

To answer the above-mentioned questions, numerous statistical tests were proposed

during the second half of the nineties. In that period, evaluating the quality of a VaR

model through backtesting gained in importance, for two reasons:



√ the growing spread of VaR models as market risk management and measurement tools;

the possibility granted by the Basel Committee to banks of using their models to

determine the market risk capital requirement, and the related need for supervisory

authorities to “validate” these models.

Tests can be divided into three main categories:

(i) tests based upon the frequency of exceptions (see sections 8.3.1–8.3.2);

(ii) tests based upon a loss function (see section 8.3.3);

(iii) tests based upon the entire profit and loss distribution (see section 8.3.4).

The first type of tests (some examples of which will be presented in sections 8.3.1–8.3.2)

are based upon the same logic as was adopted in the previous section, i.e., a comparison

between the number of days on which the loss exceeded VaR and the relevant confidence

level. The second type (of which an example will be given in section 8.3.3), conversely,

consider not only the frequency, but also the size of losses, in the belief that there

is an interest – for both the bank and the supervisory authorities – to minimize these

“excess losses”. The third type, rather than focusing on excess loss values only, makes

a comparison between the entire distribution of the value changes predicted by the VaR

model and the actual trading profits and losses.

If the tested hypothesis (“null hypothesis”) is rejected, the losses experienced by the

bank will not be consistent with the VaR model hypotheses, and, consequently, the latter



Evaluating VaR Models



233



must be considered as inaccurate. If, on the contrary, the null hypothesis cannot be

rejected, then the model will be acceptably accurate.

For these evaluation methods, as in any hypothesis test, there are two types of errors:

type I (rejecting the null hypothesis when it is correct), and type II (accepting the null

hypothesis when it is false). When we select a test for risk management purposes, we

are strongly interested in its ability to reject the null hypothesis when this is incorrect,

i.e., in its ability to minimise the type II error (“power” of the test); this is because we

want to avoid classifying an inaccurate model as accurate. As we will see, since there is

a trade-off between the two errors (when the first one is minimized, the second one will

increase, and vice versa), we will be ready to accept a fairly high margin of type I error,

unlike what happens in many classical statistical tests.

8.3.1 The unconditional coverage test

Among the first ones to propose formal statistical tests to analyze the quality of a VaR

model was Paul Kupiec (1995). His test – also referred to as “proportion of failures

test” – is based upon reviewing how frequently portfolio losses exceed VaR. In practice, the hypothesis to be empirically tested (“null hypothesis”) is that the frequency of

empirical exceptions, π, is consistent with the desired “theoretical” one, α, i.e., that the

exception rate implied in the values observed upon backtesting is actually α (in brief,

that π is “covered” by α). This test is not conditional upon any further hypotheses (for

instance, about the sequence of occurrence of errors over time), and is therefore called

“unconditional coverage”.4 The alternative hypothesis to the null hypothesis is that the

exception rate implied in the values observed upon backtesting will be higher than α (i.e.,

that the model is underestimating the risk of extreme losses).5

If the null hypothesis is correct (i.e., if the probability of observing an exception is

actually α), then the probability of observing x exceptions (number of days on which the

loss exceeds VaR) in a sample of N observations (with an exception rate equal to π ≡

x/N ) will be given by a binomial distribution, and will be:

prob(x|α, N ) =



where



N

x



=



N

x



α x (1 − α)N−x



(8.1)



N!

(N − x)!x!



and, as usual, a! refers to the factorial of the integer a.

Considering, for instance, a sample of 250 daily observations relating to a VaR model

with a 99 % confidence level, the probability of obtaining x exceptions will be given by:

prob(x; |1 %, 250) =



250

0.01x · 0.99250−x

x



(8.1.b)



4

As will be seen in more detail below, other methods allow to test more sophisticated hypotheses, and, for this

reason, are known as conditional coverage tests.

5

For an empirical analysis based upon unconditional coverage, see Saita and Sironi (2002), who reviewed

some alternative VaR models applied to international stock portfolios.



234



Risk Management and Shareholders’ Value in Banking



So, for instance, the probability of obtaining 4 exceptions will be given by:

prob(4|1 %, 250) =



250 · 249 · 248 · 247

· 0.014 · 0.99246 = 0.134 = 13.4 %

4·3·2·1



while the probability of obtaining 2 exceptions will be:

prob(2|1 %, 250) =



250 · 249

· 0.012 · 0.99248 = 0.257 = 25.7 %

2·1



In this way, the probability associated with any number of exceptions can be calculated:

Table 8.3 (column 1) and Figure 8.7 show these probabilities together with the corresponding probabilities of making type I errors by rejecting the model as incorrect. We

remind the reader that these distributions are valid if the null hypothesis is true, i.e., if

the probability of observing an excess loss is actually equal to α (and, therefore, the VaR

model is accurate).

Table 8.3 Probabilities associated with

exception and type I errors

x



(1)

prob(x)



(2)

(3)

[prob(x)] 1- [prob(x)]



0



8.1 %



8.1 %



91.9 %



1



20.5 %



28.6 %



71.4 %



2



25.7 %



54.3 %



45.7 %



3



21.5 %



75.8 %



24.2 %



4



13.4 %



89.2 %



10.8 %



5



6.7 %



95.9 %



4.1 %



6



2.7 %



98.6 %



1.4 %



7



1.0 %



99.6 %



0.4 %



8



0.3 %



99.9 %



0.1 %



9



0.1 %



100.0 %



0.0 %



10



0.0 %



100.0 %



0.0 %



So, if the model is correct, the probability of occurrence of a number of exceptions equal

to or lower than 4 is 89.2 % (see column 2 in the Table and lighter area in the Figure). It

follows that the probability of having more than 4 exceptions is 10.8 % (column 3 in the

Table and shaded area in the Figure, which corresponds to the probabilities highlighted

in the shaded cells of the Table).

So, if we followed the rule of rejecting the null hypothesis (and therefore considering

the model as incorrect) whenever more than four exceptions occur, we would run into a

possible type I error (by rejecting a correct model) in 10.8 % of cases. If, conversely, a

more “tolerant” rule were adopted, and the model were only rejected if there are more



Evaluating VaR Models



235



30 %

25 %



Frequency



20 %

15 %

10 %

5%

0%

0



Figure 8.7



1



2



3



4

5

6

7

Number of exceptions (x)



8



9



10



Example of Binomial Distribution



than six exceptions, at that point, the risk of rejecting a correct model would be virtually

zero (to be precise, equal to 1.4 % of cases, as is shown in the third column in the Table).

Since the error we are most worried about is not so much to reject a correct model, but

rather to trust an incorrect model (type II error), and since there is a trade-off between

the two types of error (as one decreases, the other one will increase), we prefer a rule

exposing us to a considerable type I error. Of the two above-mentioned thresholds, the

one providing for a maximum of 4 exceptions will therefore be preferable (because it is

more “virtuous”) to the one accepting 6.

The rules provided for by the Basel Committee are inspired by such a logic. In particular, (as is explained in the Appendix to this chapter), up to 4 exceptions, the model

is considered to be of good quality (a “green area” indicating that the thresholds which

are considered to be virtuous and reassuring are met); up to 9 exceptions, the model is

considered to be only partially adequate (“yellow area”); 10 and more exceptions, the

model is considered to be inaccurate (“red area”).

Let us now turn to a proper inferential test. The consistency between the actual exception

rate recorded by backtesting (π ≡ x/N ) and the theoretical exception rate if the model

is correct (α) can be estimated through a classical likelihood ratio test.

This type of test is based upon the ratio between two likelihood functions. One of them

is of a non-constrained one: the probability of obtaining an error is simply set equal to

the error rate observed in the sample (which, since it was observed directly, represents

the most likely value in the light of backtesting results):

L(x|π ) = π x (1 − π)N−x



(8.2)



236



Risk Management and Shareholders’ Value in Banking



The other likelihood function, conversely, is tied to compliance with the null hypothesis.

Regardless of the observed value π, the probability of an exception is therefore set equal

to α:

L(x|π = α) = α x (1 − α)N−x

(8.3)

If π is not significantly different from α, this function will take very similar values to

the former. As a consequence, the following statistic (based upon the logarithm of their

ratio):

α x (1 − α)N−x

LRuc (α) = −2 ln

(8.4)

π x (1 − π)N−x

will take values close to zero.6 If, conversely, π is significantly different from α, (8.4)

will take high positive values.

Let us consider the case – which was reviewed previously – of a VaR model with a

99 per cent confidence level (α = 1 %) which is backtested for N = 250 days. Let us

assume that the detected number of exceptions is 4 (so that π ≡ x/N = 4/250 ∼ %).

=1.6

In this case, (8.4) will take a value higher than zero, and equal to

LRuc = −2 ln

WWW.



1 %4 (1 − 1 %)250−4 ∼

= 0.77

1.6 %4 (.)250−4



To understand whether this value should be considered too far from zero, it is useful to

know that, if the null hypothesis (π=α) is correct, the LRuc statistic will be distributed

according to a chi-square distribution with 1 degree of freedom:

2

LRuc ≈ χ1



if



π =α



So, it is possible to:

– determine a threshold value which will result in a sufficiently high type I error (a small

type I error would result in a high type II error); for instance, a value of 2.7055 could

be selected as the threshold, since a chi-square with one degree of freedom can generate

values above 2.7055 (see Figure 8.8) only in 10 % of the cases.

– reject the null hypothesis (declaring the model to be inadequate) only if LR uc is above

the threshold (if the null hypothesis is correct, this will only occur in 10 % of the cases,

which is as if to say that the type I error is 10 %).

In the example we have just seen, since 0.77 does not exceed 2.7055, we will be induced

not to reject the null hypothesis, and therefore to consider the backtested VaR model as

a good one. If, conversely, the value of the LR uc statistic were higher than 2.7055, then

the model could be “rejected”. If we were ready to accept a higher type I error (which,

in general, corresponds to a more modest type II error), we would set the threshold at

lower levels: for instance, a threshold of 0.4549 would generate a type I error in 50 %

of the cases (see Figure 8.8 again) and – given an LR uc value of 0.77 – would lead to

reject the VaR model. If the threshold were set exactly at 0.77, the type I error would

be 38 %: we can therefore conclude that, in the presence of an LRuc value of 0.77, the

6



The LR uc statistic (where uc stands for “unconditional coverage”) represents the Kupiec’s test.



Xem Thêm
Tải bản đầy đủ (.pdf) (811 trang)

×