1 Understanding Path Protection, Path Restoration and Path Segments
Tải bản đầy đủ - 0trang
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
[ Team LiB ]
.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
[ Team LiB ]
8.5 Dual Failure Availability Analysis for SBPP Networks
In Chapter 6 we covered SBPP-based network design including control of the design to limit the maximum number of sharing
[4]
dependencies on any on spare channel. The limiting technique was motivated by a general recognition that doing so would enhance the
dual failure restorability and hence availability. Now we take the availability-related consideration of SBPP further by identifying the
different scenarios that can lead to SBPP outage.
[4]
An extended version of this section was submitted for publication inOptical Networks Magazine [DoCl03].
A case-by-case analysis of how dual failures affect SBPP is summarized in Table 8-3. The categories are based on the location of each
span failure relative to the primary path P1, and backup route P2, of an SBPP service path setup. There are three possibilities for the
location of any single-span failure relative to a specific service path. The failure can be on the primary route of the service path, it can be
on the backup route, or it can be on neither the primary route nor the secondary route (denoted "other"). The last two columns list the
effects of each combination on the availability of the service path considered for both SBPP and, for comparison, 1+1 APS.
Simple reasoning can explain most of the cases, except the seventh category, which is also important because it is the only one for which
the effect differs between SBPP and 1+1 APS. The situation is that an initial failure affects neither the primary service path nor its backup
path, but a second span failure strikes the primary path. In 1+1 APS, the backup path is dedicated to the protection of the primary path, so
as long as it is not itself affected by a failure it is guaranteed to be available. But for SBPP, because spare capacity is shared, there is no
guarantee that the spare capacity will be available along an entire backup path if there has already been a span failure elsewhere. For
instance, the initial failure might have caused some other service path A1 to switch to its own backup path A2, which might share some
capacity with the backup path B2 used to restore failure of primary service path B1. In this case, a detailed inspection of all the paths
affected by the first failure, and of the capacity that their backup paths are subsequently consuming, is needed to conclude whether a
primary service path affected by the second span failure is left exposed to outage or not. To do this exactly requires a
computer-experimental approach, which we describe next. Note in passing, however, that clearly the availability of 1+1 APS is an upper
bound on the availability of paths with SBPP.
Table 8-3. How Dual Failures Affect Shared Backup Path Protection (SBPP)
Failure Category
Failure 1
Failure 2
Outage causing? (SBPP)
Outage causing? (1+1 APS)
1
P1
P1
No
No
2
P1
P2
Yes
Yes
3
P1
other
No
No
4
P2
P1
Yes
Yes
5
P2
P2
No
No
6
P2
other
No
No
7
other
P1
depends on sharing details
No
8
other
P2
No
No
9
other
other
No
No
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register
. it. Thanks
8.5.1 Experimental Comparison of SBPP and Span Restoration
The determination of dual-failure restorability for SBPP requires as input the set of demands, the routes for primary and backup for each
demand, and the total spare capacity on each span (arising from the SBPP design model). The two failures are denoted a and b. Step 1
checks the restorability of service paths to the failure of the first failed span. For each primary affected by the first failure, we activate the
corresponding backup route and seize the required spare capacity. Step 2 considers activated backup routes that may have been hit by
the second failure. Under SBPP, such demands are failed. The next step (3) is to consider whether the backup paths for primaries affected
by failure b can activate their backup paths given the spare capacity usage from step 1.
Let us now look at sample results from a study comparing SBPP and span-restorable networks using the dual failure restorability analysis
methods covered so far [DoCl03]. Tests were conducted using three test network families, each family consisting of a master network and
[5]
a series of progressively lower degree networks derived from the master by random elimination of nonessential spans. The network
families were produced by starting with a master network and incrementally removing one span at a time (by random selection), subject to
retaining biconnectivity. The master networks of each family are shown in Figure 8-6. The master networks are all of average nodal degree
4.0 and provide sample networks down to about degree 2.6.
[5]
Nonessential refers to spans that may be eliminated without the resulting graph losing biconnectivity. The method
allows study of networking phenomena over a series of systematically related networks varying in nodal degree but
having the same scale, node locations, and demand matrix for all test cases. See [DoGr01][GrDo02].
Figure 8-6. Master networks for comparison of dual failure restorability with SBPP and
span-restorable network designs.
All networks were tested with all-pairs demand magnitudes being uniform random in [1,...,10]. Demands were generated once for each
master network and the same demands for each O-D pair were reused for the other members of each set. The SBPP capacity designs
were formulated with the set of five shortest distinct eligible route choices (by span-length) for possible backups on each O-D pair with a
sharing limit of five. The span-restorable designs serve the identical demand patterns with shortest-path working routes and use of the set
of eligible routes for restoration of each span failure up to 5 hops long. The second failure response of the span-restorable network is
adaptive to the spare capacity usage of the first, but if the second failure hits restoration paths of the first failure they are not re-restored.
With the fully adaptive model for span restoration (model (c) of Section 8.3.2) R2(a,b) is even higher, but the partly adaptive model seems
like the fairest comparison to make against SBPP.
Figure 8-7 shows the average R2(a,b) for both schemes for all the test cases plotted against the corresponding network average nodal
degree. With SBPP about 70-80% of all service paths would withstand a dual failure that affected them. The restorability of the same dual
failures is approximately 20% higher still with span restoration. We can also see a modest but clear trend that as the connectivity increases
the dual-failure restorability improves for both schemes. The relatively constant nature of the curves is thought to be due to two
counteracting factors. One is the shortening of the working routes. With more spans in the network, there is a decrease of the number of
service paths crossing each span and each path is exposed to fewer spans, so the average number of affected service paths for dual
failures decreases. By itself this tends to improve the average R2(a,b). However, as nodal degree increases, both forms of
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
mesh-restorable network also become increasingly efficient, embodying less total spare capacity, hence tending to decrease R2. Evidently
the shortening of primary routes seems to win out slightly overall.
Figure 8-7. Average dual failure restorability with SBPP and span-restorable network designs.
Figure 8-8 is a compares SBPP and span-restorable designs on the basis of individual failure pairR2(a,b) values. (The mean values of the
data in Figure 8-8 are the points of Figure 8-7.) The particular result is from the 37-span derivative from master network B but the
distributions are representative of all the other test networks studied. The right-most bin of R2(a,b) ~1 show that under span restoration
roughly 80% of dual failures have no impact at all. In addition, the only cases where R2(a,b)=0 under span restoration is where dual
failures isolate a degree 2 node. In contrast a dual span failure is more likely to cause demand outage in SBPP. This is consistent with
SBPP's basic similarity to 1+1 APS. 1+1 APS experiences hard outage for any (a,b) failure combination where a primary
b
backup or vice versa. Actual SBPP performance is spread out below the 1+1 dedicated APS performance bound by the detailed
sharing-related outage from effects in row 7 of Table 8-3.
Figure 8-8. Histogram of individual R2(a,b) levels under SBPP and span restoration.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks.
[ Team LiB ]
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register
. it. Thanks
[ Team LiB ]
8.6 Optimizing Spare Capacity Design for Dual Failures
In this section we move from the analysis of dual failure restorability of a given network, and how that affects path availability, to consider
[6]
ideas about deliberating synthesizing networks with certain enhanced dual failure properties. A natural first exercise is to see how much
it would cost to simply design in enough spare capacity to withstand all dual failures, i.e., R2 = 1 design. We refer to this as thedual failure
minimum capacity design (DFMC) problem. The results of DFMC are somewhat surprising: although we saw above that the averageR2(i,j)
may be relatively high, it turns out that strictly achieving R2(i,j)=1 for all (i,j) is quite expensive. A next question is therefore to see how high
an R2 can be achieved with a given limit on additional capacity expenditure over theR1=1 condition. This is called the dual failure maximum
restorability (DFMR) problem. Finally we look at how to place capacity at minimum cost to serve certain paths only atR2 =1 and others at
only R1=1. This is called multi-restorability capacity placement design (MRCP) and relates directly to how one would plan capacity and
routing for a defined premium service class that has an R2 =1 guarantee by design. In the first two problems demands are first
shortest-path routed, generating the wi values which are inputs to the problem. The last, MRCP, is a type of joint optimization problem. We
first introduce the three design models, followed by a discussion of sample results.
[6]
During development of the material in this section, a preliminary version of it was presented atClGr02b].
[
8.6.1 Minimum Capacity to Withstand All Dual Failures (DFMC)
The DFMC formulation finds a minimum total spare capacity assignment that guarantees full restorability of all dual span failure scenarios.
This formulation tells us the price for reaching R2 = 1. For obvious reasons, a feasible solution to this problem cannot be found for a
network with degree 2 nodes or 2-edge cuts of the network graph. Therefore our later tests of DFMC are limited to graph topologies that
qualify. In practice, however, transport networks often have degree 2 nodes. DFMC could therefore be considered a formulation for the
assurance of R2 = 1 on the "mesh backbone" component, also called the "meta-mesh" abstraction S
( ection 5.8) of an overall network. In
this context DFMC can be applied by logical removal of a degree 2 node between spans i, j and assertion thatwi' = max (wi, wj) where wi'
applies to the single logical edge arising from the degree 2 node removal. Alternately DFMC capacity design could be applied to a
connected subnetwork of the original graph that excludes all degree 2 nodes. The idea would be that this subnetwork is used for
provisioning all "R2 = 1" (i.e., ultra-available) service paths. A more basic reason for interest in DFMC is, however, that it serves as a
benchmark and satisfies our curiosity about how much spare capacity would be needed to literally protect all demands against all possible
dual failures.
For DFMC wi and si have the usual meanings but we now also define:
th
= the restoration flow assigned to the p eligible route for restoration of span i when span j is also failed (integer).
C
= a constant representing an arbitrarily high amount of restoration flow, higher than the largest restoration flow
assigned to any route in an actual design solution.
th
= 1 if the p eligible route for restoration of span i crosses span k, zero otherwise.
In addition, for the models involving dual-failure scenarios, note that we employ the extra index variable k for greater clarity. This allows us
to use the usual i, j indexes both in context of referring to a span failure (togetheri,j specify a particular dual-failure scenario). This leavesk
to be used (as j normally otherwise is) to refer to any other span in general in the context of a surviving span or as a general span index
without implication that it is part of the failure scenario.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register
. it. Thanks
DFMC
Minimize
Equation 8.16
subject to:
Dual failure restorability:
Equation 8.17
Dual failure routing limitations:
Equation 8.18
Spare capacity required:
Equation 8.19
integer capacities and either integer or relaxed flows.
Equation 8.16 asserts the restorability of each failure span i under all dual span-failure scenarios. In conjunction with the
indicator
parameters, Equation 8.18 uses the "infinity constant", C
, to ensure that spanj can support any required amount of restoration flow
over itself when it is not part of the failure scenario, but that it cannot be used for restoration of spani in the (i,j) failure scenario. Equation
8.19 ensures that there is enough spare capacity on each span to support the largest simultaneously imposed restoration flow on each
span, over all dual failure scenarios. Here, every surviving span k is considered with respect to the restoration flows asserted over it for
span i (the first sum) plus any simultaneously imposed flow that is for restoration of spanj in the same scenario (the second sum). No
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
explicit statement of single failure restorability is needed as this is implicitly feasible under design for all dual failures.
Note that DFMC can be easily adapted to generate the minimum spare capacity to support restoration against any specific subset of dual
2
failures as well. To do this one can generalize
(i , j) S |(i
j), which strictly generates all dual failures, into
f F where F is a
set of specific failure scenarios of interest. In this case, however, any spans that do not appear as part of dual failure scenarios then must
be asserted directly as single failure scenarios in addition to restorability of the specific dual failures of concern. This is similar to how we
later design for a specific set of SRLG hazards.
8.6.2 Dual Failure Maximum Restorability (DFMR)
An obvious concern with DFMC is that the spare capacity costs for strict restorability of all dual failure combinations might be extremely
high. In fact this is confirmed by results that follow. And yet results have already shown us that R2 tends to be relatively high on average in
networks that are only designed for R1=1. The implication is that relatively few specific dual failure scenarios may be responsible for the
large multiplier on spare capacity needed for R2=1 relative to the R1=1 condition. It makes sense therefore to try turning the problem
around and asking instead what is the highest achievable level of R2 with a given limit or a "budget" on total spare capacity investment. To
do this we need to add:
B = a budget limit for total spare capacity cost (input parameter).
th
= the restoration flow assigned to the p eligible route for restoration of span i as a single isolated-span failure scenario
(integer).
N(i,j) = number of non-restored working units under dual failure of spans i(, j) (a new integer variable). Note that N(i,j) is the
same quantity as found experimentally in R2 analysis above. Here, however, it is involved directly as a design variable.
DFMR(B)
Minimize
Equation 8.20
subject to:
Single failure restorability:
Equation 8.21
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
Unrestored paths:
Equation 8.22
Dual failure restoration flow maximums:
Equation 8.23
Dual failure routing limitations:
Equation 8.24
Essential spare capacity for R1:
Equation 8.25
Additional spare capacity to enhance R2:
Equation 8.27
Budget restriction:
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register
. it. Thanks
Equation 8.28
Equation 8.21 ensures that R1 = 1 for every single-span failure.Equation 8.22 defines N(i,j), which is the number of working capacity units
that are not restored in case of a dual failure on spansi, j. The logic is that by minimizing the sum ofN(i,j) over all dual failure scenarios, one
is indirectly maximizing R2 through Equation 8.1 and Equation 8.2 (and thereby also maximizing the availability through Equation 8.11).
Constraints of Equation 8.23 ensure that the number of paths assigned to the restoration of a span in a dual span failure scenario is at
most equal to the number of working channels to be restored. Without this constraint system, under minimization of N(i,j), restoration flow
variables could be otherwise driven above the actual requirements of the problem so as to gain credit under Equation 8.22. Equation 8.24
is identical to 8.18, forcing the exclusion of restoration routes for spani from using span j and vice-versa during the (i,j) scenario, while
allowing use of span j for all other scenarios.Equation 8.25 ensures adequate spare capacity for every single failure case. FinallyEquation
8.27 is identical in form to Equation 8.19 in DFMC, but in this context it does not assert full dual failure restorability (as it does above). It
now serves only to permit such dual failure restoration flows as are feasible under a given distribution of spare capacity amounts. It is
through the action of Equation 8.27, working under the total budget constraint ofEquation 8.28, that the solver effectively chooses the
extent to which it will cover each dual failure scenario using the available budget. The aspect of selective coverage is also why an explicit
assurance of single failure restorability is needed here (Equation 8.21), but not in DFMC. We will soon look at sample results using DFMR
as a vehicle through which, by varying B, we can study how R2 increases with total capacity.
8.6.3 Pure Redistribution of Spare Capacity to Enhance R2
A special case of interest for application of DFMR(B) should be noted. If we assume that a basic solution to SCA (min spare capacity for
R1 = 1) has already been obtained, resulting in a vector of
initial spare capacity values, then:
Equation 8.29
produces a pure redistribution of the initial spare capacity that maximizes the dual failure restorability to the extent possible with only the
capacity of the initial R1=1 design. This provides another instance of a kind of bicriteria optimization (Section 4.16.4). In networks where
the sj distribution of spare capacity is obtained from SCA (or JCA) we follow those designs with an application of DFMR at the same total
spare capacity to secondarily optimize their ability to withstand dual failures as much as possible.
8.6.4 Multi-Service Restorability Capacity Placement Design (MRCP)
In a similar vein to the discussion of Section 8.4.3, we note that DFMR maximizes the overallR2 that is achievable for the network as a
whole, but we have no control over which individual service paths benefit the most or least in the resulting design. Any R2 enhancement
produced by DFMR is a general benefit to the integrity of the network as a whole, but it is not targeted in any specific way to individual
paths. Building on previous ideas of multi-QoP design, however (Section 5.9), we could consider the establishment of a "platinum" service
class of the highest priority nature defined by the guarantee of R2 =1 by design.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
In the prior multi-QoP considerations, there were no considerations of withstanding dual failures. Gold (the top service level) was defined
as the R1=1 service class. All other classes were defined "down" from this ceiling, such as best-effort onR1, no restorability, and
preemptible. But now we can consider a design strategy to create a super-high priority class of service that is actually assured by design of
100% restorability for all single and dual failure scenarios. For simplicity we treat the design problem only for the addition of anR2 =1 subset
of demands to a network where all other demands require R1 =1 or no provision for restorability.
This may not necessarily take that much extra spare capacity because we have already seen that R2 is relatively high to start with. The
problem is that such R2 levels as intrinsically exist in anR1 = 1 network are not coherently structured as might be desired onto individual
platinum-class service paths. Therefore while we may still have to add capacity, we also expect to obtain some of the desired R2 paths
simply by restructuring the inherent R2 resource of an R1-capacity network. Results to follow show that MRCP provides an economical
way to design-in support for this platinum service class that stacks above the existing multi-QoP paradigm ofSection 5.9, in the sense
illustrated in Figure 8-9. The new super-high availability service class can be added to the range of possible QoP classes with remarkably
little additional cost within span-restorable networks that are already efficiently designed to only support single failure restorability.
Figure 8-9. How the MRCP capacity design model extends the multi-QoP hierarchy.
The multi-restorability capacity placement (MRCP) design model allows us to target and structure the dual failure restorability specifically
to the intended services or customers. Every demand receives a specific class of restorability guarantee on every span, end-to-end over
its route. To each demand we therefore assign one of the following restorability service class designations:
R2 restorability = "platinum" service class: assured restorability toany single or dual span failure.
R1 restorability = ordinary assured restorability to any single-span failure (and best-efforts for dual failures).
R0 restorability = for generality this is an easily included additional class of services paths. It receives best-efforts in both single
and dual failures but has no assured restorability.
In contrast to DFMC and DFMR, MRCP is approached as a joint optimization. To provide R2, R1, R0 restorability on a per-path basis
requires explicit handling and recognition of working path structures in any case, so it makes sense that by allowing us to choose among
options for eligible working routes we can realize the required R2 paths with greater efficiency than if working routes are on strictly shortest
paths.
The formulation finds the minimum total cost of working plus spare capacity, and the routing of each demand, to satisfy the restorability
class of service of each demand. For MRCP a demand group is now defined as one or more demand units on the same O-D pair and in
the same service class. Demand groups must be defined for MRCP to distinguish between demands of different service classes on the
r
same O-D pairs. The sum of all demand groups on an O-D pair equals the prior single service demand bundle quantities, d , on each O-D
pair. Thus, index r now indexes not just over all O-D pairs but also through each demand group on each O-D pair. To the parameters of
DFMC and DFMR above we also need to add:
.