cc/td/doc/product/wanbu/9_3
hometocprevnextglossaryfeedbacksearchhelp
PDF

Table of Contents

Quality of Service in MPLS Networks

Quality of Service in MPLS Networks

This chapter considers the role of the Quality of Service (QoS) architecture in designing MPLS-based IT+ATM networks. A summary example is provided for configuring BPX 8650 ATM LSRs, their associated LSCs (6400, 7200, or 7500 series), and Edge Label Switch Routers:

For configuration procedures for BPX 8650 ATM LSRs and their Edge Label Switch Routers, see Chapter 6, "MPLS CoS with the BPX 8650."

For additional information, refer to Cisco 6400, 7200, or 7500 series router and MPLS-related IOS documentation. Refer to 9.3 Release notes for supported features.

MPLS QoS with IP+ATM Overview

As part of their VPN services, service providers may wish to offer premium services defined by Service Level Agreements (SLAs) to expedite traffic from certain customers or applications. Quality of Service (QoS) in IP networks gives devices the intelligence to preferentially handle traffic as dictated by network policy. QoS mechanisms give network managers the ability to control the mix of bandwidth, delay, jitter, and packet loss in the network.

QoS is not a device feature; it is an end-to-end system architecture. A robust QoS solution includes a variety of technologies that interoperate to deliver scalable, media-independent services throughout the network, with system-wide performance-monitoring capabilities.

The actual deployment of QoS in a network requires a division of labor for greatest efficiency. Because QoS requires intensive processing, the Cisco model distributes QoS duties between edge and core devices that could be multilayer switches or routers. Edge devices do most of the processor-intensive work, performing application recognition to identify flows and classify packets according to unique customer policies. Edge devices also provide bandwidth management. Core devices expedite forwarding while enforcing QoS levels assigned at the edge.

MPLS-enabled networks make use of Cisco IOS QoS features to build an end-to-end QoS architecture:

The key to an effective, network-wide IP QoS plan is scalability. Applying QoS on a flow-by-flow basis is not practical because of the huge numbers of IP traffic flows in carrier-sized networks. A scalable way to provide higher levels of service quality with minimal loss in granularity is to implement multiple service classes, or classes of service (CoSs).

For example, a service provider network may implement three service classes: a high-priority, low-latency class, a guaranteed-delivery "mission-critical" service, and a low-priority "best-effort" class. Subscribers can use the mix of services that suits their needs. Some subscribers may wish to use a guaranteed-delivery, low-latency service for their video-conferencing applications, and best-effort service for e-mail traffic.

MPLS makes it possible to apply scalable QoS across very large routed networks and Layer 3 IP QoS in ATM networks because providers can designate sets of labels that correspond to service classes.

In routed networks, MPLS-enabled QoS substantially reduces processing throughout the core for optimal performance. In ATM networks, MPLS makes end-to-end Layer 3-type services possible.

Traditional ATM and Frame Relay networks implement CoS with point-to-point virtual circuits, but this is not scalable because of high provisioning and management overhead. Placing traffic into service classes at the edge enables providers to engineer and manage classes throughout the network. If service providers manage networks based on service classes, rather than point-to-point connections, they can substantially reduce the amount of detail they must track and increase efficiency without losing functionality.

Compared to per-circuit management, MPLS-enabled CoS in ATM networks provides virtually all the benefits of point-to-point meshes with far less complexity. Using MPLS to establish IP CoS in ATM networks eliminates per-VC configuration. The entire network is easier to provision and engineer.

For much IP traffic, the requirements for quality of service are fundamentally weaker and more flexible than requirements for traditional virtual-circuit-switched data traffic. In order to be competitive, providers' IP networks must provide Service-Level Agreements (SLAs) of an appropriate form for IP traffic with relatively weak Quality of Service (QoS) requirements at low cost. The networks must also provide stronger QoS for certain traffic.

Cisco MPLS networks have unique flexibility for meeting all requirements for IP QoS:

Best Effort Traffic and IP QoS Requirements

A major reason for the success of the Internet is that IP treats connectivity as being the fundamental requirement of a communications network. The Internet has been successful in its the goal of allowing any host to communicate with any other, without setting up virtual circuits, reserving bandwidths, or performing any other actions with high overhead or costs.

In the Internet, considerations such as QoS are not treated as being as important as allowing any-to-any connectivity. Because of this philosophy, IP traffic is, in general, extremely tolerant of varying QoS. The typical World Wide Web user will not care whether a Web page downloads at 100 Kbps or 5 Kbps.

TCP, the transport protocol for over half of IP traffic, automatically adjusts to the available end-to-end bandwidth and loss. UDP has generally been used only by applications that are tolerant of packet loss. Although there are now IP applications that do require stronger QoS, most IP traffic still requires only a very loose guarantee of connectivity, meaning that the available bandwidth between a given pair of IP addresses should be at least a few hundred bits per second, and that the delay be no more than a couple of seconds. This traffic merely requires "best effort" from the provider networks.

These requirements are stated in very loose terms because there is no hard minimum QoS required for TCP/IP traffic. TCP/IP adapts to QoS even worse than the figures suggest but the average user is likely to find the resulting application performance to be frustratingly poor.

Certain IP traffic does require better QoS guarantees, particularly Voice-over-IP and similar real-time applications. Good QoS ensures good application performance. Some users may prefer to migrate in the longer term towards the networks that give, at reasonable cost, the best QoS for "best effort" traffic. Despite this, it is likely that a large proportion of the traffic in any MPLS network will continue to be very tolerant of widely varying QoS in day-to-day operations.

Effects of Connectionless Traffic

Another important aspect of IP traffic is that it is connectionless. While this is obvious, its effects on traffic patterns is not as obvious.

Because IP is connectionless, IP applications have extreme flexibility in location. Companies are finding that their departments are setting up Web servers and fileservers away from their traditional head-office centers. In other words, because IP applications treat networks as being connectionless, the traffic in WAN links in corporate IP networks tends to become more meshed and any-to-any in nature.

This does not fit well with traditional hub-and-spoke virtual circuit connectivity. See Figure 4-1 for a comparison of topologies. Hub-and-spoke architectures lead to traffic passing through intermediate CPE in order to get to its destination. This is undesirable if you must pay for traffic to go across virtual circuits twice. It also wastes bandwidth at the hub sites and means that you must manage routing.


Figure 4-1: How Connectionless Traffic Drives Meshing


In response, many customers are adding more and more meshing to their virtual circuit networks. The virtual circuit connectivity increasingly reflects the any-to-any connectionless nature of IP traffic.

However, carrying connectionless traffic on increasingly complex meshes of virtual circuits is a quite inadequate solution, for several reasons:

The underlying problem is that IP traffic does not naturally fit with a connection-oriented service from a provider. For maximum efficiency and lowest cost, providers must offer a connectionless service to customers. Market research [CIMI Corporation, 1998] suggests that over 50 percent of current demand for IP Virtual Private Network services is unmet. The lack of true connectionless IP services offered by carriers is an important reason why this demand is unmet. Connectionless MPLS Virtual Private Network services simplify management for connectionless IP services.

Specifying QoS for Connectionless Service

In traditional virtual circuit networks such as Frame Relay networks, Committed Information Rates (CIR) must be specified for every link, as shown in Figure 4-2 Topology (a). As networks become more meshed, this becomes more difficult. For example a full-mesh network of 100 sites would require 9900 separate CIRs to be provisioned. It is obviously unreasonable to dimension any-to-any networks in this way. (However you might want to specify only a few specific origin-destination bandwidths in an otherwise connectionless network.)


Figure 4-2: Specifying Bandwidths for an IP Service


Aside from being connectionless, there are other reasons why providing QoS for IP traffic is fundamentally different from providing QoS in connection-oriented networks. Connection-oriented QoS is based on the premise that most traffic has QoS requirements that must almost always be met in order to provide adequate performance. Most IP applications, on the other hand, are tolerant of widely varying bandwidth; they can tolerate periods of seconds or more of high loss and are usually extremely tolerant of delay and delay variance.

Because of this fundamental difference, traditional connection-oriented QoS tools of connection admission control and per-VC QoS guarantees are an unnecessary overhead for most IP traffic. They are also difficult or impossible to use without a fully specified matrix of traffic requirements; this is an unnatural requirement for an IP service.

Internet services already use a quite different model of QoS specification. As shown in Figure 4-2 Topology (b), Internet users subscribe to a service by specifying access bandwidths for each of their sites. They do not specify a full matrix of bandwidths or any connection-oriented information. This is the natural way of structuring QoS demands for any connectionless IP service because the access bandwidth requirements are easily estimated in proportion to the number of hosts or servers at each site.

Providers who offer IP Service Level Agreements without using traditional connection-oriented QoS methods have an important advantage. They do not have to deal with the equipment costs and operational overheads for connection-oriented QoS, and closely meet requirements for Quality of Service agreements suitable for connectionless networks. Providing connectionless Service Level Agreements is an important requirement for meeting current and future demand for IP services.

The Differential Services Approach to Quality of Service

Contracts for Access Bandwidths

Even though a full matrix of point-to-point bandwidths is not normally specified for MPLS networks, usage parameter control or policing can still be applied to constrain use of network resources. This is important because it provides a basis for service providers' traffic planning to meet Service Level Agreements.

In Figure 4-2 Topology (b), a customer contracts for a certain access bandwidth at each site. In a simple case, they could be restricted to this bandwidth by setting the access line's data link. However more complex access contracts are possible and desirable.

An IP-layer policing function called Committed Access Rate (CAR) is available in Cisco routers. CAR acts independently on each customer access link, or on each virtual circuit on a channelized access link.

This use of CAR is shown in Figure 4-3. When used on edge LSRs or other provider access routers, CAR both enforces traffic contracts and marks packets according to the traffic contract. For example, a simple CAR contract may specify that a user site gets:

This explanation is simplified for clarity. Typical IP Precedence classes for premium and best- effort classes mentioned here would be 4 and 0, respectively Typical Differentiated Services (DiffServ) classes for traffic would be AF12 for premium and AF11 for best effort.


Figure 4-3: Cisco Committed Access Rate Policers


Far more sophisticated contracts are possible. Another possible example contract for a site:

The last point illustrates an important advantages of policing at the IP layer. CAR is able to take account of IP header information. In this example, it is used to specify that certain types of IP traffic that are very tolerant of varying QoS are automatically carried in the best-effort class and not counted against the limits for premium traffic.

CAR enforces the bandwidth contracts by using token bucket policers, which permit burstiness in a short timescale, while limiting rates in a longer timeframe. Traffic classes are marked on the IP packets admitted into the provider network.

CAR sets IPv4 Precedence bits on packets. (The meaning of these bits will be changed in the forthcoming Differentiated Services standards from the IETF, but DiffServ is backwards-compatible with the original IPv4 formats. CAR will be fully compliant with the new meanings of the DiffServ DS bits on IP packets.)


Note Two different acronyms are used for Differentiated Services and both of these are commonly used in other documents. "DiffServ" is used most commonly, and refers to Differentiated Services in general. "DS" is the name given specifically to the bits in the IP headers used by DiffServ.

CAR is compliant with the DiffServ architecture, which requires technologies like CAR to mark precedence on IP packets.

The core of the network supports different Differentiated Services classes. In MPLS networks, after CAR has been used to mark class of service (CoS) using the Precedence or DS bits on IP packets, the IP packets are sent on different LVCs according to their CoS. There is, in general, a different LVC for each class of service.

In other words, CAR sets IP classes of service, which are then supported by MPLS.

A service provider can use CAR to both police IP traffic and mark CoS on IP packets using Precedence or DS bits. An alternative is to choose the CoS for packets according to your organization's own policies. CAR can be used on Customer Premised Equipment (CPE) to do this, and will allow customers to coordinate CoS assignments using Directory-Enabled Networking. If CAR is used on CPE to pre-set CoS, then CAR on the edge LSRs acts purely as a policer. The use of CAR on both CPE and edge LSRs is shown in Figure 4-4.


Figure 4-4: Using CAR on Customer Premises


Using Best Effort Traffic to Help Guarantee Bandwidths

The presence of much best-effort traffic in a network produces important advantages:

This explains why best-effort traffic is an advantage in providing quality of service: it can be used to cushion premium traffic, ensuring that the premium traffic gets premium service. This is the key to the Differentiated Services approach to service quality.

Figure 4-5 shows how Differentiated Services can work to ensure that premium traffic gets good service. The network operator allocates bandwidth to two classes of service on a particular link:


Figure 4-5: Ensuring Access to Bandwidth Using Differentiated Services


Note that Bp is greater than the estimated bandwidth of premium traffic. This means that the premium traffic gains access to its required bandwidth even if the actual requirement is much greater than the estimated requirement. The best-effort traffic is guaranteed little---after all, it is best-effort traffic. This means that the best-effort traffic may be denied bandwidth to meet the requirements of premium traffic.

This occurs for most of the time; the best-effort traffic does not get all the bandwidth it could use. On the other hand, all the premium traffic is carried.

Using this method, Differentiated Services can give excellent quality of service to premium traffic provided that:

The first condition is likely to hold in future networks because almost all IP traffic today is best-effort and because IP traffic loads are growing much faster than other traffic, such as circuit-switched voice. Unusual cases where this doesn't hold are discussed in "What If There Isn't Much Best Effort Traffic in My Network?". The second condition is quite easy to meet, and this is discussed next.

This example used only one class of premium traffic as well as best-effort traffic. However, multiple classes of premium traffic can be supported in the same way, provided that the two conditions are met.

Modeling Network Traffic Flows To Meet Service Level Agreements

Now consider the engineering steps required for DiffServ network to provide Service Level Agreements (SLA). We've seen how CAR or similar technologies can be used to enforce access rate contracts for sites. This means that at each edge LSR, it is easy to calculate the sum of allowed premium-class access bandwidths. An example is shown in Figure 4-6.


Figure 4-6: Refining Estimates of Network Loads


In order to engineer the core network to ensure delivery of premium-class packets, an estimate of the actual distribution of premium traffic is required, even though customers do not specify traffic matrices.

Note that the traffic estimates are required only for aggregate flows between edge LSRs. For example, consider a provider network with 100 edge LSRs serving 200,000 customer sites. For this network, traffic estimates are required for the flows between the 100 edge LSRs, and not for the individual flows between the 200,000 customer sites. Also note that the estimates do not have to be exact.

It is trivial to calculate the maximum possible traffic flows for each origin-destination pair. In Figure 4-6 Topology (a), for example, the sum of the premium-class access bandwidths at edge LSR A is 320 Mbps. Thus, the maximum possible flow of premium class traffic from A to any other edge LSR is 320Mbps, all illustrated in Figure 4-6 (b).

This "maximum possible" is an extreme over-estimate: it is impossible for a given source LSR to send its full rate of premium traffic to each and every other edge LSR simultaneously, but this is what is assumed by this traffic estimate. Despite this, the "maximum possible" traffic matrix may be useful in some circumstances:

Beyond these exceptions, it will usually be better to use a more realistic traffic model. A more realistic traffic model might assume that traffic is distributed evenly among all possible destinations.

In the example in Figure 4-6, there are four edge LSRs, and the summed premium-class access bandwidth at edge LSR A is 320 Mbps. In this evenly distributed estimate, (320/4) Mbps is sent from A to all the edge LSRs. (Edge LSR A can send traffic to itself. This is realistic because it represents corporate VPNs that have more than one VPN site attached to edge LSR A. In general, there will be thousands of sites in hundreds of VPNs attached to A.)

Similarly, edge LSR B sends (1000/4)Mbps to every edge LSR, and so on. This is shown in Figure 4-6 Topology (c).

This method can be refined further by assuming that traffic is sent in proportion to the receiving sites' access bandwidths. This means that a site with many large access lines is assumed to receive more traffic than a site with fewer, smaller access lines.

In Figure 4-6 (d), the sum of the access bandwidths at nodes A, B, C, and D are in the proportion 320:1000:384:384. Consider the 1000 Mbps of premium-class traffic entering site B. According to these proportions, the 1000 Mbps traffic from node B is divided among nodes A, B, C, and D as 153, 479, 184, and 184 Mbps. The traffic from the other nodes is divided similarly.

Dividing traffic by this method has the property that it leads to balanced estimates of loads, that is, the traffic in one direction between a given pair of nodes is equal to the traffic in the other direction. This "proportional to PoP size" estimated traffic is quite realistic and likely to be useful in many networks.

The setting of link or circuit parameters according to these traffic estimates is covered in following sections.

A Recommended Process For Estimating & Modeling Traffic

The traffic model used to dimension the network does not have to be exact. Recall from the earlier discussion that premium traffic is first estimated, then actual allocations of bandwidth to premium traffic can be made in proportion to the estimates. But actual allocations should be larger than the estimates to allow a margin of safety.

Here is a recommended process for dealing with traffic estimates and modeling:


Step 1 In trials and early production phases, use the "maximum possible traffic" model to dimension premium-class bandwidth allocations. This ensures that all premium-class traffic will be delivered under all conditions except equipment or link failures.

Step 2 During the early production phases, measure the actual origin-destination bandwidths. Cisco Netflow is an excellent tool for collecting these statistics. Alternatively, packet or cell counts can be collected for all the backbone links. Compare this traffic to an estimate derived using the "proportional to PoP bandwidth" estimate. It will usually be found that the "proportional to PoP bandwidth" estimate is a good approximation to the real traffic. In unusual cases, some other model may need to be developed. In any case, a traffic model should be selected for the network.

Step 3 Change the dimensioning of premium-class bandwidths to agree with the chosen model, but initially over-allocate premium-class bandwidths by 100 percent or more to allow for unexpected variations. Recall that this over-allocation does not usually result in wasted bandwidth because best-effort traffic can typically make use of spare bandwidth.

Step 4 Continue to collect statistics and compare them to the traffic model, modifying the model as appropriate. As confidence in the traffic model grows, reduce the amount of over-allocation of premium-class bandwidths.

This process is familiar to many network engineers. Aside from the particular traffic models used, it is identical to the process used in the introduction of over-subscribed Frame Relay and ATM networks.


Engineering DiffServ Per-Hop Behaviors

DiffServ networks use queueing technologies such as Weighted Fair Queueing (WFQ) to provide differential service to the different Classes of Service (CoS). Link-by-link engineering of WFQ parameters is the approach suggested by the IETF DiffServ Working Group.

The treatment of a particular CoS on a particular link (or "hop"), using technologies such as Weighted Fair Queueing, is referred to as a per-hop behavior (PHB). Cisco supports engineering of per-hop behaviors on links in both ATM MPLS and packet-based MPLS networks, as well as ordinary IP networks. The principles are the same in all network types, although there are differences in the way CoS information is carried in packets for different networks.

As a prerequisite for setting the parameters for PHBs, the estimated demand for traffic of different PHBs must be derived for each hop from the estimated traffic matrices.

For example, Figure 4-7, topology (a) shows the traffic demand for a network where one class of premium traffic is carried in addition to best-effort traffic. Topology (b) shows the physical network, including the links and core LSRs which will carry the traffic. Topology (b) also shows the routes that will be chosen for IP routing in the network. IP routing protocols such as OSPF and IS-IS normally chose the shortest possible route from a given origin to a given destination. For complex networks, a tool such as Cisco Netsys will be helpful to review and analyze the routes used in a network.

The traffic matrix and routing information together specify the bandwidth used along various paths as shown in Topology (c). From this information, it is straightforward to sum the total bandwidths on each link.


Figure 4-7: Estimating Network Loads Per Hop Behavior


For example, there are three components of the premium-class traffic flow in the link from A to E: 153 Mbps A->B, 58 Mbps A->C, and 58 Mbps A->D. The sum of these is 269 Mbps, and this is the premium-class bandwidth requirement on the link A->E shown in Topology (d). The other premium-class bandwidths are calculated similarly.

The end result of this calculation are values that should be allocated for premium-class bandwidth on each link. In this example, there has been one class of premium traffic and the premium-class bandwidth reservation on link A->E should be at least 269 Mbps. Referring back to Figure 4-5, this means that the Weighted Fair Queueing bandwidth Bp assigned on the link A->E should be at least 269 Mbps.

This example uses one class of premium traffic. The same process can be used for multiple classes of premium traffic with the bandwidth requirements for each class calculated as described here.

DiffServ Classes and Cisco IP+ATM Switches

The preceding discussion has shown how engineering of DiffServ networks leads to specifications of required bandwidths for various classes of service on various links of the network. This is quite different from traditional per-VC bandwidth management in ATM networks.

The per-VC bandwidth concept used for CBR, VBR, and other ATM Forum traffic management types is illustrated in Figure 4-8 (a). Per-VC bandwidth management requires that bandwidths be specified for every origin-destination flow. If per-VC bandwidth management is used in conjunction with approximate estimates of origin-destination flows, bandwidth will be distributed unfairly. This means that per-VC bandwidth management less useful in connectionless networks than the DiffServ mechanisms.


Figure 4-8: Per-VC Service and Class of Service in ATM Switches


Per-VC bandwidth management is quite different from DiffServ concepts of CoS and Per-Hop Behavior. There is no simple, straightforward way to support DiffServ using ATM Forum traffic management types. Because of this, Cisco does not attempt to use ATM Forum traffic management types to support DiffServ. Instead, class-based queueing is used.

As shown in Figure 4-8 (b), class-based queueing involves a separate queue in the ATM switch for each CoS. Cells from all LVCs of each CoS are queued in a single queue for that CoS. The bandwidth parameters of a CoS on a link are set directly on the CoS queue. The only parameter signalled for each LVC is the Class of Service for the LVC. This means that the ATM MPLS control component is used unchanged, except that multiple LVCs are set up for each destination: one LVC per destination per class of service.

Cisco IP+ATM switches support DiffServ for MPLS traffic, alongside ATM Forum Traffic Management types for PVCs and SVCs. Each DiffServ or ATM Forum Traffic Management type gets is own "Class of Service Buffer." Per-VC queueing can be used in addition to the class-of-service buffers and this is done for ATM Forum Traffic Management types. Weighted Fair Queueing is used to assign bandwidths to the IP Class of Service buffers. This means that the IP classes share bandwidth as illustrated in Figure 4-5.


Figure 4-9: Per-VC Service With VC Merge


Using class-based queueing instead of per-VC queueing for the IP traffic has several advantages:

For these reasons, Cisco strongly recommends that networks supporting IP services are engineered using class-based queueing.

Service-Level Agreements Using DiffServ

This section has covered:

These steps can lead to guarantees of packet delivery:

Unless "maximum possible" traffic models are used, the meeting of SLAs is reliant on monitoring network performance over time, reacting to unexpected traffic events, and modifying traffic models.

An important set of statistics to monitor is the amount of premium-class traffic per link. If premium-class traffic is nearing its allocated bandwidth, then there is a danger that SLAs might not be met. However, because this process is reactive, it is important that SLAs are structured to permit the provider to have time to react; in other words they should allow for periods of poor QoS. It is very important to bear in mind that most IP traffic is very tolerant of varying QoS; customers who understand this will understand that a quite weak SLA is satisfactory even for premium traffic.

Another important aspect of SLAs for IP traffic is the nature of commitment. In a point-to-point network, it is quite easy to define a Committed Information Rate between two sites. In an IP network, the nature of commitment is different.

Consider Figure 4-10. Even if sites A, B, and C are transmitting packets within their premium-rate access contracts, not all of their packets will be delivered. They are sending packets to D at total of 768 Kbps, which exceeds the link bandwidth to D: 512 Kbps. The packet loss rate in this example will be roughly 33 percent even though the access contracts have been met.

It is important for IP SLAs to emphasize this, and specify that a packet is not committed for delivery unless it both meets the access contracts, and is sent within the possible receive rate at the destination. This means that if a customer chooses to send 768 Kbps of traffic to a site with a link bandwidth of 512 Kbps, then the resulting loss is the customer's responsibility. (The customer may deal with this by buying more bandwidth on the link to site D).


Figure 4-10: Committed Delivery in an IP Network


A simple way of structuring SLAs for IP traffic is to use two classes of traffic. This results in a traffic contract which is similar to a Frame Relay committed information rate.

Sample Service Level Agreement Using The Two-Class Model

The examples in this section show a type of Service Level Agreement that a service provider might offer to its customers for IP traffic. Because the meeting of such a Service Level Agreement depends in part on use of appropriate planning and monitoring processes by a service provider, the service levels described are illustrative only and Cisco offers no guarantee that any particular network will be able to meet the service levels described.

    1. The first 64 Kbps of traffic sent from a customer site each second is committed. (This definition is slightly simplified for clarity. A real contract might specify that traffic is measured by a token bucket policer, and specify the token bucket parameters.) Any traffic that otherwise satisfies the definition of Committed traffic, but is sent so that the sum of the committed bandwidths sent to the receiver is greater than the receiver's link rate, is counted against the 64 Kbps of committed traffic, but is treated as best effort traffic for the purpose of clauses 3. to 10. Any packet fully in compliance with this clause is referred to as a committed packet.

    2. Excess traffic, up to a total site bandwidth of 256 Kbps of traffic, will be accepted by the network with no guarantee of delivery. This is referred to as "best effort" traffic or best effort packets. (The contract could specify that any traffic in excess of 256 Kbps would be discarded, but this limit would typically be automatically enforced using a link late of 256 Kbps.)

    3. Within each month, at least 99 percent of committed packets from this site will be delivered.

    4. For each month: during the 1-hour period in which the lowest proportion of committed packets is delivered by the network, at least 90 percent of committed packets from this site will be delivered.

    5. 99.9 percent of committed packets delivered will be within 250 milliseconds of being accepted.

    6. There is no guarantee that any best effort traffic will be delivered.

    7. 99 percent of best effort packets delivered will be within 1 second of being accepted.

    8. Of all the packets delivered, not more than 0.1 percent will be delivered out of order in which they are received.

    9. Of all the packets delivered, not more than 1 in 106 will have an error introduced by the network.

    10. (Further clauses will specify costs, penalties if the clauses above are not met, and so on.)

The first two clauses define committed and best effort packets, and the allowable rates for both. Clause 1. excludes traffic that cannot possibly be delivered as discussed previously.

Clauses 3. and 4. provide realistic assurance of delivery of packets. Clauses 3. and 4. together allow for relatively poor performance for periods of hours each month, while still assuring adequate delivery performance during bad periods---such a period could indicate some need for improvement in the providers's traffic modeling. Note that a provider must provision a network according to an accurate or suitably over-estimated traffic model in order to have confidence in meeting an SLA such as this. Alternatively, it is safer to provision a network using a "maximum possible traffic" model, as discussed earlier. This would typically be done early in network deployment.

Clauses 5. and 7. specify loose delay bounds, which are straightforward to provide. This will be discussed in more detail in "Delay Limits".

Clause 8. is an important provision for IP traffic, namely that packets are delivered in order. TCP, UDP and other transport protocols can deal with IP packet misordering, but large amounts of misordering can lead to poor transport-layer performance. Fortunately, the queueing technologies used on Cisco equipment ensures packet ordering within each class of service, except during rerouting.

It is also desirable that premium and best-effort traffic is carried in the same queues, in order to ensure packet ordering across both classes. This is quite possible on Cisco equipment, and preference is given to the premium traffic by way of discard policies. This is discussed in more detail in "Discard Policies" section.

Because rerouting is a rare and short-lived event caused only by link or equipment outages, or manual routing changes, it is easy to give a strong assurance of packet ordering. Modern telecommunications networks, including those using Cisco equipment, rarely introduce bit errors. An error provision such as Clause 9. is easy to meet.

Readers familiar with QoS guarantees and SLAs for ATM and Frame Relay traffic may be surprised by the weakness of the guarantees suggested in the example. The numbers suggested are entirely reasonable for IP traffic, because most IP traffic is extremely tolerant of varying QoS. The SLA is also incomparably better than the SLAs offered on most public IP networks today: most public IP networks offer no SLAs at all, and the Internet has been widely successful despite this. Providers who offer SLAs of the strength suggested above will capture a large untapped market: customers requiring moderate guarantees for a truly connectionless IP network, but who do not wish to specify all point-to-point bandwidth requirements. Traditional point-to-point QoS guarantees are over-engineered for the needs of most IP traffic.

Some IP traffic requires stronger SLAs. Real-time IP traffic is considered in the next example. Stronger, Frame Relay-quality SLAs may be required for a small minority of IP traffic, and these are considered in "More Stringent Quality of Service in IP+ATM Networks".

Sample Service Level Agreement with Provision for Real Time Traffic

This section considers the types of Service Level Agreement a service provider might offer to its customers for IP traffic. Because the meeting of such a Service Level Agreement depends in part on use of appropriate planning and monitoring processes by a service provider, the service levels described in this paper are illustrative only and Cisco offers no guarantee that any particular network will be able to meet the service levels described.

    1. The CPE may indicate that traffic is real time by setting the IP Precedence field on each packet to a value greater than 0. Any packets with precedence value greater than 0 will be treated as real time. 64 Kbps of real time traffic will be accepted. Any real time traffic in excess of 64 Kbps will be discarded. Furthermore, any traffic that otherwise satisfies this clause, but is sent so that the sum of real time bandwidths sent to the receiver is greater than the receiver's link rate, is counted against the 64 Kbps of real-time traffic, but is treated as best effort traffic for the purpose of clauses 4. to 9..

    2. The first 256 Kb of non-real time traffic sent from a customer site each second is committed. Any traffic that otherwise satisfies this clause, but is sent so that the sum of the real time and committed bandwidths sent to the receiver is greater than the receiver's link rate, is counted against the 256 Kbps of committed traffic, but is treated as best effort traffic for the purpose of clauses 4. to 9..

    3. More traffic up to a total site bandwidth of 1024 Kbps will be accepted by the network with no guarantee of delivery. This is referred to as best effort traffic.

    4. Within each calendar month, at least 99.9 percent of real-time packets from this site will be delivered.

    5. For each calendar month: during the 1-hour period, in which the lowest proportion of real time packets is delivered by the network, at least 99 percent of real time packets from this site will be delivered.

    6. Within each calendar month, at least 99 percent of committed packets from this site will be delivered.

    7. For each calendar month: during the 1-hour period in which the lowest proportion of committed packets is delivered by the network, at least 90 percent of committed packets from this site will be delivered.

    8. 99.9 percent of real time packets which are delivered will be delivered within 100 milliseconds of being accepted.

    9. (Other clauses as per the previous example.)

If more than two traffic classes are supported, it is necessary to identify the preferred traffic class in some way. Clause 1. defines a means of doing this. Alternatively the provider equipment could detect the CoS by using the contents of the IP packet headers. Cisco CAR allows for this.

Note that IP Precedence 5 is the default class of Voice Over IP packets sent from Cisco equipment, and that clause 1 supports any other equipment that marks real-time packets with a precedence greater than 0. Clause 1. strictly limits the real time traffic that a site may send, and this helps in the engineering of the network to support the real time traffic class.

In order to meet the relatively strong delivery agreements specified in Clauses 4. and 5., it might be necessary to use a "maximum possible traffic" model for the real time traffic. The remaining clauses are similar to the previous example.

A provider must provision a network according to an accurate or suitably conservative traffic model in order to have confidence in meeting an SLA such as this. Alternatively, it is safer to provision both the real time and committed traffic using a "maximum possible traffic" model, as discussed in "Modeling Network Traffic Flows To Meet Service Level Agreements" section.

Adding a New Site

Because IP services are connectionless, normal processes of connection admission control are not useful. The equivalent service admission control is a network management operation.

Consider the process of adding of a new customer site or set of sites. A typical service admission process would follow this procedure:


Step 1 As an on-going monitoring operation, maintain a matrix A of the available bandwidth for each CoS between each pair of edge LSRs. This should consider engineering rules, for example that 25 percent of the bandwidth for each CoS on each link should be left unallocated as a statistical reserve. Also maintain a matrix R of bandwidth reserved for services that have been allowed (see Step 3) but not yet activated.

Step 2 Form a model of the traffic introduced by the new service. Traffic modeling is discussed in "Modeling Network Traffic Flows To Meet Service Level Agreements". This results in a matrix N.

Step 3 Compare the new traffic N to the available bandwidth (A-R). If there is sufficient available bandwidth, that is (A-R-N)>0, then allow the new service and increase the reserved bandwidth R by the new traffic N. Otherwise, refer the new service request to be dealt with by increasing the provisioning of the network. If increased provisioning isn't possible, the service request must be rejected.

Step 4 After the service is activated, allow some time (weeks or more) for the customer to start using the service. Then decrease the reserved traffic R by N, and deal with any further changes in customers' use of bandwidth by the normal engineering processes.

This process is based on straightforward statistics collection and calculations and could be added to an existing Operations Support System. In a similar manner to Connection Admission Control for point-to-point services, Service Admission Control will ensure that new services will receive their desired quality of service.


What If There Isn't Much Best Effort Traffic in My Network?

A prevalence of best-effort traffic assists with the engineering of DiffServ networks by allowing for very good QoS for premium traffic in the presence of approximate traffic estimates and without wasting bandwidth. Best-effort traffic helps cushion premium traffic. If best-effort traffic is not prevalent, it is still possible to use the engineering techniques described previously but there is an increased need for accuracy in traffic models.

These measures are helpful:

Standardization

CAR and the use of Diffserv discussed in "The Differential Services Approach to Quality of Service" on page 6 is based on the forthcoming MPLS and Differential Services standards from the IETF. Cisco's implementation of these technologies is either fully compliant with the standards (to the extent they are complete), or is compliant with the older IP Precedence definitions and will up upgraded to comply with DiffSev.

Relevant IETF documents are:

Some of these documents are Working Group Internet Drafts, works in progress at IETF Working Groups. Other Internet Drafts are referred to as Individual Internet Drafts. Individual Internet Drafts have no status in the standardization process, except as proposals from an individual or company. Individual Internet Drafts are easily recognized because "ietf" is not part of their name. Working Group Internet Drafts, on the other hand, are called "draft-ietf-" followed by the name of the Working Group, such as "mpls" or "diffserv", followed by the title of the draft.

The Differential Services Approach to Quality of Service: Summary

Good Quality of Service can be provided to connectionless IP traffic, on MPLS networks in particular. The process of doing this involves several steps:

MPLS Traffic Engineering

The forming and measuring of traffic models is an important part in the providing of good Quality of Service for connectionless traffic. Cisco is developing tools to assist this process. The first of these, MPLS Traffic Engineering, is currently available for Cisco router-based MPLS equipment and will be extended to ATM-LSRs. MPLS Traffic Engineering works by automatically measuring the actual traffic loads on the links of a network and then adjusts the routing of traffic to make best use of the available bandwidth.

There are several other uses for MPLS Traffic Engineering. It provides support for a full range of operational requirements in IP networks, all related to the choosing of routes for traffic:

Cisco's implementation of MPLS Traffic Engineering works in this way:

Optimizing traffic routing using MPLS Traffic Engineering is illustrated in Figure 4-11. In Topology (a) the mean load on the link between nodes E and F is 91 percent. At this load there is imminent danger that packet loss will occur, if it is not occurring already. MPLS traffic engineering will find candidate streams to re-route away from that link, if possible.

For example, the LSPs between edge LSRs A and B may be carrying significant amounts of traffic. MPLS Traffic Engineering attempts to find an alternative route for one or more LSPs so that the load on the (E,F) link is reduced without increasing the loads on other links to a similarly dangerous level. So, depending on the bandwidth on the LSPs, it may be possible to solve the congestion by re-routing traffic away from (E,F) along a different path.


Figure 4-11: Re-optimization of Traffic Using MPLS Traffic Engineering


Cisco's implementation of MPLS Traffic Engineering acts automatically to spread loads around a networks links as evenly as possible. It acts to minimize loads even if links are not currently overloaded. In this way, Cisco MPLS Traffic Engineering actively prevents overloads wherever possible.

MPLS Traffic engineering is based on measuring actual link loads. Existing QoS-aware routing protocols such as the ATM Forum's PNNI are less useful for this application because they are based on signalling of subscribed loads, rather than measurement of actual bandwidth loads. As previously noted, measurement of actual traffic is an important part of automating QoS in IP networks.

PNNI is not a good routing protocol for IP traffic for other reasons. Most notable is that there is no simple way to make PNNI make routing decisions in conjunction with routing information from standard IP routing protocols, namely OSPF and IS-IS. (There was a proposal called `IPNNI' to use PNNI as an IP routing protocol instead of OSPF or IS-IS. This failed because ISPs and carriers' Internet groups had no intention of replacing their existing OSPF or IS-IS infrastructures. Note that OSPF, in particular, is far more widely used than PNNI. IS-IS is also used by about half of the largest ISPs, and by several carriers.)

MPLS traffic engineering based on OSPF or IS-IS overcomes these limitations, and supports measurement-based engineering of connectionless traffic. In addition, it can support PNNI-style point-to-point bandwidth reservations where required, as discussed next.

Note that these issues are relevant only for routing for MPLS LSPs. Cisco IP+ATM switches use a full-featured PNNI implementation for traditional ATM connections: SVCs, SPVCs, and so on. MPLS and traditional ATM connections can be operated on the same links.

The candidate flows for adjustment using MPLS Traffic Engineering must be identified ahead of time. It will normally be found that a relatively small percentages of possible origin-destination traffic streams account for a large proportion of traffic.

For example, a network with 100 edge LSRs has 9900 possible origin-destination streams, but it would typically be found that 90 percent or more of the traffic in the network might be on a few hundred of these. This means that MPLS Traffic Engineering can successfully optimize the traffic in a network by optimizing routing for a relatively small number of candidate origin-destination streams. The other streams will be left to follow the routes chosen normally by OSPF or IS-IS. Note that this does not imply lower quality of service for the non-candidate streams. These still benefit from the low link loads and even distribution of traffic provided by MPLS Traffic Engineering.

The signalling used in Cisco's implementation of MPLS Traffic Engineering, is compliant with a traffic engineering method approved by the MPLS Working Group. Extensions to carry link loading information in the IS-IS and OSPF routing protocols are works in progress at the IS-IS and OSPF Working Groups at the IETF.

More Stringent Quality of Service in IP+ATM Networks

Previous discussions have described be provisioning of SLAs for connectionless traffic in the absence of subscribed, point-to-point bandwidths. In some cases users will want harder PVC-like Quality of Service guarantees for traffic between certain sites. This may be required for critical business applications, disaster recovery, and so on. Cisco IP+ATM can meet these requirements in two ways.

The full MPLS solution shown in Figure 4-12, Topology (a). In this solution, MPLS Traffic Engineering routes a Label-Switched Path (LSP) with a specific, reserved bandwidth between two edge LSRs. This LSP is reserved for traffic between two particular customer sites. (LSPs may be aggregated to help scalability---this is discussed in "Quality of Service for MPLS VPNs" below.) Per-VC queueing will be used for this LSP. Because MPLS Traffic Engineering routes this LSP based on a reserved bandwidth, rather an a measured bandwidth, these point-to-point LSPs will give QoS equivalent to switched permanent virtual circuits (SPVCs). This allows IP SLAs to be extended to include traffic reservations between specific sites.

Connectionless IP services require a Service Admission Control (SAC) process rather than a traditional Connection Admission Control (CAC) process. The point-to-point LSP services, on the other hand, use traditional CAC---the network will reject the connection if insufficient bandwidth is available.

There is an alternative to point-to-point MPLS links, which achieves the same results. With Cisco IP+ATM networks, a single customer site with a single access link can access both a traditional end-to-end PVC and a connectionless IP service as shown in Figure 4-12, Topology (b).

The IP service can be used to deliver connectionless SLAs and the PVC will deliver the extreme reliability already present for Frame Relay and ATM services on Cisco IP+ATM switches. The IP+ATM method is likely to be widely used for a along time because customers, such as banks, who already have a Frame Relay or ATM service for existing transaction processing might not want to shift away from Frame Relay or ATM for that traffic. If a customer's established business processes are running on long-used technology, it is quite reasonable for the customer to want to keep that existing infrastructure. Cisco IP+ATM solutions allow existing Frame Relay and ATM connectivity to be retained, while IP services are introduced for any-to-any connectivity for the new IP traffic, as shown in Figure 4-12 Topology (c). The switches in the network carry both traditional Frame Relay and ATM services, as well as IP services.


Figure 4-12: Reserved Point-to-Point Bandwidths in MPLS Networks


Quality of Service for MPLS VPNs

MPLS VPNs have the same QoS options as any other MPLS networks. Sites in VPNs can subscribe to specified rates of specified classes of service and the provider can offer connectionless Service Level Agreements for those classes. Cisco committed Access Rate (CAR) is used at Provider Edge Routers (PER) to enforce traffic contracts. Differentiated Services are used in the network core.

One of the advantages of Cisco MPLS VPNs is that the core LSRs do not have any knowledge or state for individual VPNs. This has advantages for class-based service and fairness. In particular, if premium traffic has precedence over best-effort traffic, then this applies irrespective of which VPNs are the sources of the premium and best-effort traffic.

Site-to-site bandwidth reservations could be used with MPLS VPNs, as shown in Figure 4-13, Topology (b). If there are two separate point-to-point LSPs with the same originating and destination Provider Edge routers, then these may be aggregated into a single LSP as shown in Topology (c).

This helps preserve the scalability advantages of MPLS VPNs, specifically the absence of per-VPN state in the network core. When point-to-point LSPs are aggregated, Weighted Fair Queueing (WFQ) is used to ensure that each site gets its correct share of the aggregated bandwidth, as shown in Topology (d). This means that the QoS achieved with the aggregated LSPs is equivalent to that which is achieved with separate LSPs.


Figure 4-13: Quality of Service in Virtual Private Networks


Sometimes it is desirable to provision bandwidth within VPNs to specific users and applications. This can be achieved by running Cisco CAR on CPE routers. CAR can be used to give precedence or specific bandwidths to specifics users or applications, as chosen by the customer. This is illustrated in Figure 4-14.


Figure 4-14: Providing Bandwidth To Specific Users and Applications in Virtual Private Networks


Discard Policies

This discussion of classes of service has described how bandwidth is reserved for different classes of service using class-based Weighted Fair Queueing (WFQ). Class-based WFQ is particularly useful for differentiating between classes of service, but there are other options. Queues are used at the ingress to each link in a network to ensure efficient use of the link and appropriately differentiation of service.

In a normal queueing system, packets are accepted into a queue up to a discard threshold. Past the discard threshold, 100 percent of arriving packets are discarded. This discard characteristic is shown in Figure 4-15, Topology (a).

An alternative is to have a single queue with more than one discard class, as shown in Topology (b). This is used in Cisco IP+ATM switches to give Cell Loss Priority (CLP): cells with their CLP bit set are discarded at a lower queue occupancy than cells without their CLP bits set. This gives discard priority to cells without their CLP bits set. CLP-bit setting will enable Cisco edge LSRs to work in conjunction with CLP on Cisco ATM-LSRs.

An enhancement to hard discard thresholds is to use Random Early Discard (RED). With RED, some packets are randomly discarded below the main discard threshold, shown in Figure 4-15 (c). This has two main advantages:

Irrespective of whether WRED is used, there is some suggestion that UDP traffic can get unfairly good QoS compared to TCP traffic because of TCP's behavior of backing off in the presence of packet loss. If this is found to be a problem in practice, it may be solved by transmitting TCP and UDP traffic in different Classes of Service.


Figure 4-15: Discard Policies


As with simple discards, there may be multiple RED characteristics for a single queue to deal with multiple different Classes of Service. This is known as Weighted RED (WRED). Weighted RED is illustrated in Figure 4-15 Diagram (d). Cisco routers offer eighted RED, with up to eight discard classes per queue.

Either multiple discard thresholds or WRED can be combined with weighted fair queueing. In the example shown in Figure 4-16:


Figure 4-16: Example of Combining Weighted Fair Queueing and Differential Discards


Figure 4-17 shows the effects of offering various mixtures of traffic to the queues shown in Figure 4-16. real time traffic is has access to Br if it needs it, and normal data traffic is has access to Bc. However sometimes the out-of-contract real time traffic or the best effort traffic gets no bandwidth whatsoever.


Figure 4-17: Effects of Combining Weighted Fair Queueing and Differential Discards


This combination of weighted fair queueing and multiple discard thresholds may be useful for two reasons:

Delay Limits

Delay can be limited by appropriate setting of discard thresholds.

For example, if real time traffic has a reserved service rate of Br , and the discard threshold for real time traffic is set to Dr , then the maximum delay at that queue is (Dr/Br).

All discard thresholds in Cisco IP+ATM equipment may be adjusted by the network operator. Real-time IP applications are normally quite tolerant of delay jitter, so it is not clear whether specific engineering for delay jitter is required in MPLS networks. If necessary, low jitter can be ensured by normal traffic engineering methods involving over-allocation of bandwidth to real-time traffic.

A further option in Cisco IP+ATM equipment is to give real time IP traffic priority access to spare bandwidth, ahead of any other weighted fair queueing classes irrespective of their weight. This is currently available in Cisco hardware and used for ATM Forum CBR and VBR traffic.

In Cisco IP+ATM networks, this over-allocation will typically not result in wasted bandwidth, as other classes have access to bandwidth left unused by real-time traffic.

Alternative Service Types

In Cisco IP+ATM equipment, QoS is provided for MPLS LVCs with Diffserv Assured Forwarding (AF) classes are supported using class-based weighted fair queueing. There are no per-VC bandwidth allocations and per-VC queueing is not used as it is inconsistent with the DiffServ model.

There is a default configuration. A network operator could over-ride these mappings by using a feature on IP+ATM switches called "Configurable Service Templates." However, this is not recommended because DiffServ Assured Forwarding assumes the use of class-based service and does not signal any bandwidth parameters. The DiffServ QoS model is quite different from ATM Forum per-VC QoS.

Although the default QoS classes may be over-ridden with other service classes, including ATM Forum Traffic Management classes, the default mappings have been carefully chosen to be the most appropriate.


hometocprevnextglossaryfeedbacksearchhelp
Posted: Tue Jul 11 10:05:00 PDT 2000
Copyright 1989-2000©Cisco Systems Inc.