Session Initiation Protocol (SIP) Overload Control

Session Initiation Protocol (SIP) Overload Control Bell Labs/Alcatel-Lucent

791 Holmdel-Keyport Rd Holmdel NJ 07733 USA volkerh@bell-labs.com

Bell Labs/Alcatel-Lucent

600-700 Mountain Avenue Murray Hill NJ 07974 USA iwidjaja@alcatel-lucent.com

Level 3 Communications

1025 Eldorado Blvd. Broomfield CO USA daryl.malas@level3.com

Columbia University/Department of Computer Science

450 Computer Science Building New York NY 10027 USA +1 212 939 7004 hgs@cs.columbia.edu http://www.cs.columbia.edu

Transport SIPPING Working Group SIP Overload Control Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited overload control mechanism through its 503 (Service Unavailable) response code, SIP servers are still vulnerable to overload. This document proposes new overload control mechanisms for SIP.

As with any network element, a Session Initiation Protocol (SIP) server can suffer from overload when the number of SIP messages it receives exceeds the number of messages it can process. Overload can pose a serious problem for a network of SIP servers. During periods of overload, the throughput of a network of SIP servers can be significantly degraded. In fact, overload may lead to a situation in which the throughput drops down to a small fraction of the original processing capacity. This is often called congestion collapse. Overload is said to occur if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU processing capacity, memory, network bandwidth, input/output, or disk resources. For overload control, we only consider failure cases where SIP servers are unable to process all SIP requests. There are other cases where a SIP server can successfully process incoming requests but has to reject them due to other failure conditions. For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a 488 (Not Acceptable Here) response . Similarly, a SIP registrar that has lost connectivity to its registration database but is still capable of processing SIP messages should reject REGISTER requests with a 500 (Server Error) response . Overload control does not apply to these cases and SIP provides appropriate response codes for them. The SIP protocol provides a limited mechanism for overload control through its 503 (Service Unavailable) response code. However, this mechanism cannot prevent overload of a SIP server and it cannot prevent congestion collapse. In fact, the use of the 503 (Service Unavailable) response code may cause traffic to oscillate and to shift between SIP servers and thereby worsen an overload condition. A detailed discussion of the SIP overload problem, the problems with the 503 (Service Unavailable) response code and the requirements for a SIP overload control mechanism can be found in .

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

This section discusses key design considerations for a SIP overload control mechanism. The goal for this mechanism is to enable a SIP server to control the amount of traffic it receives from its upstream neighbors.

The model shown in identifies fundamental components of an SIP overload control system: The SIP Processor processes SIP messages and is the component that is protected by overload control. The Monitor measures the current load of the SIP processor on the receiving entity. It implements the mechanisms needed to determine the current usage of resources relevant for the SIP processor and reports load samples (S) to the Control Function. The Control Function implements the overload control algorithm. The control function uses the load samples (S) and determines if overload has occurred and a throttle (T) needs to be set to adjust the load sent to the SIP processor on the receiving entity. The control function on the receiving entity sends load feedback (F) to the sending entity. The Actuator implements the algorithms needed to act on the throttles (T) and to adjust the amount of traffic forwarded to the receiving entity. For example, a throttle may instruct the Actuator to reduce the traffic destined to the receiving entity by 10%. The algorithms in the Actuator then determine how the traffic reduction is achieved, e.g., by selecting the messages that will be affected and determining whether they are rejected or redirected. The type of feedback (F) conveyed from the receiving to the sending entity depends on the overload control method used (i.e., loss-based, rate-based or window-based overload control; see ), the overload control algorithm as well as other design parameters. In any case, the feedback (F) enables the sending entity to adjust the amount of traffic forwarded to the receiving entity to a level that is acceptable to the receiving entity without causing overload.

|Processor |--+------+->|Processor |--+-> | System | +----------+ | | +----------+ | | +----------------+ +----------------+ -+ ]]>

A SIP request is often processed by more than one SIP server on its path to the destination. Thus, a design choice for overload control is where to place the components of overload control along the path of a request and, in particular, where to place the Monitor and Actuator. This design choice determines the degree of cooperation between the SIP servers on the path. Overload control can be implemented locally on a SIP server if Monitor and Actuator reside on the same server. Overload control can be implemented hop-by-hop with the Monitor on one server and the Actuator on its direct upstream neighbor. Finally, overload control can be implemented end-to-end with Monitors on all SIP servers along the path of a request and one Actuator on the sender. In this case, Monitors have to cooperate to jointly determine the current resource usage on this path. These three configurations are shown in .

| C | v | v //=>| C | +---+ +---+ // +---+ +---+ +---+ // +---+ | A |===>| B | | A |===>| B | +---+ +---+ \\ +---+ +---+ +---+ \\ +---+ \\=>| D | ^ \\=>| D | +---+ | +---+ ^ | | | +-+ +---------+ (a) local (b) hop-by-hop +------(+)---------+ | ^ | | | +---+ v | //=>| C | +---+ +---+ // +---+ | A |===>| B | +---+ +---+ \\ +---+ ^ | \\=>| D | | | +---+ | v | +------(+)---------+ (c) end-to-end ==> SIP request flow <-- Overload feedback loop ]]>

Servers can implement SIP overload control locally. This does not require any cooperation with neighboring SIP servers. All overload control components (Monitor, Control Function, Actuator) reside on the same SIP element. The idea of local overload control is to determine when a SIP server reaches a high load and to start rejecting requests with as little effort as possible, i.e., early in the processing, if overload occurs. Since rejecting these messages requires less processing capacity than fully processing them, a server is able to gracefully reject excess messages instead of simply dropping them. However, once the number of incoming requests exceeds the server's capacity to reject them, the server will still become overloaded. Local overload control does not require protocol support and is out of scope for this document.

The idea of hop-by-hop overload control is to instantiate a separate control loop between all neighboring SIP servers that directly exchange traffic. I.e., the Actuator is located on the SIP server that is the direct upstream neighbor of the SIP server that has the corresponding Monitor. Each control loop between two servers is completely independent of the control loop with other servers further up- or downstream. In the example in (b), three independent overload control loops are instantiated: A - B, B - C and B - D. Each loop only controls a single hop. Overload feedback received from a downstream neighbor is not forwarded further upstream. Instead, a SIP server acts on this feedback, for example, by re-routing or rejecting traffic if needed. If the upstream neighbor of a server also becomes overloaded, it will report this problem to its upstream neighbors, which again take action based on the reported feedback. Thus, in hop-by-hop overload control, overload is always resolved by the direct upstream neighbors of the overloaded server without the need to involve entities that are located multiple SIP hops away. Hop-by-hop overload control reduces the impact of overload on a SIP network and, in particular, can avoid congestion collapse. In addition, hop-by-hop overload control is simple and scales well to networks with many SIP entities. It does not require a SIP entity to aggregate a large number of overload status values or keep track of the overload status of SIP servers it is not communicating with.

End-to-end overload control implements an overload control loop along the entire path of a SIP request, from UAC to UAS. An end-to-end overload control mechanism consolidates overload information from all SIP servers on the way including all proxies and the UAS and uses this information to throttle traffic as far upstream as possible. An end-to-end overload control mechanism has to be able to frequently collect the overload status of all servers on the potential path(s) to a destination and combine this data into meaningful overload feedback. A UA or SIP server only needs to throttle requests if it knows that these requests will eventually be forwarded to an overloaded server. For example, if D is overloaded in (c), A should only throttle requests it forwards to B when it knows that they will be forwarded to D. It should not throttle requests that will eventually be forwarded to C, since server C is not overloaded. In many cases, it is difficult for A to determine which requests will be routed to C and D since this depends on the local routing decision made by B. The main problem of end-to-end path overload control is its inherent complexity since UAC or SIP servers need to monitor all potential paths to a destination in order to determine which requests should be throttled and which requests may be sent. In addition, the routing decisions of a SIP server depend on local policy, which can be difficult to infer for an upstream neighbor. Therefore, end-to-end overload control is likely to only work well in simple, well-known topologies (e.g., a server that is known to only have one downstream neighbor) or if a UA/server sends many requests to the exact same destination.

The following topologies describe four generic SIP server configurations, which each poses specific challenges for an overload control mechanism. In the "load balancer" configuration shown in (a) a set of SIP servers (D, E and F) receives traffic from a single source A. A load balancer is a typical example for such a configuration. In this configuration, overload control needs to prevent server A (i.e., the load balancer) from sending too much traffic to any of its downstream neighbors D, E and F. If one of the downstream neighbors becomes overloaded, A can direct traffic to the servers that still have capacity. If one of the servers serves as a backup, it can be activated once one of the primary servers reaches overload. If A can reliably determine that D, E and F are its only downstream neighbors and all of them are in overload, it may choose to report overload upstream on behalf of D, E and F. However, if the set of downstream neighbors is not fixed or only some of them are in overload then A should not use overload control since A can still forward the requests destined to non-overloaded downstream neighbors. These requests would be throttled as well if A would use overload control towards its upstream neighbors. In the "multiple sources" configuration shown in (b), a SIP server D receives traffic from multiple upstream sources A, B and C. Each of these sources can contribute a different amount of traffic, which can vary over time. The set of active upstream neighbors of D can change as servers may become inactive and previously inactive servers may start contributing traffic to D. If D becomes overloaded, it needs to generate feedback to reduce the amount of traffic it receives from its upstream neighbors. D needs to decide by how much each upstream neighbor should reduce traffic. This decision can require the consideration of the amount of traffic sent by each upstream neighbor and it may need to be re-adjusted as the traffic contributed by each upstream neighbor varies over time. An important goal for overload control is to achieve fairness across upstream neighbors. I.e., no upstream neighbor should be required to throttle more than another neighbor. In a fair system, each request that is routed to D has an equal chance of being processed, independent of the upstream neighbor it is coming from. A SIP server may have local policies that prefers some sources over others. For example, it can throttle a less preferred upstream neighbor more or earlier than a preferred neighbor. In many configurations, SIP servers form a "mesh" as shown in (c). Here, multiple upstream servers A, B and C forward traffic to multiple alternative servers D and E. This configuration is a combination of the "load balancer" and "multiple sources" scenario.

| D | | A |-\ / +---+ +---+ \ / \ +---+ +---+-/ +---+ +---+ \->| | | A |------>| E | | B |------>| D | +---+-\ +---+ +---+ /->| | \ / +---+ \ +---+ +---+ / \->| F | | C |-/ +---+ +---+ (a) load balancer (b) multiple sources +---+ | A |---\ a--\ +---+=\ \---->+---+ \ \/----->| D | b--\ \--->+---+ +---+--/\ /-->+---+ \---->| | | B | \/ c-------->| D | +---+===\/\===>+---+ | | /\====>| E | ... /--->+---+ +---+--/ /==>+---+ / | C |=====/ z--/ +---+ (c) mesh (d) edge proxy ]]> Overload control that is based on reducing the number of messages a sender is allowed to send is not suited for servers that receive requests from a very large population of senders, each of which only infrequently sends a request. This scenario is shown in (d). An edge proxy that is connected to many UAs is a typical example for such a configuration. Since each UA typically only contributes a few requests, which are often related to the same call, it can't decrease its message rate to resolve the overload. In such a configuration, a SIP server can resort to local overload control by rejecting a percentage of the requests it receives with 503 (Service Unavailable) responses. Since there are many upstream neighbors that contribute to the overall load, sending 503 (Service Unavailable) to a fraction of them can gradually reduce load without entirely stopping all incoming traffic. Using 503 (Service Unavailable) towards individual sources can, however, not prevent overload if a large number of users places calls at the same time. OPEN ISSUE: The requirements of the "edge proxy" topology are different than the ones of the other topologies, which may require a different method for overload control.

The method used by an overload control mechanism to limit the amount of traffic forwarded to an element is an important aspect of the design. We discuss the following three different types of overload control: rate-based, loss-based and window-based overload control.

The key idea of rate-based overload control is to limit the request rate at which an upstream element is allowed to forward to the downstream neighbor. If overload occurs, a SIP server instructs each upstream neighbor to send at most X requests per second. Each upstream neighbor can be assigned a different rate cap. The rate cap ensures that the number of requests received by a SIP server never increases beyond the sum of all rate caps granted to upstream neighbors. It can protect a SIP server against overload even during load spikes if no new upstream neighbors start sending traffic. New upstream neighbors need to be factored into the rate caps assigned as soon as they appear. The current overall rate cap used by a SIP server is determined by an overload control algorithm, e.g., based on system load. An algorithm for the sending entity to implement a rate cap of a given number of requests per second X is request gapping. After transmitting a request to a downstream neighbor, a server waits for 1/X seconds before it transmits the next request to the same neighbor. Requests that arrive during the waiting period are not forwarded and are either redirected, rejected or buffered. The main drawback of this mechanism is that it requires a SIP server to assign a certain rate cap to each of its upstream neighbors based on its overall capacity. Effectively, a server assigns a share of its capacity to each upstream neighbor. The server needs to ensure that the sum of all rate caps assigned to upstream neighbors is not (significantly) higher than its actual processing capacity. This requires a SIP server to continuously evaluate the amount of load it receives from each upstream neighbor and assign a rate cap that is suitable for this neighbor without limiting it too much. For example, in a non-overloaded situation, it could assign a rate cap that is 10% higher than the current number of requests received from this neighbor. This rate cap needs to be adjusted if the number of requests generated by the upstream neighbor changes (e.g., the server wants to contribute a higher amount of traffic). The cap also needs to be adjusted if a new upstream neighbors appears or an existing neighbor stops transmitting. If the cap assigned to an upstream neighbor is too high, the server may still experience overload. However, if the cap is too low, the upstream neighbors will reject requests even though they could be processed by the server.

A loss percentage enables a SIP server to ask an upstream neighbor to reduce the number of requests it would normally forward to this server by a percentage X. For example, a SIP server can ask an upstream neighbor to reduce the number of requests this neighbor would normally send by 10%. The upstream neighbor then redirects or rejects X percent of the traffic that is destined for this server. The loss percentage is determined by an overload control algorithm, e.g., based on current system load. An algorithm for the sending entity to implement a loss percentage is to draw a random number between 1 and 100 for each request to be forwarded. The request is not forwarded to the server if the random number is less than or equal to X. An advantage of loss-based overload control is that, the receiving entity does not need to track the request rate it receives from each upstream neighbor. It is sufficient to monitor the overall system utilization. To reduce load, a server can ask its upstream neighbors to lower the traffic forwarded by a certain percentage. The server calculates this percentage by combining the loss percentage that is currently in use (i.e., the loss percentage the upstream neighbors are currently using when forwarding traffic), the current system utilization and the desired system utilization. For example, if the server load approaches 90% and the current loss percentage is set to a 50% traffic reduction, then the server can decide to increase the loss percentage to 55% in order to get to a system utilization of 80%. Similarly, the server can lower the loss percentage if permitted by the system utilization. This requires that system utilization can be accurately measured and that these measurements are reasonably stable. Loss-based overload control achieves fairness among incoming requests if all upstream neighbors are throttled by the same percentage. In this case, each request destined for an overloaded server has the same chance of being rejected by overload control. The main drawback of percentage throttling is that the throttle percentage needs to be adjusted to the current number of requests received by the server. This is in particular important if the number of requests received fluctuates quickly. For example, if a SIP server sets a throttle value of 10% at time t1 and the number of requests increases by 20% between time t1 and t2 (t1<t2), then the server will see an increase in traffic by 10% between time t1 and t2. This is even though all upstream neighbors have reduced traffic by 10% as told. Thus, percentage throttling requires an adjustment of the throttling percentage in response to the traffic received and may not always be able to prevent a server from encountering brief periods of overload in extreme cases.

The key idea of window-based overload control is to allow an entity to transmit a certain number of messages before it needs to receive a confirmation for the messages in transit. Each sender maintains an overload window that limits the number of messages that can be in transit without being confirmed. Each sender maintains an unconfirmed message counter for each downstream neighbor it is communicating with. For each message sent to the downstream neighbor, the counter is increased by one. For each confirmation received, the counter is decreased by one. The sender stops transmitting messages to the downstream neighbor when the unconfirmed message counter has reached the current window size. A crucial parameter for the performance of window-based overload control is the window size. The windows size together with the round-trip time between sender and receiver determines the effective message rate that can be achieved. Each sender has an initial window size it uses when first sending a request. This window size can be changed based on the feedback it receives from the receiver. The receiver can require a decrease in window size to throttle the sender or allow an increase to allow an increasing message rate. The sender adjusts its window size as soon as it receives the corresponding feedback from the receiver. If the new window size is smaller than the current unconfirmed message counter, the sender stops transmitting messages until more messages are confirmed and the current unconfirmed message counter is less than the window size. A sender should not treat the reception of a 100 Trying response as an implicit confirmation for a message. 100 Trying responses are often created by a SIP server very early in processing and do not indicate that a message has been successfully processed and cleared from the input buffer. If the downstream neighbor is a stateless proxy, it will not create 100 Trying responses at all and instead pass through 100 Trying responses created by the next stateful server. Also, 100 Trying responses are typically only created for INVITE requests. Explicit message confirmations via an overload feedback mechanism do not have these problems. The behavior and issues of window-based overload control are similar to rate-based overload control, in that the total available receiver buffer space needs to be divided among all upstream neighbors. However, unlike rate-based overload control, window-based overload control can ensure that the receiver buffer does not overflow under normal conditions. The transmission of messages by senders is effectively clocked by message confirmations received from the receiver. A buffer overflow can occur if a large number of new upstream neighbors arrives at the same time.

An important aspect of the design of overload control mechanism is the overload control algorithm. The control algorithm determines when the amount of traffic a SIP server receives needs to be decreased and when it can be increased. Overload control algorithms have been studied to a large extent and many different overload control algorithms exist. This specification does not mandate the use or implementation of a specific algorithm. However, algorithms that are used MUST be compliant with the semantics for overload feedback and the behavior for the upstream node defined in this specification. OPEN ISSUE: With many different overload control algorithms available, it seems reasonable to define a baseline algorithm and allow the use of other algorithms if they don't violate the protocol semantics. This will also allow the development of future algorithms, which may lead to a better performance.

An important design aspect for an overload control mechanism is that it is self limiting. I.e., an overload control mechanism should stop a sender if the sender does not receive any feedback from the receiver. This avoids that an overloaded server, which has become unable to generate overload control feedback, will be overwhelmed with requests. Window-based overload control is inherently self-limiting since a sender cannot continue without receiving confirmations. Servers using Rate- or Loss-based overload control need to be configured to stop transmitting if they do not receive any feedback from the receiver.

It may be useful for a SIP server to frequently report its current load status to upstream neighbors. The load status indicates to which degree the resources needed by a SIP server to process SIP messages are utilized. An upstream neighbor can use load status to balance load between alternative SIP servers and to find under-utilized servers. Reporting load is not intended to replace specialized load balancing mechanisms. OPEN ISSUE: reporting load status seems useful but somewhat orthogonal to overload control. Should this be a separate mechanism?

A SIP mechanism is needed to convey overload feedback from the receiving to the sending SIP entity. A number of different alternatives exist to implement such a mechanism.

Overload control information can be transmitted using a new Via header field parameter for overload control. A SIP server can add this header parameter to the responses it is sending upstream to inform its upstream neighbors about the current overload status. A detailed description of this header is provided in . This approach has the following characteristics: A Via header parameter is light-weight and creates very little overhead. It does not require the transmission of additional messages for overload control and does not increase traffic or processing burdens in an overload situation. Overload control status can frequently be reported to upstream neighbors since it is a part of a SIP response. This enables the use of this mechanism if overload feedback is needed frequently, e.g., for loss- or window-based overload control. With a Via header parameter, overload control status is inherent in SIP signaling and is automatically conveyed to all relevant upstream neighbors, i.e., neighbors that are currently contributing traffic. There is no need for a SIP server to specifically track the set of current upstream or downstream neighbors with which it should exchange overload feedback. Overload status is not conveyed to inactive senders. This avoids the transmission of overload feedback to inactive senders, which do not contribute traffic. If an inactive sender starts to transmit while the receiver is in overload it will receive overload feedback in the first response and can adjust the amount of traffic forwarded accordingly. A SIP server can limit the distribution of overload control information by only inserting it into responses to known upstream neighbors. A SIP server can use transport level authentication (e.g., via TLS) with its upstream neighbors.

Overload control information can also be conveyed from a receiver to a sender using a new event package. This event package enables a sending entity to subscribe to the overload status of its downstream neighbors and receive notifications of overload control status changes in NOTIFY requests. A detailed description of this event package is provided in . This approach has the following characteristics: Overload control information is conveyed outside of the SIP signaling flow and can be decoupled from SIP signaling. For example, a separate overload control manager can be responsible for monitoring the load on all servers in a server farm and provide overload control feedback to all SIP servers that have set up subscriptions to this controller. With an event package, a receiver can send updates to senders that are currently inactive. Inactive senders will receive a notification about the overload and can refrain from sending traffic to this neighbor until the overload condition is resolved. The receiver can also notify all potential senders once they are permitted to send traffic again. However, these notifications do generate additional traffic, which adds to the overall load. A SIP entity needs to set up and maintain overload control subscriptions with all upstream and downstream neighbors. A new subscription needs to be set up before/while a request is transmitted to a new downstream neighbor. Servers can be configured to subscribe at boot time. However, this would require additional protection to avoid the avalanche restart problem for overload control. Subscriptions need to be terminated when they are not needed any more, which can be done, for example, using a timeout mechanism. A receiver needs to send NOTIFY messages to all subscribed upstream neighbors in a timely manner when the control algorithm requires a change in the control variable (e.g., when a SIP server is in an overload condition). This includes active as well as inactive neighbors. Depending on the number of neighbors, these NOTIFYs add to the amount of traffic that needs to be processed. To ensure that these requests will not be dropped due to overload, a priority mechanism needs to be implemented in all servers these request will pass through. A SIP server can limit the set of senders that can receive overload control information by authenticating subscriptions to this event package. This approach requires each proxy to implement a UAS/UAC to manage the subscriptions.

OPEN ISSUE: We need to decide about one SIP mechanism for conveying overload control information. Choosing a single transport mechanism seems beneficial for interoperability and simplicity purposes. Having two mechanisms (e.g., one for a closed network and one for SIP proxies receiving requests from many sources) might be an alternative. and provide details for the header and the event package alternative.

An new overload control mechanism needs to be backwards compatible so that it can be gradually introduced into a network and functions properly if only a fraction of the servers support it. Hop-by-hop overload control does not require that all SIP entities in a network support it. It can be used effectively between two adjacent SIP servers if both servers support overload control and does not depend on the support from any other server or user agent. The more SIP servers in a network support hop-by-hop overload control, the better protected the network is against occurrences of overload. In topologies such as the ones depicted in (b) and (c), a SIP server has multiple neighbors from which only some may support overload control. If a server would simply use this overload control mechanism, only those that support it would reduce traffic. Others would keep sending at the full rate and benefit from the throttling by the servers that support overload control. In other words, upstream neighbors that do not support overload control would be better off than those that do. A SIP server should therefore use 5xx responses towards upstream neighbors that do not support overload control. The server should reject the same amount of requests with 5xx responses that would be otherwise be rejected/redirected by the upstream neighbor if it would support overload control.

Local overload control can be used in conjunction with the mechanisms defined in this specification. It provides an additional layer of protection against overload, for example, when upstream servers do not support overload control. In general, servers should start using the mechanisms described here to throttle upstream neighbors before using local overload control to reject messages as a mechanism of last resort.

An element may receive overload control feedback indicating that it needs to reduce the traffic it sends to its downstream neighbor. An element can accomplish this task by sending some of the requests that would have gone to the overloaded element to a different destination. It needs to ensure, however, that this destination is not in overload and capable of processing the extra load. An element can also buffer requests in the hope that the overload condition will resolve quickly and the requests still can be forwarded in time. Finally, it can reject these requests.

Overload control can require a SIP server to prioritize messages and select messages that need to be rejected or redirected. The selection is largely a matter of local policy. A SIP server SHOULD honor the Resource-Priority header field as defined in RFC4412 if it is present in a SIP request. The Resource-Priority header field enables a proxy to identify high-priority requests, such as emergency service requests, and preserve them as much as possible during times of overload.

Providers can set up boundaries in their networks, which enforce topology hiding, header filtering and other functions. These boundaries are often realized as proxies, back-to-back user agents (B2BUA), or session border controllers. These devices may have policies for disclosing overload control information based on location and level of privacy desired.

It should be noted that changing overload control feedback can have a significant adverse effect on the overload control mechanism. For example, the policy in a border device might be to remove overload control feedback until the feedback reaches a certain threshold. However, this intervention in the overload control feedback loop can cause an overload control algorithm to overreact, since the algorithm would not see any effects of the feedback generated. Once the feedback passes through the filter, it would likely reduce traffic too much and causing the control algorithm to again steer into the opposite direction. For this reason, it is NOT RECOMMENDED that a border device changes or partially removes overload control feedback. A SIP service provider may choose to remove all overload control information to the upstream external proxy. This is NOT RECOMMENDED as it will disable protection against overload.

This section defines new parameters for the SIP Via header for overload control. These parameter provide a SIP mechanism for conveying overload control information between SIP entities.

A SIP server that supports this specification MUST add an "oc_accept" parameter to the Via headers it inserts into SIP requests. This provides an indication to downstream neighbors that this server supports overload control. OPEN ISSUE: To throttle upstream neighbors in a fair way, it is important that a SIP server can estimate the load each upstream neighbor receives for this server before it is throttled. This enables the server to throttle each upstream neighbor in the same way and thus provides each request the same chance of succeeding. In rate- and window-based overload control systems, a SIP server does not know how many messages each upstream neighbor had received for the server before throttling took place. A solution to this problem is to allow servers to report the load received for a downstream neighbor in the 'oc_accept' parameter.

A SIP server can provide overload control feedback to its upstream neighbors by adding the 'oc' parameter to the topmost Via header field of a SIP response. The 'oc' parameter is a new Via header parameter defined in this specification. When an 'oc' parameter is added to a response, it MUST be inserted into the topmost Via header. It MUST NOT be added to any other Via header in the response. The topmost Via header is determined after the SIP server has removed its own Via header. It is the Via header that was generated by the next upstream neighbor. Since the topmost Via header of a response will be removed by an upstream neighbor after processing it, overload control feedback contained in the 'oc' parameter will not travel beyond the next SIP server. A Via header parameter therefore provides hop-by-hop semantics for overload control feedback even if the next hop neighbor does not support this specification. A SIP server SHOULD add an 'oc' parameter to those responses, that contain an 'oc_accept' parameter in the topmost Via header. In this case, the SIP server MUST remove the 'oc_accept' parameter from the Via header and replace it with an 'oc' parameter. The 'oc' parameter can be used in all response types, including provisional, success and failure responses. A SIP server MAY generally add the 'oc' parameter to all responses it is sending. A SIP server MUST add an 'oc' parameter to responses when the transmission of overload control feedback is required by the overload control algorithm to limit the traffic received by the server. I.e., a SIP server MUST insert the 'oc' parameter when the overload control algorithm sets the 'oc' parameter to a value different from the default value. A SIP server that has added an 'oc' parameter to Via header SHOULD also add a 'oc_validity' parameter to the same Via header. The 'oc_validity' parameter defines the time in milliseconds during which the content (i.e., the overload control feedback) of the 'oc' parameter is valid. The default value of the 'oc_validity' parameter is 500. A SIP server SHOULD use a shorter 'oc_validity' time if its overload status varies quickly and MAY use a longer 'oc_validity' time if this status is more stable. If the 'oc_validity' parameter is not present, its default value is used. The 'oc_validity' parameter MUST NOT be used in a Via header without an 'oc' parameter and MUST be ignored if it appears in a Via header without 'oc' parameter. A SIP server MAY forward the content of an 'oc' parameter it has received from a downstream neighbor on to its upstream neighbor. However, forwarding the content of the 'oc' parameter is generally NOT RECOMMENDED and should only be performed if permitted by the configuration of SIP servers. For example, a SIP server that only relays messages between exactly two SIP servers could forward an 'oc' parameter. The 'oc' parameter is forwarded by copying it from the Via in which it was received into the next Via header (i.e., the Via header that will be on top after processing the response). If an 'oc_validity' parameter is present, MUST be copied along with the 'oc' parameter. The 'oc' and 'oc_validity' Via header parameters are only defined in SIP responses and MUST NOT be used in SIP requests. These parameters are only useful to the upstream neighbor of a SIP server (i.e., the entity that is sending requests to the SIP server) since this is the entity that can offload traffic by redirecting/rejecting new requests. If requests are forwarded in both directions between two SIP servers (i.e., the roles of upstream/downstream neighbors change), there are also responses flowing in both directions. Thus, both two SIP servers can exchange overload information. While adding 'oc' and 'oc_validity' parameters to requests may increase the frequency with which overload information is exchanged in these scenarios, this increase will rarely provide benefits and does not justify the added overhead and complexity needed. A SIP server MAY decide to add 'oc' and 'oc_validity' parameters only to responses that are sent via a secured transport channel such as TLS. The SIP server can use transport level authentication to identify the SIP servers, to which responses with these parameters are sent. This enables a SIP server to protect overload control information and ensure that it is only visible to trusted parties. Since overload control protects a SIP server from overload, it is RECOMMENDED that a SIP server generally inserts 'oc' and 'oc_validity' parameters into responses to all SIP servers.

The value of the 'oc' parameter is determined by an overload control algorithm (see ). This specification does not mandate the use of a specific overload control algorithm. However, the output of an overload control algorithm MUST be compliant to the semantics of this header. The 'oc' parameter value specifies the percentage by which the load forwarded to this SIP server should be reduced. Possible values range from 0 (the traffic forwarded is reduced by 0%, i.e., all traffic is forwarded) to 100 (the traffic forwarded is reduced by 100%, i.e., no traffic forwarded). The default value of this parameter is 0. The 'oc' parameter value is determined by the overload control algorithm of the SIP server generating the 'oc' parameter. OPEN ISSUE: the semantics of the 'oc' parameter depends on the overload control method used. It may contain a loss rate for loss-based overload control, a target rate for rate-based overload control or message confirmations and window-size for window-based overload control. It might be possible to allow multiple mechanisms to co-exist (e.g., by defining different parameters for the different feedback types). However, for interoperability purposes it seems preferable to agree on one mechanism.

A SIP entity compliant to this specification SHOULD remove 'oc' and 'oc_validity' parameters in all Via headers of a response received, except for the topmost Via header. This prevents 'oc'/'oc_validity' parameters that were accidentally or maliciously inserted into Via headers by a downstream SIP server from traveling upstream. A SIP server maintains the 'oc' parameter values received along with the address of the SIP servers from which they were received for the duration specified in the 'oc_validity' parameter or the default duration. Each time a SIP server receives a response with an 'oc' parameter from a SIP server, it overwrites the 'oc' value it has currently stored for this server with the new value received. The SIP server restarts the validity period of an 'oc' parameter each time a response with an 'oc' parameter is received from this server. A stored 'oc' parameter value MUST be discarded once it has reached the end of its validity.

A SIP server compliant to this specification MUST honor 'oc' parameter values it receives from downstream neighbors. The SIP server MUST NOT forward more messages to a SIP server than allowed by the current 'oc' parameter value from this server. When forwarding a SIP request, a SIP entity uses the SIP procedures to determine the next hop SIP server as, e.g., described in and . After selecting the next hop server, the SIP server MUST determine if it has an 'oc' parameter value for this server. If it has a non-expired 'oc' parameter value and this value is non-zero, the SIP server MUST determine if it can or cannot forward the current request within the current throttle conditions. The SIP server MAY use the following algorithm to determine if it can forward the request. The SIP server draws a random number between 1 and 100 for the current request. If the random number is less than or equal to the 'oc' parameter value, the request is not forwarded. Otherwise, the request is forwarded as usual. Another algorithm for SIP entities that processes a large number of requests is to reject/redirect the first X of every 100 requests processed. Other algorithms that lead to the same result may be used as well. OPEN ISSUE: the mechanisms to throttle traffic depend on the type of feedback conveyed in the 'oc' parameter value. It needs to be adjusted if a rate-based or window-based feedback is used. The treatment of SIP requests that cannot be forwarded to the selected SIP Server is a matter of local policy. A SIP entity MAY try to find an alternative target or it MAY reject the request (see ).

A SIP server that rejects a request because of overload MUST reject this request with the 5xx response code defined for overload control (e.g., 503 (Service Unavailable) or 507 (Server Overload) ). This response code indicates that the request did not succeed because the SIP servers processing the request are under overload. A SIP server that is under overload and has started to throttle incoming traffic SHOULD use 5xx response to reject a fraction of requests from upstream neighbors that do not include the 'oc_accept' parameter in their Via headers. These neighbors do not support this specification and will not respond to overload control feedback in the 'oc' parameter. The fraction of requests rejected SHOULD be equivalent to the fraction of requests the upstream server would reject/redirect if it did support this specification. This is to ensure that SIP servers, which do not support this specification, don't receive an unfair advantage over those that do. A SIP server that has reached overload (i.e., a load close to 100) SHOULD start using 5xx responses in addition to using the 'oc' parameter for all upstream neighbors. If the proxy has reached a load close to 100, it needs to protect itself against overload. Also, it is likely that upstream proxies have ignored overload feedback and do not support this specification.

In some cases, a SIP server may not receive a response from a downstream neighbor when sending a request. RFC3261 defines that when a timeout error is received from the transaction layer, it MUST be treated as if a 408 (Request Timeout) status code has been received. If a fatal transport error is reported by the transport layer, it MUST be treated as a 503 (Service Unavailable) status code. In these cases, a SIP server SHOULD stop sending requests to this server. The SIP server SHOULD occasionally forward a single request to probe if the downstream neighbor is alive. Once a SIP server has successfully transmitted a request to the downstream neighbor, it can resume normal transmission of requests. It should, of course, honor an 'oc' parameters it may receive. This avoids that a SIP server, which is unable to respond to incoming requests, is overloaded with additional requests. OPEN ISSUE: waiting for a timeout to occur seems a long time before starting to throttle back. It could make sense to throttle back earlier if no response is received for requests transmitted.

This section defines the syntax of three new Via header parameters: 'oc', 'oc_validity' and 'oc_accept'. These Via header parameters are used to implement an overload control feedback loop between neighboring SIP servers. The 'oc' and 'oc_validity' parameters are only defined in the topmost Via header of a response. They MUST NOT be used in the Via headers of requests and MUST NOT be used in other Via headers of a response. The 'oc' and 'oc_validity' parameters MUST be ignored if received outside of the topmost Via header of a response. The 'oc_accept' parameter MAY appear in all Via headers. The 'oc' Via header parameter contains a number between 0 and 100. It describes the percentage by which the traffic to the SIP server from which the response has been received should be reduced. The default value for this parameter is 0. The 'oc_validity' Via header parameter contains the time during which the corresponding 'oc' Via header parameter is valid. The 'oc_validity' parameter can only be present in a Via header in conjunction with an 'oc' parameter. The 'oc_accept' Via header parameter indicates that the SIP server, which has created this Via header, supports overload control. oc-throttle = "oc" [EQUAL 0-100] oc-validity = "oc_validity" [EQUAL delta-ms] oc-accept = "oc_accept" This extends the existing definition of the Via header field parameters, so that its BNF now looks like:

Example:

This section defines a new SIP event package for overload control. This event package provides a SIP mechanism for conveying overload control information between SIP entities. The following sections provide the details for defining a SIP event package as required by RFC 3265.

The name of this event package is "overload-control". This package name is carried in the Event and Allow-Events header fields, as defined in RFC 3265.

No package specific Event header field parameters are defined for this event package.

A SUBSCRIBE request for overload control information MAY contain a body. This body would serve the purpose of filtering the overload control subscription. The definition of such a body is outside the scope of this specification. For example, the body might provide a threshold for reporting overload control information or it might indicate that overload control information should be reported as a loss-percentage or a request rate. A SUBSCRIBE request for the overload control package MAY be sent without a body. This implies that the default subscription filtering policy as described in has been requested.

A subscription to the overload control event package is usually established when a SIP server first sends a request to another SIP server and terminated when this server stops sending requests and overload control is not needed any more. The duration of a subscription is related to the time a signaling relationship exists between two servers. In a static SIP server configuration (e.g., two SIP servers are configured to exchange messages in a service provider's network) this relationship can last for days or weeks as long as both servers are running. In this scenario, the subscription duration is largely irrelevant. In a dynamic configuration (e.g., two SIP servers in different domains) the duration of the signaling relationship can be in the range of minutes or hours and might only last for the duration of a single session. Since it is unknown a priori when the next SIP request will be transmitted from the subscriber to the notifier, subscriber and notifier MAY terminate a subscription to overload control after a period of inactivity. The duration of a subscription to the overload control event package SHOULD be longer than the duration of a typical session. The default subscription duration for this event package is set to two hours.

In this event package, the body of a notification contains the current overload status of the notifier. All subscribers and notifiers MUST support the format application/overload-info+xml. The SUBSCRIBE request MAY contain an Accept header field. If no such header field is present, it has a default value of application/overload-info+xml. If the header field is present, it MUST include application/overload-info+xml, and MAY include any other MIME type capable of representing overload status information. As defined in RFC 3265, the body of notifications MUST be in one of the formats defined in the Accept header of the SUBSCRIBE request or in the default format. TBD: An document format for the above placeholder application/overload-info+xml needs to be defined. The following document snippet is an example for such a format:

200 ]]>

The subscriber follows the general rules for generating SUBSCRIBE requests defined in RFC 3265.

It is RECOMMENDED that a notifier provides overload control status information to all subscribers and that the notifier accepts all subscriptions to this event package. By denying a subscription to overload control, a notifier would disable overload control to this subscriber. Since this subscriber would not know the current overload status of the notifier, it would not reduce the traffic forwarded when the notifier enters an overload condition. Thus, denying a subscription to this event package can leave the notifier vulnerable to SIP overload. A notifier MAY authenticate and authorize subscriptions to this event package. This is useful if the notifier wants to provide extended overload status information to certain subscribers. For example, a notifier can provide detailed resource usage information to authenticated subscribers and only provide the current throttle status to all other subscribers. The details of the authorization policy are at the discretion of the administrator.

A notifier sends a notification in response to SUBSCRIBE requests as defined in RFC 3265. In addition, a notifier MAY send a notification at any time during the subscription. Typically, the notifier will send a notification every time the overload control status has changed. For example, the notifier can create a notify every time the overload control value (e.g., the rate limit) changes. Overload status information is expressed in the format negotiated for the NOTIFY body (e.g., "application/overload-info+xml"). The overload status in a NOTIFY body MUST be complete. Notifications that contain the deltas to previous overload status or a partial overload status are not supported in this event package. It is RECOMMENDED that the notifier returns an initial NOTIFY that contains at least the current overload control value immediately after receiving a SUBSCRIBE request. It is RECOMMENDED that the notifier returns such an initial NOTIFY even if the notifier is still waiting for an authorization decision. Once the subscription is authorized, the notifier MAY send another notification that then contains all information the subscriber is authorized to receive. It is RECOMMENDED that the notifier accepts a subscription and creates a NOTIFY with at least the current overload control value even if the subscriber is not authorized to receive more information. The timely delivery of overload control notifications is important for overload control. It is therefore RECOMMENDED that NOTIFY messages for this event package are sent with highest priority. I.e., the transmission of NOTIFY messages for this event package ought not to be delayed by other tasks.

A subscriber MUST use the overload control state contained in a NOTIFY body and apply this state to all subsequent SIP messages it is intending to send to the respective SIP server. The subscriber MUST NOT forward a higher number of SIP messages to the server than allowed by the current overload control state. Details of how to apply overload control are discussed in A subscriber MUST use the overload state it has received for a SIP server until the subscriber receives another NOTIFY with an updated state or until the subscription is terminated. The subscriber SHOULD stop using the reported overload state once the subscription is terminated. It is RECOMMENDED that the subscriber processes incoming NOTIFY messages for this event package with highest priority. I.e., NOTIFY messages for this event package ought to be processed before other messages are processed. This is to ensure that a subscriber can react quickly to changes in the overload control status even if the subscriber is currently receiving a high volume of messages.

This event package allows the creation of only one dialog as a result of an initial SUBSCRIBE request. The techniques to achieve this behavior are described in .

Keeping the rate of notifications low is important for an overload control mechanism to avoid creating additional traffic in an overload condition. However, it is also important that an overload control algorithm can quickly adjust the overload control value as needed. Ideally, the overload control algorithm would generate a stable control value that rarely needs to be adjusted. The notifier SHOULD NOT generate NOTIFY messages at a rate faster once every 1 second for notifications that are triggered by a change in the control value. The notifier SHOULD NOT generate a NOTIFY message at a rate faster than once every 5 seconds for all other notifications (i.e., for any additional information included in the subscription).

State agents play no role in this package.

The following message flow illustrates how proxy A can subscribe to overload control status of proxy B. The flow assumes that proxy A does not have an active subscription to the overload control status of proxy B and has received an INVITE request it needs to forward to B.

| |(2) 200 OK | |<------------------| |(3) NOTIFY | |<------------------| |(4) 200 OK | |------------------>| |(5) INVITE | |------------------>| |(6) 200 OK | |<------------------| |(7) ACK | |------------------>| | | Message Details TBD. ]]>

Overload control mechanisms can be used by an attacker to conduct a denial-of-service attack on a SIP entity if the attacker can pretend that the SIP entity is overloaded. When such a forged overload indication is received by an upstream SIP entity, it will stop sending traffic to the victim. Thus, the victim is subject to a denial-of-service attack. An attacker can create forged overload feedback by inserting itself into the communication between the victim and its upstream neighbors. The attacker would need to add overload feedback indicating a high load to the responses passed from the victim to its upstream neighbor. Proxies can prevent this attack by communicating via TLS. Since overload feedback has no meaning beyond the next hop, there is no need to secure the communication over multiple hops. Another way to conduct an attack is to send a message containing a high overload feedback value through a proxy that does not support this extension. If this feedback is added to the second Via headers (or all Via headers), it will reach the next upstream proxy. If the attacker can make the recipient believe that the overload status was created by its direct downstream neighbor (and not by the attacker further downstream) the recipient stops sending traffic to the victim. A precondition for this attack is that the victim proxy does not support this extension since it would not pass through overload control feedback otherwise. A malicious SIP entity could gain an advantage by pretending to support this specification but never reducing the amount of traffic it forwards to the downstream neighbor. If its downstream neighbor receives traffic from multiple sources which correctly implement overload control, the malicious SIP entity would benefit since all other sources to its downstream neighbor would reduce load. OPEN ISSUE: the solution to this problem depends on the overload control algorithm. For a fixed message rate and window-based overload control, it is very easy for a downstream entity to monitor if the upstream neighbor throttles traffic forwarded as directed. For percentage throttling this is not always obvious since the load forwarded depends on the load received by the upstream neighbor.

[TBD.]

Many thanks to Rich Terpstra, Jonathan Rosenberg and Charles Shen for their contributions to this specification.

&rfc2119; &rfc3261; &rfc3263; &rfc3265; &rfc4412; Essential Correction to the Session Initiation Protocol (SIP) 503 (Service Unavailable) Response Bell Labs/Alcatel-Lucent Bell Labs/Alcatel-Lucent &i-d.rosenberg-sipping-overload-reqs;