Skip to main content


Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.12.2009 | Research Article | Ausgabe 1/2009 Open Access

EURASIP Journal on Wireless Communications and Networking 1/2009

Cross-Layer Explicit Link Status Notification to Improve TCP Performance in Wireless Networks

EURASIP Journal on Wireless Communications and Networking > Ausgabe 1/2009
Ji-Hoon Yun

1. Introduction

In recent years, wireless networks become more common in real life and many services such as e-mail, web browsing, e-commerce, and so fourth are available in the current wireless networks. In the near future, wireless networks will provide more services that are provided in the wired Internet, and will gradually carry the traffic similar to that of the wired networks. In the current wired networks, the majority of traffic relies on the Transmission Control Protocol (TCP) [ 1], which implies that the majority of traffic in the future wireless networks may also use TCP for the same services.
TCP is the prevailing transport layer protocol that provides the connection-oriented service and reliable transmission by acknowledgment (ACK). It also has a delicate congestion control mechanism [ 2] to deal with network congestion in a distributed manner. However, it is reported that TCP suffers from severe performance degradation in wireless networks [ 3].
TCP was originally designed for the use in the wired networks comprising reliable wired links and stationary hosts. It assumes that packet losses and unusual delays result from congestion in the network. Thus a TCP sender reduces its congestion window size to alleviate the congestion when it identifies the loss of a packet either by the arrival of several duplicate ACKs or the absence of an ACK for the packet within a timeout interval. However, wireless links can also suffer from packet losses due to high Bit Error Rate (BER), collisions in random access channels and handoffs in mobile networks. We refer to such packet losses as wireless losses to differentiate them from congestion losses. Although a packet is lost in the wireless link, not by congestion, the TCP sender reduces its congestion window size because the packet loss assumes that it is due to congestion in the network. Such TCP behavior results in significant throughput degradation and high interactive delay.
Recently, many schemes have been proposed to improve the end-to-end throughput of TCP in wireless networks. In this paper, we refer to these sort of schemes to improve the TCP performance against wireless losses as wireless TCP as a whole. Wireless TCP can be classified into three basic categories [ 4]: end-to-end schemes, split-connection schemes, and link-layer schemes. The end-to-end schemes mostly use three techniques, that is, selective acknowledgments [ 5], Explicit Loss Notification (ELN) [ 610], and Bandwidth Estimation (BWE) [ 1115]. (We do not consider the split-connection schemes in this paper because they violate the end-to-end semantics of a transport layer protocol considerably and put much overhead on the base station.)
In ELN schemes, the TCP sender is notified of wireless losses to preserve its congestion window. Although an ELN scheme is used, the TCP sender may reduce its congestion window even due to a wireless loss if the corresponding loss notification arrives too lately or if the detection of the wireless loss fails. Thus, how accurately and rapidly wireless losses can be notified is the main concern of ELN schemes. There have been various proposals which attempt to achieve this goal. In Balakrishnan et al .'s scheme [ 6], the TCP sequence numbers are cached at a Base Station (BS), and if the packet whose sequence number is cached is lost, the ELN bit of the ACK packet is set active by BS. In TCP HACK [ 7], the TCP sender puts extra header checksum bits in the options field of the TCP header. If the TCP receiver receives a data segment of which the data portion is corrupted while the header is intact, the receiver considers it to be a wireless loss and sends an ACK with the reserved bit for ELN set. EWLN [ 8] and TCP-ELN [ 9] are based on the similar idea. LHP [ 10] tries to improve the detection probability by fragmentation. In LHP, each data frame is fragmented before being transmitted over the wireless link and the first fragment which contains the TCP/IP header is sent repeatedly prior to the rest of fragments until it is successfully transmitted or all allocated link time slots for the frame are exhausted. If the receiver does not receive the whole frame except the first fragment, the receiver regards it as a wireless loss and sends an ACK packet with the ELN bit set.
However, the previous ELN schemes have long detection delay and nonzero detection failure probability. Balakrishnan et al .'s scheme requires an exchange of additional data and ACK packets over the wireless link for the detection of a wireless loss, which adds additional transmission delay to the detection delay. The delay gets even longer in poor channel condition due to frequent retransmissions. The other schemes require that a certain part of a data packet should be transmitted successfully in order to detect its wireless loss. However, if the wireless channel condition is poor, there is the possibility that such a part is also corrupted. Also, handoff or collision may result in the loss of the whole packet. In such cases, the TCP receiver will not detect the wireless loss correctly. Even though the receiver detects the wireless loss correctly, the ELN ACK should be transmitted over the wireless link. This takes an additional delay until loss notification and increases the possibility of notification failure further.
Meanwhile, a TCP sender can invoke congestion control due to the acknowledgment timeout of a packet while BS is still attempting to transmit the packet in the wireless link. This is called spurious timeout and more likely to occur if BS has slow loss recovery mechanism (e.g., 80 100 milliseconds for loss awareness in UMTS Rel'99) [ 16]. The above schemes cannot prevent spurious timeout because their notification process gets started after BS finishes the whole transmission procedure for a packet.
In this paper, a new ELN scheme, called (Link Layer originated Explicit Link Status Notification) LL-ELSN, is proposed. LL-ELSN does not require additional transmission over a wireless link for loss notification and thus enables accurate detection with minimum delay. This is achieved by exploiting cross-layer design. The key idea of LL-ELSN is to send ELN when a data frame is discarded at the wireless link layer due to excessive retry. Along with ELN, the proposed scheme sends a new information, that is, Explicit Retransmission Start Notification (ERSN), which notifies the TCP sender of the start of the retransmission process at the wireless link layer, in order to prevent spurious timeouts at the TCP sender. The author also proposes an SACK-based TCP modification, that is, ELN-capable SACK (ESACK), which can handle ELN messages and packet reordering due to the immediate retransmission of a lost packet indicated by ELN. ESACK is designed to avoid consecutive invocation of congestion control when multiple packets of a congestion window are dropped due to congestion. ESACK also mitigates bandwidth waste due to continuous ELN generation in a disconnection by appropriately filtering received ELNs according to their causes.
To investigate the performance improvement of the proposed scheme, we perform simulation using ns-2 [ 17] and compare various wireless TCP schemes. Through simulation, it is shown that the combination of the proposed ELN scheme and ESACK achieves goodput performance of more than 400% compared to TCP-Reno and 150% to Snoop [ 18]. Consequently, we demonstrate that the proposed scheme, when combined with the Link Layer Automatic Repeat Request (LL-ARQ) using a small retry limit, achieves consistently good performance against both uniform and bursty channel errors. Moreover, considering that many wireless TCP proposals compare their schemes only with TCP Reno or a few other schemes, our simulation results provide a comprehensive performance comparison of the various schemes on a single simulation platform.
We only consider last-hop wireless networks, that is, only the link between BS and user equipments is wireless, such as cellular networks and infrastructure Wireless Local Area Networks (WLANs).
The rest of this paper is organized as follows. Section 2 briefly describes recent wireless TCP schemes and analyzes their pros and cons. In Section 3, we describe the implementation details of LL-ELSN and compare it with other ELN schemes. Section 4 explains ESACK, and Section 5 presents the simulation results and their analysis. We conclude this paper in Section 6.

2. Related Work

In this section, we review the wireless TCP schemes that have been proposed recently. The wireless TCP schemes considered in this paper are ELN and BWE schemes as well as SACK [ 5], Snoop [ 18], and TCP-DCR [ 19]. In the following, we briefly describe each of them and analyze their pros and cons.

2.1. Explicit Loss Notification Schemes

ELN schemes attempt to detect wireless losses and notify the TCP sender of their occurrences by marking the reserved bit of TCP ACK or sending a newly defined message. Accordingly, how to accurately detect wireless losses and how to rapidly notify them are important design factors in ELN schemes.
In TCP HACK [ 7], the TCP sender puts extra header checksum bits in the options field of the TCP header. If the TCP receiver receives a data segment of which the data portion is corrupted while the header is okay, the receiver considers it as a wireless loss and sends an ACK with the reserved bit for ELN set to one. HACK assumes that SACK is used basically and the sender, upon reception of the ACK with the ELN bit set, immediately retransmits the lost packet without triggering congestion control. However, in HACK, if the wireless channel condition is poor, there is the possibility that the header is also corrupted. Also, handoff or channel contention may result in the loss of the whole packet. In such cases, the TCP receiver cannot judge the wireless link error correctly. So, the TCP sender of HACK can reduce its congestion window even due to a wireless loss. Even though the receiver detects the wireless loss correctly, the ELN ACK should be transmitted over the wireless link. This takes an additional delay until loss notification and increases the possibility of notification failure further.
LHP [ 10] tries to improve the detection probability by fragmentation. In LHP, each data frame is fragmented before being transmitted over the wireless link and the first fragment that contains the TCP/IP header is sent repeatedly prior to the rest of the fragments until it is successfully transmitted or all allocated link time slots for the frame are exhausted. When the receiver does not receive the whole frame except the first fragment, the receiver regards it as a wireless loss and sends an ACK packet with the ELN bit set. By fragmentation, LHP can increase the transmission probability of the TCP/IP header portion, which after all improves the detection probability. LHP also utilizes selective acknowledgement and the packet list which manages congestion control by keeping the unacknowledged packets in the sending order. However, the first fragment has to be transmitted successfully, which increases detection delay. Moreover, the first fragment can still be lost. Also, ACK packets that notify the wireless loss should go through wireless medium and this increases the detection failure probability and the notification delay further. The packet list works poorly with detection failures because the sender retransmits packets redundantly and invokes congestion control if the notification fails. If SMART [ 20] is used for LHP, whenever an ACK is lost, congestion control is invoked because the SMART ACK only contains the information of the single packet which has been successfully received and has triggered the ACK. In addition, fragmentation increases header overhead considerably, especially in 802.11 [ 21].

2.2. Bandwidth Estimation Schemes

BWE schemes keep on estimating the available bandwidth to monitor the network state. If a packet loss occurs in a noncongestive state based on the estimated bandwidth, the TCP sender does not invoke congestion control or reduces a congestion window conservatively. However, as BWE schemes infer the network state from indirect information, they have high probability of misjudgement.
TCP Westwood [ 11] estimates the available bandwidth from a low-pass filtered ACK reception rate. Its slow start and congestion avoidance phases are the same as those of TCP Reno. If duplicate ACKs are received ( is typically 3), the sender sets the and as the estimated bandwidth. When a timeout occurs, the sender sets the as the estimated bandwidth and the as one. In this way, it can maintain a proper sending rate even with wireless losses. However, the low pass filter incurs additional processing overhead due to its complexity. The filter also has the parameters whose optimal configuration may be different network by network. If ACKs are lost, the sender reduces its estimated bandwidth. For a timeout, it reduces the to one. Therefore, with wireless links having poor channel condition, TCP Westwood will underestimate the available bandwidth and have degraded performance since TCP ACKs can be also lost.
TCP Veno [ 12] adopts the measurement technique of Vegas [ 22] to estimate the network state. Veno does not change much of Reno. In the congestion avoidance phase, if the estimated backlogged segments from the Vegas measurement is smaller than the threshold , it assumes that the available bandwidth is not fully utilized and increases by 1/ when each new ACK is received as in TCP Reno. If , it increases by 1/ when every other new ACK is received, so as to reach the maximum achievable bandwidth rather slowly. When duplicate ACKs are received, if , Veno assumes a wireless loss and reduces the and by a small amount. Otherwise it behaves as TCP Reno. Veno has only two network states: congestion ( ) or noncongestion ( ). In the congestion state, every packet loss is assumed to be a wireless loss. Therefore, Veno cannot appropriately handle wireless losses when the network is in congestion.
TCP-Santa Cruz [ 23] also uses the measurement technique of Vegas to estimate the number of backlogged segments in the network. It updates a three-state machine based on the changes of . Within the state machine, an increase of during two consecutive intervals leads to the transition to the congestion state. TCP-Santa Cruz classifies a packet loss as a congestion loss and invokes congestion control against it only in the congestion state. However, TCP Santa Cruz has the same problem as TCP Veno has.
TCP-Jersey [ 13] is similar to TCP Westwood. It estimates the available bandwidth also from the ACK reception rate, but it filters the rate with the simpler filter than TCP Westwood's, which also needs less parameters. Contrary to the other schemes, TCP-Jersey's congestion event totally depends on Explicit Congestion Notification (ECN) [ 24]. If the sender receives a new ACK or duplicate ACKs with the congestion warning (CW) bit set in a congestion avoidance phase, it invokes the rate control, in which and are set to the estimated available bandwidth. Otherwise, it maintains its and , assuming that the network is underutilized. The weakness of TCP-Jersey is that it requires ECN-capable routers without probabilistic marking, which increases the network cost and processing overhead on the routers. Although it has less parameters than TCP Westwood, it still requires some parameters whose optimum values are hard to be decided because of their high dependence on network states. Also, it has no solution against timeouts due to wireless losses.
Jitter-based TCP (JTCP) [ 14] utilizes another metric, the jitter of data transmission time used in Real-time Transport Protocol (RTP) [ 25]. From the jitter reported by the receiver, it estimates the fraction of queued TCP segments in the network. If the number of queued segments estimated is more than a threshold, it behaves just like TCP Reno, except that it enters the fast retransmit phase only when duplicate ACKs have been received over several Round-Trip Time (RTT). Otherwise it maintains and for duplicate ACKs or halves and for a timeout. To calculate the jitter, the TCP sender has to store the sending time of each segment and the receiver has to report the segment receiving time in each ACK to the sender.
TCP TIBET [ 15] provides unbiased bandwidth estimation by filtering the interdeparture time of packets and the estimated bandwidth based on the core stateless fair queueing (CSFQ) algorithm. TCP TIBET uses the minimum RTT, denoted as , to calculate the available bandwidth accurately without depending on the coarse clock granularity of TCP. So, TCP TIBET has an updating algorithm, in which is multiplied by a reducing factor less than one. However, since the updating algorithm is invoked whenever a congestion event occurs, goes eventually to zero in congested networks. This makes small, and hence the corresponding connection remain in the congestion avoidance phase.
TCP NewReno-FF [ 26] discriminates packet losses based on RTT variation. It uses the flip flop filter technique [ 27] to estimate RTT. The flip flop filter exploits two exponentially weighted moving average (EWMA) filters where one of them, called agile filter, gives more weight to recent RTT samples. The agile filter is used prior to the other one when the deviation of the measured RTT sample is within a certain limit. If more than RTT samples among the last samples exceed the limit, a packet loss is classified as a congestion loss assuming that RTT will vary much in congestion. Otherwise it is classified as a wireless loss. As the other BWE schemes, it also has several parameters for configuring the filters and loss discrimination criterion. Moreover, it cannot detect wireless losses under congestion since it has only two network states as TCP Veno.

2.3. Another End-to-End Schemes

There are the end-to-end schemes which do not use the three techniques. TCP-DCR [ 19] is one of them. TCP-DCR delays a response to congestion by one RTT after the first duplicate ACK is received so that it gives time for LL-ARQ of the wireless link to recover a possible wireless error. If a new ACK arrives before the timer of one RTT expires, it switches to the normal operation. Otherwise it assumes that the loss is due to congestion and triggers the fast retransmissions and recovery operations.
With TCP-Casablanca [ 28], the TCP sender marks packets with two different discard priorities and intermediate routers drop the packets of the lower priority first in congestion. Thus the probability distribution of congestion losses will be different for the two types of packets while that of wireless losses may be similar for both of them assuming the uniform distribution of wireless losses. The TCP sender exploits such a difference when determining the cause of a packet loss. However, TCP-Casablanca requires the change of every router in networks and this makes it hard to be deployed. It discriminates the cause of a packet loss in a stochastic manner, therefore has higher detection failure probability of wireless losses than ELN schemes. Furthermore, under heavy congestion or bursty-loss wireless channel condition, the probability distribution of packet losses may not follow the presumed one and thus the detection failure probability will increase.

3. Link Layer Originated Explicit Link Status Notification

In this section, we describe the detailed operation of the proposed ELN scheme, that is, LL-ELSN, and analyze its behavior in the aspects of detection delay, accuracy, and bandwidth overhead.

3.1. Notification Operation

LL-ELSN assumes that the link layer protocol of a wireless link adopts an acknowledged service. The assumption is practical as many link layer protocols for wireless communication such as IEEE 802.11 Medium Access Control (MAC) [ 29] and Radio Link Protocol (RLP) [ 30] for cellular networks provide acknowledged services with ARQ. In such link layer protocols, when the number of retransmission attempts of a frame exceeds a limit, the link layer discards the frame and begins to transmit the next one. The discard of the frame results in the wireless loss of the corresponding TCP packet. In LL-ELSN, those discard events are monitored by BS or a wireless station. Here, BS indicates the network element which is responsible for reliable transmission over a wireless link in downlink communication, for example, Radio Network Controller (RNC) in UMTS [ 31] and Access Point (AP) in IEEE 802.11 WLANs.
Figure 1(a) shows the detailed LL-ELSN procedure for a single TCP packet in downlink communication with a stop-and-wait LL-ARQ. When the wireless link layer of BS fails in the first transmission attempt of the packet 1 (Layer-2 ACK timeout), the wireless link layer retransmits the packet and, simultaneously, BS sends the Explicit Retransmission Start Notification (ERSN) to the remote TCP sender. Upon reception of ERSN, the TCP sender neither invokes congestion control due to the packet 1 nor retransmits it. Therefore, spurious timeout can be avoided. If the packet 1 fails in transmission until the retry limit of LL-ARQ, BS sends an appropriate ELN message to the TCP sender. Upon receiving the ELN, the TCP sender immediately retransmits the packet 1 without any congestion control. (In this paper, the retry limit indicates the maximum number of transmission attempts permitted for a data frame at a link layer. It counts the first transmission attempt as defined in the IEEE 802.11.)
The ERSN and ELN messages are newly defined ones, which have the same message format as the TCP ACK. Their destination IP address and port number are those of the TCP sender. To indicate the message type, that is, ERSN, ELN or legacy TCP ACK, two bits of the reserved field of the TCP header are used. The internal traffic and signaling paths of BS for ERSN and ELN construction are depicted in Figure 2 for downlink communication. BS reads the TCP/IP header of an incoming packet from the wired network and sends the packet down to the wireless link layer with its flow information and TCP sequence number. When the first retransmission (excessive retry) occurs, the wireless link layer informs the ELSN agent of the flow information and sequence number of the retransmitted (discarded) packet. Then, the ELSN agent constructs the ERSN (ELN) message based on the information and sends it to the corresponding TCP sender.
For uplink communication shown in Figure 1(b), the link layer directly informs the upper TCP stack which packet enters a retransmission process and which packet is dropped because the monitored link layer and the TCP stack are within the same station. The behavior of the TCP sender against ERSN and ELN is the same as that in downlink communication. If a TCP packet is fragmented into several frames at the wireless link layer, ERSN is generated only for the first retransmission of a fragment while ELN is generated for the discard of the whole fragments.

3.2. Comparison to Other ELN Schemes

In this subsection, we compare LL-ELSN with other ELN schemes in the aspects of delay and accuracy. We summarize the comparison results in Table 1. For the comparison of notification delay, we define the following:
Table 1
Comparison of ELN schemes.
Balakrishnan et al .'s
ELN delay (downlink)
ELN delay (uplink)
Prob. of detection failure
the time when a wireless loss for a TCP packet occurs is the instance for the TCP packet being discarded at the wireless link layer;
the ELN delay is the time duration that takes until the TCP sender receives a notification for a wireless loss from the loss occurrence.
In the table, and are the time that takes to transmit a TCP data and an ACK packet successfully over a wireless link, and are obtained as
where and are the required time for a single transmission attempt of data and ACK packet, respectively, and and are the transmission failure probability of the corresponding packet. is the one-way trip time between BS and the remote TCP node (either sender or receiver) in the wired network.
As shown in Table 1, the ELN delay of LL-ELSN is only for downlink communication. However, the other schemes have additional delay plus since the ACK packet corresponding to an ELN message has to be transmitted over a wireless link. Furthermore, Balakrishnan et al .'s scheme has another delay component for successful transmission of the next data packet. On the other hand, for uplink communication, LL-ELSN has negligible ELN delay because the TCP sender is notified of wireless losses by the link layer within the same station. The other schemes need additional in uplink communication because the wireless link and the TCP receiver are located separately.
Although the other schemes (except LL-ELSN) have less delay than Balakrishnan et al .'s scheme, they have the probability of a detection failure that the TCP sender does not have any notification for a wireless loss ( is the probability that the header of the received TCP packet is corrupted and is the probability that the first fragment fails to be transmitted.). That is because at least the TCP/IP header and the first fragment have to be transmitted successfully over the wireless link. In LL-ELSN, there is no additional transmission over the wireless link and thus it has no detection failure in uplink communication. LL-ELSN also has no detection failure in downlink communication if ELN messages do not suffer from congestion losses on the wired path from BS to a TCP sender.
Compared to the other ELN schemes, LL-ELSN has the least delay and detection failure probability for both downlink and uplink, and also does not need any memory overhead at BS, which may enable fast handoffs. One disadvantage of LL-ELSN is that it violates the layering semantics because the link layer refers to the TCP/IP header to send ERSN and ELN messages. However, cross-layer design [ 32, 33] is a currently popular trend to utilize the restricted bandwidth of wireless links as much as possible. Thus the violation of the layering semantics could be acceptable if the performance is the most important metric.

3.3. Bandwidth Overhead of ERSN and ELN

In order to investigate the bandwidth overhead of the ERSN and ELN messages to wired networks in downlink communication, we conduct a simple mathematical analysis. We consider the slotted-access wireless system in which a data transmission and the acknowledgement of its reception (or Layer-2 ACK timeout) over a wireless link takes a single slot. Denote as the frame error probability of the wireless link and as the retry limit. Then, the intergeneration time of ERSN and ELN is given as
where is the expected number of required slots for successful transmission of a data frame given that it is successfully transmitted within the retry limit. is obtained as
Let be the duration time of a slot and be the transmission bit rate of the wireless link. We assume that the data transmission time is a dominant factor in a slot time and thus ( is the length of a data frame). Then, the generation rate of the ERSN and ELN messages, that is, their bandwidth overhead, is expressed as
where is the message length of ERSN and ELN. When , Figure 3 shows that is below 5% of in the considered range of . Knowing that the bandwidth of wired links is usually larger than that of the wireless link, the bandwidth overhead of the ERSN and ELN messages will not be a big burden to wired networks.

4. ELSN-Capable SACK

The operation of the TCP sender needs to be redesigned to handle the received ERSN and ELN messages. In this section we describe the proposed TCP modification which can be used with LL-ELSN as well as generic ELN schemes.

4.1. Congestion Control Algorithm

The side effect of immediate retransmissions triggered by ELNs is that the sequence of outgoing packets may be out of order and thus it is difficult for TCP senders to make appropriate decision on congestion control. To resolve this problem, Gao et al . [ 10] proposed a data structure of a packet list which keeps the information of sent packets, that is, sequence number and duplicate ACK counter, in the sending order. By exploiting this packet list (referred to as LHP packet list hereafter), the TCP sender can invoke appropriate congestion control schemes in various situations. However, it invokes multiple fast retransmit and recovery procedures for multiple packet drops in a congestion window, which makes the congestion window size unnecessarily small. To illustrate, assume that a list first has packet sequences from 14 to 19, as shown in Figure 4, and packets 14, 16, and 18 are lost in the network. When the sender receives an ACK of (14, 15) which means the receiver received the packet 15, but has not received the packet 14 yet, it removes the element of packet 15 in the list and increases the duplicate ACK counter of packet 14 by 1. (LHP premises the use of selective acknowledgement (e.g., SACK) and thus, from a received ACK, the TCP sender can be aware which packet triggered the ACK as well as which packet is expected.) Following ACKs for packets 17 and 19 increase the duplicate ACK counter of packets 14, 16, and 18. In this manner, the duplicate ACK counter of packet 14 reaches three, and a fast retransmit and fast recovery events are invoked, in which the congestion window size is reduced to half. After that, when the sender receives the ACK of (14, 20), it invokes the congestion control again because the duplicate ACK counter of packet 16 also reaches three. Consequently, the fast recovery is invoked three times for three packet drops in a congestion window.
To avoid this problem, we propose a new TCP modification, ELN-capable SACK (ESACK), which is based on the SACK implementation. In SACK, the data sender is assumed to have a retransmission queue which contains the segments, which have been transmitted, but not yet acknowledged, in a sequence-number order. Each segment in the retransmission queue has a flag bit "SACKed" to indicate whether this segment has been reported to have been received in an SACK option. ESACK adopts a similar approach except that the retransmission queue is aligned in a sending order. The retransmission queue does not need to store whole segments. Instead, it can store only the start and end bytes of a segment in the bulk data to reduce the size of the queue. ESACK defines new variables as below.
FirstUnsacked: the sequence number of the first segment which is not SACKed in the retransmission queue.
HighSACK: the highest sequence number which is SACKed.
The FirstUnsacked segment should be acknowledged cumulatively or selectively by a very next ACK unless it is reordered or dropped by congestion. If the segment is lost in a wireless link, it will be notified by an ELN and the entry of the segment will be moved to the tail of the retransmission queue after being retransmitted. Therefore, if FirstUnsacked is identified to be less than HighSACK after a duplicate ACK is processed, the TCP sender understands that the FirstUnsacked segment is reordered or dropped by congestion assuming that ELN is not dropped. Thus, by checking FirstUnsacked and HighSACK of the retransmission queue, the TCP sender can react to congestion while processing ELN. The procedure of ESACK when an ACK is received is specified in Algorithm 1.
Algorithm 1: Pseudocode of ESACK congestion control algorithm.
if(ELN) retransmit(ELN)
else if(duplicate ACK)
if(there is a SACK option)
update retransmission queue
end if
if(HighACK == FirstUnsacked)
dupacks ++
else if(FirstUnsacked HighSACK)
if(FirstUnsacked is changed)
dupacks = 1
else dupacks ++
else if(HighACK segment is SACKed)
end if
end if
if(dupacks DupThresh)
Typical TCP operation for DupThresh duplicate ACKs
end if
When an ELN is received, ESACK retransmits the lost segment indicated by the ELN immediately since the reception of an ELN indicates not only a packet was corrupted in a wireless link but also the packet has been dropped from the network. Thus the immediate retransmission ensures that the TCP connection remains ACK-clocked.
When a duplicate ACK is received, the sender checks FirstUnsacked and HighSACK to investigate whether it is due to congestion. If FirstUnsacked is equal to HighACK, it means that the FirstUnsacked segment was reordered or dropped due to congestion (we assume that an ACK acknowledges the sequence number or the first byte of the expected data). (HighACK is defined as the sequence number of the highest byte of data that has been cumulatively ACKed at a given point in [ 34].) Otherwise, if FirstUnsacked is less than HighSACK, it means that the FirstUnsacked segment was reordered or dropped due to congestion while waiting for a new ACK for the segment retransmitted by an ELN. If the segment of HighACK is already SACKed, it means that the segment was dropped in the receiver queue. Therefore, the segment needs to be transmitted again. If the counter of received duplicate ACKs becomes three, the sender triggers the congestion recovery algorithms of fast retransmit and recovery. In fast retransmit phase, the sender retransmits the FirstUnsacked segment. The other operations of ESACK are equal to SACK. Therefore, if there is no wireless loss, ESACK behaves just like SACK.

4.2. ELN Filtering

When a TCP receiver is disconnected from BS in downlink communication, the transmission attempts of the frames for the disconnected receiver fail consecutively. If the BS keeps on sending ELN messages for those frames to the corresponding TCP sender, the TCP connection will not be removed since the TCP sender keeps on retransmitting the lost segments. This wastes the bandwidth of the wired network unnecessarily. One solution to avoid this problem is to let the BS stop sending ELN messages for disconnection. The BS can assume disconnection when a certain number of successive segments for a connection are discarded. However, this approach is not scalable since the BS should store per-connection information. We solve this problem in the sender side.
ESACK can solve the bandwidth waste problem due to consecutive ELNs by differentiating the causes of the ELNs: channel noise or permanent disconnection. ESACK assumes that channel noise can maximally result in consecutive ELNs. To obtain the appropriate value of , let be the probability that a transmitted frame is corrupted by channel noise and be the retry limit. Then, the probability that a transmission leads to at least consecutive transmission failures is obtained by . We configure so that is reasonably high. For example, assume that is 20% and there is no retransmission ( ). Then, the probabilities of two and three consecutive failures are 4% and 0.8%, respectively, and we can configure as 2 since three consecutive failures due to channel noise is rather rare as 0.8%. For more conservative operation, we can increase . For those wireless losses due to channel noise, the TCP sender retransmits the discarded segments immediately without invoking congestion control. If ELNs are received for more than successive data packets, we consider this a disconnection. In this case, the TCP sender freezes all the TCP-related parameters and retransmits the discarded segments at every other ELN. By doing this, ESACK can reduce the bandwidth waste of unnecessary retransmission due to consecutive ELNs gradually in permanent disconnection. The reason of gradual slow down is for the case when a wireless station temporarily looks disconnected due to deep fading, handoff, and so fourth If the TCP sender receives a new ACK, the TCP sender restores the parameters and resumes data service.

5. Simulation Results

To investigate the quantitative performance of the proposed scheme, we perform comprehensive simulation using the ns-2 simulator [ 17].

5.1. Simulation Setup

We use the built-in modules for Reno, NewReno [ 35], SACK [ 5], and Snoop, the open sources [ 36] for TCP Westwood, [ 37] for TCP-DCR [ 19], and add some modules for TCP Veno [ 12], TCP Jersey [ 13], JTCP [ 14], TIBET [ 15], and LHP, with default parameter settings mentioned in the corresponding papers. (The version of TCP Westwood considered in the simulation is Westwood+.) Unless otherwise specified, LL-ELSN is used with ESACK. We use NewReno when evaluating the performance of Snoop for better performance. We compare their goodput performance and fairness in mixed wired and wireless networks. Here, goodput is defined as the TCP-level data reception rate measured at TCP receivers. We perform our simulation by changing RTT, competing traffic and the characteristic of the wireless channel. Simulation is run for a period of 200 seconds and we average over 20 runs without LL-ARQ and 50 runs with LL-ARQ. (LL-ARQ increases link layer dynamics and thus we need more simulation runs with LL-ARQ to obtain stable results.) The TCP packet size is 1000 bytes. TCP receivers advertise a very large window such that the sending rate of TCP senders is not clamped by the receivers' dynamics. IEEE 802.11b [ 38] is used for the data link layer of a wireless link and the queue size of each wired link and the wireless network interface is 50 packets. We do not use the Request-To-Send(RTS)/Clear-To-Send(CTS) exchange mechanism and link layer fragmentation in simulation. The transmission bit rate of data and ACK frames over the wireless link is 11Mbps. We use the default physical parameters of ns-2 for the 802.11 b modulations. The simulated topology for downlink communication is depicted in Figure 5. All the wired links are full-duplex 10 Mbps and their buffers have delay offset . Each connection carries long-live FTP application traffic from the source node (S) to the destination node (D). Packets are dropped bidirectionally at the wireless link. We generate bit errors depending on the given frame error rate (FER) in the wireless link. So, data packets are more prone to be lost than ACKs due to the longer length.

5.2. Comparison of Wireless TCP Schemes

In this subsection, we compare the various aspects of wireless TCP schemes. We first set the retry limit of 802.11 MAC to one in order to investigate the robustness of the testing schemes against wireless losses while excluding the effect of LL-ARQ. In this case, FER of the wireless link is equal to the Packet Error Rate (PER). It is noted that wireless losses due to collisions are not counted in FER, so there are still wireless losses even with small FER.
We perform the simulation of one TCP connection case, = 1. Figure 6 shows the goodput result as FER varies from 0.001% to 9%. In the result, LL-ELSN shows significant performance improvement over the other schemes. Note that LL-ELSN is better than the other schemes even with 0.01% FER since there are still wireless losses due to collisions between data packets (from BS) and TCP ACK packets (from the wireless station). Snoop shows the second best performance due to its local retransmission at BS. However, as FER increases, the goodput of Snoop decreases more drastically than that of LL-ELSN. That is because the local retransmission causes TCP to lose its ACK-based clock due to long end-to-end delay in high FER while LL-ELSN lets the TCP sender keep on sending data due to its accurate detection of wireless losses. The other schemes are better than Reno, but are much worse than LL-ELSN. Figure 7 shows the effect of RTT on the goodput performance. We indirectly vary RTT by adjusting the delay offset of each link buffer, that is, increase to increase RTT. The goodput of TCP Reno is known as inversely proportional to RTT [ 39]. Similarly, the testing schemes also show decreasing goodput for increasing RTT. However, their sensitivities to RTT are different. Especially, TIBET and Westwood are worse than Reno in small RTT. LL-ELSN shows good performance constantly.
Since the importance of uplink communication is increasing, we also simulate the uplink communication scenario where the wireless station is the TCP sender. We assume that the Snoop agent is at the wireless station by symmetric operation. In Figure 8, LL-ELSN shows the best performance because LL-ELSN has negligible ELN delay in uplink communication. However, Snoop is much worse than in downlink communication due to its slow retransmission. Snoop cannot retransmit the lost packet until the Snoop agent receives a duplicate ACK, which takes one RTT in uplink communication, or a timeout occurs at the agent. Moreover, Snoop cannot distinguish congestion losses from wireless losses in uplink communication, thus Snoop is not suitable for uplink communication. The relative performances of the other schemes are similar to those in downlink communication.
Next, we simulate the network with multiple TCP connections ( 1). In the goodput comparison in Figure 9, LL-ELSN shows quasiconstant goodput irrespective of . That is because LL-ELSN fully utilizes the channel bandwidth even with small number of connections. However, the other schemes can fully utilize the bandwidth with sufficiently large number of connections since each connection suffers from rate reduction due to wireless losses. In order to quantify fairness among multiple TCP connections, we evaluate the Jain's fairness index [ 40] which is defined as below
where is the goodput of th connection among the total connections. ranges from to 1.0, where means the perfect fairness and in that case, all the connections occupy the same bandwidth. means that one connection occupies the whole bandwidth. In Figure 10, the fairness of TCP-Jersey and JTCP gets worse as the number of connections gets increased. TCP Westwood and TIBET show the poor fairness as some connections suffer severely from frequent timeouts due to small congestion window size. Although the fairness of the testing schemes are relatively comparable with each other, most of them achieve high fairness index above 0.98 and do not suffer from severe unfairness. From the above two figures, we can conclude that LL-ELSN achieves good performance against wireless losses by accurate congestion response, not by aggressive behavior.
To investigate the performance against both wireless and congestion losses, we introduce background traffic to the network along the path of a TCP connection. The background traffic follows the Pareto distribution whose mean rate is 8 Mbps and the shape parameter is set rather high as two to induce high dynamics to the network. The simulation result is shown in Figure 11. LL-ELSN still shows the best performance among the testing schemes, which means that it successfully handles congestion losses as well as wireless losses. The severe performance degradation of LHP is due to multiple fast recoveries. This fact is investigated further in the next subsection.
Another important aspect of a new TCP variant is the friendliness with existing TCP Reno which is the base model of many TCP variants [ 41]. To be friendly with TCP Reno, a new TCP variant should follow the guidelines provided in [ 2]. Otherwise, it can be aggressive or feeble against TCP Reno. We investigate the goodput change of a connection using TCP Reno when 10 connections using TCP Reno are replaced one by one with those using a new TCP variant. If the goodput of a Reno connection remains the same, it means perfect friendliness of the new TCP variant. Otherwise, the goodput decreasing rate of a Reno connection presents the degree of the unfriendliness of the new TCP variant with TCP Reno. Figure 12 shows the result. (We exclude Snoop in this simulation since Snoop is a link-level solution and thus independent of TCP variants.) Here, we consider the ideal BWE scheme which knows the exact available bandwidth and RTT. In the figure, it is shown that TCP Westwood, Veno, and TIBET including the ideal BWE are aggressive against Reno connections. That means the accuracy of bandwidth estimation is not the only factor for TCP-friendliness. Those four schemes set to the estimated bandwidth when a congestion event occurs. Therefore, when multiple fast recoveries or timeouts are invoked in a short-time period due to heavy network congestion, those schemes have larger than Reno. While on the other, LL-ELSN and HACK show the similar result with SACK. That is because they behave just like SACK without wireless losses. TCP-DCR is somewhat aggressive against Reno as it delays congestion response, that is, reduction of transmission rate.
We also investigate the effectiveness of ERSN in the network with long retransmission delay. We set the retransmission delay of the wireless link layer to 100 milliseconds. Figure 13 shows that the number of spurious timeouts increases as FER increases due to the increasing number of retransmissions. LL-ELSN with ERSN achieves higher goodput than that without ERSN thanks to the prevention of spurious timeouts. It is observed that the improvement ratio of ERSN is almost proportional to the number of spurious timeouts.

5.3. Effect of TCP Sender Operation

We compare several combinations of LL-ELSN and different TCP modifications to show the effect of TCP sender operation to the TCP performance and find which combination results in the best performance. We compare 4 different TCP modifications, TCP HACK, HACK-fixed, LHP packet list, and ESACK, when used with LL-ELSN. Figure 14 shows the result. In the result, HACK shows better performance than Reno since it retransmits lost packets in the wireless link quickly. However, even with wireless losses, congestion control can be invoked by following duplicate ACKs and thus it is much worse than the other schemes. HACK-fixed is better than LHP when there are small connections, but, becomes worse than LHP when there are more than 10 connections due to congestion collapse by somewhat aggressive behavior. LHP is worse than ESACK because of ACK loss and multiple fast retransmit and recovery, as explained in Section 4. ESACK shows the best performance regardless of the number of connections, which means that it works well by accurate congestion response, not by an aggressive behavior.
Figure 15 shows the goodput improvement ratio of ESACK compared to the LHP packet list with the same Pareto background traffic as Figure 11. At low congestion, there is no difference between the LHP packet list and ESACK because the occurrence of multiple packet drops is rare. As congestion becomes severe, however, ESACK shows better performance since unnecessary fast recovery operations are invoked in the LHP packet list.
Figure 16 shows the sending rate variation of ESACK whose receiver is disconnected at 10 seconds. As shown in the figure, ESACK can avoid unnecessary retransmissions and thus mitigate bandwidth waste by filtering consecutive ELNs.

5.4. Effect of Link Layer ARQ

In this simulation, we only consider two best schemes in the previous simulations, that is, LL-ELSN and Snoop, and TCP Reno as a reference. If we use LL-ARQ, we can get much different results, as shown in Figure 17. When FER is less than 3%, TCP Reno with the retry limit of two is better than Snoop because LL-ARQ can hide significant amount of wireless losses. Notwithstanding that Snoop performs local retransmission, Snoop shows worse performance than TCP Reno with LL-ARQ. It results from the slower retransmission of Snoop than LL-ARQ because the timeout time of Snoop should be larger than LL-ARQ's and Snoop retransmits a TCP packet when a duplicated ACK is received or the timer expires. Also, Snoop does not retransmit TCP ACK which can be also lost, so this increases packet losses and delays the retransmission of data packets further. Although Snoop is combined with LL-ARQ, it is worse than Reno with LL-ARQ because the retransmission mechanism of Snoop is independent of that of the link layer. That is, the Snoop agent can retransmit a TCP packet unnecessarily while the link layer keeps on retransmitting it. On the other hand, if LL-ELSN is combined with LL-ARQ, it shows better performance with relatively small retry limit than TCP Reno. It is because LL-ELSN lets the TCP sender not to shrink the congestion window against wireless losses which LL-ARQ cannot recover through retransmissions. From the result, it is shown that LL-ARQ achieves significant increase of goodput due to PER decrease in high-error conditions. Such a result is consistent with that of [ 42]. However, LL-ARQ has some disadvantages in wireless channels with bursty errors, as will be shown in the following.
To investigate the effect of a bursty loss channel originated from inherent channel characteristics or temporary link outage from weak signal power or handoffs, we use a Markov chain for the wireless channel model and simulate 10 concurrent TCP connections. The Markov chain can be also used to model Rayleigh fading channels. Tan and Beaulieu [ 43] proposed a first-order Markov model for Rayleigh fading channels and analytically proved its accuracy. Babich and Lombardi [ 44] proposed a first-order Markov model for three-level quantized Rayleigh fading channels. Zorzi et al . [ 45] showed that a two state Markov model well approximates Rayleigh fading channels. As such, modeling Rayleigh fading channels using a first-order Markov model has gained some acceptance in literature and seems to find a growing number of applications [ 28, 46]. The Markov chain considered in this paper consists of two states: good and bad. We assume that state transitions occur per frame transmission and all the frames are transmitted successfully except when collisions occur in the good state while all the frames are lost with probability one in the bad state. We set the transition probability from the good state to the bad state, denoted as , to 0.01 and vary the staying probability in the bad state, denoted as , from 0.1 to 0.5. As increases, the burstiness of channel loss gets higher.
As shown in Figure 18, using LL-ARQ does not always perform well. Furthermore, the performance gets worse with LL-ARQ as the burstiness gets higher because, in the simulation environment considered, TCP performance is more sensitive to RTT increase rather than PER decrease due to LL-ARQ. Also, large retry limit results in the head of line (HOL) blocking in the First-In First-Out (FIFO) queue of BS (e.g., AP of 802.11 WLANs), so the other connections in the good state cannot be served due to a few connections in the bad state which keep on retransmitting failed frames until the retry limit is reached [ 47]. The HOL blocking also happens when a wireless station is suddenly powered-off during its communication with remote TCP senders. The harmful effect of HOL gets larger as the retry limit increases. Therefore, in order to achieve good performance flexibly in various environments, the retry limit should be set as small as possible satisfying the desired service quality in uniform error conditions.

6. Conclusions

In this paper, we proposed a new ELN scheme and a TCP modification. The proposed ELN scheme can detect wireless losses rapidly and accurately with minimum overhead since it does not need any additional data exchange over wireless links. We also showed that the proposed TCP modification can deal with multiple packet drops in a congestion window. The combination of these two schemes gives the significant performance improvement over existing protocols as we have demonstrated throughout the simulation results. We conclude that the combination of LL-ELSN and LL-ARQ with a small retry limit will perform well flexibly in various environments.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Unsere Produktempfehlungen

Premium-Abo der Gesellschaft für Informatik

Sie erhalten uneingeschränkten Vollzugriff auf alle acht Fachgebiete von Springer Professional und damit auf über 45.000 Fachbücher und ca. 300 Fachzeitschriften.

Über diesen Artikel

Weitere Artikel der Ausgabe 1/2009

EURASIP Journal on Wireless Communications and Networking 1/2009 Zur Ausgabe

Premium Partner