draft-floyd-incr-init-win-02

TCP Implementation Working Group M. Allman
INTERNET DRAFT NASA Lewis/Sterling Software
File: draft-floyd-incr-init-win-02.txt S. Floyd
 LBNL
 C. Partridge
 BBN Technologies
 April, 1998
 Increasing TCP's Initial Window
Status of this Memo
 This document is an Internet-Draft. Internet-Drafts are working
 documents of the Internet Engineering Task Force (IETF), its areas,
 and its working groups. Note that other groups may also distribute
 working documents as Internet-Drafts.
 Internet-Drafts are draft documents valid for a maximum of six
 months and may be updated, replaced, or obsoleted by other documents
 at any time. It is inappropriate to use Internet-Drafts as
 reference material or to cite them other than as ``work in
 progress.''
 To view the entire list of current Internet-Drafts, please check
 the "1id-abstracts.txt" listing contained in the Internet-Drafts
 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
 (US West Coast).
Abstract
 This document specifies an increase in the permitted initial window
 for TCP from one segment to roughly 4K bytes. This document
 discusses the advantages and disadvantages of such a change,
 outlining experimental results that indicate the costs and benefits
 of such a change to TCP.
Terminology
 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
 "OPTIONAL" in this document are to be interpreted as described in
 RFC 2119 [RFC2119].
1. TCP Modification
 This document specifies an increase in the permitted upper bound
 for TCP's initial window from one segment to between two
 and four segments. In most cases, this change results in an upper
 bound on the initial window of roughly 4K bytes (although given a
 large segment size, the permitted initial window of two segments
Expires: October, 1998 [Page 1]

draft-floyd-incr-init-win-02.txt April 1998
 could be significantly larger than 4K bytes). The upper bound for
 the initial window is given more precisely in (1):
 min (4*MSS, max (2*MSS, 4380 bytes)) (1)
 Equivalently, the upper bound for the initial window size
 is based on the maximum segment size (MSS), as follows:
 If (MSS <= 1095 bytes)
 then win <= 4 * MSS;
 If (1095 bytes < MSS < 2190 bytes)
 then win <= 4380;
 If (2190 bytes <= MSS)
 then win <= 2 * MSS;
 This increased initial window is optional: that a TCP MAY
 start with a larger initial window, not that it SHOULD.
 This upper bound for the initial window size represents a change
 from RFC 2001 [S97], which specifies that the congestion window be
 initialized to one segment. If implementation experience proves
 successful, then the intent is for this change to be incorporated
 into a revision to RFC 2001.
 This change applies to the initial window of the connection in the
 first round trip time (RTT) of transmission following the TCP
 three-way handshake. Neither the SYN/ACK nor its acknowledgment
 (ACK) in the three-way handshake should increase the initial window
 size above that outlined in equation (1). If the SYN or SYN/ACK is
 lost, the initial window used by a sender after a correctly
 transmitted SYN MUST be one segment.
 TCP implementations use slow start in as many as three different ways:
 (1) to start a new connection (the initial window); (2) to restart
 a transmission after a long idle period (the restart window); and
 (3) to restart after a retransmit timeout (the loss window). The
 change proposed in this document affects the value of the initial
 window. Optionally, a TCP MAY set the restart window to the
 same value used for the initial window. These changes do NOT
 change the loss window, which must remain 1 (to permit the lowest
 possible window size in the case of severe congestion).
2. Implementation Issues
 When larger initial windows are implemented along with Path MTU
 Discovery [MD90], and the MSS being used is found to be too large,
 the congestion window `cwnd' SHOULD be reduced to prevent large
 bursts of smaller segments. Specifically, `cwnd' SHOULD be reduced
 by the ratio of the old segment size to the new segment size.
 When larger initial windows are implemented along with Path MTU
 Discovery [MD90], alternatives are to set the "Don't Fragment" (DF)
 bit in all segments in the initial window, or to set the "Don't
 Fragment" (DF) bit in one of the segments. It is an open question
Expires: October, 1998 [Page 2]

draft-floyd-incr-init-win-02.txt April 1998
 which of these two alternatives is best; we would hope that
 implementation experiences will shed light on this. In the first
 case of setting the DF bit in all segments, if the initial packets
 are too large, then all of the initial packets will be dropped in
 the network. In the second case of setting the DF bit in only one
 segment, if the initial packets are too large, then all but one of
 the initial packets will be fragmented in the network. When the
 second case is followed, setting the DF bit in the last segment in
 the initial window provides the least chance for needless
 retransmissions when the initial segment size is found to be too
 large, because it minimizes the chances of duplicate ACKs
 triggering a Fast Retransmit. However, more attention needs to be
 paid to the interaction between larger initial windows and Path MTU
 Discovery.
 The larger initial window proposed in this document is not intended
 as an encouragement for web browsers to open multiple simultaneous
 TCP connections all with large initial windows. When web browsers
 open simultaneous TCP connections to the same destination, this
 works against TCP's congestion control mechanisms [FF98],
 regardless of the size of the initial window. Combining this
 behavior with larger initial windows further increases the
 unfairness to other traffic in the network.
3. Advantages of Larger Initial Windows
 1. When the initial window is one segment, a receiver employing
 delayed ACKs [Bra89] is forced to wait for a timeout before
 generating an ACK. With an initial window of at least two
 segments, the receiver will generate an ACK after the second
 data segment arrives. This eliminates the wait on the timeout
 (often up to 200 msec).
 2. For connections transmitting only a small amount of data, a
 larger initial window reduces the transmission time (assuming
 moderate segment drop rates). For many email (SMTP [Pos82])
 and web page (HTTP [BLFN96, FJGFBL97]) transfers that are less
 than 4K bytes, the larger initial window would reduce the data
 transfer time to a single RTT.
 3. For connections that will be able to use large congestion
 windows, this modification eliminates up to three RTTs and a
 delayed ACK timeout during the initial slow-start phase. This
 would be of particular benefit for high-bandwidth
 large-propagation-delay TCP connections, such as those over
 satellite links.
4. Disadvantages of Larger Initial Windows for the Individual
 Connection
 In high-congestion environments, particularly for routers that have
 a bias against bursty traffic (as in the typical Drop Tail router
 queues), a TCP connection can sometimes be better off starting with
 an initial window of one segment. There are scenarios where a TCP
Expires: October, 1998 [Page 3]

draft-floyd-incr-init-win-02.txt April 1998
 connection slow-starting from an initial window of one segment might
 not have segments dropped, while a TCP connection starting with an
 initial window of four segments might experience unnecessary
 retransmits due to the inability of the router to handle small
 bursts. This could result in an unnecessary retransmit timeout.
 For a large-window connection that is able to recover without a
 retransmit timeout, this could result in an unnecessarily-early
 transition from the slow-start to the congestion-avoidance phase of
 the window increase algorithm. These premature segment drops should
 not occur in uncongested networks with sufficient buffering or in
 moderately-congested networks where the congested router uses
 active queue management (such as Random Early Detection [FJ93]).
 Some TCP connections will receive better performance with the higher
 initial window even if the burstiness of the initial window results
 in premature segment drops. This will be true if (1) the TCP
 connection recovers from the segment drop without a retransmit
 timeout, and (2) the TCP connection is ultimately limited to a small
 congestion window by either network congestion or by the receiver's
 advertised window.
5. Disadvantages of Larger Initial Windows for the Network
 In terms of the potential for congestion collapse, we consider two
 separate potential dangers for the network. The first danger would
 be a scenario where a large number of segments on congested links
 were duplicate segments that had already been received at the
 receiver. The second danger would be a scenario where a large
 number of segments on congested links were segments that would be
 dropped later in the network before reaching their final
 destination.
 In terms of the negative effect on other traffic in the network, a
 potential disadvantage of larger initial windows would be that they
 increase the general packet drop rate in the network. We discuss
 these three issues below.
 Duplicate segments:
 As described in the previous section, the larger initial window
 could occasionally result in a segment dropped from the initial
 window, when that segment might not have been dropped if the
 sender had slow-started from an initial window of one segment.
 However, Appendix A shows that even in this case, the larger
 initial window would not result in the transmission of a large
 number of duplicate segments.
 Segments dropped later in the network:
 How much would the larger initial window for TCP increase the
 number of segments on congested links that would be dropped
 before reaching their final destination? This is a problem that
 can only occur for connections with multiple congested links,
 where some segments might use scarce bandwidth on the first
Expires: October, 1998 [Page 4]

draft-floyd-incr-init-win-02.txt April 1998
 congested link along the path, only to be dropped later along
 the path.
 First, many of the TCP connections will have only one congested
 link along the path. Segments dropped from these connections do
 not ``waste'' scarce bandwidth, and do not contribute to
 congestion collapse.
 However, some network paths will have multiple congested links,
 and segments dropped from the initial window could use scarce
 bandwidth along the earlier congested links before ultimately
 being dropped on subsequent congested links. To the extent
 that the drop rate is independent of the initial window used by
 TCP segments, the problem of congested links carrying segments
 that will be dropped before reaching their destination will be
 similar for TCP connections that start by sending four segments
 or one segment.
 An increased packet drop rate:
 For a network with a high segment drop rate, increasing the TCP
 initial window could increase the segment drop rate even
 further. This is in part because routers with Drop Tail queue
 management have difficulties with bursty traffic in times of
 congestion. However, given uncorrelated arrivals for TCP
 connections, the larger TCP initial window should not
 significantly increase the segment drop rate. Simulation-based
 explorations of these issues are discussed in Section 7.2.
 These potential dangers for the network are explored in simulations
 and experiments described in the section below. Our judgement
 would be, while there are dangers of congestion collapse in the
 current Internet (see [FF98] for a discussion of the dangers of
 congestion collapse from an increased deployment of UDP connections
 without end-to-end congestion control), there is no such danger to
 the network from increasing the TCP initial window to 4K bytes.
6. Typical Levels of Burstiness for TCP Traffic.
 Larger TCP initial windows would not dramatically increase the
 burstiness of TCP traffic in the Internet today, because such
 traffic is already fairly bursty. Bursts of two and three segments
 are already typical of TCP [Flo97]; A delayed ACK (covering two
 previously unacknowledged segments) received during congestion
 avoidance causes the congestion window to slide and two segments to
 be sent. The same delayed ACK received during slow start causes
 the window to slide by two segments and then be incremented by one
 segment, resulting in a three-segment burst. While not necessarily
 typical, bursts of four and five segments for TCP are not rare.
 Assuming delayed ACKs, a single dropped ACK causes the subsequent
 ACK to cover four previously unacknowledged segments. During
 congestion avoidance this leads to a four-segment burst and during
 slow start a five-segment burst is generated.
Expires: October, 1998 [Page 5]

draft-floyd-incr-init-win-02.txt April 1998
 There are also changes in progress that reduce the performance
 problems posed by moderate traffic bursts. One such change is the
 deployment of higher-speed links in some parts of the network,
 where a burst of 4K bytes can represent a small quantity of data.
 A second change, for routers with sufficient buffering, is the
 deployment of queue management mechanisms such as RED, which is
 designed to be tolerant of transient traffic bursts.
7. Simulations and Experimental Results
7.1 Studies of TCP Connections using that Larger Initial Window
 This section surveys simulations and experiments that have been
 used to explore the effect of larger initial windows on the TCP
 connection using that larger window. The first set of experiments
 explores performance over satellite links. Larger initial windows
 have been shown to improve performance of TCP connections over
 satellite channels [All97b]. In this study, an initial window of
 four segments (512 byte MSS) resulted in throughput improvements of
 up to 30% (depending upon transfer size). [HAGT98] shows that the
 use of larger initial windows results in a decrease in transfer
 time in HTTP tests over the ACTS satellite system. A study
 involving simulations of a large number of HTTP transactions over
 hybrid fiber coax (HFC) indicates that the use of larger initial
 windows decreases the time required to load WWW pages [Nic97].
 A second set of experiments has explored TCP performance over
 dialup modem links. In experiments over a 28.8 bps dialup channel
 [All97a, AHO98], a four-segment initial window decreased the
 transfer time of a 16KB file by roughly 10%, with no accompanying
 increase in the drop rate. A particular area of concern has been
 TCP performance over low speed tail circuits (e.g., dialup modem
 links) with routers with small buffers. A simulation study [SP97]
 investigated the effects of using a larger initial window on a host
 connected by a slow modem link and a router with a 3 packet
 buffer. The study concluded that for the scenario investigated,
 the use of larger initial windows was not harmful to TCP
 performance. Questions have been raised concerning the effects of
 larger initial windows on the transfer time for short transfers in
 this environment, but these effects have not been quantified. A
 question has also been raised concerning the possible effect on
 existing TCP connections sharing the link.
7.2 Studies of Networks using Larger Initial Windows
 This section surveys simulations and experiments investigating the
 impact of the larger window on other TCP connections sharing the
 path. Experiments in [All97a, AHO98] show that for 16 KB transfers
 to 100 Internet hosts, four-segment initial windows resulted in a
 small increase in the drop rate of 0.04 segments/transfer. While
 the drop rate increased slightly, the transfer time was reduced by
 roughly 25% for transfers using the four-segment (512 byte MSS)
 initial window when compared to an initial window of one segment.
Expires: October, 1998 [Page 6]

draft-floyd-incr-init-win-02.txt April 1998
 One scenario of concern is heavily loaded links. For
 instance, a couple of years ago, one of the trans-Atlantic links
 was so heavily loaded that the correct congestion window size for a
 connection was about one segment. In this environment, new
 connections using larger initial windows would be starting with
 windows that were four times too big. What would the effects be?
 Do connections thrash?
 A simulation study in [PN98] explores the impact of a larger
 initial window on competing network traffic. In this
 investigation, HTTP and FTP flows share a single congested gateway
 (where the number of HTTP and FTP flows varies from one simulation
 set to another). For each simulation set, the paper examines
 aggregate link utilization and packet drop rates, median web page
 delay, and network power for the FTP transfers. The larger initial
 window generally resulted in increased throughput,
 slightly-increased packet drop rates, and an increase in overall
 network power. With the exception of one scenario, the larger
 initial window resulted in an increase in the drop rate of less
 than 1% above the loss rate experienced when using a one-segment
 initial window; in this scenario, the drop rate increased from
 3.5% with one-segment initial windows, to 4.5% with four-segment
 initial windows. The overall conclusions were that increasing the
 TCP initial window to three packets (or 4380 bytes) helps to
 improve perceived performance.
 Morris [Mor97] investigated larger initial windows in a very
 congested network with transfers of size 20K. The loss rate in
 networks where all TCP connections use an initial window of four
 segments is shown to be 1-2% greater than in a network where all
 connections use an initial window of one segment. This
 relationship held in scenarios where the loss rates with
 one-segment initial windows ranged from 1% to 11%. In addition, in
 networks where connections used an initial window of four segments,
 TCP connections spent more time waiting for the retransmit timer
 (RTO) to expire to resend a segment than was spent when using an
 initial window of one segment. The time spent waiting for the RTO
 timer to expire represents idle time when no useful work was being
 accomplished for that connection. These results show that in a
 very congested environment, where each connection's share of the
 bottleneck bandwidth is close to one segment, using a larger
 initial window can cause a perceptible increase in both loss rates
 and retransmit timeouts.
8. Security Considerations
 This document discusses the initial congestion window permitted
 for TCP connections. Changing this value does not raise any known
 new security issues with TCP.
Expires: October, 1998 [Page 7]

draft-floyd-incr-init-win-02.txt April 1998
9. Conclusion
 This document proposes a small change to TCP that may be beneficial to
 short-lived TCP connections and those over links with long RTTs
 (saving several RTTs during the initial slow-start phase).
10. Acknowledgments
 We would like to acknowledge Vern Paxson, Tim Shepard, members of
 the End-to-End-Interest Mailing List, and members of the IETF TCP
 Implementation Working Group for continuing discussions of these
 issues for discussions and feedback on this document.
11. References
 [All97a] Mark Allman. An Evaluation of TCP with Larger Initial
 Windows. 40th IETF Meeting -- TCP Implementations WG.
 December, 1997. Washington, DC.
 [AHO98] Mark Allman, Chris Hayes, and Shawn Ostermann, An
 Evaluation of TCP with Larger Initial Windows, March 1998.
 Submitted to ACM Computer Communication Review. URL
 "http://gigahertz.lerc.nasa.gov/~mallman/papers/initwin.ps".
 [All97b] Mark Allman. Improving TCP Performance Over Satellite
 Channels. Master's thesis, Ohio University, June 1997.
 [BLFN96] Tim Berners-Lee, R. Fielding, and H. Nielsen. Hypertext
 Transfer Protocol -- HTTP/1.0, May 1996. RFC 1945.
 [Bra89] Robert Braden. Requirements for Internet Hosts --
 Communication Layers, October 1989. RFC 1122.
 [FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of
 Tahoe, Reno, and SACK TCP. Computer Communication Review,
 26(3), July 1996.
 [FF98] Sally Floyd, Kevin Fall. Promoting the Use of End-to-End
 Congestion Control in the Internet. Submitted to IEEE
 Transactions on Networking. URL
 "http://www-nrg.ee.lbl.gov/floyd/end2end-paper.html".
 [FJGFBL97] R. Fielding, Jeffrey C. Mogul, Jim Gettys, H. Frystyk,
 and Tim Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1,
 January 1997. RFC 2068.
 [FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways
 for Congestion Avoidance. IEEE/ACM Transactions on Networking,
 V.1 N.4, August 1993, p. 397-413.
 [Flo94] Floyd, S., TCP and Explicit Congestion Notification.
 Computer Communication Review, 24(5):10-23, October 1994.
Expires: October, 1998 [Page 8]

draft-floyd-incr-init-win-02.txt April 1998
 [Flo96] Floyd, S., Issues of TCP with SACK. Technical report, January
 1996. Available from http://www-nrg.ee.lbl.gov/floyd/.
 [Flo97] Floyd, S., Increasing TCP's Initial Window. Viewgraphs,
 40th IETF Meeting - TCP Implementations WG. December, 1997.
 URL "ftp://ftp.ee.lbl.gov/talks/sf-tcp-ietf97.ps".
 [KAGT98] Hans Kruse, Mark Allman, Jim Griner, Diepchi Tran. HTTP
 Page Transfer Rates Over Geo-Stationary Satellite Links. March
 1998. Proceedings of the Sixth International Conference on
 Telecommunication Systems. URL
 "http://gigahertz.lerc.nasa.gov/~mallman/papers/nash98.ps".
 [MD90] Jeffrey C. Mogul and Steve Deering. Path MTU Discovery,
 November 1990. RFC 1191.
 [MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd and Allyn
 Romanow. TCP Selective Acknowledgment Options, October 1996.
 RFC 2018.
 [Mor97] Robert Morris. Private communication, 1997. Cited for
 acknowledgement purposes only.
 [Nic97] Kathleen Nichols. Improving Network Simulation with
 Feedback. Com21, Inc. Technical Report. Available from
 http://www.com21.com/pages/papers/068.pdf.
 [PN98] Poduri, K., and Nichols, K., Simulation Studies of Increased
 Initial TCP Window Size, February 1998. Internet-Draft
 draft-ietf-tcpimpl-poduri-00.txt (work in progress).
 [Pos82] Jon Postel. Simple Mail Transfer Protocol, August 1982.
 RFC 821.
 [RF97] Ramakrishnan, K.K., and Floyd, S., A Proposal to Add Explicit
 Congestion Notification (ECN) to IPv6 and to TCP. Internet-Draft
 draft-kksjf-ecn-00.txt (work in progress). November 1997.
 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
 Requirement Levels", BCP 14, RFC 2119, March 1997.
 [S97] W. Stevens, TCP Slow Start, Congestion Avoidance, Fast
 Retransmit, and Fast Recovery Algorithms. RFC 2001, Proposed
 Standard, January 1997.
 [SP97] Tim Shepard and Craig Partridge. When TCP Starts Up With
 Four Packets Into Only Three Buffers, July 1997. Internet-Draft
 draft-shepard-TCP-4-packets-3-buff-00.txt (work in progress).
Expires: October, 1998 [Page 9]

draft-floyd-incr-init-win-02.txt April 1998
12. Author's Addresses
 Mark Allman
 NASA Lewis Research Center/Sterling Software
 21000 Brookpark Road
 MS 54-2
 Cleveland, OH 44135
 mallman@lerc.nasa.gov
 http://gigahertz.lerc.nasa.gov/~mallman/
 Sally Floyd
 Lawrence Berkeley National Laboratory
 One Cyclotron Road
 Berkeley, CA 94720
 floyd@ee.lbl.gov
 Craig Partridge
 BBN Technologies
 10 Moulton Street
 Cambridge, MA 02138
 craig@bbn.com
13. Appendix - Duplicate Segments
 In the current environment (without Explicit Congestion
 Notification [Flo94] [RF97]), all TCPs use segment drops as
 indications from the network about the limits of available
 bandwidth. We argue here that the change to a larger initial
 window should not result in the sender retransmitting
 a large number of duplicate segments that have already been
 received at the receiver.
 If one segment is dropped from the initial window, there are three
 different ways for TCP to recover: (1) Slow-starting from a window
 of one segment, as is done after a retransmit timeout, or after Fast
 Retransmit in Tahoe TCP; (2) Fast Recovery without selective
 acknowledgments (SACK), as is done after three duplicate ACKs in
 Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the
 sender and the receiver support the SACK option [MMFR96]. In all
 three cases, if a single segment is dropped from the initial window,
 no duplicate segments (i.e., segments that have already been
 received at the receiver) are transmitted. Note that for a
 TCP sending four 512-byte segments in the initial window, a single
 segment drop will not require a retransmit timeout, but can be
 recovered from using the Fast Retransmit algorithm (unless the
 retransmit timer expires prematurely). In addition, a single
 segment dropped from an initial window of three segments might be
 repaired using the fast retransmit algorithm, depending on which
 segment is dropped and whether or not delayed ACKs are used. For
 example, dropping the first segment of a three segment initial
 window will always require waiting for a timeout. However,
 dropping the third segment will always allow recovery via the fast
 retransmit algorithm, as long as no ACKs are lost.
Expires: October, 1998 [Page 10]

draft-floyd-incr-init-win-02.txt April 1998
 Next we consider scenarios where the initial window contains
 two to four segments, and at least two of those segments are dropped.
 If all segments in the initial window are dropped, then clearly
 no duplicate segments are retransmitted, as the receiver has not yet
 received any segments. (It is still a possibility that these dropped
 segments used scarce bandwidth on the way to their drop point;
 this issue was discussed in Section 5.)
 When two segments are dropped from an initial window of three
 segments, the sender will only send a duplicate segment if the
 first two of the three segments were dropped, and the sender does
 not receive a packet with the SACK option acknowledging the third
 segment.
 When two segments are dropped from an initial window of four
 segments, an examination of the six possible scenarios (which we
 don't go through here) shows that, depending on the position of the
 dropped packets, in the absence of SACK the sender might send one
 duplicate segment. There are no scenarios in which the sender
 sends two duplicate segments.
 When three segments are dropped from an initial window of four segments,
 then, in the absence of SACK, it is possible that one duplicate
 segment will be sent, depending on the position of the dropped segments.
 The summary is that in the absence of SACK, there are some
 scenarios with multiple segment drops from the initial window where
 one duplicate segment will be transmitted. There are no scenarios
 where more that one duplicate segment will be transmitted. Our
 conclusion is that the number of duplicate segments transmitted as
 a result of a larger initial window should be small.
14. Full Copyright Statement
 [This section would be filled in with the standard template if
 this document advances to an RFC.]
Expires: October, 1998 [Page 11]

Document	Document type	This is an older version of an Internet-Draft that was ultimately published as RFC 2414.
Select version	00 01 02 RFC 2414
Compare versions
Authors	Dr. Craig Partridge , Sally Floyd , Mark Allman Email authors
RFC stream	IETF Logo IETF Logo
Intended RFC status	Experimental
Other formats	txt bibtex bibxml
Additional resources	Mailing list discussion