draft-mathis-frag-harmful-00

Network Working Group M. Mathis
Internet-Draft J. Heffner
Expires: January 8, 2005 B. Chandler
 PSC
 July 10, 2004
 Fragmentation Considered Very Harmful
 draft-mathis-frag-harmful-00
Status of this Memo
 By submitting this Internet-Draft, I certify that any applicable
 patent or other IPR claims of which I am aware have been disclosed,
 and any of which I become aware will be disclosed, in accordance with
 RFC 3668.
 Internet-Drafts are working documents of the Internet Engineering
 Task Force (IETF), its areas, and its working groups. Note that other
 groups may also distribute working documents as Internet-Drafts.
 Internet-Drafts are draft documents valid for a maximum of six months
 and may be updated, replaced, or obsoleted by other documents at any
 time. It is inappropriate to use Internet-Drafts as reference
 material or to cite them other than as "work in progress."
 The list of current Internet-Drafts can be accessed at http://
 www.ietf.org/ietf/1id-abstracts.txt.
 The list of Internet-Draft Shadow Directories can be accessed at
 http://www.ietf.org/shadow.html.
 This Internet-Draft will expire on January 8, 2005.
Copyright Notice
 Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract
 IPv4 fragmentation is not sufficiently robust for general use in
 today's Internet. The 16-bit IP identification field is not large
 enough to prevent frequent missassociated IP fragments and the TCP
 and UDP checksums are insufficient to prevent the resulting corrupted
 data from being delivered to higher protocol layers. In this note we
 describe some easily reproduced experiments demonstrating the problem
 and estimate the scale the data corruption in the presence of ever
 growing data rates.
Mathis, et al. Expires January 8, 2005 [Page 1]

Internet-Draft Fragmentation Considered Very Harmful July 2004
1. Introduction
 The IPv4 header was designed at a time when data rates were several
 orders of magnitude lower than those achievable today. In this
 document, we describe a consequent scale-related failure in the IP
 identification (ID) field, where fragments may be mis-associated at a
 rate high enough likely to invalidate assumptions about data
 integrity failure rates. We also outline scenarios in which data
 corruption may happen reliably and reproducibly.
 While a number of problems with IP fragmentation have been well
 documented [1], this presents a relatively new and serious
 operational problem given the severity of the failure mode, and that
 it occurs on what is today common communications equipment. It is
 especially pertinent due to the recent proliferation of UDP bulk
 transport tools which do not do MTU discovery , and some network
 equipment which ignores the Don't Fragment (DF) bit in the IP header
 as a work-around for MTU discovery problems [2].
2. Wrapping the IP ID Field
 The Internet Protocol standard specifies:
 "The choice of the Identifier for a datagram is based on the need
 to provide a way to uniquely identify the fragments of a
 particular datagram. The protocol module assembling fragments
 judges fragments to belong to the same datagram if they have the
 same source, destination, protocol, and Identifier. Thus, the
 sender must choose the Identifier to be unique for this source,
 destination pair and protocol for the time the datagram (or any
 fragment of it) could be alive in the internet." [3]
 Strict conformance to this standard limits transmissions in one
 direction between any address pair to no more than 65536 datagrams
 per maximum packet lifetime.
 Obviously hosts do not follow the standard so strictly. Assuming a
 maximum packet lifetime on the order of seconds, today it is common
 for host interfaces to send at rates higher than this. For example,
 a host with a 100 Mbps interface sending 1500 byte packets may send
 65536 packets in under 8 seconds.
 The problem occurs when a fragment is dropped by the network, and a
 later fragment is received that, while part of a different datagram,
 has the same ID value and fragment offset as the dropped fragment.
 The two fragments will be incorrectly spliced together and delivered
 to the layer above IP. It is common that the fragment offset and
 length would match since packets of the same size sent along the same
Mathis, et al. Expires January 8, 2005 [Page 2]

Internet-Draft Fragmentation Considered Very Harmful July 2004
 path will be fragmented in the same manner. In 65537 segments, there
 must be at least two with matching ID fields. If the sender is
 transmitting segments fast enough that datagrams are send with
 duplicate ID fields within the reassembly timeout (a suggested value
 is 15 seconds [3]), then fragments may be mis-associated.
 The case of particular concern occurs when only the first fragment of
 a datagram is lost by the network. The remaining fragments will be
 stored in the fragment reassembly buffer, and at some point in the
 future a new packet will arrive with the matching ID field. This new
 first fragment will be (incorrectly) matched up with the rest of the
 old packet and delivered to the upper layer. Assuming the fragments
 are delivered in order, the rest of the new datagram will be
 buffered, forming a cycle. One of every 65536 datagrams will be
 incorrectly reassembled by the IP layer. It is possible to have a
 number of simultaneous cycles, bounded by the size of the fragment
 reassembly buffer.
 Most TCP implementations today participate in MTU discovery [4],
 which will avoid this problem by avoiding fragmentation. However, as
 a work-around for MTU discovery problems [2], some TCP
 implementations and communications gear provide mechanisms to disable
 path MTU discovery by clearing or ignoring the DF bit.
3. Harmful Effects of Mis-associated Fragments
 When the mis-associated fragments are delivered, transport-layer
 checksumming should detect these datagrams as incorrect and discard
 them. When the datagrams are discarded, it could pose a problem for
 loss feedback congestion control algorithms since there will be a
 high number of non-congestion-related losses.
 However, transport checksums may not be designed to handle such high
 error rates, either. The UDP checksum is only 16 bits in length. If
 these checksums follow a uniform random distribution, we expect
 mis-associated datagrams to be accepted by the checksum at a rate of
 one per 65536. With only one mis-association cycle, we expect
 corrupt data delivered to the application layer once per 2^32
 datagrams. This number can be significantly higher with multiple
 cycles.
 With non-random data, the UDP checksum may be even weaker still. It
 is possible to construct datasets where mis-associated fragments will
 always have the same checksum. Such a case may be considered
 unlikely, but is worth considering. "Real" data may be more likely
 than random data to cause checksum hotspots and increase the
 probability of false checksum match [5]. Also, some applications may
 turn off checksumming to increase speed, though this practice has
Mathis, et al. Expires January 8, 2005 [Page 3]

Internet-Draft Fragmentation Considered Very Harmful July 2004
 been found to be dangerous for other reasons [6].
4. Experimental Results
 To test the practical impact of fragmentation on UDP, we ran a series
 of experiments with a common UDP bulk transport protocol, Reliable
 Blast UDP (RBUDP), part of the QUANTA networking toolkit. It is one
 of the tools used as an alternative to TCP for high-bandwidth
 applications on specialized networks. The choice to use RBUDP has
 very little to do with the protocol itself, as any UDP transport tool
 without extra corruption detection would work equally well.
 In order to diagnose corruption on files transferred with RBUDP, we
 used a file format including embedded sequence numbers and MD5
 checksums. These were placed such that one set was included in each
 fragment of each datagram. Thus it was possible to distinguish
 random corruption from that caused by mis-associated fragments.
 Two types of dataset were used. In the first, all space not used for
 sequence numbers and MD5 checksums was filled with pseudo-random
 data, giving datagrams random checksums. The second was constructed
 in a similar manner except that the upper halves of each 32-bit word
 were filled with the 16-bit ones complement of the lower half. This
 gave each 32-bit word a zero ones-complement sum, so datagrams had
 constant checksums. With these constant checksums, mis-associated
 fragments were guaranteed not to fail the UDP checksum test. Each
 dataset used was 400 MB in size.
 The RBUDP tools were used to send the datasets between a pair of
 hosts at slightly less than the available datarate. Near the
 beginning of each flow, a brief secondary flow was started to induce
 packet loss in the primary flow. Throughout the life of the primary
 flow, we typically observed mis-association rates on the order of
 0.05%. In datasets with constant checksums, each of these
 mis-associations resulted in corrupted data. In sending datasets
 with random checksums 100 times (for a total of 100 GB), we observed
 one corruption and 41091 bad UDP checksums.
5. Remedies
 IPv6 is less vulnerable to this type of problem, since its fragment
 header contains a 32-bit identification field [7]. Mis-association
 will only be a problem at packet rates 65536 times higher than for
 IPv4.
 Since mis-association of fragments will only occur when the IP ID
 field is wrapped within the fragment reassembly timeout, it is
 possible to reduce the timeout so that this situation is less likely
Mathis, et al. Expires January 8, 2005 [Page 4]

Internet-Draft Fragmentation Considered Very Harmful July 2004
 to occur. Since the timeout is set by the receiving host while the
 IP ID field is set by the sending host, it is not generally possible
 to set the timeout low enough so that a fast sender's fragments will
 not be mis-association, yet high enough so that a slow sender's
 fragments will not be unconditionally discarded before it is possible
 to reassemble them. It is not within the scope of this document to
 recommend timeout values.
 Another means of solving the corruption issue is to add stronger
 integrity checking, which can be done at any layer above IP. This is
 a natural side effect of using cryptographic authentication. At the
 network layer, if IPsec AH is in use, the mis-associated fragments
 should be discarded with extremely high probability. Other higher
 layers may use longer checksums (for example, SCTP's is 32 bits in
 length [8]) or cryptographic authentication (SSH message
 authentication codes [10]). While stronger integrity checking may
 prevent data corruption, it will not solve the problem of a high
 effective loss rate.
6. Security Considerations
 If a malicious entity knows that a pair of hosts are communicating
 using a fragmented stream, it may present an opportunity for this
 entity to corrupt the flow. By sending "high" fragments (those with
 offset greater than zero) with a forged source address, the attacker
 can deliberately cause corruption as described above. Exploiting
 this vulnerability requires only knowledge of the source and
 destination addresses of the flow, and fragment boundaries. It does
 not require knowledge of port or sequence numbers.
 If the attacker has visibility of packets on the path, the attack
 profile is similar to injecting full segments. Using this attack
 makes blind disruptions easier, and could certainly be used
 effectively to cause denial of service. However, only streams using
 IPv4 fragmentation are vulnerable. Because of the nature of the
 problems outlined in this draft, the use of IPv4 fragmentation for
 critical applications may not be advisable regardless of security
 concerns.
7 References
 [1] Kent, C. and J. Mogul, "Fragmentation considered harmful",
 Proc. SIGCOMM '87 vol. 17, No. 5, October 1987.
 [2] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923,
 September 2000.
 [3] Postel, J., "Internet Protocol", STD 5, RFC 791, September
Mathis, et al. Expires January 8, 2005 [Page 5]

Internet-Draft Fragmentation Considered Very Harmful July 2004
 1981.
 [4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
 November 1990.
 [5] Stone, J., Greenwald, M., Partridge, C. and J. Hughes,
 "Performance of Checksums and CRC's over Real Data", IEEE/ACM
 Transactions on Networking vol. 6, No. 5, October 1998.
 [6] Stone, J. and C. Partridge, "When The CRC and TCP Checksum
 Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4, October 2000.
 [7] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
 Specification", RFC 2460, December 1998.
 [8] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,
 H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson,
 "Stream Control Transmission Protocol", RFC 2960, October 2000.
 [9] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402,
 November 1998.
 [10] Ylonen, T. and C. Lonvick, "SSH Transport Layer Protocol",
 draft-ietf-secsh-transport-18 (work in progress), June 2004.
 [11] Clark, D., "IP datagram reassembly algorithms", RFC 815, July
 1982.
Authors' Addresses
 Matt Mathis
 Pittsburgh Supercomputing Center
 4400 Fifth Avenue
 Pittsburgh, PA 15213
 US
 Phone: 412-268-3319
 EMail: mathis@psc.edu
Mathis, et al. Expires January 8, 2005 [Page 6]

Internet-Draft Fragmentation Considered Very Harmful July 2004
 John W. Heffner
 Pittsburgh Supercomputing Center
 4400 Fifth Avenue
 Pittsburgh, PA 15213
 US
 Phone: 412-268-2329
 EMail: jheffner@psc.edu
 Ben Chandler
 Pittsburgh Supercomputing Center
 4400 Fifth Avenue
 Pittsburgh, PA 15213
 US
 Phone: 412-268-9783
 EMail: bchandle@psc.edu
Appendix A. Support
 This work was supported by the National Science Foundation under
 Grant No. 0083285.
Mathis, et al. Expires January 8, 2005 [Page 7]

Internet-Draft Fragmentation Considered Very Harmful July 2004
Intellectual Property Statement
 The IETF takes no position regarding the validity or scope of any
 Intellectual Property Rights or other rights that might be claimed to
 pertain to the implementation or use of the technology described in
 this document or the extent to which any license under such rights
 might or might not be available; nor does it represent that it has
 made any independent effort to identify any such rights. Information
 on the IETF's procedures with respect to rights in IETF Documents can
 be found in BCP 78 and BCP 79.
 Copies of IPR disclosures made to the IETF Secretariat and any
 assurances of licenses to be made available, or the result of an
 attempt made to obtain a general license or permission for the use of
 such proprietary rights by implementers or users of this
 specification can be obtained from the IETF on-line IPR repository at
 http://www.ietf.org/ipr.
 The IETF invites any interested party to bring to its attention any
 copyrights, patents or patent applications, or other proprietary
 rights that may cover technology that may be required to implement
 this standard. Please address the information to the IETF at
 ietf-ipr@ietf.org.
Disclaimer of Validity
 This document and the information contained herein are provided on an
 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
 Copyright (C) The Internet Society (2004). This document is subject
 to the rights, licenses and restrictions contained in BCP 78, and
 except as set forth therein, the authors retain all their rights.
Acknowledgment
 Funding for the RFC Editor function is currently provided by the
 Internet Society.
Mathis, et al. Expires January 8, 2005 [Page 8]

Document	Document type	Expired Internet-Draft (individual) Expired & archived This document is an Internet-Draft (I-D). Anyone may submit an I-D to the IETF. This I-D is not endorsed by the IETF and has no formal standing in the IETF standards process.
Select version	00
Author	Matt Mathis Email authors
RFC stream	(None)
Intended RFC status	(None)
Other formats	txt bibtex bibxml