Hurricane Electric's IPv6 Tunnel Broker Forums

Tunnelbroker.net Specific Topics => Questions & Answers => Topic started by: bicknell on January 26, 2012, 06:18:14 PM

Title: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 26, 2012, 06:18:14 PM
I installed a new home router today (an Apple Time Capsule, if you care) and as part of that work ran some tests on my network connectivity.  One of the tools I used was the ICSI Netalyzer, available at http://netalyzr.icsi.berkeley.edu/.

It detected an IPv6 MTU issue:

  IPv6 Path MTU (?): Warning

  Your system can not send or receive fragmented traffic over IPv6. The 
  path between our system and your network has an MTU of 1480 bytes. The
  bottleneck is at IP address 2001:470:0:90::2. The path between our   
  system and your network does not appear to handle fragmented IPv6     
  traffic properly.

They also have a link with more information on thi particular error, it is http://n2.netalyzr.icsi.berkeley.edu/info_ipv6_mtu.html.

Now the IPv6 address is the native interface of the HE Tunnel server my tunnel terminates on in Ashburn.

I think there are two possibilities:

1) The Tunnel MTU is smaller than 1500, so the HE Tunnel server should return an ICMP Packet Too Big, but perhaps those are not being generated, being filtered, or being rate limited and too many are showing up during my test.

2) Fragments are correctly sent down the tunnel, but the Time Capsule is dropping them due to some Apple bug.

I tried to figure out a fast way to test, but couldn't come up with anything super-easy to see if this was a known issue (with either HE or the Apple Airport/Time Capsules).  Anyone?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 26, 2012, 11:25:18 PM
The problem doesn't have to be at one of the tunnel endpoints. It could be on the path between them. Also, it is not clear in which direction there is a problem. It may be that too large packets are dropped in only one direction, and in the other direction every packet will either make it through or result in an ICMPv6 packet being returned. Since you haven't mentioned what your IP address is, I cannot investigate the problem myself.

You should try to ping the tunnel server with various packet sizes and run tcpdump to figure out what is going on. Do this with both IPv4 and IPv6. Find out what the path MTU appears to be in each case, and what happens when it is exceeded.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: cholzhauer on January 27, 2012, 05:51:44 AM
I ran it too just to see what results I got.  Along the same lines as you:


Your system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1280 bytes. The path between our system and your network has an MTU of 1480 bytes. The bottleneck is at IP address 2001:470:0:6e::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 27, 2012, 07:00:51 AM
I got the test running as well. This is what I got:
QuoteYour system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1472 bytes. The path between our system and your network has an MTU of 1480 bytes. The bottleneck is at IP address 2001:470:0:69::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.

I tried to disable the HE tunnel and run with just 6to4. Then I got this:
QuoteYour system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1472 bytes. The path between our system and your network has an MTU of 1450 bytes. The bottleneck is at IP address 2001:1900:5:1::229. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.

And one really strange thing happened when I was running just 6to4, I got this message:
QuoteYour host, NAT, or firewall acts as a DNS server or proxy. Requests sent to this server are eventually processed by 216.66.80.90.
This is probably a bug in your NAT's firmware, and represents a minor security vulnerability.

That last message was without using the HE tunnel, and with resolv.conf listing only my ISPs DNS servers. I have no idea where it got that IP from. The test generates too much traffic for me to be bothered with going through it all. I don't experience any MTU related problems, so I am not sure what it is this test is picking up.

Maybe if there is a test site somewhere that has native IPv6 and can ping me with a packet size that I choose, I could figure out a bit more.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 27, 2012, 07:17:25 PM
Are both of you running Apple routers (Apple Extreme Base Stations or Time Capsules)?

I'd like to get a couple of people with non-apple products to try, as if it occurs with them it would point more towards HE being the problem, and if it is Apple products only it would point to Apple's implementation.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: cholzhauer on January 27, 2012, 07:26:32 PM
Nope, FreeBSD 8.2 router
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on January 27, 2012, 07:34:37 PM
I remember getting similar results while on NATIVE IPv6 in the HE NOC. Running it from my Ubuntu 11.10 laptop at home via tunnel terminating on dir-825 I get:

IPv6 Path MTU (?): Warning
Your system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1480 bytes. The path between our system and your network has an MTU of 1480 bytes. The bottleneck is at IP address 2001:470:20::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.


Are they saying 1480 is bad? Because I'd love to know a tunnel that gives you 1500. Also I love that the "bottleneck" is at an anycasted IP. Actually thinking about it, I probably forgot to set preferred_lft 0 on the anycast ip on tserv29. Maybe whomever took over will double-check it.

109.551 test-42| Message: mtu 1488 64
109.552 test-42| UDP socket at 2001:470:67:22f:224:2cff:feaa:6383:46451
109.732 test-42| Got datagram of 29 bytes.
109.732 test-42| Responsive failure
109.732 test-42| Response is bad 1488 2001:470:20::2 1480
109.732 test-42| Works: 1476
109.732 test-42| Fails: 1488
109.732 test-42| At:    1482
109.732 test-42| Message: mtu 1482 64
109.733 test-42| UDP socket at 2001:470:67:22f:224:2cff:feaa:6383:45177
109.903 test-42| Got datagram of 29 bytes.
109.903 test-42| Responsive failure
109.904 test-42| Response is bad 1482 2001:470:20::2 1480
109.904 test-42| Works: 1476
109.904 test-42| Fails: 1482
109.904 test-42| At:    1479
109.904 test-42| Message: mtu 1479 64
109.905 test-42| UDP socket at 2001:470:67:22f:224:2cff:feaa:6383:56287
110.004 test-42| Got datagram of 1024 bytes.
110.005 test-42| Success
110.005 test-42| Works: 1479
110.005 test-42| Fails: 1482
110.005 test-42| At:    1480
110.005 test-42| Message: mtu 1480 64
110.005 test-42| UDP socket at 2001:470:67:22f:224:2cff:feaa:6383:35993
110.103 test-42| Got datagram of 1024 bytes.
110.103 test-42| Success
110.103 test-42| Works: 1480
110.103 test-42| Fails: 1482
110.103 test-42| At:    1481
110.103 test-42| Message: mtu 1481 64
110.104 test-42| UDP socket at 2001:470:67:22f:224:2cff:feaa:6383:36066
110.276 test-42| Got datagram of 29 bytes.
110.277 test-42| Responsive failure
110.277 test-42| Response is bad 1481 2001:470:20::2 1480
110.277 test-42| Final MTU is 1480


So yeah, its "bad" because it's limited to 1480?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: cholzhauer on January 27, 2012, 08:03:50 PM
I just tried from home with my tunnel hosted on a D-Link DIR615


Your system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1480 bytes. The path between our system and your network has an MTU of 1480 bytes. The bottleneck is at IP address 2001:470:0:5d::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 29, 2012, 06:42:12 AM
Quote from: broquea on January 27, 2012, 07:34:37 PM
So yeah, its "bad" because it's limited to 1480?

Just to make my example more clear, assume a tunnel can move a 1480 byte packet max, and the "Internet" is 1500 byte clean.  I realize those are not always true, but I want a simple example to get us all on the same page.

Source system sends a 1490 byte IPv6 packet.  It should make it to the tunnel end point (e.g. one of HE's tunnel broker servers) which realizes the 1490 byte packet can't fit down the 1480 byte tunnel.  That router should send back an IPv6 ICMP "Packet Too Big" message to the source and drop the packet.  Note this is different from IPv4, an IPv4 router would split the packet in half and forward it, in IPv6 that does not happen.  Source system gets the ICMP "Packet Too Big" and is thus supposed to split the original 1490 byte packet into two parts and send a packet plus a fragment on to the destination.  These both now fit, and make it down the tunnel where the receiver reassembles the packet.

Your log now makes me think I've run into this problem before.  With TCP this all works nicely on all operating systems, because a TCP sender has to buffer the data until an acknowledgement is received.  With UDP, well, that's more interesting.  I believe there are operating systems that fire off the UDP packet and don't buffer the packet at all (as this was the IPv4 behavior), and thus when the Packet-Too-Big comes back there is no data to retransmit.  Also, this packet too big should create a PMTU entry (on operating systems with path MTU tracking) allowing subsequent packets to simply be fragmented at the source.

I would strongly hope the Netalyzer folks are using servers that do everything right, I mean, I assume their suite of tests 100% passes when they run it locally.  I think what we need to do is capture a TCPdump from both ends of a tunnel while running the test, and then look at the packets.

Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 29, 2012, 08:24:40 AM
I'm going to post a bit more info here:

Client side of my failed test:

Quote
163.903    main|
163.903    main| Running test 42: checkMTUV6
163.903    main| ----------------------------
163.913 test-42| Testing the ability to send a large UDP packet (2000 bytes) over IPv6
163.913 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.913 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53348
163.914 test-42| Got exception java.net.SocketException: Message too long on UDP test
163.914 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.914 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53349
163.966 test-42| Got exception java.net.PortUnreachableException: ICMP Port Unreachable on UDP test
163.966 test-42| Can't send UDP fragments
163.966 test-42| Testing the ability to receive a large UDP packet (2000 bytes) over IPv6
163.966 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.966 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53350
164.023 test-42| Got exception java.net.PortUnreachableException: ICMP Port Unreachable on UDP test
164.023 test-42| Can't receive UDP fragments
164.023 test-42| Attempting to send a packet with
164.023 test-42| fragmentation of 2009 bytes
164.023 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
165.024 test-42| No data received.
165.024 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
166.025 test-42| No data received.
166.025 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
167.025 test-42| No data received.
167.025 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
168.026 test-42| No data received.
168.027 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
169.028 test-42| No data received.
169.028 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
170.028 test-42| No data received.
170.029 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
171.029 test-42| No data received.
171.029 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
172.030 test-42| No data received.
172.030 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
173.032 test-42| No data received.
173.032 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
174.033 test-42| No data received.
174.033 test-42| No reply back
174.033 test-42| Now looking for the receive MTU. Trying 1500 first
174.033 test-42| MSG: mtu 1500 64
174.033 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53352
174.095 test-42| Got datagram of 31 bytes.
174.095 test-42| Response is bad 1500 2001:470:0:90::2 1480
174.095 test-42| Path MTU is <1500B
174.095 test-42| Beginning binary search to find the path MTU
174.095 test-42| Works: 0
174.095 test-42| Fails: 1500
174.095 test-42| At:    750
174.095 test-42| Message: mtu 750 64
174.095 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53353
174.148 test-42| Got datagram of 702 bytes.
174.148 test-42| Success
174.148 test-42| Works: 750
174.148 test-42| Fails: 1500
174.148 test-42| At:    1125
174.148 test-42| Message: mtu 1125 64
174.148 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53354
174.203 test-42| Got datagram of 1024 bytes.
174.203 test-42| Success
174.203 test-42| Works: 1125
174.203 test-42| Fails: 1500
174.203 test-42| At:    1312
174.203 test-42| Message: mtu 1312 64
174.203 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53355
174.252 test-42| Got datagram of 1024 bytes.
174.252 test-42| Success
174.252 test-42| Works: 1312
174.252 test-42| Fails: 1500
174.252 test-42| At:    1406
174.252 test-42| Message: mtu 1406 64
174.253 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53356
174.301 test-42| Got datagram of 1024 bytes.
174.301 test-42| Success
174.301 test-42| Works: 1406
174.301 test-42| Fails: 1500
174.301 test-42| At:    1453
174.301 test-42| Message: mtu 1453 64
174.302 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53357
174.353 test-42| Got datagram of 1024 bytes.
174.354 test-42| Success
174.354 test-42| Works: 1453
174.354 test-42| Fails: 1500
174.354 test-42| At:    1476
174.354 test-42| Message: mtu 1476 64
174.354 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53358
174.408 test-42| Got datagram of 1024 bytes.
174.408 test-42| Success
174.408 test-42| Works: 1476
174.408 test-42| Fails: 1500
174.408 test-42| At:    1488
174.408 test-42| Message: mtu 1488 64
174.408 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53359
174.468 test-42| Got datagram of 31 bytes.
174.468 test-42| Responsive failure
174.468 test-42| Response is bad 1488 2001:470:0:90::2 1480
174.468 test-42| Works: 1476
174.468 test-42| Fails: 1488
174.468 test-42| At:    1482
174.468 test-42| Message: mtu 1482 64
174.469 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53360
174.528 test-42| Got datagram of 31 bytes.
174.528 test-42| Responsive failure
174.529 test-42| Response is bad 1482 2001:470:0:90::2 1480
174.529 test-42| Works: 1476
174.529 test-42| Fails: 1482
174.529 test-42| At:    1479
174.529 test-42| Message: mtu 1479 64
174.529 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53361
174.580 test-42| Got datagram of 1024 bytes.
174.580 test-42| Success
174.580 test-42| Works: 1479
174.580 test-42| Fails: 1482
174.580 test-42| At:    1480
174.580 test-42| Message: mtu 1480 64
174.580 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53362
174.628 test-42| Got datagram of 1024 bytes.
174.628 test-42| Success
174.628 test-42| Works: 1480
174.628 test-42| Fails: 1482
174.628 test-42| At:    1481
174.628 test-42| Message: mtu 1481 64
174.629 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53363
174.689 test-42| Got datagram of 31 bytes.
174.689 test-42| Responsive failure
174.689 test-42| Response is bad 1481 2001:470:0:90::2 1480
174.689 test-42| Final MTU is 1480

Wireshark Capture of the box running the client:

Quote

    230 174.728043  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      838    Source port: 53349  Destination port: eye2eye
    231 174.731075  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 TCP      86     sentinelsrm > 57537 [ACK] Seq=43 Ack=2 Win=5712 Len=0 TSval=87108372 TSecr=1298810232
    232 174.779641  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 ICMPv6   1294   Destination Unreachable (Port unreachable)
    233 174.780117  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      78     Source port: 53350  Destination port: eye2eye
    234 174.836455  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 ICMPv6   126    Destination Unreachable (Port unreachable)
    235 174.837125  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x451f2d01)
    236 174.837131  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    241 175.838153  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x22a0eb7a)
    242 175.838159  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    243 176.838659  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x39ec621f)
    244 176.838665  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    245 177.839024  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x24e7bdfe)
    246 177.839031  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    249 178.259659  2001:470:e07d:1:c62c:3ff:fe2c:78e0 2001:470:e07d:1::1    ICMPv6   149    Destination Unreachable (Port unreachable)
    253 178.261531  2001:470:e07d:1:c62c:3ff:fe2c:78e0 2001:470:e07d:1::1    ICMPv6   149    Destination Unreachable (Port unreachable)
    254 178.261558  2001:470:e07d:1:c62c:3ff:fe2c:78e0 2001:470:e07d:1::1    ICMPv6   149    Destination Unreachable (Port unreachable)
    255 178.261569  2001:470:e07d:1:c62c:3ff:fe2c:78e0 2001:470:e07d:1::1    ICMPv6   149    Destination Unreachable (Port unreachable)
    257 178.261591  2001:470:e07d:1:c62c:3ff:fe2c:78e0 2001:470:e07d:1::1    ICMPv6   149    Destination Unreachable (Port unreachable)
    258 178.840290  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x29ffd029)
    259 178.840296  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    260 179.841590  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x3d394354)
    261 179.841596  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    264 180.842345  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x77c2fbc1)
    265 180.842362  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    266 181.842854  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x2fa68044)
    267 181.842860  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    268 182.844119  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x3a598b70)
    269 182.844126  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    272 183.845585  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x61c80d48)
    273 183.845592  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    274 184.846852  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53352  Destination port: bcs-lmserver
    275 184.908422  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53352
    276 184.909101  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      72     Source port: 53353  Destination port: bcs-lmserver
    277 184.961213  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      764    Source port: bcs-lmserver  Destination port: 53353
    278 184.961756  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53354  Destination port: bcs-lmserver
    279 185.016423  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1139   Source port: bcs-lmserver  Destination port: 53354
    280 185.016967  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53355  Destination port: bcs-lmserver
    281 185.065661  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1326   Source port: bcs-lmserver  Destination port: 53355
    282 185.066323  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53356  Destination port: bcs-lmserver
    283 185.114773  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1420   Source port: bcs-lmserver  Destination port: 53356
    284 185.115388  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53357  Destination port: bcs-lmserver
    285 185.167136  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1467   Source port: bcs-lmserver  Destination port: 53357
    286 185.167749  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53358  Destination port: bcs-lmserver
    287 185.221257  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1490   Source port: bcs-lmserver  Destination port: 53358
    288 185.221967  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53359  Destination port: bcs-lmserver
    289 185.281867  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53359
    290 185.282656  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53360  Destination port: bcs-lmserver
    291 185.342016  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53360
    292 185.342637  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53361  Destination port: bcs-lmserver
    293 185.393482  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1493   Source port: bcs-lmserver  Destination port: 53361
    294 185.393965  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53362  Destination port: bcs-lmserver
    295 185.441881  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1494   Source port: bcs-lmserver  Destination port: 53362
    296 185.442379  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53363  Destination port: bcs-lmserver
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 29, 2012, 09:11:59 AM
Now, with that data, I'm looking at what is going on.

First set of tests is sending from my client to the ICSI server UDP packets > 1500 bytes.


163.913 test-42| Testing the ability to send a large UDP packet (2000 bytes) over IPv6


It then tries to send from source ports 55348-55350, destination port 1948 (eye2eye in /etc/services):


163.913 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.913 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53348
163.914 test-42| Got exception java.net.SocketException: Message too long on UDP test
163.914 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.914 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53349
163.966 test-42| Got exception java.net.PortUnreachableException: ICMP Port Unreachable on UDP test
163.966 test-42| Can't send UDP fragments
163.966 test-42| Testing the ability to receive a large UDP packet (2000 bytes) over IPv6
163.966 test-42| Sending UDP request to ipv6-node.u14369.n3.netalyzr.icsi.berkeley.edu on port 1948
163.966 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53350
164.023 test-42| Got exception java.net.PortUnreachableException: ICMP Port Unreachable on UDP test
164.023 test-42| Can't receive UDP fragments


"Got exception java.net.SocketException: Message too long on UDP test" makes me think there is a local OS/Java error/limit.  I also find it interesting the ICSI server is returning port unreachable on these test packets.

Let's look at the tcpdump for that traffic:


    226 174.727246  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1510   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x36cd5178)
    227 174.727256  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     622    IPv6 fragment (nxt=UDP (0x11) off=1448 id=0x36cd5178)
    228 174.727648  2001:470:e07d:1::1    2001:470:e07d:1:a833:b6aa:c711:ffa1 ICMPv6   1294   Packet Too Big
    229 174.728036  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x1bd690be)
    230 174.728043  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      838    Source port: 53349  Destination port: eye2eye
    232 174.779641  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 ICMPv6   1294   Destination Unreachable (Port unreachable)
    233 174.780117  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      78     Source port: 53350  Destination port: eye2eye
    234 174.836455  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 ICMPv6   126    Destination Unreachable (Port unreachable)
    235 174.837125  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x451f2d01)
    236 174.837131  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    241 175.838153  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x22a0eb7a)
    242 175.838159  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    243 176.838659  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x39ec621f)
    244 176.838665  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    245 177.839024  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x24e7bdfe)
    246 177.839031  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver


I can't find the source port 53348 packet, but it appears we pick up what might be three fragments from it.  We also see a packet too big message in the middle.  The port 53349-53351 packets all appear to go out with a packet plus a fragment, but it appears the ICSI server never receives them.  This looks like fragments are being filtered on the path from me to ICSI for some reason.  I'll note the fragments are 1294 bytes, I have no way to check the tunnel MTU in the Apple Time Capsule.  Could the time capsule be sending tunnel packets larger than HE will accept?

Next up, attempts to send from ports 53351 to the ICSI server.


164.023 test-42| Attempting to send a packet with
164.023 test-42| fragmentation of 2009 bytes
164.023 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
165.024 test-42| No data received.
165.024 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
166.025 test-42| No data received.
166.025 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
167.025 test-42| No data received.
167.025 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
168.026 test-42| No data received.
168.027 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
169.028 test-42| No data received.
169.028 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
170.028 test-42| No data received.
170.029 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
171.029 test-42| No data received.
171.029 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
172.030 test-42| No data received.
172.030 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
173.032 test-42| No data received.
173.032 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53351
174.033 test-42| No data received.
174.033 test-42| No reply back


And the packet capture:


    235 174.837125  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x451f2d01)
    236 174.837131  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    241 175.838153  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x22a0eb7a)
    242 175.838159  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    243 176.838659  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x39ec621f)
    244 176.838665  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    245 177.839024  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x24e7bdfe)
    246 177.839031  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    258 178.840290  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x29ffd029)
    259 178.840296  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    260 179.841590  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x3d394354)
    261 179.841596  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    264 180.842345  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x77c2fbc1)
    265 180.842362  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    266 181.842854  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x2fa68044)
    267 181.842860  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    268 182.844119  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x3a598b70)
    269 182.844126  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver
    272 183.845585  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      IPv6     1294   IPv6 fragment (nxt=UDP (0x11) off=0 id=0x61c80d48)
    273 183.845592  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      847    Source port: 53351  Destination port: bcs-lmserver


Now, this is more interesting to me.  The packets leave my server, but apparently never make it to the ICSI box.  It's possible just the fragments don't make it, which prevents reassembly, or it could be that none of these packets make it.   Same question as before, Could the time capsule be sending tunnel packets larger than HE will accept?

Now we get the testing from the ICSI server to my client:


174.033 test-42| Now looking for the receive MTU. Trying 1500 first
174.033 test-42| MSG: mtu 1500 64
174.033 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53352
174.095 test-42| Got datagram of 31 bytes.
174.095 test-42| Response is bad 1500 2001:470:0:90::2 1480
174.095 test-42| Path MTU is <1500B
174.095 test-42| Beginning binary search to find the path MTU
174.095 test-42| Works: 0
174.095 test-42| Fails: 1500
174.095 test-42| At:    750
174.095 test-42| Message: mtu 750 64
174.095 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53353
174.148 test-42| Got datagram of 702 bytes.
174.148 test-42| Success
174.148 test-42| Works: 750
174.148 test-42| Fails: 1500
174.148 test-42| At:    1125
174.148 test-42| Message: mtu 1125 64
174.148 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53354
174.203 test-42| Got datagram of 1024 bytes.
174.203 test-42| Success
174.203 test-42| Works: 1125
174.203 test-42| Fails: 1500
174.203 test-42| At:    1312
174.203 test-42| Message: mtu 1312 64
174.203 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53355
174.252 test-42| Got datagram of 1024 bytes.
174.252 test-42| Success
174.252 test-42| Works: 1312
174.252 test-42| Fails: 1500
174.252 test-42| At:    1406
174.252 test-42| Message: mtu 1406 64
174.253 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53356
174.301 test-42| Got datagram of 1024 bytes.
174.301 test-42| Success
174.301 test-42| Works: 1406
174.301 test-42| Fails: 1500
174.301 test-42| At:    1453
174.301 test-42| Message: mtu 1453 64
174.302 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53357
174.353 test-42| Got datagram of 1024 bytes.
174.354 test-42| Success
174.354 test-42| Works: 1453
174.354 test-42| Fails: 1500
174.354 test-42| At:    1476
174.354 test-42| Message: mtu 1476 64
174.354 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53358
174.408 test-42| Got datagram of 1024 bytes.
174.408 test-42| Success
174.408 test-42| Works: 1476
174.408 test-42| Fails: 1500
174.408 test-42| At:    1488
174.408 test-42| Message: mtu 1488 64
174.408 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53359
174.468 test-42| Got datagram of 31 bytes.
174.468 test-42| Responsive failure
174.468 test-42| Response is bad 1488 2001:470:0:90::2 1480
174.468 test-42| Works: 1476
174.468 test-42| Fails: 1488
174.468 test-42| At:    1482
174.468 test-42| Message: mtu 1482 64
174.469 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53360
174.528 test-42| Got datagram of 31 bytes.
174.528 test-42| Responsive failure
174.529 test-42| Response is bad 1482 2001:470:0:90::2 1480
174.529 test-42| Works: 1476
174.529 test-42| Fails: 1482
174.529 test-42| At:    1479
174.529 test-42| Message: mtu 1479 64
174.529 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53361
174.580 test-42| Got datagram of 1024 bytes.
174.580 test-42| Success
174.580 test-42| Works: 1479
174.580 test-42| Fails: 1482
174.580 test-42| At:    1480
174.580 test-42| Message: mtu 1480 64
174.580 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53362
174.628 test-42| Got datagram of 1024 bytes.
174.628 test-42| Success
174.628 test-42| Works: 1480
174.628 test-42| Fails: 1482
174.628 test-42| At:    1481
174.628 test-42| Message: mtu 1481 64
174.629 test-42| UDP socket at 2001:470:e07d:1:a833:b6aa:c711:ffa1%0:53363
174.689 test-42| Got datagram of 31 bytes.
174.689 test-42| Responsive failure
174.689 test-42| Response is bad 1481 2001:470:0:90::2 1480
174.689 test-42| Final MTU is 1480


And the TCPDump

   274 184.846852  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53352  Destination port: bcs-lmserver
    275 184.908422  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53352
    276 184.909101  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      72     Source port: 53353  Destination port: bcs-lmserver
    277 184.961213  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      764    Source port: bcs-lmserver  Destination port: 53353
    278 184.961756  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53354  Destination port: bcs-lmserver
    279 185.016423  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1139   Source port: bcs-lmserver  Destination port: 53354
    280 185.016967  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53355  Destination port: bcs-lmserver
    281 185.065661  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1326   Source port: bcs-lmserver  Destination port: 53355
    282 185.066323  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53356  Destination port: bcs-lmserver
    283 185.114773  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1420   Source port: bcs-lmserver  Destination port: 53356
    284 185.115388  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53357  Destination port: bcs-lmserver
    285 185.167136  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1467   Source port: bcs-lmserver  Destination port: 53357
    286 185.167749  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53358  Destination port: bcs-lmserver
    287 185.221257  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1490   Source port: bcs-lmserver  Destination port: 53358
    288 185.221967  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53359  Destination port: bcs-lmserver
    289 185.281867  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53359
    290 185.282656  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53360  Destination port: bcs-lmserver
    291 185.342016  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53360
    292 185.342637  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53361  Destination port: bcs-lmserver
    293 185.393482  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1493   Source port: bcs-lmserver  Destination port: 53361
    294 185.393965  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53362  Destination port: bcs-lmserver
    295 185.441881  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      1494   Source port: bcs-lmserver  Destination port: 53362
    296 185.442379  2001:470:e07d:1:a833:b6aa:c711:ffa1 2607:f740:b::f93      UDP      73     Source port: 53363  Destination port: bcs-lmserver
    297 185.502727  2607:f740:b::f93      2001:470:e07d:1:a833:b6aa:c711:ffa1 UDP      93     Source port: bcs-lmserver  Destination port: 53363


We see no fragments making it from ICSI to my system, illustrating the problem.

I tried a couple of other things.  I've been running all of these  tests on an OSX Lion client.  I tried disabling the firewall on the box to be sure it wasn't something with that, no change in the results.  I then decided to try a Windows 7 box on the same LAN to get a different version of Java and different OS quirks going.  It ran into issues with the windows firewall (generating a few pop ups), so I disabled the firewall and tried again.  Same result.  At least it's a consistent result across OSX and Windows.

One last test, turned off "Block incoming IPv6 connections" on my Time Capsule, basically disabling the firewall on that device.  This resulted in a slightly different message from the Netalyzr scan:

Quote
Your system can not send or receive fragmented traffic over IPv6.
The path between your network and our system supports an MTU of at least 1280 bytes. The path between our system and your network has an MTU of 1480 bytes. The bottleneck is at IP address 2001:470:0:90::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.

Bottom line?  It does appear that I cannot receive any IPv6 fragments over my tunnel/router, and that IPv6 fragmented I send out do not make it to their destination.

What I think would be an awesome next step is if someone from HE could run a tcpdump on the tunnel broker server for all packets to/from my IPv6 address range while I run the test, and then provide that to me.  I could then compare how things look on the other side of the tunnel, and probably have a better idea what's going on.  I now wonder if there are two different problems:

1) Mismatched tunnel MTU.  Neither the Apple device nor HE allow me to set it (or even view it!), so I think this is a strong possibility.

2) One or more devices in the middle is filtering IPv6 Fragments.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kcochran on January 29, 2012, 09:27:15 AM
We're 1480 on the tunnel interfaces.  I'll see if we can get an MTU option in the interface somewhere.  Options would likely be 1480, 1472, and 1280, unless anyone can think of any other useful common values.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 29, 2012, 12:11:10 PM
Did some more testing, this time with iPerf.

I can send 1432 byte UDP packets from a host on the Internet across my tunnel to my home box.  As soon as I go to a 1433 byte packet, I receive nothing at home.

The 1432 iperf from the server end looks like:


11:48:57.223794 IP6 (flowlabel 0xd1a2a, hlim 64, next-header UDP (17) payload length: 1440) ussenterprise.ufp.org.63767 > 2001:470:e07d:1:21d:7dff:fea3:66ae.8010: [udp sum ok] UDP, length 1432
11:48:57.234794 IP6 (flowlabel 0xd1a2a, hlim 64, next-header UDP (17) payload length: 1440) ussenterprise.ufp.org.63767 > 2001:470:e07d:1:21d:7dff:fea3:66ae.8010: [udp sum ok] UDP, length 1432
11:48:57.244795 IP6 (flowlabel 0xd1a2a, hlim 64, next-header UDP (17) payload length: 1440) ussenterprise.ufp.org.63767 > 2001:470:e07d:1:21d:7dff:fea3:66ae.8010: [udp sum ok] UDP, length 1432
11:48:57.255795 IP6 (flowlabel 0xd1a2a, hlim 64, next-header UDP (17) payload length: 1440) ussenterprise.ufp.org.63767 > 2001:470:e07d:1:21d:7dff:fea3:66ae.8010: [udp sum ok] UDP, length 1432


The 1433:



11:49:39.392331 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 1440) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xc142ccd0:0|1432) 59968 > 8010: UDP, length 1433
11:49:39.392334 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 17) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xc142ccd0:1432|9)
11:49:39.643317 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 1440) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xc7f16651:0|1432) 59968 > 8010: UDP, length 1433
11:49:39.643320 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 17) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xc7f16651:1432|9)
11:49:39.894314 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 1440) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xf6422ca8:0|1432) 59968 > 8010: UDP, length 1433
11:49:39.894316 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 17) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xf6422ca8:1432|9)
11:49:40.145311 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 1440) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xee742ee6:0|1432) 59968 > 8010: UDP, length 1433
11:49:40.145314 IP6 (flowlabel 0xe004b, hlim 64, next-header Fragment (44) payload length: 17) ussenterprise.ufp.org > 2001:470:e07d:1:21d:7dff:fea3:66ae: frag (0xee742ee6:1432|9)


Neither fragment makes it to me.

Now, in the other direction, from home, across the tunnel, to a box on the Internet, the MTU is different.  1232 is the largest packet that makes it without fragmentation. (Seen from the Internet host end)


11:52:23.052771 IP6 (flowlabel 0x644bd, hlim 56, next-header UDP (17) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae.64362 > ussenterprise.ufp.org.8010: [udp sum ok] UDP, length 1232
11:52:23.060641 IP6 (flowlabel 0x644bd, hlim 56, next-header UDP (17) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae.64362 > ussenterprise.ufp.org.8010: [udp sum ok] UDP, length 1232
11:52:23.070635 IP6 (flowlabel 0x644bd, hlim 56, next-header UDP (17) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae.64362 > ussenterprise.ufp.org.8010: [udp sum ok] UDP, length 1232
11:52:23.082502 IP6 (flowlabel 0x644bd, hlim 56, next-header UDP (17) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae.64362 > ussenterprise.ufp.org.8010: [udp sum ok] UDP, length 1232
11:52:23.089872 IP6 (flowlabel 0x644bd, hlim 56, next-header UDP (17) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae.64362 > ussenterprise.ufp.org.8010: [udp sum ok] UDP, length 1232


Bumping up to 1233, we see the packets get fragmented and make it out:


11:52:52.782560 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xe48c9858:0|1232) 44552 > 8010: UDP, length 1233
11:52:52.786058 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 17) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xe48c9858:1232|9)
11:52:52.792804 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xf29e788f:0|1232) 44552 > 8010: UDP, length 1233
11:52:52.797676 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 17) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xf29e788f:1232|9)
11:52:52.807295 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xa93258d6:0|1232) 44552 > 8010: UDP, length 1233
11:52:52.807298 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 17) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xa93258d6:1232|9)
11:52:52.812541 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 1240) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xfd13fade:0|1232) 44552 > 8010: UDP, length 1233
11:52:52.815789 IP6 (flowlabel 0x0e4cc, hlim 56, next-header Fragment (44) payload length: 17) 2001:470:e07d:1:21d:7dff:fea3:66ae > ussenterprise.ufp.org: frag (0xfd13fade:1232|9)


From this, I reach the following conclusions:



Something in the path from the Internet to my home server is filtering all fragmented IPv6 packets.  Figuring it is very unlikely to be a router in the middle I'm left with the same two suspect locations.  Most likely the Apple Time Capsule is refusing to process all incoming fragmented packets.  Less likely but still possible is that HE on their tunnel broker servers has some configuration to drop all fragments (possibly in an attempt to protect the boxes from DDoS or similar) that is inadvertently killing all of my fragments as well.

HE Folks, now that I've narrowed it down and can reproduce with iperf, can we try you guys doing a tcpdump on the tunnel server to see if my fragmented packets are making it to the tunnel server, and/or being sent out down the tunnel?  If so, it's got to be the Time Capsule, if not, we'll have a different direction to go in.

Title: MTU Size - was Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: snarked on January 29, 2012, 12:47:39 PM
How about 9k (9216 bytes) if HE has any paths into its tunnel servers which support jumbo frames?

Although my current path to HE does not (my colo provider is still at 1500), I do set my interfaces at 9216 and let PMTU discover whatever lower MTU is supported.  I've had no connectivity problems doing this.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 29, 2012, 01:38:47 PM
Quote from: bicknell on January 29, 2012, 09:11:59 AMIt does appear that I cannot receive any IPv6 fragments over my tunnel/router
While running some unrelated test I noticed that I was receiving fragmented IPv6 packets over the tunnel from HE. I was using the test on http://test-ipv6.com/, and it was sending packets that were too large to make it through the tunnel. PMTU discovery did work as intended, though the server behaved in a way that surprised me a bit. The TCP segment that had triggered the ICMP response was fragmented and I received a fragmented TCP packet through the tunnel. I would have expected TCP to split the data into smaller segments, but that was not what happened. Later TCP traffic did however use smaller segments.

So I know fragmented packets can make it through the tunnel.

Based on the results so far, the error message from netalyzer shows up with different software on the client host, the error shows up both when using the HE tunnel and when using 6to4. It seems the only thing that is still in common between all the cases where the error message is seen is netalyzer. And even the transcript from netalyzer suggests that the proper ICMPv6 packet to indicate that fragmentation is needed was sent and processed. Notice how the transcripts lists the MTU value from the ICMPv6 packet.

Though that MTU value shows up in the transcript, it does appear that it is completely ignored from that point forward. It does a binary search for the MTU and ignores what it was told by the routers.

Is there still any reason to think this is not a flaw in netalyzer?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 29, 2012, 01:40:56 PM
Quote from: kcochran on January 29, 2012, 09:27:15 AMOptions would likely be 1480, 1472, and 1280, unless anyone can think of any other useful common values.
Is it a lot easier to implement a fixed set of values than it is to implement an input field where any value from 1280 to 1480 could be entered?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 29, 2012, 03:14:26 PM
Quote from: kasperd on January 29, 2012, 01:38:47 PM
Is there still any reason to think this is not a flaw in netalyzer?

I duplicated the problem using UDP iPerf between two hosts I control.  No netalyzer involved.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kcochran on January 29, 2012, 07:57:44 PM
Quote from: kasperd on January 29, 2012, 01:40:56 PM
Quote from: kcochran on January 29, 2012, 09:27:15 AMOptions would likely be 1480, 1472, and 1280, unless anyone can think of any other useful common values.
Is it a lot easier to implement a fixed set of values than it is to implement an input field where any value from 1280 to 1480 could be entered?

It's about the same either way, but there are a few sweet spots that are the most useful.  So it's really a choice between: leaving it completely open-ended, and people really have to know why they may need to change it in order to change it to a useful value; giving the most useful values; or do both, and hope it doesn't confuse and/or break people.  I also know we'd find several MTUs of 1337 if it were free-form.  ;-)
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: frim on January 29, 2012, 11:20:52 PM
I'm having the same issues, also with an Airport Express as tunnel endpoint. Netalyzr gives:

Your system can not send or receive fragmented traffic over IPv6. The path between our system and your network has an MTU of 1381 bytes. The bottleneck is at IP address 2001:470:0:7d::2. The path between our system and your network does not appear to handle fragmented IPv6 traffic properly.

The interesting part is that these issues only started about a week ago. Before that, everything worked fine for months.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 29, 2012, 11:22:02 PM
Quote from: kcochran on January 29, 2012, 07:57:44 PMI also know we'd find several MTUs of 1337 if it were free-form.
You'd probably need to round it down to a multiple of 8. But "mtu-=mtu%8" isn't hard to implement :-)
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on January 30, 2012, 06:12:17 AM
Quote from: kcochran on January 29, 2012, 07:57:44 PM
It's about the same either way, but there are a few sweet spots that are the most useful.  So it's really a choice between: leaving it completely open-ended, and people really have to know why they may need to change it in order to change it to a useful value; giving the most useful values; or do both, and hope it doesn't confuse and/or break people.  I also know we'd find several MTUs of 1337 if it were free-form.  ;-)

I would pick a few common values, and document why they should be used:

9000* - 6in4 over a jumbo-frame capable network.
4450* - 6in4 over a 4470 byte MTU network (Packet Over SONET)
1480  - 6in4 over a 1500 byte MTU network (FIOS, Cable Modem)
1472  - 6in4 over PPPoE over a 1500 byte network (DSL)
1280  - IPv6 Minimum MTU (Should work everywhere)

* Note, these values require your ISP to have a jumbo frame clean path to Hurricane Electric, which generally means private peering with Hurricane and configuring the peering for jumbo frames.

Then, if you wan to score bonus points, create a small tool/applet that tests the IPv4 path between the tunnel endpoint and the tunnel broker server to determine the largest IPv4 packet that can pass without fragmentation, and then make a recommendation to the user.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 30, 2012, 07:53:08 AM
Quote from: bicknell on January 30, 2012, 06:12:17 AMif you wan to score bonus points, create a small tool/applet that tests the IPv4 path between the tunnel endpoint and the tunnel broker server to determine the largest IPv4 packet that can pass without fragmentation, and then make a recommendation to the user.
As far as I could tell that already happenes. It just happenes behind the scenes without you even noticing. But I'd need somebody to doublecheck to be sure I'm interpreting my observations correctly.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on January 30, 2012, 09:24:05 AM
It recommends closest broker based on routing. He is talking about recommending a MTU. The MTU selection thing came up a bunch and got shot down an equal amount. Maybe this time it'll stick. Personally 3 options are enough: 1480 (which the tservs default to now), 1472 (for pppoe) and 1280 minimum. If you are on a network that is letting you get jumbo frame capabilities on WAN, then you are on a network that should already be providing you native IPv6 IMO.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 30, 2012, 10:36:34 AM
Quote from: broquea on January 30, 2012, 09:24:05 AMHe is talking about recommending a MTU.
I just did the same test over again, and what I saw was that if the IPv6 in IPv4 packet from the tunnel server to the user results in an ICMP message indicating that the IPv4 packet needs fragmentation, then the tunnel server will use that information to lower the IPv6 MTU of the tunnel temporarily. I don't know for how long the tunnel server keeps the lower MTU on the tunnel.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on January 30, 2012, 10:57:08 AM
Quote from: kasperd on January 30, 2012, 10:36:34 AMI don't know for how long the tunnel server keeps the lower MTU on the tunnel.
I just tried to time it. After the tunnel server received the ICMP message it lowered the MTU of the tunnel for 150 seconds. After those had passed it increased the MTU back to the default value.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: snarked on January 30, 2012, 11:13:02 AM
QuoteIf you are on a network that is letting you get jumbo frame capabilities on WAN, then you are on a network that should already be providing you native IPv6 IMO.

Just because jumbo frames are based on 1GB or faster ethernet (e.g. 1000-base-T or fiber), that in no way implies that IPv6 is supported in such hardware.  I consider these characteristics as orthogonal and thus unrelated.

I have seen plenty of things (e.g. VoIP phones) that are being produced today which are IPv4 only.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: igorybema on February 04, 2012, 12:45:42 PM
Hi,
I see the same message from netalyzr however I'm not experiencing any other problems with IPv6. My guess is that the netalyzr test is broken/incorrect.
The iperf test with udp 1433 is not valid as pmtu detection wil not work for iperf as you choose 1433 as message size. That wil not fit into the 1432 (1480) bytes tunnel. UDP does not allow segmentation and because you told iperf to use 1433 it will stay on 1433 bytes.

Al tests I did showed that PMTU is working correcty with he's tunnelbroker ipv6 connections. It must be that the netalyzr test is incorrect. Hopefully they will respond also to this thread.

regards, Igor
using openwrt as home router
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on February 08, 2012, 01:25:37 PM
Quote from: igorybema on February 04, 2012, 12:45:42 PM
The iperf test with udp 1433 is not valid as pmtu detection wil not work for iperf as you choose 1433 as message size. That wil not fit into the 1432 (1480) bytes tunnel. UDP does not allow segmentation and because you told iperf to use 1433 it will stay on 1433 bytes.

I don't think you're right on the packet size issue.  I can send 1600 byte packets between two 1500 byte hosts on clean IPv6 connections.  If your theory was correct that wouldn't work as well.

Quote from: igorybema on February 04, 2012, 12:45:42 PM
Al tests I did showed that PMTU is working correcty with he's tunnelbroker ipv6 connections. It must be that the netalyzr test is incorrect. Hopefully they will respond also to this thread.

I think the Apple Time Capsule is dropping all IPv6 fragments inbound on the tunnel as a security policy.  I have opened a bug with apple to that effect, and will report back where that goes if they get back to me.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 08, 2012, 01:54:08 PM
Quote from: bicknell on February 08, 2012, 01:25:37 PMI don't think you're right on the packet size issue.  I can send 1600 byte packets between two 1500 byte hosts on clean IPv6 connections.  If your theory was correct that wouldn't work as well.
Fragmentation is permitted on the sending host regardless of which upper layer protocol is used. As long as there is no later hop with a smaller MTU than the first hop, it should work just fine.

The problem is only when a later hop has a smaller MTU. For TCP the best approach is to just hand the fragmentation needed info from the IP layer to the TCP layer and let TCP segment differently. Though I have seen cases where the TCP segment that triggered the fragmentation needed message would get retransmitted using fragmentation, but later packets would be segmented by the TCP layer.

I don't know exactly what is supposed to happen for UDP. Having the stack on the sending host buffer the UDP packet and retransmit in case of a fragmentation needed message doesn't sound like what you would expect from UDP. And pushing the requirement of dealing with fragmentation to the application layer isn't good either. Failing to implement either of those approaches will just lead to the application layer having to deal with a lost packet, which it is supposed to be capable of anyway. But always having a timeout for the very first packet isn't a great solution.

Quote from: bicknell on February 08, 2012, 01:25:37 PMI think the Apple Time Capsule is dropping all IPv6 fragments inbound on the tunnel as a security policy.  I have opened a bug with apple to that effect, and will report back where that goes if they get back to me.
I saw the problem without any Apple equipment. I'm pretty sure whatever the problem is, does not lie on my end of the tunnel.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on February 08, 2012, 08:29:02 PM
Quote from: kasperd on February 08, 2012, 01:54:08 PM
I don't know exactly what is supposed to happen for UDP. Having the stack on the sending host buffer the UDP packet and retransmit in case of a fragmentation needed message doesn't sound like what you would expect from UDP. And pushing the requirement of dealing with fragmentation to the application layer isn't good either. Failing to implement either of those approaches will just lead to the application layer having to deal with a lost packet, which it is supposed to be capable of anyway. But always having a timeout for the very first packet isn't a great solution.

My understanding (which I admit may be wrong) is the current state of the art in Linux and FreeBSD is that the first packet is lost.  That is, the first UDP packet goes out, gets dropped and generates a packet too big.  The server then caches the new MTU and uses that for sending additional UDP packets.  The first packet, and in in-flight, are dropped and the application must resend.

I believe this is why there is a lot of discussion about how UDP & PMTU discovery don't work for transactional services in IPv6.  For instance, DNS's send one, get one typical operation.  Indeed, I believe best current operational practice for DNS over IPv6 is to send only 1280 byte UDP frames. :(

Still, with iperf this should show up as a few lost packets at the start and then a successful test for the rest of the test, worst case.

I need to find someone who's actually written code in an IPv6 stack and ask them for more details.

Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 09, 2012, 04:15:23 AM
Quote from: bicknell on February 08, 2012, 08:29:02 PMI believe best current operational practice for DNS over IPv6 is to send only 1280 byte UDP frames.
Then again, I don't think it is completely agreed upon how to get past the 512 byte limit that DNS was designed with.

The DNS request is unlikely to go past 512 bytes in the first place (it is actually not that easy to construct a valid DNS request larger than 512 bytes). The reply can easily be larger than 512 bytes, and it used to be the case, that you were supposed to switch to TCP in that case.

The client can include an EDNS option specifying that it is capable of receiving larger packets, and then the server can send a reply larger than 512 bytes, and larger than 1280 bytes even. But some consider this a problem because it could be used in amplification attacks.

With all the things that need to go on to ensure that DNS packets larger than 512 bytes are permitted, you'd think that PMTU discovery could be completed by the time you are ready to send a large packet. Unfortunately PMTU discovery is a bit less efficient than I'd like it to be. Why do you have to waste bandwidth sending a maximum sized packet to find the MTU. Potentially you'll need to send multiple large packets before you find the actual MTU.

I think a better option would have been a hop by hop option that contain the MTU seen so far, and is then decremented by each router along the path if it cannot handle as large a packet as indicated by that option. Then you'd just need a small packet going back and forth to find the PMTU. Unfortunately introducing such an option now is of limited value, because it would have to be supported by most routers in order to really help.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kcochran on February 16, 2012, 11:21:33 AM
Ok, MTU adjustment is live.

Currently supporting three options: 1480, 1472 and 1280.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: Quill on February 16, 2012, 09:38:46 PM
Quote from: kcochran on February 16, 2012, 11:21:33 AM
Ok, MTU adjustment is live.

Currently supporting three options: 1480, 1472 and 1280.


Excuse my ignorance but I'm not really sure what this does. I have an MTU of 1380, do these settings have any meaning for me?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 18, 2012, 12:19:17 AM
Quote from: Quill on February 16, 2012, 09:38:46 PMI have an MTU of 1380, do these settings have any meaning for me?
How did you manage to figure out your MTU without knowing what it means? If your actual MTU is 1380, then the best choice from the list of options is 1280. I assume 1480 is still the default. If you stick with the default, then a small number of packets will get dropped on the way from the Internet to your computer.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: Quill on February 18, 2012, 01:46:36 AM
Quote from: kasperd on February 18, 2012, 12:19:17 AM
Quote from: Quill on February 16, 2012, 09:38:46 PMI have an MTU of 1380, do these settings have any meaning for me?
How did you manage to figure out your MTU without knowing what it means? If your actual MTU is 1380, then the best choice from the list of options is 1280. I assume 1480 is still the default. If you stick with the default, then a small number of packets will get dropped on the way from the Internet to your computer.

I understand what MTU is, and as far as determining mine, I used tracepath. What I don't understand, is what kcochran posted and how that information relates to my MTU setting.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: snarked on February 18, 2012, 03:46:24 PM
1280:  Minimum IPv6 MTU per the RFCs.
1480:  Standard MTU for an IPv6 encapsulated packet over IPv4 ethernet.
1472:  Similar to 1480, except encapsulated as PPPoE under IPv4.

That's what they mean and how they're derived.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: Quill on February 18, 2012, 07:49:45 PM
Quote from: snarked on February 18, 2012, 03:46:24 PM
1280:  Minimum IPv6 MTU per the RFCs.
1480:  Standard MTU for an IPv6 encapsulated packet over IPv4 ethernet.
1472:  Similar to 1480, except encapsulated as PPPoE under IPv4.

That's what they mean and how they're derived.

Thanks for the reply, I am, however, familiar with the individual MTUs listed. What I'm trying to ascertain is, what they mean to me with an MTU of 1380. As far as I can see, there's no difference, if that's the case, what relevance do the new settings have? 

Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 19, 2012, 01:43:07 AM
Quote from: Quill on February 18, 2012, 07:49:45 PMWhat I'm trying to ascertain is, what they mean to me with an MTU of 1380.
If your exact value is not on the list, you want to use the largest one among those that are less than your actual MTU. So first you eliminate 1472 and 1480 because those are larger than 1380. Then you take the largest among those that remain.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: Quill on February 19, 2012, 03:15:47 AM
Quote from: kasperd on February 19, 2012, 01:43:07 AM
Quote from: Quill on February 18, 2012, 07:49:45 PMWhat I'm trying to ascertain is, what they mean to me with an MTU of 1380.
If your exact value is not on the list, you want to use the largest one among those that are less than your actual MTU. So first you eliminate 1472 and 1480 because those are larger than 1380. Then you take the largest among those that remain.

Thanks for the reply. So, you basically suggesting I should manually lower mu MTU to 1280, even though there's no apparent change in the way my tunnel is behaving and no apparent fragmentation?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on February 19, 2012, 05:40:13 AM
If you aren't finding that you experience MTU issues with how the MTU is configured on HE's side, then no, you don't need to touch a thing. If however you do feel you are experiencing MTU related issues, then yes you can try changing the MTU on HE's side to see if that resolves the issue.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 19, 2012, 06:19:02 AM
Quote from: broquea on February 19, 2012, 05:40:13 AMIf you aren't finding that you experience MTU issues with how the MTU is configured on HE's side, then no, you don't need to touch a thing.
Correct. As I indicated earlier, HE had auto detection of the MTU already before the new option was introduced. The possible values were limited to the range 1280-1480 bytes. As long as PMTU was working on the IPv4 connection from the tunnel server to your router, you shouldn't experience a problem. The autodetection would cause a single packet drop once every few minutes.

I haven't tested what the new option does, but I assume it changes the upper value of the range. If PMTU is not working on the IPv4 path from the tunnel server to your router, then you can lower the setting. For example a setting of 1472 should then autodetect in the range from 1280-1472, and a setting of 1280 would make the range 1280-1280 effectively turning off the autodetection.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on February 19, 2012, 12:18:37 PM
Quote from: kasperdCorrect. As I indicated earlier, HE had auto detection of the MTU already before the new option was introduced

There is no MTU auto-detection; never has been. The MTU configured on HE's tunnel interface has always been 1480, and nothing else aside from maybe 3-4 requested settings to 1472 done manually. I don't know where the system would have suggested otherwise.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 19, 2012, 03:02:35 PM
Quote from: broquea on February 19, 2012, 12:18:37 PMThere is no MTU auto-detection; never has been.
There most certainly is. I don't know for how long it existed. I learned about the existence of that feature about three weeks ago.

See this example:
ping6 -n -s 1424 2001:470:28:940::1
PING 2001:470:28:940::1(2001:470:28:940::1) 1424 data bytes
From 2001:470:0:11e::2 icmp_seq=1 Packet too big: mtu=1464
From 2001:470:28:940:1c82:31b3:813c:2e14 icmp_seq=2 Destination unreachable: No route
From 2001:470:28:940:1c82:31b3:813c:2e14 icmp_seq=3 Destination unreachable: No route
I got the tunnel server to report an MTU of 1464 bytes. That is not one of the configurable choices, but I was able to get the tunnel server to use that MTU by making use of the auto detection.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on February 19, 2012, 03:10:37 PM
After running this broker from 2007 until 2012, I can promise you that the tunnel interfaces weren't configured with anything other than 1480. They did not dynamically change to some other number on the interface.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 19, 2012, 03:48:09 PM
Quote from: broquea on February 19, 2012, 03:10:37 PMThey did not dynamically change to some other number on the interface.
The tunnel server on 216.66.80.90 does change the MTU of my tunnel dynamically. I have looked at the traffic with tcpdump, and there is no doubt that tunnel server is dynamically changing the MTU.

I think you have been looking at a number, that does not mean, what you think it means.

Anybody who have two computers on different public IPv4 addresses can create two tunnels and verify the existence of that feature. Once an ICMP message indicating an IPv4 MTU of 1484 bytes has been send to 216.66.80.90, there will be ICMPv6 packets from 2001:470:0:11e::2 indicating an IPv6 MTU of 1464 bytes.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on February 19, 2012, 06:45:46 PM
I'm referring to the physical/tunnel interface setting, not the values that packets can get transported as via pmtud. If you end up with packets transported at 1464 mtu, the tunnel's interface setting isn't changed to 1464, it remains 1480. You are talking packets, not interfaces.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on February 19, 2012, 11:56:08 PM
Quote from: broquea on February 19, 2012, 06:45:46 PMYou are talking packets, not interfaces.
In the end all that matters to the users, is what packets the tunnel server is sending. What terminology is used on the tunnel server is not important. In my example the tunnel server was using an MTU of 1464 for processing the network traffic.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on February 27, 2013, 01:43:04 PM
I opened a ticket on this with Apple back when I originally posted here.  Today I received an update from Apple:

Firmware version 7.6.3 has been released officially, please retest.


I'm actually up and running on an entirely different home gateway right now, so I'll have to both upgrade and put it back in service to test.  Apple seems to think they have fixed something related to this, so perhaps fragments will pass properly now.

I'll try and report back by this weekend.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bimmerdriver on March 12, 2013, 10:55:05 PM
I ran this test as well and got the same result as the original poster:


The MTU on my router (sophos utm) is 1500. My internet connection is vdsl2, not using pppoe or pppoa.

The test also gave this result:


Is there anything I can or should do about either of these issues?
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on March 12, 2013, 11:11:43 PM
Your tunnel's MTU is 1480 which is maximum for using this tunnel service.

200ms slower rendering an image in a browser? Spotify tuned their system to start playing music within 285ms from pressing play because that is just above 250ms which is when the human brain processes sound as instantaneous (cool NPR story from years ago). I'd say an extra 200ms to render something will not be noticed.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bimmerdriver on March 13, 2013, 12:14:18 AM
Quote from: broquea on March 12, 2013, 11:11:43 PM
Your tunnel's MTU is 1480 which is maximum for using this tunnel service.

200ms slower rendering an image in a browser? Spotify tuned their system to start playing music within 285 seconds from pressing play because that is just above 250ms which is when the human brain processes sound as instantaneous (cool NPR story from years ago). I'd say an extra 200ms to render something will not be noticed.
Thanks very much.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 13, 2013, 08:33:46 AM
Quote from: broquea on March 12, 2013, 11:11:43 PMSpotify tuned their system to start playing music within 285 seconds from pressing play
I assume you meant 285 ms.

Quote from: broquea on March 12, 2013, 11:11:43 PMbecause that is just above 250ms which is when the human brain processes sound as instantaneous (cool NPR story from years ago).
When working on a sound processing project at university, we got some advice from a professional sound technician, who told us the threshold is 20ms.

When I am watching a DVD I can clearly feel something is wrong, if the sound is offset by 100ms. But I can't always pinpoint the direction in which the sound is off.

Quote from: broquea on March 12, 2013, 11:11:43 PMI'd say an extra 200ms to render something will not be noticed.
My former colleagues would ridicule you for making such a statement. There we were working with deadlines around 200ms to complete the rendering of a webpage, including the time it took to download all resources needed to render the page.

200ms may not be enough to consciously notice that there was a delay. But subconsciously the users will notice the difference, and it will change their perception of the overall quality of the service.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: broquea on March 13, 2013, 08:50:38 AM
Quote from: kasperd on March 13, 2013, 08:33:46 AMI assume you meant 285 ms.
oops, I did, lemme go fix that

QuoteWhen working on a sound processing project at university, we got some advice from a professional sound technician, who told us the threshold is 20ms.

When I am watching a DVD I can clearly feel something is wrong, if the sound is offset by 100ms. But I can't always pinpoint the direction in which the sound is off.
Neat, I find that using VLC lets you tweak with the audio delay and have definitely noticed some weirdness on a few videos out there but its almost always around something absurd like 400-700ms off which is way way more obvious generally.

QuoteMy former colleagues would ridicule you for making such a statement. There we were working with deadlines around 200ms to complete the rendering of a webpage, including the time it took to download all resources needed to render the page.

200ms may not be enough to consciously notice that there was a delay. But subconsciously the users will notice the difference, and it will change their perception of the overall quality of the service.
Also probably depends on age of the user as well :) A younger person firing on all neurons should be noticing it, but as we get older and more addled, everything will probably feels like it takes forever!
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 20, 2013, 05:15:41 PM
I finally was able to upgrade to firmware 7.6.3 and rerun my tests.  Same result, so I updated the Apple bug ticket with that information.  I'm going to stay running on my Airport for the time being in case I need to retest again soon.

Whatever is going on here, I don't think it's fixed yet.  I'm going to e-mail the Netalyzr folks and point them to this thread as well.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 25, 2013, 01:44:10 PM
I've been working with the Netalyzr folks on some testing, and we've already found some interesting details.  Nothing specific to report back yet, but I think we have a complex interaction between various bits of infrastructure that are all working "in spec" but not in a way each likes.

I do want to point out one thing I found which is a bit of a surprise to me.  It turns out most (all?) of the Linux kernels output UDP fragments in _reverse_ order.  That is, say you had a 3500 byte packet to transmit which would become 1500 byte segment #1, 1500 byte segment #2, and 500 byte segment #3.  They will go out on the wire 3, then 2, then 1.

At least with a couple of popular NAT implementations we tested that do in fact reassemble fragments this causes them to be dropped.  Segment #1 must be received first to create a state table entry for the rest of the packets.

I'm tempted to say that the emission of the fragments backwards is wrong, but the reality is even if they were sent in order there is the potential for them to be re-ordered during transport across the network.  However, as a programmer I also get that having a NAT box store random fragments in the hopes the rest of the bits come in later is both a bit of a programming challenge and a potential DDOS vector.

I'm not a big Linux fan, so I'm wondering if anyone knows if this reverse fragment behavior is pervasive across all kernels, or if there is any work around.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 25, 2013, 02:58:33 PM
Quote from: bicknell on March 25, 2013, 01:44:10 PMI do want to point out one thing I found which is a bit of a surprise to me.  It turns out most (all?) of the Linux kernels output UDP fragments in _reverse_ order.  That is, say you had a 3500 byte packet to transmit which would become 1500 byte segment #1, 1500 byte segment #2, and 500 byte segment #3.  They will go out on the wire 3, then 2, then 1.
It's easier to reassemble packets that way. The last fragment is the only one, which can tell you how large the reassembled packet is going to be. So until you have received the last fragment, you cannot allocate memory for reassembly, and you'll have to keep packets separate.

As the packets are being reassembled, you need to keep track of which bytes of the final packet have been received, and which have not. Keeping track of that is much easier if you receive them in order. And since you need to start with the last, that order has to be reverse order. From that perspective, it makes a lot of sense to send fragments in reverse order.

For the sender it would actually be slightly simpler to do them in order. Because once you have send the first fragment, you can overwrite the end of the first fragment with the IP header for the next fragment, that way you avoid using an extra buffer and doing additional copying.

Quote from: bicknell on March 25, 2013, 01:44:10 PMAt least with a couple of popular NAT implementations we tested that do in fact reassemble fragments this causes them to be dropped.  Segment #1 must be received first to create a state table entry for the rest of the packets.
Fragmentation is known to be problematic. NAT is known to be problematic. Combining the two makes it even worse. What you are describing is not the only problem.

What you describe is understandable behaviour. After all the port numbers are crucial to the operations a NAT device performs on packets, and the port numbers are only listed in the first fragment. However the onus is on the NAT implementers to solve the problem, as fragmentation was standardized before NAT was invented. And a NAT is not allowed to reassemble the fragments. It will however need to store all the fragments in memory until it knows where to forward them to.

Quote from: bicknell on March 25, 2013, 01:44:10 PMHowever, as a programmer I also get that having a NAT box store random fragments in the hopes the rest of the bits come in later is both a bit of a programming challenge and a potential DDOS vector.
That's something implementers of NAT devices have to deal with, if they don't want to be shipping a broken product. You can set aside a few MB of memory for storing fragments that cannot yet be forwarded due to the first fragment not having been seen yet. And a FIFO strategy would be sensible for discarding packets once memory is full, and in fact fragments must be discarded once they are a few minutes old, even if there isn't any memory pressure.

Quote from: bicknell on March 25, 2013, 01:44:10 PMif there is any work around.
My best recommendation is to avoid NAT and avoid fragmentation.

The problem you described would be even harder to solve in IPv6 than it is in IPv4 due to the possibility of extension headers moving the port numbers to a different position. There isn't even a guarantee, that the port number is in the first fragment. If there are so many extension headers, that the transport header is in a later segment, you need all the packets from the very first until the one with the transport header in order to figure out the port number. You may even need all of those fragments to figure out what the protocol is.

The good news is, that with IPv6 you don't need NAT. And IPv6 has improved the situation regarding a lot of other fragmentation related issues.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 26, 2013, 07:49:28 AM
Quote from: kasperd on March 25, 2013, 02:58:33 PM
Quote from: bicknell on March 25, 2013, 01:44:10 PMI do want to point out one thing I found which is a bit of a surprise to me.  It turns out most (all?) of the Linux kernels output UDP fragments in _reverse_ order.  That is, say you had a 3500 byte packet to transmit which would become 1500 byte segment #1, 1500 byte segment #2, and 500 byte segment #3.  They will go out on the wire 3, then 2, then 1.
It's easier to reassemble packets that way. The last fragment is the only one, which can tell you how large the reassembled packet is going to be. So until you have received the last fragment, you cannot allocate memory for reassembly, and you'll have to keep packets separate.

As the packets are being reassembled, you need to keep track of which bytes of the final packet have been received, and which have not. Keeping track of that is much easier if you receive them in order. And since you need to start with the last, that order has to be reverse order. From that perspective, it makes a lot of sense to send fragments in reverse order.

Your statement makes no sense to me.

The first frame is the only frame with a IP header, which includes the length field.  The rest have an offset inside the packet.  So in the hypothetical stream I mentioned above, the receiver would get:

Packet #3: Fragment, offset 3000, len 500.
Packet #2: Fragment, offset 1500, len 1500.
Packet #1: IP Header, length 3500, fragment, offset 0 len 1500.

You're suggesting the receiver can guess from packet #3 this is a 3500 byte frame, but that is not correct.  Consider this stream for a 4000 byte packet:

Packet #4: Fragment, offset 3500, len 500.
Packet #3: Fragment, offset 3000, len 500.
Packet #2: Fragment, offset 1500, len 1500.
Packet #1: IP Header, length 4000, fragment, offset 0 len 1500.

Except that packet #4 is dropped in flight.  The receiver receives the same packet #3, but cannot guess the memory size until packet #1 is received!

More importantly, from a security perspective when received in reverse order the receiving host must store all fragments received in memory to see if a header comes in that matches.  This enables a trivial DDOS, send fragments to the host and it will run out of memory!  If packet #1 is received first it's header can be matched against ACL's, including dynamic state entries, and allowed or discarded.  Subsequent fragments can then be saved or discarded upon reception based on if they match an initial packet that has already passed the security checks.

I believe from both a programming perspective and a security perspective things are significantly easier if the fragmented frames arrive in order, rather than any out of order sequence, including the reversed sequence Linux uses.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 26, 2013, 08:00:05 AM
I finally got off my duff and did some serious testing, and what I discovered is interesting.  I've attached a diagram, but basically I inserted an ethernet switch between the Time Capsule and my cable modem so that I could capture both sides.  I then ran this query on the EndHost:

EndHost:~ bicknell$ dig +norecurse +bufsize=2048 txt txtpadding-1800.netalyzr.icsi.berkeley.edu @ipv6-node.netalyzr.icsi.berkeley.edu
;; Warning: ID mismatch: expected ID 28893, got 25185
;; Warning: query response not set

; <<>> DiG 9.8.3-P1 <<>> +norecurse +bufsize=2048 txt txtpadding-1800.netalyzr.icsi.berkeley.edu @ipv6-node.netalyzr.icsi.berkeley.edu
;; global options: +cmd
;; connection timed out; no servers could be reached

The ID mismatch is the first interesting thing, but it's not actually the problem.  First let's look at a tcpdump on the EndHost itself, to see how this makes it out on the wire:

09:20:34.195533 IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > 2607:f740:b::f93.53: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)
09:20:34.250004 IP6 2607:f740:b::f93.53 > 2001:470:e07d:1:54ea:2859:ef83:1f92.50321: 25185 updateM [b2&3=0x6420] [14646a] [12596q] [8242n] [12336au][|domain]
09:20:39.196723 IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > 2607:f740:b::f93.53: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)
09:20:44.197888 IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > 2607:f740:b::f93.53: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)

We see the initial query, a packet triggering the ID mismatch, and then two repeats of the query.  Note that there is no response to the query of any kind.

Moving on to the Sniffer, we get the rest of the details:

09:20:35.397019 IP 74-93-155-149-memphis-tn.hfc.comcastbusiness.net > tserv2.ash1.he.net: IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > ipv6-node.netalyzr.icsi.berkeley.edu.domain: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)
09:20:35.444275 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu > 2001:470:e07d:1:54ea:2859:ef83:1f92: frag (1448|360)
09:20:35.450971 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu.domain > 2001:470:e07d:1:54ea:2859:ef83:1f92.50321: 25185 updateM [b2&3=0x6420] [14646a] [12596q] [8242n] [12336au][|domain]
09:20:40.398153 IP 74-93-155-149-memphis-tn.hfc.comcastbusiness.net > tserv2.ash1.he.net: IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > ipv6-node.netalyzr.icsi.berkeley.edu.domain: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)
09:20:40.442996 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu > 2001:470:e07d:1:54ea:2859:ef83:1f92: frag (1432|376)
09:20:40.443275 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu > 2001:470:e07d:1:54ea:2859:ef83:1f92: frag (0|1432) domain > 50321: 28893*- 1/1/2 TXT[|domain]
09:20:45.399317 IP 74-93-155-149-memphis-tn.hfc.comcastbusiness.net > tserv2.ash1.he.net: IP6 2001:470:e07d:1:54ea:2859:ef83:1f92.50321 > ipv6-node.netalyzr.icsi.berkeley.edu.domain: 28893 [1au] TXT? txtpadding-1800.netalyzr.icsi.berkeley.edu. (71)
09:20:45.443485 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu > 2001:470:e07d:1:54ea:2859:ef83:1f92: frag (1432|376)
09:20:45.444585 IP tserv2.ash1.he.net > 74-93-155-149-memphis-tn.hfc.comcastbusiness.net: IP6 ipv6-node.netalyzr.icsi.berkeley.edu > 2001:470:e07d:1:54ea:2859:ef83:1f92: frag (0|1432) domain > 50321: 28893*- 1/1/2 TXT[|domain]

These of course are the tunnel encapsulated packets.  Here we see the query, the errant packet triggering the ID mismatch, but then a response!  The response is fragmented, and the fragments are received in reverse order.  We see first offset 1432 length 376, and second offset 0 length 1432.

The good news here is that TunnelBroker is off the hook, the fragments are making it down my tunnel.   :D

The bad news is that they are not making it past my Time Capsule.  I'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference before going back to update my bug report with Apple.  I suspect though that many firewalls will block all fragments (very bad), and that many will block the fragments received out of order (somewhat bad).  If people can replicate this test with different hardware it would be appreciated.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 26, 2013, 01:08:54 PM
Quote from: bicknell on March 26, 2013, 07:49:28 AMYour statement makes no sense to me.
I assumed you knew how fragmentation works. My bad.

Each fragment has an IP header. The length field in the IP header always indicates the length of the fragment. Fragments are not numbered, but they carry an indication of their offset within the reassembled packet. Additionally there is one bit indicating if this is the last fragment. The first fragment is recognized by having offset 0.

A packet which is not fragmented is simply a fragment, which is simultaneously the first and the last fragment. (In case of IPv6 the fragment header can be left out on such a fragment saving 8 bytes of space).

No fragment contains a field indicating the size of the reassembled packet. The size is computed by adding fragment offset and fragment length of the last fragment.

Quote from: bicknell on March 26, 2013, 07:49:28 AMMore importantly, from a security perspective when received in reverse order the receiving host must store all fragments received in memory to see if a header comes in that matches.  This enables a trivial DDOS, send fragments to the host and it will run out of memory!
This is a well-known issue, which you have to keep in mind when implementing an IP stack. Just limit the amount of memory used for reassembly and use a FIFO strategy to discard fragments when memory need to be used for a newly arrived fragment.

Quote from: bicknell on March 26, 2013, 07:49:28 AMIf packet #1 is received first it's header can be matched against ACL's, including dynamic state entries, and allowed or discarded.  Subsequent fragments can then be saved or discarded upon reception based on if they match an initial packet that has already passed the security checks.
Ordering was optimized for the receiver not the firewall. Additionally with IPv6 you can't always apply ACL's based on any one fragment. IPv6 packets can be constructed in a way, where even figuring out the port number being used requires all the fragments.

There is a simple solution though. Just let all the fragments pass through the firewall until you see the first fragment of the packet. Then when you see the first fragment you decide if the packet is permitted or not. If the packet is permitted, you let it through. If the packet is rejected, the firewall sends an ICMPv6 error code based on the first fragment. Other fragments, which have already passed through the firewall, will be discarded by the destination host.

Quote from: bicknell on March 26, 2013, 07:49:28 AMI believe from both a programming perspective and a security perspective things are significantly easier if the fragmented frames arrive in order, rather than any out of order sequence, including the reversed sequence Linux uses.
I think your idea about what is easier would change, if you tried to implement fragment reassembly. As for the security implications, it is a mistake to consider what ordering is easiest to deal with. Your security needs to work regardless of which order an attacker sends packets in.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 26, 2013, 01:44:00 PM
Quote from: bicknell on March 26, 2013, 08:00:05 AMbasically I inserted an ethernet switch between the Time Capsule and my cable modem so that I could capture both sides.
That's not supposed to be possible to do with a switch. I keep an old hub around just in case I need to do that sort of debugging.

Quote from: bicknell on March 26, 2013, 08:00:05 AMEndHost:~ bicknell$ dig +norecurse +bufsize=2048 txt txtpadding-1800.netalyzr.icsi.berkeley.edu @ipv6-node.netalyzr.icsi.berkeley.edu
;; Warning: ID mismatch: expected ID 28893, got 25185
;; Warning: query response not set

; <<>> DiG 9.8.3-P1 <<>> +norecurse +bufsize=2048 txt txtpadding-1800.netalyzr.icsi.berkeley.edu @ipv6-node.netalyzr.icsi.berkeley.edu
;; global options: +cmd
;; connection timed out; no servers could be reached

The ID mismatch is the first interesting thing, but it's not actually the problem.
That ID mismatch is a symptom of a quite mysterious bug on their side. Notice how the incorrect ID being received is always 25185. That packet is not DNS at all. What it contains is ASCII data. I captured one of those and found this 31 character string in the packet "bad 1496 2001:470:0:69::2 1480 ".

It showed up at the exact place in the sequence of packets, where the first fragment of the DNS reply would have been expected. The second fragment of the DNS reply had already been received. From the offset on the second fragment I can see the size of the first fragment, which would have been too large for the tunnel MTU, which explains why the first fragment did not arrive.

Quote from: bicknell on March 26, 2013, 08:00:05 AMThe bad news is that they are not making it past my Time Capsule.
Then the Time Capsule is at fault. And netalyzr is correct to report that you have a fragmentation problem.

Quote from: bicknell on March 26, 2013, 08:00:05 AMI'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference
I can hack together a DNS server sending replies in various orders, if you need to test that.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 26, 2013, 07:02:26 PM
Quote from: kasperd on March 26, 2013, 01:08:54 PM
Quote from: bicknell on March 26, 2013, 07:49:28 AMI believe from both a programming perspective and a security perspective things are significantly easier if the fragmented frames arrive in order, rather than any out of order sequence, including the reversed sequence Linux uses.
I think your idea about what is easier would change, if you tried to implement fragment reassembly. As for the security implications, it is a mistake to consider what ordering is easiest to deal with. Your security needs to work regardless of which order an attacker sends packets in.

I will note I am more familiar with the BSD stack, FreeBSD in particular.  It emits fragments in order.  The code to both send and receive them is quite clean, in my opinion.

I do agree that the security needs to work regardless of order.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 26, 2013, 07:04:53 PM
Quote from: kasperd on March 26, 2013, 01:44:00 PM
That's not supposed to be possible to do with a switch. I keep an old hub around just in case I need to do that sort of debugging.

The switch in question is a managed Cisco switch which supports SPAN, allowing the mirroring of traffic for monitoring.

Quote from: kasperd on March 26, 2013, 01:44:00 PM
That ID mismatch is a symptom of a quite mysterious bug on their side. Notice how the incorrect ID being received is always 25185. That packet is not DNS at all. What it contains is ASCII data. I captured one of those and found this 31 character string in the packet "bad 1496 2001:470:0:69::2 1480 ".

I'm not quite sure what to make of this problem.  It doesn't occur every time for me, perhaps for 1 in 10 queries.  I've reported it to the Netalyzr folks and they are looking into it.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: bicknell on March 26, 2013, 07:10:44 PM
Quote from: bicknell on March 26, 2013, 08:00:05 AM
The good news here is that TunnelBroker is off the hook, the fragments are making it down my tunnel.   :D

The bad news is that they are not making it past my Time Capsule.  I'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference before going back to update my bug report with Apple.  I suspect though that many firewalls will block all fragments (very bad), and that many will block the fragments received out of order (somewhat bad).  If people can replicate this test with different hardware it would be appreciated.

I configured one of my DNS servers on a FreeBSD box to generate an 1800 byte TXT records, and reran the tests to my own server.  I observed a few interesting details:

1) The FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.  I'm not immediately sure if this is a FreeBSD-ism, or BIND-ism.  The result is still one packet and one fragment, just of slightly different sizes.

2) The packets are emitted in order, and received down the tunnel in order.

3) Neither of the two response packets makes it past the Time Capsule.

So, I've now shown fragments don't make it past the Time Capsule regardless of packet order.  I raised the severity of my bug report with Apple, documented all of this with them, and poked a couple of people over there I know in an attempt to nudge it to slightly higher priority.

I also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 27, 2013, 02:05:27 AM
Quote from: bicknell on March 26, 2013, 07:02:26 PMIt emits fragments in order.  The code to both send and receive them is quite clean, in my opinion.
Sending in order is a bit simpler than sending in reverse order. Code for receiving needs to be prepared to receive fragments in any order. So for simplicity of the code, sending in order is preferable.

But you may want to optimize the reassembly for the case where packets are not reordered by the network. Less consumption of CPU and/or RAM in the reassembly code may be desired. So you could decide to do two alternate code paths. A fast path, which is used as long as fragments are received in reverse order, and a slow path, which is used when fragments are received out of order. This would require more code on the receiving end, but with a performance improvement in the common case.

Something similar is found in TCP options parsing. Linux has optimized code to handle the case where the TCP options are exactly 12 bytes long, and the first four bytes are exactly 0x0101080A. This doesn't simplify the code, because it still need code to handle arbitrary ordering of options. But the code will be faster on the large number of packets with that exact sequence of bytes. This optimization is even recommended in RFC 1323.

Quote from: bicknell on March 26, 2013, 07:04:53 PMIt doesn't occur every time for me, perhaps for 1 in 10 queries.
It depends on the PMTU information being cached on the sender. If the cache has no PMTU information for your IP the first fragment will be 1500 bytes. That fragment is bounced by the tunnel server. At that point your IP address will be put in the cache with a PMTU of 1480 bytes. But rather than retransmitting the packet on receipt of the ICMPv6 error, it sends an invalid response.

From that point on, it will work until the PMTU cache entry expires.

Quote from: bicknell on March 26, 2013, 07:10:44 PMThe FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.
That actually sounds like a very sensible default behaviour. I think more systems should do that.

Quote from: bicknell on March 26, 2013, 07:10:44 PMThe result is still one packet and one fragment, just of slightly different sizes.
I assume you mean one UDP packet fragmented in two IPv6 fragments.

Quote from: bicknell on March 26, 2013, 07:10:44 PMI also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.
It's quite likely, it will make a difference in some cases.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: kasperd on March 28, 2013, 05:17:48 AM
Quote from: bicknell on March 26, 2013, 07:10:44 PMSo, I've now shown fragments don't make it past the Time Capsule regardless of packet order.
I have a few more ideas for you to try.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: RayH on June 09, 2013, 04:57:39 AM
Quote from: bicknell on March 26, 2013, 07:10:44 PM
Quote from: bicknell on March 26, 2013, 08:00:05 AM
The good news here is that TunnelBroker is off the hook, the fragments are making it down my tunnel.   :D

The bad news is that they are not making it past my Time Capsule.  I'm working with the Netalyzr folks to see if there is anything we can do to get the fragments in order to see if that makes a difference before going back to update my bug report with Apple.  I suspect though that many firewalls will block all fragments (very bad), and that many will block the fragments received out of order (somewhat bad).  If people can replicate this test with different hardware it would be appreciated.

I configured one of my DNS servers on a FreeBSD box to generate an 1800 byte TXT records, and reran the tests to my own server.  I observed a few interesting details:

1) The FreeBSD/BIND combo I used emitted 1280 byte maximum length UDP.  I'm not immediately sure if this is a FreeBSD-ism, or BIND-ism.  The result is still one packet and one fragment, just of slightly different sizes.

2) The packets are emitted in order, and received down the tunnel in order.

3) Neither of the two response packets makes it past the Time Capsule.

So, I've now shown fragments don't make it past the Time Capsule regardless of packet order.  I raised the severity of my bug report with Apple, documented all of this with them, and poked a couple of people over there I know in an attempt to nudge it to slightly higher priority.

I also pointed out to the Netalyzr folks it would be very cool if they could devise a test to send the fragments both in order and out of order, and see if it makes any difference.  While it doesn't in this case, I suspect with some firewalls and NAT implementations it will.


I've also been doing some tests on transmission of various IPv6 extension headers to my Linux server + he.net tunnel connection using the si6 ipv6 toolkit http://www.si6networks.com/tools/ipv6toolkit/, and found this thread.

Packets containing hop by hop extension header, or destination extension header, or both, are all transmitted and received OK (confirmed with Wireshark running both on client and Linux server).

Packets containing any IPv6 fragment extension header appear to be getting dropped outbound by my Time Capsule to both the HE.net tunnel as well as local destinations connected on the same WAN interface as the Time Capsule.

Not even a simple minimalist packet seems to be getting through i.e. one consisting of an IPv6 header + single fragment header (first fragment) with no following fragments, next header TCP port 80 + SYN + no data && length <1280.

I tested with a direct cable between my client and the Linux server and everything worked exactly as expected.

I also tried disabling the firewall feature on the Time Capsule: no change.

I'd need to do some more debugging with a dumb hub and a laptop running wireshark to examine the 6to4 packets directly to really confirm this, but it's definitely looking like it's the Time Capsule is blocking all packets with an IPv6 fragment extension header.
Title: Re: Netalyzer says I have a IPv6 fragmentation problem.
Post by: RayH on June 11, 2013, 08:44:34 AM
Right. I've got a dumb hub in the path between the Time Capsule and my Linux box (that acts as the tunnel endpoint for he.net): so I have simultaneous wireshark captures at the end client behind the airport, at the hub between the Time Capsule, and on the Linux server.

Wireshark is seeing and decoding the 6to4 packets coming out of the Time Capsule correctly.

The Time Capsule is dropping every single IPv6 packet containing a fragment header (in any combination with or without hop by hop or destination option extension headers).

Hop by hop headers and destination option extension headers are being transmitted nominally in any combination (except when a fragmentation header is present)

It's definitely the Time Capsule that is blocking the *outbound* traffic.