Hurricane Electric's IPv6 Tunnel Broker Forums

Tunnelbroker.net Specific Topics => Questions & Answers => Topic started by: Buroa on June 03, 2013, 05:27:05 AM

Title: Facebook failing... why?
Post by: Buroa on June 03, 2013, 05:27:05 AM
Sometimes it will work, sometimes it won't. This has been going on for about a week. Here is my traceroute.

(http://puu.sh/37nqN.png)
Title: Re: Facebook failing... why?
Post by: kasperd on June 03, 2013, 06:06:02 AM
I don't see a problem. What is it you think is not working?
Title: Re: Facebook failing... why?
Post by: cconn on June 05, 2013, 07:49:42 AM
same here, I have occasional non-connectivity bouts with FB.  I will try and wireshark it to see what is going on.

edit: from what I can see from the wireshark, the failure starts when I start fetching data from a IPv6 akamai host.  Still not clear why sometimes my browser is the one sending the TCP RST first, sometimes akamai.
Title: Re: Facebook failing... why?
Post by: hawk82 on June 05, 2013, 05:18:07 PM
I'm also having problems loading Facebook intermittently. My traceroute is identical to Buroa's. And I noticed the issue also around the same time about a week or so.
Title: Re: Facebook failing... why?
Post by: SpOuK3 on June 06, 2013, 04:15:50 AM
Same here:

root@MELAINE:~# traceroute6 www.facebook.com
traceroute to www.facebook.com (2a03:2880:10:cf07:face:b00c:0:1), 30 hops max, 16 byte packets
 1  SpOuK3-1.tunnel.tserv21.tor1.ipv6.he.net (2001:470:1c:4a2::1)  45.489 ms  42.247 ms  41.449 ms
 2  gige-g2-5.core1.tor1.he.net (2001:470:0:c0::1)  43.616 ms  40.527 ms  41.633 ms
 3  10gigabitethernet10-7.core1.nyc4.he.net (2001:470:0:23f::1)  51.571 ms  53.870 ms  51.009 ms
 4  xe-1-1-0.br01.lga1.tfbnw.net (2001:504:f::3:2934:1)  59.328 ms  56.851 ms  55.821 ms
 5  ae1.bb01.lga1.tfbnw.net (2620:0:1cff:dead:beee::232)  75.837 ms  57.169 ms  55.773 ms
 6  ae15.bb01.iad2.tfbnw.net (2620:0:1cff:dead:beef::2ad)  58.347 ms  56.674 ms  57.086 ms
 7  ae8.bb05.prn1.tfbnw.net (2620:0:1cff:dead:beef::122)  119.925 ms  122.398 ms  122.174 ms
 8  ae33.dr05.prn1.tfbnw.net (2620:0:1cff:dead:beee::234)  121.411 ms  122.537 ms  121.611 ms
 9  po1022.csw12b.prn1.tfbnw.net (2620:0:1cff:dead:beef::176f)  122.325 ms  122.557 ms  123.440 ms
10  *  *  *
11  edge-star6-shv-12-prn1.facebook.com (2a03:2880:10:cf07:face:b00c:0:1)  122.818 ms  121.516 ms  119.964 ms

It started some days ago ... I don't remember but a week or two... It's on/off... have to refresh the page to get on facebook...
Title: Re: Facebook failing... why?
Post by: mediag on June 06, 2013, 06:13:32 AM
Our connections to Facebook are failing over our tunnel 90%-95% of the time. It's frustrating, especially considering that I could get it just fine over IPv4, as well as one of those 2002:blah:blah::/48 tunnels (but the latency is lame through that!).

As an additional note, www.sam.gov fails over IPv6 (one of our people had to go to that site for whatever reason), but only when it goes to make an SSL connection. Interestingly enough, this is also where Facebook fails. The initial HTTP connection works, but then when it gets redirected to HTTPS, the connection cannot be made. Just to check if it's an SSL issue, I went to https://www.google.com/, and that works.
Title: Re: Facebook failing... why?
Post by: cconn on June 06, 2013, 07:07:42 AM
well although I read these forums, my IPv6 connectivity is native and does not depend on HE.  As well, even if I de-peer from HE, I get a 60-40 chance of having to hit refresh a few times to get anywhere on facebook.  And it is not facebook itself that is giving me grief, it is akamai-ized servers that are sending RSTs for some reason.  Still investigating.
Title: Re: Facebook failing... why?
Post by: kasperd on June 06, 2013, 07:51:13 AM
I have frequently been accessing facebook over an HE tunnel, and it appears to be fairly stable from my laptop. From my phone it is a different matter, facebook is extremely slow and loading often fails.

The two are using different tunnels, but on the same tunnel server. I have noticed some spurious time to live exceeded packets coming back from facebook on the tunnel the phone is using.

It appears that facebook is encapsulating the IPv6 packets they receive from me in an IPv6 over IPv6 tunnel, and on the outer IPv6 header they are spoofing my IP address as source. That is, when they encapsulate the IPv6 packet they actually copy the source address from the inner IPv6 header to the outer IPv6 header.

Occasionally the tunnelled packet hits a time to live exceeded, at which point an ICMPv6 error is sent back to me. But the content of that ICMPv6 error is not the packet, which I send but rather the encapsulated version, which of course isn't recognized as being related to any open connection on my end.
Title: Re: Facebook failing... why?
Post by: Steak on June 07, 2013, 07:37:46 AM
Exactly the same issue here, facebook fails to load almost all the time over IPv6 through our HE tunnel.
At home, I have an A&A connection with native IPv6, and facebook works fine over IPv6.
Title: Re: Facebook failing... why?
Post by: hawk82 on June 07, 2013, 09:01:05 AM
Was still having problems earlier this morning around 8am est, but now it appears to be resolved. Facebook loading properly. Will keep monitoring.
Title: Re: Facebook failing... why?
Post by: cconn on June 07, 2013, 09:04:02 AM

It appears that facebook is encapsulating the IPv6 packets they receive from me in an IPv6 over IPv6 tunnel, and on the outer IPv6 header they are spoofing my IP address as source. That is, when they encapsulate the IPv6 packet they actually copy the source address from the inner IPv6 header to the outer IPv6 header.

Occasionally the tunnelled packet hits a time to live exceeded, at which point an ICMPv6 error is sent back to me. But the content of that ICMPv6 error is not the packet, which I send but rather the encapsulated version, which of course isn't recognized as being related to any open connection on my end.

 This is a huge WTF.  Any idea why they may be doing this?  And from what I see here, it seems its the akamai servers that are doing it, not the facebook servers themselves.  Is that consistent with what you see?
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 10:26:29 AM
This is a huge WTF.  Any idea why they may be doing this?
I believe the tunnelling is part of a load balancing setup. I read an article about the setup at some point in the past. As far as I remember it was written by Phil Dibowitz.

The WTF part is the way the source IP address for the encapsulation was chosen. It may just be that it the time it was implemented nobody really thought about what that source IP should be, and perhaps people thought it didn't matter at all.

Alternatively it might be that the source IP is important to some intermediate router, perhaps there is some packet filtering based on source IP. And preserving source IP as the packet was encapsulated may have been a hack to ensure the filters still worked.

It might also be that somebody thought preserving the source IP address at encapsulation was a good idea as it would get any ICMP errors to the real source of the address.

But this is all guessing, I don't know why they did this.

And from what I see here, it seems its the akamai servers that are doing it, not the facebook servers themselves.  Is that consistent with what you see?
No, that is not consistent with what I have seen. I found a packet dump, which I had lying around from December. And I found the following.

There was a SYN packet, which had been sent to 2a03:2880:2040:1f01:face:b00c:0:3. This had been encapsulated in a tunnel with destination address 2401:db00:2040:1166:face:0:11:0 and spoofing my IP as source. And this packet had triggered an ICMPv6 time exceeded error from 2620:0:1cff:dead:beef::1494. All three prefixes are registered to facebook.

I'll look for some more recent packet dumps.

BTW. I recently started trying out gogo6 on the network where I am using the phone from. And this appears to have made facebook much less reliable (in the last few days I have experienced less than 50% of uptime). I'll try switching back to HE to see if facebook gets more reliable that way.
Title: Re: Facebook failing... why?
Post by: cconn on June 07, 2013, 11:36:48 AM
discussion about this on a mailing list I peruse;

http://lists.cluenet.de/pipermail/ipv6-ops/2013-June/008987.html


apparently related to bogus DNS load balancing failure of some sort.

is Gogo6 canadian-based still?  I am getting geo-located DNS replies pointing to IPv6 Akamai servers at Videotron that are failing.  Videotron == Quebec/Canada cableco.
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 12:38:26 PM
is Gogo6 canadian-based still?
I wouldn't have bothered testing them, if they only had a presence in Canada. I am using their server in Amsterdam (their only one in Europe as far as I know).

I am getting geo-located DNS replies pointing to IPv6 Akamai servers at Videotron that are failing.  Videotron == Quebec/Canada cableco.
Google was giving me a correct geolocation of my phone. They even got the postal code correct. Getting it that accurate when I am using a tunnel server in a different country, probably means they are using some other method of geolocation, probably GPS.

I just recalled I had tested a wget command on a URL two days ago, which happened to be redirecting to facebook. I hadn't thought about that being possibly IPv6 related, but it may very well be. That wget command is still waiting for a reply from facebook, and it has been more than 48 hours. It did manage to successfully complete a TCP handshake.
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 12:54:35 PM
I tried switching back to HE for traffic to 2a03:2880::/32, and now facebook is accessible again from that network. I did not change which DNS server I am using or which tunnel I am using to connect to the DNS server, so the difference is not caused by DNS lookups. So there really is an issue with communication between gogo6 and facebook.

It does not look like a routing problem, because http is working, only https is affected. Also when doing https, I am getting a connection established.

This sort of sounds like an MTU problem. But I have been clamping my MSS at 1420, which should eliminate any MTU issues on the tunnel, assuming the PMTU between me and the tunnel server is at least 1500 bytes. Considering this problem is only affecting facebook, I don't exactly suspect an MTU issue on the tunnel itself.

But it is worth trying a lower MSS setting to see if that affects the connectivity.
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 01:03:53 PM
But it is worth trying a lower MSS setting to see if that affects the connectivity.
With an MSS setting of 1220 it worked every time so far. With an MSS setting of 1208 it fails most of the time. What I am seeing is definitely an MTU issue. Where exactly the MTU issue is located, is not clear yet, but it appears something involved in the communication cannot deal with packets above 1280 bytes.
Title: Re: Facebook failing... why?
Post by: cconn on June 07, 2013, 02:17:44 PM
But it is worth trying a lower MSS setting to see if that affects the connectivity.
With an MSS setting of 1220 it worked every time so far. With an MSS setting of 1208 it fails most of the time. What I am seeing is definitely an MTU issue. Where exactly the MTU issue is located, is not clear yet, but it appears something involved in the communication cannot deal with packets above 1280 bytes.

is not the minimum MTU on IPv6 supposed to be 1280 for compliancy?
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 02:20:07 PM
With an MSS setting of 1220 it worked every time so far. With an MSS setting of 1228 it fails most of the time. What I am seeing is definitely an MTU issue. Where exactly the MTU issue is located, is not clear yet, but it appears something involved in the communication cannot deal with packets above 1280 bytes.
I managed to grab a packet trace of both one of those few cases where it worked as well as one where it did not work. The first  packets in both cases looked the same.

C->S: SYN
S->C: SYN+ACK
C->S: ACK
C->S: Client hello
S->C: ACK of client hello
S->C: Last packet of server hello
C->S: ACK of SYN+ACK packet

The last packet where the client ACKs the SYN+ACK packet again signals to the server that the client has received a packet, but there was a gap in between, so the last received packet cannot be acknowledged yet. Presumably the server retransmits the packet with the data from the gap, which is lost again. At this point the connection stalls.

In the one case where it did work the repeated ACK of the SYN+ACK packet was followed by more packets from the server. The server would send the first part of the server hello again, this time segmented into two segments. With the MSS setting I tested (1228) the server would send one segment with a total length of 1280 bytes and another segment with the last 8 bytes of payload.

The extra ACK indicating the loss of a packet is send in both cases, so presumably facebook does realize a packet has been lost and retransmits it. But sometimes it does not arrive. I guess that is because the retransmitted packet is the same size, as before, which is still too large. Most likely that is because the server did not receive an ICMPv6 message indicating the packet was too big.

The reason facebook does not receive that ICMPv6 message may very well be, that the router with the 1280 byte link MTU is rate limiting the ICMPv6 messages. So the number of successful connections is now capped by the ICMPv6 rate limit on that router. Possibly facebook is sending way too many packets above 1280 bytes.

I can see three ways facebook could improve the situation.
The PMTU problem I noticed is definitely real. But that by no means proves that the problem other people are experiencing is an MTU issue as well.
Title: Re: Facebook failing... why?
Post by: kasperd on June 07, 2013, 02:26:25 PM
But it is worth trying a lower MSS setting to see if that affects the connectivity.
With an MSS setting of 1220 it worked every time so far. With an MSS setting of 1208 it fails most of the time. What I am seeing is definitely an MTU issue. Where exactly the MTU issue is located, is not clear yet, but it appears something involved in the communication cannot deal with packets above 1280 bytes.
is not the minimum MTU on IPv6 supposed to be 1280 for compliancy?
The 1208 in my post was a typo. It should have read 1228. With 60 bytes used for IPv6 and TCP headers those numbers correspond to MTU of 1280 and 1288 respectively.

Routers are not required to forward packets larger than 1280 bytes. So not forwarding the 1288 byte packets is standards compliant. But in that case an ICMPv6 error has to be sent back to the sender of the packet, and that sender has to perform PMTU discovery and retransmit as smaller packets.

For some connections, the retransmit as smaller packets never happen. I guess rate limiting of the ICMPv6 packets combined with lack of caching of PMTU results is to blame.
Title: Re: Facebook failing... why?
Post by: passport123 on June 08, 2013, 09:25:09 AM
It appears that this is widespread:
http://mailman.nanog.org/pipermail/nanog/2013-June/058805.html (http://mailman.nanog.org/pipermail/nanog/2013-June/058805.html)


Facebook broken over v6?


On 6/7/13 12:21 PM, Jeroen Massar wrote:
> On 2013-06-07 09:12, Aaron Hughes wrote:
>>
>> Anyone else getting connection hangs and closes to Facebook?
>
> Yes, and from a lot of vantage points, thus it is not your local network
> that is at fault, seems that there are some IP addresses which are being
> returned by some Facebook DNS servers that are actually not properly
> provisioned and thus do not respond.
>
> More to it at:
>  http://www.sixxs.net/forum/?msg=general-9511818
>
> I've informed the Facebook peoples (and bcc'd them on this email), and
> apparently they are digging into it from the response I've received.
>
> Note that using www.v6.facebook.com apparently works fine as that IP is
> not affected and is not geo-balanced or something like that thus is
> always (afaik) the same. Thus if you like sharing your life and
> everything you do, that is the thing to use.

It's affecting anyone running dual stack, as the server responds, hangs, times
out and then it tries again on v6.  At least in the latest FF and Safari
browsers, I've not tried chrome.

I've cc'd this over to Nanog, as I've not seen anything about it there, and
I'm sure others are seeing it.

www.v6.facebook.com works fine as a workaround for the time being.

Title: Re: Facebook failing... why?
Post by: passport123 on June 08, 2013, 09:31:26 AM
Here's the full thread at cluenet.de about Facebook's IPv6 issues.

http://lists.cluenet.de/pipermail/ipv6-ops/2013-June/008987.html (http://lists.cluenet.de/pipermail/ipv6-ops/2013-June/008987.html)
Title: Re: Facebook failing... why?
Post by: passport123 on June 08, 2013, 03:15:40 PM
From the nanog thread:

http://mailman.nanog.org/pipermail/nanog/2013-June/058819.html

Doug Porter dsp at fb.com
Sat Jun 8 20:29:43 UTC 2013

We're actively investigating the v6 issues.  We need more data
though.  If you're experiencing problems, please email me a
tcpdump/pcap or any other debug data you think will help.

Thanks,
--
dsp

Title: Re: Facebook failing... why?
Post by: PigLover on June 09, 2013, 04:00:48 PM
I updated my tunnel to use MTU 1480 and things got better, but not right.  Instead of hanging completely some facebook activities just don't finish correctly (e.g., it will display a user's FB home page, but if you scroll to the bottom it won't grow into older messages).  Its very odd.

I've actually had to disable v6 router advertisement on the subnet most of family uses.  With only V4 FB works perfectly.

Will be glad when they get it fixed so I can turn it back on.
Title: Re: Facebook failing... why?
Post by: PaulosV on June 11, 2013, 02:16:43 PM
Actually, I'm not getting the hangups anymore, and the Wireshark does not report anything unusual. It seems that they have resolved the issue - at last.

Will wait a few more days before jumping high though. The odds of this happening twice are slim, but certainly not non-existent.
Title: Re: Facebook failing... why?
Post by: kasperd on June 11, 2013, 03:45:48 PM
I updated my tunnel to use MTU 1480 and things got better, but not right.
Have you tried reducing MSS to 1220 on all packets passing through your tunnel endpoint?
Title: Re: Facebook failing... why?
Post by: passport123 on June 12, 2013, 07:29:45 PM
...Will be glad when they get it fixed so I can turn it back on.

I'm not seeing any IPv6 issues anymore.  Hopefully the problem has been fixed.