• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

YouTube streaming slow since June 6th

Started by sttng, June 07, 2012, 03:36:56 AM

Previous topic - Next topic

sttng

Since World IPv6 Launch Day, I've had issues with YouTube videos buffering quite a bit on higher qualities and I'm curious if I can do anything to improve it. I am not using HE's DNS servers, but instead am running my own dual-stacked recursive name servers.  IPv6 Ping tests against YouTube seem fine:

$ ping6 www.youtube.com
PING www.youtube.com(pz-in-x5d.1e100.net) 56 data bytes
64 bytes from pz-in-x5d.1e100.net: icmp_seq=1 ttl=55 time=21.0 ms
64 bytes from pz-in-x5d.1e100.net: icmp_seq=2 ttl=55 time=20.8 ms
64 bytes from pz-in-x5d.1e100.net: icmp_seq=3 ttl=55 time=19.7 ms
64 bytes from pz-in-x5d.1e100.net: icmp_seq=4 ttl=55 time=23.1 ms
64 bytes from pz-in-x5d.1e100.net: icmp_seq=5 ttl=55 time=19.2 ms
--- www.youtube.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4016ms
rtt min/avg/max/mdev = 19.243/20.796/23.148/1.361 ms


Granted, this isn't necessarily their content delivery servers.  I also tried an IPv6 traceroute but it seems to be blocked.  My IPv6 speed test should be reasonable for YouTube:



On test-ipv6.com, I get a score of 10 out of 10 so I'm not sure where the hang up is.  The underlying IPv4 connection doesn't have any congestion, but I'd like to find a better solution than turning on prefer IPv4 over IPv6 on all our computers.  We have a lot of them.

kasperd

Quote from: sttng on June 07, 2012, 03:36:56 AMI also tried an IPv6 traceroute but it seems to be blocked.
Nothing was blocked for me:$ ./traceroute6 2001:4860:8005::5d
traceroute to 2001:4860:8005::5d (2001:4860:8005::5d), 30 hops max, 80 byte packets
1  2001:470:1f0b:1da2:635a:c32:ae34:df91  0.204 ms  0.093 ms  0.078 ms
2  2001:470:1f0a:1da2::1  63.213 ms  58.208 ms  58.363 ms
3  2001:470:0:69::1  47.277 ms  47.588 ms  36.639 ms
4  2001:7f8::3b41:0:2  122.855 ms  112.065 ms  127.852 ms
5  2001:4860::1:0:10  44.679 ms  39.319 ms  42.513 ms
6  2001:4860::8:0:3016  49.480 ms  46.403 ms  74.396 ms
7  2001:4860::8:0:3df4  60.765 ms  89.460 ms  64.537 ms
8  2001:4860::1:0:23  60.707 ms  56.725 ms  83.606 ms
9  2001:4860::8:0:2fc6  148.371 ms  126.788 ms  127.500 ms
10  2001:4860::8:0:2fea  159.735 ms  143.721 ms  152.586 ms
11  2001:4860::8:0:281e  157.160 ms  181.421 ms  158.485 ms
12  2001:4860::8:0:3427  166.834 ms  167.162 ms  174.794 ms
13  2001:4860::8:0:252d  208.572 ms  204.364 ms  197.993 ms
14  2001:4860::8:0:2b60  211.288 ms  272.287 ms  228.426 ms
15  2001:4860::2:0:bd  217.152 ms  217.620 ms  251.225 ms
16  2001:4860:0:1::57  209.444 ms  207.001 ms  215.820 ms
17  2001:4860:8005::5d  205.239 ms  205.740 ms  212.843 ms
What sort of packets were you using for your traceroute? The above was produced using ICMPv6 echo requests (I should probably update my code to support UDP and TCP as well).

sttng

Using the traceroute6 package from iputils-tracepath on Ubuntu.  This is what I get running traceroute6 on a Linux desktop:

$ sudo traceroute6 www.youtube.com
traceroute to youtube-ui.l.google.com (2001:4860:8005::be) from 2001:470:xxxx::5, 30 hops max, 24 byte packets
1  router.example.com (2001:470:xxxx::6)  0.365 ms  0.233 ms  0.232 ms
2  * * *
3  * * *


And here's what the router sees:

$ sudo tcpdump -i hurricane udp
tcpdump: WARNING: hurricane: no IPv4 address assigned
tcpdump: listening on hurricane
09:30:06.025559 desktop.example.com.43852 > pz-in-xbe.1e100.net.traceroute:  udp 24
09:30:11.030393 desktop.example.com.43852 > pz-in-xbe.1e100.net.traceroute:  udp 24
09:30:13.874601 desktop.example.com.58944 > pz-in-xbe.1e100.net.traceroute:  udp 24 [hlim 1]
09:30:18.871337 desktop.example.com.58944 > pz-in-xbe.1e100.net.traceroute:  udp 24 [hlim 1]
09:30:23.867150 desktop.example.com.58944 > pz-in-xbe.1e100.net.traceroute:  udp 24 [hlim 1]


I have no IPv6 firewall on the router currently:

$ sudo ip6tables -vL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination


Does seem a little odd that I can't traceroute, maybe I should look for a traceroute that does ICMPv6.

kasperd

Quote from: sttng on June 07, 2012, 04:58:24 AMDoes seem a little odd that I can't traceroute, maybe I should look for a traceroute that does ICMPv6.
Not necessary. The problem is clearly very close to you. Looks like your tunnel is not working at all, and you just happened to not notice until the launch because you weren't using IPv6 for anything.

Try to do a tcpdump on the physical interface and look for protocol 41 and icmp packets. On my gateway where that is interface eth0 I would use: tcpdump -pni eth0 'proto 41 || icmp'

sttng

No, my tunnel is very clearly working.  From my first post, I scored a 10 of of 10 on my IPv6 Test and I was able to successfully ping6 YouTube.  I have visited a number of sites that have correctly reported my IPv6 address and claim my browser prefers IPv6 over IPv4.  I just now visited ipv6.google.com as a test and it works.  I also regularly connect to an external IPv6-only site that doesn't use Hurricane Electric.  When I tcpdump the underlying interface I do use ip proto \ipv6 in my filter, but in that example I was dumping the IPv6 tunnel interface.  I'll take a closer look and see if I can track down the traceroute issue.  YouTube streaming seems a little better today, but even at 480p, I had it buffer a couple times very briefly.  At 720p or above I'm better off leaving it paused for a while before playing.

kasperd

Quote from: sttng on June 09, 2012, 08:03:34 AMNo, my tunnel is very clearly working.
The traceroute output you showed indicates the packets don't even make it to the tunnel server. The most likely reasons for those packets not to make it to the tunnel server while other packets do make it to the tunnel server would be:
  • Your router has a more specific route matching those packets
  • Your computer has multiple IPv6 addresses and choose a different one when contacting this address, which may cause the tunnel server to drop the packets.
It may also be that something is misconfigured in a way that causes symptoms pointing in the wrong direction. I noticed your reverse DNS is broken, your IPv6 address maps to a non-existing domain name. Try running using the -n option for tcpdump to disable DNS lookups.

Running the tcpdump command I suggested, while doing traceroute to multiple destinations will for sure show something that could help track down the problem.

sttng

Ok, I'm using traceroute6-nanog for this test.  I actually do get feedback from the last two hops as can also be seen in the tcpdump, but no intermediate routers.  You can also see other IPv6 traffic correctly operating in the capture.  I took the tcpdump from the underlying IPv4 physical interface to the Internet.  Here's the traceoute:

$ traceroute6 www.youtube.com
traceroute to pz-in-x5d.1e100.net (2001:4860:8005::5d) from 2001:470:e9d7::5, port 33434, from port 43588, 30 hops max, 60 byte packets
1  joel.tallye.com (2001:470:e9d7::6)  0.323 ms  0.257 ms  0.320 ms
2  * * *
3  * * *
4  * * *
5  * * *
6  * * *
7  * * *
8  * * *
9  2001:4860:0:1::59 (2001:4860:0:1::59)  19.965 ms  22.363 ms  29.812 ms
10  pz-in-x5d.1e100.net (2001:4860:8005::5d)  19.561 ms  19.607 ms  27.564 ms


And the command I used on the router for the tcpdump:

$ sudo tcpdump -i eth2 -w traceroute6.pcap -s0 ip proto \ipv6 && ! tcp

The capture is stored at: http://www.north-winds.org/unix/traceroute6.pcap

kasperd

Quote from: sttng on June 09, 2012, 10:40:15 PMThe capture is stored at: http://www.north-winds.org/unix/traceroute6.pcap
You did not include icmp packets like I suggested. Thus we cannot see what sort of icmp errors you get back in response to some of the 6in4 packets you send. However your pcap file did contain enough information for me to figure out why all your traceroutes are looking strange.

It is a bug or perhaps a poorly implemented feature in your tunnel endpoint. It will make any traceroute6 almost useless to you. The good news is, it doesn't break anything else.

What your tunnel endpoint does differently from most other 6in4 implementations is in the way it constructs the IPv4 header. It does not use a fixed TTL value in that IPv4 header, rather it copies the hop limit from the IPv6 header to the TTL value in the IPv4 header. Doing that could be a useful debugging feature, if a lot of other things were in place.

First of all your tunnel endpoint would have to make use of the ICMP time exceeded messages it gets back and turn those into corresponding ICMPv6 time exceeded messages for the encapsulated IPv6 packet. Additionally the HE tunnel server would have to not just remove the IPv4 header, but at that point replace the hop limit in the IPv6 header with the minimum of the hop limit in the IPv6 header and the TTL from the IPv4 header.

If both of the above were in place, then it would make sense for your tunnel endpoint to copy the hop limit to the TTL field. However none of the HE tunnel servers I have tested does what I describe, which means you are better off having your tunnel endpoint use a fixed value for the IPv4 TTL field.

The lack of ICMPv6 errors for those packets that exceed the IPv4 TTL field can be caused by a few different reasons:
  • The IPv4 router did not send an ICMP error at all
  • The IPv4 router send an ICMP error that was too short to contain the full IPv6 header
  • Your tunnel endpoint just cannot convert them.
Since we are talking about seven different routers, and none of them produced a useful result, I guess it is probably because your tunnel endpoint cannot convert the responses.

However ultimately fixing that problem in your tunnel endpoint won't help you much unless HE changes the tunnel server as well, so in order to make traceroute useful, you would have to change the setup of your tunnel endpoint. Now we should probably get back to the original problem and see if we can debug that without needing traceroute.

kasperd

Getting back to the slow streaming, we don't have much to work with. We know why the traceroutes looked strange, but once I found the reason for that, it turned out to be something that doesn't affect the streaming performance. So all we know is that the streaming is slow, and the performance varies from day to day.

I suppose you want to find out if this is a problem with the network connectivity or a problem with the server being overloaded. The packets in the TCP stream will look different in the two cases. You should be able to tell if the slowness is due to the application on the sending end not handing data down to the TCP stack fast enough, or if it is due to congestion or packet drops on the network.

First of all, I think we need to know if ECN is enabled on this connection. Not because that affects performance enough to be of significance, but because the symptoms to look for are different depending on whether it is enabled or not. If ECN is enabled you should look for how often packets you receive are flagged. If ECN is disabled you should look for how often packets you receive are out of order. If one packet is dropped on the way to you, you'll see later packets still arriving and the lost packet eventually retransmitted.

Additionally having a ping running in the background while the streaming is going on will tell you something about roundtrip time and packet loss on the connection. The ping output is easier to analyse than the tcpdump.

As for what you could do to force the traffic to use IPv4 instead, I think the easiest place to do something about it is on your recursive name servers. Alas I don't know if there are any recursive nameservers with a feature to delete AAAA records from replies for specific domains. Another approach is to create a packet filter to reject TCP SYN packets to those IPv6 addresses with TCP reset packets.

Neither of those workarounds are good solutions, but I think a really good solution requires some nontrivial code in the playback client code. It is no longer sufficient to just try IPv4 and IPv6 in parallel and use the connection which completes the handshake first. It needs to be able to switch to another connection in the middle of the playback, if the server it is using is unable to keep up. If this is a general problem the monitoring of capacity on the streaming servers need to be improved. Any real solution requires some action to be taken by Google, everything you could do yourself would simply be workarounds.

I have observed performance problems myself, but in my case symptoms are pointing at poor WIFI connectivity, and it just happens that youtube streaming is what causes my WIFI connection to peak.