• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

High packet loss through tunnelbroker tunnel.

Started by marioxcc, January 14, 2014, 01:45:47 PM

Previous topic - Next topic

marioxcc

Hello.

I'm using the IPv6 tunnel service (Normal, /64) from my residential ADSL connection to get IPv6 connectivity as my ISP don't provides IPv6 service yet. I have noticed an increased packet loss. Here is one example:

mario@mario-laptop:~$ sudo ping -c 200 -i 0.05 -s 1400 -n -q he.net
PING he.net (216.218.186.2) 1400(1428) bytes of data.

--- he.net ping statistics ---
200 packets transmitted, 200 received, 0% packet loss, time 10450ms
rtt min/avg/max/mdev = 135.535/217.233/502.044/71.940 ms, pipe 10
mario@mario-laptop:~$ sudo ping6 -c 200 -i 0.05 -s 1400 -n -q he.net
PING he.net(2001:470:0:76::2) 1400 data bytes

--- he.net ping statistics ---
200 packets transmitted, 154 received, 23% packet loss, time 10410ms
rtt min/avg/max/mdev = 168.243/273.917/441.806/61.529 ms, pipe 9


Measurements in a different moment and to a different host:

mario@mario-laptop:~$ sudo ping -c 100 -i 0.05 -s 1400 -n -q en.wikipedia.org
PING text-lb.eqiad.wikimedia.org (208.80.154.224) 1400(1428) bytes of data.

--- text-lb.eqiad.wikimedia.org ping statistics ---
100 packets transmitted, 90 received, 10% packet loss, time 5360ms
rtt min/avg/max/mdev = 180.385/255.738/394.109/46.845 ms, pipe 7
mario@mario-laptop:~$ sudo ping6 -c 100 -i 0.05 -s 1400 -n -q en.wikipedia.org
PING en.wikipedia.org(2620:0:861:ed1a::1) 1400 data bytes

--- en.wikipedia.org ping statistics ---
100 packets transmitted, 57 received, 43% packet loss, time 5225ms
rtt min/avg/max/mdev = 752.173/1180.984/1542.387/222.351 ms, pipe 29


The computer is connected to the modem via 802.11g. I would by far prefer a wired connection but that's currently not possible. The wireless link is very unpredictable, sometimes it has a very high packet loss ratio and sometimes not but nonetheless the 6in4 tunnel is almost always less reliable. Also, even through the modem advertise a MTU of 1500 by DHCP the actual MTU to the rest of the Internet is 1492. Sometimes I get average RTT above 1 s through the tunnel as demonstrated in the second test set. I'm using Debian GNU/Linux. Is there something I can do to fix this?.

Regards and thanks in advance.

marioxcc

Hi. I don't get the outrageous packet loss (I'm not sure what changed anyway) but the high latency is still there, especially when I'm using the tunnel (When my connection to the Internet is almost idle the latency to the server through is very reasonable).

kasperd

I tried to do a traceroute to your computer, but it appears to be offline. Could you post the output of a traceroute to the tunnel server (216.218.224.42 as far as I can tell) and a traceroute6 to he.net?

marioxcc

Quote from: kasperd on January 18, 2014, 02:01:48 AM
I tried to do a traceroute to your computer, but it appears to be offline. Could you post the output of a traceroute to the tunnel server (216.218.224.42 as far as I can tell) and a traceroute6 to he.net?

Sure. Here is the traceroute using native IPv4, both with TCP and protocol 41. As you can see, it seems that protocol 41 is treated somewhat differently.

root@mario-laptop:~# traceroute -T -z 0.1 216.218.224.42
traceroute to 216.218.224.42 (216.218.224.42), 30 hops max, 60 byte packets
1  dsldevice.lan (192.168.1.254)  11.890 ms  5.647 ms  5.602 ms
dsl-servicio-l200.uninet.net.mx (200.38.193.226)  24.934 ms  25.844 ms  25.741 ms
bb-la-grand-5-tge0-7-0-10.uninet.net.mx (189.246.218.38)  78.791 ms  77.281 ms  81.261 ms
4  * if-3-2.icore1.EQL-LosAngeles.as6453.net (206.82.129.101)  80.079 ms  73.035 ms
if-4-28.tcore2.LVW-LosAngeles.as6453.net (216.6.84.53)  73.040 ms  73.989 ms  78.967 ms
if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  72.063 ms  73.039 ms  74.004 ms
7  66.110.59.66 (66.110.59.66)  73.804 ms  71.842 ms  73.848 ms
hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  91.845 ms  91.795 ms  91.843 ms
10ge1-3.core1.lax2.he.net (72.52.92.122)  97.846 ms  98.778 ms  86.710 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  89.710 ms  88.587 ms  88.575 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  84.642 ms  84.644 ms  84.436 ms
12  tserv1.dal1.he.net (216.218.224.42)  84.187 ms  85.196 ms  84.217 ms

root@mario-laptop:~# traceroute -P 41 -z 0.1 216.218.224.42
traceroute to 216.218.224.42 (216.218.224.42), 30 hops max, 60 byte packets
1  dsldevice.lan (192.168.1.254)  16.169 ms  11.595 ms  11.590 ms
dsl-servicio-l200.uninet.net.mx (200.38.193.226)  25.259 ms  25.271 ms  27.340 ms
bb-la-grand-5-tge0-7-0-10.uninet.net.mx (189.246.218.38)  81.319 ms  80.196 ms  76.257 ms
4  * if-3-2.icore1.EQL-LosAngeles.as6453.net (206.82.129.101)  73.179 ms *
5  * * *
6  * if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  72.594 ms  73.270 ms
7  66.110.59.66 (66.110.59.66)  73.200 ms  73.453 ms  73.419 ms
hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  86.554 ms  99.205 ms  98.282 ms
10ge1-3.core1.lax2.he.net (72.52.92.122)  88.227 ms  87.315 ms  87.354 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  90.283 ms  89.176 ms  89.166 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  85.264 ms  85.379 ms  85.331 ms
12  * * *
13  * * *
14  * * *
15  * * *^C

Here is a traceroute to the server I'm using:

root@mario-laptop:~# traceroute -T -z 0.1 184.105.253.10
traceroute to 184.105.253.10 (184.105.253.10), 30 hops max, 60 byte packets
1  * * dsldevice.lan (192.168.1.254)  52.182 ms
dsl-servicio-l200.uninet.net.mx (200.38.193.226)  27.984 ms  27.004 ms  25.535 ms
bb-la-grand-5-tge0-3-0-4.uninet.net.mx (201.125.48.174)  74.710 ms  78.613 ms  74.688 ms
if-3-2.icore1.EQL-LosAngeles.as6453.net (206.82.129.101)  83.520 ms  73.675 ms  81.602 ms
if-4-28.tcore2.LVW-LosAngeles.as6453.net (216.6.84.53)  72.423 ms  74.431 ms  73.548 ms
if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  76.382 ms  72.447 ms  76.420 ms
7  66.110.59.66 (66.110.59.66)  72.528 ms  72.526 ms  73.639 ms
hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  85.347 ms  87.314 ms  86.346 ms
10ge1-3.core1.lax2.he.net (72.52.92.122)  85.365 ms  87.341 ms  86.413 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  97.229 ms  98.266 ms  97.477 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  85.390 ms  85.049 ms  85.440 ms
12  184.105.253.10 (184.105.253.10)  87.948 ms  82.847 ms  84.521 ms

root@mario-laptop:~# traceroute -P 41 -z 0.1 -m 15 184.105.253.10
traceroute to 184.105.253.10 (184.105.253.10), 15 hops max, 60 byte packets
1  * dsldevice.lan (192.168.1.254)  98.985 ms  94.715 ms
dsl-servicio-l200.uninet.net.mx (200.38.193.226)  25.613 ms  26.117 ms  25.243 ms
bb-la-grand-5-tge0-3-0-4.uninet.net.mx (201.125.48.174)  76.012 ms  73.650 ms  77.900 ms
4  * * if-3-2.icore1.EQL-LosAngeles.as6453.net (206.82.129.101)  80.395 ms
if-4-28.tcore2.LVW-LosAngeles.as6453.net (216.6.84.53)  72.402 ms  72.239 ms *
if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  71.959 ms  74.273 ms  74.329 ms
7  66.110.59.66 (66.110.59.66)  71.150 ms  72.063 ms  72.088 ms
hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  87.102 ms  86.256 ms  88.223 ms
10ge1-3.core1.lax2.he.net (72.52.92.122)  86.064 ms  87.235 ms  87.242 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  93.274 ms  89.121 ms  88.994 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  85.180 ms  96.430 ms *
12  10ge5-3.core1.dal1.he.net (184.105.222.78)  3355.566 ms * *
13  * * *
14  * * *
15  * * *

Traceroute to he.net:

traceroute to he.net (216.218.186.2), 30 hops max, 60 byte packets
1  dsldevice.lan (192.168.1.254)  22.158 ms  17.394 ms  17.680 ms
dsl-servicio-l200.uninet.net.mx (200.38.193.226)  25.476 ms  25.278 ms  26.204 ms
bb-la-grand-5-tge0-3-0-4.uninet.net.mx (201.125.48.174)  77.452 ms  76.153 ms  72.146 ms
Vlan553.icore1.EQL-LosAngeles.as6453.net (206.82.129.73)  73.395 ms  83.034 ms *
5  * * *
if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  73.106 ms  72.120 ms  72.128 ms
7  66.110.59.66 (66.110.59.66)  72.140 ms  71.371 ms  70.946 ms
hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  86.900 ms  87.072 ms  86.887 ms
100ge15-1.core1.sjc2.he.net (184.105.223.249)  95.722 ms  94.707 ms  93.792 ms
10  10ge1-1.core1.fmt1.he.net (72.52.92.109)  104.464 ms  97.580 ms  97.610 ms
11  he.net (216.218.186.2)  98.301 ms  98.282 ms  95.614 ms

Ping to the 6in4 server, through the tunnel and outside the tunnel with my Internet connection otherwise unloaded (My original post contains a similar test under load, which demonstrates the problem):

root@mario-laptop:~# ping6 -s 1400 -c 200 -i 0.05 -n -q 2001:470:1f0e:a81::1
PING 2001:470:1f0e:a81::1(2001:470:1f0e:a81::1) 1400 data bytes

--- 2001:470:1f0e:a81::1 ping statistics ---
200 packets transmitted, 200 received, 0% packet loss, time 10104ms
rtt min/avg/max/mdev = 124.495/127.390/168.189/4.300 ms, pipe 4
root@mario-laptop:~# ping -s 1400 -c 200 -i 0.05 -n -q 184.105.253.10
PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.

--- 184.105.253.10 ping statistics ---
200 packets transmitted, 196 received, 2% packet loss, time 10158ms
rtt min/avg/max/mdev = 123.072/130.268/218.439/14.033 ms, pipe 5


Regards.

marioxcc

Here is a repetition of the last test, but under slight load:

root@mario-laptop:~# ping6 -s 1400 -c 200 -i 0.05 -n -q 2001:470:1f0e:a81::1 ; ping -s 1400 -c 200 -i 0.05 -n -q 184.105.253.10
PING 2001:470:1f0e:a81::1(2001:470:1f0e:a81::1) 1400 data bytes

--- 2001:470:1f0e:a81::1 ping statistics ---
200 packets transmitted, 127 received, 36% packet loss, time 10526ms
rtt min/avg/max/mdev = 125.385/408.957/1668.243/437.315 ms, pipe 32
PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.

--- 184.105.253.10 ping statistics ---
200 packets transmitted, 200 received, 0% packet loss, time 10444ms
rtt min/avg/max/mdev = 124.434/166.468/347.902/29.103 ms, pipe 7

kasperd

Quote from: marioxcc on January 18, 2014, 07:41:09 AMAs you can see, it seems that protocol 41 is treated somewhat differently.
I don't see any difference to speak of. At hop 4 through 6 a few responses were missing. But a few missing responses from intermediate hops in a traceroute, though mildly annoying while debugging, is rarely a sign of a real problem for communication. From hop 7 and onwards, the two traces look the same again.

At the final hop, you do see a difference. The TCP packets do get a reply (most likely RST), but the protocol 41 packets get no reply. The lack of reply to the protocol 41 packets would be due to the tunnel server actually receiving the packet and then discarding them, because they do not contain a valid IPv6 payload. Additional debugging could be performed using a tool, which could produce valid protocol 41 packets and vary both the inner and outer hop limits.

From the traceroutes I'd conclude the IPv4 path between you and 216.218.224.42 appear to be working just fine.

I don't know where the IP address 184.105.253.10 came from. Because everything I see suggests you are using the tunnel server on 216.218.224.42. I'd still like to see a traceroute6 output.

marioxcc

Quote from: kasperd on January 18, 2014, 08:47:30 AM
Quote from: marioxcc on January 18, 2014, 07:41:09 AMAs you can see, it seems that protocol 41 is treated somewhat differently.
I don't see any difference to speak of. At hop 4 through 6 a few responses were missing. But a few missing responses from intermediate hops in a traceroute, though mildly annoying while debugging, is rarely a sign of a real problem for communication. From hop 7 and onwards, the two traces look the same again.

At the final hop, you do see a difference. The TCP packets do get a reply (most likely RST), but the protocol 41 packets get no reply. The lack of reply to the protocol 41 packets would be due to the tunnel server actually receiving the packet and then discarding them, because they do not contain a valid IPv6 payload. Additional debugging could be performed using a tool, which could produce valid protocol 41 packets and vary both the inner and outer hop limits.

Right.

Quote from: kasperd on January 18, 2014, 08:47:30 AM
From the traceroutes I'd conclude the IPv4 path between you and 216.218.224.42 appear to be working just fine.

I don't know where the IP address 184.105.253.10 came from. Because everything I see suggests you are using the tunnel server on 216.218.224.42. I'd still like to see a traceroute6 output.

That's what is shown in the "Tunnel Details" page, it's the PoP nearest me.

I forgot to include the IPv6 traceroute, sorry for that. Here is it:

traceroute to he.net (2001:470:0:76::2), 30 hops max, 80 byte packets
marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  85.605 ms  85.817 ms  86.765 ms
ge2-14.core1.dal1.he.net (2001:470:0:78::1)  83.086 ms  82.048 ms  83.100 ms
10ge2-4.core1.phx2.he.net (2001:470:0:258::1)  100.592 ms  101.642 ms  102.301 ms
10ge15-6.core1.lax2.he.net (2001:470:0:24a::2)  121.997 ms  130.204 ms  117.602 ms
10ge2-1.core1.lax1.he.net (2001:470:0:72::1)  111.253 ms 10ge9-5.core1.sjc2.he.net (2001:470:0:16a::1)  127.743 ms  127.804 ms
10ge1-1.core1.fmt1.he.net (2001:470:0:2f::1)  123.123 ms 10ge4-2.core3.fmt2.he.net (2001:470:0:18d::1)  120.411 ms  132.444 ms
10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  127.336 ms he.net (2001:470:0:76::2)  122.156 ms 10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  120.267 ms


kasperd

Quote from: marioxcc on January 18, 2014, 09:27:17 AMThat's what is shown in the "Tunnel Details" page, it's the PoP nearest me.
Does it also tell you the exact name of the tunnel server? The IP does not have any reverse DNS. You are probably using the correct IP address, it just doesn't look the way I'd expect it to from here.

Quote from: marioxcc on January 18, 2014, 09:27:17 AMI forgot to include the IPv6 traceroute, sorry for that. Here is it:

traceroute to he.net (2001:470:0:76::2), 30 hops max, 80 byte packets
marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  85.605 ms  85.817 ms  86.765 ms
ge2-14.core1.dal1.he.net (2001:470:0:78::1)  83.086 ms  82.048 ms  83.100 ms
10ge2-4.core1.phx2.he.net (2001:470:0:258::1)  100.592 ms  101.642 ms  102.301 ms
10ge15-6.core1.lax2.he.net (2001:470:0:24a::2)  121.997 ms  130.204 ms  117.602 ms
10ge2-1.core1.lax1.he.net (2001:470:0:72::1)  111.253 ms 10ge9-5.core1.sjc2.he.net (2001:470:0:16a::1)  127.743 ms  127.804 ms
10ge1-1.core1.fmt1.he.net (2001:470:0:2f::1)  123.123 ms 10ge4-2.core3.fmt2.he.net (2001:470:0:18d::1)  120.411 ms  132.444 ms
10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  127.336 ms he.net (2001:470:0:76::2)  122.156 ms 10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  120.267 ms
That trace looks fine. Could you run another traceroute6 command next time you experience high latency.

marioxcc

Quote from: kasperd on January 18, 2014, 09:57:01 AM
Quote from: marioxcc on January 18, 2014, 09:27:17 AMThat's what is shown in the "Tunnel Details" page, it's the PoP nearest me.
Does it also tell you the exact name of the tunnel server? The IP does not have any reverse DNS. You are probably using the correct IP address, it just doesn't look the way I'd expect it to from here.

It doesn't shows a name for the server. Both the tunnel creation and the tunnel details pages only states that it's located on Dallas and it's IP.

Quote from: kasperd on January 18, 2014, 09:57:01 AM
Quote from: marioxcc on January 18, 2014, 09:27:17 AMI forgot to include the IPv6 traceroute, sorry for that. Here is it:

traceroute to he.net (2001:470:0:76::2), 30 hops max, 80 byte packets
marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  85.605 ms  85.817 ms  86.765 ms
ge2-14.core1.dal1.he.net (2001:470:0:78::1)  83.086 ms  82.048 ms  83.100 ms
10ge2-4.core1.phx2.he.net (2001:470:0:258::1)  100.592 ms  101.642 ms  102.301 ms
10ge15-6.core1.lax2.he.net (2001:470:0:24a::2)  121.997 ms  130.204 ms  117.602 ms
10ge2-1.core1.lax1.he.net (2001:470:0:72::1)  111.253 ms 10ge9-5.core1.sjc2.he.net (2001:470:0:16a::1)  127.743 ms  127.804 ms
10ge1-1.core1.fmt1.he.net (2001:470:0:2f::1)  123.123 ms 10ge4-2.core3.fmt2.he.net (2001:470:0:18d::1)  120.411 ms  132.444 ms
10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  127.336 ms he.net (2001:470:0:76::2)  122.156 ms 10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  120.267 ms
That trace looks fine. Could you run another traceroute6 command next time you experience high latency.

Sure, here is it:
root@mario-laptop:~# traceroute -6 -T -z 0.1 he.net
traceroute to he.net (2001:470:0:76::2), 30 hops max, 80 byte packets
marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  787.008 ms  872.939 ms *
ge2-14.core1.dal1.he.net (2001:470:0:78::1)  682.742 ms  644.077 ms *
3  * 10ge2-4.core1.phx2.he.net (2001:470:0:258::1)  910.696 ms *
4  * 10ge15-6.core1.lax2.he.net (2001:470:0:24a::2)  763.820 ms  874.954 ms
10ge9-5.core1.sjc2.he.net (2001:470:0:16a::1)  842.333 ms  927.687 ms *
10ge1-1.core1.fmt1.he.net (2001:470:0:2f::1)  968.217 ms  923.580 ms  1017.215 ms
10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  1001.242 ms  928.751 ms he.net (2001:470:0:76::2)  1062.293 ms


Additionally, here is a ping to the other side of the tunnel, both from outside and from inside:

root@mario-laptop:~# ping -q -c 100 -i 0.1 -n -s 1400 184.105.253.10
PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.

--- 184.105.253.10 ping statistics ---
100 packets transmitted, 93 received, 7% packet loss, time 10185ms
rtt min/avg/max/mdev = 127.467/219.350/500.283/82.390 ms, pipe 5

root@mario-laptop:~# ping6 -q -c 100 -i 0.1 -n -s 1400 2001:470:1f0e:a81::1
PING 2001:470:1f0e:a81::1(2001:470:1f0e:a81::1) 1400 data bytes

--- 2001:470:1f0e:a81::1 ping statistics ---
100 packets transmitted, 58 received, 42% packet loss, time 10330ms
rtt min/avg/max/mdev = 125.624/524.160/1856.687/549.140 ms, pipe 18


Thanks in advance. :).

kasperd

Quote from: marioxcc on January 19, 2014, 11:08:20 AMSure, here is it:
root@mario-laptop:~# traceroute -6 -T -z 0.1 he.net
traceroute to he.net (2001:470:0:76::2), 30 hops max, 80 byte packets
marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  787.008 ms  872.939 ms *
ge2-14.core1.dal1.he.net (2001:470:0:78::1)  682.742 ms  644.077 ms *
3  * 10ge2-4.core1.phx2.he.net (2001:470:0:258::1)  910.696 ms *
4  * 10ge15-6.core1.lax2.he.net (2001:470:0:24a::2)  763.820 ms  874.954 ms
10ge9-5.core1.sjc2.he.net (2001:470:0:16a::1)  842.333 ms  927.687 ms *
10ge1-1.core1.fmt1.he.net (2001:470:0:2f::1)  968.217 ms  923.580 ms  1017.215 ms
10ge2-1.core1.fmt1.he.net (2001:470:0:2d::1)  1001.242 ms  928.751 ms he.net (2001:470:0:76::2)  1062.293 ms


Additionally, here is a ping to the other side of the tunnel, both from outside and from inside:

root@mario-laptop:~# ping -q -c 100 -i 0.1 -n -s 1400 184.105.253.10
PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.

--- 184.105.253.10 ping statistics ---
100 packets transmitted, 93 received, 7% packet loss, time 10185ms
rtt min/avg/max/mdev = 127.467/219.350/500.283/82.390 ms, pipe 5
Based on that data, I think it would be useful to see both IPv4 traceroute to the tunnel server and IPv6 traceroute to some other location, where both are taken during a period of high latency. I think the previous IPv4 traceroute may have been taken at a time, where the latency was low. Though I am starting to suspect we are going to find the v4 and v6 sides of the tunnel server are both doing fine, and the tunnel server itself may turn out to be the bottleneck.

Did I already link to a previous posting, where I demonstrated a method to measure packet drops in both directions more or less independently?

marioxcc

#10
Quote from: kasperd on January 19, 2014, 03:49:44 PMBased on that data, I think it would be useful to see both IPv4 traceroute to the tunnel server and IPv6 traceroute to some other location, where both are taken during a period of high latency.

Here is it:

root@mario-laptop:~# traceroute -4 -z 0.1 -q 5 184.105.253.10
traceroute to 184.105.253.10 (184.105.253.10), 30 hops max, 60 byte packets
1  dsldevice.lan (192.168.1.254)  87.054 ms  73.946 ms  76.404 ms  67.444 ms  111.670 ms
2  dsl-servicio-l200.uninet.net.mx (200.38.193.226)  578.202 ms  521.356 ms  438.617 ms  367.886 ms  399.132 ms
3  bb-la-grand-5-pos0-14-5-0.uninet.net.mx (201.125.48.54)  711.621 ms  743.567 ms  751.809 ms  817.721 ms  947.365 ms
4  Vlan552.icore1.EQL-LosAngeles.as6453.net (206.82.129.65)  1047.977 ms * *  950.824 ms *
5  * if-4-28.tcore2.LVW-LosAngeles.as6453.net (216.6.84.53)  1001.343 ms * *  1089.047 ms
6  * if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  1117.035 ms  1032.564 ms  1091.645 ms *
7  66.110.59.66 (66.110.59.66)  996.043 ms  909.640 ms  813.775 ms *  806.654 ms
8  hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  923.627 ms  854.775 ms  858.981 ms  783.757 ms  843.165 ms
9  10ge1-3.core1.lax2.he.net (72.52.92.122)  917.600 ms  938.064 ms  774.540 ms  776.835 ms  749.166 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  657.399 ms  583.464 ms  756.133 ms  769.112 ms  816.184 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  888.140 ms  886.990 ms  968.302 ms  871.717 ms  786.502 ms
12  184.105.253.10 (184.105.253.10)  708.111 ms  786.203 ms  915.318 ms  1088.226 ms  1339.538 ms
root@mario-laptop:~# traceroute -6 -z 0.1 -q 5 en.wikipedia.org

traceroute to en.wikipedia.org (2620:0:861:ed1a::1), 30 hops max, 80 byte packets
1  marioxcc-1.tunnel.tserv8.dal1.ipv6.he.net (2001:470:1f0e:a81::1)  1202.242 ms  1143.032 ms  1203.120 ms  1164.949 ms  1129.623 ms
2  ge2-14.core1.dal1.he.net (2001:470:0:78::1)  1029.443 ms  1047.834 ms  960.127 ms  859.947 ms  825.462 ms
3  10ge5-4.core1.atl1.he.net (2001:470:0:1b6::2)  760.820 ms  1122.684 ms  924.842 ms  1030.541 ms  922.392 ms
4  10ge16-5.core1.ash1.he.net (2001:470:0:1b5::1)  1050.991 ms *  1006.159 ms  1034.359 ms  981.119 ms
5  xe-5-3-3-500.cr1-eqiad.wikimedia.org (2001:470:0:1c0::2)  1029.644 ms  963.992 ms  999.735 ms  980.714 ms  963.025 ms
6  text-lb.eqiad.wikimedia.org (2620:0:861:ed1a::1)  1052.966 ms  943.506 ms  914.954 ms  963.356 ms  883.566 ms


Quote from: kasperd on January 19, 2014, 03:49:44 PM
I think the previous IPv4 traceroute may have been taken at a time, where the latency was low. Though I am starting to suspect we are going to find the v4 and v6 sides of the tunnel server are both doing fine, and the tunnel server itself may turn out to be the bottleneck.

That's right, it was taken in a low-latency time. The high latency and packet loss correlate very well with high usage of my Internet connection (But only with native (non-tunneled) traffic). The high-latency & high-packet-loss problem don't happens when I top my Internet connection with traffic going almost solely through 6in4 tunnel; meaning the server can handle just fine as much traffic as I can put/pull (Which makes sense, since this is a low-end residential connection transported through DSL). I doubt that the 6in4 server happens to be overload exactly when my Internet connection is under medium/heavy usage.

Quote from: kasperd on January 19, 2014, 03:49:44 PM
Did I already link to a previous posting, where I demonstrated a method to measure packet drops in both directions more or less independently?

Well, you just did ;). How can this tool help is troubleshooting this problem?.

Do you think than it's time for me to write to ipv6@he.net?.

Regards and thanks for your help.

kasperd

Quote from: marioxcc on January 20, 2014, 11:49:38 AMroot@mario-laptop:~# traceroute -4 -z 0.1 -q 5 184.105.253.10
traceroute to 184.105.253.10 (184.105.253.10), 30 hops max, 60 byte packets
1  dsldevice.lan (192.168.1.254)  87.054 ms  73.946 ms  76.404 ms  67.444 ms  111.670 ms
2  dsl-servicio-l200.uninet.net.mx (200.38.193.226)  578.202 ms  521.356 ms  438.617 ms  367.886 ms  399.132 ms
3  bb-la-grand-5-pos0-14-5-0.uninet.net.mx (201.125.48.54)  711.621 ms  743.567 ms  751.809 ms  817.721 ms  947.365 ms
4  Vlan552.icore1.EQL-LosAngeles.as6453.net (206.82.129.65)  1047.977 ms * *  950.824 ms *
5  * if-4-28.tcore2.LVW-LosAngeles.as6453.net (216.6.84.53)  1001.343 ms * *  1089.047 ms
6  * if-2-2.tcore1.LVW-LosAngeles.as6453.net (66.110.59.1)  1117.035 ms  1032.564 ms  1091.645 ms *
7  66.110.59.66 (66.110.59.66)  996.043 ms  909.640 ms  813.775 ms *  806.654 ms
8  hurricane-ic-138362-las-bb1.c.telia.net (213.248.67.142)  923.627 ms  854.775 ms  858.981 ms  783.757 ms  843.165 ms
9  10ge1-3.core1.lax2.he.net (72.52.92.122)  917.600 ms  938.064 ms  774.540 ms  776.835 ms  749.166 ms
10  10ge2-3.core1.phx2.he.net (184.105.222.85)  657.399 ms  583.464 ms  756.133 ms  769.112 ms  816.184 ms
11  10ge5-3.core1.dal1.he.net (184.105.222.78)  888.140 ms  886.990 ms  968.302 ms  871.717 ms  786.502 ms
12  184.105.253.10 (184.105.253.10)  708.111 ms  786.203 ms  915.318 ms  1088.226 ms  1339.538 ms
This is not what a traceroute is supposed to look like. At hop 2 you have latencies between 360 and 580 ms. And it does not improve at later hops. Your good traces from earlier showed less than 100 ms at every hop along a similar route.

What we are seeing here is a problem on the v4 path, and that problem is located close to your network. Possibly the problem might even be on your own router or modem, though that is not absolutely certain. The problems you are seeing could be caused by a combination of buffer bloat on your modem and flooding of your upstream. If that is not the culprit, you should ask your own ISP to explain why you see such huge latency spikes.

If the problem is due to buffer bloat, you should look into using traffic shaping on your router. If your router can be configured to do traffic shaping and cap the upstream at 95% of the actual upstream provided by the ISP, you should be able to avoid problems caused by buffer bloat on the modem.

Quote from: marioxcc on January 20, 2014, 11:49:38 AMThe high latency and packet loss correlate very well with high usage of my Internet connection (But only with native (non-tunneled) traffic). The high-latency & high-packet-loss problem don't happens when I top my Internet connection with traffic going almost solely through 6in4 tunnel; meaning the server can handle just fine as much traffic as I can put/pull
Actually it may be the other way around. If the bandwidth bottleneck is on the tunnel server (or on the v4 path between your ISP and the tunnel server), that bandwidth may cause TCP to scale down bandwidth usage and avoid overloading the modem.

As TCP is increasing bandwidth usage, it will eventually hit a bottleneck on some hop along the route. Once a bottleneck is hit, a buffer will start filling up. If that buffer is too large compared to the bandwidth on that hop, you will see latency increase. But if you can somehow get the bandwidth bottleneck to be smaller on some other hop on the path, then it is a different buffer, which fills up. Assuming that buffer has a more reasonable size, performance can actually be improved by reducing throughput on some hop.

In other words by running most of your traffic through the tunnel, you might be moving the bottleneck from your modem to some other place and thus avoid the problem. However, if this is really the problem, then moving the bottleneck from your modem to your router may be a better solution. That is why I suggests capping your router at 95% of the capacity of the modem, if the router can do so.

Quote from: marioxcc on January 20, 2014, 11:49:38 AM
Quote from: kasperd on January 19, 2014, 03:49:44 PM
Did I already link to a previous posting, where I demonstrated a method to measure packet drops in both directions more or less independently?

Well, you just did ;). How can this tool help is troubleshooting this problem?
If you have a Linux host on your LAN, you can download the client and let it communicate with the server IP which I mentioned in that post. It will measure packet drops on just the downstream (ignoring packet drops on the upstream). If my hunch is right, you should see much lower packet drop on the downstream than you do on the upstream.

Quote from: marioxcc on January 20, 2014, 11:49:38 AMDo you think than it's time for me to write to ipv6@he.net?
Based on your latest traceroute, I think the problem is on your end, in which case HE couldn't help you.

marioxcc

Hi.

The problem with the hypotesis than it's buffer bloat what causes this problem is that it fails to explain why 6in4 traffic is treated differently than the rest. For instance, under the following test which intentionally puts more traffic than what the connection can handle, the IPv6 ping reports far worser statistics than the native IPv4 test, the Internet connection was otherwise unload (save for the occasional traffic from NTP, etc...). Note than the 2 command were run on parallel.

root@mario-laptop:/home/mario# ping6 -q -s 1400 -i 0.02 -c 400 2001:470:1f0e:a81::1 & ping -q -s 1400 -i 0.02 -c 400 184.105.253.10 &
[1] 5167
[2] 5168
root@mario-laptop:/home/mario# PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.
PING 2001:470:1f0e:a81::1(2001:470:1f0e:a81::1) 1400 data bytes

--- 184.105.253.10 ping statistics ---
400 packets transmitted, 290 received, 27% packet loss, time 9946ms
rtt min/avg/max/mdev = 120.460/263.358/522.778/91.864 ms, pipe 22

--- 2001:470:1f0e:a81::1 ping statistics ---
400 packets transmitted, 44 received, 89% packet loss, time 11033ms
rtt min/avg/max/mdev = 154.642/1583.399/6152.360/1307.172 ms, pipe 220


Also, it seems than sometimes running a BitTorrent client even with a marginal bandwidth usage (But several native (IPv4) TCP connections and UDP connection-like usage (µTP)) will make the IPv6 performance fall.

I can only conjecture as for the reason of this weird behavior but I suspect than my ISP may attempt to throttle down protcol-41 traffic when it notices congestion. However, as I said, when almost all traffic goes through then IPv6 tunnel then no performance degradation happens (Some ms more for ping, however, but no packet loss or anything like that. The tunnel don't seems to be the bottleneck either, the traffic takes up all the contracted bandwidth).

Do you know how may I tunnel the 6in4 tunnel itself through a OpenVPN tunnel so to avoid protocol-specific interference from my ISP?. I think doing this and then repeating the test may give further insights into the roots of this problem.

Thanks for your help.

kasperd

Quote from: marioxcc on January 22, 2014, 02:28:05 PMThe problem with the hypotesis than it's buffer bloat what causes this problem is that it fails to explain why 6in4 traffic is treated differently than the rest.
The delay happens so early in the trace, that it can more or less only be explained by buffering. If there is a measurable difference between the protocols, it most likely means that the device which is buffering uses multiple queues. So what you want to find out next may be, how many queues are there, what criteria decides the queue a packet goes into, and perhaps if buffer memory is allocated statically to each queue or reallocated dynamically between them.

I recently noticed on my own connection, that queues appeared to be split by IPv4 address. My protocol 41 traffic was using a different IPv4 address than my other traffic. And since I was mostly pushing 6in4 traffic, it was only IPv6 traffic that got slowed down while IPv4 traffic went through unaffected.

Quote from: marioxcc on January 22, 2014, 02:28:05 PMFor instance, under the following test which intentionally puts more traffic than what the connection can handle, the IPv6 ping reports far worser statistics than the native IPv4 test, the Internet connection was otherwise unload (save for the occasional traffic from NTP, etc...). Note than the 2 command were run on parallel.

root@mario-laptop:/home/mario# ping6 -q -s 1400 -i 0.02 -c 400 2001:470:1f0e:a81::1 & ping -q -s 1400 -i 0.02 -c 400 184.105.253.10 &
[1] 5167
[2] 5168
root@mario-laptop:/home/mario# PING 184.105.253.10 (184.105.253.10) 1400(1428) bytes of data.
PING 2001:470:1f0e:a81::1(2001:470:1f0e:a81::1) 1400 data bytes

--- 184.105.253.10 ping statistics ---
400 packets transmitted, 290 received, 27% packet loss, time 9946ms
rtt min/avg/max/mdev = 120.460/263.358/522.778/91.864 ms, pipe 22

--- 2001:470:1f0e:a81::1 ping statistics ---
400 packets transmitted, 44 received, 89% packet loss, time 11033ms
rtt min/avg/max/mdev = 154.642/1583.399/6152.360/1307.172 ms, pipe 220
That does look strange. There is no obvious explanation for those numbers.

Quote from: marioxcc on January 22, 2014, 02:28:05 PMAlso, it seems than sometimes running a BitTorrent client even with a marginal bandwidth usage (But several native (IPv4) TCP connections and UDP connection-like usage (µTP)) will make the IPv6 performance fall.
Possibly connection tracking plays a role in this picture, but this is just a random guess.

Quote from: marioxcc on January 22, 2014, 02:28:05 PMI can only conjecture as for the reason of this weird behavior but I suspect than my ISP may attempt to throttle down protcol-41 traffic when it notices congestion. However, as I said, when almost all traffic goes through then IPv6 tunnel then no performance degradation happens (Some ms more for ping, however, but no packet loss or anything like that. The tunnel don't seems to be the bottleneck either, the traffic takes up all the contracted bandwidth).
The traffic may be put into different priority bands. That would explain why one queue sees more latency and packet loss than another. The queues might simply not have the same priority. The queue which is used for your protocol 41 packets does exhibit symptoms of buffer bloat. And when there are multiple queues, it is entirely possible they are not all affected by buffer bloat.

For bulk traffic like bittorrent transfers, a bit of buffer bloat is not bad. As long as interactive traffic doesn't have to go through the same queue. So it may be the problem is misclassification of traffic rather than bloat per se.

If your protocol 41 traffic gets classified as bulk traffic while other customers' traffic gets classified as interactive and adversely affects your protocol 41 traffic, then you should complain loudly. OTOH, if the problem only kicks in once you reach the bandwidth limit specified in your subscription, and you can avoid problems by staying within that limit, then it would be more productive to stay within that limit. If that is the case, then my previous suggestion should still work just the same. By controlling where the bottleneck is and how traffic is prioritized at the bottleneck, you can get a better result. Once you have moved the bottleneck to where you can control it, it no longer matters how the previous bottleneck would treat traffic.

Quote from: marioxcc on January 22, 2014, 02:28:05 PMDo you know how may I tunnel the 6in4 tunnel itself through a OpenVPN tunnel so to avoid protocol-specific interference from my ISP?.
I do not have that much experience with VPNs. In principle what you ask for should be easy, in practice it may very well depend on the VPN implementation. If you have VPN and 6in4 tunnel both able to pass packets, I can try to come with suggestions on how you get traffic routed the way you want it. VPN and 6in4 could end up next to each other or with VPN inside 6in4 or 6in4 inside VPN. The routing table has a lot to say in, which of those will be the case.

Quote from: marioxcc on January 22, 2014, 02:28:05 PMI think doing this and then repeating the test may give further insights into the roots of this problem.
Sure, the information you could find by experimenting with 6in4 inside VPN could potentially tell us something more about the root of the problem.