For some weeks now I'm observing performance issues regarding a tunnel to tserv6.fra1.ipv6.he.net. In the evening latency is increased by a lot (30ms to 120ms, see attached graph), and bandwidth is reduced (DL ~30MBit to ~1MBit).
As this looks more like a problem of my ISP, I tested native IPv4 performance, downloading files over different ports, uploading stuff via SSH, but everything seems fine there. Also the Ping RTT to the tunnelserver via IPv4 is normal, whereas traceroute over IPv6 shows the first hop with increased latency. That's why I'm a bit confused.
I've already sent this via mail but only got an automated reply from the ticket system so far. Does anybody have an idee how to analyse whether this is my ISPs fault or a problem on the tunnel server side? Thanks...
Quote from: dstest01 on December 08, 2013, 09:52:26 AMI've already sent this via mail but only got an automated reply from the ticket system so far. Does anybody have an idee how to analyse whether this is my ISPs fault or a problem on the tunnel server side?
I am also using fra1, and I have not observed this problem. I am using thinkbroadband.com/ping to graph the latency of my tunnel on fra1, and the only times I see significant increase in latency, is when I max out my v4 upstream.
I am using my own v6 stack, which will not send packets through the tunnel server, if it knows a more direct path. So if a latency issue is only affecting packets in one direction, I might not notice it. I'll try to disable that feature for a couple of days to see if that means I'll experience the same latency increase, that you do.
Quote from: kasperd on December 08, 2013, 10:42:40 AMI'll try to disable that feature for a couple of days to see if that means I'll experience the same latency increase, that you do.
That increased my roundtrip time to thinkbroadband from approximately 35ms to approximately 39ms. Also, if I ping google.de, I don't see roundtrip times nearly as high as you do. But Google has lots of frontends. There is no guarantee, we have measured towards the same datacenter.
What IPv6 address are you measuring? And what is the IPv6 address of your host?
Quote from: kasperd on December 08, 2013, 11:09:02 AM
What IPv6 address are you measuring? And what is the IPv6 address of your host?
It's the same for other hosts, the graphs for heise.de and another private server look pretty much the same.
The address of the local tunnel endpoint is 2001:470:1f0a:751::2.
Quote from: dstest01 on December 08, 2013, 12:22:03 PMThe address of the local tunnel endpoint is 2001:470:1f0a:751::2.
I just realized that packets I send weren't going through the tunnel server anyway, so my previous measurements were not exactly valid. The thing is, that the ISP I am using has a router, which under certain circumstances will decapsulate protocol 41 packets and forward the IPv6 packet directly onto the IPv6 backbone instead of forwarding the tunnelled packet to HE. I do know what to change in my configuration to avoid that though, so this time I really am measuring roundtrip with packets in both directions going through the tunnel server. Fixing that increased my roundtrip to thinkbroadband by another 6ms or so. But nowhere near the latency, you are seeing.
I am still not able to reproduce the high latency on my tunnel. I am however able to see the high latency on a traceroute to your IP address. So I can confirm your observation about high latency on your first hop. It looks like the problem must be on the v4 side of the tunnel server. In other words somewhere on the IPv4 path between the tunnel server and your tunnel endpoint.
A traceroute from my network to your IPv6 address looks like this:
traceroute to 2001:470:1f0a:751::2 (2001:470:1f0a:751::2), 30 hops max, 80 byte packets
1 2001:470:1f0b:1da2:635a:c32:ae34:df91 0.224 ms 0.127 ms 0.187 ms
2 2001:470:1f0a:1da2::1 44.547 ms 47.658 ms 50.871 ms
3 2001:470:1f0a:751::2 176.363 ms 166.157 ms 171.574 ms
Traceroute from Hetzner to your IPv6 address ends like this:
7 2a01:4f8:0:3::6 5.279 ms 5.286 ms 5.271 ms
8 2001:7f8::1b1b:0:1 5.337 ms 5.293 ms 16.376 ms
9 2001:470:0:69::2 8.303 ms 8.009 ms 8.065 ms
10 2001:470:1f0a:751::2 131.653 ms 131.508 ms 131.869 ms
For comparision a traceroute to my IPv6 address on the same tunnel server:
7 2a01:4f8:0:3::6 5.266 ms 5.271 ms 5.274 ms
8 2001:7f8::1b1b:0:1 5.328 ms 13.122 ms 5.248 ms
9 2001:470:0:69::2 8.534 ms 7.757 ms 7.901 ms
10 2001:470:1f0b:1da2:635a:c32:ae34:df91 55.921 ms 50.602 ms 62.796 ms
11 2001:470:1f0b:1da2:635a:c32:ae34:df91 42.278 ms 46.365 ms 41.18 ms
The following pieces of information could be useful in narrowing down the problem:
- A traceroute from your tunnel endpoint to 216.66.80.30
- A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).
- A traceroute from your network to an external IPv6 address.
- Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)
Quote from: kasperd on December 08, 2013, 02:36:19 PM
- A traceroute from your tunnel endpoint to 216.66.80.30
traceroute to 216.66.80.30 (216.66.80.30), 30 hops max, 60 byte packets
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 34.072 ms 29.301 ms 30.283 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 35.860 ms 35.397 ms 34.927 ms
9 tserv1.fra1.he.net (216.66.80.30) 28.202 ms 25.949 ms 37.399 ms
Host Loss% Snt Last Avg Best Wrst StDev
[...]
7. 30gigabitethernet1-3.core1.ams1.he.net 0.0% 120 28.0 33.2 25.3 45.0 5.0
8. 100ge5-1.core1.fra1.he.net 0.0% 120 26.9 32.3 25.2 62.2 5.6
9. tserv1.fra1.he.net 0.0% 120 26.6 30.8 24.7 55.8 5.7
Quote- A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).
This produces weird results, sometimes the output seems corrupted, like this:
[...]
5 83-169-129-89.static.superkabel.de (83.169.129.89) 25.010 ms 21.730 ms 23.940 ms
6 tserv1.fra1.he.net (216.66.80.30) 18.288 ms 83-169-128-141.static.superkabel.de (83.169.128.141) 5.995 ms 27.805 ms
If it's completing, looks like this at the end:
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 36.166 ms 27.191 ms 27.756 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 26.886 ms 26.145 ms 27.358 ms
9 tserv1.fra1.he.net (216.66.80.30) 112.081 ms 291.641 ms 675.325 ms
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 25.643 ms 37.247 ms 26.134 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 27.088 ms 35.618 ms 25.292 ms
9 tserv1.fra1.he.net (216.66.80.30) 4629.144 ms 133.377 ms 2359.053 ms
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 25.535 ms 25.610 ms 25.627 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 26.528 ms 25.607 ms 26.191 ms
9 tserv1.fra1.he.net (216.66.80.30) 268.646 ms 8.473 ms 543.078 ms
Various results for the last hop, but they all look like this. While writing this post, the RTT went back to normal, but these traces look the same.
Quote- A traceroute from your network to an external IPv6 address.
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f0b:751::101, 30 hops max, 24 byte packets
1 2001:470:1f0a:751::1 (2001:470:1f0a:751::1) 86.749 ms 88.746 ms 90.232 ms
2 v320.core1.fra1.he.net (2001:470:0:69::1) 87.058 ms 89.918 ms 84.285 ms
3 te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1) 86.634 ms 85.812 ms 86.665 ms
4 te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e) 86.185 ms 87.781 ms 87.051 ms
5 redirector.heise.de (2a02:2e0:3fe:100::8) 86.196 ms 87.332 ms 89.024 ms
(Taken while RTT is falling down again.)
Quote- Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)
188.195.5.194
Quote from: dstest01 on December 08, 2013, 04:24:28 PM
Quote from: kasperd on December 08, 2013, 02:36:19 PM- A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).
This produces weird results, sometimes the output seems corrupted, like this:
[...]
5 83-169-129-89.static.superkabel.de (83.169.129.89) 25.010 ms 21.730 ms 23.940 ms
6 tserv1.fra1.he.net (216.66.80.30) 18.288 ms 83-169-128-141.static.superkabel.de (83.169.128.141) 5.995 ms 27.805 ms
This just means not all the packets are taking the same route. If the route taken by different packets have the same number of hops, you might only see different IPs on a few hops. If the routes have different number of hops, the rest of the traceroute output could look very confusing. It could indicate route flapping, but it could also just be bundled links, working as intended. Judging from the rest of your observations, I don't think this variation in routing indicates a problem.
Quote from: dstest01 on December 08, 2013, 04:24:28 PMIf it's completing, looks like this at the end:
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 36.166 ms 27.191 ms 27.756 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 26.886 ms 26.145 ms 27.358 ms
9 tserv1.fra1.he.net (216.66.80.30) 112.081 ms 291.641 ms 675.325 ms
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 25.643 ms 37.247 ms 26.134 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 27.088 ms 35.618 ms 25.292 ms
9 tserv1.fra1.he.net (216.66.80.30) 4629.144 ms 133.377 ms 2359.053 ms
[...]
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 25.535 ms 25.610 ms 25.627 ms
8 100ge5-1.core1.fra1.he.net (72.52.92.6) 26.528 ms 25.607 ms 26.191 ms
9 tserv1.fra1.he.net (216.66.80.30) 268.646 ms 8.473 ms 543.078 ms
Various results for the last hop, but they all look like this.
All three traces show the same IPs for the last three hops, so it looks like this part of the route is consistent for you. It also shows an extreme latency on the last link from 100ge5-1.core1.fra1.he.net (72.52.92.6) to tserv1.fra1.he.net (216.66.80.30). That is probably the most useful piece of information you found so far. You should update the ticket you filed with that information. Additionally mention, that it only looks like that for protocol 41 packets and not for echo requests.
When I run a traceroute to the tunnel server, the path looks a little bit different.
# traceroute -f5 -P 41 216.66.80.30
traceroute to 216.66.80.30 (216.66.80.30), 30 hops max, 60 byte packets
5 kbn-b1-geth2-3.telia.net (213.248.66.145) 9.397 ms 8.459 ms 9.752 ms
6 kbn-bb1-link.telia.net (80.91.246.46) 61.240 ms 56.713 ms 69.155 ms
7 s-bb3-link.telia.net (80.91.248.46) 21.737 ms 20.663 ms 23.519 ms
8 s-b3-link.telia.net (213.155.133.99) 17.946 ms 22.966 ms 18.984 ms
9 hurricane-ic-134866-s-b3.c.telia.net (213.155.141.254) 23.024 ms 34.506 ms 24.955 ms
10 10ge3-1.core1.cph1.he.net (184.105.223.206) 26.081 ms 31.631 ms 30.086 ms
11 10ge16-1.core1.fra1.he.net (184.105.223.201) 40.309 ms 40.228 ms 33.032 ms
12 tserv1.fra1.he.net (216.66.80.30) 33.868 ms 36.967 ms 42.956 ms
The next to last hop is a different one for me. And I don't see the high latency. So most likely the reason you see a problem, and I don't is that our packets reach the tunnel server through different paths.
Quote from: dstest01 on December 08, 2013, 04:24:28 PMQuote- Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)
188.195.5.194
I'll took a look on that later, though you might not need the additional information, we could find from that. You already have useful information to add to the ticket.
Ok, I'll post it there, thx a lot for your help.
Quote from: dstest01 on December 09, 2013, 04:27:09 PM
Ok, I'll post it there, thx a lot for your help.
I am curious to hear if you ever got a reply to your ticket.
Quote from: kasperd on December 21, 2013, 08:06:28 AM
I am curious to hear if you ever got a reply to your ticket.
I did... was asked for a complete trace, but that was some time ago. I guess it's hard to debug from their side.
For now I only tested a tunnel to a different server, not a tunnel broker, just a random server I have access to. Ping RTTs over this tunnel were stable as expected.
The next thing I'd like to test is a second HE tunnel to AMS1, which is a hop on the way to my current tunnelserver.
HE does not allow a second tunnel with the same IP for some reason, but luckily I have another ISP. The result is indeed quite interesting:
default AMS1 nosrc:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f14:10e2::2, 30 hops max, 24 byte packets
1 2001:470:1f14:10e2::1 (2001:470:1f14:10e2::1) 36.464 ms 37.28 ms 37.255 ms
2 v213.core1.ams1.he.net (2001:470:0:7d::1) 32.924 ms 32.028 ms 42.578 ms
3 ams-ix-v6.nl.plusline.net (2001:7f8:1::a501:2306:1) 40.135 ms 39.405 ms 39.25 ms
4 te2-4.c101.f.de.plusline.net (2a02:2e0:10:3:c::1) 39.305 ms 39.28 ms 39.711 ms
5 te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e) 39.002 ms 39.131 ms 38.909 ms
6 redirector.heise.de (2a02:2e0:3fe:100::8) 39.216 ms 38.718 ms 38.738 ms
default AMS1 src FRA1:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f0b:751::101, 30 hops max, 24 byte packets
1 tserv1.ams1.he.net (2001:470:0:7d::2) 121.265 ms 121.898 ms 123.603 ms
2 v213.core1.ams1.he.net (2001:470:0:7d::1) 115.323 ms 112.813 ms 125.725 ms
3 ams-ix-v6.nl.plusline.net (2001:7f8:1::a501:2306:1) 114.944 ms 316.589 ms 234.207 ms
4 te2-4.c101.f.de.plusline.net (2a02:2e0:10:3:c::1) 110.279 ms 106.011 ms 111.538 ms
5 te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e) 106.414 ms 120.993 ms 107.52 ms
6 redirector.heise.de (2a02:2e0:3fe:100::8) 110.427 ms 109.495 ms 109.77 ms
default FRA1 nosrc:
1 2001:470:1f0a:751::1 (2001:470:1f0a:751::1) 118.367 ms 122.252 ms 120.983 ms
2 v320.core1.fra1.he.net (2001:470:0:69::1) 124.875 ms 123.936 ms 126.179 ms
3 te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1) 120.115 ms 119.612 ms 120.992 ms
4 te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e) 123.557 ms 124.883 ms 120.908 ms
5 redirector.heise.de (2a02:2e0:3fe:100::8) 125.327 ms 125.293 ms 124.819 ms
default FRA1 src AMS1:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f15:10e2::101, 30 hops max, 24 byte packets
1 tserv1.fra1.he.net (2001:470:0:69::2) 41.606 ms 42.458 ms 42.083 ms
2 v320.core1.fra1.he.net (2001:470:0:69::1) 68.629 ms 39.157 ms 41.465 ms
3 te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1) 205.25 ms 231.324 ms 299.25 ms
4 te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e) 209.512 ms 204.391 ms 53.029 ms
5 redirector.heise.de (2a02:2e0:3fe:100::8) 38.678 ms 37.604 ms 38.221 ms
"src" means setting the default route with an address from the other tunnels routed net as src address, e.g. in the last case requests go via FRA1 while replies come back via AMS1. So for 6in4 packets (but not ICMP Pings) the way from FRA1 to my ISP is looking bad, while the other direction seems to be ok.
Btw. I also tried switching ISPs, in which case latencies are fine for both tunnels, so it's just the combination of this specific ISP with this specific tunnelserver, but the other connection is a standard DSL with 24h forced reconnection, therefore I'm not eager using it permanently for the tunnel.
Aaarg, the forum ate my last post... so, shorter this time. ;)
http://lg.he.net/ gives
1 16 ms 1 ms 4 ms decix1.superkabel.de (80.81.192.249)
2 17 ms 24 ms 24 ms 88-134-203-137-dynip.superkabel.de (88.134.203.137)
3 25 ms 24 ms 30 ms 83-169-129-86.static.superkabel.de (83.169.129.86)
4 23 ms 23 ms 24 ms 88-134-205-122-dynip.superkabel.de (88.134.205.122)
5 30 ms 40 ms 25 ms 83-169-128-86.static.superkabel.de (83.169.128.86)
6 32 ms 19 ms 31 ms 83-169-137-19.static.superkabel.de (83.169.137.19)
7 28 ms 37 ms 33 ms 188-195-5-194-dynip.superkabel.de (188.195.5.194)
... compared to ...
traceroute to 216.218.252.174 (216.218.252.174), 255 hops max, 60 byte packets
1 83-169-169-90-isp.superkabel.de (83.169.169.90) 5.090 ms 10.182 ms 10.223 ms
2 83-169-136-62.static.superkabel.de (83.169.136.62) 16.605 ms 16.647 ms 16.649 ms
3 83-169-128-29.static.superkabel.de (83.169.128.29) 17.675 ms 17.762 ms 17.766 ms
4 88-134-205-134-dynip.superkabel.de (88.134.205.134) 17.770 ms 17.772 ms 17.834 ms
5 88-134-237-133-dynip.superkabel.de (88.134.237.133) 29.433 ms 88-134-237-137-dynip.superkabel.de (88.134.237.137) 24.702 ms 24.744 ms
6 88-134-202-90-dynip.superkabel.de (88.134.202.90) 33.289 ms 88-134-202-133-dynip.superkabel.de (88.134.202.133) 27.511 ms 88-134-202-90-dynip.superkabel.de (88.134.202.90) 30.795 ms
7 30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150) 38.475 ms 32.408 ms 32.394 ms
8 core1.fra1.he.net (216.218.252.174) 44.468 ms 43.557 ms 45.090 ms
Routing asymmetry, maybe part of the problem, see last post.
Hi,
thats what i got over Kdg (Kabeldeutschland) and the reason i avoid using this connection for my ipv6 tunnel ;)
Kdg:
[x@mtr-ham3] > /tool traceroute routing-table=cable-kdg 216.66.80.30
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 x.x.x.x 0% 7 9.2ms 15.8 9.2 31.6 7.4
2 83.169.157.62 0% 7 30.9ms 88.6 17.8 217.6 68.6
3 83.169.129.138 0% 7 13.6ms 20.7 13.6 31 5.5
4 88.134.204.14 0% 7 16.2ms 17.3 14.5 19.8 2
5 88.134.203.77 0% 7 25.5ms 25.5 12.3 47.1 10.8
6 88.134.202.125 0% 7 52.4ms 35.7 25.1 52.4 11
7 195.69.145.150 0% 7 34.5ms 26.9 22.2 34.5 3.6
8 72.52.92.78 0% 7 35.1ms 42.9 33 55.5 8.7
9 216.66.80.30 0% 7 138.8ms 138.3 132.6 150.8 6.1
Telefonica:
[x@mtr-ham3] > /tool traceroute routing-table=static-tfd 216.66.80.30
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 x.x.x.x 0% 7 30.5ms 27.7 26.2 30.5 1.4
2 62.109.72.114 0% 7 27ms 26.4 25.7 28 0.8
3 213.191.66.162 0% 7 26.3ms 26.1 25.3 27.5 0.7
4 193.42.155.32 0% 7 40.9ms 43.2 38.9 49.7 4
5 72.52.92.78 0% 7 53.3ms 48.9 42.3 53.3 3.9
6 216.66.80.30 0% 6 42.2ms 40 38.4 42.2 1.5
Hm, weird, I'm getting a different traceroute result now, but not the same as takio's:
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
6. 83-169-128-141.static.superkabel.de 0.0% 150 42.0 31.2 24.2 42.0 4.1
88-134-201-10-dynip.superkabel.de
7. 30gigabitethernet1-3.core1.ams1.he.net 0.0% 150 34.0 35.3 26.0 110.5 10.2
8. 10ge15-4.core1.fra1.he.net 0.7% 150 131.1 129.9 113.9 147.8 6.7
9. tserv1.fra1.he.net 0.0% 150 27.4 31.7 25.5 55.5 4.8
It's the second last hop (72.52.92.78 aka 10ge15-4.core1.fra1.he.net) that has increased RTTs, but not the last one, I'm not sure how this can happen.
...
1 hour later, RTTs over the tunnel are back to normal, RTTs in the traceroute as well.
Quote from: dstest01 on December 23, 2013, 04:19:55 PMIt's the second last hop (72.52.92.78 aka 10ge15-4.core1.fra1.he.net) that has increased RTTs, but not the last one, I'm not sure how this can happen.
I know of two possible explanations for that to happen. Typically you'd expect traceroute output to show an increase in roundtrip at each hop, but it is by no means guaranteed to be the case.
First of all, the route you see is the forward route. The forward route can change depending on what time you run the command, what protocol you are using, and even what port number you are using. However such changes will be visible in the traceroute output (with proper settings). What you do not see is which route the replies take back to you. It is possible that the route from each of those IPs back to you are different. So it is possible that the one hop, which shows an increased latency is simply caused by that particular router not having an optimal route back to your IPv4 address.
Additionally there are routers, which can forward packets in hardware, but sending a time exceeded ICMP error message back to you requires the packet to be handled by the CPU. For such a router it is much faster to forward the packet than to send a reply. If the CPU on one router along the path is more busy than the others, the roundtrip measured by traceroute for that particular router may be higher even if the router has plenty of bandwidth free to route packets.