• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

Increased latency to FRA1

Started by dstest01, December 08, 2013, 09:52:26 AM

Previous topic - Next topic

dstest01

For some weeks now I'm observing performance issues regarding a tunnel to tserv6.fra1.ipv6.he.net. In the evening latency is increased by a lot (30ms to 120ms, see attached graph), and bandwidth is reduced (DL ~30MBit to ~1MBit).

As this looks more like a problem of my ISP, I tested native IPv4 performance, downloading files over different ports, uploading stuff via SSH, but everything seems fine there. Also the Ping RTT to the tunnelserver via IPv4 is normal, whereas traceroute over IPv6 shows the first hop with increased latency. That's why I'm a bit confused.

I've already sent this via mail but only got an automated reply from the ticket system so far. Does anybody have an idee how to analyse whether this is my ISPs fault or a problem on the tunnel server side? Thanks...

kasperd

Quote from: dstest01 on December 08, 2013, 09:52:26 AMI've already sent this via mail but only got an automated reply from the ticket system so far. Does anybody have an idee how to analyse whether this is my ISPs fault or a problem on the tunnel server side?
I am also using fra1, and I have not observed this problem. I am using thinkbroadband.com/ping to graph the latency of my tunnel on fra1, and the only times I see significant increase in latency, is when I max out my v4 upstream.

I am using my own v6 stack, which will not send packets through the tunnel server, if it knows a more direct path. So if a latency issue is only affecting packets in one direction, I might not notice it. I'll try to disable that feature for a couple of days to see if that means I'll experience the same latency increase, that you do.

kasperd

Quote from: kasperd on December 08, 2013, 10:42:40 AMI'll try to disable that feature for a couple of days to see if that means I'll experience the same latency increase, that you do.
That increased my roundtrip time to thinkbroadband from approximately 35ms to approximately 39ms. Also, if I ping google.de, I don't see roundtrip times nearly as high as you do. But Google has lots of frontends. There is no guarantee, we have measured towards the same datacenter.

What IPv6 address are you measuring? And what is the IPv6 address of your host?

dstest01

Quote from: kasperd on December 08, 2013, 11:09:02 AM
What IPv6 address are you measuring? And what is the IPv6 address of your host?

It's the same for other hosts, the graphs for heise.de and another private server look pretty much the same.

The address of the local tunnel endpoint is 2001:470:1f0a:751::2.

kasperd

Quote from: dstest01 on December 08, 2013, 12:22:03 PMThe address of the local tunnel endpoint is 2001:470:1f0a:751::2.
I just realized that packets I send weren't going through the tunnel server anyway, so my previous measurements were not exactly valid. The thing is, that the ISP I am using has a router, which under certain circumstances will decapsulate protocol 41 packets and forward the IPv6 packet directly onto the IPv6 backbone instead of forwarding the tunnelled packet to HE. I do know what to change in my configuration to avoid that though, so this time I really am measuring roundtrip with packets in both directions going through the tunnel server. Fixing that increased my roundtrip to thinkbroadband by another 6ms or so. But nowhere near the latency, you are seeing.

I am still not able to reproduce the high latency on my tunnel. I am however able to see the high latency on a traceroute to your IP address. So I can confirm your observation about high latency on your first hop. It looks like the problem must be on the v4 side of the tunnel server. In other words somewhere on the IPv4 path between the tunnel server and your tunnel endpoint.

A traceroute from my network to your IPv6 address looks like this:traceroute to 2001:470:1f0a:751::2 (2001:470:1f0a:751::2), 30 hops max, 80 byte packets
1  2001:470:1f0b:1da2:635a:c32:ae34:df91  0.224 ms  0.127 ms  0.187 ms
2  2001:470:1f0a:1da2::1  44.547 ms  47.658 ms  50.871 ms
3  2001:470:1f0a:751::2  176.363 ms  166.157 ms  171.574 ms


Traceroute from Hetzner to your IPv6 address ends like this: 7  2a01:4f8:0:3::6  5.279 ms  5.286 ms  5.271 ms
8  2001:7f8::1b1b:0:1  5.337 ms  5.293 ms  16.376 ms
9  2001:470:0:69::2  8.303 ms  8.009 ms  8.065 ms
10  2001:470:1f0a:751::2  131.653 ms  131.508 ms  131.869 ms


For comparision a traceroute to my IPv6 address on the same tunnel server: 7  2a01:4f8:0:3::6  5.266 ms  5.271 ms  5.274 ms
8  2001:7f8::1b1b:0:1  5.328 ms  13.122 ms  5.248 ms
9  2001:470:0:69::2  8.534 ms  7.757 ms  7.901 ms
10  2001:470:1f0b:1da2:635a:c32:ae34:df91  55.921 ms  50.602 ms  62.796 ms
11  2001:470:1f0b:1da2:635a:c32:ae34:df91  42.278 ms  46.365 ms  41.18 ms


The following pieces of information could be useful in narrowing down the problem:
  • A traceroute from your tunnel endpoint to 216.66.80.30
  • A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).
  • A traceroute from your network to an external IPv6 address.
  • Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)

dstest01

Quote from: kasperd on December 08, 2013, 02:36:19 PM
  • A traceroute from your tunnel endpoint to 216.66.80.30

traceroute to 216.66.80.30 (216.66.80.30), 30 hops max, 60 byte packets
[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  34.072 ms  29.301 ms  30.283 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  35.860 ms  35.397 ms  34.927 ms
9  tserv1.fra1.he.net (216.66.80.30)  28.202 ms  25.949 ms  37.399 ms


Host                                             Loss%   Snt   Last   Avg  Best  Wrst StDev
[...]
7. 30gigabitethernet1-3.core1.ams1.he.net         0.0%   120   28.0  33.2  25.3  45.0   5.0
8. 100ge5-1.core1.fra1.he.net                     0.0%   120   26.9  32.3  25.2  62.2   5.6
9. tserv1.fra1.he.net                             0.0%   120   26.6  30.8  24.7  55.8   5.7



Quote
  • A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).

This produces weird results, sometimes the output seems corrupted, like this:

[...]
5  83-169-129-89.static.superkabel.de (83.169.129.89)  25.010 ms  21.730 ms  23.940 ms
6  tserv1.fra1.he.net (216.66.80.30)  18.288 ms 83-169-128-141.static.superkabel.de (83.169.128.141)  5.995 ms  27.805 ms


If it's completing, looks like this at the end:

[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  36.166 ms  27.191 ms  27.756 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  26.886 ms  26.145 ms  27.358 ms
9  tserv1.fra1.he.net (216.66.80.30)  112.081 ms  291.641 ms  675.325 ms


[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  25.643 ms  37.247 ms  26.134 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  27.088 ms  35.618 ms  25.292 ms
9  tserv1.fra1.he.net (216.66.80.30)  4629.144 ms  133.377 ms  2359.053 ms


[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  25.535 ms  25.610 ms  25.627 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  26.528 ms  25.607 ms  26.191 ms
9  tserv1.fra1.he.net (216.66.80.30)  268.646 ms  8.473 ms  543.078 ms


Various results for the last hop, but they all look like this. While writing this post, the RTT went back to normal, but these traces look the same.


Quote
  • A traceroute from your network to an external IPv6 address.

traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f0b:751::101, 30 hops max, 24 byte packets
1  2001:470:1f0a:751::1 (2001:470:1f0a:751::1)  86.749 ms  88.746 ms  90.232 ms
2  v320.core1.fra1.he.net (2001:470:0:69::1)  87.058 ms  89.918 ms  84.285 ms
3  te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1)  86.634 ms  85.812 ms  86.665 ms
4  te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e)  86.185 ms  87.781 ms  87.051 ms
5  redirector.heise.de (2a02:2e0:3fe:100::8)  86.196 ms  87.332 ms  89.024 ms


(Taken while RTT is falling down again.)


Quote
  • Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)

188.195.5.194

kasperd

Quote from: dstest01 on December 08, 2013, 04:24:28 PM
Quote from: kasperd on December 08, 2013, 02:36:19 PM
  • A traceroute from your tunnel endpoint using protocol 41 packets to 216.66.80.30 (-P 41 if you are using Ubuntu).

This produces weird results, sometimes the output seems corrupted, like this:

[...]
5  83-169-129-89.static.superkabel.de (83.169.129.89)  25.010 ms  21.730 ms  23.940 ms
6  tserv1.fra1.he.net (216.66.80.30)  18.288 ms 83-169-128-141.static.superkabel.de (83.169.128.141)  5.995 ms  27.805 ms
This just means not all the packets are taking the same route. If the route taken by different packets have the same number of hops, you might only see different IPs on a few hops. If the routes have different number of hops, the rest of the traceroute output could look very confusing. It could indicate route flapping, but it could also just be bundled links, working as intended. Judging from the rest of your observations, I don't think this variation in routing indicates a problem.

Quote from: dstest01 on December 08, 2013, 04:24:28 PMIf it's completing, looks like this at the end:

[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  36.166 ms  27.191 ms  27.756 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  26.886 ms  26.145 ms  27.358 ms
9  tserv1.fra1.he.net (216.66.80.30)  112.081 ms  291.641 ms  675.325 ms


[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  25.643 ms  37.247 ms  26.134 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  27.088 ms  35.618 ms  25.292 ms
9  tserv1.fra1.he.net (216.66.80.30)  4629.144 ms  133.377 ms  2359.053 ms


[...]
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  25.535 ms  25.610 ms  25.627 ms
8  100ge5-1.core1.fra1.he.net (72.52.92.6)  26.528 ms  25.607 ms  26.191 ms
9  tserv1.fra1.he.net (216.66.80.30)  268.646 ms  8.473 ms  543.078 ms


Various results for the last hop, but they all look like this.
All three traces show the same IPs for the last three hops, so it looks like this part of the route is consistent for you. It also shows an extreme latency on the last link from 100ge5-1.core1.fra1.he.net (72.52.92.6) to tserv1.fra1.he.net (216.66.80.30). That is probably the most useful piece of information you found so far. You should update the ticket you filed with that information. Additionally mention, that it only looks like that for protocol 41 packets and not for echo requests.

When I run a traceroute to the tunnel server, the path looks a little bit different.# traceroute -f5 -P 41 216.66.80.30
traceroute to 216.66.80.30 (216.66.80.30), 30 hops max, 60 byte packets
5  kbn-b1-geth2-3.telia.net (213.248.66.145)  9.397 ms  8.459 ms  9.752 ms
6  kbn-bb1-link.telia.net (80.91.246.46)  61.240 ms  56.713 ms  69.155 ms
7  s-bb3-link.telia.net (80.91.248.46)  21.737 ms  20.663 ms  23.519 ms
8  s-b3-link.telia.net (213.155.133.99)  17.946 ms  22.966 ms  18.984 ms
9  hurricane-ic-134866-s-b3.c.telia.net (213.155.141.254)  23.024 ms  34.506 ms  24.955 ms
10  10ge3-1.core1.cph1.he.net (184.105.223.206)  26.081 ms  31.631 ms  30.086 ms
11  10ge16-1.core1.fra1.he.net (184.105.223.201)  40.309 ms  40.228 ms  33.032 ms
12  tserv1.fra1.he.net (216.66.80.30)  33.868 ms  36.967 ms  42.956 ms
The next to last hop is a different one for me. And I don't see the high latency. So most likely the reason you see a problem, and I don't is that our packets reach the tunnel server through different paths.

Quote from: dstest01 on December 08, 2013, 04:24:28 PM
Quote
  • Your public IPv4 address (such that I can send protocol 41 packets directly to you and get echo replies through the tunnel server.)

188.195.5.194
I'll took a look on that later, though you might not need the additional information, we could find from that. You already have useful information to add to the ticket.

dstest01

Ok, I'll post it there, thx a lot for your help.

kasperd

Quote from: dstest01 on December 09, 2013, 04:27:09 PM
Ok, I'll post it there, thx a lot for your help.
I am curious to hear if you ever got a reply to your ticket.

dstest01

#9
Quote from: kasperd on December 21, 2013, 08:06:28 AM
I am curious to hear if you ever got a reply to your ticket.
I did... was asked for a complete trace, but that was some time ago. I guess it's hard to debug from their side.

For now I only tested a tunnel to a different server, not a tunnel broker, just a random server I have access to. Ping RTTs over this tunnel were stable as expected.

The next thing I'd like to test is a second HE tunnel to AMS1, which is a hop on the way to my current tunnelserver.

dstest01

#10
HE does not allow a second tunnel with the same IP for some reason, but luckily I have another ISP. The result is indeed quite interesting:

default AMS1 nosrc:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f14:10e2::2, 30 hops max, 24 byte packets
1  2001:470:1f14:10e2::1 (2001:470:1f14:10e2::1)  36.464 ms  37.28 ms  37.255 ms
2  v213.core1.ams1.he.net (2001:470:0:7d::1)  32.924 ms  32.028 ms  42.578 ms
3  ams-ix-v6.nl.plusline.net (2001:7f8:1::a501:2306:1)  40.135 ms  39.405 ms  39.25 ms
4  te2-4.c101.f.de.plusline.net (2a02:2e0:10:3:c::1)  39.305 ms  39.28 ms  39.711 ms
5  te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e)  39.002 ms  39.131 ms  38.909 ms
6  redirector.heise.de (2a02:2e0:3fe:100::8)  39.216 ms  38.718 ms  38.738 ms

default AMS1 src FRA1:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f0b:751::101, 30 hops max, 24 byte packets
1  tserv1.ams1.he.net (2001:470:0:7d::2)  121.265 ms  121.898 ms  123.603 ms
2  v213.core1.ams1.he.net (2001:470:0:7d::1)  115.323 ms  112.813 ms  125.725 ms
3  ams-ix-v6.nl.plusline.net (2001:7f8:1::a501:2306:1)  114.944 ms  316.589 ms  234.207 ms
4  te2-4.c101.f.de.plusline.net (2a02:2e0:10:3:c::1)  110.279 ms  106.011 ms  111.538 ms
5  te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e)  106.414 ms  120.993 ms  107.52 ms
6  redirector.heise.de (2a02:2e0:3fe:100::8)  110.427 ms  109.495 ms  109.77 ms

default FRA1 nosrc:
1  2001:470:1f0a:751::1 (2001:470:1f0a:751::1)  118.367 ms  122.252 ms  120.983 ms
2  v320.core1.fra1.he.net (2001:470:0:69::1)  124.875 ms  123.936 ms  126.179 ms
3  te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1)  120.115 ms  119.612 ms  120.992 ms
4  te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e)  123.557 ms  124.883 ms  120.908 ms
5  redirector.heise.de (2a02:2e0:3fe:100::8)  125.327 ms  125.293 ms  124.819 ms

default FRA1 src AMS1:
traceroute to heise.de (2a02:2e0:3fe:100::8) from 2001:470:1f15:10e2::101, 30 hops max, 24 byte packets
1  tserv1.fra1.he.net (2001:470:0:69::2)  41.606 ms  42.458 ms  42.083 ms
2  v320.core1.fra1.he.net (2001:470:0:69::1)  68.629 ms  39.157 ms  41.465 ms
3  te3-1.c101.f.de.plusline.net (2001:7f8::3012:0:1)  205.25 ms  231.324 ms  299.25 ms
4  te6-1.c13.f.de.plusline.net (2a02:2e0:1::1e)  209.512 ms  204.391 ms  53.029 ms
5  redirector.heise.de (2a02:2e0:3fe:100::8)  38.678 ms  37.604 ms  38.221 ms


"src" means setting the default route with an address from the other tunnels routed net as src address, e.g. in the last case requests go via FRA1 while replies come back via AMS1. So for 6in4 packets (but not ICMP Pings) the way from FRA1 to my ISP is looking bad, while the other direction seems to be ok.

Btw. I also tried switching ISPs, in which case latencies are fine for both tunnels, so it's just the combination of this specific ISP with this specific tunnelserver, but the other connection is a standard DSL with 24h forced reconnection, therefore I'm not eager using it permanently for the tunnel.

dstest01

Aaarg, the forum ate my last post... so, shorter this time. ;)

http://lg.he.net/ gives

1 16 ms 1 ms 4 ms decix1.superkabel.de (80.81.192.249)
2 17 ms 24 ms 24 ms 88-134-203-137-dynip.superkabel.de (88.134.203.137)
3 25 ms 24 ms 30 ms 83-169-129-86.static.superkabel.de (83.169.129.86)
4 23 ms 23 ms 24 ms 88-134-205-122-dynip.superkabel.de (88.134.205.122)
5 30 ms 40 ms 25 ms 83-169-128-86.static.superkabel.de (83.169.128.86)
6 32 ms 19 ms 31 ms 83-169-137-19.static.superkabel.de (83.169.137.19)
7 28 ms 37 ms 33 ms 188-195-5-194-dynip.superkabel.de (188.195.5.194)


... compared to ...

traceroute to 216.218.252.174 (216.218.252.174), 255 hops max, 60 byte packets
1  83-169-169-90-isp.superkabel.de (83.169.169.90)  5.090 ms  10.182 ms  10.223 ms
2  83-169-136-62.static.superkabel.de (83.169.136.62)  16.605 ms  16.647 ms  16.649 ms
3  83-169-128-29.static.superkabel.de (83.169.128.29)  17.675 ms  17.762 ms  17.766 ms
4  88-134-205-134-dynip.superkabel.de (88.134.205.134)  17.770 ms  17.772 ms  17.834 ms
5  88-134-237-133-dynip.superkabel.de (88.134.237.133)  29.433 ms 88-134-237-137-dynip.superkabel.de (88.134.237.137)  24.702 ms  24.744 ms
6  88-134-202-90-dynip.superkabel.de (88.134.202.90)  33.289 ms 88-134-202-133-dynip.superkabel.de (88.134.202.133)  27.511 ms 88-134-202-90-dynip.superkabel.de (88.134.202.90)  30.795 ms
7  30gigabitethernet1-3.core1.ams1.he.net (195.69.145.150)  38.475 ms  32.408 ms  32.394 ms
8  core1.fra1.he.net (216.218.252.174)  44.468 ms  43.557 ms  45.090 ms


Routing asymmetry, maybe part of the problem, see last post.

takio

Hi,

thats what i got over Kdg (Kabeldeutschland) and the reason i avoid using this connection for my ipv6 tunnel ;)

Kdg:

[x@mtr-ham3] > /tool traceroute routing-table=cable-kdg 216.66.80.30
# ADDRESS                          LOSS SENT    LAST     AVG    BEST   WORST STD-DEV STATUS
1 x.x.x.x                            0%    7   9.2ms    15.8     9.2    31.6     7.4
2 83.169.157.62                      0%    7  30.9ms    88.6    17.8   217.6    68.6
3 83.169.129.138                     0%    7  13.6ms    20.7    13.6      31     5.5
4 88.134.204.14                      0%    7  16.2ms    17.3    14.5    19.8       2
5 88.134.203.77                      0%    7  25.5ms    25.5    12.3    47.1    10.8
6 88.134.202.125                     0%    7  52.4ms    35.7    25.1    52.4      11
7 195.69.145.150                     0%    7  34.5ms    26.9    22.2    34.5     3.6
8 72.52.92.78                        0%    7  35.1ms    42.9      33    55.5     8.7
9 216.66.80.30                       0%    7 138.8ms   138.3   132.6   150.8     6.1


Telefonica:

[x@mtr-ham3] > /tool traceroute routing-table=static-tfd 216.66.80.30
# ADDRESS                          LOSS SENT    LAST     AVG    BEST   WORST STD-DEV STATUS
1 x.x.x.x                            0%    7  30.5ms    27.7    26.2    30.5     1.4
2 62.109.72.114                      0%    7    27ms    26.4    25.7      28     0.8
3 213.191.66.162                     0%    7  26.3ms    26.1    25.3    27.5     0.7
4 193.42.155.32                      0%    7  40.9ms    43.2    38.9    49.7       4
5 72.52.92.78                        0%    7  53.3ms    48.9    42.3    53.3     3.9
6 216.66.80.30                       0%    6  42.2ms      40    38.4    42.2     1.5

dstest01

Hm, weird, I'm getting a different traceroute result now, but not the same as takio's:

                                             Packets               Pings
Host                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
6. 83-169-128-141.static.superkabel.de     0.0%   150   42.0  31.2  24.2  42.0   4.1
    88-134-201-10-dynip.superkabel.de
7. 30gigabitethernet1-3.core1.ams1.he.net  0.0%   150   34.0  35.3  26.0 110.5  10.2
8. 10ge15-4.core1.fra1.he.net              0.7%   150  131.1 129.9 113.9 147.8   6.7
9. tserv1.fra1.he.net                      0.0%   150   27.4  31.7  25.5  55.5   4.8


It's the second last hop (72.52.92.78 aka 10ge15-4.core1.fra1.he.net) that has increased RTTs, but not the last one, I'm not sure how this can happen.

...

1 hour later, RTTs over the tunnel are back to normal, RTTs in the traceroute as well.

kasperd

Quote from: dstest01 on December 23, 2013, 04:19:55 PMIt's the second last hop (72.52.92.78 aka 10ge15-4.core1.fra1.he.net) that has increased RTTs, but not the last one, I'm not sure how this can happen.
I know of two possible explanations for that to happen. Typically you'd expect traceroute output to show an increase in roundtrip at each hop, but it is by no means guaranteed to be the case.

First of all, the route you see is the forward route. The forward route can change depending on what time you run the command, what protocol you are using, and even what port number you are using. However such changes will be visible in the traceroute output (with proper settings). What you do not see is which route the replies take back to you. It is possible that the route from each of those IPs back to you are different. So it is possible that the one hop, which shows an increased latency is simply caused by that particular router not having an optimal route back to your IPv4 address.

Additionally there are routers, which can forward packets in hardware, but sending a time exceeded ICMP error message back to you requires the packet to be handled by the CPU. For such a router it is much faster to forward the packet than to send a reply. If the CPU on one router along the path is more busy than the others, the roundtrip measured by traceroute for that particular router may be higher even if the router has plenty of bandwidth free to route packets.