Hurricane Electric's IPv6 Tunnel Broker Forums

General IPv6 Topics => IPv6 Basics & Questions & General Chatter => Topic started by: c0urier on January 26, 2014, 09:57:22 AM

Title: Stability issue - Stockholm, Sweden
Post by: c0urier on January 26, 2014, 09:57:22 AM
Am I the only one experiencing stability issues with tserv24.sto1 (Stockholm, Sweden).

It seems to go down several times per day and run crazy unstable during the days:
Code: [Select]
Jan 26 18:54:21 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:57 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:47 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:37 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:26 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:26 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:08 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:02 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:51 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:20 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:00 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:52 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:41 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:37 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:26 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:17 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:07 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:50:14 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:32 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:20 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:03 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:02 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***

Code: [Select]
Jan 25 14:28:00 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:34 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:21 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:15 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:54 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:46 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:34 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:34 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:12 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:10 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:25:09 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 25 14:25:09 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:25:06 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:24:16 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 25 14:22:57 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:53 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:33 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:24 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***

These are just some of the logs it started around 20th's of January.
Title: Re: Stability issue - Stockholm, Sweden
Post by: TAtari on January 26, 2014, 10:24:48 AM
Nope, you are not the only one... seems like the Stockholm server caught the flu or something... it's been going up and down the last couple of days.
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 26, 2014, 11:03:42 AM
Well lets hope it gets better soon, nice to get confirmation that I wasn't the only one affected by this.
Title: Re: Stability issue - Stockholm, Sweden
Post by: kasperd on January 27, 2014, 01:29:39 AM
Take a look on this (https://www.tunnelbroker.net/forums/index.php?topic=3090).
Title: Re: Stability issue - Stockholm, Sweden
Post by: svenix on January 27, 2014, 02:56:46 AM
I too can confirm there is a problem with the Stockholm POP. I have had severe packet loss the last couple of days and the ping times are ten times longer than normal:
Code: [Select]
root@flash:~# ping6 -n 2001:470:27:7ad::1
PING 2001:470:27:7ad::1(2001:470:27:7ad::1) 56 data bytes
64 bytes from 2001:470:27:7ad::1: icmp_seq=15 ttl=64 time=508 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=16 ttl=64 time=527 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=18 ttl=64 time=489 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=23 ttl=64 time=568 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=29 ttl=64 time=555 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=31 ttl=64 time=579 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=38 ttl=64 time=572 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=39 ttl=64 time=564 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=40 ttl=64 time=592 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=47 ttl=64 time=588 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=49 ttl=64 time=575 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=52 ttl=64 time=609 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=55 ttl=64 time=595 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=60 ttl=64 time=558 ms
^C
--- 2001:470:27:7ad::1 ping statistics ---
63 packets transmitted, 14 received, 77% packet loss, time 62002ms
rtt min/avg/max/mdev = 489.366/563.472/609.181/32.790 ms
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 27, 2014, 10:04:33 AM
Take a look on this (https://www.tunnelbroker.net/forums/index.php?topic=3090).

Is there any chance of getting updates from that ticket?

I mean, the problem is still on going, at least I still got a large amount of package loss etc.
Title: Re: Stability issue - Stockholm, Sweden
Post by: TAtari on January 27, 2014, 10:15:31 AM
Yes, the problem is still on going it was working ok until about 18:55 CET when it broke down...
This is the last hops to the IPv4 endpoint...
Code: [Select]
  7     8 ms    13 ms     9 ms  10gigabitethernet1-3.core1.sto1.he.net [195.69.119.187]
  8   529 ms     *        *     tserv1.sto1.he.net [216.66.80.90]
  9   504 ms   112 ms    30 ms  tserv1.sto1.he.net [216.66.80.90]
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 27, 2014, 02:05:24 PM
Yes, the problem is still on going it was working ok until about 18:55 CET when it broke down...
This is the last hops to the IPv4 endpoint...
Code: [Select]
 7     8 ms    13 ms     9 ms  10gigabitethernet1-3.core1.sto1.he.net [195.69.119.187]
  8   529 ms     *        *     tserv1.sto1.he.net [216.66.80.90]
  9   504 ms   112 ms    30 ms  tserv1.sto1.he.net [216.66.80.90]


Correct:
Code: [Select]
Jan 27 18:56:44 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 18:56:44 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 18:56:41 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 18:55:58 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** delay ***
Jan 27 18:55:51 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***

But I also saw issues earlier today at 11.55,
Code: [Select]
Jan 27 11:23:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 11:23:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:23:35 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:23:17 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 11:56:08 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** delay ***
Jan 27 11:56:02 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:55:58 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***

Let's hope it gets solved soon =).
Title: Re: Stability issue - Stockholm, Sweden
Post by: kasperd on January 27, 2014, 03:12:43 PM
Is there any chance of getting updates from that ticket?
I got a vague reply hinting that somebody is flooding the tunnel server with traffic, but due to the nature of the traffic, it is difficult to distinguish from legitimate traffic.

In June I filed a ticket about a similar issue, which got resolved quickly. With the amount of information provided in the reply it is impossible for me to say why the current situation is so much harder to resolve.

I can think of a few possible ways one could exploit the operation of a tunnel server to get a quite significant amplification factor in a flooding attack. At this time I do however not want to go into details about how such an attack would work. I do not know if the HE tunnel servers are vulnerable to such attacks, and I have no way of knowing if anybody has mounted such an attack against HE. I do however know that one of the kinds of attack I can think of could potentially be detected by users. I haven't noticed any suspicious traffic on my tunnel. I have just started a tcpdump to look at all the protocol 41 packets on my connection. I can check back later if I have captured anything suspicious.
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 27, 2014, 03:18:55 PM
Is there any chance of getting updates from that ticket?
I got a vague reply hinting that somebody is flooding the tunnel server with traffic, but due to the nature of the traffic, it is difficult to distinguish from legitimate traffic.

In June I filed a ticket about a similar issue, which got resolved quickly. With the amount of information provided in the reply it is impossible for me to say why the current situation is so much harder to resolve.

I can think of a few possible ways one could exploit the operation of a tunnel server to get a quite significant amplification factor in a flooding attack. At this time I do however not want to go into details about how such an attack would work. I do not know if the HE tunnel servers are vulnerable to such attacks, and I have no way of knowing if anybody has mounted such an attack against HE. I do however know that one of the kinds of attack I can think of could potentially be detected by users. I haven't noticed any suspicious traffic on my tunnel. I have just started a tcpdump to look at all the protocol 41 packets on my connection. I can check back later if I have captured anything suspicious.

Thanks for the reply Kasper - It's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame - It causes a little frustration, especially when trying to run most of my services on IPv6.
Title: Re: Stability issue - Stockholm, Sweden
Post by: kasperd on January 27, 2014, 03:39:09 PM
It's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame
I'd be happy to come with suggestions on how HE can mitigate the problem. Unfortunately I can't because I haven't been told enough about what is really going on. I can understand that HE may have good reasons not to want to talk openly about what is really going on, but to me not knowing what the problem is and what mitigations have been tried is actually the most frustrating part.

It causes a little frustration, especially when trying to run most of my services on IPv6.
To me it means an opportunity to test how well my IPv6 stack behaves under such circumstances. I find it has been doing quite well, I haven't noticed many problems during these outages.
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 27, 2014, 03:43:43 PM
It's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame
I'd be happy to come with suggestions on how HE can mitigate the problem. Unfortunately I can't because I haven't been told enough about what is really going on. I can understand that HE may have good reasons not to want to talk openly about what is really going on, but to me not knowing what the problem is and what mitigations have been tried is actually the most frustrating part.

It causes a little frustration, especially when trying to run most of my services on IPv6.
To me it means an opportunity to test how well my IPv6 stack behaves under such circumstances. I find it has been doing quite well, I haven't noticed many problems during these outages.

I can't comment on the first I just have the information you provided - But it sounded like they are "trying" something - Have you suggested to offer your help in the trouble ticket you created?

But you are right, that is a positive way of seeing it =).
Title: Re: Stability issue - Stockholm, Sweden
Post by: kasperd on January 27, 2014, 03:57:33 PM
Have you suggested to offer your help in the trouble ticket you created?
In one of the emails, I did explain the simplest attack as well as how it can be mitigated. I have not received any reply to that email yet. I'd hope they'd at least tell me, if that attack, is what is happening.
Title: Re: Stability issue - Stockholm, Sweden
Post by: tmberg on January 28, 2014, 02:36:39 PM
And down we go again.. :(
Title: Re: Stability issue - Stockholm, Sweden
Post by: kasperd on January 28, 2014, 03:24:12 PM
And down we go again.
I'm only seeing 40% packet loss this time. That's much better than the previous times.
Title: Re: Stability issue - Stockholm, Sweden
Post by: c0urier on January 28, 2014, 03:57:16 PM
package loss has been up to 85% - So in a not working state.
Title: Re: Stability issue - Stockholm, Sweden
Post by: broquea on January 29, 2014, 01:25:15 AM
Guessing they are filtering ICMP against the tserv's IPv4 endpoint itself, since that gets 100% loss right now.

Looking at their route-server, BGP is still up since the anycast ORDNS prefix is still advertised from the tserv to the core router there:
Code: [Select]
* i2001:470:20::/48 2001:470:0:11e::2       1    140      0 i
But not seeing the specifics for the ranges that the tunnels are carved out of. Although tserv-side IPs respond:
Code: [Select]
--- 2001:470:27:dee::1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 171.875/171.894/171.923/0.415 ms

User-side replies are a different story:
Code: [Select]
--- 2001:470:27:dee::2 ping statistics ---
4 packets transmitted, 2 received, 50% packet loss, time 3018ms
rtt min/avg/max/mdev = 307.285/309.291/311.297/2.006 ms