• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

Stability issue - Stockholm, Sweden

Started by c0urier, January 26, 2014, 09:57:22 AM

Previous topic - Next topic

c0urier

Am I the only one experiencing stability issues with tserv24.sto1 (Stockholm, Sweden).

It seems to go down several times per day and run crazy unstable during the days:
Jan 26 18:54:21 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:57 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:47 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:37 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:26 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:26 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:08 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:53:02 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:51 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:20 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:52:00 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:52 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:41 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:37 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:26 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:17 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:51:07 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:50:14 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:32 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:20 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:03 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 26 18:49:02 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***


Jan 25 14:28:00 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:34 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:21 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:27:15 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:54 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:46 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:34 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:34 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:12 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:26:10 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:25:09 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 25 14:25:09 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:25:06 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:24:16 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 25 14:22:57 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:53 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:33 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 25 14:22:24 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***


These are just some of the logs it started around 20th's of January.
Br
Glenn, c0urier.net

TAtari

Nope, you are not the only one... seems like the Stockholm server caught the flu or something... it's been going up and down the last couple of days.

c0urier

Well lets hope it gets better soon, nice to get confirmation that I wasn't the only one affected by this.
Br
Glenn, c0urier.net

kasperd


svenix

I too can confirm there is a problem with the Stockholm POP. I have had severe packet loss the last couple of days and the ping times are ten times longer than normal:
root@flash:~# ping6 -n 2001:470:27:7ad::1
PING 2001:470:27:7ad::1(2001:470:27:7ad::1) 56 data bytes
64 bytes from 2001:470:27:7ad::1: icmp_seq=15 ttl=64 time=508 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=16 ttl=64 time=527 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=18 ttl=64 time=489 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=23 ttl=64 time=568 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=29 ttl=64 time=555 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=31 ttl=64 time=579 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=38 ttl=64 time=572 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=39 ttl=64 time=564 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=40 ttl=64 time=592 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=47 ttl=64 time=588 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=49 ttl=64 time=575 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=52 ttl=64 time=609 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=55 ttl=64 time=595 ms
64 bytes from 2001:470:27:7ad::1: icmp_seq=60 ttl=64 time=558 ms
^C
--- 2001:470:27:7ad::1 ping statistics ---
63 packets transmitted, 14 received, 77% packet loss, time 62002ms
rtt min/avg/max/mdev = 489.366/563.472/609.181/32.790 ms

c0urier

Quote from: kasperd on January 27, 2014, 01:29:39 AM
Take a look on this.

Is there any chance of getting updates from that ticket?

I mean, the problem is still on going, at least I still got a large amount of package loss etc.
Br
Glenn, c0urier.net

TAtari

Yes, the problem is still on going it was working ok until about 18:55 CET when it broke down...
This is the last hops to the IPv4 endpoint...

  7     8 ms    13 ms     9 ms  10gigabitethernet1-3.core1.sto1.he.net [195.69.119.187]
  8   529 ms     *        *     tserv1.sto1.he.net [216.66.80.90]
  9   504 ms   112 ms    30 ms  tserv1.sto1.he.net [216.66.80.90]


c0urier

Quote from: TAtari on January 27, 2014, 10:15:31 AM
Yes, the problem is still on going it was working ok until about 18:55 CET when it broke down...
This is the last hops to the IPv4 endpoint...

 7     8 ms    13 ms     9 ms  10gigabitethernet1-3.core1.sto1.he.net [195.69.119.187]
 8   529 ms     *        *     tserv1.sto1.he.net [216.66.80.90]
 9   504 ms   112 ms    30 ms  tserv1.sto1.he.net [216.66.80.90]



Correct:
Jan 27 18:56:44 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 18:56:44 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 18:56:41 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 18:55:58 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** delay ***
Jan 27 18:55:51 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***


But I also saw issues earlier today at 11.55,
Jan 27 11:23:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 11:23:35 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:23:35 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:23:17 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** loss ***
Jan 27 11:56:08 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** delay ***
Jan 27 11:56:02 apinger: alarm canceled: HEGWIPv6(2001:470:27:dee::1) *** down ***
Jan 27 11:55:58 apinger: ALARM: HEGWIPv6(2001:470:27:dee::1) *** down ***


Let's hope it gets solved soon =).
Br
Glenn, c0urier.net

kasperd

Quote from: c0urier on January 27, 2014, 10:04:33 AMIs there any chance of getting updates from that ticket?
I got a vague reply hinting that somebody is flooding the tunnel server with traffic, but due to the nature of the traffic, it is difficult to distinguish from legitimate traffic.

In June I filed a ticket about a similar issue, which got resolved quickly. With the amount of information provided in the reply it is impossible for me to say why the current situation is so much harder to resolve.

I can think of a few possible ways one could exploit the operation of a tunnel server to get a quite significant amplification factor in a flooding attack. At this time I do however not want to go into details about how such an attack would work. I do not know if the HE tunnel servers are vulnerable to such attacks, and I have no way of knowing if anybody has mounted such an attack against HE. I do however know that one of the kinds of attack I can think of could potentially be detected by users. I haven't noticed any suspicious traffic on my tunnel. I have just started a tcpdump to look at all the protocol 41 packets on my connection. I can check back later if I have captured anything suspicious.

c0urier

Quote from: kasperd on January 27, 2014, 03:12:43 PM
Quote from: c0urier on January 27, 2014, 10:04:33 AMIs there any chance of getting updates from that ticket?
I got a vague reply hinting that somebody is flooding the tunnel server with traffic, but due to the nature of the traffic, it is difficult to distinguish from legitimate traffic.

In June I filed a ticket about a similar issue, which got resolved quickly. With the amount of information provided in the reply it is impossible for me to say why the current situation is so much harder to resolve.

I can think of a few possible ways one could exploit the operation of a tunnel server to get a quite significant amplification factor in a flooding attack. At this time I do however not want to go into details about how such an attack would work. I do not know if the HE tunnel servers are vulnerable to such attacks, and I have no way of knowing if anybody has mounted such an attack against HE. I do however know that one of the kinds of attack I can think of could potentially be detected by users. I haven't noticed any suspicious traffic on my tunnel. I have just started a tcpdump to look at all the protocol 41 packets on my connection. I can check back later if I have captured anything suspicious.

Thanks for the reply Kasper - It's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame - It causes a little frustration, especially when trying to run most of my services on IPv6.
Br
Glenn, c0urier.net

kasperd

Quote from: c0urier on January 27, 2014, 03:18:55 PMIt's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame
I'd be happy to come with suggestions on how HE can mitigate the problem. Unfortunately I can't because I haven't been told enough about what is really going on. I can understand that HE may have good reasons not to want to talk openly about what is really going on, but to me not knowing what the problem is and what mitigations have been tried is actually the most frustrating part.

Quote from: c0urier on January 27, 2014, 03:18:55 PMIt causes a little frustration, especially when trying to run most of my services on IPv6.
To me it means an opportunity to test how well my IPv6 stack behaves under such circumstances. I find it has been doing quite well, I haven't noticed many problems during these outages.

c0urier

Quote from: kasperd on January 27, 2014, 03:39:09 PM
Quote from: c0urier on January 27, 2014, 03:18:55 PMIt's a bit sad people are abusing a service like this, I hope HE is able to find a solution within a reasonable time-frame
I'd be happy to come with suggestions on how HE can mitigate the problem. Unfortunately I can't because I haven't been told enough about what is really going on. I can understand that HE may have good reasons not to want to talk openly about what is really going on, but to me not knowing what the problem is and what mitigations have been tried is actually the most frustrating part.

Quote from: c0urier on January 27, 2014, 03:18:55 PMIt causes a little frustration, especially when trying to run most of my services on IPv6.
To me it means an opportunity to test how well my IPv6 stack behaves under such circumstances. I find it has been doing quite well, I haven't noticed many problems during these outages.

I can't comment on the first I just have the information you provided - But it sounded like they are "trying" something - Have you suggested to offer your help in the trouble ticket you created?

But you are right, that is a positive way of seeing it =).
Br
Glenn, c0urier.net

kasperd

Quote from: c0urier on January 27, 2014, 03:43:43 PMHave you suggested to offer your help in the trouble ticket you created?
In one of the emails, I did explain the simplest attack as well as how it can be mitigated. I have not received any reply to that email yet. I'd hope they'd at least tell me, if that attack, is what is happening.

tmberg

tmberg

kasperd

Quote from: tmberg on January 28, 2014, 02:36:39 PMAnd down we go again.
I'm only seeing 40% packet loss this time. That's much better than the previous times.