Hi,
a network failure suddenly appeared today at 01:40Z. No changes have happened.
Analysis showed that I can still reach some of my tunnels, and the behaviour is different for UDP, TCP and ICMP6 and for different targets. I found that it is also different for different sources.
For ping/ICMP6 it currently looks like this:
tun-1 tun-2 tun-3
from 2003:e7:171d:6eff::1 FAIL FAIL FAIL
from 2003:e7:171d:6ee1:41d:92ff:fe01:104 ok FAIL FAIL
from 2003:e7:171d:6ee0:41d:92ff:fe01:105 ok ok ok
from 2003:e7:17ff:f22:41d:92ff:fe01:301 FAIL ok ok
tcpdump shows the packets depart from origin into the Internet (into DSL in this case), but no gif-encapsulated packet appears on the interface of the destination system.
I have no clue on why or where on the network they might disappear.
Pinging the two tunnel endpoints (ingress and egress) shows a different pattern again:
ingress tun-1 tun-2 tun-3
from 2003:e7:171d:6eff::1 FAIL ok FAIL
from 2003:e7:171d:6ee1:41d:92ff:fe01:104 FAIL ok FAIL
from 2003:e7:171d:6ee0:41d:92ff:fe01:105 ok FAIL ok
from 2003:e7:17ff:f22:41d:92ff:fe01:301 FAIL FAIL FAIL
egress tun-1 tun-2 tun-3
from 2003:e7:171d:6eff::1 FAIL ok FAIL
from 2003:e7:171d:6ee1:41d:92ff:fe01:104 ok ok FAIL
from 2003:e7:171d:6ee0:41d:92ff:fe01:105 ok ok ok
from 2003:e7:17ff:f22:41d:92ff:fe01:301 FAIL FAIL ok
# ifconfig vtnet0
vtnet0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>
ether 06:1d:92:01:03:01
inet 192.168.98.34 netmask 0xfffffffc broadcast 192.168.98.35
inet6 fe80::2%vtnet0 prefixlen 64 scopeid 0x1
inet6 2003:e7:1731:3cff::20 prefixlen 64
inet6 2003:e7:1731:3cff::21 prefixlen 64
inet6 2003:e7:1731:3cff::22 prefixlen 64
inet6 2003:e7:1731:3cff::23 prefixlen 64
inet6 2003:e7:1731:3cff::24 prefixlen 64
inet6 2003:e7:1731:3cff::25 prefixlen 64
inet6 2003:e7:1731:3cff::26 prefixlen 64
inet6 2003:e7:1731:3cff::27 prefixlen 64
inet6 2003:e7:1731:3cff::28 prefixlen 64
inet6 2003:e7:1731:3cff::29 prefixlen 64
media: Ethernet autoselect (10Gbase-T <full-duplex>)
status: active
nd6 options=1<PERFORMNUD>
# for i in 20 21 22 23 24 25 26 27 28 29; do ping -q -c 1 -S 2003:e7:1731:3cff::$i wand.daemon.contact >/dev/null 2>&1 && echo $i okay; done
21 okay
23 okay
24 okay
25 okay
26 okay
28 okay
The pattern reproduces. Over all quite precisely 50% failure rate.
However, trying google, or freebsd.org:
# for i in 20 21 22 23 24 25 26 27 28 29; do ping -q -c 1 -S 2003:e7:1731:3cff::$i freebsd.org >/dev/null 2>&1 && echo $i okay; done
20 okay
21 okay
22 okay
23 okay
24 okay
25 okay
26 okay
27 okay
28 okay
29 okay
So, it is not a problem with the IPv6 on this originating site (or with the router or whatever here).
Can it be a problem with the destination site (wand.daemon.contact in the example above)?
I have two tunnel endpoint IPv6 adresses (tunnel123456-pt.tunnel.etc.etc.he.net and tunnel123456.tunnel.etc.etc.he.net), and both of them show just the same 50% failures.
One of these is run entirely by HE, so it doesn't look like a flaw on my side. :(