• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

strange issue with packet encapsulation

Started by xitop6, October 31, 2021, 01:02:53 PM

Previous topic - Next topic

xitop6

I'm experiencing a weird problem for several weeks that I cannot explain nor fix. I'm not even sure where it happens. Any help would be really appreciated.

I have two Internet IPv4 connections (mutual backup) and two corresponding IPv6 tunnels. I have no connectivity issues except this one.

It is occurring on an IPv6 tunnel with endpoint in Prague (Czech rep.). My other tunnel is not affected.

Ping is fine, traceroute is fine, but all downloads except the tiny ones fail.

I spent a lot of time debugging the issue. I have lowered the MTU, clamped the TCP MSS, but it didn't help. Finally I captured the packets for analysis:

1. The TCP connection starts normally (SYN, SYN+ACK, ACK)

2. the first part (about 30-120 kB of data) is transferred without problems, that means an exchange of roughly 30 to 100 data packets incoming and the same amount of ACKs outgoing.

3. but then the size of incoming packets suddenly increases by 20 bytes and something is wrong. The result are dropped packets, retransmissions, resets, etc. and the download fails.

With the wireshark tool I have found out that the additional 20 bytes are another IPv4 header.

A SIT tunnel packet normally looks like this:
[IPv4 header, protocol 41]
[IPv6 packet]

And in the step 3 I'm suddenly receiving this:
[IPv4 header #1, protocol 41]
[IPv4 header #2, protocol 41]
[IPv6 packet]

The IPv4 headers contain correct IPv4 addresses and correct payload lengths (i.e. in the outer header it is 20 bytes larger than in the inner one). It looks like the IPv6-in-IPv4 traffic changes in the middle of download and becomes encapsulated twice.

I captured the packets on a PPPoE interface, they are entering my router (Linux) already malformed.

tomkep

Having IPv4 header #2 when IPv4 header #1 indicates protocol 41 (ipv6) is an obvious error.

This looks like encapsulation error, possibly at HE site. You should send the capture to HE help desk and ask them to have a look at their side.

xitop6

The issue still persists.

I informed the H.E. prior to posting, but got no reply. I have no reason to assume the problem is on their side as I really have no idea which equipment or software is to blame and on which side.

I hoped to find others with the same problem to compare what our setups have in common, but nobody else complained so far.



xitop6

Update:

- my Internet connections are named ADSL and LTE
- I'm using HE endpoints at Prague, Czechia (CZ) and Budapest, Hungary (HU)

I had this setup:

  ADSL --- CZ (having the problem)
  LTE -- HU (OK)

Today I swapped the endpoints:

   ADSL --- HU (now the issue is here)
   LTE - CZ (OK)

So, I will contact the ADSL provider.

xitop6

UPDATE: With a lot of help from my ISP (many thanks), after several tests and experiments, we made some progress:

- the double encapsulation issue was confirmed, they were able to reproduce it
- the issue could be reproduced with a test SIT tunnel. This confirms that H.E. has nothing to
  do with the issue
- the Linux network stack is also not involved, the malformed packets were detected on the wire
- changing the mode from SIT to GRE solves the issue

To summarize:
The issue is happening somewhere during the transit, affected are probably some ADSL customers here in Slovakia (small EU country).
The ADSL is an old technology, but some regions do not have access to better alternatives.
I'm sorry that we could not pinpoint the exact piece of equipment causing the trouble, but the ADSL infrastructure
is operated by a major telco company we didn't contact.