I like stateful tracking better.
There are advantages and disadvantages.
Stateful tracking breaks down if packets in the two directions do not travel through the same device. So once you do stateful tracking, you cannot do redundant connectivity. Probably you already do have a single point of failure on the connection to your customer, and if you can put the stateful tracking there, then you are not introducing another single point of failure.
Stateful tracking also breaks down if the device maintaining the state is restarted. An open TCP connection is supposed to be able to stay alive for long periods of time with no packets being exchanged. As long as the two endpoints of the TCP connection are not restarted, they can just resume communication when there is new data to transfer (or when a receiver is ready for more data).
Routers on the path may have been restarted in the meantime, and even if they haven't, they have no way of knowing if the connection is gone because both ends of the connection were restarted. So there is no certain way of knowing if an entry in the table can be disposed of.
Some of these problems can be avoided with a stateful firewall, which is capable of reestablishing state mid-connection. For example if a typical TCP packet is received from the outside, you let it through to the customer, even if there is no matching entry. If a packet is send by the customer, you let it through regardless of missing state, and if it is not a RST packet, you create the state.
Packets you allow going in can have ACK and PSH flags, but no other flags. And they can have a single timestamp option, and no other options. Make sure they are completely ordinary such that there is no way they could be exploiting a bug in the TCP stack. That way you can be confident that in the worst case, it just going to trigger a RST, like it should. If the packet is actually valid, the customer's computer will respond, and the state will be recreated when the outgoing packet goes through the firewall.
This doesn't solve the problem of the two directions going through different firewalls. But you can restart the firewall or reroute both directions through another firewall, in case one goes down.
As far as UDP traffic goes, I think most traffic is simple RPC protocols like DNS lookups. Those are very short lived. And should the state get lost in the middle of an RPC, then the client will retransmit. And more complicated UDP based protocols can easily be made to deal with stateful tracking. All they have to do is to ensure that packets are periodically sent in both directions. Sending a stream of UDP packets without periodically getting confirmation, that the receiver is still expecting packets, is bad practice anyway. (I know an interesting story about what bad stuff can happen, unfortunately that is confidential.)
it seems like it should help a lot with ND exhaustion attacks
Certainly. If your prefix is /64 or /96 or anything in between, then ND exhaustion attacks is an issue to worry about. But anything from /116 to /127 should keep you protected from ND exhaustion attacks.
As long as you also need to support IPv4 on those links, you probably don't want thousands of nodes, so a /116 would give enough addresses. I think you said you wanted to keep the prefixes on nibble boundaries, so you could go with /116, /120, or /124 depending on the number of nodes on link. But I don't think using a /126 for P2P links add much confusion either.
I just picked one of many options and ran with it.
That's also the way to go. I was just wondering because the problem you described and the solution didn't sound like they were matching.
Honestly, of all the DoSes I've ever been a victim of, most of them haven't been very interesting. They've generally been the small packet, high PPS UDP variety, not super exciting 0-day.
That sort of issue is not specific to either version of the protocol. I don't think there are many nodes on the network, which can actually handle a sustained flow of minimum sized packets. How you react to such a flow is of course an issue. I have seen operating systems that would crash if you looped an Ethernet cable back to the same switch.
There is actually a weakness in TCP which can be used to remotely cause a peer to send a large number of small packets, that can overload that peer's local network. I have PoC code. When testing it my own LAN I found three different places that were vulnerable to such a DoS attack.
- The attacked computer would disable the affected network interface for about five seconds, each time it was attacked. Judging from the kernel log, it sounded like the driver was suspecting a hardware fault.
- Flow control on gigabit Ethernet is flawed. Pushback means that in certain situations a 100Mbit/s flow can fully block a 1Gbit/s link.
- One computer would have a kernel panic if it was on the path from the attacked computer towards the attacker.