solaris2007
yesterday at 10:59 PM
If a situation where production vxlan is going over Wireguard arises, then someone in leadership failed to plan and the underlying Wireguard tunnel is coping with that failure. No doubt, OP already knows this and all too well.
The problem is no doubt a people problem. I have learned to overcome these people problems by adhering to specific kinds of communication patterns (familiar to Staff Engineers and SVPs).
There is no reason that Wireguard over vxlan over Wireguard can't work, even with another layer (TLS) on top of Wireguard. Nonetheless it is very suboptimal and proprietary implementations of vxlan tend to behave poorly in unexpected conditions.
We should remember that vxlan is next-get vlan.
The type of Wireguard traffic encapsulated within the vxlan that comes to mind first is Kubernetes intra/inter-cluster pod-to-pod traffic. But this Wireguard traffic could be between two legacy style VMs.
If I were the operator told "you need to securely tunnel this vxlan traffic between two sites" I would reach for IPsec instead of Wireguard in an attempt to not lower the MTU of encapsulated packets too much. Wireguard is a layer 4 (udp) protocol intended to encapsulate layer 3 (ipv6 and legacy ip) packets.
If I were the owner of the application I would bake mutual TLS authentication on QUIC with "encrypted hello" (both elliptic and PQ redundant) into the application. The applications would be implemented in Rust, or if not practicable to implement the application in Rust I would write into the Helm chart a sidecar that does such a mutual TLS auth part (in Rust of course).
I would also aggressively "ping" in some manner through the innermost encapsulation layer. If I had tenancy on a classic VM doing Wireguard over the vxlan I would have "ping -i 2 $remote_inside_tunnel_ipv6" running indefinitely.