Some info about WireGuard, a new VPN:
Project website: www.wireguard.com
Inception: 2015.
Lead developer: Jason A. Donenfeld. WireGuard
and service
marks are registered trademarks of Jason A. Donenfeld.
Website sponsors: ZX2C4 (Jason A. Donenfeld) and Edge Security.
Package names: wireguard-tools on Linux (OpenSuSE and numerous other distros), and for Android, WireGuard from com.wireguard.android . Also available for MacOS, iOS (iPhone), and Microsoft Windows.
I have just (2021) gone through yet another audit of my VPNs, making sure that they work for all relevant clients and that the vpn-tester program can competently report if they are or aren't working. Programs and versions installed as of 2024-11-01:
wireguard.
StrongS/WAN and OpenVPN work well when properly configured, but they have a number of less than wonderful features:
The learning curve is steep for routing packets properly through the tunnel (of course excluding bearer packets) and for providing credentials in the required form to authenticate the two ends.
The network paradigm for IPSec is, if an IPv6 packet has an IPSec header, the corresponding Security Association has the information (crypto algo, key, etc.) which the kernel can use to decrypt it. The headers and payload thus revealed are than processed normally. Outgoing packets are selected by a Traffic Selector (different from normal routing) and the inverse transformation is performed, after which the encrypted packet is sent out through normal routing. IPv4 uses ESP protocol packets instead. I found that it was often a challenge to get the Traffic Selector right, and it was also a challenge to extract the certificate's Distinguished Name in the format Charon wants. (Using a SAN turns out to be easier.)
IPSec connects fairly promptly, initially or after a net disruption, but OpenVPN takes several seconds to do this.
OpenVPN uses a tun/tap device, and payload packets pop out of it or are stuffed into it by normal routes, as if it were a physical net interface. It's a lot easier to handle routing in this context, which WireGuard shares.
If a tunnel's bearer packets get routed down the tunnel that they are bearing, this is a chicken and egg issue which ends up as an omelet. OpenVPN provides an anti-route so traffic between the client and the server goes by normal routing, not the VPN. But this means all traffic, not just bearer packets, and such traffic does not get protected by the VPN. Policy routing can mitigate the issue, but OpenVPN has to run on operating systems, like Windows, which don't have policy routing.
OpenVPN is all in user space, and it stands out as using a lot of CPU time. IPSec and WireGuard do all the crypto in the kernel, so CPU time doesn't get accounted for, but both of them give the impression of alacrity.
Both IPSec and OpenVPN have a lot of code in the packages: around 400,000 and 600,000 lines of code. This doesn't affect the end user directly, but I remeber a quote from Weitse Venema, the author of the Postfix mail transport agent: he says his code has about one bug per thousand lines, and if you introduce complexity (he was talking about TLS for mail transport) you should think about exploits against the bugs and accidental loss of service. Since WireGuard has only about 4000 lines of code, it is much less vulnerable to this kind of bug-borne issue.
Responding to shortcomings in existing VPN software, Jason A. Donenfeld in 2015 began to develop WireGuard, a new VPN. The project website describes these features; whether they're scored as good or bad depends on the user's goals.
Drastically reduced complexity; features not absolutely essential were sacrificed for this goal. He claims only 4000 lines of code.
A lot fewer configurable aspects; for example there is [currently, 2024] only one symmetric crypto algo, ChaCha20Poly1305 by Daniel Bernstein, so no configuration and no negotiation with the peer. Negotiation is very complex for both IPSec and OpenVPN.
An ED25519 (Edwards Curve Diffie-Hellman) public key, locally generated, serves as both the authentication credential and the foundation of the tunnel's encryption, similar to the style of SSH.
WireGuard sends bearer packets by UDP only, no TCP and no unusual
protocols like ESP. Check out udptunnel and udp2raw for add-on layers
if you need TCP (which I do thought I did).
Very fast connection setup: the initiator sends one handshake packet, the responder sends one back, whereupon they both can infer the symmetric keys to encrypt payloads. The CPU time needed to do this is obviously minimal.
Fixed IP addresses and ports are not required. Whatever address a bearer packet arrives from, if the connection identifier points to an active connection and if the recipient can decrypt the payload (verified by the MAC in the AEAD encryption algo), the bearer packet is accepted as authentic and future replies go back to that address and port, until the next IP change. This is particularly important for cellphones; they change IPs whenever they change protocols, e.g. 5G to LTE.
The ChaCha20Poly1305 symmetric crypto algo (AEAD type) is faster than the competitors. But hardware acceleration for it is very rare. On the Intel Core® i5-10210U, jimc's tests score it as half as fast as hardware accelerated AES-256 (Rijndael), and twice as fast as software AES-256. This difference would only be significant for a server with thousands of clients.
The protocol isn't chatty: the only packets sent are bearer packets containing payloads, and key establishment (and rekeying). You can configure keepalive packets (zero-length payloads) if your net needs them. The responder doesn't respond to and doesn't expend resources on unauthorized initiators.
Some details of the Edwards Curve Diffie-Hellman key establishment procedure are interesting. See this Wikipedia article about ECDH, which I've summarized. See also the EdDSA (Edwards Curve Digital Signature Algorithm) article, the section on ED25519.
Parameters for ECDH are agreed on in advance; WireGuard has only one set of parameters built in. It uses a modular field. NSA guidance says that a field of size in the range of 2^256 is sufficient for protecting Top Secret data; that is, to crack the crypto the Black Hats would have to run billions of dollars worth of computers for a year or more to crack one key (and WireGuard re-keys about every 2 minutes). The actual modular field size is 2^255-19.
A private (secret) key for ECDH is a randomly chosen point in the modular field, basically a 255 bit random number excluding the 19 that won't fit, and the identities 0 and 1. Call it S. For the public key, a number G is agreed on, and it is added to itself S times; that is, G is multiplied by S. Call the product Q.
An attacker can recover the private key by dividing Q by G. If the operands were integers we have an efficient algorithm to do this, long division, but in the modular field, doing the division will need similar effort as in cracking a 128 bit symmetric key, half the ECDH bit length. This is the effort level currently considered adequate for protecting Top Secret data.
For each connection (or re-key), an ephemeral Diffie-Hellman key
pair is created by each peer. In modern Linux on modern
hardware there would normally be enough physically random entropy to
produce the private key, but it would normally be whitened
by a
trip through a pseudorandom generator, requiring a 256 bit
multiplication, plus another multiplication for the public key, but
this is a lot less effort than generating a prime factor key pair
(RSA).
The initiator sends its static (permanent) public key and the ephemeral public key that it just generated; neither is encrypted, so the attackers know them. The responder sends back its ephemeral public key, but the initiator is supposed to have the responder's static public key in a configuration file. Each peer also sends a counter which prevents replay attacks, and an encrypted dummy payload which, if successfully decrypted, assures each peer that the other one holds the private keys that correspond to both public keys that it proffered or that was configured.
Each peer, for each of the static and ephemeral keys, multiplies the other end's public key by its own private key. Remember that the other end's public key is its private key times G, so the resulting product is the other end's private key times G times the local private key (but done in the opposite order at the other end). Since multiplication in modular rings (including fields) is commutative, both will get the same answer: the Diffie-Hellman shared secret. The peers hash up the shared secrets identically to produce the symmetric key which they will both use to encrypt or decrypt payloads.
For authentication, the symmetric encruption algo is AEAD type which includes a MAC, so each end can tell authoritatively whether it decrypted the payload successfully, and if so, they know authoritatively that the other end used the private key corresponding to the public key that was proffered; in other words, they can be sure of the identity of the peer (unless its key was stolen).
For authorization, the initiator has the responder's static public
key in a configuration file, so it can be sure that it is talking to
the intended responder. The responder has a list of public keys of
every initiator authorized to connect. It knows on the first packet
who the initiator claims to be, and will only respond if that public
key is on its list. The list can be added to or pruned on the fly by
the provided wg
utility, which would be called by out-of-band
facilities that aren't part of WireGuard, analogous to the
charon
key management daemon of StrongS/WAN (IPsec) and related
functions in OpenVPN. See below: Anubis, Your Guide
to the Underworld.
What are my goals for the VPNs, and how much hassle will it be to make WireGuard deliver what I need, so I can add it to my collection?
Resistance to bit-rot: changes in the system configuration tend to have a bad effect on operation of the VPNs, and I hope WireGuard will be affected less than IPSec and OpenVPN.
While I always use UDP if feasible, avoiding the dreaded TCP Meltdown syndrome, I've found that hotel Wi-Fi often blocks UDP in general and VPN ports in particular. Thus I support and use TCP on port 443, which the Wi-Fi access points have got to pass. Note: some authoritarian nations block VPN ports nationally. 443/TCP can bypass this, but the Secret Police could recognize that it was not normal HTTPS traffic, even if they can't decrypt the payload, with baleful consequences for the perpetrator.
OpenVPN can multiplex VPN traffic and another protocol (e.g. HTTPS) on its listen port, so the VPN server can host a normal webserver as well. For WireGuard, it's probably better to not mess with protocol conversion or tunneling from TCP to UDP. Rather, 53/UDP for DNS is another port that the hotel weasels can't block, and I'm having DNS listen only to localhost:53(UDP) while WireGuard listens to 0.0.0.0:53(UDP), ignoring occasional DNS packets from hackers.
There are four VPN routes that need to work:
The segment tunnel: It goes from the main router Jacinth to a
host in the cloud called Surya. See
A Server in the Clouds
for what is being accomplished here. Basically my ISP doesn't have
native IPv6, the Hurricane Electric tunnel broker runs afoul of
some content distribution services, and so the local net's default
route for IPv6 goes out via my own cloud host.
Xena and Petra to Jacinth: Xena is my laptop, which roams, and Petra is a virtual machine on it for development. See Network for Xena's Virtual Machine. But how do other hosts know where to send packets destined for Xena and Petra? Solution: they always send via Jacinth, and Xena always initiates a VPN to Jacinth even when it's not roaming.
Selen to Jacinth: Selen is a cellphone with Android, and it
also roams. To get to the webmail and PIM server from off-site
it needs a VPN. See Jimc's
Awesome PIM Server
for the packages I'm using. Unlike
Xena, Selen would prefer to not use the VPN unless it is both
roaming and using PIM, or is working with local LAN hosts, or is
doing something for which privacy or integrity is critical.
VPN tester: Two local LAN hosts are chosen (not necessarily the same every time, depending on which are up or down). Host A connects to Jacinth's or Surya's VPN server and sends packets to B; the tester checks if the packets go direct (the NoVPN case) or via the server, and whether their content is really encrypted. For this to work, every LAN host has to be able to connect to Jacinth's WireGuard. The test is done daily; most of the time the LAN host is not using WireGuard.
For authorization, VPNs can be set up two ways. In the historic design each connection has individual credentials installed, typically in the form of pre-shared symmetric keys. But in World Wars I and II stealing symmetric keys was a very productive activity. Modern VPNs, such as IPSec and OpenVPN (starting in version 2.x), install a credential (normally a X.509 certificate and private key) on the server and on each client; the client certs are all signed by one Certificate Authority (or intermediate cert) which the server requires in the client's trust chain. The server doesn't need the clients' certs individually. I don't really have a big herd of users and I can handle either arrangement, but X.509 certs are what I'm using now. Preview: for WireGuard, each connected pair of peers needs to be configured at each end, but the credentials are Edwards Curve Diffie-Hellman public keys of 255 bits (32 octets) that are also used for the crypto, not pre-shared symmetric keys.
There's an issue that makes a lot of trouble for designing a net
with VPNs: some clients always use the VPN and some don't. I'm
implicitly assuming a central server that all the clients work through.
For the always VPN
case the right
routing setup is to
assign the client's hostname to a fixed address on the VPN tunnel
device, and the client's WireGuard listens for and sends bearer packets
from a different address on a real interface. Idiosyncracies of my
always VPN
hosts are:
It has an internal subnet implemented as a bridge, with a virtual host in it. Xena's name is assigned to the bridge, with an address off the local LAN. wg0 has a separate name (xenawg) with an address in a range dedicated to WireGuard. The server has a peer relation with Xenawg and sends and receives traffic to/from Xena and its VM Petra through WireGuard. Xena's real interfaces include both Wi-Fi and Ethernet, which have separate LAN addresses. By normal routing, bearer packets come out there and go to the server, which responds to whichever interface Xena is currently sending from.
It is a client of Surya, the cloud
server. (Client
means that it initiates the bidirectional
tunnel, since Surya has a fixed IPv4 address while Jacinth's isn't
fixed.) Jacinth has a bridge containing the vnet interfaces of two
virtual hosts plus the Ethernet interface to the LAN. All the
bridged hosts have LAN addresses. Jacinth's WireGuard interface
has its own name and address in the dedicated WireGuard range,
similar to Xena. All LAN hosts have a default route through
Jacinth, so subnets like the one on Xena can be routed
appropriately on Jacinth.
Its wild side addresses are on its
real
Ethernet interface. Actually it's a rented VM, and
reality is a relative term, but for figuring out the design we can
ignore virtual networking for this VM. Bearer packets are sent
from and to the wild-side addresses, both by Jacinth's end of the
segment tunnel and by roaming clients operating on the wild side.
Surya's WireGuard name (suryawg) and address are on its WireGuard
interface wg0. Its own name and internal address are on a dummy
device.
The harder case is when the client sometimes uses the VPN, and sometimes doesn't, like the VPN tester and my cellphone. The way I solved this was, if one peer connects to another by its own name (not WireGuard), the connection doesn't go via WireGuard, and most normal traffic is handled this way. But if the initiator connects to its peer's WireGuard address (and name), necessarily the traffic would go via WireGuard. No configuration changes are needed; the client software can choose whether or not to use WireGuard; for example the VPN test program tests both.
The network design went through a lot of changes. Here are some highlights of the transformations.
My wife and I are moving to a different city to be closer to our son. As our plans became firmer, it became clearer that the cloud server Surya would need to take a hot spare role, to be activated when Jacinth and all the other LAN hosts vanish into a packing box. This means that Surya will need to become a super backup directory server: DNS, Kerberos, LDAP, mail, and replicated home directories. The roaming leaf nodes will be connecting to it, in place of the main router Jacinth, when Jacinth is in transit or in storage.
Each host has a WireGuard interface, that accepts or sends out
packets transmitted over WireGuard, as well as one or more real
interfaces such as Ethernet or Wi-Fi. I had hoped to use the same IP
address on both, which is legal in favorable situations, but their roles
are not really equivalent and the routing couldn't be made to work.
So I now have a separate address range for WireGuard. The addresses have
separate hostnames: the host's basename with wg
appended, e.g. selen
is the name of that host's IPv4+6 LAN addresses, while selenwg is for the
host's WireGuard addresses.
I had hoped to have WireGuard running only on the hosts that need it operationally, but I switched to activating WireGuard all the time on all hosts, because recognizing when I needed to turn it on or off, and getting the needed root permission, was too much of a can of worms. Fortunately WireGuard needs little resources: module size is 267kb, no user space daemon, and just a few hundred bytes of state for each authorized peer. CPU time is used only if one peer sends from its own WireGuard address or to the other peer's WireGuard address; most peer pairs just ignore the available WireGuard route.
It's essential to keep bearer packets out of the tunnels that they
are bearing. It turns out that on ordinary leaf nodes the bearer packets
find their own way onto the LAN, but when roaming peers are involved,
I need some simple policy routes, same for IPv4 and 6. The first
pair is for IPv4 and the second, identical except for order numbers, is for
IPv6. /usr/share/iproute2/rt_tables is a file mapping table numbers to
their friendly names; I'm assuming it has a row 214 wireguard
; 214
is the table used by wg
from package wireguard-tools but another
(vacant) table would work equally well. Each rule should have an explicit
and unique order number (also called a priority). The automatically
provided rules to the local, main and default tables are not shown. I use
4296 as the listen port for bearer packets on all hosts.
The rules with dport
flip all bearer packets directly to the
main table, to go out the real interface. All other traffic goes through
the WireGuard table. The routes in it send most traffic into the WireGuard
interface, but there are throw
routes that divert packets for
special cases like an internal LAN with a virtual machine's interface.
Here is the network design that finally worked out:
All hosts
means those running a real Linux distro, OpenSuSE
Tumbleweed for me. Appliances like the cleaning robot don't do
WireGuard, though they do need to still function without using the VPN.
A VPN connects two hosts. It accepts a network packet at either end and sends it to the other end, wrapped to assure privacy (encryption), integrity (checksums) and authenticity (matching WireGuard private and public keys; Black Hats can't insert fake transactions without stealing the private key). These are called bearer packets. A bearer packet must not go through the tunnel that it bears; this is a chicken and egg issue that results in an omelet.
A summary of the design is:
CouchNet (my network) has one central server. It operates 24/7, except when it goes into a packing box for our relocation.
All of the other hosts, referred to as leaves
,
are set up so they can communicate via WireGuard with the central
server.
If they want to communicate via WireGuard with another leaf, the central server will forward the traffic.
Non-WireGuard communication between hosts is encouraged; in the general case traffic via WireGuard is just used for testing and troubleshooting, except…
A few leaves roam, meaning they could leave the LAN and connect to the central server from the wild side. Since the firewalls block almost all traffic from the wild side to LAN hosts including the central server, except VPN bearer packets, the roamers need to send all traffic over WireGuard, which non-roaming leaves could have sent directly.
It's a little impractical for roamers to switch route patterns when they roam away from the LAN or rejoin it, so they always send all LAN traffic via WireGuard, whether or not they are on the LAN.
There are several virtual machines. Non-roaming VMs use bridge networking and behave like any other leaf. But VMs (Petra) on a roaming host (Xena) have a peer relation to their own host, which forwards their packets to the various LAN hosts. These packets aren't bearer packets but they could be addressed to the other host's WireGuard port. Thus Petra can participate in the LAN same as any other leaf, but traffic between the VM and the host goes in one hop, not having to hop through the tunnel to the central server and then bring back reply packets, which turns out to be very hard to set up, and prone to packet loops.
There is one complication: the central server has on the wild
side a non-fixed IPv4 address (with dynamic DNS) and no IPv6
service. So I have a cloud server with fixed IPv4 and IPv6
addresses, and one of its jobs is to make these addresses available
as the preferred ingress path from the wild side. See
A Server in the Clouds
for details.
Each host has a WireGuard interface, conventionally called wg0
(though any legal name will do), configured with the host's WireGuard
address, its private (secret) key, and its ListenPort where it receives
bearer packets from peers. For convenience and easy documentation the
interface name and the ListenPort are the same on all hosts; this is
not mandatory. Here is a sample interface stanza. It doesn't include
the interface name. Keep the ending '=' after the PrivateKey. This
private key was generated by
as a sample,
not for use on any actual host.
wg genkey
[Interface]
PrivateKey = UMlAztNylJahPmTyjthNAQm7HMVlsE4AelA6Ansk2GI=
ListenPort = 4296
Almost always, the host needs only one WireGuard interface. But see Anubis, Your Guide to the Underworld for an exception.
All hosts need at least one peer stanza; a sample follows. These are its data members.
wg pubkey < privatekeyfile
[Peer]
Endpoint = 192.168.10.254:4296
PublicKey = Qu3nLiZfjn1JQbf257LaTTRh12fADisYeNw+++ue5g0=
AllowedIPs = 0.0.0.0/0, ::/0
Each host has a LAN (or global) address and a different WireGuard address, IPv4 and 6, unless you're hopelessly stuck in the 1900's. For usability I put all four in /etc/hosts and DNS. The WireGuard addresses go on the WireGuard interface wg0, and need a /CIDR bit length giving the size of the range of WireGuard addresses. The LAN addresses go on the interface that connects to the LAN, and need /CIDR bits too. One host (or in my case three hosts) have an interface on the wild side and addresses for it.
The cloud server is weird: it has a wild side and a WireGuard interface, but LAN access is only over WireGuard. Putting its LAN address on one of the other interfaces did not work out, so it needs a dummy interface for its LAN address to be on, which I call dum0. Here is its configuration in /etc/sysconfig/network/ifcfg-dum0:
STARTMODE='auto'
BOOTPROTO='static'
USERCONTROL='no'
PRE_UP_SCRIPT="compat:suse:ip-dummy.J"
DUMMY=yes
LLADDR='52:54:00:09:c8:b9'
MACADDR='52:54:00:09:c8:b9'
IPADDR='2600:3c01:e000:306::8:1/112'
IPADDR_1='192.9.200.185/29'
About the pre-up script: With the compat:suse schema, relative paths have to be relative to /etc/sysconfig/network/scripts/ . My script contains:
ip link add dum0 type dummy
ip link set dum0 multicast on
ip link set dum0 up
If you need more than one dummy device, load the dummy
kernel module with parameter numdummies=(number).
The design assumes one central server, which is on the LAN and has a wild-side connection, a herd of leaf nodes on the LAN, and several (two or three) roaming hosts that could be on the LAN or could connect from the wild side.
There are also two cloud servers, which are considered to be
roaming even though they never actually move, because they aren't on the
LAN. For continuity during our relocation, when the LAN and server
will be in packing boxes for weeks or months, the cloud server will be
a slave directory server with DNS, LDAP and Kerberos, for the roaming
leaves to use. It will also handle mail storage and personal
information management (PIM) (see Jimc's Awesome PIM Server
), and
it will have a replicated copy of users' home directories.
Roaming leaf nodes can connect via WireGuard to either the central server (if unpacked) or the cloud server. In the latter case, the roaming client can make connections to the LAN hosts (if unpacked) and the central server, forwarded through the cloud server's WireGuard connection to the central server.
There are several virtual machines; all use bridge networking, so they act like they are directly on the LAN. Except, my laptop, which roams, has a VM for development, and the bridged subnet is entirely inside the laptop. I am also thinking of moving the public webserver out to the cloud server, which would make a network arrangement similar to my laptop.
AllowedIPs and routes are set up so any host can initiate a service connection (http/s, ssh, smtp, dns, etc.) to any other host including off-site. For the connection to provide service, authentication and authorization are required (except http), and service connections from the wild side are blocked by the firewalls unless the client uses a VPN or equivalent (except the public webserver).
For the service connections, the client normally sends from its LAN
address to the server's LAN address, and WireGuard is not involved
unless the client is roaming, i.e. isn't on the LAN. For maintenance
and troubleshooting, any host can connect to almost any peer's
WireGuard address, going via WireGuard and getting normal service. The
default is for the from
address of the packets to be the
client's WireGuard address, but some troubleshooting tools (ping,
traceroute, ip route get) can substitute the client's LAN address; the
connection still goes via WireGuard.
LAN leaves have only a peer relation to the central server, which forwards their WireGuard traffic to the target peer. AllowedIPs of this relation accepts forwarded traffic from any source, because the leaf could legitimately get service from any host on the Intenet. Firewall required.
The central server has a peer relation to every other leaf node. Except for special cases, AllowedIPs on the server for the leaf accepts packets from the leaf's WireGuard and LAN addresses and nothing else.
A VM of a roaming host has a peer relation with its own host, which has a complementary relation with the VM. AllowedIPs are analogous to the design for the central server and its LAN leaves.
Special features on the central server:
The central server is the default route of all hosts that aren't
roaming at the time. Our ISP does not have native IPv6, but
the cloud server has a fixed native IPv6 address.
So, as detailed in
A Server in the Clouds
,
all IPv6 traffic that would have gone out the central server's
wild side is instead forwarded to the cloud server. That means,
IPv6 packets are routed to wg0, and the peer stanza for the cloud
server lists ::/0, i.e. all IPv6 addresses. Gotcha: some IPv6
ranges, like the LAN, are routed elsewhere, and the specific
addresses on the cloud server, like its IPv6 LAN address, need to
be routed there in a smaller subnet or a host route. It took a lot
of work to figure that out and to get it straight.
All IPv4 addresses on the cloud server, like its wild-side address, have to be forwarded there also.
The default route of a roaming leaf node is provided by the net it's connected to: some arbitrary wild-side host, or if the roamer is currently on the LAN, the default route will be the central server, but not via WireGuard. The roamer is allowed to override and send its default route via WireGuard, and in fact this is currently the default operation mode.
I had thought of making the default on roamers be to send CouchNet traffic via WireGuard, but to use the carrier's default route otherwise. But that made things complicated for the roamer's VM, so I didn't push that further.
Almost all of CouchNet is accessible from the central server, not the cloud server, so the central server needs throw routes so Couchnet plus the central server's wild side won't be forwarded to the cloud server.
Traffic to the internal LAN of a roaming VM host has to be routed to WireGuard (wg0). The roamer is the only one that knows what default route it is currently assigned to use (LAN or wild). It initiates a WireGuard connection to the central server, which now knows the roamer's endpoint address and port. So traffic to the roamer or its VM, which the central server sends to WireGuard (wg0), has a known destination and will reach the roamer or the VM.
The central server needs peer stanzas for all other hosts.
Most other hosts have the central server as their peer, but this peer stanza of course has to be suppressed on the central server itself.
Special features on roaming leaves, including the cloud server. Some of the items apply only if the leaf is the host of VM(s).
The whole CouchNet address range is sent via WireGuard (wg0) which will forward the traffic to the central server. Similarly for the central server's wild side addresses, and the cloud server's wild side too (except not on the cloud server itself).
A throw route is needed to suppress putting onto WireGuard (wg0) traffic to the leaf's internal subnet that contains the VM. The cloud server currently doesn't have an internal subnet.
The host needs peer stanzas for its VM(s) (if any).
Special features on virtual machines (VMs) hosted by roaming leaves. This got surprisingly complicated so such VMs could connect to any other host.
The generic prefix route on wg0 is removed, replaced in the next step. It would have attracted all traffic sourced from any WireGuard address into wg0, which is normally the right thing to do.
These subnets are sent to WireGuard (wg0): central server wild side, cloud server wild side (except not on the cloud server), the VM's WireGuard addresses as host routes (which as subnets were deleted in the previous step), and the VM host's WireGuard addresses as host routes.
Throw routes are needed to keep out of WireGuard (wg0) the VM host's internal subnet, and the whole range of WireGuard addresses. The host routes in the previous step for the VM and its host, being longer, will take precedence over the throw route when the VM and host addresses are involved.
The VM needs a peer stanza for its host, whose AllowedIPs include the host's WireGuard addresses and its internal subnet. (Nothing else.)
As for the directory structure, /etc/wireguard contains these important files:
HOST.priv and HOST.pub: The host's private and public keys. The private key needs mode 600 (read-write only by root). wg-setup.sh reads them and passes the content to WireGuard. The recommended way to generate a key pair is:
cd /etc/wireguard
touch HOST.priv
chmod 600 HOST.priv
wg genkey | tee HOST.priv | wg pubkey > HOST.pub
wg-setup.sh sets up WireGuard, or kills it with the -k option. The resulting default setup, lacking HOST.extra, is the WireGuard interface wg0 with its WireGuard addresses in the WireGuard subnet, and just Jacinth as its peer. AllowedIPs in the peer stanza are the whole Internet (0.0.0.0/0, ::/0), because the leaf expects to be able to interact with any host on the Internet.
The central server's peer stanza for the leaf, AllowedIPs, has its WireGuard and LAN addresses and nothing else, except for special cases.
HOST.extra activates special features for the particular host. It is absent on normal leaf nodes. It has three major sections: adding or removing other peers; adding address ranges that need to be handled by WireGuard; and adding throw routes that need to be kept out of WireGuard.
/etc/systemd/system/wireguard.J.service looks like this (omitting comments):
[Unit] Description=WireGuard VPN /J/ # network.service is an alias to wicked.service or NetworkManager.service After=network.service # PartOf = when network.service stops or REstarts, so will wireguard.J . PartOf=network.service Before=network.target ConditionPathExists=/etc/wireguard/%H.priv [Service] Type=oneshot RemainAfterExit=true ExecStart=/etc/wireguard/wg-setup.sh ExecStop=/etc/wireguard/wg-setup.sh -k ExecReload=/etc/wireguard/wg-setup.sh -k ExecReload=/etc/wireguard/wg-setup.sh [Install] # When network.service starts, so will wireguard.J . WantedBy=network.service
Testing wild side connections is easy: on Selen (cellphone) just switch off Wi-Fi and let it use cellular data. On Xena (laptop), turn on Selen's hotspot and have Xena connect to it.
Pretty early, Selen and Xena on the wild side could initiate WireGuard communication to the server Jacinth, and could communicate with all other CouchNet hosts (including Surya) plus wild-side hosts, the same as if they were on Wi-Fi, i.e. were on the LAN and not roaming. When on cellular data the roaming clients could also initiate WireGuard communication to Surya and could communicate with more and more hosts as I fixed typos in addresses and dumb choices of routes. But for clients on Wi-Fi doing WireGuard to Surya, I started out only able to get service from Surya itself. This was a big advance because it will be the major use case during the relocation when Jacinth will be in its packing crate.
The major problem was that the server knows the client's (WireGuard) address but can't figure out a route to it. A packet from a potentially roaming client, e.g. Selen, when it's actually on Wi-Fi but doing WireGuard to Surya, started out going through this baroque path:
Oops. Surya needs to do SNAT (source network address translation), altering the source address of the payload from Selenswg to its own Suryawg. It wraps up the altered packet and sends the bearer packet to Jacinth.
Isn't perfect: client (Selen) on Wi-Fi doing WireGuard to Surya. The description above of the packet path, and of SNAT needed to straighten it out, doesn't quite match reality, and I think several points can be simplified. However, I need to make progress on the other directory server features, so I'm going to pause work on basic networking.
In /etc/firewallJ.d/nat-masq-B4.dirsvr-xena it looks like bit 400 gets set after SNAT for 4296 (bearer packets). Fixed.
Think seriously whether I should use MASQUERADE or SNAT. The issue is, when the egress interface goes down, SNAT remembers the address and port mapping in case it comes up again, while MASQUERADE forgets the mapping, avoiding collisions but not being able to resume the connection transparently.
Switching between Jacinth and Surya as the roaming host's peer or
server, for easily making sure that it still works. I would like to be
able to just turn off the Jacinth configuration and turn on Surya; no
editing conf files every time. The issue is getting the non-used
server to stop sending traffic on WireGuard to the client, instead
sending traffic to the server that the client is actually using. See
Anubis, Your Guide to the Underworld
.
Need to test if I can borrow port 53/udp (DNS) for WireGuard bearer packets, and still have local DNS working. Confirmed working on Jacinth and Surya.
Suppose host A has WireGuard running. Can peer B initiate a service connection (e.g. HTTP or SSH) to A's non-WireGuard interface, as well as to its WireGuard interface? Yes, it works (now).
How to import a WireGuard conf file to Android (Selen) using a QR code:
Scan from QR code.
Jacinth's role on OpenVPN and IPSec is as a generic server: potentially a variety of clients could connect at the same time, authenticating with an X.509 certificate with an acceptable trust chain. This isn't going to fly with WireGuard, since the server has to know the client's public key before it can accept a connection from the client.
Brainwave:
mirred egress mirror dev $IFB, the latter being an Intermediate Functional Block (synthetic interface) which the daemon can listen to. See man tc-mirred for documentation.
The StrongS/WAN IPSec suite includes a daemon called Charon, formerly
Pluto. The initiator starts a VPN-type connection by signalling their own
Charon to establish a Security Association with the peer's Charon
(authentication credential required) and to send to the peer connection
parameters like which address ranges should go over the tunnel. In OpenVPN the
connection setup module isn't a separate daemon but it performs similar
functions, including selecting affected traffic (its equivalent of AllowedIPs
is called an iroute
).
WireGuard needs a similar gatekeeper which, following the underworld theme, I'm calling Anubis. Its functions are just about one to one equivalent to Charon's, but WireGuard has advantages in simplicity. Here are the basic design points:
Any plan involving modifications to WireGuard's kernel driver is not going to fly.
I want to get away from the paradigm that when peers are authorized, WireGuard always has them configured in the kernel and therefore always communicates with those peers over WireGuard. On my net two of the pairs are in this mode, but most of the associations should be available when needed, but should not be another opportunity for bugs to kill the net, if permanently activated despite not being needed.
Therefore the initiator client communicates with the responder's Anubis, authenticating, and both of them add or delete the peer's public key, endpoint and AllowedIPs on the kernel's list, plus routes, activating or turning off WireGuard.
Just like StrongS/WAN's Charon, WireGuard has two collections of Security Associations: generic VPN communication occurs only when activated, whereas (not like StronS/WAN) Anubis has a separate WireGuard interface with a permanently configured connecton to every authorized peer. That won't scale to hundreds of thousands of peers, but it's something I can implement promptly for my much smaller net.
Or will it scale? How big is the per-peer data structure? Off the top of my head, assuming IPv6, I can think of: peer's public key (32 bytes), IP address (16), port (2), just one AllowedIP with /CIDR bits (17), probably not our ephemeral private or public key (0), and the Diffie-Hellman shared secret (possibly further hashed) as the symmetric key (32), and 2 chain links (2x8), for a total structure size of 115 bytes, call it 128. For a million peers we're talking about 1.28e8 bytes. The smallest Raspberry Pi you can buy has 2.14e9 bytes RAM. So ignoring the minor detail of network bandwidth, you actually could serve a million VPN service users on your Raspberry Pi.
Anubis has a fixed port for bearer packets, different from the production WireGuard, and the firewall needs to allow packets from any wild-side or internal IP to both these ports. Policy routes at both ends limit communication to only Anubis' service port; thus the availability of perpetual WireGuard service to this one port has no effect on generic WireGuard service, or the lack thereof when not wanted.
With this infrastructure handling authentication, authorization and security, all Anubis needs to do is this:
The initiator sends a packet saying up
or down
, and
some public key. To have keys, and therefore to send this
packet, the initiator must know its own private key, so the
responder is assured that the packet comes from some authorized
peer, and there will be no problem when WireGuard cheaply drops random
exploit attempts.
But the packet's IP address does not uniquely identify which connection should be brought up or down, because NAT, or network infrastructure for containers or virtual machines, could cause the same service IP and even port to be used for multiple WireGuard instances.
The initiator creates a nonce (meaningless random bits) and hashes it; the hash is included in the packet. The initiator encrypts the nonce with a symmetric key which is the initiator's private key times the responder's public key (its private key times the G factor), which the initiator already knows, and the result goes into the packet.
The responder decrypts the nonce with the public key in the packet (hoped to be the initiator's private key times the G factor) times the responder's private key: the Diffie-Hellman shared secret, which is the same at both ends because multiplication is commutative in modular rings, incuding the field of size 2255-19. The responder hashes the decrypted nonce and compares with the hash in the packet. If they are equal, the responder knows that the packet came from the specific authorized peer using the public key in the packet. An AEAD type crypto algo does all the checking as an integral part of decryption.
The responder then can identify the production connection and bring it up or down. Mission accomplished.
A nice addition to the protocol would be a response with a success or error message.
An alert reader will have noticed that a Black Hat can steal the private key by dividing the public key by G. In the ring of integers we have an efficient algorithm to do that, long division, but not so in modular rings: the effort to do the division is similar to doing test decryptions using for the symmetric key each ring member from 2 to 2128, the square root of the size of the modular ring. This effort level is considered sufficient to protect Top Secret data.
WireGuard needs the equivalent of OpenVPN's explicit-exit-notify. When the
kernel module detects that a connection is going down (e.g. ip link del dev
wg0
) it should notify the peer. The rekey timeout seems to be short, about
2 minutes, but the rekey attempt only occurs if the non-dead peer sends a
packet, and it's not clear how much state it's keeping for the dead peer and
how significant that is. It just seems neater to notify the surviving peer if
you're closing the connection.
Cryptographic algorithms can't be relied on to last forever, although Rijndael (AES) has lasted with only minimally effective attacks up to 2021 since 1997 (inception, or 2001, anointment in FIPS pub. 197), and ChaCha20 has been widely deployed from 2008 to 2021. It would be a very smart move to add algo negotiation, with the needed info in the dummy payload in the initial handshake packet.