Around a year ago I posted about routing my own prefixes to my machines with BGP, and running anycast.
anycast is a routing architecture where your visitors are going to the (logical) nearest node of the cluster.
I run a 2 node cluster, located in 2 different datacenters in Amsterdam in the Coloclue network.
In this blogpost I’ll dive deeper in how I configured it. There are propably more spaces with such tutorials, but I’d like to keep a record for my self :).
Host
I run 2 physical machines which in turn run multiple virtual machines, powered with KVM/Qemu on Debian 9 “Stretch”.
Installing KVM/Qemu is strait forward.
apt-get install qemu-kvm libvirt-clients libvirt-daemon-system lxc libvirt0 libpam-cgroup libpam-cgfs bridge-utils virt-manager
I also install virt-manager for managing the virtual machines with a graphical interface over SSH
adduser `id -un` libvirt adduser `id -un` kvm
After this restart your shell so that the new groups can take effect
Network
configure the network settings, I use private RFC 1918 addresses for this purpose, since I don’t want to waste IPv4 addresses
auto eth0 iface eth0 inet static address 94.142.240.14 netmask 255.255.255.0 network 94.142.240.0 broadcast 94.142.240.255 gateway 94.142.240.254 # dns-* options are implemented by the resolvconf package, if installed dns-nameservers 10.0.0.1 dns-search jelleluteijn.nl iface eth0 inet6 static address 2a02:898:0:20::166:1 netmask 64 gateway 2a02:898:0:20::1 auto vlanbr iface vlanbr inet static address 10.0.0.1 netmask 255.255.255.0 bridge_ports none bridge_stp off bridge_fd 0 bridge_maxwait 0 iface vlanbr inet6 static address 2a02:898:166:e1::1 netmask 64 bridge_ports none bridge_stp off bridge_fd 0 bridge_maxwait 0
The VMs are located in the “vlanbr” network.
Now we need to configure the kernel and iptables for forwarding IP packages
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf echo "net.ipv6.conf.all.forwarding=1" >> /etc/sysctl.conf echo 1 > /proc/sys/net/ipv4/ip_forward echo 1 > /proc/sys/net/ipv6/conf/all/forwarding iptables -A INPUT -d 255.255.255.255/32 -i vlanbr -j ACCEPT iptables -A INPUT -s 10.0.0.0/24 -i vlanbr -j ACCEPT iptables -A INPUT -s 10.0.0.0/24 -i eth0 -j ACCEPT iptables -A FORWARD -s 10.0.0.0/24 -i vlanbr -o eth0 -j ACCEPT iptables -A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT iptables -A FORWARD -d 10.0.0.0/24 -o eth0 -j LOG iptables -A OUTPUT -o lo -j ACCEPT iptables -A OUTPUT -d 255.255.255.255/32 -o vlanbr -j ACCEPT iptables -A OUTPUT -d 10.0.0.0/24 -o vlanbr -j ACCEPT iptables -A OUTPUT -d 255.255.255.255/32 -o eth0 -j ACCEPT iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE
Routing
For routing the prefixes to my VMs I run the BIRD routing deamon on each physical machine.
Bird will communicate with the parent network (Coloclue) and announce on a individual address basis the addresses I want on that node.
Bird also routes it to the requires private address.
apt-get install bird bird6
Bird and bird6 are 2 different deamons for routing ipv4 and ipv6 respectively.
They are configured the following.
Bird.conf: log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug }; router id 94.142.240.14; function is_owned_by_me() prefix set owned_by_me_space; { owned_by_me_space = [ 185.52.224.38/32 ]; if net ~ owned_by_me_space then return true; return false; } filter ebgp_export { if ( is_owned_by_me () ) then accept; reject; } template bgp ebgp { local as 65166; import all; export filter ebgp_export; source address 94.142.240.14; next hop self; } protocol bgp eun1 from ebgp { neighbor 94.142.240.252 as 8283; allow local as 65166; # Replace 172.16.25.254 with the IP of the first router that is your uplink } protocol bgp eun2 from ebgp { neighbor 94.142.240.253 as 8283; allow local as 65166; # Replace 172.16.25.253 with the IP of the second router that is your uplink } protocol static { route 185.52.224.38/32 via 10.0.0.38; #routing to the desired private address } protocol kernel { learn; # Learn all alien routes from the kernel persist; # Don't remove routes on bird shutdown scan time 20; # Scan kernel routing table every 20 seconds import all; # Default is import all export all; # Default is export none, changed to all } # This pseudo-protocol watches all interface up/down events. protocol device { scan time 10; # Scan interfaces every 10 seconds } protocol direct { interface "eth0"; } bird6.conf: log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug }; router id 94.142.240.14; function is_owned_by_me() prefix set owned_by_me_space; { owned_by_me_space = [ 2a02:898:166:e1::/64{64,128}, 2a02:898:166::/64{64,128} ]; if net ~ owned_by_me_space then return true; return false; } filter ebgp_export { if ( is_owned_by_me () ) then accept; reject; } template bgp ebgp { local as 65166; import all; export filter ebgp_export; source address 2a02:898:0:20::166:1; next hop self; } protocol bgp eun1 from ebgp { neighbor 2a02:898:0:20::e1 as 8283; allow local as 65166; # Replace 172.16.25.254 with the IP of the first router that is your uplink } protocol bgp eun2 from ebgp { neighbor 2a02:898:0:20::e2 as 8283; allow local as 65166; # Replace 172.16.25.253 with the IP of the second router that is your uplink } protocol static { # route 2a02:898:166::/48 unreachable; route 2a02:898:166::/64 via "vlanbr"; route 2a02:898:166:e1::/64 via "vlanbr"; } protocol kernel { learn; # Learn all alien routes from the kernel persist; # Don't remove routes on bird shutdown scan time 20; # Scan kernel routing table every 20 seconds import all; # Default is import all export all; # Default is export none, changed to all } # This pseudo-protocol watches all interface up/down events. protocol device { scan time 10; # Scan interfaces every 10 seconds } protocol direct { interface "eth0"; }
Anycast is fairly easy. If you announce 1 prefix on both locations you are allready anycasted.
Their is a catch however, most network operators block prefixes smaller than /24, you need to announce the whole range most of the time.
VMs
On the VM side it’s easy.
You configure the machine with its private space and bind the external address on it’s link-local address
auto lo iface lo inet loopback auto lo:0 iface lo:0 inet static address 185.52.224.38 netmask 255.255.255.255 # The primary network interface allow-hotplug eth0 iface eth0 inet static address 10.0.0.38 netmask 255.255.255.0 network 10.0.0.0 broadcast 10.0.0.255 gateway 10.0.0.1 # dns-* options are implemented by the resolvconf package, if installed dns-nameservers 10.0.0.1 dns-search jelleluteijn.nl iface eth0 inet6 static address 2a02:898:166:e1:185:52:224:38 netmask 64 gateway 2a02:898:166:e1::1 auto eth0:0 iface eth0:0 inet6 static address 2a02:898:166::185:52:224:38 netmask 64
And that’s how I run my system.
Yours truly