linux

Using Homebridge with cmdscript2 to control your Linux machine over HomeKit

A few of my tech projects experience occasional hiccups and need to be soft-reset from my Linux host (e.g. Wi-Fi SSID routed through VPN, Windows gaming VM with hardware passthough). This was annoying as it meant having a machine nearby to SSH and execute a few simple commands -- often just a systemctl restart foo. Fortunately, homebridge-cmdswitch2 can easily expose arbitrary commands as lights so I would be able to easily bounce the services via my phone.

First, since Homebridge should be running as its own system user, we need to give it permissions to restart services (as root). We don't want to grant services to all of /bin/systemctl, so a wrapper script will be placed at /usr/local/bin/serviceswitch to encapsulate the desired behavior. Grant the homebridge user permission to run it with sudo:

cat << EOF > /etc/sudoers.d/homebridge-cmdswitch
homebridge ALL = (root) NOPASSWD: /usr/local/bin/serviceswitch

Next, let's create that /usr/local/bin/serviceswitch script with service status, start and stop commands - using such a wrapper also has the benefit that complex checks consisting of several commands can be performed. Keep in mind these are now being run as root from Homebridge!

#!/bin/sh

if [ "$(id -u)" -ne 0 ];then
  echo "You must run this script as root."
  exit 1
fi

usage() {
  error="$1"
  if [ ! -z "$error" ];then
    echo "Error: $error"
  fi
  echo "Usage: $0 [action] [service]"
}

action="$1"
service="$2"
if [ -z "$action" ] || [ -z "$service" ];then
  usage
  exit 1
fi

case $action in
  start|stop|status) ;;
  *) usage "invalid action, must be one of [start, stop, status]"; exit 1;;
esac

case $service in
  vm-guests)
    [ "$action" == "start" ] && (systemctl start libvirt-guests)
    [ "$action" == "stop" ] && (systemctl stop libvirt-guests)
    [ "$action" == "status" ] && { systemctl -q is-active libvirt-guests; exit $?; }
    ;;
  fileserver)
    [ "$action" == "start" ] && (systemctl start smb;systemctl start nmb;systemctl start netatalk)
    [ "$action" == "stop" ] && (systemctl stop smb;systemctl stop nmb;systemctl stop netatalk)
    [ "$action" == "status" ] && { (systemctl -q is-active smb && systemctl -q is-active nmb && systemctl -q is-active netatalk); exit $?; }
    ;;
  web)
    [ "$action" == "start" ] && (systemctl start httpd)
    [ "$action" == "stop" ] && (systemctl stop httpd)
    [ "$action" == "status" ] && { systemctl -q is-active httpd; exit $?; }
    ;;
  *) usage "invalid service"; exit 1;;
esac
exit 0

Finally, here is the relevant platform section from the homebridge config:

{
  "platforms": [{
    "platform": "cmdSwitch2",
    "name": "Command Switch",
    "switches": [{
       "name" : "vm-guests",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start vm-guests",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop vm-guests",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status vm-guests",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    },
    {
       "name" : "fileserver",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start fileserver",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop fileserver",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status fileserver",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    },
    {
       "name" : "web",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start web",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop web",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status web",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    }]
  }]
}

Using Monit to restart a running service, without automatically starting it

I recently ran into an issue where a bug in one of my Docker containers would intermittently chew through CPU until restarted. I wanted Monit to automatically restart the service when it was eating CPU (which ordinarily is trivial to do), but due to the mapped volume, I only wanted it to stop & start the service if it was already running. Otherwise, Monit would proceed to start the container on boot prior to the mounted drive being present, resulting in a bunch of headaches.

"if already running" turned out to be a little more complicated than I expected. Monit doesn't have a good answer for this built-in, so the key is to override the start action by executing a no-op when the service isn't running:

only check process home-assistant MATCHING ^python.+homeassistant
   start program = "/usr/bin/docker start home-assistant"
   stop  program = "/usr/bin/docker stop home-assistant"
   if cpu usage > 13% for 3 cycles then restart
   if does not exist then exec /bin/true

Monit considers 100% CPU usage to be full utilization on all cores, which is why you see 13% (you can also verify current service CPU usage checking the output of monit status). In my case, 13% is about 65% CPU on a single core which (over 3 minutes) I deemed enough to recognize when the bug had occurred.

Here you see I'm also using the MATCHING syntax because the entrypoint for Docker containers may change in the future (I don't maintain this one myself).

The only downside to this method is that Monit will log about the service not running repeatedly until it is started. In my case, because I start docker services on boot it wasn't an issue.

Policy-based routing on Linux to forward packets from a subnet or process through a VPN

In my last post, I covered how to route packages from a specific VLAN through a VPN on the USG. Here, I will show how to use policy-based routing on Linux to route packets from specific processes or subnets through a VPN connection on a Linux host in your LAN instead. You could then point to this host as the next-hop for a VLAN on your USG to achieve the same effect as in my last post.

Note that this post will assume a modern tooling including firewalld and NetworkManager, and that subnet 192.168.10.0/24 is your LAN. This post will send packets coming from 192.168.20.0/24 to VPN, but you could customize that as you see fit (e.g. send specific only hosts from your normal LAN subnet instead).

VPN network interface setup

First, let's create a VPN firewalld zone so we can easily apply firewall rules just to the VPN connection:

firewall-cmd --permanent --new-zone=VPN
firewall-cmd --reload

Next, create the VPN interface with NetworkManager:

VPN_USER=openvpn_username
VPN_PASSWORD=openvpn_password

# Setup VPN connection with NetworkManager
dnf install -y NetworkManager-openvpn
nmcli c add type vpn ifname vpn con-name vpn vpn-type openvpn
nmcli c mod vpn connection.zone "VPN"
nmcli c mod vpn connection.autoconnect "yes"
nmcli c mod vpn ipv4.method "auto"
nmcli c mod vpn ipv6.method "auto"

# Ensure it is never set as default route, nor listen to its DNS settings
# (doing so would push the VPN DNS for all lookups)
nmcli c mod vpn ipv4.never-default "yes"
nmcli c mod vpn ipv4.ignore-auto-dns on
nmcli c mod vpn ipv6.never-default "yes"
nmcli c mod vpn ipv6.ignore-auto-dns on

# Set configuration options
nmcli c mod vpn vpn.data "comp-lzo = adaptive, ca = /etc/openvpn/keys/vpn-ca.crt, password-flags = 0, connection-type = password, remote = remote.vpnhost.tld, username = $VPN_USER, reneg-seconds = 0"

# Configure VPN secrets for passwordless start
cat << EOF >> /etc/NetworkManager/system-connections/vpn

[vpn-secrets]
password=$VPN_PASSWORD
EOF
systemctl restart NetworkManager

Configure routing table and policy-based routing

Normally, a host has a single routing table and therefore only 1 default gateway. Static routes can be configured for next-hops, this is configuring the system to route based a packet's destination address, and we want to know how route based on the source address of a packet. For this, we need multiple routing tables (one for normal traffic, another for VPN traffic) and Policy Based Routing (PBR) to define rules on how to select the right one.

First, let's create a second routing table for VPN connections:

cat << EOF >> /etc/iproute2/rt_tables
100 vpn
EOF

Next, setup an IP rule to select between routing tables for incoming packets based on their source addres:

# Replace this with your LAN interface
IFACE=eno1

# Route incoming packets on VPN subnet towards VPN interface
cat << EOF >> /etc/sysconfig/network-scripts/rule-$IFACE
from 192.168.20.0/24 table vpn
EOF

Now that we can properly select which routing table to use, we need to configure routes on the vpn routing table:

cat << EOF > /etc/sysconfig/network-scripts/route-$IFACE
# Always allow LAN connectivity
192.168.10.0/24 dev $IFACE scope link metric 98 table vpn
192.168.20.0/24 dev $IFACE scope link metric 99 table vpn

# Blackhole by default to avoid privacy leaks if VPN disconnects
blackhole 0.0.0.0/0 metric 100 table vpn
EOF

You'll note that nowhere do we actually define the default gateway - because we can't yet. VPN connections often dynamically allocate IPs, so we'll need to configure the default route for the VPN table to match that particular IP each time we start the VPN connection (we'll do so with a smaller metric figure than the blackhole above of 100, thereby avoiding the blackhole rule).

So, we will configure NetworkManager to trigger a script upon bringing up the VPN interface:

cat << EOF > /etc/NetworkManager/dispatcher.d/90-vpn
VPN_UUID="\$(nmcli con show vpn | grep uuid | tr -s ' ' | cut -d' ' -f2)"
INTERFACE="\$1"
ACTION="\$2"

if [ "\$CONNECTION_UUID" == "\$VPN_UUID" ];then
  /usr/local/bin/configure_vpn_routes "\$INTERFACE" "\$ACTION"
fi
EOF

In that script, we will read the IP address of the VPN interface and install it as the default route. When the VPN is deactivated, we'll do the opposite and cleanup the route we added:

cat << EOF > /usr/local/bin/configure_vpn_routes
#!/bin/bash
# Configures a secondary routing table for use with VPN interface

interface=\$1
action=\$2

tables=/etc/iproute2/rt_tables
vpn_table=vpn
zone="\$(nmcli -t --fields connection.zone c show vpn | cut -d':' -f2)"

clear_vpn_routes() {
  table=$1
  /sbin/ip route show via 192.168/16 table \$table | while read route;do
    /sbin/ip route delete \$route table \$table
  done
}

clear_vpn_rules() {
  keep=\$(ip rule show from 192.168/16)
  /sbin/ip rule show from 192.168/16 | while read line;do
    rule="\$(echo \$line | cut -d':' -f2-)"
    (echo "\$keep" | grep -q "\$rule") && continue
    /sbin/ip rule delete \$rule
  done
}

if [ "\$action" = "vpn-up" ];then
  ip="\$(/sbin/ip route get 8.8.8.8 oif \$interface | head -n 1 | cut -d' ' -f5)"

  # Modify default route
  clear_vpn_routes \$vpn_table
  /sbin/ip route add default via \$ip dev \$interface table \$vpn_table

elif [ "\$action" = "vpn-down" ];then
  # Remove VPN routes
  clear_vpn_routes \$vpn_table
fi
EOF
chmod 755 /usr/local/bin/configure_vpn_routes

Bring up the VPN interface:

nmcli c up vpn

That's all, enjoy!

Sending all packets from a user through the VPN

I find this technique particularly versatile as one can also easily force all traffic from a particular user through the VPN tunnel:

# Replace this with your LAN interface
IFACE=eno1

# Username (or UID) of user who's traffic to send over VPN
USERNAME=foo

# Send any marked packets using VPN routing table
cat << EOF >> /etc/sysconfig/network-scripts/rule-$IFACE
fwmark 0x50 table vpn
EOF

# Mark all packets originating from processes owned by this user
firewall-cmd --permanent --direct --add-rule ipv4 mangle OUTPUT 0 -m owner --uid-owner $USERNAME -j MARK --set-mark 0x50
# Enable masquerade on the VPN zone (enables IP forwarding between interfaces)
firewall-cmd --permanent --add-masquerade --zone=VPN

firewall-cmd --reload

Note 0x50 is arbitrary, as long as it the rule and firewall rule match, you're fine.

Migrating a live server to another host with no downtime

I have had a 1U server co-located for some time now at iWeb Technologies' datacenter in Montreal. So far I've had no issues and it did a wonderful job hosting websites & a few other VMs, but because of my concern for its aging hardware I wanted to migrate away before disaster struck.

Modern VPS offerings are a steal in terms of they performance they offer for the price, and Linode's 4096 plan caught my eye at a nice sweet spot. Backed by powerful CPUs and SSD storage, their VPS is blazingly fast and the only downside is I would lose some RAM and HDD-backed storage compared to my 1U server. The bandwidth provided wit the Linode was also a nice bump up from my previous 10Mbps, 500GB/mo traffic limit.

When CentOS 7 was released I took the opportunity to immediately start modernizing my CentOS 5 configuration and test its configuration. I wanted to ensure full continuity for client-facing services - other than a nice speed boost, I wanted clients to take no manual action on their end to reconfigure their devices or domains.

I also wanted to ensure zero downtime. As the DNS A records are being migrated, I didn't want emails coming in to the wrong server (or clients checking a stale inboxes until they started seeing the new mailserver IP). I can easily configure Postfix to relay all incoming mail on the CentOS 5 server to the IP of the CentOS 7 one to avoid any loss of emails, but there's still the issue that some end users might connect to the old server and get served their old IMAP inbox for some time.

So first things first, after developing a prototype VM that offered the same service set I went about buying a small Linode for a month to test the configuration some of my existing user data from my CentOS 5 server. MySQL was sufficiently easy to migrate over and Dovecot was able to preserve all UUIDs, so my inbox continued to sync seamlessly. Apache complained a bit when importing my virtual host configurations due to the new 2.4 syntax, but nothing a few sed commands couldn't fix. So with full continuity out of the way, I had to develop a strategy to handle zero downtime.

With some foresight and DNS TTL adjustments, we can get near zero downtime assuming all resolvers comply with your TTL. Simply set your TTL to 300 (5 minutes) a day or so before the migration occurs and as your old TTL expires, resolvers will see the new TTL and will not cache the IP for as long. Even with a short TTL, that's still up to 5 minutes of downtime and clients often do bad things... The IP might still be cached (e.g. at the ISP, router, OS, or browser) for longer. Ultimately, I'm the one that ends up looking bad in that scenario even though I have done what I can on the server side and have no ability to fix the broken clients.

To work around this, I discovered an incredibly handy tool socat that can make magic happen. socat routes data between sockets, network connections, files, pipes, you name it. Installing it is as easy as: yum install socat

A quick script later and we can forward all connections from the old host to the new host:

#!/bin/sh
NEWIP=0.0.0.0

# Stop services on this host
for SERVICE in dovecot postfix httpd mysqld;do
  /sbin/service $SERVICE stop
done

# Some cleanup
rm /var/lib/mysql/mysql.sock

# Map the new server's MySQL to localhost:3307
# Assumes capability for password-less (e.g. pubkey) login
ssh $NEWIP -L 3307:localhost:3306 &
socat unix-listen:/var/lib/mysql/mysql.sock,fork,reuseaddr,unlink-early,unlink-close,user=mysql,group=mysql,mode=777 TCP:localhost:3307 &

# Map ports from each service to the new host
for PORT in 110 995 143 993 25 465 587 80 3306;do
  echo "Starting socat on port $PORT..."
  socat TCP-LISTEN:$PORT,fork TCP:${NEWIP}:${PORT} &
  sleep 1
done

And just like that, every connection made to the old server is immediately forwarded to the new one. This includes the MySQL socket (which is automatically used instead of a TCP connection a host of 'localhost' is passed to MySQL).

Note how we establish a SSH tunnel mapping a connection to localhost:3306 on the new server to port 3307 on the old one instead of simply forwarding the connection and socket to the new server - this is done so that if you have users who are permitted on 'localhost' only, they can still connect (forwarding the connection will deny access due to a connection from a unauthorized remote host).

Update: a friend has pointed out this video to me, if you thought 0 downtime was bad enough... These guys move a live server 7km through public transport without losing power or network!

Building a home media server with ZFS and a gaming virtual machine

Work has kept me busy lately so it's been a while since my last post... I have been doing lots of research and collecting lots of information over the holiday break and I'm happy to say that in the coming days I will be posting a new server setup guide, this time for a server that is capable of running redundant storage (ZFS RAIDZ2), sharing home media (Plex Media Server, SMB, AFP) as well as a full Windows 7 gaming rig simultaneously!

Windows runs in a virtual machine and is assigned it's own real graphics card from the host's hardware using the using the brand-new VFIO PCI passthrough technique with the VGA quirks enabled. This does require a motherboard and CPU with support for IOMMU, more commonly known as VT-d or AMD-Vi.