systemd sucks

How to umount NFS before killing any processes.
or
How to save your state before umounting NFS.
or
The blind and angry leading the blind and angry down a thorny path full of goblins.
April 29, 2017

A narrative, because the reference-style documentation sucks.

So, rudderless Debian installed yet another god-forsaken solipsist piece of over-reaching GNOME-tainted garbage on your system: systemd. And you've got some process like openvpn or a userspace fs daemon or so on that you have been explicitly managing for years. But on shutdown or reboot, you need to run something to clean up before it dies, like umount. If it waits until too late in the shutdown process, your umounts will hang.

This is a very very very common variety of problem. To throw salt in your wounds, systemd is needlessly opaque even about the things that it will share with you.

"This will be easy."

Here's the rough framework for how to make a service unit that runs a script before shutdown. I made a file /etc/systemd/system/greg.service (you might want to avoid naming it something meaningful because there is probably already an opaque and dysfunctional service with the same name already, and that will obfuscate everything):

[Unit]
Description=umount nfs to save the world
After=networking.service

[Service]
ExecStart=/bin/true
ExecStop=/root/bin/umountnfs
TimeoutSec=10
Type=oneshot
RemainAfterExit=yes

The man pages systemd.unit(5) and systemd.service(5) are handy references for this file format. Roughly, After= indicates which service this one is nested inside -- units can be nested, and this one starts after networking.service and therefore stops before it. The ExecStart is executed when it starts, and because of RemainAfterExit=yes it will be considered active even after /bin/true completes. ExecStop is executed when it ends, and because of Type=oneshot, networking.service cannot be terminated until ExecStop has finished (which must happen within TimeoutSec=10 seconds or the ExecStop is killed).

If networking.service actually provides your network facility, congratulations, all you need to do is systemctl start greg.service, and you're done! But you wouldn't be reading this if that were the case. You've decided already that you just need to find the right thing to put in that After= line to make your ExecStop actually get run before your manually-started service is killed. Well, let's take a trip down that rabbit hole.

The most basic status information comes from just running systemctl without arguments (equivalent to list-units). It gives you a useful triple of information for each service:

greg.service                loaded active exited

loaded means it is supposed to be running. active means that, according to systemd's criteria, it is currently running and its ExecStop needs to be executed some time in the future. exited means the ExecStart has already finished.

People will tell you to put LogLevel=debug in /etc/systemd/system.conf. That will give you a few more clues. There are two important steps about unit shutdown that you can see (maybe in syslog or maybe in journalctl):

systemd[4798]: greg.service: Executing: /root/bin/umountnfs
systemd[1]: rsyslog.service: Changed running -> stop-sigterm

That is, it tells you about the ExecStart and ExecStop rules running. And it tells you about the unit going into a mode where it starts killing off the cgroup (I think cgroup used to be called process group). But it doesn't tell you what processes are actually killed, and here's the important part: systemd is solipsist. Systemd believes that when it closes its eyes, the whole universe blinks out of existence.

Once systemd has determined that a process is orphaned -- not associated with any active unit -- it just kills it outright. This is why, if you start a service that forks into the background, you must use Type=forking, because otherwise systemd will consider any forked children of your ExecStart command to be orphans when the top-level ExecStart exits.

So, very early in shutdown, it transitions a ton of processes into the orphaned category and kills them without explanation. And it is nigh unto impossible to tell how a given process becomes orphaned. Is it because a unit associated with the top level process (like getty) transitioned to stop-sigterm, and then after getty died, all of its children became orphans? If that were the case, it seems like you could simply add to your After rule.

After=networking.service getty.target

For example, my openvpn process was started from /etc/rc.local, so systemd considers it part of the unit rc-local.service (defined in /lib/systemd/system/rc-local.service). So After=rc-local.service saves the day!

Not so fast! The openvpn process is started from /etc/rc.local on bootup, but on resume from sleep it winds up being executed from /etc/acpi/actions/lm_lid.sh. And if it failed for some reason, then I start it again manually under su.

So the inclination is to just make a longer After= line:

After=networking.service getty.target acpid.service

Maybe getty@.service? Maybe systemd-user-sessions.service? How about adding all the items from After= to Requires= too? Sadly, no. It seems that anyone who goes down this road meets with failure. But I did find something which might help you if you really want to:

systemctl status 1234

That will tell you what unit systemd thinks that pid 1234 belongs to. For example, an openvpn started under su winds up owned by /run/systemd/transient/session-c1.scope. Does that mean if I put After=session-c1.scope, I would win? I have no idea, but I have even less faith. systemd is meddlesome garbage, and this is not the correct way to pay fealty to it.

I'd love to know what you can put in After= to actually run before vast and random chunks of userland get killed, but I am a mere mortal and systemd has closed its eyes to my existence. I have forsaken that road.

I give up, I will let systemd manage the service, but I'll do it my way!

What you really want is to put your process in an explicit cgroup, and then you can control it easily enough. And luckily that is not inordinately difficult, though systemd still has surprises up its sleeve for you.

So this is what I wound up with, in /etc/systemd/system/greg.service:

[Unit]
Description=openvpn and nfs mounts
After=networking.service

[Service]
ExecStart=/root/bin/openvpn_start
ExecStop=/root/bin/umountnfs
TimeoutSec=10
Type=forking

Here's roughly the narrative of how all that plays out:

So, this EXIT_STATUS hack... If I had made the NFS its own service, it might be strictly nested within the openvpn service, but that isn't actually what I desire -- I want the NFS mounts to stick around until we are shutting down, on the assumption that at all other times, we are on the verge of openvpn restoring the connection. So I use the EXIT_STATUS to determine if umountnfs is being called because of shutdown or just because openvpn died (anyways, the umount won't succeed if openvpn is already dead!). You might want to add an export > /tmp/foo to see what environment variables are defined.

And there is a huge caveat here: if something else in the shutdown process interferes with the network, such as a call to ifdown, then we will need to be After= that as well. And, worse, the documentation doesn't say (and user reports vary wildly) whether it will wait until your ExecStop completes before starting the dependent ExecStop. My experiments suggest Type=oneshot will cause that sort of delay...not so sure about Type=forking.

Fine, sure, whatever. Let's sing Kumbaya with systemd.

I have the idea that Wants= vs. Requires= will let us use two services and do it almost how a real systemd fan would do it. So here's my files:

Then I replace the killall -9 openvpn with systemctl stop greg-openvpn.service, and I replace systemctl start greg.service with systemctl start greg-nfs.service, and that's it.

The Requires=networking.service enforces the strict nesting rule. If you run systemctl stop networking.service, for example, it will stop greg-openvpn.service first.

On the other hand, Wants=greg-openvpn.service is not as strict. On systemctl start greg-nfs.service, it launches greg-openvpn.service, even if greg-nfs.service is already active. But if greg-openvpn.service stops or dies or fails, greg-nfs.service is unaffected, which is exactly what we want. The icing on the cake is that if greg-nfs.service is going down anyways, and greg-openvpn.service is running, then it won't stop greg-openvpn.service (or networking.service) until after /root/bin/umountnfs is done.

Exactly the behavior I wanted. Exactly the behavior I've had for 14 years with a couple readable shell scripts. Great, now I've learned another fly-by-night proprietary system.

GNOME, you're as bad as MacOS X. No, really. In February of 2006 I went through almost identical trouble learning Apple's configd and Kicker for almost exactly the same purpose, and never used that knowledge again -- Kicker had already been officially deprecated before I even learned how to use it. People who will fix what isn't broken never stop.

As an aside - Allan Nathanson at Apple was a way friendlier guy to talk to than Lennart Poettering is. Of course, that's easy for Allan -- he isn't universally reviled.

A side story

If you've had systemd foisted on you, odds are you've got Adwaita theme too.

rm -rf /usr/share/icons/Adwaita/cursors/

You're welcome. Especially if you were using one of the X servers where animated cursors are a DoS.

People who will fix what isn't broken never stop.