Search | Running | Sailfish OS | All Posts | About Me

Losing one of my evenings after an OpenBSD upgrade

May 07, 2025 — Nico Cartron

Context

I recently upgraded my OpenBSD.Amsterdam VM to OpenBSD 7.7. As usual, the upgrade went smoothly... or so I thought!

I received an alert from AlertManager a few hours later, telling me that Knot, the Authoritative DNS server running on that server, was unreachable.

I checked on the server, and indeed it was not running, and trying to restart it didn't work:

# rcctl start knot
knot(failed)

Digging

When trying to start knot manually, or using knotc, I would get worrying error messages:

ld.so: knotc: can't load library 'libdbus-1.so.11.4'

It's not just Knot!

Even though Knot (UDP/TCP 53) was the only process not running, I quickly realised that other processes had issues, e.g. Postfix, as I got similar error messages:

Error: ld.so: lmtp: can't load library 'libiconv.so.7.1'

Fixing the (first) problem

What was weird is that I could see those libraries just fine in /usr/local/lib - so it looked like the OpenBSD upgrade did not go as smoothly as I thought it did.

The advised manner to fix this was to uninstall all packages and reinstall them from scratch, which I did with:

# pkg_info -m > pkgs.txt
# pkg_delete -X
# export PKG_PATH=https://ftp.fr.openbsd.org/pub/OpenBSD/$(uname -r)/packages/$(uname -m)/
# pkg_add -z -l pkgs.txt

After this (and a reboot, to be safe) - my problem was fixed and I could finally start Knot, and the Postfix errors were gone.

Another problem, and Digging further

Well, that was before I realised I was no longer receiving emails, and this was confirmed by checking /var/log/maillog:

cannot create tmp lockfile /var/db/spamassassin/.spamassassin/bayes.lock.mx.ncartron.org.82539

Which meant that while my email server was accepting emails from the outside, they would just sit in Postfix' queue and not being delivered to their respective mailboxes.

And this is where I went down a rabbit hole: because the issue I saw was related to Spamassassin / spamc / spamd, I spent a lot of time on trying to fix it, checking permissions, Spamassassin configuration, Bayes stuff. None of that fixed the issue.

In the end, I took a step down and looked at each component individually, trying to understand what changed and what could have caused this (major) problem.

Finding the solution

It was now close to midnight, and I realised that, while trying to fix my first issue (remember the one with the "missing" libraries above?), I issued a sysmerge(8) command.

Now, sysmerge is a rather innocuous command: it's used after you upgrade an OpenBSD system, to (according to its manpage): "help the administrator update configuration files after upgrading to a new release or snapshot".

And indeed, when I ran it, it asked me to take actions on 2 files:

  • `/etc/ssh/sshd_config
    • because I'm running OpenSSH on a non-standard port, and have tweaked it a bit - I kept my previous version
  • /etc/mailer.conf
    • it asked me if I wanted to replace my existing setup, which uses /usr/local/sbin/sendmail for all-things related to email (sendmail, mailq, newaliaes) by OpenSMTPD's /usr/sbin/smtpctl

For the latter, I thought I kept the file, but it turns out I didn't, and so OpenBSD was happily using the below configuration:

sendmail        /usr/sbin/smtpctl
send-mail       /usr/sbin/smtpctl
mailq           /usr/sbin/smtpctl
makemap         /usr/sbin/smtpctl
newaliases      /usr/sbin/smtpctl

instead of the original one:

sendmail        /usr/local/sbin/sendmail
send-mail       /usr/local/sbin/sendmail
mailq           /usr/local/sbin/sendmail
newaliases      /usr/local/sbin/sendmail

I replaced the file, restarted Postfix and Spamassassin, and boom my problem was fixed!!!

Wrap Up

When I told my wife and youngest son about this (of course I did not get into the weeds of it), my son told me "oh Dad it was not your fault" - which was cute, but once again it's a reminder to myself that troubleshooting always requires to "get back to the basics", and think about what has changed;

if something was working before and no longer does, then you need to break down the components that changed, starting from the root, rather than just looking at what log files tell you: indeed, log files will tell you about a symptom, but the root cause can be far away from that.

I learnt it (again) painfully yesterday!

One More Thing

Oh, and also: I am soooo happy that I have a (arguably simple) AlertManager setup and that I took the time to do it. It catches things quickly and warns me when something is not right.


Tags: BSD


I don't have any commenting system, but email me (nicolas at ncartron dot org) your comments!
If you like my work, you can buy me a coffee!