Search | Sailfish OS | Running | PineTime | All Posts

Testing PowerDNS 4.5.0-beta1's "Zone cache"

June 17, 2021 — Nico Cartron

A quick look at this new feature, aimed at providing better performances when under NXD attacks.


"Disclaimer"

It will come as no surprise that, as someone who worked for 3 years for PowerDNS/Open-Xchange, I am eating my own dog food and using PowerDNS software for pretty much everything I do with DNS:

  • dnsdist: used at home to distribute load and play a bit
  • Recursor: used at home
  • Authoritative:
    • used at home (with a sqlite backend)
    • and publicly for my domains (MariaDB backend)
    • although I'm using cz.nic's KnotDNS as secondary, for diversity reasons :)

Why testing a beta?

Even if it's been almost 2 years since I no longer work with the likes of Bert, Peter, Pieter and Rémi, I am still closely following the developments, therefore when I saw Peter's announcement about the first beta in the 4.5.0 Auth branch, I knew I had to test it!

Especially this paragraph caught my attention:

Version 4.5.0 mostly brings small improvements and fixes, but there is one notable new feature: the zone cache.

The zone cache allows PowerDNS to keep a list of zones in memory, updated periodically.
With this cache, PowerDNS can avoid hitting the database with queries for unknown domains.
In some setups, and some attack scenarios, this can make a serious performance difference.

While I usually happily test beta software to provide feedback, this single feature was enough to make me download and test 4.5.0-beta1!

What's the big deal with "zone cache"?

PDNS backends

PowerDNS Authoritative can use different so-called backends, where the information about zones and resource records are stored.
Those backend can be plain text files (aka zone files/BIND backend), or more dynamic like SQL database (MySQL, PosgreSQL, sqlite, ...) as well as LDAP or LMDB.

Each time the Auth server receives a request, it will consult this backend to get the answer.
That doesn't mean that it has to consult again and again when receiving the same request: that's what Packet Cache and Query Cache are made for.

Without doing a complete explanation about how PowerDNS Authoritative works, let's say that if a lot of people use the *SQL (MySQL, PostgreSQL) backends, that's because they provide a lot of advantages over other backends, such as the Native replication mode of operation: this lets the database take care of the replication between DNS servers, without the need of dealing with notifies/primary/secondary concepts.

NXDOMAIN attacks

That's all good when receiving only "legitimate" queries, but what happens when most of the queries received are unique, therefore forcing PDNS to consult the database?
Such scenario includes NXDOMAIN attacks like the Pseudo-Random Subdomain attacks (PRSD): plenty of random queries (such as abc.mydomain.org, bfd.mydomain.org, etc) whose answers do not exist, the goal of the attack is to keep sending such queries to overwhelm the Authoritative server.

In the case of PowerDNS, that means that every single query generates requests to the database backend, resulting in a performance degradation, and eventually an outage.

Zone cache

The changelog gives a 1-liner explanation:

Add a cache of all zones, avoiding backend lookups for zones that do not exist, and for non-existing subzones

This Github issue explains the problem faced in more details, as well as the proposed approach, which is super simple but also should be quite effective:

  • PowerDNS will keep a list of zones in memory,
  • which avoids having to query the database, even while under PRSD attack,
  • every X seconds (300 by default), PowerDNS will reload the list of zones.

Zone-cache is enabled by default now (it was not the case with the alpha1).

The setting to modify in your pdns.conf configuration file is zone-cache-refresh-interval, if you want to decrease/increase how much time the zones are kept in memory, or simply disable this feature (by setting it to 0).

Testing it

Tests with zone cache disabled

To compare apples to apples, let's start by doing a few requests with zone cache disabled, and query-logging enabled, so that we can see the SQL queries hitting the backend.

Add the following to your pdns.conf:

zone-cache-refresh-interval=0
query-logging=yes

For the purpose of this article, I have created a zone test.nuc.local.ncartron.org.

  • A simple test to return an A record in that zone: dig @nuc 1.test.nuc.local.ncartron.org will result in 4 queries made (outputs have been edited for brevity):

SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and name=? and domain_id=?
select kind,content from domains, domainmetadata where domainmetadata.domain_id=domains.id and name=?

  • If I do the same kind of request to a non-existing zone, e.g. dig @nuc 1.test.nuc2.local.ncartron.org I get 7 queries:

SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and type=? and name=?

(The number of requests here will depend on how many delegations have to be checked).

Tests with zone cache enabled

Now let's comment the line with zone-cache-refresh-interval=0, or set it to another value (e.g. 60), and restart PowerDNS.

Doing again the same test with the same records:

  • with 1.test.nuc.local.ncartron.org (an existing zone), we get 1 less query:

SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and name=? and domain_id=?
SELECT content,ttl,type,domain_id,name,auth FROM records WHERE disabled=0 and name=? and domain_id=?
select kind,content from domains, domainmetadata where domainmetadata.domain_id=domains.id and name=?

  • with 1.test.nuc2.local.ncartron.org, we get... 0 query!
    Since the zone does not exist, PowerDNS knows it and does not even try to query the backend for that zone/record.

Wrap-up

I haven't tested (yet) on servers with real-life traffic, but I am pleasantly surprised by the improvement zone-cache is bringing.

It would be even better with the possibility to also load the records of a zone, as suggested by Rémi here, but hey that's a first step :)


Tags: DNS, Opensource


I don't have any commenting system, but email me (nicolas at ncartron dot org) your comments!
If you like my work, you can buy me a coffee!