Skip to content

DNS Privacy Stack — Optimization

Optimization

This section covers every performance tuning applied to the stack, why each setting matters, and how it impacts real-time applications like video calls and gaming.

Unbound Performance Settings

Cache Sizing

The most critical setting. Oversized caches cause swap thrashing; undersized caches cause cache misses.

Setting Recommended Why
rrset-cache-size 64m DNS record cache. Rule: 2x msg-cache
msg-cache-size 32m Query response cache. Rule: rrset/2
key-cache-size 16m DNSSEC key cache
neg-cache-size 8m NXDOMAIN cache

Total: ~120MB — comfortably fits in RAM for any homelab server. The previous config used 1.8GB (1024m + 512m + 256m) which caused swap thrashing and a cascading DNS outage after 5 days.

Do NOT set caches to 1GB+

On a 16GB server running 100+ Docker containers, Unbound's caches compete with container memory. At 1.8GB, Unbound pushed the system into 12GB of swap, making it completely unresponsive. 120MB total is more than sufficient for a homelab resolving ~10,000 unique domains.

Thread Alignment

num-threads: 2
msg-cache-slabs: 2
rrset-cache-slabs: 2
infra-cache-slabs: 2
key-cache-slabs: 2

Slabs must match num-threads for optimal lock contention. Each thread gets its own slab, reducing mutex overhead on cache lookups.

For most homelabs, 2 threads is sufficient. Use nproc to check your CPU count — don't exceed it.

Socket Optimization

so-reuseport: yes

Enables the kernel to distribute incoming UDP queries evenly across threads using SO_REUSEPORT. Without this, all queries hit thread 0 and other threads sit idle. This setting roughly doubles throughput on multi-thread configurations.

EDNS Buffer Size

edns-buffer-size: 1232

Per the DNS Flag Day recommendation. Prevents UDP fragmentation which causes packet loss on some networks. The default (4096) can cause issues with routers that fragment large UDP packets.

Upstream RTT Tracking

infra-cache-numhosts: 10000

Unbound tracks the round-trip time (RTT) to upstream servers to make smart routing decisions. The default (10,000) is sufficient. This matters more for recursive resolution but still helps with DoT forwarder selection.

Stale Cache Serving — The Key Setting

This is the single most impactful optimization for real-time applications:

serve-expired: yes
serve-expired-ttl: 3600 # (1)
serve-expired-client-timeout: 1800 # (2)
  1. Maximum age of expired entries to serve. After 1 hour past TTL expiry, the entry is discarded rather than served stale.
  2. If upstream response takes longer than 1.8 seconds, serve the stale entry. In practice, stale is served in ~0ms because upstream DoT takes ~40ms — this is a safety net for outages.

How It Works

Without serve-expired-client-timeout:

Client query → Cache expired → Wait for upstream (40-200ms) → Return fresh answer

With serve-expired-client-timeout: 1800 (1.8 seconds):

Client query → Cache expired → Return stale answer IMMEDIATELY (0ms)
                             → Refresh from upstream in background

The 1800 value means: if the upstream hasn't responded within 1.8 seconds, serve the stale cache entry. In practice, the stale entry is served within milliseconds because the upstream DoT query takes ~40ms.

Why This Matters for Real-Time Apps

  • Google Meet / Zoom: DNS lookups happen during ICE candidate gathering and STUN/TURN server resolution. A 200ms DNS delay causes a visible video stutter.
  • Gaming (LoL, Valorant): Game servers do DNS lookups for matchmaking, chat, and telemetry. A stalled DNS query causes a 2-5 second disconnect.
  • Boot flooding: When a laptop boots, 200+ DNS queries fire simultaneously. Without stale serving, each unique domain blocks until upstream responds. With it, all cached domains resolve in 0ms.

Combined with AdGuard's Optimistic Cache

AdGuard has its own stale-serving mechanism:

cache_optimistic: true

This provides two layers of stale serving: 1. AdGuard returns a stale cached answer to the client immediately 2. Sends the query to Unbound, which also returns a stale answer immediately if it has one 3. Unbound queries upstream over DoT in the background 4. Both caches update when the fresh answer arrives

The client never waits.

AdGuard Performance Settings

Cache Size

cache_size: 10000000  # 10 MB
cache_ttl_min: 600    # minimum 10 minutes
cache_ttl_max: 86400  # cap at 1 day

The default 4096 entries is far too small for 100+ containers each resolving dozens of domains. At 10 MB, the cache comfortably holds all commonly queried domains without eviction.

cache_ttl_min: 600 forces a minimum 10-minute cache, reducing upstream query volume by ~70% for domains with very short TTLs (like CDNs that set 60-second TTLs). cache_ttl_max: 86400 caps entries at 1 day to prevent serving stale records indefinitely.

Blocked Response TTL

blocked_response_ttl: 60

When AdGuard blocks a domain, clients receive a 0.0.0.0 response. With the default TTL of 10 seconds, clients re-query blocked domains every 10 seconds — generating unnecessary load. Setting this to 60 seconds means clients cache the "blocked" answer for a full minute, reducing repeated queries for the same blocked domain by ~6x.

Concurrent Query Handling

max_goroutines: 500
upstream_timeout: 5s

max_goroutines controls how many DNS queries AdGuard can process simultaneously. The default (300) can bottleneck during boot flooding when 100+ containers start at once. 500 provides headroom.

upstream_timeout controls how long AdGuard waits for Unbound before failing over to fallback DNS. Reducing from 10s to 5s means faster failover — if Unbound is having issues, clients get answers from the encrypted fallback resolvers within 5 seconds instead of 10.

Safe Browsing & Filter Updates

safebrowsing_enabled: true
safebrowsing_cache_size: 4194304   # 4 MB (default 1 MB)
filters_update_interval: 12        # hours (default 24)

Safe Browsing uses AdGuard's own threat intelligence feed to block known malicious and phishing domains, independent of blocklists. Increasing the cache from 1 MB to 4 MB reduces repeated lookups against AdGuard's servers.

Updating blocklists every 12 hours (instead of 24) ensures new threats are blocked faster.

Rate Limiting

ratelimit: 0     # disabled

The default ratelimit of 20 queries/second per subnet silently drops queries during bursts — like when 100 containers start simultaneously, or when a laptop boots and fires 200 queries in 2 seconds. Since all clients are on a trusted LAN, disable it entirely.

EDNS Client Subnet

edns_client_subnet:
  enabled: false

EDNS Client Subnet (ECS) sends a portion of your IP address to upstream DNS servers so they can return geographically optimized results. Disabling it is a privacy trade-off — you lose some CDN optimization but prevent upstream resolvers from learning your subnet.

AAAA (IPv6) Filtering

aaaa_disabled: true

If your network doesn't use IPv6, this halves the number of upstream queries. Every DNS lookup normally generates two queries (A + AAAA). Disabling AAAA reduces load and speeds up resolution.

Measuring Performance

Cached vs Uncached Latency

## First query (cold cache) — hits upstream via DoT
dig @127.0.0.1 example.com +timeout=5 | grep "Query time"
## Expected: ~40-80ms

## Second query (cached) — served from Unbound cache
dig @127.0.0.1 example.com +timeout=5 | grep "Query time"
## Expected: 0ms

Cache Hit Rate

Check AdGuard's dashboard at http://adguard.server.lan:8091 → Statistics. A healthy setup shows 70-90% cache hit rate after warmup.

Verify DoT Is Working

## Check Unbound is using TLS connections
ss -tnp | grep :853
## Should show ESTAB connections to 194.242.2.2 and 9.9.9.9

Performance Summary

Setting Before After Impact
Unbound caches 1.8GB 120MB Eliminated swap thrashing
serve-expired-client-timeout Not set 1800ms 0ms stale responses
cache_optimistic false true Double-layer stale serving
cache_size 4096 10 MB Large cache, no eviction pressure
cache_ttl_min 0 600s 70% fewer upstream queries for short-TTL domains
cache_ttl_max unlimited 86400s Prevents indefinitely stale records
blocked_response_ttl 10s 60s 6x fewer re-queries for blocked domains
upstream_timeout 10s 5s Faster failover to fallback DNS
max_goroutines 300 500 Handles boot flooding from 100+ containers
ratelimit 20 qps 0 (off) No dropped queries during bursts
so-reuseport no yes Even thread utilization
prefetch no yes Popular domains stay cached
aaaa_disabled false true 50% fewer upstream queries
safebrowsing_enabled false true Blocks malicious/phishing domains
filters_update_interval 24h 12h Faster blocklist updates
fallback_dns empty Mullvad/Quad9/dns0.eu (DoT) Encrypted failover if Unbound is down

Previous: 03-setup | Next: 05-troubleshooting