Skip to content

DNS Caching

Xray's DNS caching is implemented per-server via the CacheController struct. Every cached nameserver (UDP, TCP, DoH, DoQ) has its own independent cache. The caching layer handles TTL tracking, serve-stale, deduplication of in-flight queries, pubsub notification, and background map compaction.

CacheController

File: app/dns/cache_controller.go

go
type CacheController struct {
    name            string
    disableCache    bool
    serveStale      bool
    serveExpiredTTL int32  // negative value: max seconds past expiry to serve

    ips      map[string]*record
    dirtyips map[string]*record  // used during map compaction

    sync.RWMutex
    pub           *pubsub.Service
    cacheCleanup  *task.Periodic
    highWatermark int
    requestGroup  singleflight.Group
}

The record and IPRecord Types

File: app/dns/dnscommon.go

go
type record struct {
    A    *IPRecord
    AAAA *IPRecord
}

type IPRecord struct {
    ReqID     uint16
    IP        []net.IP
    Expire    time.Time
    RCode     dnsmessage.RCode
    RawHeader *dnsmessage.Header
}

Each domain has a record containing separate A (IPv4) and AAAA (IPv6) entries. The Expire field stores the absolute expiration time (now + minimum TTL from the response).

TTL Calculation

When reading from cache, IPRecord.getIPs() computes the remaining TTL:

go
func (r *IPRecord) getIPs() ([]net.IP, int32, error) {
    if r == nil {
        return nil, 0, errRecordNotFound
    }
    untilExpire := time.Until(r.Expire).Seconds()
    ttl := int32(math.Ceil(untilExpire))

    if r.RCode != dnsmessage.RCodeSuccess {
        return nil, ttl, dns.RCodeError(r.RCode)
    }
    if len(r.IP) == 0 {
        return nil, ttl, dns.ErrEmptyResponse
    }
    return r.IP, ttl, nil
}

A positive TTL means the record is fresh. A zero or negative TTL means it has expired.

Cache Lookup Flow

File: app/dns/nameserver_cached.go

The queryIP() function implements the cache-first strategy:

go
func queryIP(ctx context.Context, s CachedNameserver, domain string, option dns.IPOption) ([]net.IP, uint32, error) {
    fqdn := Fqdn(domain)
    cache := s.getCacheController()

    if !cache.disableCache {
        if rec := cache.findRecords(fqdn); rec != nil {
            ips, ttl, err := merge(option, rec.A, rec.AAAA)
            if !errors.Is(err, errRecordNotFound) {
                if ttl > 0 {
                    // CACHE HIT: fresh record
                    return ips, uint32(ttl), err
                }
                if cache.serveStale && (cache.serveExpiredTTL == 0 || cache.serveExpiredTTL < ttl) {
                    // CACHE OPTIMISTIC: stale but serveable
                    go pull(ctx, s, fqdn, option)  // background refresh
                    return ips, 1, err
                }
            }
        }
    }
    // CACHE MISS: fetch from upstream
    return fetch(ctx, s, fqdn, option)
}

Serve-Stale

When serveStale is enabled:

  • Expired records are returned immediately with TTL=1
  • A background goroutine (pull()) refreshes the record asynchronously
  • The serveExpiredTTL field limits how far past expiry a record can be served (0 = unlimited)

The serveExpiredTTL is stored as a negative int32 (e.g., -3600 means "up to 3600 seconds past expiry"). During TTL comparison, it checks cache.serveExpiredTTL < ttl where TTL is already negative for expired records.

Request Deduplication

The fetch() function uses singleflight.Group to prevent duplicate upstream queries:

go
func fetch(ctx context.Context, s CachedNameserver, fqdn string, option dns.IPOption) ([]net.IP, uint32, error) {
    key := fqdn + "46"|"4"|"6"  // keyed by domain + IP version
    v, _, _ := s.getCacheController().requestGroup.Do(key, func() (any, error) {
        return doFetch(ctx, s, fqdn, option), nil
    })
    ret := v.(result)
    return ret.ips, ret.ttl, ret.error
}

If multiple goroutines request the same domain simultaneously, only one actual upstream query is made.

Record Update Flow

When a DNS response arrives, updateRecord() handles cache insertion and pubsub notification:

go
func (c *CacheController) updateRecord(req *dnsRequest, rep *IPRecord) {
    rtt := time.Since(req.start)

    // 1. Publish to waiting subscribers
    switch req.reqType {
    case dnsmessage.TypeA:
        c.pub.Publish(req.domain+"4", rep)
    case dnsmessage.TypeAAAA:
        c.pub.Publish(req.domain+"6", rep)
    }

    if c.disableCache { return }

    // 2. Merge with existing record
    c.Lock()
    newRec := &record{}
    oldRec := c.ips[req.domain]

    switch req.reqType {
    case dnsmessage.TypeA:
        newRec.A = rep
        if oldRec != nil && oldRec.AAAA != nil {
            newRec.AAAA = oldRec.AAAA  // preserve existing AAAA
        }
    case dnsmessage.TypeAAAA:
        newRec.AAAA = rep
        if oldRec != nil && oldRec.A != nil {
            newRec.A = oldRec.A  // preserve existing A
        }
    }
    c.ips[req.domain] = newRec
    c.Unlock()

    // 3. Cross-publish: if A arrives, also notify AAAA subscribers with cached data
    if pubRecord != nil && pubRecord has valid IPs {
        c.pub.Publish(req.domain+pubSuffix, pubRecord)
    }

    // 4. Start cleanup timer
    if !c.serveStale || c.serveExpiredTTL != 0 {
        c.cacheCleanup.Start()
    }
}

The cross-publish step is important for merged A+AAAA queries: when both record types are requested, the first response to arrive also publishes the cached counterpart so the subscriber doesn't have to wait.

PubSub Mechanism

The cache uses pubsub.Service for asynchronous notification. When doFetch() starts a query:

go
func (c *CacheController) registerSubscribers(domain string, option dns.IPOption) (*pubsub.Subscriber, *pubsub.Subscriber) {
    if option.IPv4Enable {
        sub4 = c.pub.Subscribe(domain + "4")
    }
    if option.IPv6Enable {
        sub6 = c.pub.Subscribe(domain + "6")
    }
    return
}

The doFetch() function then waits on these subscribers:

go
func doFetch(ctx context.Context, s CachedNameserver, fqdn string, option dns.IPOption) result {
    sub4, sub6 := s.getCacheController().registerSubscribers(fqdn, option)
    defer closeSubscribers(sub4, sub6)

    noResponseErrCh := make(chan error, 2)
    s.sendQuery(ctx, noResponseErrCh, fqdn, option)

    // Wait for either: context cancel, transport error, or pubsub message
    rec4, err4 := onEvent(sub4)
    rec6, err6 := onEvent(sub6)

    ips, ttl, err := merge(option, rec4, rec6, errs...)
    return result{ips, rTTL, err}
}

Merging A and AAAA Records

The merge() function combines IPv4 and IPv6 results:

go
func merge(option dns.IPOption, rec4 *IPRecord, rec6 *IPRecord, errs ...error) ([]net.IP, int32, error) {
    mergeReq := option.IPv4Enable && option.IPv6Enable

    // If only one type requested, return it directly
    // If both requested, combine IPs and use minimum TTL
    // If one has IPs and the other doesn't, return what we have
    // If neither has IPs, return combined errors
}

The TTL of the merged result is the minimum of the A and AAAA TTLs, capped at dns.DefaultTTL (600 seconds).

Cache Cleanup and Map Compaction

File: app/dns/cache_controller.go

The cleanup runs every 300 seconds via task.Periodic:

go
func (c *CacheController) CacheCleanup() error {
    expiredKeys, _ := c.collectExpiredKeys()
    c.writeAndShrink(expiredKeys)
    return nil
}

Expired Key Collection (Read Lock)

go
func (c *CacheController) collectExpiredKeys() ([]string, error) {
    c.RLock()
    defer c.RUnlock()
    // Skip if migration in progress
    if c.dirtyips != nil { return nil, nil }
    // Collect domains where A or AAAA has expired
    // If serveStale with serveExpiredTTL, adjust "now" accordingly
}

Write and Shrink (Write Lock)

After collecting expired keys, writeAndShrink() performs the actual deletion and optionally triggers map compaction:

go
func (c *CacheController) writeAndShrink(expiredKeys []string) {
    c.Lock()
    defer c.Unlock()

    // Delete expired individual records (A or AAAA)
    // Delete the domain entry entirely if both are nil

    // Shrink decision:
    // If map is now empty and highWatermark >= 512: rebuild empty map
    // If reduction from peak > 10240 AND > 65% of peak: background migrate
}

Background Migration

When the map has shrunk significantly, a new smaller map is created and entries are migrated in batches of 4096:

go
func (c *CacheController) migrate() {
    batch := make([]migrationEntry, 0, 4096)
    for domain, rec := range c.dirtyips {
        batch = append(batch, migrationEntry{domain, rec})
        if len(batch) >= 4096 {
            c.flush(batch)
            runtime.Gosched()  // yield to other goroutines
        }
    }
    c.Lock()
    c.dirtyips = nil
    c.Unlock()
}

During migration, findRecords() checks both c.ips (new map) and c.dirtyips (old map):

go
func (c *CacheController) findRecords(domain string) *record {
    c.RLock()
    defer c.RUnlock()
    rec := c.ips[domain]
    if rec == nil && c.dirtyips != nil {
        rec = c.dirtyips[domain]
    }
    return rec
}

The flush() method merges entries, preferring newer data in c.ips over older data from c.dirtyips.

Cache Configuration

Cache behavior is controlled at two levels:

Global (DNSConfig):

  • disableCache -- Disable caching for all servers
  • serveStale -- Enable stale serving for all servers
  • serveExpiredTTL -- Max seconds past expiry to serve stale

Per-Server (NameServerConfig):

  • disableCache -- Override global setting for this server
  • serveStale -- Override global setting
  • serveExpiredTTL -- Override global setting

Per-server settings take precedence when non-nil.

Implementation Notes

  • The CacheController name field is derived from the server type and address (e.g., "UDP://8.8.8.8:53", "DOH//dns.google"). This appears in log messages.

  • When disableCache is true, updateRecord() still publishes to pubsub (so in-flight queries get their responses) but does not store the record in the map.

  • The shrink thresholds are:

    • minSizeForEmptyRebuild = 512 -- Only rebuild empty maps if the peak was at least 512
    • shrinkAbsoluteThreshold = 10240 -- Must have freed at least 10240 entries from peak
    • shrinkRatioThreshold = 0.65 -- Must have freed at least 65% of peak entries
  • The singleflight.Group in requestGroup returns cached results to all concurrent callers, effectively deduplicating both cache misses and stale refreshes.

  • The cleanup timer is started lazily (only after a record is inserted) and stops itself when the map is empty. When serveStale is true with no serveExpiredTTL, cleanup is not started at all (records live forever until evicted by map shrinking).

  • TTL in parsed responses uses the minimum TTL across all answer records, with a floor of 1 second (TTL=0 in DNS responses is treated as TTL=1).

Technical analysis for re-implementation purposes.