Skip to content

gVisor IP Stack

gVisor (gvisor.dev/gvisor) provides a complete userspace implementation of the TCP/IP stack. Xray-core uses it to process raw IP packets from the TUN interface into application-layer connections.

Source: proxy/tun/stack_gvisor.go, proxy/tun/stack_gvisor_endpoint.go

Why gVisor?

The TUN device operates at Layer 3 (IP packets), but Xray-core's proxy protocols operate at Layer 4+ (TCP streams, UDP datagrams). A userspace IP stack bridges this gap:

TUN device          → Raw IP packets (L3)
gVisor TCP/IP stack → TCP connections, UDP packets (L4)
Xray Handler        → Application connections (L7)

Stack Architecture

mermaid
flowchart TB
    subgraph TUN["TUN Device (Kernel)"]
        FD["File Descriptor"]
    end

    subgraph Endpoint["Link Endpoint"]
        RX["Read Loop:<br/>TUN fd → gVisor"]
        TX["Write: gVisor → TUN fd"]
    end

    subgraph gVisor["gVisor Stack"]
        NIC["NIC (Network Interface)"]
        IPv4["IPv4 Protocol"]
        IPv6["IPv6 Protocol"]
        TCP["TCP Protocol"]
        UDP["UDP Protocol"]
        TCPFwd["TCP Forwarder"]
        UDPHandler["UDP Handler"]
    end

    FD --> RX
    RX --> NIC
    NIC --> IPv4
    NIC --> IPv6
    IPv4 --> TCP
    IPv4 --> UDP
    IPv6 --> TCP
    IPv6 --> UDP
    TCP --> TCPFwd
    UDP --> UDPHandler

    TCPFwd -->|"gonet.TCPConn"| Handler["Xray TUN Handler"]
    UDPHandler -->|"raw packet data"| UDPConn["UDP Connection Handler"]

    gVisor -->|"response packets"| TX
    TX --> FD

Stack Creation

go
func createStack(ep stack.LinkEndpoint) (*stack.Stack, error) {
    gStack := stack.New(stack.Options{
        NetworkProtocols: []stack.NetworkProtocolFactory{
            ipv4.NewProtocol,   // IPv4 support
            ipv6.NewProtocol,   // IPv6 support
        },
        TransportProtocols: []stack.TransportProtocolFactory{
            tcp.NewProtocol,    // TCP support
            udp.NewProtocol,    // UDP support
        },
        HandleLocal: false,     // Don't special-case local addresses
    })

    // Create virtual NIC bound to our endpoint
    gStack.CreateNIC(1, ep)

    // Accept ALL destination IPs (route everything through this NIC)
    gStack.SetRouteTable([]tcpip.Route{
        {Destination: header.IPv4EmptySubnet, NIC: 1},  // 0.0.0.0/0
        {Destination: header.IPv6EmptySubnet, NIC: 1},  // ::/0
    })

    // Critical: accept packets for any IP (we're a proxy, not a host)
    gStack.SetSpoofing(1, true)
    gStack.SetPromiscuousMode(1, true)
}

TCP Tuning

go
// Congestion control: CUBIC (standard)
gStack.SetTransportProtocolOption(tcp.ProtocolNumber,
    &tcpip.CongestionControlOption("cubic"))

// Selective ACK (improves recovery from packet loss)
gStack.SetTransportProtocolOption(tcp.ProtocolNumber,
    &tcpip.TCPSACKEnabled(true))

// Moderate receive buffer (auto-tune buffer sizes)
gStack.SetTransportProtocolOption(tcp.ProtocolNumber,
    &tcpip.TCPModerateReceiveBufferOption(true))

// Disable RACK/TLP (workaround for gVisor stall bug)
gStack.SetTransportProtocolOption(tcp.ProtocolNumber,
    &tcpip.TCPRecovery(0))

// Buffer sizes
tcpRXBufOpt := tcpip.TCPReceiveBufferSizeRangeOption{
    Min: 4096, Default: 212992, Max: 8388608,  // 4KB → 208KB → 8MB
}
tcpTXBufOpt := tcpip.TCPSendBufferSizeRangeOption{
    Min: 4096, Default: 212992, Max: 6291456,  // 4KB → 208KB → 6MB
}

The link endpoint is the bridge between the TUN file descriptor and gVisor's packet processing:

go
type tunEndpoint struct {
    tun        Tun                        // TUN device
    dispatcher stack.NetworkDispatcher    // gVisor packet dispatcher
    mtu        uint32
}

Inbound Path (TUN → gVisor)

go
func (ep *tunEndpoint) dispatchLoop() {
    for {
        // Read raw IP packet from TUN fd
        packet := readFromTUN()

        // Determine IP version from first nibble
        var protocol tcpip.NetworkProtocolNumber
        switch packet[0] >> 4 {
        case 4: protocol = header.IPv4ProtocolNumber
        case 6: protocol = header.IPv6ProtocolNumber
        }

        // Create PacketBuffer and deliver to gVisor
        pkt := stack.NewPacketBuffer(stack.PacketBufferOptions{
            Payload: buffer.MakeWithData(packet),
        })
        ep.dispatcher.DeliverNetworkPacket(protocol, pkt)
    }
}

Outbound Path (gVisor → TUN)

go
func (ep *tunEndpoint) WritePackets(pkts stack.PacketBufferList) (int, tcpip.Error) {
    for _, pkt := range pkts {
        // Serialize gVisor packet to bytes
        data := pkt.ToView().AsSlice()
        // Write to TUN fd
        ep.tun.Write(data)
    }
}

TCP Forwarder

All TCP connections are intercepted by the forwarder:

go
tcpForwarder := tcp.NewForwarder(ipStack,
    0,      // receive buffer size (0 = use default)
    65535,  // max in-flight connections
    func(r *tcp.ForwarderRequest) {
        go handleTCPConnection(r)
    },
)
ipStack.SetTransportProtocolHandler(tcp.ProtocolNumber, tcpForwarder.HandlePacket)

The forwarder:

  1. Receives the SYN packet
  2. Creates a gVisor endpoint (performs TCP 3-way handshake internally)
  3. Wraps the endpoint in gonet.NewTCPConn() (implements net.Conn)
  4. Passes to the Xray handler

UDP Processing

UDP doesn't use gVisor's forwarder. Instead, packets are intercepted at the transport protocol handler level:

go
ipStack.SetTransportProtocolHandler(udp.ProtocolNumber,
    func(id stack.TransportEndpointID, pkt *stack.PacketBuffer) bool {
        data := pkt.Data().AsRange().ToSlice()

        src := net.UDPDestination(
            net.IPAddress(id.RemoteAddress.AsSlice()),
            net.Port(id.RemotePort),
        )
        dst := net.UDPDestination(
            net.IPAddress(id.LocalAddress.AsSlice()),
            net.Port(id.LocalPort),
        )

        return udpForwarder.HandlePacket(src, dst, data)
    },
)

Why not use gVisor's UDP forwarder? Because gVisor's forwarder creates per-destination connections, which doesn't support Full-Cone NAT (where return packets from any address should be accepted).

Raw UDP Return Path

For UDP responses, Xray must construct raw IP+UDP packets to inject back into the gVisor stack:

go
func (t *stackGVisor) writeRawUDPPacket(payload, src, dst) error {
    // Build UDP header
    udpHdr := header.UDP(...)
    udpHdr.Encode(&header.UDPFields{
        SrcPort: src.Port,
        DstPort: dst.Port,
        Length:  udpLen,
    })
    // Calculate checksum
    udpHdr.SetChecksum(...)

    // Build IP header (v4 or v6)
    if isIPv4 {
        ipHdr := header.IPv4(...)
        ipHdr.Encode(&header.IPv4Fields{
            TotalLength: ...,
            TTL: 64,
            Protocol: header.UDPProtocolNumber,
            SrcAddr: srcIP,
            DstAddr: dstIP,
        })
        ipHdr.SetChecksum(...)
    }

    // Inject packet back into the stack
    t.stack.WriteRawPacket(defaultNIC, ipProtocol, packetData)
}

This raw packet goes through gVisor's stack back to the TUN device, then to the original application.

Memory Considerations

gVisor allocates memory for:

  • Per-connection TCP buffers (up to 8MB RX + 6MB TX per conn)
  • Packet buffers for in-flight packets
  • Protocol state (TCP sequence numbers, timers, etc.)

For a proxy handling thousands of connections, this can be significant. The buffer auto-tuning (TCPModerateReceiveBufferOption) helps by starting small and growing as needed.

Implementation Notes

  1. gVisor is optional: You could use a simpler approach (lwIP, smoltcp) but gVisor provides the most complete TCP implementation (SACK, CUBIC, proper retransmission, etc.).

  2. Spoofing + Promiscuous are mandatory: Without them, gVisor rejects packets not addressed to a known IP. As a proxy, every destination IP is valid.

  3. RACK/TLP workaround: Disabling RACK/TLP recovery (TCPRecovery(0)) is a workaround for a gVisor bug where connections stall under high load. Monitor if this is fixed in newer gVisor versions.

  4. UDP via raw packets: The custom UDP handling (bypassing gVisor's UDP forwarder) is necessary for Full-Cone NAT. The raw packet construction (IP+UDP headers, checksums) must be correct or packets will be dropped.

  5. MTU matters: The TUN MTU (default 1500) affects maximum packet size. MSS is derived from MTU. Mismatched MTU causes fragmentation or drops.

Technical analysis for re-implementation purposes.