FAQ: Troubleshooting UDP Packet Loss

This document addresses common questions related to diagnosing and resolving UDP packet loss when using the imudp input module. While UDP is fast, it is inherently unreliable. These entries will help you understand why loss occurs and how to mitigate it.

Q1: I see packet loss in my OS statistics for imudp. What are the potential causes, troubleshooting steps, and probable fixes?

Answer:

This is a common issue when dealing with UDP-based logging. Packet loss typically occurs when the rate of incoming messages overwhelms the system’s or rsyslog’s capacity to process them.

Potential Causes

  • High Message Rate: A burst of traffic or a sustained high volume of logs exceeds the processing capacity.

  • Insufficient OS Buffers: The kernel’s network receive buffer for the UDP socket is too small and overflows. This is the most common cause.

  • CPU Limitations: The CPU is too busy to service the network queue, causing the kernel buffer to fill and drop packets.

  • Rsyslog Configuration: The default imudp settings may not be optimized for a high-traffic environment. Downstream actions within your rsyslog configuration could also be slow, causing backpressure.

  • Network Issues: Packet loss is happening on the network before the packets even reach the rsyslog server.

Troubleshooting Steps

  1. Confirm Kernel Drops: First, verify that the OS kernel is the one dropping packets due to a full receive buffer. See the specific FAQ entry on how to check this with netstat or ss.

  2. Check for Unknown Ports: Use OS network statistics to see if packets are arriving on ports you are not listening on, which indicates a client misconfiguration.

  3. Inspect Network Traffic: Use a tool like tcpdump to see if the packets are arriving at the server’s network interface. This helps rule out upstream network problems.

  4. Enable ``impstats``: Use rsyslog’s impstats module to monitor internal queue performance and see if rsyslog itself is dropping messages after they have been received from the kernel.

Probable Fixes

  1. Increase Kernel Receive Buffer: This is the most effective fix. You can request a larger buffer directly in your imudp configuration. Rsyslog will attempt to set it using privileged calls if run as root.

    # Request a 4MB receive buffer for the UDP input
    input(type="imudp" port="514" rcvbuf="4m")
    

    If you are not running as root, you must also increase the OS-level maximum:

    # Set the maximum allowed buffer size to 8MB
    sysctl -w net.core.rmem_max=8388608
    
  2. Tune ``imudp`` Performance: Assign more worker threads to process incoming data in parallel and adjust the batch size.

    input(type="imudp" port="514" threads="4" batchSize="256")
    
  3. Switch to a Reliable Protocol: If packet loss is unacceptable, the ultimate fix is to stop using UDP. Migrate your clients and server to use TCP (``imtcp``) or, even better, RELP (``imrelp``), which is designed for guaranteed log delivery.

Q2: How can I definitively check if the OS kernel is dropping UDP packets due to full receive buffers?

Answer:

Yes, you can check this directly by querying kernel network statistics. The kernel increments a specific counter every time it drops a UDP packet because a socket’s receive buffer is full.

Use one of the following commands:

  1. ``netstat`` (classic tool):

    netstat -su
    

    In the output, look for the receive buffer errors counter in the Udp: section. A non-zero, increasing value is a clear indicator of packet loss due to buffer overflow.

  2. ``nstat`` (cleaner output):

    nstat -auz
    

    Look for the UdpRcvbufErrors counter. It directly corresponds to this issue.

To troubleshoot, check the value before and after a period of high traffic. If the counter increases, you have confirmed the cause of the packet loss.


Q3: Why does tcpdump show traffic as “ip-proto-17” without a port number?

Answer:

This indicates you are seeing fragmented UDP packets.

When a device sends a UDP datagram that is larger than the network’s Maximum Transmission Unit (MTU, typically ~1500 bytes), the IP layer must break it into smaller fragments.

  • Only the first fragment contains the full UDP header with the source and destination ports.

  • All subsequent fragments do not contain the UDP header. tcpdump can see from the IP header that the payload belongs to protocol 17 (UDP), but it cannot find the port numbers, so it simply labels it ip-proto-17.

The presence of many such packets means a device on your network is sending very large log messages or other UDP data. The solution is to identify the source device and, if possible, configure it to send smaller messages.


Q4: My OS stats show “packets to unknown port”. How do I find out which port is being used?

Answer:

The “packets to unknown port” counter is incremented when a UDP packet arrives on a port where no application is listening. To find the destination port, you must use a network sniffing tool like tcpdump.

This command will show all UDP traffic that is not going to your standard, known syslog ports (e.g., 514 and 10514).

tcpdump -i any -n 'udp and not (dst port 514 or dst port 10514)'

The output will show you the destination IP and port of the unexpected traffic, allowing you to identify the misconfigured client. For example, 10.1.1.5.4321 > 10.1.1.1.9999 shows a packet destined for port 9999.


Q5: How does rsyslog set the UDP receive buffer? Does it use SO_RCVBUFFORCE?

Answer:

Yes, rsyslog has a sophisticated, two-step process for setting the buffer size specified by the rcvbuf parameter in imudp:

  1. Attempt ``SO_RCVBUFFORCE``: First, it tries to set the buffer using the SO_RCVBUFFORCE socket option. This is a privileged call that bypasses the system’s maximum buffer limit (net.core.rmem_max). This call will only succeed if rsyslog is running with CAP_NET_ADMIN capabilities (e.g., as the root user).

  2. Fallback to ``SO_RCVBUF``: If the SO_RCVBUFFORCE call fails (e.g., because rsyslog is not running as root), it immediately falls back to using the standard, unprivileged SO_RCVBUF call. This call is limited by the system-wide net.core.rmem_max setting.

This means that if you run rsyslog as root, setting a large rcvbuf is sufficient. If you run as a non-privileged user, you must also tune the net.core.rmem_max sysctl parameter.

See also

Help with configuring/using Rsyslog:

See also

Contributing to Rsyslog:

Copyright 2008-2023 Rainer Gerhards (Großrinderfeld), and Others.