Use this documentation with care! It describes the outdated version 7, which was actively developed around 2014 and is considered dead by the rsyslog team.

This documentation reflects the latest update of the v7-stable branch. It describes the 7.6.8 version, which was never released. As such, it contains some content that does not apply to any released version.

To obtain the doc that properly matches your installed v7 version, obtain the doc set from your distro. Each version of rsyslog contained the version that exactly matches it.

As general advise, it is strongly suggested to upgrade to the current version supported by the rsyslog project. The current version can always be found on the right-hand side info box on the rsyslog web site.

Note that there is only limited rsyslog community support available for the outdated v7 version (officially we do not support it at all, but we usually are able to answer simple questions). If you need to stick with v7, it probably is best to ask your distribution for support.

How reliable should reliable logging be?

With any logging, you need to decide what you want to do if the log cannot be written

  • do you want the application to stop because it can’t write a log message

or

  • do you want the application to continue, but not write the log message

Note that this decision is still there even if you are not logging remotely, your local disk partition where you are writing logs can fill up, become read-only, or have other problems.

The RFC for syslog (dating back a couple of decades, well before rsyslog started) specify that the application writing the log message should block and wait for the log message to be processed. Rsyslog (like every other modern syslog daemon) fudges this a bit and buffers the log data in RAM rather than following the original behavior of writing the data to disk and doing a fsync before acknowledging the log message.

If you have a problem with your output from rsyslog, your application will keep running until rsyslog fills it’s queues, and then it will stop.

When you configure rsyslog to send the logs to another machine (either to rsyslog on another machine or to some sort of database), you introduce a significant new set of failure modes for the output from rsyslog.

You can configure the size of the rsyslog memory queues (I had one machine dedicated to running rsyslog where I created queues large enough to use >100G of ram for logs)

You can configure rsyslog to spill from it’s memory queues to disk queues (disk assisted queue mode) when it fills it’s memory queues.

You can create a separate set of queues for the action that has a high probability of failing (sending to a remote machine via TCP in this case), but this doesn’t buy you more time, it just means that other logs can continue to be written when the remote system is down.

You can configure rsyslog to have high/low watermark levels, when the queue fills past the high watermark, rsyslog will start discarding logs below a specified severity, and stop doing so when it drops below the low watermark level

For rsyslog -> *syslog, you can use UDP for your transport so that the logs will get dropped at the network layer if the remote system is unresponsive.

You have lots of options.

If you are really concerned with reliability, I should point out that using TCP does not eliminate the possibility of loosing logs when a remote system goes down. When you send a message via TCP, the sender considers it sent when it’s handed to the OS to send it. The OS has a window of how much data it allows to be outstanding (sent without acknowledgement from the remote system), and when the TCP connection fails (due to a firewall or a remote machine going down), the sending OS has no way to tell the application what data what data is outstanding, so the outstanding data will be lost. This is a smaller window of loss than UDP, which will happily keep sending your data forever, but it’s still a potential for loss. Rsyslog offers the RELP (Reliable Event Logging Protocol), which addresses this problem by using application level acknowledgements so no messages can get lost due to network issues. That just leaves memory buffering (both in rsyslog and in the OS after rsyslog tells the OS to write the logs) as potential data loss points. Those failures will only trigger if the system crashes or rsyslog is shutdown (and yes, there are ways to address these as well)

The reason why nothing today operates without the possibility of loosing log messages is that making the logs completely reliable absolutely kills performance. With buffering, rsyslog can handle 400,000 logs/sec on a low-mid range machine. With utterly reliable logs and spinning disks, this rate drops to <100 logs/sec. With a $5K PCI SSD card, you can get up to ~4,000 logs/sec (in both cases, at the cost of not being able to use the disk for anything else on the system (so if you do use the disk for anything else, performance drops from there, and pretty rapidly). This is why traditional syslog had a reputation for being very slow.