plugin

Monitoring rsyslog’s impstats with Kibana and SPM

Original post: Monitoring rsyslog with Kibana and SPM by @Sematext

A while ago we published this post where we explained how you can get stats about rsyslog, such as the number of messages enqueued, the number of output errors and so on. The point was to send them to Elasticsearch (or Logsene, our logging SaaS, which exposes the Elasticsearch API) in order to analyze them.

This is part 2 of that story, where we share how we process these stats in production. We’ll cover:

  • an updated config, working with Elasticsearch 2.x
  • what Kibana dashboards we have in Logsene to get an overview of what rsyslog is doing
  • how we send some of these metrics to SPM as well, in order to set up alerts on their values: both threshold-based alerts and anomaly detection

Continue reading “Monitoring rsyslog’s impstats with Kibana and SPM”

Connecting with Logstash via Apache Kafka

Original post: Recipe: rsyslog + Kafka + Logstash by @Sematext

This recipe is similar to the previous rsyslog + Redis + Logstash one, except that we’ll use Kafka as a central buffer and connecting point instead of Redis. You’ll have more of the same advantages:

  • rsyslog is light and crazy-fast, including when you want it to tail files and parse unstructured data (see the Apache logs + rsyslog + Elasticsearch recipe)
  • Kafka is awesome at buffering things
  • Logstash can transform your logs and connect them to N destinations with unmatched ease

There are a couple of differences to the Redis recipe, though:

  • rsyslog already has Kafka output packages, so it’s easier to set up
  • Kafka has a different set of features than Redis (trying to avoid flame wars here) when it comes to queues and scaling

As with the other recipes, I’ll show you how to install and configure the needed components. The end result would be that local syslog (and tailed files, if you want to tail them) will end up in Elasticsearch, or a logging SaaS like Logsene (which exposes the Elasticsearch API for both indexing and searching). Of course you can choose to change your rsyslog configuration to parse logs as well (as we’ve shown before), and change Logstash to do other things (like adding GeoIP info).

Getting the ingredients

First of all, you’ll probably need to update rsyslog. Most distros come with ancient versions and don’t have the plugins you need. From the official packages you can install:

If you don’t have Kafka already, you can set it up by downloading the binary tar. And then you can follow the quickstart guide. Basically you’ll have to start Zookeeper first (assuming you don’t have one already that you’d want to re-use):

bin/zookeeper-server-start.sh config/zookeeper.properties

And then start Kafka itself and create a simple 1-partition topic that we’ll use for pushing logs from rsyslog to Logstash. Let’s call it rsyslog_logstash:

bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic rsyslog_logstash

Finally, you’ll have Logstash. At the time of writing this, we have a beta of 2.0, which comes with lots of improvements (including huge performance gains of the GeoIP filter I touched on earlier). After downloading and unpacking, you can start it via:

bin/logstash -f logstash.conf

Though you also have packages, in which case you’d put the configuration file in /etc/logstash/conf.d/ and start it with the init script.

Configuring rsyslog

With rsyslog, you’d need to load the needed modules first:

module(load="imuxsock")  # will listen to your local syslog
module(load="imfile")    # if you want to tail files
module(load="omkafka")   # lets you send to Kafka

If you want to tail files, you’d have to add definitions for each group of files like this:

input(type="imfile"
  File="/opt/logs/example*.log"
  Tag="examplelogs"
)

Then you’d need a template that will build JSON documents out of your logs. You would publish these JSON’s to Kafka and consume them with Logstash. Here’s one that works well for plain syslog and tailed files that aren’t parsed via mmnormalize:

template(name="json_lines" type="list" option.json="on") {
  constant(value="{")
  constant(value="\"timestamp\":\"")
  property(name="timereported" dateFormat="rfc3339")
  constant(value="\",\"message\":\"")
  property(name="msg")
  constant(value="\",\"host\":\"")
  property(name="hostname")
  constant(value="\",\"severity\":\"")
  property(name="syslogseverity-text")
  constant(value="\",\"facility\":\"")
  property(name="syslogfacility-text")
  constant(value="\",\"syslog-tag\":\"")
  property(name="syslogtag")
  constant(value="\"}")
}

By default, rsyslog has a memory queue of 10K messages and has a single thread that works with batches of up to 16 messages (you can find all queue parameters here). You may want to change:
– the batch size, which also controls the maximum number of messages to be sent to Kafka at once
– the number of threads, which would parallelize sending to Kafka as well
– the size of the queue and its nature: in-memory(default), disk or disk-assisted

In a rsyslog->Kafka->Logstash setup I assume you want to keep rsyslog light, so these numbers would be small, like:

main_queue(
  queue.workerthreads="1"      # threads to work on the queue
  queue.dequeueBatchSize="100" # max number of messages to process at once
  queue.size="10000"           # max queue size
)

Finally, to publish to Kafka you’d mainly specify the brokers to connect to (in this example we have one listening to localhost:9092) and the name of the topic we just created:

action(
  broker=["localhost:9092"]
  type="omkafka"
  topic="rsyslog_logstash"
  template="json"
)

Assuming Kafka is started, rsyslog will keep pushing to it.

Configuring Logstash

This is the part where we pick the JSON logs (as defined in the earlier template) and forward them to the preferred destinations. First, we have the input, which will use to the Kafka topic we created. To connect, we’ll point Logstash to Zookeeper, and it will fetch all the info about Kafka from there:

input {
  kafka {
    zk_connect => "localhost:2181"
    topic_id => "rsyslog_logstash"
  }
}

At this point, you may want to use various filters to change your logs before pushing to Logsene/Elasticsearch. For this last step, you’d use the Elasticsearch output:

output {
  elasticsearch {
    hosts => "localhost" # it used to be "host" pre-2.0
    port => 9200
    #ssl => "true"
    #protocol => "http" # removed in 2.0
  }
}

And that’s it! Now you can use Kibana (or, in the case of Logsene, either Kibana or Logsene’s own UI) to search your logs!

rsyslog v8 improvements and how to write plugins in any language

In the first part, we will explain the new RSYSLOG v8 engine, its motivation and its benefits. Learn, for example, why writing to Elasticsearch is much faster with the new engine. We will describe the tuning parameters vital for making best use of the new features.

In the second part we will explain how to write RSYSLOG plugins in any language. Traditionally, writing rsyslog plugins has been considered quite hard, with at least C knowledge necessary. In v8, we have introduced new interfaces which make it possible to write plugins in any language – be it Python, Perl or Java. Even bash will do. In essence, this is a great tool for any admin to add special needs with just a bit of scripting. We will proivde concrete instructions on how to write a plugin, point to read-to-copy samples and tell how to integrate this into rsyslog.

NOTE: This is Rainers LinuxTag Berlin 2014 talk.

Changelog for 8.3.0 (v8-devel)

Version 8.3.0 [v8-devel] 2014-04-10

  • new plugin for anonymizing credit card numbers
    Thanks to Peter Slavov for providing the code.
  • external message modification modules are now supported
    They are bound via the new native module “mmexternal”. Also, a sample skeleton for an external python message modification module has been added.
  • new $jsonmesg property with JSON representation of whole message object
    closes: https://github.com/rsyslog/rsyslog/issues/19
  • improved error message for invalid field extraction in string template
    see also:
    http://kb.monitorware.com/problem-with-field-based-extraction-t12299.html
  • fix build problems on Solaris
  • NOTE: a json-c API that we begun to use requires the compiler to be in c99 mode. By default, we select it automatically. If you modify this and use gcc, be sure to include “-std=c99” in your compiler flags. This seems to be necessary only for older versions of gcc.

rsyslog statistic counter plugin imuxsocks

Plugin – imuxsock

This plugin maintains a global statistics with the following properties:

  • submitted – total number of messages submitted for processing since startup
  • ratelimit.discarded – number of messages discarded due to rate limiting
  • ratelimit.numratelimiters – number of currently active rate limiters (small data structures used for the rate limiting logic)

Back to statistics counter overview

rsyslog statistic counter plugin imudp

Plugin – imudp

This plugin maintains statistics for each listener and for each worker thread.

The listener statistic is named starting with “imudp”, followed followed by the listener IP, a colon and port in parenthesis. For example, the counter for a listener on port 514 (on all IPs) with no set name is called “imudp(*:514)”.

If an “inputname” is defined for a listener, that inputname is used instead of “imudp” as statistic name. For example, if the inputname is set to “myudpinut”, that corresponding statistic name in above case would be “myudpinput(*:514)”. This has been introduced in 7.5.3.

The following properties are maintained for each listener:

  • submitted – total number of messages submitted for processing since startup

The worker thread (in short: worker) statistic is named “imudp(wX)” where “X” is the worker thread ID, which is an monotonically increasing integer starting at 0. This means the first worker will have the name “imudp(w0”), the second “imudp(w1)” and so on. Note that workers are all equal. It doesn’t really matter which worker processes which messages, so the actual worker ID is not of much concern. More interesting is to check how the load is spread between the worker. Also note that there is no fixed worker-to-listener relationship: all workers process messages from all listeners.

Note: worker thread statistics are available starting with rsyslog 7.5.5.

The following properties are maintained for each worker thread:

  • called.recvmmsg – number of recvmmsg() OS calls done
  • called.recvmsg – number of recvmsg() OS calls done
  • msgs.received – number of actual messages received

Note: usually either one of “called.recvmmsg” or “called.recvmsg” is non-zero. This is because these are alternative kernel interfaces and the one to be used is selected based on kernel capabilities. However, there are two counters, as the system call type used has important performance implications, and so we thought this information needs to be exposed.

On systems supporting recvmmsg, the quotient msgs.received / caled.recvmmsg tells the average number of messages that could be pulled from the kernel buffers with a single syscall.

Back to statistics counter overview

rsyslog statistic counter plugin imtcp

Plugin – imtcp

This plugin maintains statistics for each listener. The statistic is named after the given input name (or “imtcp” if none is configured), followed by the listener port in parenthesis. For example, the counter for a listener on port 514 with no set name is called “imtcp(514)”.

The following properties are maintained for each listener:

  • submitted – total number of messages submitted for processing since startup

Back to statistics counter overview

rsyslog statistic counter plugin imptcp

Plugin – imptcp

This plugin maintains statistics for each listener. The statistic is named “imtcp” , followed by the bound address, listener port and IP version in parenthesis. For example, the counter for a listener on port 514, bound to all interfaces and listening on IPv6 is called “imptcp(*/514/IPv6)”.

The following properties are maintained for each listener:

  • submitted – total number of messages submitted for processing since startup

Back to statistics counter overview

rsyslog statistic counter plugin impstats

Plugin – impstats (rsyslog 7.5.3+) The impstats plugin gathers some internal statistics. They have different names depending on the actual statistics. Obviously, they do not relate to the plugin itself but rather to a broader object – most notably the rsyslog process itself. The “resource-usage” counter maintains process statistics. They base on the getrusage() system call. The counters are named like getrusage returned data memebers. So for details, looking them up in “man getrusage” is highly recommended, especially as value may be different depending on the platform. A getrusage() call is done immediately before the counter is emitted. The follwowing individual counters are maintained:

  • utime – this is the user time in microseconds (thus the timeval structure combined)
  • stime – again, time given in microseconds
  • maxrss
  • minflt
  • majflt
  • inblock
  • oublock
  • nvcsw
  • nivcsw

Back to statistics counter overview

rsyslog statistic counter plugin omfile

Plugin – omfile (rsyslog 7.3.6+)

This plugin maintains statistics for each dynafile cache. Dynafile cache performance is critical for overall system performance, so reviewing these counters on a busy system (especially one experiencing performance problems) is advisable. The statistic is named “dynafile cache”, followed by the template name used for this dynafile action.

The following properties are maintained for each dynafile:

  • requests – total number of requests made to obtain a dynafile
  • level0 – requests for the current active file, so no real cache lookup needed to be done. These are extremely good.
  • missed – cache misses, where the required file did not reside in cache. Even with a perfect cache, there will be at least one miss per file. That happens when the file is being accessed for the first time and brought into cache. So “missed” will always be at least as large as the number of different files processed.
  • evicted – the number of times a file needed to be evicted from the cache as it ran out of space. These can simply happen when date-based files are used, and the previous date files are being removed from the cache as time progresses. It is better, though, to set an appropriate “closeTimeout” (counter described below), so that files are removed from the cache after they become no longer accessed. It is bad if active files need to be evicted from the cache. This is a very costly operation as an evict requires to close the file (thus a full flush, no matter of its buffer state) and a later access requires a re-open – and the eviction of another file, as the cache obviously has run out of free entries. If this happens frequently, it can severely affect performance. So a high eviction rate is a sign that the dynafile cache size should be increased. If it is already very high, it is recommended to re-think about the design of the file store, at least if the eviction process causes real performance problems.
  • maxused – the maximum number of cache entries ever used. This can be used to trim the cache down to a value that’s actually useful but does not waste resources. Note that when date-based files are used and rsyslog is run for an extended period of time, the cache gradually fills up to the max configured value as older files are migrated out of it. This will make “maxused” questionable after some time. Frequently enough purging the cache can prevent this (usually, once a day is sufficient).
  • closetimeouts –  available since 8.3.3 – tells how often a file was closed due to timeout settings (“closeTimeout” action parameter). These are cases where dynafiles or static files have been closed by rsyslog due to inactivity. Note that if no “closeTimeout” is specified for the action, this counter always is zero. A high or low number in itself doesn’t mean anything good or bad. It totally depends on the use case, so no general advise can be given.

Note that the dynafile caches are purged when a HUP is sent.

Back to statistics counter overview

Scroll to top