The rocket-fast system for log processing

config

RSyslog Windows Agent 4.1 Released

By Adiscon SupportPosted on March 27, 2017Posted in News, Release AnnouncementTagged 4.1, bugfix, config, EventLog Monitor, file config, RSyslog Windows Agent

Adiscon is proud to announce the 4.1 release of MonitorWare Agent.

Rsyslog Windows Agent is now able to reload it’s configuration automatically if enabled (Which is done by the configuration client
automatically on first start). It is not necessary to restart the service manually anymore.

Performance enhancing options have been added into EventLog Monitor V1 and V2 and in File Monitor to delay writing the last record/fileposition back to disk. This can incease performance on machines with a very high eventlog or file load.

Detailed information can be found in the version history below.

Build-IDs: Service 4.1.0.166, Client 4.1.0.246

Features

Updated to OpenSSL 1.0.2k.
Configuration Reload: This is a big new core feature allowing the
service to reload itself automatically after a configuration changed has
been detected. The feature can be turned off in General-General Options if
this new behavior is not wanted. By default auto reload will be enabled.
The latest Configuration Client is required for the feature to fully work.
EventLog Monitor V2: Added new options to delay LastRecord save.
Enabling this option will improve processing performance of machines with
a high event volume.
EventLog Monitor V1: Added new option to delay LastRecord save. Enabling
this option will improve processing performance of machines with a high
event volume.
File Monitor: Added new option to delay LastFilePosition save. Enabling
this option will improve processing performance when processing large
growing files.
FileConfig: Changed datafile saving method, more reliable when the
service is stopped unintentionally while updating data state files.
Send Syslog Action: Added new option to enable/disable UTF8 BOM. Default
is enabled like before, but it can be disabled now by configuration so the
message won’t contain the UTF8 BOM.

Bugfixes

Property Engine: Fixed SystemID and CustomerID properties.v
FileConfig: Due a missing property (FilterVersion), some of the global
conditions in rule filters could not be used. This automatically fixes
itself next time the configuration is saved with the Client.
Debug Logging: Completely rewritten debug output for Rule Engine
(Filters) for better readability and analysis.
Fixed an compatibility issue on Windows 2003/XP (failed to start because
WSAPoll API is missing).
FileConfig: Fixed an issue with invalid linefeeds when using includefile
directive.
FileConfig: Fixed EnumRegkey emulation causing EventLog Monitor Services
to load invalid eventlog channels.
Debug Logging: Moved RELP Debugging from minimal to internal
FileMonitor: Fixed issue rewriting filepointer updates each time when
wildcards support was enabled.

Version 4.1 is a free download. Customers with existing 3.x keys can contact our Sales department for upgrade prices. If you have a valid Upgrade Insurance ID, you can request a free new key by sending your Upgrade Insurance ID to sales@adiscon.com. Please note that the download enables the free 30-day trial version if used without a key – so you can right now go ahead and evaluate it.

Using rsyslog to Reindex/Migrate Elasticsearch data

By rgheorghePosted on May 10, 2016Posted in More complex scenariosTagged config, elasticsearch, external inputs, Guides for rsyslog, howto, imtcp, omelasticsearch, reindex, rsyslog, syslog

Original post: Scalable and Flexible Elasticsearch Reindexing via rsyslog by @Sematext

This recipe is useful in a two scenarios:

migrating data from one Elasticsearch cluster to another (e.g. when you’re upgrading from Elasticsearch 1.x to 2.x or later)
reindexing data from one index to another in a cluster pre 2.3. For clusters on version 2.3 or later, you can use the Reindex API

Back to the recipe, we used an external application to scroll through Elasticsearch documents in the source cluster and push them to rsyslog via TCP. Then we used rsyslog’s Elasticsearch output to push logs to the destination cluster. The overall flow would be:

This is an easy way to extend rsyslog, using whichever language you’re comfortable with, to support more inputs. Here, we piggyback on the TCP input. You can do a similar job with filters/parsers – you can find GeoIP implementations, for example – by piggybacking the mmexternal module, which uses stdout&stdin for communication. The same is possible for outputs, normally added via the omprog module: we did this to add a Solr output and one for SPM custom metrics.

The custom script in question doesn’t have to be multi-threaded, you can simply spin up more of them, scrolling different indices. In this particular case, using two scripts gave us slightly better throughput, saturating the network:

Writing the custom script

Before starting to write the script, one needs to know how the messages sent to rsyslog would look like. To be able to index data, rsyslog will need an index name, a type name and optionally an ID. In this particular case, we were dealing with logs, so the ID wasn’t necessary.

With this in mind, I see a number of ways of sending data to rsyslog:

one big JSON per line. One can use mmnormalize to parse that JSON, which then allows rsyslog do use values from within it as index name, type name, and so on
for each line, begin with the bits of “extra data” (like index and type names) then put the JSON document that you want to reindex. Again, you can use mmnormalize to parse, but this time you can simply trust that the last thing is a JSON and send it to Elasticsearch directly, without the need to parse it
if you only need to pass two variables (index and type name, in this case), you can piggyback on the vague spec of RFC3164 syslog and send something like
```
destination_index document_type:{"original": "document"}
```

This last option will parse the provided index name in the hostname variable, the type in syslogtag and the original document in msg. A bit hacky, I know, but quite convenient (makes the rsyslog configuration straightforward) and very fast, since we know the RFC3164 parser is very quick and it runs on all messages anyway. No need for mmnormalize, unless you want to change the document in-flight with rsyslog.

Below you can find the Python code that can scan through existing documents in an index (or index pattern, like logstash_2016.05.*) and push them to rsyslog via TCP. You’ll need the Python Elasticsearch client (pip install elasticsearch) and you’d run it like this:

python elasticsearch_to_rsyslog.py source_index destination_index

The script being:

from elasticsearch import Elasticsearch
import json, socket, sys

source_cluster = ['server1', 'server2']
rsyslog_address = '127.0.0.1'
rsyslog_port = 5514

es = Elasticsearch(source_cluster,
      retry_on_timeout=True,
      max_retries=10)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((rsyslog_address, rsyslog_port))


result = es.search(index=sys.argv[1], scroll='1m', search_type='scan', size=500)

while True:
  res = es.scroll(scroll_id=result['_scroll_id'], scroll='1m')
  for hit in result['hits']['hits']:
    s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(hit["_source"])+'\n')
  if not result['hits']['hits']:
    break

s.close()

If you need to modify messages, you can parse them in rsyslog via mmjsonparse and then add/remove fields though rsyslog’s scripting language. Though I couldn’t find a nice way to change field names – for example to remove the dots that are forbidden since Elasticsearch 2.0 – so I did that in the Python script:

def de_dot(my_dict):
  for key, value in my_dict.iteritems():
    if '.' in key:
      my_dict[key.replace('.','_')] = my_dict.pop(key)
    if type(value) is dict:
      my_dict[key] = de_dot(my_dict.pop(key))
  return my_dict

And then the “send” line becomes:

s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(de_dot(hit["_source"]))+'\n')

Configuring rsyslog

The first step here is to make sure you have the lastest rsyslog, though the config below works with versions all the way back to 7.x (which can be found in most Linux distributions). You just need to make sure the rsyslog-elasticsearch package is installed, because we need the Elasticsearch output module.

# messages bigger than this are truncated
$maxMessageSize 10000000  # ~10MB

# load the TCP input and the ES output modules
module(load="imtcp")
module(load="omelasticsearch")

main_queue(
  # buffer up to 1M messages in memory
  queue.size="1000000"
  # these threads process messages and send them to Elasticsearch
  queue.workerThreads="4"
  # rsyslog processes messages in batches to avoid queue contention
  # this will also be the Elasticsearch bulk size
  queue.dequeueBatchSize="4000"
)

# we use templates to specify how the data sent to Elasticsearch looks like
template(name="document" type="list"){
  # the "msg" variable contains the document
  property(name="msg")
}
template(name="index" type="list"){
  # "hostname" has the index name
  property(name="hostname")
}
template(name="type" type="list"){
  # "syslogtag" has the type name
  property(name="syslogtag")
}

# start the TCP listener on the port we pointed the Python script to
input(type="imtcp" port="5514")

# sending data to Elasticsearch, using the templates defined earlier
action(type="omelasticsearch"
  template="document"
  dynSearchIndex="on" searchIndex="index"
  dynSearchType="on" searchType="type"
  server="localhost"  # destination Elasticsearch host
  serverport="9200"   # and port
  bulkmode="on"  # use the bulk API
  action.resumeretrycount="-1"  # retry indefinitely if Elasticsearch is unreachable
)

This configuration doesn’t have to disturb your local syslog (i.e. by replacing /etc/rsyslog.conf). You can put it someplace else and run a different rsyslog process:

rsyslogd -i /var/run/rsyslog_reindexer.pid -f /home/me/rsyslog_reindexer.conf

And that’s it! With rsyslog started, you can start the Python script(s) and do the reindexing.

Monitoring rsyslog’s impstats with Kibana and SPM

By rgheorghePosted on April 6, 2016Posted in More complex scenariosTagged alert, config, elasticsearch, impstats, kibana, liblognorm, mmnormalize, monitoring, omelasticsearch, plugin, rsyslog, ruleset, spm, statistic, templates, v8

Original post: Monitoring rsyslog with Kibana and SPM by @Sematext

A while ago we published this post where we explained how you can get stats about rsyslog, such as the number of messages enqueued, the number of output errors and so on. The point was to send them to Elasticsearch (or Logsene, our logging SaaS, which exposes the Elasticsearch API) in order to analyze them.

This is part 2 of that story, where we share how we process these stats in production. We’ll cover:

an updated config, working with Elasticsearch 2.x
what Kibana dashboards we have in Logsene to get an overview of what rsyslog is doing
how we send some of these metrics to SPM as well, in order to set up alerts on their values: both threshold-based alerts and anomaly detection

Continue reading “Monitoring rsyslog’s impstats with Kibana and SPM”

Connecting with Logstash via Apache Kafka

By rgheorghePosted on October 21, 2015Posted in More complex scenarios, UncategorizedTagged config, elasticsearch, Guides for rsyslog, imfile, logstash, omkafka, plugin, queues, rsyslog, syslog, templates, v8

Original post: Recipe: rsyslog + Kafka + Logstash by @Sematext

This recipe is similar to the previous rsyslog + Redis + Logstash one, except that we’ll use Kafka as a central buffer and connecting point instead of Redis. You’ll have more of the same advantages:

rsyslog is light and crazy-fast, including when you want it to tail files and parse unstructured data (see the Apache logs + rsyslog + Elasticsearch recipe)
Kafka is awesome at buffering things
Logstash can transform your logs and connect them to N destinations with unmatched ease

There are a couple of differences to the Redis recipe, though:

rsyslog already has Kafka output packages, so it’s easier to set up
Kafka has a different set of features than Redis (trying to avoid flame wars here) when it comes to queues and scaling

As with the other recipes, I’ll show you how to install and configure the needed components. The end result would be that local syslog (and tailed files, if you want to tail them) will end up in Elasticsearch, or a logging SaaS like Logsene (which exposes the Elasticsearch API for both indexing and searching). Of course you can choose to change your rsyslog configuration to parse logs as well (as we’ve shown before), and change Logstash to do other things (like adding GeoIP info).

Getting the ingredients

First of all, you’ll probably need to update rsyslog. Most distros come with ancient versions and don’t have the plugins you need. From the official packages you can install:

rsyslog. This will update the base package, including the file-tailing module
rsyslog-kafka. This will get you the Kafka output module

If you don’t have Kafka already, you can set it up by downloading the binary tar. And then you can follow the quickstart guide. Basically you’ll have to start Zookeeper first (assuming you don’t have one already that you’d want to re-use):

bin/zookeeper-server-start.sh config/zookeeper.properties

And then start Kafka itself and create a simple 1-partition topic that we’ll use for pushing logs from rsyslog to Logstash. Let’s call it rsyslog_logstash:

bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic rsyslog_logstash

Finally, you’ll have Logstash. At the time of writing this, we have a beta of 2.0, which comes with lots of improvements (including huge performance gains of the GeoIP filter I touched on earlier). After downloading and unpacking, you can start it via:

bin/logstash -f logstash.conf

Though you also have packages, in which case you’d put the configuration file in /etc/logstash/conf.d/ and start it with the init script.

Configuring rsyslog

With rsyslog, you’d need to load the needed modules first:

module(load="imuxsock")  # will listen to your local syslog
module(load="imfile")    # if you want to tail files
module(load="omkafka")   # lets you send to Kafka

If you want to tail files, you’d have to add definitions for each group of files like this:

input(type="imfile"
  File="/opt/logs/example*.log"
  Tag="examplelogs"
)

Then you’d need a template that will build JSON documents out of your logs. You would publish these JSON’s to Kafka and consume them with Logstash. Here’s one that works well for plain syslog and tailed files that aren’t parsed via mmnormalize:

template(name="json_lines" type="list" option.json="on") {
  constant(value="{")
  constant(value="\"timestamp\":\"")
  property(name="timereported" dateFormat="rfc3339")
  constant(value="\",\"message\":\"")
  property(name="msg")
  constant(value="\",\"host\":\"")
  property(name="hostname")
  constant(value="\",\"severity\":\"")
  property(name="syslogseverity-text")
  constant(value="\",\"facility\":\"")
  property(name="syslogfacility-text")
  constant(value="\",\"syslog-tag\":\"")
  property(name="syslogtag")
  constant(value="\"}")
}

By default, rsyslog has a memory queue of 10K messages and has a single thread that works with batches of up to 16 messages (you can find all queue parameters here). You may want to change:
– the batch size, which also controls the maximum number of messages to be sent to Kafka at once
– the number of threads, which would parallelize sending to Kafka as well
– the size of the queue and its nature: in-memory(default), disk or disk-assisted

In a rsyslog->Kafka->Logstash setup I assume you want to keep rsyslog light, so these numbers would be small, like:

main_queue(
  queue.workerthreads="1"      # threads to work on the queue
  queue.dequeueBatchSize="100" # max number of messages to process at once
  queue.size="10000"           # max queue size
)

Finally, to publish to Kafka you’d mainly specify the brokers to connect to (in this example we have one listening to localhost:9092) and the name of the topic we just created:

action(
  broker=["localhost:9092"]
  type="omkafka"
  topic="rsyslog_logstash"
  template="json"
)

Assuming Kafka is started, rsyslog will keep pushing to it.

Configuring Logstash

This is the part where we pick the JSON logs (as defined in the earlier template) and forward them to the preferred destinations. First, we have the input, which will use to the Kafka topic we created. To connect, we’ll point Logstash to Zookeeper, and it will fetch all the info about Kafka from there:

input {
  kafka {
    zk_connect => "localhost:2181"
    topic_id => "rsyslog_logstash"
  }
}

At this point, you may want to use various filters to change your logs before pushing to Logsene/Elasticsearch. For this last step, you’d use the Elasticsearch output:

output {
  elasticsearch {
    hosts => "localhost" # it used to be "host" pre-2.0
    port => 9200
    #ssl => "true"
    #protocol => "http" # removed in 2.0
  }
}

And that’s it! Now you can use Kibana (or, in the case of Logsene, either Kibana or Logsene’s own UI) to search your logs!

Howto store remote messages in a separate file

By adisconteamPosted on June 3, 2013Posted in Video TutorialsTagged config, file, format, legacy, messages, remote, separate, testing, tutorial, Video

In this ~8 minute video Rainer Gerhards describes Howto store remote messages in a separate file. It’s actually one of the most frequently asked question on rsyslog forum and mailing list.

Note: the tutorial is for legacy config format in order to help most people gain benefit from it.

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

How can I check the config?

By Adiscon SupportPosted on January 30, 2013Posted in FAQTagged check, config, rsyslog

We have often seen the case, that someone has rsyslog running and makes changes to the configuration. And usually, after making the changes, rsyslog gets restarted, but the changed config is invalid. rsyslog has a function to check the configuration for validity. This can be done very easily by invoking this command:

rsyslogd -N1

(Note that rsyslogd may not be in your search path – then it usually is found in /sbin/rsyslogd)

This tells rsyslog to do a config check. It does NOT run in regular mode, but just check configuration file correctness. This option is meant to verify a config file. To do so, run rsyslogd interactively in foreground, specifying -f <config-file> and -N level. The level argument modifies behaviour. Currently, 0 is the same as not specifying the -N option at all (so this makes limited sense) and 1 actually activates the code.

This configuration check will only check the configuration for integrity like syntax. Additionaly, the modules will be loaded to make sure that they work properly. On the downside, since the engine will not be loaded, errors with permissions or alike cannot be checked. These will occur only when running rsyslog normally.

The verdict for this option is, that it is quite useful for a first check if the changes were correct, without running the configuration in live mode. This might help to prevent that rsyslog gets restarted with a basically wrong configuration and thus rendering rsyslog useless, because it might not work or not work properly.

How to check if config variables are used?

By Adiscon SupportPosted on July 6, 2012Posted in FAQTagged config, debug, rsyslog

Sometimes you might wonder, if the configuration you created is really used. At least parts of it. This could really happen in a lot of situations. Currently, the config format is changed a lot. This forces users who want to use the new format to use a mixed mode of old and new config style. And this is where a lot of confusion can occur, which results in not properly set config variables. In fact, you just need to create some consistency in the configuration. Most output modules are already updated. If you want to use the new format, you cannot use some old config directives and use some new config directives at the same time. The old ones will be simply ignored then. Instead the default values will be used. A very common case is with queues and this is what I will use for my example. I will show, how you can identify if configuration directives are used correctly. We will use the debug log for this. The rsyslog version I will use is the 6.3.12 beta. In this version, the main message queue still needs to be configured with the old config directives, whereas action queues already support the new config directives. You can enable debug mode for rsyslog with the following commands in a terminal:

export RSYSLOG_DEBUGLOG="/path/to/debuglog" export RSYSLOG_DEBUG="Debug"

You can now start rsyslog on the same command line with:

rsyslogd -c6

You will usually see the debug output in that same terminal, as rsyslog runs in the foreground. That should now just serve as a indicator, that the debug output works. Now here is the config snippet I used:

$ActionQueueType LinkedList $ActionQueueSize 2000000 $ActionQueueTimeoutEnqueue 0 $ActionQueueDequeueBatchSize 400 *.* action(type="omfwd" target="10.10.10.12" port="514" protocol="udp")

As you can see, we configured the action queue with some custom variables in the old fashion, basically the action queue and the action itself with the new style. If you create a new debug file now and review it, search for “action 1” in this case. You should see the following. This screenshot is a excerpt of the debug log. I marked several positions. The green circles show, that the action parameters have been used correctly. The red circles show two of the action queues parameters, where the defaults have been used. This is just a example on how to identify, if your configuration was loaded successfully.

Changelog for 6.3.7 (v6-devel)

By Adiscon SupportPosted on February 2, 2012Posted in ChangelogTagged 6.3.7, bugfix, config, devel, imklog, license, rsyslog, v6, x.509

Version 6.3.7 [DEVEL] 2012-02-02

imported refactored v5.9.6 imklog linux driver, now combined with BSD driver
removed imtemplate/omtemplate template modules, as this was waste of time
The actual input/output modules are better copy templates. Instead, the now-removed modules cost time for maintenance AND often caused confusion on what their role was.
added new stats objects
improved support for new v6 config system. The build-in output modules now all support the new config language
bugfix: facility local<x> was not correctly interpreted in legacy filters
Was only accepted if it was the first PRI in a multi-filter PRI. Thanks to forum user Mark for bringing this to our attention.
bugfix: potential abort after reading invalid X.509 certificate
closes: http://bugzilla.adiscon.com/show_bug.cgi?id=290
Thanks to Tomas Heinrich for the patch
bufgix: legacy parsing of some filters did not work correctly
bugfix: rsyslog aborted during startup if there is an error in loading an action and legacy configuration mode is used
bugfix: bsd klog driver did no longer compile
relicensed larger parts of the code under Apache (ASL) 2.0

rsyslog 6.3.3 (devel) released

By adisconteamPosted on July 13, 2011Posted in News, Release AnnouncementTagged 6.3.3, config, devel, format, rainerscript, release, v6

This is a very important milestone release. It features the new config parser and thus provides the basis for a more intuitive config format. With 6.3.3 there are already some enhancements to the format. However, more changes will come up with the next minor releases. For details, please check this link:

http://www.rsyslog.com/rsyslog-6-3-3-config-format-improvements/

It is worth noting that the performance of script-based filters (“if … then”) has notable been improved. Preliminary benchmarks show an improvement of at least a factor of three (more detailed benchmarks will be done after the new scoped object statements have been introduced).

We would appreciate early adoption of this release. One goal in releasing it is to see if the new parser actually is able to handle all legacy configurations found in practice (note that the parser was written from scratch).

ChangeLog:

http://www.rsyslog.com/changelog-for-6-3-3-v6-devel/

Download:

http://www.rsyslog.com/rsyslog-6-3-3-devel/

As always, feedback is appreciated.

Best regards,
Tom Bergfeld

rsyslog 6.3.3 (devel)

By adisconteamPosted on July 13, 2011Posted in devel, DownloadTagged 6.3.3, config, devel, format, rainerscript, v6

Download file name: rsyslog 6.3.3 (devel)

rsyslog 6.3.3 (devel)
md5sum: f0ef4a1760eaf4498fba3f5bdc969d8e

Author: Rainer Gerhards (rgerhards@adiscon.com)
Version: 6.3.3 File size: 2.4 MB

Download this file now!