rsyslog

rsyslog 8.19.0 (v8-stable) released

We have released rsyslog 8.19.0.

This is mostly a bug-fixing release. Among the big number of fixes are a few additions to the testbench and some minor enhancements for several modules (like imrelp, omelasticsearch) to provide more convenience.

To get a full overview over the changes, please take a look at the Changelog.
ChangeLog:

Using rsyslog to Reindex/Migrate Elasticsearch data

Original post: Scalable and Flexible Elasticsearch Reindexing via rsyslog by @Sematext

This recipe is useful in a two scenarios:

  • migrating data from one Elasticsearch cluster to another (e.g. when you’re upgrading from Elasticsearch 1.x to 2.x or later)
  • reindexing data from one index to another in a cluster pre 2.3. For clusters on version 2.3 or later, you can use the Reindex API

Back to the recipe, we used an external application to scroll through Elasticsearch documents in the source cluster and push them to rsyslog via TCP. Then we used rsyslog’s Elasticsearch output to push logs to the destination cluster. The overall flow would be:

rsyslog to Elasticsearch reindex flow

This is an easy way to extend rsyslog, using whichever language you’re comfortable with, to support more inputs. Here, we piggyback on the TCP input. You can do a similar job with filters/parsers – you can find GeoIP implementations, for example – by piggybacking the mmexternal module, which uses stdout&stdin for communication. The same is possible for outputs, normally added via the omprog module: we did this to add a Solr output and one for SPM custom metrics.

The custom script in question doesn’t have to be multi-threaded, you can simply spin up more of them, scrolling different indices. In this particular case, using two scripts gave us slightly better throughput, saturating the network:

rsyslog to Elasticsearch reindex flow multiple scripts

Writing the custom script

Before starting to write the script, one needs to know how the messages sent to rsyslog would look like. To be able to index data, rsyslog will need an index name, a type name and optionally an ID. In this particular case, we were dealing with logs, so the ID wasn’t necessary.

With this in mind, I see a number of ways of sending data to rsyslog:

  • one big JSON per line. One can use mmnormalize to parse that JSON, which then allows rsyslog do use values from within it as index name, type name, and so on
  • for each line, begin with the bits of “extra data” (like index and type names) then put the JSON document that you want to reindex. Again, you can use mmnormalize to parse, but this time you can simply trust that the last thing is a JSON and send it to Elasticsearch directly, without the need to parse it
  • if you only need to pass two variables (index and type name, in this case), you can piggyback on the vague spec of RFC3164 syslog and send something like
    destination_index document_type:{"original": "document"}
    

This last option will parse the provided index name in the hostname variable, the type in syslogtag and the original document in msg. A bit hacky, I know, but quite convenient (makes the rsyslog configuration straightforward) and very fast, since we know the RFC3164 parser is very quick and it runs on all messages anyway. No need for mmnormalize, unless you want to change the document in-flight with rsyslog.

Below you can find the Python code that can scan through existing documents in an index (or index pattern, like logstash_2016.05.*) and push them to rsyslog via TCP. You’ll need the Python Elasticsearch client (pip install elasticsearch) and you’d run it like this:

python elasticsearch_to_rsyslog.py source_index destination_index

The script being:

from elasticsearch import Elasticsearch
import json, socket, sys

source_cluster = ['server1', 'server2']
rsyslog_address = '127.0.0.1'
rsyslog_port = 5514

es = Elasticsearch(source_cluster,
      retry_on_timeout=True,
      max_retries=10)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((rsyslog_address, rsyslog_port))


result = es.search(index=sys.argv[1], scroll='1m', search_type='scan', size=500)

while True:
  res = es.scroll(scroll_id=result['_scroll_id'], scroll='1m')
  for hit in result['hits']['hits']:
    s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(hit["_source"])+'\n')
  if not result['hits']['hits']:
    break

s.close()

If you need to modify messages, you can parse them in rsyslog via mmjsonparse and then add/remove fields though rsyslog’s scripting language. Though I couldn’t find a nice way to change field names – for example to remove the dots that are forbidden since Elasticsearch 2.0 – so I did that in the Python script:

def de_dot(my_dict):
  for key, value in my_dict.iteritems():
    if '.' in key:
      my_dict[key.replace('.','_')] = my_dict.pop(key)
    if type(value) is dict:
      my_dict[key] = de_dot(my_dict.pop(key))
  return my_dict

And then the “send” line becomes:

s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(de_dot(hit["_source"]))+'\n')

Configuring rsyslog

The first step here is to make sure you have the lastest rsyslog, though the config below works with versions all the way back to 7.x (which can be found in most Linux distributions). You just need to make sure the rsyslog-elasticsearch package is installed, because we need the Elasticsearch output module.

# messages bigger than this are truncated
$maxMessageSize 10000000  # ~10MB

# load the TCP input and the ES output modules
module(load="imtcp")
module(load="omelasticsearch")

main_queue(
  # buffer up to 1M messages in memory
  queue.size="1000000"
  # these threads process messages and send them to Elasticsearch
  queue.workerThreads="4"
  # rsyslog processes messages in batches to avoid queue contention
  # this will also be the Elasticsearch bulk size
  queue.dequeueBatchSize="4000"
)

# we use templates to specify how the data sent to Elasticsearch looks like
template(name="document" type="list"){
  # the "msg" variable contains the document
  property(name="msg")
}
template(name="index" type="list"){
  # "hostname" has the index name
  property(name="hostname")
}
template(name="type" type="list"){
  # "syslogtag" has the type name
  property(name="syslogtag")
}

# start the TCP listener on the port we pointed the Python script to
input(type="imtcp" port="5514")

# sending data to Elasticsearch, using the templates defined earlier
action(type="omelasticsearch"
  template="document"
  dynSearchIndex="on" searchIndex="index"
  dynSearchType="on" searchType="type"
  server="localhost"  # destination Elasticsearch host
  serverport="9200"   # and port
  bulkmode="on"  # use the bulk API
  action.resumeretrycount="-1"  # retry indefinitely if Elasticsearch is unreachable
)

This configuration doesn’t have to disturb your local syslog (i.e. by replacing /etc/rsyslog.conf). You can put it someplace else and run a different rsyslog process:

rsyslogd -i /var/run/rsyslog_reindexer.pid -f /home/me/rsyslog_reindexer.conf

And that’s it! With rsyslog started, you can start the Python script(s) and do the reindexing.

rsyslog 8.18.0 (v8-stable) released

We have released rsyslog 8.18.0.

This is mostly a bug-fixing release. Among the big number of fixes are a few additions to the testbench and some minor enhancements for several modules (like redis, omkafka, imfile) to provide more convenience.

To get a full overview over the changes, please take a look at the Changelog.
ChangeLog:

Changelog for 8.18.0 (v8-stable)

Version 8.18.0 [v8-stable] 2016-04-19

  • testbench: When running privdrop tests testbench tries to drop
    user to “rsyslog”, “syslog” or “daemon” when running as root and
    you don’t explict set RSYSLOG_TESTUSER environment variable.
    Make sure the unprivileged testuser can write into tests/ dir!
  • templates: add option to convert timestamps to UTC
    closes https://github.com/rsyslog/rsyslog/issues/730
  • omjournal: fix segfault (regression in 8.17.0)
  • imptcp: added AF_UNIX support
    Thanks to Nathan Brown for implementing this feature.
  • new template options
    • compressSpace
    • date-utc
  • redis: support for authentication
    Thanks to Manohar Ht for the patch
  • omkafka: makes kafka-producer on-HUP restart optional
    As of now, omkafka kills and re-creates kafka-producer on HUP. This
    is not always desirable. This change introduces an action param
    (reopenOnHup=”on|off”) which allows user to control re-cycling of
    kafka-producer.
    It defaults to on (for backward compatibility). Off allows user to
    ignore HUP as far as kafka-producer is concerned.
    Thanks to Janmejay Singh for implementing this feature
  • imfile: new “FreshStartTail” input parameter
    Thanks to Curu Wong for implementing this.
  • omjournal: fix libfastjson API issues
    This module accessed private data members of libfastjson
  • ommongodb: fix json API issues
    This module accessed private data members of libfastjson
  • testbench improvements (more tests and more thourough tests)
    among others:

    • tests for omjournal added
    • tests for KSI subsystem
    • tests for priviledge drop statements
    • basic test for RELP with TLS
    • some previously disabled tests have been re-enabled
  • dynamic stats subsystem: a couple of smaller changes
    they also involve the format, which is slightly incompatible to
    previous version. As this was out only very recently (last version),
    we considered this as acceptable.
    Thanks to Janmejay Singh for developing this.
  • foreach loop: now also iterates over objects (not just arrays)
    Thanks to Janmejay Singh for developing this.
  • improvements to the CI environment
  • enhancement: queue subsystem is more robst in regard to some corruptions
    It is now detected if a .qi file states that the queue contains more
    records than there are actually inside the queue files. Previously this
    resulted in an emergency switch to direct mode, now the problem is only
    reported but processing continues.
  • enhancement: Allow rsyslog to bind UDP ports even w/out specific
    interface being up at the moment.
    Alternatively, rsyslog could be ordered after networking, however,
    that might have some negative side effects. Also IP_FREEBIND is
    recommended by systemd documentation.
    Thanks to Nirmoy Das and Marius Tomaschewski for the patch.
  • cleanup: removed no longer needed json-c compatibility layer
    as we now always use libfastjson, we do not need to support old
    versions of json-c (libfastjson was based on the newest json-c
    version at the time of the fork, which is the newest in regard
    to the compatibility layer)
  • new External plugin for sending metrics to SPM Monitoring SaaS
    Thanks to Radu Gheorghe for the patch.
  • bugfix imfile: fix memory corruption bug when appending @cee
    Thanks to Brian Knox for the patch.
  • bugfix: memory misallocation if position.from and position.to is used
    a negative amount of memory is tried to be allocated if position.from
    is smaller than the buffer size (at least with json variables). This
    usually leads to a segfault.
    closes https://github.com/rsyslog/rsyslog/issues/915
  • bugfix: fix potential memleak in TCP allowed sender definition
    depending on circumstances, a very small leak could happen on each
    HUP. This was caused by an invalid macro definition which did not rule
    out side effects.
  • bugfix: $PrivDropToGroupID actually did a name lookup
    … instead of using the provided ID
  • bugfix: small memory leak in imfile
    Thanks to Tomas Heinrich for the patch.
  • bugfix: double free in jsonmesg template
    There has to be actual json data in the message (from mmjsonparse,
    mmnormalize, imjournal, …) to trigger the crash.
    Thanks to Tomas Heinrich for the patch.
  • bugfix: incorrect formatting of stats when CEE/Json format is used
    This lead to ill-formed json being generated
  • bugfix omfwd: new-style keepalive action parameters did not work
    due to being inconsistently spelled inside the code. Note that legacy
    parameters $keepalive… always worked
    see also: https://github.com/rsyslog/rsyslog/issues/916
    Thanks to Devin Christensen for alerting us and an analysis of the
    root cause.
  • bugfix: memory leaks in logctl utility
    Detected by clang static analyzer. Note that these leaks CAN happen in
    practice and may even be pretty large. This was probably never detected
    because the tool is not often used.
  • bugfix omrelp: fix segfault if no port action parameter was given
    closes https://github.com/rsyslog/rsyslog/issues/911
  • bugfix imtcp: Messages not terminated by a NL were discarded
    … upon connection termination.
    Thanks to Tomas Heinrich for the patch.

Monitoring rsyslog’s impstats with Kibana and SPM

Original post: Monitoring rsyslog with Kibana and SPM by @Sematext

A while ago we published this post where we explained how you can get stats about rsyslog, such as the number of messages enqueued, the number of output errors and so on. The point was to send them to Elasticsearch (or Logsene, our logging SaaS, which exposes the Elasticsearch API) in order to analyze them.

This is part 2 of that story, where we share how we process these stats in production. We’ll cover:

  • an updated config, working with Elasticsearch 2.x
  • what Kibana dashboards we have in Logsene to get an overview of what rsyslog is doing
  • how we send some of these metrics to SPM as well, in order to set up alerts on their values: both threshold-based alerts and anomaly detection

Continue reading “Monitoring rsyslog’s impstats with Kibana and SPM”

rsyslog 8.17.0 (v8-stable) released

We have released rsyslog 8.17.0.

This release brings, among a few bugfixes, a lot of brand-new features. The most important change is probably the libfastjson requirement, which replaces the json-c dependency. There is a new contributed plugin called omampq1 for AMQP 1.0 compliant brokers, a new experimental lookup table support, dynamic statistics counters and many many more.

To get a full overview over the changes, please take a look at the Changelog.
ChangeLog:

http://www.rsyslog.com/changelog-for-8-17-0-v8-stable/

Download:

http://www.rsyslog.com/downloads/download-v8-stable/

As always, feedback is appreciated.

Best regards,
Florian Riedl

Changelog for 8.17.0 (v8-stable)

Version 8.17.0 [v8-stable] 2016-03-08

  • NEW REQUIREMENT: libfastjson
    see also:
    http://blog.gerhards.net/2015/12/rsyslog-and-liblognorm-will-switch-to.html
  • new testbench requirement: faketime command line tool
    This is used to generate a controlled environment for time-based tests; if
    not available, tests will gracefully be skipped.
  • improve json variable performance
    We use libfastjson’s alternative hash function, which has been
    proven to be much faster than the default one (which stems
    back to libjson-c). This should bring an overall performance
    improvement for all operations involving variable processing.
    closes https://github.com/rsyslog/rsyslog/issues/848
  • new experimental feature: lookup table suport
    Note that at this time, this is an experimental feature which is not yet
    fully supported by the rsyslog team. It is introduced in order to gain
    more feedback and to make it available as early as possible because many
    people consider it useful.
    Thanks to Janmejay Singh for implementing this feature
  • new feature: dynamic statistics counters
    which may be changed during rule processing
    Thanks to Janmejay Singh for suggesting and implementing this feature
  • new contributed plugin: omampq1 for AMQP 1.0-compliant brokers
    Thanks to Ken Giusti for this module
  • new set of UTC-based $now family of variables ($now-utc, $year-utc, …)
  • simplified locking when accessing message and local variables
    this simlifies the code and slightly increases performance if such
    variables are heavily accessed.
  • new global parameter “debug.unloadModules”
    This permits to disable unloading of modules, e.g. to make valgrind
    reports more useful (without a need to recompile).
  • timestamp handling: guard against invalid dates
    We do not permit dates outside of the year 1970..2100
    interval. Note that network-receivers do already guard
    against this, so the new guard only guards against invalid
    system time.
  • imfile: add “trimlineoverbytes” input paramter
    Thanks to github user JindongChen for the patch.
  • ommongodb: add support for extended json format for dates
    Thanks to Florian Bücklers for the patch.
  • omjournal: add support for templates
    see also: https://github.com/rsyslog/rsyslog/pull/770
    Thanks to github user bobthemighty for the patch
  • imuxsock: add “ruleset” input parameter
  • testbench: framework improvement: configs can be included in test file
    they do no longer need to be in a separate file, which saves a bit
    of work when working with them. This is supported for simple tests with
    a single running rsyslog instance
    Thanks to Janmejay Singh for inspiring me with a similar method in
    liblognorm testbench.
  • imptcp: performance improvements
    Thanks to Janmejay Singh for implementing this improvement
  • made build compile (almost) without warnings
    still some warnings are suppressed where this is currently required
  • improve interface definition in some modules, e.g. mmanon, mmsequence
    This is more an internal cleanup and should have no actual affect to
    the end user.
  • solaris build: MAXHOSTNAMELEN properly detected
  • build system improvement: ability to detect old hiredis libs
    This permits to automatically build omhiredis on systems where the
    hiredis libs do not provide a pkgconfig file. Previsouly, this
    required manual configuration.
    Thanks to github user jaymell for the patch.
  • rsgtutil: dump mode improvements
    • auto-detect signature file type
    • ability to dump hash chains for log extraction files
  • build system: fix build issues with clang
    clang builds often failed with a missing external symbol
    “rpl_malloc”. This was caused by checks in configure.ac,
    which checked for specific GNU semantics. As we do not need
    them (we never ask malloc for zero bytes), we can safely
    remove the macros.
    Note that we routinely run clang static analyer in CI and
    it also detects such calls as invalid.
    closes https://github.com/rsyslog/rsyslog/issues/834
  • bugfix: unixtimestamp date format was incorrectly computed
    The problem happened in leap year from March til then end
    of year and healed itself at the begining of the next year.
    During the problem period, the timestamp was 24 hours too low.
    fixes https://github.com/rsyslog/rsyslog/issues/830
  • bugfix: date-ordinal date format was incorrectly computed
    same root cause aus for unixtimestamp and same triggering
    condition. During the affected perido, the ordinal was one
    too less.
  • bugfix: some race when shutting down input module threads
    this had little, if at all, effect on real deployments as it resulted
    in a small leak right before rsyslog termination. However, it caused
    trouble with the testbench (and other QA tools).
    Thanks to Peter Portante for the patch and both Peter and Janmejay
    Singh for helping to analyze what was going on.
  • bugfix tcpflood: did not handle connection drops correct in TLS case
    note that tcpflood is a testbench too. The bug caused some testbench
    instability, but had no effect on deplyments.
  • bugfix: abort if global parameter value was wrong
    If so, the abort happened during startup. Once started,
    all was stable.
  • bugfix omkafka: fix potential NULL pointer addressing
    this happened when the topic cache was full and an entry
    needed to be evicted
  • bugfix impstats: @cee cookie was prefixed to wrong fromat (json vs. cee)
    Thanks to Volker Fröhlich for the fix.
  • bugfix imfile: fix race during startup that could lead to some duplication
    If a to-be-monitored file was created after inotify was initialized
    but before startup was completed, the first chunk of data from this
    file could be duplicated. This should have happened very rarely in
    practice, but caused occasional testbench failures.
    see also: https://github.com/rsyslog/rsyslog/issues/791
  • bugfix: potential loss of single message at queue shutdown
    see also: https://github.com/rsyslog/rsyslog/issues/262
  • bugfix: potential deadlock with heavy variable access
    When making havy use of global, local and message variables, a deadlock
    could occur. While it is extremly unlikely to happen, we have at least
    seen one incarnation of this problem in practice.
  • bugfix ommysql: on some platforms, serverport parameter had no effect
    This was caused by an invalid code sequence which’s outcome depends on
    compiler settings.
  • bugfix omelasticsearch: invalid pointer dereference
    The actual practical impact is not clear. This came up when working
    on compiler warnings.
    Thanks to David Lang for the patch.
  • bugfix omhiredis: serverport config parameter did not reliably work
    depended on environment/compiler used to build
  • bugfix rsgtutil: -h command line option did not work
    Thanks to Henri Lakk for the patch.
  • bugfix lexer: hex numbers were not properly represented
    see: https://github.com/rsyslog/rsyslog/pull/771
    Thanks to Sam Hanes for the patch.
  • bugfix TLS syslog: intermittent errors while sending data
    Regression from commit 1394e0b. A symptom often seen was the message
    “unexpected GnuTLS error -50 in nsd_gtls.c:530”
  • bugfix imfile: abort on startup if no slash was present in file name param
    Thanks to Brian Knox for the patch.
  • bugfix rsgtutil: fixed abort when using short command line options
    Thanks to Henri Lakk
  • bugfix rsgtutil: invalid computation of log record extraction file
    This caused verification to fail because the hash chain was actually
    incorrect. Depended on the input data set.
    closes https://github.com/rsyslog/rsyslog/issues/832
  • bugfix build system: KSI components could only be build if in default path

rsyslog 8.16.0 (v8-stable) released

We have released rsyslog 8.16.0.

This release is mostly a bugfixing release with fixes for impstats, omelasticsearch, imfile, ommail and many more. The biggest change however is the addition of the extraction support in rsgtutil for ksi support (https://github.com/rsyslog/rsyslog/issues/561).

To get a full overview over the changes, please take a look at the Changelog.
ChangeLog:

http://www.rsyslog.com/changelog-for-8-16-0-v8-stable/

Download:

http://www.rsyslog.com/downloads/download-v8-stable/

As always, feedback is appreciated.

Best regards,
Florian Riedl

Changelog for 8.16.0 (v8-stable)

——————————————————————————
Version 8.16.0 [v8-stable] 2016-01-26

  • rsgtutil: Added extraction support including loglines and hash chains.
    More details on how to extract loglines can be found in the rsgtutil
    manpage. See also: https://github.com/rsyslog/rsyslog/issues/561
  • clean up doAction output module interface
    We started with char * pointers, but used different types of pointers
    over time. This lead to alignment warnings. In practice, I think this
    should never cause any problems (at least there have been no reports
    in the 7 or so years we do this), but it is not clean. The interface is
    now cleaned up. We do this in a way that does not require modifications
    to modules that just use string parameters. For those with message
    parameters, have a look at e.g. mmutf8fix to see how easy the
    required change is.
  • new system properties for $NOW properties based on UTC
    This permits to express current system time in UTC.
    See also https://github.com/rsyslog/rsyslog/issues/729
  • impstats: support broken ElasticSearch JSON implementation
    ES 2.0 no longer supports valid JSON and disallows dots inside names.
    This adds a new “json-elasticsearch” format option which replaces
    those dots by the bang (“!”) character. So “discarded.full” becomes
    “discarded!full”.
    This is a workaroud. A method that will provide more control over
    replacements will be implemented some time in the future. For
    details, see below-quoted issue tracker.
    closes https://github.com/rsyslog/rsyslog/issues/713
  • omelasticsearch: craft better URLs
    Elasticsearch is confused by url’s ending in a bare ‘?’ or ‘&’. While
    this is valid, those are no longer produced.
    Thanks to Benno Evers for the patch.
  • imfile: add experimental “reopenOnTruncate” parameter
    Thanks to Matthew Wang for the patch.
  • bugfix imfile: proper handling of inotify initialization failure
    Thanks to Zachary Zhao for the patch.
  • bugfix imfile: potential segfault due to improper handling of ev var
    This occurs in inotify mode, only.
    Thanks to Zachary Zhao and Peter Portante for the patch.
    closes https://github.com/rsyslog/rsyslog/issues/718
  • bugfix imfile: potential segfault under heavey load.
    This occurs in inotify mode when using wildcards, only.
    The root cause is dropped IN_IGNOPRED inotify events which be dropped
    in circumstance of high input pressure and frequent rotation, and
    according to wikipeida, they can also be dropped in other conditions.
    Thanks to Zachary Zhao for the patch.
    closes https://github.com/rsyslog/rsyslog/issues/723
  • bugfix ommail: invalid handling of server response
    if that response was split into different read calls. Could lead to
    error-termination of send operation. Problem is pretty unlikely to
    occur in standard setups (requires slow connection to SMTP server).
    Thank to github user haixingood for the patch.
  • bugfix omelasticsearch: custom serverport was ignored on some platforms
    Thanks to Benno Evers for the patch.
  • bugfix: tarball did not include some testbench files
    Thanks to Thomas D. (whissi) for the patch.
  • bugfix: memory misadressing during config parsing string template
    This occurred if an (invalid) template option larger than 63 characters
    was given.
    Thanks to git hub user c6226 for the patch.
  • bugfix imzmq: memory leak
    Thanks to Jeremy Liang for the patch.
  • bugfix imzmq: memory leak
    Thanks to github user xushengping for the patch.
  • bugfix omzmq: memory leak
    Thanks to Jack Lin for the patch.
  • some code improvement and cleanup

rsyslog 8.15.0 (v8-stable) released

We have released rsyslog 8.15.0.

This release sports a lot of changes. Among the changes are a lot of bugfixes, changes to the KSI support, pmciscoios, omkafka, 0mq modules, omelasticsearch and many more.

To get a full overview over the changes, please take a look at the Changelog.
ChangeLog:

http://www.rsyslog.com/changelog-for-8-15-0-v8-stable/

Download:

http://www.rsyslog.com/downloads/download-v8-stable/

As always, feedback is appreciated.

Best regards,
Florian Riedl

Scroll to top