Filter optimization with arrays

If you are using a lot of filters and templates in rsyslog, this can not only be affecting the performance drastically, but it is also a hassle to set up all the different actions and templates. It is always worthy to check, if there isn’t a shortcut somewhere, which might not only save you time for creating the configuration, but also make it much simpler in the end to keep track of all the actions.

In our example, we have several programnames. The log messages should be sorted by programname and then be stored in a specific file and be sorted by host. After storing the log messages, the message should be discarded, so it won’t be processed by the following filters, thus saving otherwise wasted processing time. This example is applicable to rsyslog v7.2.x and above.

Here are some sample config lines.

## apache_access
template(name="DailyPerHost_apache_access" type="string" string="/syslog/%FROMHOST%/apache_access.log")
:programname, isequal, "apache_access" { action(type="omfile" DynaFile="DailyPerHost_apache_access") stop }

## apache_error
template(name="DailyPerHost_apache_error" type="string" string="/syslog/%FROMHOST%/apache_error.log")
:programname, isequal, "apache_error" { action(type="omfile" DynaFile="DailyPerHost_apache_error") stop }
## mysql
template(name="DailyPerHost_mysql" type="string" string="/syslog/%FROMHOST%/mysql.log")
:programname, isequal, "mysql" { action(type="omfile" DynaFile="DailyPerHost_mysql") stop }

## php
template(name="DailyPerHost_php" type="string" string="/syslog/%FROMHOST%/php.log")
:programname, isequal, "php" { action(type="omfile" DynaFile="DailyPerHost_php") stop }

These are some basic services, which are often run together. Please note, that these are just a few examples. As you can see here, the template is created first. It is given a name, type and format. Templates of type string are usually used for file names. Here the log messages get stored in the folder /syslog a subfolder for the host where the message occured and then a filename which reflects the type of message that occured.

The second line holds the actions. First you see the property based filter (programname) and the condition. After that the actions get chained with the curly braces. The log messages where the filter evaluates to true get stored in a file. The filename and path is generated dynamically with the DynaFile parameter. Through this, the above written template will be used to generate the path and filename. The second action is represented by stop. Please note that this is case sensitive. Basically, stop means to stop the message processing. No further processing of the message will take place.

If we look closely at the sample config lines, we see, that the filter condition is basically always the same. It will always filter the programname property for a certain value. This is a predestinated case for using an array for simplification. We can use the property programname in the file template as well and filter an array of values. This will greatly save the overhead for all the seperate filter, not only in the configuration, but also in processing the messages.

template(name="DailyPerHost_app" type="string" string="/syslog/%FROMHOST%/%programname%.log")
if $programname == ["apache_access",
                    "apache_error",
                    "mysql",
                    "php"]
                then {
                action(type="omfile" DynaFile="DailyPerHost_app")
                stop
                }

Again, we first create the template. Please note the difference in the filename where the hardcoded text has been replaced by the property programname. In the next lines, we see the filter and the array of values. This is just to reflect the example. Virtually, the array can have near-infinite values. The filter is also a common if/then construct. After the then we see our chain of commands. First the action which writes the log messages into a file where the filename is created by the above template and then a stop as second action.

This case is applicable in many forms. It is also most useful if you are filtering and the discarding a lot of messages with very common filter settings. You could use it to filter for an array of property values and even chain comparison operations.

Changelog for 7.2.5 (v7-stable)

Version 7.2.5  [v7-stable] 2013-01-08

  • build system cleanup (thanks to Michael Biebl for this!)
  • bugfix: omelasticsearch did not properly compile on some platforms due to missing libmath. Thanks to Michael Biebl for the fix
  • bugfix: invalid DST handling under Solaris
    Thanks to Scott Severtson for the patch.
  • bugfix: on termination, actions were incorrectly called
    The problem was that incomplete fiter evaluation was done *during the shutdown phase*. This affected only the LAST batches being processed. No problem existed during the regular run. Could usually only happen on very busy systems, which were still busy during shutdown.
  • bugfix: very large memory consumption (and probably out of memory) when FromPos was specified in template, but ToPos not.
    Thanks to Radu Gheorghe for alerting us of this bug.
  • bugfix: timeval2syslogTime cause problems on some platforms due to invalid assumption on structure data types.
    closes: http://bugzilla.adiscon.com/show_bug.cgi?id=394
    Thanks to David Hill for the patch [under ASL2.0 as per email conversation 2013-01-03].
  • bugfix: compile errors in im3195
    Thanks to Martin Körper for the patch
  • bugfix: doGetFileCreateMode() had invalid validity check ;)
    Thanks to Chandler Latour for the patch.
  • bugfix: mmjsonparse errornously returned action error when no CEE cookie was present.

Changelog for 7.3.5 (v7-devel)

Version 7.3.5 [devel] 2012-12-19

  • ommysql: addded batching/transaction support
  • enhanced script optimizer to optimize common PRI-based comparisons
    These constructs are especially used in SUSE default config files,
    but also by many users (as they are more readable than the equivalent
    PRI-based filter).
  • omudpspoof: add support for new config system
  • omudpspoof: add support for packets larger than 1472 bytes
    On Ethernet, they need to be transmitted in multiple fragments. While
    it is known that fragmentation can cause issues, it is the best choice
    to be made in that case. Also improved debug output.
  • bugfix: omudpspoof failed depending on the execution environment
    The v7 engine closes fds, and closed some of libnet’s fds as well, what
    lead to problems (unfortunately, at least some libnet versions do not
    report a proper error state but still “success”…). The order of libnet
    calls has been adjusted to by in sync with what the core engine does.
  • bugfix: segfault on imuxsock startup if system log socket is used
    and no ratelimiting supported. Happens only during initial config
    read phase, once this is over, everything works stable.
  • bugfix: mmnormalize build problems
  • bugfix: mmnormalize could abort rsyslog if config parameter was in error
  • bugfix: no error message for invalid string template parameters
    rather a malformed template was generated, and error information emitted
    at runtime. However, this could be quite confusing. Note that with this
    “bugfix” user experience changes: formerly, rsyslog and the affected
    actions properly started up, but the actions did not produce proper
    data. Now, there are startup error messages and the actions are NOT
    executed (due to missing template due to template error).
  • bugfix[minor]: invalid error code when mmnormalize could not access rulebase
  • bugfix(kind of): script optimizer did not work for complex boolean expressions
  • doc bugfix: corrections and improvements in mmnormalize html doc page
  • bugfix: some message properties could be garbled due to race condition
    This happened only on very high volume systems, if the same message was
    being processed by two different actions. This was a regression caused
    by the new config processor, which did no longer properly enable msg
    locking in multithreaded cases. The bugfix is actually a refactoring of
    the msg locking code – we no longer do unlocked operations, as the use
    case for it has mostly gone away. It is potentially possible only at
    very low-end systems, and there the small additional overhead of doing
    the locking does not really hurt. Instead, the removal of that
    capability can actually slightly improve performance in common cases,
    as the code path is smaller and requires slightly less memory writes.
    That probably outperforms the extra locking overhead (which in the
    low-end case always happens in user space, without need for kernel
    support as we can always directly aquire the lock – there is no
    contention at all).
  • build system cleanup (thanks to Michael Biebl for this!)
  • bugfix: omelasticsearch did not properly compile on some platforms
    due to missing libmath. Thanks to Michael Biebl for the fix.

rsyslog 7.3.5 (v7-devel) released

Version 7.3.5 is an update to the development branch. It contains new features: now, ommysql supports batching, which results in overall better system performance. Also, omudpspoof has been enhanced to support fragmentation for large messages. Omudpspoof now also supports the native v7 config format. Lastly, the script optimizer has been enhanced and detects and considerably optimizes some frequently-used constructs (like if $syslogfacility-text == “kernel” and such). Of course, there are also a couple of bug fixes, the most important one addresses a potential segfault that can happen in imuxock.

All users of the 7.3 branch are advised to update to this version.

ChangeLog:

http://www.rsyslog.com/changelog-for-7-3-5-v7-devel/

Download:

http://www.rsyslog.com/rsyslog-7-3-5-v7-devel/

As always, feedback is appreciated.

Best regards,
Tim Eifler

What is the difference between timereported and timegenerated?

Each message that is received by rsyslog is usually available with two timestamps. They can be accessed by using the properties “timereported” and “timegenerated”.

“timegenerated” is always the time when rsyslog generated the message object on the local machine. That actually means it is the time when the message was received (either via the oscall layer or on some inputs based on information the OS provides).

“timereported” is what the sending device reports as time. This is taken from the appropriate syslog header field. If and only if the syslog date header cannot properly be parsed, “timereported” is populated with the same value as “timegenerated”.

Assuming that all systems in a relay chain use valid syslog format, “timereported” will be the same on all relay machines, whereas “timegenerated” reflects the local time of message reception and thus is different on each relay machine.

Please keep in mind the mentioned difference between both properties. It is well known, that the property name for “timegenerated” might be a bit confusing. As such, “timereceived” would probably be a better name, but changing it is not possible without breaking existing deplyoments.

Normalizing Cisco ASA messages

This time, we want to parse log messages received from a Cisco ASA for specific parts. The log messages will be parsed by liblognorm and then they will be written into a specific file resembling the sent code.

This guide has been tested with v7.3.4 of rsyslog.

Things to think about

We basically need a receiver, the normalizer module mmnormalize, some templates and several actions. To keep the logs seperated from other log messages, we also will need a filter. But that is just valid for the rsyslog configuration.

We also need liblognorm installed and a rulebase to work with. The rulebase is needed for liblognorm to know what to do with the messages, so it knows, which part should be fed into a custom property and which to ignore.

Config Statements

We now have basically two parts to configure. The first part should be the rulebase which holds format representations of the expected log messages. For this example we will work with the following log line:

2012-11-23T10:47:42+01:00 0.0.0.0 : %ASA-3-313001: Denied ICMP type=8, code=0 from 77.2.2.2 on interface outside

The rulebase entry for this log line looks like this:

rule=: %date:word% %host:ipv4% : \x25ASA-%ddd:char-to:-%-%eee:number%: Denied ICMP type=%number2:number%, code=%number3:number% from %origin:ipv4% on interface outside

And here is the rsyslog configuration:

module (load="immark")
module (load="imuxsock")
module (load="imklog")
module (load="mmnormalize")
module (load="imudp")
input(type="imudp" port="514")
template(name="filename" type="string" string="/var/log/%$!ddd%-%$!eee%")
if $msg contains '%ASA' then{
 action(type="mmnormalize" userawmsg="off" rulebase="/path/to/rulebase.rb")
 action(type="omfile" DynaFile="filename")
}

How it works

The rulebase holds currently one rule. The rule is a representation of the above log message. For this case, we are only interested in the ASA code. The code identifies a specific message, thus, the message itself is not interesting. Though, we have to reflect complete message and put temporary variables, where the same message of a kind still has different values. These would be the timestamp, source, type and code the the IP which triggered the message. The problem is here, that we really only need two parts to be filled into a property, but other parts can be variable. The message format can be the same, but i.e. the IP from the source can vary. If you fill in a concrete value into the rule where such parts are, the message will only be parsed correctly if the message originates from the exact same IP.

The configuration itself isn’t very complicated. It loads several standard modules, which are mainly used to ensure local logging, as well as the UDP input module and the precious mmnormalize. The mmnormalize module will be later used as a output module. It handles the loading of the rulebase, as well as implements the normalization features of liblognorm.

After loading the modules, we have our single UDP input configured, which is necessary to receive the log messages from the Cisco ASA. Nothing fancy  here, but a standard syslog receiver for UDP.

The next step is to create a template. The template is not for a message output format, but a file name format. It will help writing the messages into log files that reflect the ASA number in the filename.

Finally we will filter for “%ASA” in the message. That will ensure, that only ASA messages are processed by the following actions. The first action runs the message through mmnormalize, while using our rulebase. The messages will be parsed by liblognorm. If the messages do fit the rules in the rulebase, the parts we need will be put into properties. These properties come into play in the second action where we simply write the messages via Dynafile to disk. The template and our new properties will be used to define the filename.

Important

This configuration works or fails with the rulebase. All messages, that should be run through the normalizer must fit to a rule. If you are in doubt, that the rules are correct, follow the linked guide. It will show you how to create a output format that shows if a message has been parsed correctly or if and where the parsing broke.

Scroll to top