rsyslog 8.29.0 (v8-stable) released
We have released rsyslog 8.29.0.
This release features a number of changes. E.g. imptcp now has an experimental parameter for multiline messages, and new statistics counters.
Most notably though, is the improved error reporting in the rsyslog core and in several modules like imtcp, imptcp, omfwd and the core modules. There is also an article available about the improved/enhanced error reporting:
https://www.linkedin.com/pulse/improving-rsyslog-debug-output-jan-gerhards
If you have questions or feedback in relation to the article and/or debug output, please let us know or leave a comment below the article.
Other than that, the new version provides quite a number of bugfixes.
For a complete list of changes, fixes and enhancements, please visit the ChangeLog.
The packages will follow when they are finished.
https://github.com/rsyslog/rsyslog/blob/v8-stable/ChangeLog
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Florian Riedl
rsyslog 8.27.0 (v8-stable) released
We have released rsyslog 8.27.0.
This release provides, apart from a lot of fixes, many useful feature enhancements. Most notably is the imkafka module, which allows the use of kafka as an input. In addition to this, imptcp and imtcp received quite a number of enhancements and the overall error reporting got improved quite a bit.
For a complete list of changes, fixes and enhancements, please visit the ChangeLog.
https://github.com/rsyslog/rsyslog/blob/v8-stable/ChangeLog
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Florian Riedl
Using rsyslog to Reindex/Migrate Elasticsearch data
Original post: Scalable and Flexible Elasticsearch Reindexing via rsyslog by @Sematext
This recipe is useful in a two scenarios:
- migrating data from one Elasticsearch cluster to another (e.g. when you’re upgrading from Elasticsearch 1.x to 2.x or later)
- reindexing data from one index to another in a cluster pre 2.3. For clusters on version 2.3 or later, you can use the Reindex API
Back to the recipe, we used an external application to scroll through Elasticsearch documents in the source cluster and push them to rsyslog via TCP. Then we used rsyslog’s Elasticsearch output to push logs to the destination cluster. The overall flow would be:
This is an easy way to extend rsyslog, using whichever language you’re comfortable with, to support more inputs. Here, we piggyback on the TCP input. You can do a similar job with filters/parsers – you can find GeoIP implementations, for example – by piggybacking the mmexternal module, which uses stdout&stdin for communication. The same is possible for outputs, normally added via the omprog module: we did this to add a Solr output and one for SPM custom metrics.
The custom script in question doesn’t have to be multi-threaded, you can simply spin up more of them, scrolling different indices. In this particular case, using two scripts gave us slightly better throughput, saturating the network:
Writing the custom script
Before starting to write the script, one needs to know how the messages sent to rsyslog would look like. To be able to index data, rsyslog will need an index name, a type name and optionally an ID. In this particular case, we were dealing with logs, so the ID wasn’t necessary.
With this in mind, I see a number of ways of sending data to rsyslog:
- one big JSON per line. One can use mmnormalize to parse that JSON, which then allows rsyslog do use values from within it as index name, type name, and so on
- for each line, begin with the bits of “extra data” (like index and type names) then put the JSON document that you want to reindex. Again, you can use mmnormalize to parse, but this time you can simply trust that the last thing is a JSON and send it to Elasticsearch directly, without the need to parse it
- if you only need to pass two variables (index and type name, in this case), you can piggyback on the vague spec of RFC3164 syslog and send something like
destination_index document_type:{"original": "document"}
This last option will parse the provided index name in the hostname variable, the type in syslogtag and the original document in msg. A bit hacky, I know, but quite convenient (makes the rsyslog configuration straightforward) and very fast, since we know the RFC3164 parser is very quick and it runs on all messages anyway. No need for mmnormalize, unless you want to change the document in-flight with rsyslog.
Below you can find the Python code that can scan through existing documents in an index (or index pattern, like logstash_2016.05.*) and push them to rsyslog via TCP. You’ll need the Python Elasticsearch client (pip install elasticsearch) and you’d run it like this:
python elasticsearch_to_rsyslog.py source_index destination_index
The script being:
from elasticsearch import Elasticsearch
import json, socket, sys
source_cluster = ['server1', 'server2']
rsyslog_address = '127.0.0.1'
rsyslog_port = 5514
es = Elasticsearch(source_cluster,
retry_on_timeout=True,
max_retries=10)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((rsyslog_address, rsyslog_port))
result = es.search(index=sys.argv[1], scroll='1m', search_type='scan', size=500)
while True:
res = es.scroll(scroll_id=result['_scroll_id'], scroll='1m')
for hit in result['hits']['hits']:
s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(hit["_source"])+'\n')
if not result['hits']['hits']:
break
s.close()
If you need to modify messages, you can parse them in rsyslog via mmjsonparse and then add/remove fields though rsyslog’s scripting language. Though I couldn’t find a nice way to change field names – for example to remove the dots that are forbidden since Elasticsearch 2.0 – so I did that in the Python script:
def de_dot(my_dict):
for key, value in my_dict.iteritems():
if '.' in key:
my_dict[key.replace('.','_')] = my_dict.pop(key)
if type(value) is dict:
my_dict[key] = de_dot(my_dict.pop(key))
return my_dict
And then the “send” line becomes:
s.send(sys.argv[2] + ' ' + hit["_type"] + ':' + json.dumps(de_dot(hit["_source"]))+'\n')
Configuring rsyslog
The first step here is to make sure you have the lastest rsyslog, though the config below works with versions all the way back to 7.x (which can be found in most Linux distributions). You just need to make sure the rsyslog-elasticsearch package is installed, because we need the Elasticsearch output module.
# messages bigger than this are truncated
$maxMessageSize 10000000 # ~10MB
# load the TCP input and the ES output modules
module(load="imtcp")
module(load="omelasticsearch")
main_queue(
# buffer up to 1M messages in memory
queue.size="1000000"
# these threads process messages and send them to Elasticsearch
queue.workerThreads="4"
# rsyslog processes messages in batches to avoid queue contention
# this will also be the Elasticsearch bulk size
queue.dequeueBatchSize="4000"
)
# we use templates to specify how the data sent to Elasticsearch looks like
template(name="document" type="list"){
# the "msg" variable contains the document
property(name="msg")
}
template(name="index" type="list"){
# "hostname" has the index name
property(name="hostname")
}
template(name="type" type="list"){
# "syslogtag" has the type name
property(name="syslogtag")
}
# start the TCP listener on the port we pointed the Python script to
input(type="imtcp" port="5514")
# sending data to Elasticsearch, using the templates defined earlier
action(type="omelasticsearch"
template="document"
dynSearchIndex="on" searchIndex="index"
dynSearchType="on" searchType="type"
server="localhost" # destination Elasticsearch host
serverport="9200" # and port
bulkmode="on" # use the bulk API
action.resumeretrycount="-1" # retry indefinitely if Elasticsearch is unreachable
)
This configuration doesn’t have to disturb your local syslog (i.e. by replacing /etc/rsyslog.conf). You can put it someplace else and run a different rsyslog process:
rsyslogd -i /var/run/rsyslog_reindexer.pid -f /home/me/rsyslog_reindexer.conf
And that’s it! With rsyslog started, you can start the Python script(s) and do the reindexing.
rsyslog 8.9.0 (v8-stable) released
We have released rsyslog 8.9.0.
http://www.rsyslog.com/changelog-for-8-9-0-v8-stable/
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Florian Riedl
Changelog for 8.9.0 (v8-stable)
Version 8.9.0 [v8-stable] 2015-04-07
- omprog: add option “hup.forward” to forwards HUP to external plugins
This was suggested by David Lang so that external plugins (and other
programs) can also do HUP-specific processing. The default is not
to forward HUP, so no change of behavior by default. - imuxsock: added capability to use regular parser chain
Previously, this was a fixed format, that was known to be spoken on
the system log socket. This also adds new parameters:- sysSock.useSpecialParser module parameter
- sysSock.parseHostname module parameter
- useSpecialParser input parameter
- parseHostname input parameter
- 0mq: improvements in input and output modules
See module READMEs, part is to be considered experimental.
Thanks to Brian Knox for the contribution. - imtcp: add support for ip based bind for imtcp -> param “address”
Thanks to github user crackytsi for the patch. - bugfix: MsgDeserialize out of sync with MsgSerialize for StrucData
This lead to failure of disk queue processing when structured data was
present. Thanks to github user adrush for the fix. - bugfix imfile: partial data loss, especially in readMode != 0
closes https://github.com/rsyslog/rsyslog/issues/144 - bugfix: potential large memory consumption with failed actions
see also https://github.com/rsyslog/rsyslog/issues/253 - bugfix: omudpspoof: invalid default send template in RainerScript format
The file format template was used, which obviously does not work for
forwarding. Thanks to Christopher Racky for alerting us.
closes https://github.com/rsyslog/rsyslog/issues/268 - bugfix: size-based legacy config statements did not work properly
on some platforms, they were incorrectly handled, resulting in all
sorts of “interesting” effects (up to segfault on startup) - build system: added option –without-valgrind-testbench
… which provides the capability to either enforce or turn off
valgrind use inside the testbench. Thanks to whissi for the patch. - rsyslogd: fix misleading typos in error messages
Thanks to Ansgar Püster for the fixes.
rsyslog 7.4.7 (v7-stable) released
We have just released 7.4.7 of the v7-stable branch. This is a bug-fixing release. Most importantly it fixes a bug that can lead to Continue reading “rsyslog 7.4.7 (v7-stable) released”
Changelog for 7.4.7 (v7-stable)
Version 7.4.7 [v7.4-stable] 2013-12-10
- bugfix: limiting queue disk space did not work properly
- queue.maxdiskspace actually initializes queue.maxfilesize
- total size of queue files was not checked against queue.maxdiskspace for disk assisted queues.
Thanks to Karol Jurak for the patch.
- bugfix: linux kernel-like ratelimiter did not work properly with all inputs (for example, it did not work with imdup). The reason was that the PRI value was used, but that needed parsing of the message, which was done too late.
- bugfix: disk queues created files in wrong working directory if the $WorkDirectory was changed multiple times, all queues only used the last value set.
- bugfix: legacy directive $ActionQueueWorkerThreads was not honored
- bugfix: segfault on startup when certain script constructs are used
e.g. “if not $msg …” - bugfix: imuxsock: UseSysTimeStamp config parameter did not work correctly
Thanks to Tomas Heinrich for alerting us and provinding a solution suggestion. - bugfix: $SystemLogUseSysTimeStamp/$
SystemLogUsePIDFromSystem did not work
Thanks to Tomas Heinrich for the patch. - improved checking of queue config parameters on startup
- bugfix: call to ruleset with async queue did not use the queue
closes: http://bugzilla.adiscon.com/show_bug.cgi?id=443 - bugfix: if imtcp is loaded and no listeners are configured (which is uncommon), rsyslog crashes during shutdown.
rsyslog statistic counter plugin imtcp
Plugin – imtcp
This plugin maintains statistics for each listener. The statistic is named after the given input name (or “imtcp” if none is configured), followed by the listener port in parenthesis. For example, the counter for a listener on port 514 with no set name is called “imtcp(514)”.
The following properties are maintained for each listener:
- submitted – total number of messages submitted for processing since startup
rsyslog 7.5.4 (v7-devel) released
This release offers some interesting features. It provides a new module called mmpstrucdata to parse RFC5424 structured data into json message properties. Also the default queue.size values have been altered to more suitable values. Omfwd and omfile received new parameters and we changed a bigger portion of the documentation to improve usability by linking relevant web ressources to quickly find additional information. Finally, there have been a few other changes and bugfixes.
More detailed information is available in the changelog.
ChangeLog:
http://www.rsyslog.com/changelog-for-7-5-4-v7-devel/
Download:
http://www.rsyslog.com/rsyslog-7-5-4-v7-devel/
As always, feedback is appreciated.
Best regards,
Florian Riedl
Changelog for 7.5.4 (v7-devel)
Version 7.5.4 [devel] 2013-10-07
- mmpstrucdata: new module to parse RFC5424 structured data into json message properties
- change main/ruleset queue defaults to be more enterprise-like
new defaults are queue.size 100,000 max workers 2, worker activation after 40,000 msgs are queued, batch size 256. These settings are much more useful for enterprises and will not hurt low-end systems that much. This is part of our re-focus on enterprise needs. - omfwd: new action parameter “maxErrorMessages” added
- omfile: new module parameters to set action defaults added
* dirCreateMode
* fileCreateMode - mmutf8fix: new module to fix invalid UTF-8 sequences
- imuxsock: handle unlimited number of additional listen sockets
- doc: improve usability by linking to relevant web ressources
The idea is to enable users to quickly find additional information, samples, HOWTOs and the like on the main site. At the same time, (very) slightly remove memory footprint when few listeners are monitored. - bugfix: omfwd parameter streamdrivermmode was not properly handled
It was always overwritten by whatever value was set via the legacy directive $ActionSendStreamDriverMode - imtcp: add streamdriver.name module parameter
permits overriding the system default stream driver (gtls, ptcp) - bugfix: build system: libgcrypt.h needed even if libgrcypt was disabled
Thanks to Jonny Törnbom for reporting this problem - imported bugfixes from 7.4.4