Using rsyslog and Elasticsearch to Handle Different Types of JSON Logs
Originally posted on the Sematext blog: Using Elasticsearch Mapping Types to Handle Different JSON Logs
By default, Elasticsearch does a good job of figuring the type of data in each field of your logs. But if you like your logs structured like we do, you probably want more control over how they’re indexed: is time_elapsed an integer or a float? Do you want your tags analyzed so you can search for big in big data? Or do you need it not_analyzed, so you can show top tags via the terms aggregation? Or maybe both?
In this post, we’ll look at how to use index templates to manage multiple types of logs across multiple indices. Also, we’ll explain how to use rsyslog to handle JSON logging and specify types.
Elasticsearch Mapping and Logs
To control settings for how a field is analyzed in Elasticsearch, you’ll need to define a mapping. This works similarly in Logsene, our log analytics SaaS, because it uses Elasticsearch and exposes its API.
With logs you’ll probably use time-based indices, because they scale better (in Logsene, for instance, you get daily indices). To make sure the mapping you define today applies to the index you create tomorrow, you need to define it in an index template.
Managing Multiple Types
Mappings provide a nice abstraction when you have to deal with multiple types of structured data. Let’s say you have two apps generating logs of different structures: both have a timestamp field, but one recording logins has a user field, and another one recording purchases has an amount field.
To deal with this, you can define the timestamp field in the _default_ mapping which applies to all types. Then, in each type’s own mapping we’ll define fields unique to that mapping. The following snippet is an example that works with Logsene, provided that aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee is your Logsene app token. If you roll your own Elasticsearch, you can use whichever name you want, and make sure the template name applies to matches index pattern (for example, logs-* will work if your indices are in the logs-YYYY-MM-dd format).
curl -XPUT 'logsene-receiver.sematext.com/_template/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee_MyTemplate' -d '{ "template" : "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee*", "order" : 21, "mappings" : { "_default_" : { "properties" : { "timestamp" : { "type" : "date" } } }, "firstapp" : { "properties" : { "user" : { "type" : "string" } } }, "secondapp" : { "properties" : { "amount" : { "type" : "long" } } } } }'
Sending JSON Logs to Specific Types
When you send a document to Elasticsearch by using the API, you have to provide an index and a type. You can use an Elasticsearch client for your preferred language to log directly to Elasticsearch or Logsene this way. But I wouldn’t recommend this, because then you’d have to manage things like buffering if the destination is unreachable.
Instead, I’d keep my logging simple and use a specialized logging tool, such as rsyslog, to do the hard work for me. Logging to a file is usually the easiest option. It’s local, and you can have your logging tool tail the file and send contents over the network. I usually prefer sockets (like syslog) because they let me configure rsyslog to:
– write events in a human format to a local file I can tail if I need to (usually in development)
– forward logs without hitting disk if I need to (usually in production)
Whatever you prefer, I think writing to local files or sockets is better than sending logs over the network from your application. Unless you’re willing to do a reliability trade-off and use UDP, which gets rid of most complexities.
Opinions aside, if you want to send JSON over syslog, there’s the JSON-over-syslog (CEE) format that we detailed in a previous post. You can use rsyslog’s JSON parser module to take your structured logs and forward them to Logsene:
module(load="imuxsock") # can listen to local syslog socket module(load="omelasticsearch") # can forward to Elasticsearch module(load="mmjsonparse") # can parse JSON action(type="mmjsonparse") # parse CEE-formatted messages template(name="syslog-cee" type="list") { # Elasticsearch documents will contain property(name="$!all-json") # all JSON fields that were parsed } action( type="omelasticsearch" template="syslog-cee" # use the template defined earlier server="logsene-receiver.sematext.com" serverport="80" searchType="syslogapp" searchIndex="LOGSENE-APP-TOKEN-GOES-HERE" bulkmode="on" # send logs in batches queue.dequeuebatchsize="1000" # of up to 1000 action.resumeretrycount="-1" # retry indefinitely (buffer) if destination is unreachable )
To send a CEE-formatted syslog, you can run logger ‘@cee: {“amount”: 50}’ for example. Rsyslog would forward this JSON to Elasticsearch or Logsene via HTTP. Note that Logsene also supports CEE-formatted JSON over syslog out of the box if you want to use a syslog protocol instead of the Elasticsearch API.
Filtering by Type
Once your logs are in, you can filter them by type (via the _type field) in Kibana:
However, if you want more refined filtering by source, we suggest using a separate field for storing the application name. This can be useful when you have different applications using the same logging format. For example, both crond and postfix use plain syslog.
rsyslog 8.8.0 (v8-stable) released
We have released rsyslog 8.8.0.
http://www.rsyslog.com/changelog-for-8-8-0-v8-stable/
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Tim Eifler
Changelog for 8.8.0 (v8-stable)
Version 8.8.0 [v8-stable] 2015-02-24
- omkafka: add support for dynamic topics and auto partitioning
Thanks to Tait Clarridge for the patches. - imtcp/imptcp: support for broken Cisco ASA TCP syslog framing
- omfwd: more detailled error messages in case of UDP send error
- TLS syslog: enable capability to turn on GnuTLS debug logging
This provides better diagnostics in hard-to-diagnose cases,
especially when GnuTLS is extra-picky about certificates. - bugfix: $AbortOnUncleanConfig did not work
- improve rsyslogd -v output and error message with meta information
version number is now contained in error message and build platform in
version output. This helps to gets rid of the usual “which version”
question on mailing list, support forums, etc… - bugfix imtcp: octet-counted framing cannot be turned off
- bugfix: build problems on Illuminos
Thanks to Andrew Stormont for the patch - bugfix: invalid data size for iMaxLine global property
It was defined as int, but inside the config system it was declared as
size type, which uses int64_t. With legacy config statements, this could
lead to misadressing, which usually meant the another config variable was
overwritten (depending on memory layout).
closes https://github.com/rsyslog/rsyslog/issues/205 - bugfix: negative values for maxMessageSize global parameter were permitted
RSyslog Windows Agent 3.0 Released
Adiscon is proud to announce the 3.0 release of RSyslog Windows Agent.
This new major release adds full support for Windows 2012 R2 and also has been verified to work on Windows 10 preview versions.
The new major version is a milestone in many ways. Most important the performance of the core engine has been considerably increased. All existing configurations will benefit from this. Also a new Configuration Client has been added which has been rewritten using the .Net Framework (Like the InterActive Syslog Viewer). With the new Configuration Client, we also introduce support for a new file based configuration format (as an alternative to the registry-based method). RSyslog Windows Agent can now run from a configuration file and save it state values
into files.
We also extended the classic EventLog Monitor to support multiple dynamic *.evt files for NetApp customers.
Detailed information can be found in the version history below.
Build-IDs: Service 3.0.130, Client 3.0.201
Features |
|
Bugfixes |
|
Version 3.0 is a free download. Customers with existing 2.x keys can contact our Sales department for upgrade prices. If you have a valid Upgrade Insurance ID, you can request a free new key by sending your Upgrade Insurance ID to sales@adiscon.com. Please note that the download enables the free 30-day trial version if used without a key – so you can right now go ahead and evaluate it.
rsyslog daily builds and tarballs
The past days, we have worked on making rsyslog daily builds and tarballs a reality. We hope this will enable users to rapidly deploy the latest features as well as make it easier to help with testing the current development system. Daily builds are what the scheduled v8-devel builds were under the previous release paradigm. Consequently, the archives are named v8-devel.
Right now, builds are only supported for Ubuntu. Users of other platforms are advised to use the daily tarballs to build from source. Depending on feedback on and success of the daily builds, we will make them available for more platforms.
A daily build is based on the latest git master version. So it really is at the [b]leading edge of technology. So why create them?
A top reason is that we often fix a bug for someone, and that someone then is unable to build from source. In the end result, we have a bugfix, but there is no external confirmation that it really fixed the bug when we merge it into the next release. We hope that now those users can simply pick the daily build and check if that solves their problem.
Also, in general we hope that some users will use the daily tarballs to get not only the latest and greatest but contribute to the project by doing some testing.
Finally, and quite important, with daily builds we will see build problems as early as possible. In the past, we often saw problems only after source release (or very close to it), which was obviously problematic. Now, this should no longer happen. For obvious reasons, the final release build is now more or less a copy of a daily build.
As a technical side-note, daily builds are identified by the git master branch head hash that was used to build them. As a forth version component, they have the first 12 digits of that hash (an example is “8.8.0.35e7f12a2c04”). This enables us to track error reports to the right version. The packages have a different version name, based on the build date. The reason is that the hash does not increment and so newer versions (with lower hash values) are considered as “old” by Launchpad. We avoid this by using an always incrementing package version. Also note that the package changelog just contains a “daily build” entry — anything else makes limited sense.
We hope you enjoy this new feature! Feedback is appreciated.
rsyslog 8.7.0 (v8-stable) released
We have released rsyslog 8.7.0.
Version 8.7.0 contains various improvements and additions to a wide array of modules, like imfile, imptcp, improvements to RainerScript and mmnormalize (thanks to Singh Janmejay) and a couple of other improvements. But, the biggest addition is the new omkafka module that now allows direct writing to Apache Kafka.
This release also contains important bug fixes.
This is a recommended upgrade for all users.
http://www.rsyslog.com/changelog-for-8-7-0-v8-stable/
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Florian Riedl
Changelog for 8.7.0 (v8-stable)
Version 8.7.0 [v8-stable] 2015-01-13
- add message metadata “system” to msg object
this permits to store metadata alongside the message - imfile: add support for “filename” metadata
this is useful in cases where wildcards are used - imptcp: make stats counter names consistent with what imudp, imtcp uses
- added new module “omkafka” to support writing to Apache Kafka
- omfwd: add new “udp.senddelay” parameter
- mmnormalize enhancements
Thanks to Janmejay Singh for the patch. - RainerScript “foreach” iterator and array reading support
Thanks to Janmejay Singh for the patch. - now requires liblognorm >= 1.0.2
- add support for systemd >= 209 library names
- BSD “ntp” facility (value 12) is now also supported in filter
Thanks to Douglas K. Rand of Iteris, Inc. for the patch.
Note: this patch was released under ASL 2.0 (see email-conversation). - bugfix: global(localHostName=”xxx”) was not respected in all modules
- bugfix: emit correct error message on config-file-not-found
closes https://github.com/rsyslog/rsyslog/issues/173 - bugfix: impstats emitted invalid JSON format (if JSON was selected)
- bugfix: (small) memory leak in omfile’s outchannel code
Thanks to Koral Ilgun for reporting this issue. - bugfix: imuxsock did not deactivate some code not supported by platform
Among potential other problemns, this caused build failure under Solaris.
Note that this build problem just made a broader problem appear that so
far always existed but was not visible.
closes https://github.com/rsyslog/rsyslog/issues/185
LibLogging 1.0.5 released
liblogging 1.0.5 [download]
We have released liblogging 1.0.5.
sha256sum: 310dc1691279b7a669d383581fe4b0babdc7bf75c9b54a24e51e60428624890b
– cleanup for systemd-journal >= 209
closes https://github.com/rsyslog/liblogging/issues/17
– bugfix: date stamp was incorrectly formatted
The day part was totally off. This affected the “uxsock:” and “file:”
drivers.
closes https://github.com/rsyslog/liblogging/issues/21
rsyslog -devel packages are being removed soon
If you use rsyslog’s devel packages on your system, you will receive errors soon. Be sure to read the complete posting to avoid trouble!
As part of rsyslog’s new release schedule and version naming, devel releases will no longer be named according to the “normal” numbering scheme. This also means that the previous “devel” branches will disappear, as git master branch now is the always-current devel version.
Keep on your mind that we previously had a release cycle of 3 to 9 month for a new feature to appear in a stable version. That was because new feature releases were only done when a complete devel turnaround was done, and relatively many new features were added. For this reason, some people opted to run devel versions in production, and thus needed specific tarballs (and packages) for them.
With the new six week release cycle, we get new features rather quickly into the stable builds. So it usually should be no problem to wait for the next stable to use that recently-implemented new feature. As such, there is no need any longer for special devel releases, and thus no need for devel tarballs and packages.
Well… almost. One thing we would like to have is a “daily devel version”. The idea is that if the testbench runs are OK, a new tarball and a set of packages is generated automatically and posted to a special archive. In general, that archive should receive an update once a day. So people really interested in the [b]leading edge can simply install from that daily package archive — and report bugs quickly, so helping the development process. Unfortunately, time is precious and we don’t know when and if we can setup the required automation. Most probably not before January 2015, and how it works out then needs to be seen.
In the interim, we will begin to delete the -devel packages. The old -devel tarballs will remain available, at least for the time being. The problem with -devel packages is that folks may have set their system to use the -devel repro. If we would just keep it as is, those systems would never again receive any updates, neither security-releated nor others, simply because -devel versions no longer exist in the way they were. That would pose a potentially big security risk. As such, we will delete the -devel content, and begin to do so early next week. If you use the -devel packages, be sure to switch the v8-stable instead.
rsyslog 8.6.0 (v8-stable) released
We have released rsyslog 8.6.0.
This is the first stable release under a new release cycle and versioning scheme. This new scheme is important news in itself. For more details, please have a look here:
http://www.rsyslog.com/rsyslogs-new-release-cycle-and-versioning-scheme/
Version 8.6.0 contains important new features like the ability to monitor files via wildcards in imfile. It also contains new, experimental zero message queue modules (special thanks to team member Brian Knox), improvements to RainerScript and mmnormalize (thanks to Singh Janmejay) and a couple of other improvements.
This release also contains important bug fixes.
This is a recommended upgrade for all users.
http://www.rsyslog.com/changelog-for-8-6-0-v8-stable/
Download:
http://www.rsyslog.com/downloads/download-v8-stable/
As always, feedback is appreciated.
Best regards,
Florian Riedl