Using rsyslog and Elasticsearch to Handle Different Types of JSON Logs

Originally posted on the Sematext blog: Using Elasticsearch Mapping Types to Handle Different JSON Logs

By default, Elasticsearch does a good job of figuring the type of data in each field of your logs. But if you like your logs structured like we do, you probably want more control over how they’re indexed: is time_elapsed an integer or a float? Do you want your tags analyzed so you can search for big in big data? Or do you need it not_analyzed, so you can show top tags via the terms aggregation? Or maybe both?

In this post, we’ll look at how to use index templates to manage multiple types of logs across multiple indices. Also, we’ll explain how to use rsyslog to handle JSON logging and specify types.

Elasticsearch Mapping and Logs

To control settings for how a field is analyzed in Elasticsearch, you’ll need to define a mapping. This works similarly in Logsene, our log analytics SaaS, because it uses Elasticsearch and exposes its API.

With logs you’ll probably use time-based indices, because they scale better (in Logsene, for instance, you get daily indices). To make sure the mapping you define today applies to the index you create tomorrow, you need to define it in an index template.

Managing Multiple Types

Mappings provide a nice abstraction when you have to deal with multiple types of structured data. Let’s say you have two apps generating logs of different structures: both have a timestamp field, but one recording logins has a user field, and another one recording purchases has an amount field.

To deal with this, you can define the timestamp field in the _default_ mapping which applies to all types. Then, in each type’s own mapping we’ll define fields unique to that mapping. The following snippet is an example that works with Logsene, provided that aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee is your Logsene app token. If you roll your own Elasticsearch, you can use whichever name you want, and make sure the template name applies to matches index pattern (for example, logs-* will work if your indices are in the logs-YYYY-MM-dd format).

curl -XPUT 'logsene-receiver.sematext.com/_template/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee_MyTemplate' -d '{
 "template" : "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee*",
 "order" : 21,
 "mappings" : {
  "_default_" : {
   "properties" : {
    "timestamp" : { "type" : "date" }
   }
  },
  "firstapp" : {
   "properties" : {
    "user" : { "type" : "string" }
   }
  },
  "secondapp" : {
   "properties" : {
    "amount" : { "type" : "long" }
   }
  }
 }
}'

Sending JSON Logs to Specific Types

When you send a document to Elasticsearch by using the API, you have to provide an index and a type. You can use an Elasticsearch client for your preferred language to log directly to Elasticsearch or Logsene this way. But I wouldn’t recommend this, because then you’d have to manage things like buffering if the destination is unreachable.

Instead, I’d keep my logging simple and use a specialized logging tool, such as rsyslog, to do the hard work for me. Logging to a file is usually the easiest option. It’s local, and you can have your logging tool tail the file and send contents over the network. I usually prefer sockets (like syslog) because they let me configure rsyslog to:
– write events in a human format to a local file I can tail if I need to (usually in development)
– forward logs without hitting disk if I need to (usually in production)
Whatever you prefer, I think writing to local files or sockets is better than sending logs over the network from your application. Unless you’re willing to do a reliability trade-off and use UDP, which gets rid of most complexities.

Opinions aside, if you want to send JSON over syslog, there’s the JSON-over-syslog (CEE) format that we detailed in a previous post. You can use rsyslog’s JSON parser module to take your structured logs and forward them to Logsene:

module(load="imuxsock")        # can listen to local syslog socket
module(load="omelasticsearch") # can forward to Elasticsearch
module(load="mmjsonparse")     # can parse JSON

action(type="mmjsonparse")  # parse CEE-formatted messages

template(name="syslog-cee" type="list") {  # Elasticsearch documents will contain
  property(name="$!all-json")              # all JSON fields that were parsed
}

action(
  type="omelasticsearch"
  template="syslog-cee"                     # use the template defined earlier
  server="logsene-receiver.sematext.com"
  serverport="80"
  searchType="syslogapp"
  searchIndex="LOGSENE-APP-TOKEN-GOES-HERE"
  bulkmode="on"                                # send logs in batches
  queue.dequeuebatchsize="1000"                # of up to 1000
  action.resumeretrycount="-1"    # retry indefinitely (buffer) if destination is unreachable
)

To send a CEE-formatted syslog, you can run logger ‘@cee: {“amount”: 50}’ for example. Rsyslog would forward this JSON to Elasticsearch or Logsene via HTTP. Note that Logsene also supports CEE-formatted JSON over syslog out of the box if you want to use a syslog protocol instead of the Elasticsearch API.

Filtering by Type

Once your logs are in, you can filter them by type (via the _type field) in Kibana:
Type Filtering with Kibana
However, if you want more refined filtering by source, we suggest using a separate field for storing the application name. This can be useful when you have different applications using the same logging format. For example, both crond and postfix use plain syslog.

rsyslog 8.8.0 (v8-stable) released

We have released rsyslog 8.8.0.

This is primarily a bug-fixing release, which also includes a number of new features focussed on practical issues and improvements in problem diagnostics.
ChangeLog:

Changelog for 8.8.0 (v8-stable)

Version 8.8.0 [v8-stable] 2015-02-24

  • omkafka: add support for dynamic topics and auto partitioning
    Thanks to Tait Clarridge for the patches.
  • imtcp/imptcp: support for broken Cisco ASA TCP syslog framing
  • omfwd: more detailled error messages in case of UDP send error
  • TLS syslog: enable capability to turn on GnuTLS debug logging
    This provides better diagnostics in hard-to-diagnose cases,
    especially when GnuTLS is extra-picky about certificates.
  • bugfix: $AbortOnUncleanConfig did not work
  • improve rsyslogd -v output and error message with meta information
    version number is now contained in error message and build platform in
    version output. This helps to gets rid of the usual “which version”
    question on mailing list, support forums, etc…
  • bugfix imtcp: octet-counted framing cannot be turned off
  • bugfix: build problems on Illuminos
    Thanks to Andrew Stormont for the patch
  • bugfix: invalid data size for iMaxLine global property
    It was defined as int, but inside the config system it was declared as
    size type, which uses int64_t. With legacy config statements, this could
    lead to misadressing, which usually meant the another config variable was
    overwritten (depending on memory layout).
    closes https://github.com/rsyslog/rsyslog/issues/205
  • bugfix: negative values for maxMessageSize global parameter were permitted

RSyslog Windows Agent 3.0 Released

Adiscon is proud to announce the 3.0 release of RSyslog Windows Agent.

This new major release adds full support for Windows 2012 R2 and also has been verified to work on Windows 10 preview versions.

The new major version is a milestone in many ways. Most important the performance of the core engine has been considerably increased. All existing configurations will benefit from this. Also a new Configuration Client has been added which has been rewritten using the .Net Framework (Like the InterActive Syslog Viewer). With the new Configuration Client, we also introduce support for a new file based configuration format (as an alternative to the registry-based method). RSyslog Windows Agent can now run from a configuration file and save it state values
into files.

We also extended the classic EventLog Monitor to support multiple dynamic *.evt files for NetApp customers.

Detailed information can be found in the version history below.

Build-IDs: Service 3.0.130, Client 3.0.201

Features

  • Faster core engine
  • New Configuration Client running on Microsoft .Net Framework. If wanted, the old client application can be installed manually as “RSyslog Windows Agent Legacy Client”.
  • The Agent can be switched from registry to file based configuration support. Requires usage of the new configuration client.
  • EventLog Monitor Classic(V1): Support for dynamic Eventlog files added.
    Kindly use an asterix (*) in the eventlog filename to activate it, for example: \\netappdevice\c$\etc\log\adtlog.*.evt
    When activated, EventLog Monitor will process all matching files automatically. The feature was primary added for NETAPP users who have dynamic filenames.
  • New System Property added to created UUID’s called “$NEWUUID”. Generates a random generated 128Bit UUID (Universally Unique Identifiers).

Bugfixes

  •  none

Version 3.0 is a free download. Customers with existing 2.x keys can contact our Sales department for upgrade prices. If you have a valid Upgrade Insurance ID, you can request a free new key by sending your Upgrade Insurance ID to sales@adiscon.com. Please note that the download enables the free 30-day trial version if used without a key – so you can right now go ahead and evaluate it.

rsyslog daily builds and tarballs

The past days, we have worked on making rsyslog daily builds and tarballs a reality. We hope this will enable users to rapidly deploy the latest features as well as make it easier to help with testing the current development system. Daily builds are what the scheduled v8-devel builds were under the previous release paradigm. Consequently, the archives are named v8-devel.

Right now, builds are only supported for Ubuntu. Users of other platforms are advised to use the daily tarballs to build from source. Depending on feedback on and success of the daily builds, we will make them available for more platforms.  

A daily build is based on the latest git master version. So it really is at the [b]leading edge of technology. So why create them?

A top reason is that we often fix a bug for someone, and that someone then is unable to build from source. In the end result, we have a bugfix, but there is no external confirmation that it really fixed the bug when we merge it into the next release. We hope that now those users can simply pick the daily build and check if that solves their problem.

Also, in general we hope that some users will use the daily tarballs to get not only the latest and greatest but contribute to the project by doing some testing.

Finally, and quite important, with daily builds we will see build problems as early as possible. In the past, we often saw problems only after source release (or very close to it), which was obviously problematic. Now, this should no longer happen. For obvious reasons, the final release build is now more or less a copy of a daily build.

As a technical side-note, daily builds are identified by the git master branch head hash that was used to build them. As a forth version component, they have the first 12 digits of that hash (an example is “8.8.0.35e7f12a2c04”). This enables us to track error reports to the right version. The packages have a different version name, based on the build date. The reason is that the hash does not increment and so newer versions (with lower hash values) are considered as “old” by Launchpad. We avoid this by using an always incrementing package version. Also note that the package changelog just contains a “daily build” entry — anything else makes limited sense.

We hope you enjoy this new feature! Feedback is appreciated.

rsyslog 8.7.0 (v8-stable) released

We have released rsyslog 8.7.0.

Version 8.7.0 contains various improvements and additions to a wide array of modules, like imfile, imptcp, improvements to RainerScript and mmnormalize (thanks to Singh Janmejay) and a couple of other improvements. But, the biggest addition is the new omkafka module that now allows direct writing to Apache Kafka.

This release also contains important bug fixes.

This is a recommended upgrade for all users.

 

ChangeLog:

http://www.rsyslog.com/changelog-for-8-7-0-v8-stable/

Download:

http://www.rsyslog.com/downloads/download-v8-stable/

As always, feedback is appreciated.

Best regards,
Florian Riedl

Changelog for 8.7.0 (v8-stable)

Version 8.7.0 [v8-stable] 2015-01-13

  • add message metadata “system” to msg object
    this permits to store metadata alongside the message
  • imfile: add support for “filename” metadata
    this is useful in cases where wildcards are used
  • imptcp: make stats counter names consistent with what imudp, imtcp uses
  • added new module “omkafka” to support writing to Apache Kafka
  • omfwd: add new “udp.senddelay” parameter
  • mmnormalize enhancements
    Thanks to Janmejay Singh for the patch.
  • RainerScript “foreach” iterator and array reading support
    Thanks to Janmejay Singh for the patch.
  • now requires liblognorm >= 1.0.2
  • add support for systemd >= 209 library names
  • BSD “ntp” facility (value 12) is now also supported in filter
    Thanks to Douglas K. Rand of Iteris, Inc. for the patch.
    Note: this patch was released under ASL 2.0 (see email-conversation).
  • bugfix: global(localHostName=”xxx”) was not respected in all modules
  • bugfix: emit correct error message on config-file-not-found
    closes https://github.com/rsyslog/rsyslog/issues/173
  • bugfix: impstats emitted invalid JSON format (if JSON was selected)
  • bugfix: (small) memory leak in omfile’s outchannel code
    Thanks to Koral Ilgun for reporting this issue.
  • bugfix: imuxsock did not deactivate some code not supported by platform
    Among potential other problemns, this caused build failure under Solaris.
    Note that this build problem just made a broader problem appear that so
    far always existed but was not visible.
    closes https://github.com/rsyslog/rsyslog/issues/185

LibLogging 1.0.5 released

liblogging 1.0.5 [download]

We have released liblogging 1.0.5.

This release has a important bugfix for a bug that caused the date stamp to be partially incorrect. The day part was totally off and this affected the “uxsock:” and “file:” drivers.

sha256sum: 310dc1691279b7a669d383581fe4b0babdc7bf75c9b54a24e51e60428624890b

—————————————————————————-
v1.0.5 2014-12-09
– cleanup for systemd-journal >= 209
  closes https://github.com/rsyslog/liblogging/issues/17
– bugfix: date stamp was incorrectly formatted
  The day part was totally off. This affected the “uxsock:” and “file:”
  drivers.
  closes https://github.com/rsyslog/liblogging/issues/21

rsyslog -devel packages are being removed soon

If you use rsyslog’s devel packages on your system, you will receive errors soon. Be sure to read the complete posting to avoid trouble!

As part of rsyslog’s new release schedule and version naming, devel releases will no longer be named according to the “normal” numbering scheme. This also means that the previous “devel” branches will disappear, as git master branch now is the always-current devel version.

Keep on your mind that we previously had a release cycle of 3 to 9 month for a new feature to appear in a stable version. That was because new feature releases were only done when a complete devel turnaround was done, and relatively many new features were added. For this reason, some people opted to run devel versions in production, and thus needed specific tarballs (and packages) for them.

With the new six week release cycle, we get new features rather quickly into the stable builds. So it usually should be no problem to wait for the next stable to use that recently-implemented new feature. As such, there is no need any longer for special devel releases, and thus no need for devel tarballs and packages.

Well… almost. One thing we would like to have is a “daily devel version”. The idea is that if the testbench runs are OK, a new tarball and a set of packages is generated automatically and posted to a special archive. In general, that archive should receive an update once a day. So people really interested in the [b]leading edge can simply install from that daily package archive — and report bugs quickly, so helping the development process. Unfortunately, time is precious and we don’t know when and if we can setup the required automation. Most probably not before January 2015, and how it works out then needs to be seen.

In the interim, we will begin to delete the -devel packages. The old -devel tarballs will remain available, at least for the time being. The problem with -devel packages is that folks may have set their system to use the -devel repro. If we would just keep it as is, those systems would never again receive any updates, neither security-releated nor others, simply because -devel versions no longer exist in the way they were. That would pose a potentially big security risk. As such, we will delete the -devel content, and begin to do so early next week. If you use the -devel packages, be sure to switch the v8-stable instead.

rsyslog 8.6.0 (v8-stable) released

We have released rsyslog 8.6.0.

This is the first stable release under a new release cycle and versioning scheme. This new scheme is important news in itself. For more details, please have a look here:

http://www.rsyslog.com/rsyslogs-new-release-cycle-and-versioning-scheme/

Version 8.6.0 contains important new features like the ability to monitor files via wildcards in imfile. It also contains new, experimental zero message queue modules (special thanks to team member Brian Knox), improvements to RainerScript and mmnormalize (thanks to Singh Janmejay) and a couple of other improvements.

This release also contains important bug fixes.

This is a recommended upgrade for all users.

 

ChangeLog:

http://www.rsyslog.com/changelog-for-8-6-0-v8-stable/

Download:

http://www.rsyslog.com/downloads/download-v8-stable/

As always, feedback is appreciated.

Best regards,
Florian Riedl

Scroll to top