Writing syslog messages to MariaDB, MySQL, PostgreSQL or any other supported Database¶

Written by Rainer Gerhardswith some additions by Marc Schiffbauer (2008-02-28)

Abstract¶

*In this paper, I describe how to write*syslog*messages to a*MariaDB <http://www.mariadb.org/>)_/MySQL *or*PostgreSQL*database.*Having syslog messages in a database is often handy, especially when you intend to set up a front-end for viewing them. This paper describes an approach withrsyslogd, an alternative enhanced syslog daemon natively supporting MySQL and PostgreSQL. I describe the components needed to be installed and how to configure them. Please note that as of this writing, rsyslog supports a variety of databases. While this guide is still MySQL- and PostgreSQL-focused, you can probably use it together with other ones too. You just need to modify a few settings.

Background¶

In many cases, syslog data is simply written to text files. This approach has some advantages, most notably it is very fast and efficient. However, data stored in text files is not readily accessible for real-time viewing and analysis. To do that, the messages need to be in a database. There are various ways to store syslog messages in a database. For example, some have the syslogd write text files which are later feed via a separate script into the database. Others have written scripts taking the data (via a pipe) from a non-database-aware syslogd and store them as they appear. Some others use database-aware syslogds and make them write the data directly to the database. In this paper, I use that “direct write” approach. I think it is superior, because the syslogd itself knows the status of the database connection and thus can handle it intelligently (well … hopefully ;)). I use rsyslogd to accomplish this, simply because I have initiated the rsyslog project with database-awareness as one goal.

One word of caution: while message storage in the database provides an excellent foundation for interactive analysis, it comes at a cost. Database i/o is considerably slower than text file i/o. As such, directly writing to the database makes sense only if your message volume is low enough to allow a) the syslogd, b) the network, and c) the database server to catch up with it. Some time ago, I have written a paper on optimizing syslog server performance. While this paper talks about Window-based solutions, the ideas in it are generic enough to apply here, too. So it might be worth reading if you anticipate medium high to high traffic. If you anticipate really high traffic (or very large traffic spikes), you should seriously consider forgetting about direct database writes - in my opinion, such a situation needs either a very specialized system or a different approach (the text-file-to-database approach might work better for you in this case).

Overall System Setup¶

In this paper, I concentrate on the server side. If you are thinking about interactive syslog message review, you probably want to centralize syslog. In such a scenario, you have multiple machines (the so-called clients) send their data to a central machine (called server in this context). While I expect such a setup to be typical when you are interested in storing messages in the database, I do not describe how to set it up. This is beyond the scope of this paper. If you search a little, you will probably find many good descriptions on how to centralize syslog. If you do that, it might be a good idea to do it securely, so you might also be interested in my paper on ssl-encrypting syslog message transfer.

No matter how the messages arrive at the server, their processing is always the same. So you can use this paper in combination with any description for centralized syslog reporting.

As I already said, I use rsyslogd on the server. It has intrinsic support for talking to the supported databases. For obvious reasons, we also need an instance of MySQL or PostgreSQL running. To keep us focused, the setup of the database itself is also beyond the scope of this paper. I assume that you have successfully installed the database and also have a front-end at hand to work with it (for example, phpMyAdmin or phpPgAdmin. Please make sure that this is installed, actually working and you have a basic understanding of how to handle it.

Setting up the system¶

You need to download and install rsyslogd first. Obtain it from the rsyslog site. Make sure that you disable stock syslogd, otherwise you will experience some difficulties. On some distributions (Fedora 8 and above, for example), rsyslog may already by the default syslogd, in which case you obviously do not need to do anything specific. For many others, there are prebuild packages available. If you use either, please make sure that you have the required database plugins for your database available. It usually is a separate package and typically not installed by default.

It is important to understand how rsyslogd talks to the database. In rsyslogd, there is the concept of “templates”. A template is a list of text fragments and message properties. When rendered, the property replacer expands the property references and rsyslog concatenates them with the literal text. For example, a template that contains only the msg property will be expanded to the syslog message content. Rsyslogd supports sending the resulting text as an SQL statement to the database. As such, the template must be a valid SQL statement. There is no limit in what the statement might be, but there are some obvious and not so obvious choices. For example, a template “drop table xxx” is possible, but does not make an awful lot of sense. In practice, you will always use an “insert” statement inside the template.

An example: if you would just like to store the msg part of the full syslog message, you have probably created a table “syslog” with a single column “message”. In such a case, a Rainerscript list template might look like this:

template(
    name="sqlInsertMessage"
    type="list"
) {
    constant(value="insert into syslog(message) values ('")
    property(name="msg" sql="on")
    constant(value="')")
}

When rendered for a message with the content “Test” the template above expands to insert into syslog(message) values('Test'). The action then sends that SQL statement to the database. It’s that easy, no special magic. The only thing you must ensure is that your template expands to a proper SQL statement and that this statement matches your database design.

Does that mean you need to create database schema yourself and also must fully understand rsyslogd’s properties? No, that’s not needed. Because we anticipated that folks are probably more interested in getting things going instead of designing them from scratch. So we have provided a default schema as well as built-in support for it. This schema also offers an additional benefit: rsyslog is part of Adiscon’s MonitorWare product line (which includes open source and closed source members). All of these tools share the same default schema and know how to operate on it. For this reason, the default schema is also called the “MonitorWare Schema”. If you use it, you can simply add phpLogCon, a GPLed syslog web interface, to your system and have instant interactive access to your database. So there are some benefits in using the provided schema.

The schema definition is contained in the file “createDB.sql”. It comes with the rsyslog package and one can be found for each supported database type (in the plugins directory). Review it to check that the database name is acceptable for you. Be sure to leave the table and field names unmodified, because otherwise you need to customize rsyslogd’s default sql template, which we do not do in this paper. Then, run the script with your favorite SQL client. Double-check that the table was successfully created.

It is important to note that the correct database encoding must be used so that the database will accept strings independent of the string encoding. This is an important part because it can not be guaranteed that all syslog messages will have a defined character encoding. This is especially true if the rsyslog-Server will collect messages from different clients and different products.

For example PostgreSQL may refuse to accept messages if you would set the database encoding to “UTF8” while a client is sending invalid byte sequences for that encoding.

Database support in rsyslog is integrated via loadable plugin modules. To use the database functionality, the database plugin must be enabled in the config file before the first database table action is used. With Rainerscript this is done via a module() statement. For MariaDB or MySQL place the following near the top of /etc/rsyslog.conf:

module(load="ommysql")

For PostgreSQL replace the module name with ompgsql. For other databases, use the plugin that matches your database backend (for example omlibdbi when routing through libdbi).

Next, instruct rsyslog to write data to the database. When you use the default schema you do not need to define a custom template. The built-in template is selected automatically. The Rainerscript fragment below forwards every received message to a MariaDB/MySQL database:

action(
    type="ommysql"
    server="database-server"
    db="database-name"
    uid="database-userid"
    pwd="database-password"
)

The parameters map 1:1 to the plugin options described in the ommysql: MariaDB/MySQL Database Output Module documentation. For PostgreSQL use type="ompgsql" and the equivalent parameter names. In many cases, the database will run on the local machine. In this case, you can simply use 127.0.0.1 for server. This can be especially advisable, if you do not need to expose the database to any process outside of the local machine. In this case, you can simply bind it to 127.0.0.1, which provides a quite secure setup. Of course, rsyslog also supports remote database instances. In that case, use the remote server name (e.g. mydb.example.com) or IP-address.

The db parameter defaults to Syslog. If you have modified the default, use your database name. uid and pwd are the credentials used to connect to the database. As they are stored in clear text in the configuration file, that user should have only the least possible privileges. It is sufficient to grant it INSERT privileges to the systemevents table only. As a side note, it is strongly advisable to make /etc/rsyslog.conf readable by root only - if you make it world-readable, everybody could obtain the password (and eventually other vital information from it). In our example, let’s assume you have created a database user named syslogwriter with a password of topsecret (just to say it bluntly: such a password is NOT a good idea…). If your database is on the local machine, your configuration could look like this:

action(
    type="ommysql"
    server="127.0.0.1"
    db="Syslog"
    uid="syslogwriter"
    pwd="topsecret"
)

Save the configuration and restart rsyslogd - you should see syslog messages being stored in the systemevents table!

The example action stores every message to the database. Especially if you have a high traffic volume, you will probably limit the amount of messages being logged. This is easy to accomplish: the “write database” action can be wrapped by Rainerscript filtering. If, for example, you are only interested in messages from the mail subsystem, you can use the following construct:

if prifilt("mail.*") then {
    action(
        type="ommysql"
        server="127.0.0.1"
        db="Syslog"
        uid="syslogwriter"
        pwd="topsecret"
    )
}

See Control Structures and prifilt() for the available filtering primitives.

You have now completed everything necessary to store syslog messages to the a database. If you would like to try out a front-end, you might want to look at phpLogCon, which displays syslog data in a browser. As of this writing, phpLogCon is not yet a powerful tool, but it’s open source, so it might be a starting point for your own solution.

On Reliability…¶

Rsyslogd writes syslog messages directly to the database. This implies that the database must be available at the time of message arrival. If the database is offline, no space is left or something else goes wrong - rsyslogd can not write the database record. If rsyslogd is unable to store a message, it performs one retry. This is helpful if the database server was restarted. In this case, the previous connection was broken but a reconnect immediately succeeds. However, if the database is down for an extended period of time, an immediate retry does not help.

Message loss in this scenario can easily be prevented with rsyslog. All you need to do is run the database writer in queued mode. This is now described in a generic way and I do not intend to duplicate it here. So please be sure to read Handling a massive syslog database insert rate with Rsyslog, which describes the scenario and also includes configuration examples that build on General Queue Parameters.

Conclusion¶

With minimal effort, you can use rsyslogd to write syslog messages to a database. You can even make it absolutely fail-safe and protect it against database server downtime. Once the messages are arrived there, you can interactively review and analyze them. In practice, the messages are also stored in text files for longer-term archival and the databases are cleared out after some time (to avoid becoming too slow). If you expect an extremely high syslog message volume, storing it in real-time to the database may outperform your database server. In such cases, either filter out some messages or used queued mode (which in general is recommended with databases).

The method outlined in this paper provides an easy to setup and maintain solution for most use cases.

References and Additional Material¶

www.rsyslog.com - the rsyslog site
Paper on Syslog Server Optimization

Support: rsyslog Assistant | GitHub Discussions | GitHub Issues: rsyslog source project

Contributing: Source & docs: rsyslog source project