rsyslog

The rocket-fast system for log processing

Masking data in logs and RSYSLOG

As a mobile payments company, we at SumUp are obligated to follow many industry regulations, one of them being PCI DSS. Restricting access to credit card numbers is a clear need and this implies ensuring they are not part of the logs which are used for various purposes and have bigger audience, not restricted to the authorized list of employees who have access to sensitive data.

PAN or primary account number is used by card issuers as card number which is unique and brings information about the issuer and also in majority of the cases can be validated with Luhn algorithm. This is the number on you credit or debit card and it should be kept secret by us.

One widely used approach is to have quality assurance of the logs all over the development and deployment cycle. This is a needed and valuable attitude however first it takes a lot of human resources and second it is kind of reactive approach in terms of dealing with production systems. So we want something better, something mandatory which can leave us on the safe side if we got human error somewhere in the chain. This is very important in our case where we need to put logging management system out of PCI scope. From four ways which are offered by PCI DSS, Requirement 3.4:

3.4 Render PAN unreadable anywhere it is stored (including on portable digital media, backup media, and in logs)
by using any of the following approaches:

  •  One-way hashes based on strong cryptography, (hash must be of the entire PAN).
  • Truncation (hashing cannot be used to replace the truncated segment of PAN).
  •  Index tokens and pads (pads must be securely stored).
  • Strong cryptography with associated key-management processes and procedures.

our natural choice for log messages is truncation. We want to truncate PAN data if it’s present in the logs for some reason in example in situation when temporary the log level is increased for investigation. While we have centralized log storage which is in PCI scope we want to transfer the logs  in real-time in some external location, accessible for developers and BI where they can find and use the information they need.

Since we are using rsyslog for logging daemon our next step was to get in touch with Adiscon – the company behind this brilliant piece of software. They were very interested when I explained the idea and the work started. A little bit later we got new message modification module called mmexternal. It sends the message to some external binary and expects an input. More on the implementation here.

Let me give you an example with a code snippet from rsyslog config and an example of python script which is doing a regular expression to catch and replace i.e. VISA, MasterCard and AMEX cards. You may find a lot of useful regular expressions here:

rsyslog.conf

module(load="mmexternal")
action(type="mmexternal"
binary="/usr/local/bin/external_python_cards_replace.py"
interface.input="msg" )

external_python_cards_replace.py

Please note that the above snippets are only examples. With using regular expressions you are going to have many false positives but in general this won’t be an issue. Also note that you can modify completely different parts of the logs and also you are not limited to any language or technique for doing so.

With the following example we have negligible resource consumption on the server where log modification is done. Synthetic test which not claim for accuracy shows around 5% CPU usage on single core 2.5GHz virtual CPU for 100 messages/s.

This is how we are doing it. All comments and suggestions are welcome!

Posted in faq |