Store the system logs in MariaDB

       841 words, 4 minutes

I’ve used Elasticsearch on OpenBSD to store my system logs for quite long now. And if it does the job, there are a few things I don’t like so much with it.

I only used a single instance so I was warned about availability. But a sudden power outage had severe impact on my daily data. Way much more than what I expected from a Production-ready software. Rebuilding and re-indexing the data was a real pain in the ass. From time to time, I also get errors about indexing that seem to go away without doing nothing.

The latter is probably due to my low memory server. But I want to store logs for only a couple of boxes. And I don’t want to reserve 4GB of RAM just for this. This “gimme more RAM” manner really annoys me. And as I also need RAM for Logstash (to parse the data and send them to Elasticsearch), this leads to way too much resources consumption.

That said, I decided to test another way for storing the logs : using a RDBMS, namely MariaDB. I already have one running smooth. And I read Grafana was able to read data from it using SQL commands.

How it’ll work

Syslog and MariaDB overview

The stock syslogd(8) will be configured to send everything it gets to a local (or remote) syslog-ng daemon. The latter will parse, filter, format and store the logs into a (remote) MySQL / MariaDB instance.

Prepare the RDBMS

I’m using mariadb-server-10.0.34v1 on OpenBSD 6.3/amd64.

First of all, I want to be able to compress the (text) data from the logs. So I had to enable a few InnoDB related options.

# vi /etc/my.cnf
(...)
innodb_file_per_table = 1
innodb_file_format = barracuda
innodb_strict_mode = 1
(...)

# rcctl restart mysqld

Then, I simply created a database and the credentials that’d be used by syslog-ng.

# mysql -u root -p
(...)
> CREATE database logsink;
> GRANT ALL PRIVILEGES ON logsink.* \
  TO 'syslog-ng'@'%' IDENTIFIED BY 'changeme';
> FLUSH PRIVILEGES;

Install and configure Syslog-NG

There are drivers required by syslog-ng to store data into mysql.

# pkg_add syslog-ng libdbi-drivers-mysql

Syslog-NG will listen on all interfaces, UDP and TCP ports. This way, any other box can send its logs to him.

# vi /etc/syslog-ng/syslog-ng.conf
(...)
source s_net {
    udp(port(8514));
    tcp(port(8514));
};
(...)
destination d_mysql_compressed {
  sql(
    type(mysql)
    host("127.0.0.1") username("syslog-ng") password("changeme")
    database("logsink")
    table("_all")
    create-statement-append(ROW_FORMAT=COMPRESSED)
    columns(
      "seq        bigint(20) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY",
      "unixtime   bigint NOT NULL",
      "facility   varchar(16)",
      "priority   varchar(16)",
      "level      varchar(16)",
      "host       varchar(64) NOT NULL",
      "program    varchar(64) NOT NULL",
      "pid        smallint",
      "message    text",
      "tag        varchar(32)"
    )
    values(
      "", "${UNIXTIME}", "$FACILITY_NUM", "$PRIORITY", "$LEVEL_NUM",
      "${HOST}", "$PROGRAM", "${PID}", "${MSGONLY}", "$TAG"
    )
    indexes("unixtime", "host", "program", "tag")
    null("")
  );
};
(...)
log { source(s_net); filter(f_all); destination(d_mysql_compressed); };
(...)
# rcctl enable syslog_ng
# rcctl start syslog_ng

When this is done, configure syslogd(8).

# vi /etc/syslog.conf
(...)
*.* @127.0.0.1:8514

# rcctl restart syslogd

Explore the logs

From here, the logs should be stored in MariaDB / MySQL.

A first look at the tables shows “COMPRESSED” is better than the standard storage ; regarding disk usage.

> SELECT TABLE_NAME,ENGINE,ROW_FORMAT,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE \
  FROM information_schema.tables WHERE table_schema='logsink';
+----------------+--------+------------+------------+-------------+--------------+-----------+
| TABLE_NAME     | ENGINE | ROW_FORMAT | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH | DATA_FREE |
+----------------+--------+------------+------------+-------------+--------------+-----------+
| _all           | InnoDB | Compressed |      92409 |     6561792 |      5275648 |   2097152 |
| _all_compact   | InnoDB | Compact    |      93643 |    14172160 |     10551296 |   7340032 |
+----------------+--------+------------+------------+-------------+--------------+-----------+

From the filesystem POV, the gain is also clearly visible.

-rw-rw----   1 _mysql  _mysql   3.3K Aug  3 16:22 _all.frm
-rw-rw----   1 _mysql  _mysql  16.0M Aug  6 15:52 _all.ibd
-rw-rw----   1 _mysql  _mysql   3.3K Aug  3 16:22 _all_compact.frm
-rw-rw----   1 _mysql  _mysql  36.0M Aug  6 15:52 _all_compact.ibd

Have a look at the most verbose programs is just a matter of writing SQL sentence:

> SELECT program, COUNT(program) AS messages FROM _all GROUP BY program ORDER BY messages DESC;
+---------------------+----------+
| program             | messages |
+---------------------+----------+
| monit               |    47596 |
| smtpd               |    19689 |
| rspamd              |    12884 |
| doas                |     4546 |
| collectd            |     4265 |
| sshd                |     3018 |
| cron                |     2545 |
(...)

The logs can be accessed and rendered by Grafana. A simple query can print the last logs. Add alerting when some value appear and you have a nice event-based monitoring tool.

Organize storage

There are logs that I don’t want to store. And there are some that I want to store in a specific table. This can be done in Syslog-NG using filters.

<pre>filter f_all {
      not program("fetchmail");
  and not program("monit");
  and not filter(f_unbound);
  and not filter(f_apache);
};

This will not send messages from fetchmail or monit to the compressed table. Nor will it send messages that match the f_unbound and f_apache filters. Those two guys are used to store messages in a specific table with a dedicated schema. I’ll probably write about the details some day…

Now… send all your logs to Syslog-NG rather than Logstash. Count to 10 and get your RAM back! So far, MariaDB seem to handle it pretty well.