Cleaning InfluxDB measurements with mixed Tags and Fields

       463 words, 3 minutes

While experimenting with Telegraf, SNMP and InfluxDB, I ended up filling some measurements with tags and fields that have the same name. InfluxDB works nicely with such user-case by adding numbers (in format “_###”) to those keys. But that’s a mess to deal with when using Grafana. And a mess in general.

Polling SNMP into Telegraf is about writing blocs like this one:

[[inputs.snmp.table]]
  name = "snmp_hrStorageTable"
  oid = "HOST-RESOURCES-MIB::hrStorageTable"
  inherit_tags = [ "sysName" ]
  [[inputs.snmp.table.field]]
    name = "hrStorageDescr"
    oid = "HOST-RESOURCES-MIB::hrStorageDescr"
    is_tag = true
  [[inputs.snmp.table.field]]
    name = "hrStorageType"
    oid = "HOST-RESOURCES-MIB::hrStorageType"
    is_tag = true

And it works great when you do it properly. But… when you forgot the “is_tag” or decide to change its value, you end up with a tag and field that have the same name. In my case, the measurement turned that way:

> SHOW TAG KEYS FROM snmp_hrStorageTable ; SHOW FIELD KEYS FROM snmp_hrStorageTable
name: snmp_hrStorageTable
tagKey
------
agent_host
dc
host
hrStorageDescr
hrStorageIndex
hrStorageType
sysName

name: snmp_hrStorageTable
fieldKey fieldType
-------- ---------
hrStorageAllocationFailures integer
hrStorageAllocationUnits integer
hrStorageDescr string
hrStorageSize integer
hrStorageType string
hrStorageUsed integer

From the InfluxDB point of view, I got series like:

name: snmp_hrStorageTable
time                 hrStorageDescr  hrStorageDescr_1 hrStorageIndex hrStorageSize hrStorageType         hrStorageType_1       hrStorageUsed sysName
----                 --------------  ---------------- -------------- ------------- -------------         ---------------       ------------- -------
2018-08-31T13:29:01Z Physical memory Physical memory  1              16235108      .1.3.6.1.2.1.25.2.1.2 .1.3.6.1.2.1.25.2.1.2 15812904      syno
2018-09-06T13:29:01Z                 Physical memory  1              16235108                            .1.3.6.1.2.1.25.2.1.2 15869172      syno

Which explains I don’t get the data when querying “(…) WHERE “hrStorageDescr” = ‘Physical memory’ (…)” now that Telegraf (or InfluxDB) stores the data in hrStorageDescr_1 rather than hrStorageDescr.

There doesn’t seem to be a way to remove tags or fields from series. To sanitize my data, I gotta get the clean one and move them to a new measurement. In InfluxDB, this is done using the “INTO” clause.

The idea is simple : Grab the clean data from hrStorageDescr to tmp. Delete hrStorageDescr. Insert data from tmp to a clean hrStorageDescr.

> SELECT hrStorageAllocationFailures,hrStorageAllocationUnits,hrStorageSize,hrStorageUsed INTO tmp FROM snmp_hrStorageTable GROUP BY *
name: result
time written
---- -------
1970-01-01T00:00:00Z 7394290
> DROP MEASUREMENT snmp_hrStorageTable
> SELECT hrStorageAllocationFailures,hrStorageAllocationUnits,hrStorageSize,hrStorageUsed INTO snmp_hrStorageTable FROM tmp GROUP BY *
name: result
time                 written
----                 -------
1970-01-01T00:00:00Z 7394290
> SHOW TAG KEYS FROM snmp_hrStorageTable ; SHOW FIELD KEYS FROM snmp_hrStorageTable
name: snmp_hrStorageTable
tagKey
------
agent_host
dc
host
hrStorageDescr
hrStorageIndex
hrStorageType
sysName

name: snmp_hrStorageTable
fieldKey                    fieldType
--------                    ---------
hrStorageAllocationFailures integer
hrStorageAllocationUnits    integer
hrStorageSize               integer
hrStorageUsed               integer

Now, my measurement is clean. The only lost data are those written between the end of the SELECT INTO and the DROP. In my case, polling every minutes, I didn’t loose anything.

The important thing is to only select fields and “GROUP BY *”. This allows to keep the tags as tags. If not used, tags will become keys. And that’s precisely what I’m fighting against here.