=======================
== blog.shellcode.in ==
=======================
My corner of the Internet

Kubernetes Logging With Fluentd and Fluent Bit

k8s logging fluentd fluent-bit

Introduction

Part of observability is logging. Logging is heavily used to debug and understand what is happening in systems.

The end goal is to have logging to Elasticsearch that is running in the cluster with fluent-bit to a fluentd.

Fluent-bit

Fluent-bit is going to be used to grab the logs from pods, check if they are json, if so parse them, and then forward them to fluentd. The reason for using both fluent-bit and fluentd is that we can aggregate the logs in fluentd. The aggregation allows us to buffer the logs and do more filtering. If we just used fluent-bit we would be sending the logs directly to Elasticsearch with many different instances of fluent-bit. It is harder to control the flow of logs to Elasticsearch when we have many nodes. Elasticsearch can be difficult to configure to get ingestion working just right. If we have more clients all sending logs then the ingest can be overloaded. Fluent-bit also do not have as many knobs to turn as fluentd.

A key feature is going to be having an index per namespace. The reason for this is we can control logs at a lower level, index per namespace, the index sizes will be smaller, and there is a limit of 1000 fields to be indexed. If all logs go to the same index then we will reach the 1000 field limit quickly. If we have an index per namespace we are much less likely to reach this limit.

Fluent-bit is a very capable program and can be used without aggregation to fluentd.

Below is almost the default configuration from the fluent-bit helm chart. There are a few changes though. There is a buffer in the filter of 1M and the output is just a forward to our fluentd instance. Fluent-bit is deployed as a daemonset.

This config grabs the kubernetes metadata that is useful when searching logs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
config:
  service: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level info
        Parsers_File parsers.conf
        Parsers_File custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On

  inputs: |
    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

  filters: |
    [FILTER]
        Name kubernetes
        Match kube.*
        Buffer_Size 1M
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

  outputs: |
    [OUTPUT]
        Name forward
        Match kube.*
        Host fluentd
        Port 24224

Side note below is an output config section for pushing logs directly to Elasticsearch. The only issue with this is that the Logstash_Prefix_Key is based on the kubernetes namespace. Setting up index templates to work with this is very difficult unless you have a prefix in your namespaces already. In Elasticsearch you glob indices with a prefix thus if every namespace does not have the same prefix then you either need to make a lot of patterns or lose out on automatically assigning the right settings to each index that is created.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
[OUTPUT]
    Name es
    Match kube.*
    Host eck-dev-es-ingest
    Port 9200
    HTTP_User elastic
    HTTP_Passwd ${ELASTICSEARCH_PASSWORD}
    Logstash_Format On
    Logstash_Prefix_Key $kubernetes['namespace_name']
    Replace_Dots On
    Trace_Error Off
    Retry_Limit 20
    tls On
    tls.verify Off

Fluentd

Fluentd is used to aggregate the logs and perform a few more modifications to the logs. Some of the filters applied in fluentd could be applied in fluent-bit. The reason for applying them in fluentd is that we use fluent-bit for a simple log forwarder.

Fluentd is deployed as a statefulset so that it can have a persistent disk for buffering. If fluentd goes down then the buffer is still left on disk for when it comes back up.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
fileConfigs:
  01_sources.conf: |-
    <source>
      @type forward
      bind 0.0.0.0
      port 24224
    </source>

  02_filters.conf: |-
    <filter kube.**>
      @type             dedot
      de_dot            true
      de_dot_separator  _
      de_dot_nested     false
    </filter>

    <filter kube.**>
      @type rename_key
      rename_rule1 ^app$ app.kubernetes.io/name
      rename_rule2 ^chart$ helm.sh/chart
      rename_rule3 ^istio$ istio.io/istio
      rename_rule4 ^lenses.io/app.type$ lenses.io/app-type
    </filter>

    <match kube.**>
      @type relabel
      @label @DISPATCH
    </match>

  03_dispatch.conf: |-
    <label @DISPATCH>
      <filter **>
        @type prometheus
        <metric>
          name fluentd_input_status_num_records_total
          type counter
          desc The total number of incoming records
          <labels>
            tag ${tag}
            hostname ${hostname}
          </labels>
        </metric>
      </filter>

      <match **>
        @type relabel
        @label @OUTPUT
      </match>
    </label>

  04_outputs.conf: |-
    <label @NULL>
      <match **>
        @type null
      </match>
    </label>

    <label @OUTPUT>
      <match kube.**>
        @type elasticsearch_data_stream
        @id output_es_kube
        host eck-dev-es-ingest
        port 9200
        ssl_verify false
        scheme https
        path ""
        user elastic
        password "#{ENV['ELASTICSEARCH_PASSWORD']}"
        data_stream_name fluentd-k8s-${$.kubernetes.namespace_name}
        data_stream_template_name fluentd-k8s
        suppress_type_name true
        use_legacy_template false
        request_timeout 30s
        reload_connections false
        reconnect_on_error true
        reload_on_failure true
        <buffer tag, $.kubernetes.namespace_name>
          @type file
          path /var/log/fluent/kube.*.buffer
          chunk_limit_size 64MB
          flush_thread_count 5
          flush_interval 5s
        </buffer>
      </match>
    </label>

Breaking down the configuration into smaller parts the first thing is the filters. We have two filters the first is the de_dot and the second is the rename_key.

De_dot is used to remove ‘.’ from the kubernetes metadata keys. It takes input like ‘kubernetes.labels.app.kubernetes.io/made.up.label’ and turns it into ‘kubernetes.labels.app.kubernetes.io/made_up_label’. This allows proper ingestion into Elasticsearch because Elasticsearch parses the dots as keys in a json dictionary. Without this Elasticsearch would nest ‘label’ under ‘up’ under ‘made’ which works until we have another key ‘kubernetes.labels.app.kubernetes.io/realappname’ which does not have dots which means it will not nest the same'

1
2
3
4
5
6
<filter kube.**>
  @type             dedot
  de_dot            true
  de_dot_separator  _
  de_dot_nested     false
</filter>

Rename_key is used to fix kubernetes labels that do not fit the pattern of the rest of the labels. This allows the labels to be ingested and not rejected becasue of mismatching fields again.

1
2
3
4
5
6
7
<filter kube.**>
  @type rename_key
  rename_rule1 ^app$ app.kubernetes.io/name
  rename_rule2 ^chart$ helm.sh/chart
  rename_rule3 ^istio$ istio.io/istio
  rename_rule4 ^lenses.io/app.type$ lenses.io/app-type
</filter>

The output to elsaticsearch is using a datastream instead of the normal index. With indexes we need to bootstrap the index manually before we write logs to it. After we bootstrap the index we also need to write to the alias index so that the ILM rollover can happen behind the scenes. With a datastream we write to the same index and there is no need to bootstrap the index before because we are using index templates. This allows us to automatically add new indexes based on namespace. A datastream also allows for automatic index rollover behind the scenes.

The data_stream_name is set to ‘fluentd-k8s-’ so that we have a prefix that identifies where the index is being created from. The syntax on the end is part of fluentd that allows us to use any key from the buffer section that we key the buffer on.

The ‘data_stream_template_name’ is the index template that we have defined already in Elasticsearch.

We set the request_timeout to 30s to reduce the amount of timeouts. This number should be adjusted based on your environment and what your Elasticsearch cluster can handle.

Inside the buffer section we set the chunk_limit_size to 64MB because Elasticsearch has a default ‘http.max_content_length’ of 100MB. Increasing the ‘http.max_content_length’ for Elasticsearch is not recommended. Increasing it means that your cluster will have to handle larger payloads which depending on your cluster may degrade performance or may not be able to handle all together.

The ‘flush_thread_count’ and ‘flush_interval’ should also be tuned to what your cluster can handle. The ‘flush_thread_count’ determines how many threads are used to flush. In other words how many threads are used to send the http request to Elasticsearch. The ‘flush_interval’ is how often the flush to Elasticsearch will happen. You can increase these numbers to increase the throughput but the Elasticsearch cluster needs to be able to handle it otherwise there will be a lot of failing flushes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<match kube.**>
  @type elasticsearch_data_stream
  @id output_es_kube
  host eck-dev-es-ingest
  port 9200
  ssl_verify false
  scheme https
  path ""
  user elastic
  password "#{ENV['ELASTICSEARCH_PASSWORD']}"
  data_stream_name fluentd-k8s-${$.kubernetes.namespace_name}
  data_stream_template_name fluentd-k8s
  suppress_type_name true
  use_legacy_template false
  request_timeout 30s
  reload_connections false
  reconnect_on_error true
  reload_on_failure true
  <buffer tag, $.kubernetes.namespace_name>
    @type file
    path /var/log/fluent/kube.*.buffer
    chunk_limit_size 64MB
    flush_thread_count 5
    flush_interval 5s
  </buffer>
</match>

#Elasticsearch

Preparing the Elasticsearch cluster doesn’t take much effort. The two things we need to make are the index templates and the ILM policy. This should be tuned to your cluster. There are other things to do like adding a Kibana indices if you want to search these indices from Kibana.

For the index template the index_patterns is set to match what our prefix is in the fluentd config. The ‘data_stream’ is empty but it means that we want to create a datastream when a new index is created that matches our pattern. The next thing of importance is setting the ‘lifecycle’ to ‘fluentd-k8s’ which is what our ILM policy is named.

Index Template

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
PUT /_index_template/fluentd-k8s
{
  "index_patterns": [
    "fluentd-k8s-*"
  ],
  "data_stream": {},
  "priority": 100,
  "template": {
    "settings": {
      "lifecycle": {
        "name": "fluentd-k8s"
      },
      "number_of_shards": "1"
    }
  }
}

This ILM Policy is pretty standard and basic. This is something you should customize for your environment and requirements.

ILM Policy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
PUT _ilm/policy/fluentd-k8s
{
  "policy": {
    "phases": {
      "warm": {
        "min_age": "5d",
        "actions": {
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "set_priority": {
            "priority": 0
          }
        }
      },
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_primary_shard_size": "50gb",
            "max_age": "30d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "delete": {
        "min_age": "90d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

Conclusion

The most difficult part is making sure your Elasticsearch cluster can handle the load. Even a small kubernetes cluster can produce a large amount of logs.

For now this solution works for reducing the amount of fields to be indexed at the cost of creating many more indices. These indices still need to be managed but with index templates the work gets easier.