Help

Vars editor

Variables in articles are noted {{myVar}}

Legend

A link to a page of this blog
A link to a section of this page
A link to a template of this guide. Templates are files in which you should replace your variables
A variable
A link to an external tool documentation
This page looks best with JavaScript enabled

Setup the cluster's Audit Log

 ·  via commit 1c91ff1 (chore: change shortcodes format (HTML tag like)) by Gerkin  ·  ☕ 6 min read

Note : Even if this part is not required, you should not ignore it on dev environment and should really really REALLY not skip it for production. In fact, it can contain useful debug informations and security traces to see what is going on in your kubernetes cluster, and even on your whole server(s).

This tutorial will guide you to setup audit log policy, catch logs with Fluentd, cast them to elasticsearch & show them using Kibana.

First, choose an audit log dir name on the host {{audit.sourceLogDir}}. This is the directory where kubernetes will write its audit logs, and should be in /var/log. Then, choose an audit log file {{audit.sourceLogFile}} in {{audit.sourceLogDir}}. The final audit logs path is then {{audit.sourceLogDir}}/{{audit.sourceLogFile}}

FluentD will parse those audit logs, and split them by tags for easier sorting of logs. It will then write those zones in {{audit.destLogDir}}

In order to pipe audit log messages to Elasticsearch, we need to install fluentd on the kubernetes master host.

FluentD job

Install fluentd (on the kubernetes master host)

Install Chrony

Start by installing Chrony for accurate timestamps

1
2
dnf install chrony
systemctl enable --now chronyd

You should be good to go.

Configure other settings

Check the file descriptors limit for the root user (use sudo):

1
2
3
4
ulimit -n
# » 1024
ulimit -Hn
# » 262144

If it is low (like 1024), you need to increase it, by opening your system’s limits. So, open your limits.conf file:

1
vim /etc/security/limits.conf

Set the following configurations:

1
2
3
4
root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536

Then reboot & recheck for the root user (use sudo).

1
2
ulimit -n # should be 65536
ulimit -Hn # should be at least 65536

If the environment is expected to have a high load, follow  this section of the guide

Install FluentD & plugins

Add the td-agent repository & install it

1
2
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh
systemctl enable --now td-agent.service

Check if it works by posting a sample log

1
2
curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.test
cat /var/log/td-agent/td-agent.log # should end with our test message above

Install required plugins with the following command:

1
td-agent-gem install fluent-plugin-forest fluent-plugin-rewrite-tag-filter

If having errors here, see the  Troubleshoot section at the end.

Configure Fluentd

Install the  td-agent/kube.conf template template into /etc/td-agent/, include it in your master configuration, and create the log dirs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# From https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#log-collector-examples
# fluentd conf runs in the same host with kube-apiserver
<source>
    @type tail
    # audit log path of kube-apiserver
    path {{audit.sourceLogDir}}/{{audit.sourceLogFile}}
    pos_file {{audit.sourceLogDir}}/{{audit.sourceLogFile}}.pos
    format json
    time_key time
    time_format %Y-%m-%dT%H:%M:%S.%N%z
    tag audit
</source>

<filter audit>
    #https://github.com/fluent/fluent-plugin-rewrite-tag-filter/issues/13
    @type record_transformer
    enable_ruby
    <record>
     namespace ${record["objectRef"].nil? ? "none":(record["objectRef"]["namespace"].nil? ? "none":record["objectRef"]["namespace"])}
    </record>
</filter>

<match audit>
    # route audit according to namespace element in context
    @type rewrite_tag_filter
    <rule>
        key namespace
        pattern /^(.+)/
        tag ${tag}.$1
    </rule>
</match>

<filter audit.**>
   @type record_transformer
   remove_keys namespace
</filter>

<match audit.**>
    @type forest
    subtype file
    remove_prefix audit
    <template>
        time_slice_format %Y%m%d%H
        compress gz
        path {{audit.destLogDir}}/audit-${tag}.*.log
        format json
        include_time_key true
    </template>
</match>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mv ./td-agent/kube.conf /etc/td-agent/td-agent.conf
# Include kubernetes configuration it in configuration
echo "@include './kube.conf'" >> /etc/td-agent/td-agent.conf
# Create the log dir that will be mounted into the API server
mkdir -p {{audit.destLogDir}}
# If required, allow td-agent to read/write in it
chown -R root:td-agent {{audit.destLogDir}}
chmod -R g+w {{audit.destLogDir}}
# Restart the agent
systemctl restart td-agent.service

Setup the audit log

See the  example audit log policy & the  template audit log file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# From https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/audit/audit-policy.yaml
# See https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy for more info

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

Move it in the /etc/kubernetes folder (because this is a kubernete’s configuration).

1
2
mv ./kubernetes/audit-log-policy.yaml /etc/kubernetes/audit-log-policy.yaml
chown root:root /etc/kubernetes/audit-log-policy.yaml

Troubleshoot

Unable to download data from https://rubygems.org/ - timed out (https://api.rubygems.org/specs.4.8.gz)

Rubygems repository seems to have issues with IPv6. Check with below commands:

1
2
curl -v --head https://api.rubygems.org
curl -6 -v --head https://api.rubygems.org

If the 1st command worked and the second hang (timeout), then you are having troubles with IPv6, and you need to temporarly disable it.

1
2
sysctl -w net.ipv6.conf.default.disable_ipv6=1
sysctl -w net.ipv6.conf.all.disable_ipv6=1

After installing your plugin, re-enable IPv6

1
2
sysctl -w net.ipv6.conf.default.disable_ipv6=0
sysctl -w net.ipv6.conf.all.disable_ipv6=0
Share on

GerkinDev
WRITTEN BY
GerkinDev
Fullstack developer, on its journey to DevOps.

 
What's on this Page