Help

Vars editor

Variables in articles are noted {{myVar}}

Legend

A link to a page of this blog
A link to a section of this page
A link to a template of this guide. Templates are files in which you should replace your variables
A variable
A link to an external tool documentation
This page looks best with JavaScript enabled

Monitoring: See what is going on

 ·  via commit 7f11d15 (fix: remove broken reference to kubernetes docs) by Gerkin  ·  ☕ 12 min read

Well, things are getting real and are on the point to become quite complex. So we’ll setup (super unsafe) dashboards to see what is going on easily. After all, we have nothing critical for now, but we might get troubles soon. And, don’t worry, we’ll make it safe just after that.

1. Traefik dashboard: monitoring routes

The traefik dashboard will help us in the diagnostics of our ingress routes and traefik-related stuff. For this, we need to:

  • update our ingress controller previously deployed to enable the dashboard
  • and create routes to the dashboard.

Use the  kubernetes/traefik/04-IngressController.yaml and  kubernetes/traefik/06-IngressRoutes.yaml templates.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
kind: Deployment
apiVersion: apps/v1
metadata:
  name: traefik
  namespace: traefik
  labels:
    app: traefik
    component: ingress-controller

spec:
  replicas: 1
  selector:
    matchLabels:
      app: traefik
      component: ingress-controller
  template:
    metadata:
      labels:
        app: traefik
        component: ingress-controller
    spec:
      serviceAccountName: traefik
      containers:
        - name: traefik
          image: traefik:v2.4
          args:
            - --api=true
            - --api.dashboard=true
            - --accesslog
            - --entrypoints.web.Address=:8000
            - --entrypoints.websecure.Address=:4443
            - --providers.kubernetescrd
            - --certificatesresolvers.myresolver.acme.tlschallenge
            - --certificatesresolvers.myresolver.acme.email=FILL@ME.COM
            - --certificatesresolvers.myresolver.acme.storage=acme.json
            # Please note that this is the staging Let's Encrypt server.
            # Once you get things working, you should remove that whole line altogether.
            - --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
          ports:
            - name: web
              containerPort: 8000
            - name: websecure
              containerPort: 4443
            - name: admin
              containerPort: 8080

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: dashboard
  namespace: traefik
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`traefik.{{cluster.baseHostName}}`) && PathPrefix(`/dashboard`)
    kind: Rule
    services:
    - name: dashboard@internal
      kind: TraefikService
    middlewares:
    - name: dashboard-stripprefix
      namespace: traefik
  tls:
    certResolver: myresolver
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: api
  namespace: traefik
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`traefik.{{cluster.baseHostName}}`) && PathPrefix(`/api`)
    kind: Rule
    services:
    - name: api@internal
      kind: TraefikService
  tls:
    certResolver: myresolver
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: dashboard-stripprefix
  namespace: traefik
spec:
  stripPrefix:
    prefixes:
      - /dashboard
      - /dashboard/

1
2
kubectl apply -f ./kubernetes/traefik/04-IngressController.yaml
kubectl apply -f ./kubernetes/traefik/06-IngressRoutes.yaml

Now, you should be able to reach the dashboard via  https://traefik.{{cluster.baseHostName}}/dashboard/.

2. Kibana: harvest data from your cluster

 Kibana is a super versatile tool to visualize data stored in Elasticsearch.

 Elasticsearch is a database particularly adapted for search engines, with  fulltext search and  scoring capabilities.

Together, they compose the perfect combo to ingest all our cluster’s logs, and do searches, visualizations, tracking, and everything you’ll need to understand what is going on in your cluster’s apps.

Elasticsearch may be quite resources-consuming, and your machines may not be optimized to make it run smoothly with the consequent flow of data its about to ingest. I strongly advise you to read some  installation documentations to make things correctly.

Reading a guide don’t dispense you from RTFMing.

2.1. Pods logs

We’ll start by getting our pods (container) logs. Deploy the following configuration files:

1
2
3
4
apiVersion: v1
kind: Namespace
metadata:
  name: kibana

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# See https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/#elastic

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: elasticsearch
  namespace: kibana
  labels:
    app: kibana
    component: elasticsearch
spec:
  selector:
    matchLabels:
      app: kibana
      component: elasticsearch
  template:
    metadata:
      labels:
        app: kibana
        component: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.10.2
        env:
        - name: discovery.type
          value: single-node
        ports:
        - containerPort: 9200
          name: http
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: kibana
  labels:
    app: kibana
    component: elasticsearch
spec:
  selector:
    app: kibana
    component: elasticsearch
  ports:
    - port: 9200
      targetPort: 9200
      protocol: TCP
      name: http

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# See https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/#kibana

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: kibana
  labels:
    app: kibana
    component: kibana
spec:
  selector:
    matchLabels:
      app: kibana
      component: kibana
  template:
    metadata:
      labels:
        app: kibana
        component: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.10.2
        env:
        - name: ELASTICSEARCH_URL
          value: http://elasticsearch.kibana.svc.cluster.local:9200
        - name: XPACK_SECURITY_ENABLED
          value: "true"
        # - name: SERVER_NAME
        #   value: kibana.{{cluster.baseHostName}}
        ports:
        - containerPort: 5601
          name: http
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: kibana
  labels:
    app: kibana
    component: kibana
spec:
  selector:
    app: kibana
    component: kibana
  ports:
    - port: 80
      targetPort: 5601
      protocol: TCP
      name: http

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
# See https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/#fluentd

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system

---

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    app: kibana
    component: fluentd
spec:
  selector:
    matchLabels:
      app: kibana
      component: fluentd
  template:
    metadata:
      labels:
        app: kibana
        component: fluentd
    spec:
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.12-debian-elasticsearch7-1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: elasticsearch.kibana.svc.cluster.local
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
          # See https://github.com/fluent/fluentd-kubernetes-daemonset#disable-systemd-input
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: auditlog
          mountPath: /var/log/kubernetes/kube-apiserver-audit.log
          readOnly: true
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: auditlog
        hostPath:
          path: "{{audit.sourceLogDir}}/{{audit.sourceLogFile}}"
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingress-secure
  namespace: kibana
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`kibana.{{cluster.baseHostName}}`)
    kind: Rule
    services:
    - name: kibana
      kind: Service
      namespace: kibana
      port: 80
  tls:
    certResolver: myresolver

1
2
3
4
5
kubectl apply -f ./kubernetes/kibana/01-Namespace.yaml
kubectl apply -f ./kubernetes/kibana/11-Elasticsearch.yaml
kubectl apply -f ./kubernetes/kibana/12-Kibana.yaml
kubectl apply -f ./kubernetes/kibana/13-Fluentd.yaml
kubectl apply -f ./kubernetes/kibana/21-Ingress.yaml

Once applied, you should be able to reach your kibana dashboard via  https://kibana.{{cluster.baseHostName}}/. Be patient, it may take a bit of time to initialize ElasticSearch and Kibana itself. Once they started up, let’s configure those !

Go to the  Kibana > Discover > Index patterns page. Kibana should ask to create indices. Index logs with pattern logstash*.

Index pattern 1st screen

Then, set up the time field as @timestamp.

Index pattern 2nd screen

Finally, go back to the Discover page. You should get at least your pods logs!

Logs

I strongly recommend you to inspect logs carefully, to clean up as many errors as possible. Yeah, you should have done that all the way long, but looking everywhere is painful, I know. So do that now while you don’t have a bunch of things to pollute your streams.

2.2. Audit logs

For a reason I can’t explain, the default settings for audit log parsing from fluentd are incorrect. Moreover, I find the “all settings in a single file” pattern awful. So we are going to reconfigure fluentd to parse correctly our logs. Use the  kubernetes/kibana/31-Fluentd.yaml &  kubernetes/kibana/32-FluentdConfigMap.yaml templates. The 1st one rewrites some of the configuration of fluentd to use our custom configs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# See https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/#fluentd

---

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    app: kibana
    component: fluentd

spec:
  selector:
    matchLabels:
      app: kibana
      component: fluentd

  template:
    metadata:
      labels:
        app: kibana
        component: fluentd

    spec:
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.12-debian-elasticsearch7-1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: elasticsearch.kibana.svc.cluster.local
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
          # See https://github.com/fluent/fluentd-kubernetes-daemonset#disable-systemd-input
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        volumeMounts:
        - name: fluentd-config-kubernetes-conf
          mountPath: /fluentd/etc/kubernetes.conf
          subPath: kubernetes.conf
        - name: fluentd-config-conf-additional
          mountPath: /fluentd/etc/conf.d/
        - name: varlog
          mountPath: /var/log
        - name: auditlog
          mountPath: /var/log/kubernetes
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: fluentd-config-kubernetes-conf
        configMap:
          name: fluentd-config-kubernetes-conf
      - name: fluentd-config-conf-additional
        configMap:
          name: fluentd-config-conf-additional
      - name: varlog
        hostPath:
          path: /var/log
      - name: auditlog
        hostPath:
          path: /var/log/kubernetes
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
# Those configmaps are taken from https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/docker-image/v1.10/debian-elasticsearch6/conf/kubernetes.conf

apiVersion: v1
data:
  kubernetes.conf: |-
    <label @FLUENT_LOG>
      <match fluent.**>
        @type null
      </match>
    </label>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
      kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
      verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || true}"
      ca_file "#{ENV['KUBERNETES_CA_FILE']}"
      skip_labels "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_LABELS'] || 'false'}"
      skip_container_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA'] || 'false'}"
      skip_master_url "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL'] || 'false'}"
      skip_namespace_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA'] || 'false'}"
    </filter>    

kind: ConfigMap
metadata:
  name: fluentd-config-kubernetes-conf
  namespace: kube-system

---

apiVersion: v1
data:
  container-logs.conf: |-
    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
      exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
      read_from_head true
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>    
  salt.conf: |-
    <source>
      @type tail
      @id in_tail_minion
      path /var/log/salt/minion
      pos_file /var/log/fluentd-salt.pos
      tag salt
      <parse>
        @type regexp
        expression /^(?<time>[^ ]* [^ ,]*)[^\[]*\{{^\}}*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
        time_format %Y-%m-%d %H:%M:%S
      </parse>
    </source>    
  startupscript.conf: |-
    <source>
      @type tail
      @id in_tail_startupscript
      path /var/log/startupscript.log
      pos_file /var/log/fluentd-startupscript.log.pos
      tag startupscript
      <parse>
        @type syslog
      </parse>
    </source>    
  docker.conf: |-
    <source>
      @type tail
      @id in_tail_docker
      path /var/log/docker.log
      pos_file /var/log/fluentd-docker.log.pos
      tag docker
      <parse>
        @type regexp
        expression /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
      </parse>
    </source>    
  etcd.conf: |-
    <source>
      @type tail
      @id in_tail_etcd
      path /var/log/etcd.log
      pos_file /var/log/fluentd-etcd.log.pos
      tag etcd
      <parse>
        @type none
      </parse>
    </source>    
  kubelet.conf: |-
    <source>
      @type tail
      @id in_tail_kubelet
      multiline_flush_interval 5s
      path /var/log/kubelet.log
      pos_file /var/log/fluentd-kubelet.log.pos
      tag kubelet
      <parse>
        @type kubernetes
      </parse>
    </source>    
  kube-proxy.conf: |-
    <source>
      @type tail
      @id in_tail_kube_proxy
      multiline_flush_interval 5s
      path /var/log/kube-proxy.log
      pos_file /var/log/fluentd-kube-proxy.log.pos
      tag kube-proxy
      <parse>
        @type kubernetes
      </parse>
    </source>    
  kube-apiserver.conf: |-
    <source>
      @type tail
      @id in_tail_kube_apiserver
      multiline_flush_interval 5s
      path /var/log/kube-apiserver.log
      pos_file /var/log/fluentd-kube-apiserver.log.pos
      tag kube-apiserver
      <parse>
        @type kubernetes
      </parse>
    </source>    
  kube-controller-manager.conf: |-
    <source>
      @type tail
      @id in_tail_kube_controller_manager
      multiline_flush_interval 5s
      path /var/log/kube-controller-manager.log
      pos_file /var/log/fluentd-kube-controller-manager.log.pos
      tag kube-controller-manager
      <parse>
        @type kubernetes
      </parse>
    </source>    
  kube-scheduler.conf: |-
    <source>
      @type tail
      @id in_tail_kube_scheduler
      multiline_flush_interval 5s
      path /var/log/kube-scheduler.log
      pos_file /var/log/fluentd-kube-scheduler.log.pos
      tag kube-scheduler
      <parse>
        @type kubernetes
      </parse>
    </source>    
  rescheduler.conf: |-
    <source>
      @type tail
      @id in_tail_rescheduler
      multiline_flush_interval 5s
      path /var/log/rescheduler.log
      pos_file /var/log/fluentd-rescheduler.log.pos
      tag rescheduler
      <parse>
        @type kubernetes
      </parse>
    </source>    
  glbc.conf: |-
    <source>
      @type tail
      @id in_tail_glbc
      multiline_flush_interval 5s
      path /var/log/glbc.log
      pos_file /var/log/fluentd-glbc.log.pos
      tag glbc
      <parse>
        @type kubernetes
      </parse>
    </source>    
  autoscaler.conf: |-
    <source>
      @type tail
      @id in_tail_cluster_autoscaler
      multiline_flush_interval 5s
      path /var/log/cluster-autoscaler.log
      pos_file /var/log/fluentd-cluster-autoscaler.log.pos
      tag cluster-autoscaler
      <parse>
        @type kubernetes
      </parse>
    </source>    
  audit-log.conf: |-
    # Example:
    # {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"xxxx","stage":"ResponseComplete","requestURI":"/apis/...?timeout=10s","verb":"update","user":{"username":"system:kube-scheduler","groups":["system:authenticated"]},"sourceIPs":["xxx.xxx.xxx.xxx"],"userAgent":"kube-scheduler/v1.19.3 (linux/amd64) kubernetes/1e11e4a/leader-election","objectRef":{"resource":"leases","namespace":"kube-system","name":"kube-scheduler","uid":"xxxx","apiGroup":"coordination.k8s.io","apiVersion":"v1","resourceVersion":"52124"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2020-10-29T16:26:44.967339Z","stageTimestamp":"2020-10-29T16:26:44.968796Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-scheduler\" of ClusterRole \"system:kube-scheduler\" to User \"system:kube-scheduler\""}}
    # {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Request","auditID":"xxxx","stage":"ResponseComplete","requestURI":"/api/....?resourceVersion=0\u0026timeout=10s","verb":"get","user":{"username":"system:node:kube-slave-1","groups":["system:nodes","system:authenticated"]},"sourceIPs":["xxx.xxx.xxx.xxx"],"userAgent":"kubelet/v1.19.3 (linux/amd64) kubernetes/1e11e4a","objectRef":{"resource":"nodes","name":"kube-slave-1","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2020-10-29T16:26:45.099703Z","stageTimestamp":"2020-10-29T16:26:45.100167Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}}
    <source>
      @type tail
      @id in_tail_kube_apiserver_audit
      multiline_flush_interval 5s
      path /var/log/kubernetes/kube-apiserver-audit.log
      pos_file /var/log/kube-apiserver-audit.log.pos
      tag audit
      <parse>
        @type json
        time_key requestReceivedTimestamp
        time_type string
        time_format %Y-%m-%dT%T.%L%Z
      </parse>
    </source>    

kind: ConfigMap
metadata:
  name: fluentd-config-conf-additional
  namespace: kube-system

1
2
kubectl apply -f ./kubernetes/kibana/31-Fluentd.yaml
kubectl apply -f ./kubernetes/kibana/32-FluentdConfigMap.yaml

You should now be able to filter your logs by tag, and look at the audit logs.

Logs filter by tag

So, in this setup, your audit logs are at 2 places: directly bare on your server, and in kibana. This redundancy is important IMO because whatever happens to your cluster (even a full flush of all your pods), you should be able to know who did what.

2.3. Make things persistent

Now that you have set up everything, you might have seen that everytime the ElasticSearch pod is restarted, the database is emptied. This is normal so far, because we don’t actually write any data on a persistent storage. For now ! But let’s solve that.

3. Kube dashboard: Web UI to administrate the cluster

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Namespace
metadata:
  name: kubernetes-dashboard

---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-csrf
  namespace: kubernetes-dashboard
type: Opaque
data:
  csrf: ""

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-key-holder
  namespace: kubernetes-dashboard
type: Opaque

---

kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-settings
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
rules:
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
    verbs: ["get", "update", "delete"]
    # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["kubernetes-dashboard-settings"]
    verbs: ["get", "update"]
    # Allow Dashboard to get metrics.
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["heapster", "dashboard-metrics-scraper"]
    verbs: ["proxy"]
  - apiGroups: [""]
    resources: ["services/proxy"]
    resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
    verbs: ["get"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods", "nodes"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    app: kube-dashboard
    component: kube-dashboard

  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 80
      targetPort: 9090
  selector:
    app: kube-dashboard
    component: kube-dashboard


---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    app: kube-dashboard
    component: kube-dashboard

  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: kube-dashboard
      component: kube-dashboard

  template:
    metadata:
      labels:
        app: kube-dashboard
        component: kube-dashboard

    spec:
      containers:
        - name: kubernetes-dashboard
          image: kubernetesui/dashboard:v2.0.1
          imagePullPolicy: Always
          ports:
            - containerPort: 9090
              # containerPort: 8443
              protocol: TCP
          args:
            - --insecure-port=9090 # ADDED
            - --enable-insecure-login # ADDED
            # - --auto-generate-certificates
            - --namespace=kubernetes-dashboard
            - --authentication-mode=token
            # Uncomment the following line to manually specify Kubernetes API server Host
            # If not specified, Dashboard will attempt to auto discover the API server and connect
            # to it. Uncomment only if the default does not work.
            # - --apiserver-host=http://my-address:port
          volumeMounts:
            - name: kubernetes-dashboard-certs
              mountPath: /certs
              # Create on-disk volume to store exec logs
            - mountPath: /tmp
              name: tmp-volume
          livenessProbe:
            httpGet:
              # scheme: HTTPS
              path: /
              # port: 8443
              port: 9090 # ADDED
            initialDelaySeconds: 30
            timeoutSeconds: 30
          securityContext:
            # allowPrivilegeEscalation: false 
            allowPrivilegeEscalation: true
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      volumes:
        - name: kubernetes-dashboard-certs
          secret:
            secretName: kubernetes-dashboard-certs
        - name: tmp-volume
          emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "beta.kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 8000
      targetPort: 8000
  selector:
    k8s-app: dashboard-metrics-scraper

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: dashboard-metrics-scraper
  template:
    metadata:
      labels:
        k8s-app: dashboard-metrics-scraper
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'runtime/default'
    spec:
      containers:
        - name: dashboard-metrics-scraper
          image: kubernetesui/metrics-scraper:v1.0.1
          ports:
            - containerPort: 8000
              protocol: TCP
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /
              port: 8000
            initialDelaySeconds: 30
            timeoutSeconds: 30
          volumeMounts:
          - mountPath: /tmp
            name: tmp-volume
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "beta.kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      volumes:
        - name: tmp-volume
          emptyDir: {}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: ingressroute-dashboard
  namespace: kubernetes-dashboard
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`kube-dashboard.bar.com`)
    kind: Rule
    services:
    - name: kubernetes-dashboard
      namespace: kubernetes-dashboard
      kind: Service
      port: 80
  tls:
    certResolver: myresolver

1
2
kubectl apply -f ./kubernetes/kube-dashboard/01-Dashboard.yaml
kubectl apply -f ./kubernetes/kube-dashboard/02-IngressRoutes.yaml

Then, for debugging purpose, we’ll set up a test service account that can only view and list items in the dashboard. This service account will be named watchdog.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: watchdog
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: watchdog
  namespace: kubernetes-dashboard
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    resourceNames: ["*"]
    verbs: ["get", "list", "watch"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: watchdog
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: watchdog
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: watchdog
subjects:
  - kind: ServiceAccount
    name: watchdog
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: watchdog
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: watchdog
subjects:
  - kind: ServiceAccount
    name: watchdog
    namespace: kubernetes-dashboard
1
2
3
4
5
6
# Create the role, cluster role and service account using them
kubectl apply -f ./kubernetes/kube-dashboard/03-ServiceAccount.yaml
# Get the secret's name
secret_name="$(kubectl get serviceaccount watchdog -n kubernetes-dashboard -o json | jq '.secrets[0].name' -r)"
# Get the secret's token
echo $(kubectl get secret $secret_name -n kubernetes-dashboard -o json | jq '.data.token' -r  | base64 --decode)

Now, navigate to  https://kube-dashboard.{{cluster.baseHostName}} and log in using the token you got above.

Authentication screen

You should be able to see all resources in your cluster.

Resources listing

Yes, this is super unsafe. That’s why we are going to add authentication right now, and why I told you not to make this publicly exposed for now.


Hey, we’ve done important things here ! Maybe it’s time to commit…

1
2
3
4
git add .
git commit -m "Monitoring: See what is going on

Following guide @ https://gerkindev.github.io/devblog/walkthroughs/kubernetes/06-monitoring/"
Share on

GerkinDev
WRITTEN BY
GerkinDev
Fullstack developer, on its journey to DevOps.

 
What's on this Page