Logging, Monitoring & Alerting

Telemetry Data Collection and Display

Telemetry streams are collected from several sources including:

  • Kubernetes telemetry data (pod telemetry),
  • KumoScale application telemetry, and
  • Other applications running in the KumoScale software cluster.

This data is stored in Prometheus, an open-source system monitoring and alerting toolkit. The volume capacity for storing the data is a configurable parameter that may be changed by the user. Open-source Grafana dashboards are used to visualize telemetry data.

The figure below illustrates the data flow and components for telemetry collection and display.

userman-fig5

Figure 3. Overview of Telemetry Collection and Display

Log Data Collection and Display

Log data is collected from multiple sources such as system logs and KumoScale (Appliance mode only) applications using Fluentd, an open-source data collector. It is a cloud-native logging solution to unify data collection and consumption. It collects the following logs:

  • kubernetes /var/log/containers/*.log.
  • Syslog logs.
  • /var/log/messages.
  • ks-engine /data/ks/logs/*.* log.

The log streams are stored in Grafana Loki, a multi-tenant log aggregation system created at Grafana Labs. The volume capacity for storing the data is a configurable parameter that may be changed by the user. Open-source Grafana dashboards are used to visualize log data. This is documented further in the KumoScale Dashboard Guide.

The figure below illustrates the data flow and components in log collection and display.

userman-fig8

Figure 4. Overview of Log Collection and Display

Grafana Dashboards

An example Grafana dashboard is included with KumoScale for Appliance mode to demonstrate the kinds of visualization that can be created from the Prometheus data feed.

To configure the Prometheus data feed for Grafana, see Configuring and Installing the Prometheus Stack and Grafana.

Details on how to customize the dashboard layout is documented further in the KumoScale Dashboard Guide.

Configuring Custom Resources for Data Collection (Appliance Mode only)

This section provides information on the parameters to set in the custom resource files for the components used in data collection and display for Appliance mode only. Note that many of the services used for data collection cannot be configured in Highly Available (HA) mode so that data is replicated. Prometheus is one of the few services that does support replication through the use of the storage class parameter replicas. We recommend that this parameter be set as follows:

  • When using one master node, use replicas= 1.
  • In other deployments, use replicas= 3.
    The applications will scale as you add nodes.

Configuring and Installing the Prometheus Stack and Grafana

  • Configure the Prometheus Stack by specifying the appropriate parameter values in the Prometheus Stack storage class file, kumoscale.kioxia.com_v1_prometheusstack_cr.yaml. Below are details on parameters used in configuration.

prometheus Stack Parameter Name

Description

Optional/Required

retention

The amount of time to retain metrics. Possible values are [0-9]+(ms|s|m|h|d|w|y).
Default 1y.

Optional

replicas

The number of replicas for data collection. For single node clusters, replicas should be 1. In other cases, replicas should be 3.

Default value is 3.

Optional

storageClassName

The storage class of the volume.

Default value is kumoscale-local-storage. Keep in mind that if you specify a name, you must use that name throughout other operations.

Optional

storage

The size of storage for Prometheus service. Default value is 40Gi.

Optional

alertManager:
enabled

Whether the alertManager is enabled. Possible values are true (enabled) or false (not enabled).

Default value is true.

Optional

alertManager: retention

The amount of time to retain data. Possible values are [0-9]+(ms|s|m|h).

Default value is 8760h.

Optional

alertManager:
storageClassName

The storage class of the volume.

Default value is kumoscale-local-storage.

Optional

alertManager:
storage

The size of storage for the alertManager service.

Default value is 40Gi.

Optional

prometheus-node-exporter: enabled

Whether the Prometheus node exporter is enabled. Possible values are true (enabled) or false (not enabled).

Default value is true.

Optional

kube-state-metrics

Whether the Kubernetes kube-state-metrics service is enabled. Possible values are true (enabled) or false (not enabled).

Default value is true.

Optional

grafana:
externalIPs

An IP for the Grafana web interface. A VIP is recommended.

Required

grafana:
persistence.enabled

Enable grafana persistence for persistent password and data sources.

Default value is true.

Optional

grafana:
storageClassName

The storage class of the volume.

Default value is kumoscale-local-storage.

Optional

grafana:
storage

The volume size of storage for grafana service.

Default value is 1Gi.

Optional

  • Install the Prometheus stack with the following command:
kubectl create -f\ prometheusstack.kumoscale.kioxia.com_v1_prometheusstack_cr.yaml

Note: The initial Grafana credentials are: admin/ksAdmin. You should log into the UI after completing all of the following steps.

Configuring and Installing Loki

  • Configure Loki by specifying the appropriate parameter values in the Loki storage class file, kumoscale.kioxia.com_v1_loki.yaml. Below are details on parameters used in configuration.

Loki
Parameter Name

Description

Optional/Required

size

The size of the volume that saves the logs.

Default value is 100Gi.

Optional

storageClassName

The storage class of the volume.
Default value is kumoscale-local-storage

This has the protocol:Local and provisioningType:”thin”.

Optional

  • Install Loki with the following command:
kubectl create -f loki.kumoscale.kioxia.com_v1_loki_cr.yaml

Configuring and Installing Fluentd

  • Configure Fluentd by specifying the appropriate parameter values in the Fluentd storage class file, kumoscale.kioxia.com_v1_ fluentd.yaml. Below are details on parameters used in configuration.

Note: Fluentd does not support TCP.

fluentd
Parameter Name

Description

Optional/Required

clusterIP

The Cluster IP. To assign a cluster IP for a service such as Fluentd, you need to know the address range from which to select the IP. You can get the range by entering kubectl cluster-info dump | grep -m 1 service-cluster-ip-range
An example of output returned includes –service-cluster-ip-range=192.0.0.0/12

Required

name

Port name

Optional

Protocol

Syslog protocol

Optional

containerPort

Container port

Optional

  • Install Fluentd with the following command:
kubectl create -f fluentd.kumoscale.kioxia.com_v1_fluentd_cr.yaml

 

 

Next: Maintenance, Troubleshooting, and Support