Storage Provisioning in KumoScale

KumoScale High Level Architecture

KumoScale is a clustered, scale-out NVMe storage system designed for the modern data center. It consists of storage appliances (nodes), and a provisioner service which interfaces with the data center’s orchestration framework. A high-level architecture diagram is shown in the following illustration.

readme-overview

Data center orchestration

The orchestrator requests volumes from KumoScale according to the various applications’ requirements.

KumoScale Provisioner

The Provisioner creates virtual clusters for different tenants and allocates volumes according to the applications' requirements and the appliances' current status and utilization. It then creates the targets on the storage nodes for host connection and adds the volumes as namespaces to them, and maintains the application’s host connection status.
The following sections offer additional information on the KumoScale Provisioner.

KumoScale storage nodes

KumoScale Provisioner allocates volumes on the storage nodes (also referred as ‘appliances’ or ‘backends’). The storage nodes maintain the persistency of the configuration and manage the volume virtualization. The nodes periodically send telemetry information to several types of telemetry servers.

Telemetry and logging

KumoScale sends telemetry data and statistics regarding the physical SSDs and virtual volumes.
It supports several types of TSDBs and Syslog servers. In future versions an analytics service will analyze this data and produce more accurate placement decisions, failure predictions, and more.

Hosts (application servers)

The hosts connect to their allocated volumes via NVMe connect. Telemetry and Syslog events are collected from them as well.

The following paragraphs detail the entities and objects involved in the storage provisioning process in KumoScale.

KumoScale Provisioner

The KumoScale Provisioner service is a distributed, stateless, resilient service that accepts requests for volumes, along with a specification detailing the volume’s requirements – the Storage Class. The Provisioner returns the logical identity and network location where the requested volume can be accessed.

The Provisioner takes into account a variety of factors to arrive at an optimal placement decision, such as resilience and topology requirements, capacity, and node utilization. In addition, the Provisioner will take into account the desired Quality of Service (QoS) parameters, which may be provided as part of a Storage Class. The current QoS implementation includes enforcement of upper bound for IOps and Bandwidth, and desired IOps and bandwidth, which are fulfilled on best-effort basis.
The Provisioner also manages the allocation and budget of different tenants; it allows their configuration and maintains their storage and performance budget by managing Virtual Clusters (Multi-Tenancy).

SSD Groups

The SSD Group feature enables grouping a number of physical SSDs within a KumoScale appliance into a defined group. Groups can provide a means to physically isolate between different customers, or to segregate SSDs of different types. This allows for a single KumoScale appliance to service multiple different tenants in cases where physical separation between users (tenants) is required. A group can hold any number of SSDs, from one to all SSDs in the appliance. A physical SSD within the appliance cannot be assigned to more than one group.

SSD groups can be expanded by adding new SSDs into the group, which serves as a pool of storage. Once new SSDs are inserted into the KumoScale appliance, they are associated with a group. The group then expands its capacity and the new capacity becomes available for mapping new volumes. See Adding and Removing an SSD to and from a Group for further explanations.

Virtual Volumes

KumoScale maps virtual volumes to physical drives. In order to optimize the memory utilization within KumoScale and to enable the easiest and most intuitive integration with orchestration frameworks, KumoScale automatically places the volumes within the appliance, taking into account the Storage Class defined by the user.

Targets and Access Control List (ACL)

Once volumes are created, they are exposed over the Ethernet fabric. Hosts (e.g. application servers within the data center) establish an NVMe-oF connection to the volumes via a logical entity called a ‘target’.

Connectivity between hosts and targets is controlled in the Access Control List (ACL). When a volume is created, it is added to a target with the corresponding ACL as a namespace, thus enabling the host to connect to it. The ACL is set per host and is associated with the initiator–target pair. The permitted connection types are Read Only and Read/Write access.

An additional access control feature is a discovery ON/OFF setting. When discovery is set to OFF, the NVMe resource cannot be discovered over the fabric using the NVMe Discovery command. The only way to connect to it is to know ahead of time the unique worldwide NVMe Qualified Name (NQN) of that resource. This prevents the discovery of storage resources by hosts, even when they are connected to the datacenter trusted network.

The following figure demonstrates the relationships between KumoScale’s storage provisioning entities:

docs-fig2-physicalvolumeallocationvsvirtualvolumes

Figure 2 - Physical Volume Allocation vs. Virtual Volumes

The figure on the left depicts the physical allocation of volumes for the various targets on a group of two SSDs, whereas the figure on the right illustrates the volumes from the application hosts’ point of view. The arrows simulate the hosts–targets ACLs.

The target creation and ACL settings are hidden by KumoScale Provisioner’s Publish method, derived from the connecting hosts’ parameters, and do not require user configuration.