Set up High Availability (HA) Kubernetes on the Service Cluster

With Omnia, you can deploy a service Kubernetes cluster on the designated service nodes to efficiently distribute workload and manage resources for telemetry data collection. This setup reduces the processing load on the OIM node and enhances overall scalability. Each service_kube_node is responsible for collecting telemetry data from its assigned subset of compute nodes. Federated way of telemetry data collection improves efficiency for large-scale clusters.

Prerequisites

  • To deploy Kubernetes on service cluster, ensure that service_k8s is added under softwares in the /opt/omnia/input/project_default/software_config.json. Refer the sample config file below:

    {
    
         "cluster_os_type": "rhel",
          "cluster_os_version": "10.0",
           "repo_config": "partial",
           "softwares": [
    
                {"name": "service_k8s","version": "1.34.1", "arch": ["x86_64"]}
            ],
    
         "service_k8s": [
                {"name": "service_kube_control_plane_first"},
                {"name": "service_kube_control_plane"},
                {"name": "service_kube_node"}
        ]
    
    }
    
  • Omnia supports only Kubernetes version 1.34.1.

  • If you want to install CSI PowerScale driver, ensure that you provide the required values. Click Deploy CSI drivers for Dell PowerScale storage solutions for more information.

  • Ensure that there are at least three service_kube_control_plane entries and one service_kube_node entry in the pxe_mapping_file.csv for the Kubernetes controller HA scenario.

Note

The above requirement is the minimum needed to deploy the service Kubernetes cluster. High availability applies only to the control plane. For workload and pod failover, it is recommended to have at least two service_kube_node nodes, so that pods can be rescheduled automatically if one worker node fails.

  • Ensure that the nfs server is reachable on all the diskless nodes.

  • The nodes must be equipped with two active Network Interface Cards (NICs):

    • One dedicated to internal cluster communication. It is used for internal cluster communication, Kubernetes deployment activities, and for accessing the Pulp repositories hosted on the OIM. The Admin interface must be assigned an IP address from the admin network range and must be reachable from the OIM.

    • One dedicated to Internet. If you want to install a CSI driver, ensure that the storage network must be accessible through the Internet-facing NIC. This NIC must be configured via DHCP.

  • To use NFS for service Kubernetes cluster, ensure the following prerequisites are met:

    • The NFS share has 755 permissions and rw,sync,no_root_squash,no_subtree_check are enabled on the mounted NFS share.

    • Edit the /etc/exports file on the NFS server to include the rw,sync,no_root_squash,no_subtree_check option for the server_share_path.

      /<your_server_share_path>  *(rw,sync,no_root_squash,no_subtree_check)
      

Steps

  1. Run local_repo.yml playbook to download the artifacts required to set up Kubernetes on the service cluster nodes.

  2. Fill the omnia_config.yml, high_availability_config.yml (for service cluster HA), and storage_config.yml. The nfs_name mentioned in storage_config.yml should match the nfs_storage_name of the entries for the service_k8s_cluster in omnia_config.yml where deployment is set to true. See Input parameters for the cluster. The NFS share is utilized by the Kubernetes cluster to mount necessary resources. See the following sample:

    nfs_client_params:
    -{
       server_ip: "", # Provide the IP of the NFS server
       server_share_path: "", # Provide server share path of the NFS Server
       client_share_path: /opt/omnia,,
       client_mount_options: "nosuid,rw,sync,hard,intr",
       nfs_name: nfs_k8s
    }
    

Note

In case of CSI support, ensure that the server_share_path must be the same as the isiPath value in values.yml file and the server_ip should be the Powerscale NFS server IP.

Note

Ensure that the server_share_path and client_share_path do not have any content before you deploy Kubernetes. To delete the content, go to server_share_path on NFS server and remove the content available in the path.

Note

Ensure that the pod_external_ip_range defined in the omnia_config.yml file is reachable from the OpenManage Enterprise appliance and the SFM network.

omnia_config.yml

Variables

Mandatory/Optional

Details

cluster_name

Mandatory

  • Type: String

  • Name of the cluster on which you want to deploy Kubernetes.

  • This input is case-sensitive. Do not add any special characters except _ (underscore) in the cluster name.

deployment

Mandatory

  • Type: Boolean

  • Indicates if Kubernetes will be deployed or not.

  • Accepted values: true or false

k8s_cni

Mandatory

  • Type: String

  • Kubernetes SDN network.

  • Accepted values: calico

  • Default value: calico

pod_external_ip_range

Mandatory

  • Type: String

  • These addresses will be used by the loadbalancer for assigning external IPs to Kubernetes services.

  • Ensure that the IP range provided is not assigned to any node in the cluster.

  • Ensure that the pod_external_ip_range defined in the omnia_config.yml file is reachable from the OpenManage Enterprise appliance and the SFM network.

  • Sample values: 172.16.107.170-172.16.107.200

k8s_service_addresses

Optional

  • Type: String

  • Kubernetes internal network for services.

  • This network must be unused in your network infrastructure.

  • Default value: "10.233.0.0/18"

k8s_pod_network_cidr

Optional

  • Type: String

  • Kubernetes pod network CIDR for internal network. When used, it will assign IP addresses from this range to individual pods.

  • This network must be unused in your network infrastructure.

  • Default value: "10.233.64.0/18"

csi_powerscale_driver_secret_file_path

Optional

  • Type: File path

  • If you want to deploy the CSI driver for PowerScale on your service cluster, add the file path of the secrets.yaml file to this variable.

csi_powerscale_driver_values_file_path

Optional

  • Type: File path

  • If you want to deploy the CSI driver for PowerScale on your service cluster, add the file path of the values.yaml file to this variable.

nfs_storage_name

Mandatory

  • Type: String

  • Use same name as mentioned in each of the nfs_name available in storage_config.yml.

k8s_crio_storage_size

Mandatory

  • Type: String

  • Specifies the disk size allocated for CRI-O container storage.

high_availability_config.yml

Parameter

Details

cluster_name

  • Type: String

  • Captures the name of the service cluster on which HA will be set up. Default value: service_cluster

enable_k8s_ha

  • Type: Boolean

  • Possible values: true

  • Default value: true

  • Indicates whether to enable HA for the Kubernetes (K8s) service node or not. Set to true to enable.

virtual_ip_address

  • Type: String

  • This is a mandatory and user-configurable parameter.

  • Captures the virtual IP address for the K8s service node HA setup. Ensure that the virtual_ip_address does not belong to the dynamic_range or static_range mentioned in the network_spec.yml.

  • Default value: 172.16.107.1

  1. Run build_image_x86_64.yml playbook to build diskless images for cluster nodes. See Build cluster node images.

  2. Run discovery.yml playbook to discover the potential cluster nodes, configure the boot script, and cloud-init based on the functional groups. See Discover cluster nodes

    After successfully running the discovery.yml playbook, you can either manually PXE boot the nodes or use the set_pxe_boot.yml playbook. PXE booting allows the nodes to load diskless images from the Omnia Infrastructure Manager (OIM). For detailed steps on using set_pxe_boot.yml, see Configure PXE Boot.

Additional Installations

After deploying Kubernetes, the following additional packages are installed on top of the Kubernetes stack on the service cluster:

  1. nfs-client-provisioner

    • NFS subdir external provisioner is an automatic provisioner that use your existing and already configured external NFS server to support dynamic provisioning of Kubernetes Persistent Volumes via Persistent Volume Claims (PVC).

    • The nfs_name mentioned in storage_config.yml should match the nfs_storage_name of the entries for the service_k8s_cluster.

    • The path to PVC is mentioned under {{ nfs_server_share_path }}.

    Click here for more information.

  2. Doca-ofed installation

    After running discovery.yml and PXE-booting the nodes, DOCA-OFED is installed on nodes that have Mellanox InfiniBand cards. A static IP is assigned to the InfiniBand interface only if the interface is up. If the interface is down, the user must bring it up to enable IP assignment.

Next Step

To know how to deploy the iDRAC telemetry containers on the service cluster, click here.

If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.