Deploy Omnia on RHEL Platforms
Follow the below steps to deploy the Omnia container on RHEL-based platforms:
- Space Requirements for the OIM
- Omnia Minimum Cluster Node Requirements
- Complete Deployment Supported by Omnia (Slurm and Service Kubernetes Cluster) – x86_64 and aarch64
- Complete Deployment Supported by Omnia (Slurm and Service Kubernetes Cluster) – x86_64 Only
- Slurm-Only Deployment Supported by Omnia – x86_64
- Service Kubernetes Cluster-Only Deployment Supported by Omnia (iDRAC Telemetry) – x86_64
- Ports Used by the OIM
- Step 1: Deploy Omnia Core Container
- Step 2: Create Mapping File with Node Information
- Step 3: Provide Inputs to the Files in the
project_defaultDirectory - Step 4: Provide Required Credentials for Omnia
- Step 5: Prepare the OIM
- Step 6: Verify Readiness of OIM
- Step 7: Configure OpenLDAP Proxy for Centralized Authentication
- Step 8: Configure Telemetry Requirements
- Step 9: Create Local Repositories for the Cluster
- Step 10: High Availability
- Step 11: Set up Slurm on nodes
- Automated CUDA and DCGM Provisioning
- Manual Recovery: CUDA Toolkit and DCGM Setup Failure
- Configuration merge control
- Default Slurm configuration
- Pulling container images on a Slurm cluster node
- HPC Benchmark Image Layer
- Backup Slurm configuration
- Cleanup Slurm configuration
- Rollback Slurm configuration
- Step 12: Build Cluster Node Images
- Step 13: Provision cluster nodes
- Step 14: Verify Slurm Cluster and Kubernetes on the Service Cluster
- Step 15: Initialize and Verify Telemetry
- Step 16: Verify Telemetry Services Deployed on the Cluster
- External Services Telemetry Integrations
- Integrate OpenManage Enterprise with Omnia Kafka Pipeline for Secure Telemetry Data Streaming
- Integrate Smart Fabric Manager (SFM) with VictoriaMetrics for Secure Telemetry Data Streaming
- Integrate NVIDIA Unified Fabric Manager (UFM) with Omnia Telemetry for Secure Metrics and Logs Streaming
- Integrate VAST Storage with Omnia Telemetry for Secure Metrics and Logs Streaming
- Collect Telemetry Data from External Client Nodes to Kafka
- Collect Telemetry Data from External Client Nodes to Victoria DB (Cluster Mode)
- Collect External Client Nodes to Send Logs to VictoriaLogs (Cluster Mode)
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.