AMD Resource Manager resource management dashboard

Dashboard#

The dashboard provides users with an overview of the cluster and the workloads running on it.

Clusters and nodes#

This section shows the number of onboarded clusters, the number of GPU nodes in the clusters, the total number of GPUs across all clusters, and the number of allocated GPUs via quotas.

Cluster stat cards outline various consumption metrics.

Allocations and workloads#

This section displays current statistics for workloads, such as GPU utilization, the number of running workloads, and the number of pending workloads. It also lists resource usage by project.

Project GPU consumption is listed under a separate title.

GPU memory and device utilization are shown in a graph over time. Users can change the time scale of these graphs using the buttons above the graph.

The view has nice graphs for following resource consumption.