Workload Priority Classes#
Overview#
Priority classes allow you to assign different importance levels to your workloads, ensuring that critical workloads get access to resources when the cluster is fully utilized. When GPU resources are exhausted, higher priority workloads can preempt (temporarily suspend) lower priority workloads to claim the resources they need.
AMD Resource Manager implements priority class preemption, which operates independently from quota-based preemption. This gives you fine-grained control over workload scheduling within your projects.
Default Priority Classes#
AMD Resource Manager provides three default priority classes:
Priority Class |
Priority Value |
Use Case |
|---|---|---|
|
-100 |
Batch workloads, non-urgent work, experimental workloads |
|
0 |
Default for most production workloads |
|
100 |
Time-sensitive workloads, production inference, urgent tasks |
How Priority Class Preemption Works#
When you submit a workload to the cluster:
Resource availability: If GPU resources are available, your workload runs regardless of its priority class
Resource exhaustion: When the cluster is fully utilized, workloads compete for resources based on their priority
Preemption occurs: A higher priority workload can preempt (suspend) a lower priority workload to claim its resources
Suspension, not termination: Preempted workloads are suspended, not deleted. They automatically resume when resources become available
FIFO within priority: Workloads with the same priority class respect first-in-first-out ordering and do not preempt each other
Note
Priority class preemption only occurs when resources are exhausted. If idle resources are available, all workloads run regardless of priority. While the examples in this guide focus on GPU resources, priority classes apply to all resource types (CPU, memory, GPUs).
Preemption Scenarios#
Scenario 1: Higher priority preempts lower priority
Cluster is fully utilized with
lowpriority workloadsYou submit a
highpriority workloadResult: One or more
lowpriority workloads are suspended to make room for thehighpriority workload
Scenario 2: Same priority respects FIFO
Cluster is fully utilized with
mediumpriority workloadsYou submit another
mediumpriority workloadResult: Your workload waits in the queue. No preemption occurs.
Setting Priority Classes on Workloads#
Priority classes are set using the Kubernetes label kueue.x-k8s.io/priority-class on your workload resources (Jobs, Pods, Deployments, etc.). Simply add this label to your workload and deploy it to your project’s namespace.
Note
The queue configuration is automatically handled when you deploy to a project namespace. You only need to specify the priority class label.
Example#
apiVersion: batch/v1
kind: Job
metadata:
name: my-training-job
namespace: my-project
labels:
kueue.x-k8s.io/priority-class: medium
spec:
template:
spec:
containers:
- name: trainer
image: my-training-image:latest
resources:
requests:
amd.com/gpu: "2"
limits:
amd.com/gpu: "2"
restartPolicy: Never
To use a different priority class, change the label value to low, medium, or high.
Warning
Use high priority sparingly. If all workloads are marked as high priority, the priority system becomes ineffective and workloads compete on a first-come-first-served basis.
Priority Class vs Quota Preemption#
AMD Resource Manager uses two independent preemption mechanisms that can work together:
Priority Class Preemption (Workload-Level)#
What it controls: Workload importance within and across projects
How it works: Higher priority workloads preempt lower priority workloads when resources are exhausted
Scope: Individual workloads
Set via:
kueue.x-k8s.io/priority-classlabel on workloadsUse case: Prioritize urgent workloads over batch workloads
Quota Preemption (Project-Level)#
What it controls: Project resource guarantees
How it works: Projects with allocated quotas can reclaim resources from projects that borrowed beyond their quota
Scope: Entire projects
Set via: Project quota settings in AMD Resource Manager UI
Use case: Ensure projects get their guaranteed resource allocation
How They Work Together#
When both mechanisms are active, the system applies them in this order:
First: Quota preemption ensures projects with guaranteed quotas can reclaim borrowed resources
Then: Priority class preemption determines which workloads within the available resources should run
Example scenario:
Project A has a GPU quota of 4, Project B has a quota of 0
Project B borrows idle resources and runs 8 low-priority GPU jobs
Project A submits a high-priority GPU job requiring 2 GPUs
Result: Project A reclaims 2 GPUs from Project B (quota preemption), and if needed, the high-priority job can further preempt low-priority jobs within Project A (priority class preemption)
Note
For detailed information about quota-based preemption, see Project Quotas.
Best Practices#
Choosing the Right Priority Class#
Use low priority for:
Batch training workloads that can be interrupted
Experimental or exploratory workloads
Workloads running during off-peak hours
Development and testing workloads
Use medium priority for:
Standard production training workloads
Most inference workloads
Scheduled workloads with flexible deadlines
Default choice when unsure
Use high priority for:
Production inference serving critical applications
Time-sensitive research experiments
Workloads with strict deadlines or SLA requirements
Workloads that must complete without interruption
Resource Planning Considerations#
Avoid priority inflation: If everyone marks their workloads as high priority, the system degrades to FIFO scheduling
Monitor preemption frequency: Frequent preemption of your workloads may indicate you need higher priority or increased project quota
Design for interruption: Low and medium priority workloads should checkpoint their progress to handle suspension gracefully
Coordinate with your team: Establish team guidelines for priority class usage to prevent conflicts
Workload Checkpointing#
Since preempted workloads are suspended (not terminated), they will resume when resources become available. However, implementing checkpointing is still recommended:
Workloads can save progress periodically
Upon resumption, workloads can continue from the last checkpoint
This minimizes wasted computation and speeds up completion
Supported Workload Types#
Priority classes work with all Kubernetes workload types monitored by AMD Resource Manager:
Pods
Jobs
CronJobs
Deployments
DaemonSets
StatefulSets
KaiwoJobs
KaiwoServices
Simply add the kueue.x-k8s.io/priority-class label to the workload’s metadata to enable priority-based scheduling.
Monitoring Priority Class Behavior#
You can monitor workload status and preemption events using kubectl:
# Check if a workload is suspended (preempted)
kubectl get job my-job -n my-project -o jsonpath='{.status.conditions[?(@.type=="Suspended")].status}'
# View all jobs and their status
kubectl get jobs -n my-project
# Describe a job to see events (including preemption)
kubectl describe job my-job -n my-project
Platform administrators can also monitor workload status and resource usage through the AMD Resource Manager UI.