Workload Priority Classes

Workload Priority Classes#

Overview#

Priority classes allow you to assign different importance levels to your workloads, ensuring that critical workloads get access to resources when the cluster is fully utilized. When GPU resources are exhausted, higher priority workloads can preempt (temporarily suspend) lower priority workloads to claim the resources they need.

AMD Resource Manager implements priority class preemption, which operates independently from quota-based preemption. This gives you fine-grained control over workload scheduling within your projects.

Default Priority Classes#

AMD Resource Manager provides three default priority classes:

Priority Class	Priority Value	Use Case
`low`	-100	Batch workloads, non-urgent work, experimental workloads
`medium`	0	Default for most production workloads
`high`	100	Time-sensitive workloads, production inference, urgent tasks

How Priority Class Preemption Works#

When you submit a workload to the cluster:

Resource availability: If GPU resources are available, your workload runs regardless of its priority class
Resource exhaustion: When the cluster is fully utilized, workloads compete for resources based on their priority
Preemption occurs: A higher priority workload can preempt (suspend) a lower priority workload to claim its resources
Suspension, not termination: Preempted workloads are suspended, not deleted. They automatically resume when resources become available
FIFO within priority: Workloads with the same priority class respect first-in-first-out ordering and do not preempt each other

Note

Priority class preemption only occurs when resources are exhausted. If idle resources are available, all workloads run regardless of priority. While the examples in this guide focus on GPU resources, priority classes apply to all resource types (CPU, memory, GPUs).

Preemption Scenarios#

Scenario 1: Higher priority preempts lower priority

Cluster is fully utilized with low priority workloads
You submit a high priority workload
Result: One or more low priority workloads are suspended to make room for the high priority workload

Scenario 2: Same priority respects FIFO

Cluster is fully utilized with medium priority workloads
You submit another medium priority workload
Result: Your workload waits in the queue. No preemption occurs.

Setting Priority Classes on Workloads#

Priority classes are set using the Kubernetes label kueue.x-k8s.io/priority-class on your workload resources (Jobs, Pods, Deployments, etc.). Simply add this label to your workload and deploy it to your project’s namespace.

Note

The queue configuration is automatically handled when you deploy to a project namespace. You only need to specify the priority class label.

Example#

apiVersion: batch/v1
kind: Job
metadata:
  name: my-training-job
  namespace: my-project
  labels:
    kueue.x-k8s.io/priority-class: medium
spec:
  template:
    spec:
      containers:
      - name: trainer
        image: my-training-image:latest
        resources:
          requests:
            amd.com/gpu: "2"
          limits:
            amd.com/gpu: "2"
      restartPolicy: Never

To use a different priority class, change the label value to low, medium, or high.

Warning

Use high priority sparingly. If all workloads are marked as high priority, the priority system becomes ineffective and workloads compete on a first-come-first-served basis.

Priority Class vs Quota Preemption#

AMD Resource Manager uses two independent preemption mechanisms that can work together:

Priority Class Preemption (Workload-Level)#

What it controls: Workload importance within and across projects
How it works: Higher priority workloads preempt lower priority workloads when resources are exhausted
Scope: Individual workloads
Set via: kueue.x-k8s.io/priority-class label on workloads
Use case: Prioritize urgent workloads over batch workloads

Quota Preemption (Project-Level)#

What it controls: Project resource guarantees
How it works: Projects with allocated quotas can reclaim resources from projects that borrowed beyond their quota
Scope: Entire projects
Set via: Project quota settings in AMD Resource Manager UI
Use case: Ensure projects get their guaranteed resource allocation

How They Work Together#

When both mechanisms are active, the system applies them in this order:

First: Quota preemption ensures projects with guaranteed quotas can reclaim borrowed resources
Then: Priority class preemption determines which workloads within the available resources should run

Example scenario:

Project A has a GPU quota of 4, Project B has a quota of 0
Project B borrows idle resources and runs 8 low-priority GPU jobs
Project A submits a high-priority GPU job requiring 2 GPUs
Result: Project A reclaims 2 GPUs from Project B (quota preemption), and if needed, the high-priority job can further preempt low-priority jobs within Project A (priority class preemption)

Note

For detailed information about quota-based preemption, see Project Quotas.

Best Practices#

Choosing the Right Priority Class#

Use low priority for:

Batch training workloads that can be interrupted
Experimental or exploratory workloads
Workloads running during off-peak hours
Development and testing workloads

Use medium priority for:

Standard production training workloads
Most inference workloads
Scheduled workloads with flexible deadlines
Default choice when unsure

Use high priority for:

Production inference serving critical applications
Time-sensitive research experiments
Workloads with strict deadlines or SLA requirements
Workloads that must complete without interruption

Resource Planning Considerations#

Avoid priority inflation: If everyone marks their workloads as high priority, the system degrades to FIFO scheduling
Monitor preemption frequency: Frequent preemption of your workloads may indicate you need higher priority or increased project quota
Design for interruption: Low and medium priority workloads should checkpoint their progress to handle suspension gracefully
Coordinate with your team: Establish team guidelines for priority class usage to prevent conflicts

Workload Checkpointing#

Since preempted workloads are suspended (not terminated), they will resume when resources become available. However, implementing checkpointing is still recommended:

Workloads can save progress periodically
Upon resumption, workloads can continue from the last checkpoint
This minimizes wasted computation and speeds up completion

Supported Workload Types#

Priority classes work with all Kubernetes workload types monitored by AMD Resource Manager:

Pods
Jobs
CronJobs
Deployments
DaemonSets
StatefulSets
KaiwoJobs
KaiwoServices

Simply add the kueue.x-k8s.io/priority-class label to the workload’s metadata to enable priority-based scheduling.

Monitoring Priority Class Behavior#

You can monitor workload status and preemption events using kubectl:

# Check if a workload is suspended (preempted)
kubectl get job my-job -n my-project -o jsonpath='{.status.conditions[?(@.type=="Suspended")].status}'

# View all jobs and their status
kubectl get jobs -n my-project

# Describe a job to see events (including preemption)
kubectl describe job my-job -n my-project

Platform administrators can also monitor workload status and resource usage through the AMD Resource Manager UI.

Workload Priority Classes

Contents

Workload Priority Classes#

Overview#

Default Priority Classes#

How Priority Class Preemption Works#

Preemption Scenarios#

Setting Priority Classes on Workloads#

Example#

Priority Class vs Quota Preemption#

Priority Class Preemption (Workload-Level)#

Quota Preemption (Project-Level)#

How They Work Together#

Best Practices#

Choosing the Right Priority Class#

Resource Planning Considerations#

Workload Checkpointing#

Supported Workload Types#

Monitoring Priority Class Behavior#