AMD Resource Manager workloads priority classes preemption

Workload Priority Classes#

Overview#

Priority classes allow you to assign different importance levels to your workloads, ensuring that critical workloads get access to resources when the cluster is fully utilized. When GPU resources are exhausted, higher priority workloads can preempt (temporarily suspend) lower priority workloads to claim the resources they need.

AMD Resource Manager implements priority class preemption, which operates independently from quota-based preemption. This gives you fine-grained control over workload scheduling within your projects.

Default Priority Classes#

AMD Resource Manager provides three default priority classes:

Priority Class

Priority Value

Use Case

low

-100

Batch workloads, non-urgent work, experimental workloads

medium

0

Default for most production workloads

high

100

Time-sensitive workloads, production inference, urgent tasks

How Priority Class Preemption Works#

When you submit a workload to the cluster:

  1. Resource availability: If GPU resources are available, your workload runs regardless of its priority class

  2. Resource exhaustion: When the cluster is fully utilized, workloads compete for resources based on their priority

  3. Preemption occurs: A higher priority workload can preempt (suspend) a lower priority workload to claim its resources

  4. Suspension, not termination: Preempted workloads are suspended, not deleted. They automatically resume when resources become available

  5. FIFO within priority: Workloads with the same priority class respect first-in-first-out ordering and do not preempt each other

Note

Priority class preemption only occurs when resources are exhausted. If idle resources are available, all workloads run regardless of priority. While the examples in this guide focus on GPU resources, priority classes apply to all resource types (CPU, memory, GPUs).

Preemption Scenarios#

Scenario 1: Higher priority preempts lower priority

  • Cluster is fully utilized with low priority workloads

  • You submit a high priority workload

  • Result: One or more low priority workloads are suspended to make room for the high priority workload

Scenario 2: Same priority respects FIFO

  • Cluster is fully utilized with medium priority workloads

  • You submit another medium priority workload

  • Result: Your workload waits in the queue. No preemption occurs.

Setting Priority Classes on Workloads#

Priority classes are set using the Kubernetes label kueue.x-k8s.io/priority-class on your workload resources (Jobs, Pods, Deployments, etc.). Simply add this label to your workload and deploy it to your project’s namespace.

Note

The queue configuration is automatically handled when you deploy to a project namespace. You only need to specify the priority class label.

Example#

apiVersion: batch/v1
kind: Job
metadata:
  name: my-training-job
  namespace: my-project
  labels:
    kueue.x-k8s.io/priority-class: medium
spec:
  template:
    spec:
      containers:
      - name: trainer
        image: my-training-image:latest
        resources:
          requests:
            amd.com/gpu: "2"
          limits:
            amd.com/gpu: "2"
      restartPolicy: Never

To use a different priority class, change the label value to low, medium, or high.

Warning

Use high priority sparingly. If all workloads are marked as high priority, the priority system becomes ineffective and workloads compete on a first-come-first-served basis.

Priority Class vs Quota Preemption#

AMD Resource Manager uses two independent preemption mechanisms that can work together:

Priority Class Preemption (Workload-Level)#

  • What it controls: Workload importance within and across projects

  • How it works: Higher priority workloads preempt lower priority workloads when resources are exhausted

  • Scope: Individual workloads

  • Set via: kueue.x-k8s.io/priority-class label on workloads

  • Use case: Prioritize urgent workloads over batch workloads

Quota Preemption (Project-Level)#

  • What it controls: Project resource guarantees

  • How it works: Projects with allocated quotas can reclaim resources from projects that borrowed beyond their quota

  • Scope: Entire projects

  • Set via: Project quota settings in AMD Resource Manager UI

  • Use case: Ensure projects get their guaranteed resource allocation

How They Work Together#

When both mechanisms are active, the system applies them in this order:

  1. First: Quota preemption ensures projects with guaranteed quotas can reclaim borrowed resources

  2. Then: Priority class preemption determines which workloads within the available resources should run

Example scenario:

  • Project A has a GPU quota of 4, Project B has a quota of 0

  • Project B borrows idle resources and runs 8 low-priority GPU jobs

  • Project A submits a high-priority GPU job requiring 2 GPUs

  • Result: Project A reclaims 2 GPUs from Project B (quota preemption), and if needed, the high-priority job can further preempt low-priority jobs within Project A (priority class preemption)

Note

For detailed information about quota-based preemption, see Project Quotas.

Best Practices#

Choosing the Right Priority Class#

Use low priority for:

  • Batch training workloads that can be interrupted

  • Experimental or exploratory workloads

  • Workloads running during off-peak hours

  • Development and testing workloads

Use medium priority for:

  • Standard production training workloads

  • Most inference workloads

  • Scheduled workloads with flexible deadlines

  • Default choice when unsure

Use high priority for:

  • Production inference serving critical applications

  • Time-sensitive research experiments

  • Workloads with strict deadlines or SLA requirements

  • Workloads that must complete without interruption

Resource Planning Considerations#

  1. Avoid priority inflation: If everyone marks their workloads as high priority, the system degrades to FIFO scheduling

  2. Monitor preemption frequency: Frequent preemption of your workloads may indicate you need higher priority or increased project quota

  3. Design for interruption: Low and medium priority workloads should checkpoint their progress to handle suspension gracefully

  4. Coordinate with your team: Establish team guidelines for priority class usage to prevent conflicts

Workload Checkpointing#

Since preempted workloads are suspended (not terminated), they will resume when resources become available. However, implementing checkpointing is still recommended:

  • Workloads can save progress periodically

  • Upon resumption, workloads can continue from the last checkpoint

  • This minimizes wasted computation and speeds up completion

Supported Workload Types#

Priority classes work with all Kubernetes workload types monitored by AMD Resource Manager:

  • Pods

  • Jobs

  • CronJobs

  • Deployments

  • DaemonSets

  • StatefulSets

  • KaiwoJobs

  • KaiwoServices

Simply add the kueue.x-k8s.io/priority-class label to the workload’s metadata to enable priority-based scheduling.

Monitoring Priority Class Behavior#

You can monitor workload status and preemption events using kubectl:

# Check if a workload is suspended (preempted)
kubectl get job my-job -n my-project -o jsonpath='{.status.conditions[?(@.type=="Suspended")].status}'

# View all jobs and their status
kubectl get jobs -n my-project

# Describe a job to see events (including preemption)
kubectl describe job my-job -n my-project

Platform administrators can also monitor workload status and resource usage through the AMD Resource Manager UI.