Enabling Enterprises to Scale AI in Production#
The sections below document composable AMD software components for running AI on Kubernetes, including optimized inference containers (AIMs), orchestration, operational tooling, and resource governance. Organizations and partners can adopt individual components or combine them with existing infrastructure to improve GPU utilization and accelerate deployment from experiment to production, with security and governance controls where required.
AMD Inference Microservices Ecosystem#
AMD Inference Microservices (AIMs) provide standardized, portable inference microservices for serving AI models on AMD hardware. Distributed as Docker images, AIMs abstract away the complexities involved in model serving through an intelligent orchestration layer that automatically configures runtime environments, detects available accelerators, and selects an optimized performance profile (configuration parameters for the inference engine). In addition, AIM exposes an OpenAI-compatible API for LLMs, making it easy to integrate with existing applications and services.
Visit the AIM Catalog to view available AIM containers for supported models.
Resource |
Link |
|---|---|
Documentation |
|
GitHub |
|
Getting started |
AIM Engine#
The AIM Engine is a Kubernetes operator that provides capabilities to orchestrate the full lifecycle of AI inference workloads. It bridges model artifacts and production-ready inference endpoints by coordinating Kubernetes-native components, including KServe InferenceService resources for simple yet powerful deployment with minimal configuration.
The operator maintains the AIM Catalog on your cluster with automatic discovery from container registries and selects optimized AIM images and profiles for detected accelerators. Additional capabilities include model caching, setting HTTP routing through the Gateway API, autoscaling with KEDA and OpenTelemetry metrics, and multi-tenancy.
Resource |
Link |
|---|---|
Documentation |
|
GitHub |
|
Getting started |
AMD AI Workbench#
AMD AI Workbench is a graphical interface that enables AI practitioners to run, manage and scale AI workloads, including fine-tuning, inference, and related jobs. Key capabilities include one-click deployment of AIMs from the AIM Catalog, plus bundled tools such as Jupyter Notebooks, Visual Studio Code and MLOps integrations including MLflow for experiment tracking and model lifecycle management.
Resource |
Link |
|---|---|
Documentation |
|
GitHub |
|
Getting started |
AMD Solution Blueprints#
AMD Solution Blueprints are ready-to-deploy, customizable Kubernetes reference applications built with AIMs. They serve as starting points and example implementations, allowing you to explore AIMs across a broad range of use cases, from standard chat interfaces to agentic frameworks. Each blueprint is packaged as a Helm chart and includes architecture diagrams and documentation.
Resource |
Link |
|---|---|
Documentation |
|
GitHub |
|
Getting started |
AMD Resource Manager#
The AMD Resource Manager is a graphical interface that enables the dynamic sharing of compute resources between teams working in a common cluster. Built on top of Kubernetes it achieves this, by utilizing open-source technologies such as Kueue. Its key capabilities include cluster management, monitoring, and maintaining teams’ access to computational resources.
Resource |
Link |
|---|---|
Documentation |
|
GitHub |
|
Getting started |