API Reference

API Reference#

Packages#

aim.eai.amd.com/v1alpha1

aim.eai.amd.com/v1alpha1#

Package v1alpha1 contains API Schema definitions for the aim v1alpha1 API group.

Resource Types#

AIMArtifact#

AIMArtifact is the Schema for the artifacts API

Appears in:

AIMArtifactList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMArtifact`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMArtifactSpec
`status` AIMArtifactStatus

AIMArtifactConfig#

AIMArtifactConfig controls artifact-level defaults that are not appropriate for individual services. These settings apply at namespace/cluster scope only.

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec

Field	Description	Default	Validation
`defaultRetentionPriority` integer	DefaultRetentionPriority sets the default retention priority for AIMArtifacts that do not specify one in their spec. When set, artifacts without an explicit retentionPriority become eligible for automatic eviction at this priority level. Lower values are evicted first. If not set, artifacts without an explicit retentionPriority are never automatically evicted.		Minimum: 0 Optional: {}

AIMArtifactList#

AIMArtifactList contains a list of AIMArtifact

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMArtifactList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMArtifact array

AIMArtifactMode#

Underlying type: string

AIMArtifactMode indicates the ownership mode of an artifact, derived from owner references.

Validation:

Enum: [Dedicated Shared]

Appears in:

AIMArtifactStatus

Field	Description
`Dedicated`	ArtifactModeDedicated indicates the cache has owner references and will be garbage collected when its owners are deleted.
`Shared`	ArtifactModeShared indicates the cache has no owner references and persists independently, available for sharing across services.

AIMArtifactSpec#

AIMArtifactSpec defines the desired state of AIMArtifact

Appears in:

AIMArtifact

Field	Description	Validation
`sourceUri` string	SourceURI specifies the source location of the model to download. Supported protocols: hf:// (Hugging Face) and s3:// (S3-compatible storage). This field uniquely identifies the artifact and is immutable after creation. Example: hf://meta-llama/Llama-3-8B	MinLength: 1 Pattern: `^(hf\|s3)://[^ \t\r\n]+$`
`modelId` string	ModelID is the canonical identifier in {org}/{name} format. Determines the cache download path: /workspace/cache/{modelId} For Hugging Face sources, this is typically derived from the URI (e.g., “meta-llama/Llama-3-8B”). For S3 sources, this must be explicitly provided (e.g., “my-team/fine-tuned-llama”). When not specified, derived from SourceURI for Hugging Face sources.	Pattern: `^[a-zA-Z0-9_-]+/[a-zA-Z0-9._-]+$` Optional: {}
`storageClassName` string	StorageClassName specifies the storage class for the cache volume. When not specified, uses the cluster default storage class.	Optional: {}
`size` Quantity	Size specifies the size of the cache volume	Optional: {}
`env` EnvVar array	Env lists the environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens).	Optional: {}
`modelDownloadImage` string	ModelDownloadImage specifies the container image used to download and initialize the artifact. This image runs as a job to download model artifacts from the source URI to the cache volume. When not specified, the controller uses its built-in default (matching the release version).	Optional: {}
`downloadFilter` AIMDownloadFilter	DownloadFilter controls which files are included or excluded when downloading from Hugging Face. Overrides any filter set in the runtime config’s storage.downloadFilter. When neither is set, subdirectory files are excluded by default (equivalent to exclude: [”/”]). To download all files including subdirectories, set this to an empty object: downloadFilter: {}. This field is immutable — to change the filter, recreate the artifact.	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets references secrets for pulling AIM container images.	Optional: {}
`retentionPriority` integer	RetentionPriority marks this artifact as eligible for automatic eviction when storage quota is exceeded. Lower values are evicted first. Artifacts without this field are only evictable if a defaultRetentionPriority is configured in the runtime config. Use the aim.eai.amd.com/eviction-protected annotation to exempt an artifact from eviction entirely.	Minimum: 0 Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.	Optional: {}

AIMArtifactStatus#

AIMArtifactStatus defines the observed state of AIMArtifact

Appears in:

AIMArtifact

Field	Description	Default	Validation
`observedGeneration` integer
`conditions` Condition array	Conditions represent the latest available observations of the artifact’s state
`status` AIMStatus	Status represents the current status of the artifact	Pending	Enum: [Pending Progressing Ready Degraded Failed NotAvailable]
`progress` DownloadProgress	Progress represents the download progress when Status is Progressing		Optional: {}
`download` DownloadState	Download represents the current download attempt state, patched by the downloader pod. Shows which protocol is active, what attempt we’re on, etc.		Optional: {}
`displaySize` string	DisplaySize is the human-readable effective size (spec or discovered)		Optional: {}
`lastUsed` Time	LastUsed represents the last time a model was deployed that used this cache
`persistentVolumeClaim` string	PersistentVolumeClaim represents the name of the created PVC
`mode` AIMArtifactMode	Mode indicates the ownership mode of this artifact, derived from owner references. - Dedicated: Has owner references, will be garbage collected when owners are deleted. - Shared: No owner references, persists independently and can be shared.		Enum: [Dedicated Shared] Optional: {}
`discoveredSizeBytes` integer	DiscoveredSizeBytes is the model size discovered via check-size job. Populated when spec.size is not provided.		Optional: {}
`allocatedSize` Quantity	AllocatedSize is the actual PVC size requested (including headroom).		Optional: {}
`headroomPercent` integer	HeadroomPercent is the headroom percentage that was applied to the PVC size.		Optional: {}

AIMArtifactStorageQuota#

AIMArtifactStorageQuota configures storage limits for AIMArtifacts. These settings are only available on AIMClusterRuntimeConfig (cluster-scoped) because they enforce cluster-wide and cross-namespace policies.

Appears in:

AIMClusterRuntimeConfigSpec

Field	Description	Default	Validation
`clusterLimit` Quantity	ClusterLimit is the maximum total allocated storage for all AIMArtifacts cluster-wide. When the sum of all artifact PVC sizes across all namespaces would exceed this limit, new artifact PVCs are blocked until evictable artifacts are cleaned up or the limit is raised.		Optional: {}
`defaultNamespaceLimit` Quantity	DefaultNamespaceLimit is the default maximum allocated storage for AIMArtifacts per namespace. Can be overridden for individual namespaces via the aim.eai.amd.com/artifact-storage-quota annotation.		Optional: {}

AIMCachingMode#

Underlying type: string

AIMCachingMode controls caching behavior for a service. Canonical values are Dedicated and Shared. Legacy values are accepted for backward compatibility:

Always maps to Shared
Auto maps to Shared
Never maps to Dedicated

Validation:

Enum: [Dedicated Shared Auto Always Never]

Appears in:

AIMServiceCachingConfig

Field	Description
`Dedicated`	CachingModeDedicated always creates service-owned dedicated caches/artifacts.
`Shared`	CachingModeShared reuses and creates shared caches/artifacts.
`Auto`	CachingModeAuto is deprecated legacy value that maps to Shared.
`Always`	CachingModeAlways is deprecated legacy value that maps to Shared.
`Never`	CachingModeNever is deprecated legacy value that maps to Dedicated.

AIMClusterModel#

AIMClusterModel is a cluster-scoped model catalog entry for AIM container images.

Cluster-scoped models can be referenced by AIMServices in any namespace, making them ideal for shared model deployments across teams and projects. Like namespace-scoped AIMModels, cluster models trigger discovery jobs to extract metadata and generate service templates.

When both cluster and namespace models exist for the same container image, services will preferentially use the namespace-scoped AIMModel when referenced by image URI.

Appears in:

AIMClusterModelList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterModel`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMModelSpec
`status` AIMModelStatus

AIMClusterModelList#

AIMClusterModelList contains a list of AIMClusterModel.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterModelList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterModel array

AIMClusterModelSource#

AIMClusterModelSource automatically discovers and syncs AI model images from container registries.

Appears in:

AIMClusterModelSourceList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterModelSource`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMClusterModelSourceSpec
`status` AIMClusterModelSourceStatus

AIMClusterModelSourceList#

AIMClusterModelSourceList contains a list of AIMClusterModelSource.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterModelSourceList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterModelSource array

AIMClusterModelSourceSpec#

AIMClusterModelSourceSpec defines the desired state of AIMClusterModelSource.

Appears in:

AIMClusterModelSource

Field	Description	Default	Validation
`registry` string	Registry to sync from (e.g., docker.io, ghcr.io, gcr.io). Defaults to docker.io if not specified.	docker.io	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets contains references to secrets for authenticating to private registries. Secrets must exist in the operator namespace (typically aim-system). Used for both registry catalog listing and image metadata extraction.		Optional: {}
`filters` ModelSourceFilter array	Filters define which images to discover and sync. Each filter specifies an image pattern with optional version constraints and exclusions. Multiple filters are combined with OR logic (any match includes the image).		MaxItems: 100 MinItems: 1
`syncInterval` Duration	SyncInterval defines how often to sync with the registry. Defaults to 1h. Minimum recommended interval is 15m to avoid rate limiting. Format: duration string (e.g., “30m”, “1h”, “2h30m”).	1h	Optional: {}
`versions` string array	Versions specifies global semantic version constraints applied to all filters. Individual filters can override this with their own version constraints. Constraints use semver syntax: >=1.0.0, <2.0.0, ~1.2.0, ^1.0.0, etc. Non-semver tags (e.g., “latest”, “dev”) are silently skipped. Version ranges work on all registries (including ghcr.io, gcr.io) when combined with exact repository names (no wildcards). The controller uses the Tags List API to fetch all tags for the repository and filters them by the semver constraint. Example: registry=ghcr.io, filters=[{image: “silogen/aim-llama”}], versions=[“>=1.0.0”] will fetch all tags from ghcr.io/silogen/aim-llama and include only those >=1.0.0.		Optional: {}
`maxModels` integer	MaxModels is the maximum number of AIMClusterModel resources to create from this source. Once this limit is reached, no new models will be created, even if more matching images are discovered. Existing models are never deleted. This prevents runaway model creation from overly broad filters.	100	Maximum: 10000 Minimum: 1 Optional: {}

AIMClusterModelSourceStatus#

AIMClusterModelSourceStatus defines the observed state of AIMClusterModelSource.

Appears in:

AIMClusterModelSource

Field	Description	Validation
`status` string	Status represents the overall state of the model source.	Enum: [Pending Starting Progressing Ready Running Degraded NotAvailable Failed] Optional: {}
`lastSyncTime` Time	LastSyncTime is the timestamp of the last successful registry sync. Updated after each successful sync operation.	Optional: {}
`discoveredModels` integer	DiscoveredModels is the count of AIMClusterModel resources managed by this source. Includes both existing and newly created models.	Optional: {}
`availableModels` integer	AvailableModels is the total count of images discovered in the registry that match the filters. This may be higher than DiscoveredModels if maxModels limit was reached.	Optional: {}
`modelsLimitReached` boolean	ModelsLimitReached indicates whether the maxModels limit has been reached. When true, no new models will be created even if more matching images are discovered.	Optional: {}
`conditions` Condition array	Conditions represent the latest available observations of the source’s state. Standard conditions: Ready, Syncing, RegistryReachable.	Optional: {}
`observedGeneration` integer	ObservedGeneration reflects the generation of the most recently observed spec.	Optional: {}

AIMClusterRuntimeConfig#

AIMClusterRuntimeConfig is a cluster-scoped runtime configuration for AIM services, models, and templates.

Cluster-scoped runtime configs provide platform-wide defaults that apply to all namespaces, making them ideal for organization-level policies such as storage classes, discovery behavior, model creation scope, and routing configuration.

When both cluster and namespace runtime configs exist with the same name, the configs are merged, and the namespace-scoped AIMRuntimeConfig takes precedence for any field that is set in both.

Appears in:

AIMClusterRuntimeConfigList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterRuntimeConfig`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMClusterRuntimeConfigSpec
`status` AIMRuntimeConfigStatus

AIMClusterRuntimeConfigList#

AIMClusterRuntimeConfigList contains a list of AIMClusterRuntimeConfig.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterRuntimeConfigList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterRuntimeConfig array

AIMClusterRuntimeConfigSpec#

AIMClusterRuntimeConfigSpec defines cluster-wide defaults for AIM resources.

Appears in:

AIMClusterRuntimeConfig

Field	Description	Validation
`storage` AIMStorageConfig	Storage configures storage defaults for this service’s PVCs and caches. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`routing` AIMRuntimeRoutingConfig	Routing controls HTTP routing configuration for this service. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. When set on AIMService, these take highest precedence in the merge hierarchy. When set on RuntimeConfig, these provide namespace/cluster-level defaults. Merge order (highest to lowest): Service.Env > Template.Env > RuntimeConfig.Env > Profile.Env	Optional: {}
`model` AIMModelConfig	Model controls model creation and discovery defaults. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`artifact` AIMArtifactConfig	Artifact controls artifact-level defaults such as eviction policy. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`labelPropagation` AIMRuntimeConfigLabelPropagationSpec	LabelPropagation controls how labels from parent AIM resources are propagated to child resources. When enabled, labels matching the specified patterns are automatically copied from parent resources (e.g., AIMService, AIMTemplateCache) to their child resources (e.g., Deployments, Services, PVCs). This is useful for propagating organizational metadata like cost centers, team identifiers, or compliance labels through the resource hierarchy.	Optional: {}
`defaultStorageClassName` string	DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.DefaultStorageClassName is not set, the value will be automatically migrated.	Optional: {}
`pvcHeadroomPercent` integer	DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.PVCHeadroomPercent is not set, the value will be automatically migrated.	Optional: {}
`artifactStorageQuota` AIMArtifactStorageQuota	ArtifactStorageQuota configures storage limits for AIMArtifacts. These limits control how much total PVC storage artifacts may consume, both cluster-wide and per-namespace.	Optional: {}

AIMClusterServiceTemplate#

AIMClusterServiceTemplate is a cluster-scoped template that defines runtime profiles for AIM services.

Cluster-scoped templates can be used by AIMServices in any namespace, making them ideal for platform-wide model configurations that should be shared across teams and projects. Unlike namespace-scoped AIMServiceTemplates, cluster templates do not support caching configuration and must be managed by cluster administrators, since caches themselves are namespace-scoped.

When both cluster and namespace templates exist with the same name, the namespace-scoped template takes precedence for services in that namespace.

Appears in:

AIMClusterServiceTemplateList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterServiceTemplate`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMClusterServiceTemplateSpec
`status` AIMServiceTemplateStatus

AIMClusterServiceTemplateList#

AIMClusterServiceTemplateList contains a list of AIMClusterServiceTemplate.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMClusterServiceTemplateList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMClusterServiceTemplate array

AIMClusterServiceTemplateSpec#

AIMClusterServiceTemplateSpec defines the desired state of AIMClusterServiceTemplate (cluster-scoped).

A cluster-scoped template that selects a runtime profile for a given AIM model.

Appears in:

AIMClusterServiceTemplate

Field	Description	Validation
`modelName` string	ModelName is the model name. Matches `metadata.name` of an AIMModel or AIMClusterModel. Immutable. Example: `meta/llama-3-8b:1.1+20240915`	MinLength: 1
`metric` AIMMetric	Metric selects the optimization goal. - `latency`: prioritize low end‑to‑end latency - `throughput`: prioritize sustained requests/second	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision selects the numeric precision used by the runtime.	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for each replica. For GPU models, defines the GPU count and model types required for deployment. For CPU-only models, defines CPU resource requirements. This field is immutable after creation.	Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.	Optional: {}
`aimId` string	AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). Required when customProfile is set; used to assemble the profile YAML aim_id field and to compute the custom profile ID for AIM_PROFILE_ID.	Optional: {}
`modelId` string	ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). Required when customProfile is set; used for profile YAML model_id field and for weight pre-caching via the discovery job.	Optional: {}
`customProfile` AIMCustomProfile	CustomProfile defines inline custom profile data for the inference engine. When set, the controller assembles a profile YAML from this data and template metadata, creates a ConfigMap, and mounts it into discovery and inference containers. Requires aimId, modelId, hardware, metric, and precision to also be set.	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets containing credentials for pulling container images. These secrets are used for: - Discovery dry-run jobs that inspect the model container - Pulling the image for inference services The secrets are merged with any model or runtime config defaults. For namespace-scoped templates, secrets must exist in the same namespace. For cluster-scoped templates, secrets must exist in the operator namespace.	Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. This includes discovery dry-run jobs and inference services created from this template. If empty, the default service account for the namespace is used.	Optional: {}
`resources` ResourceRequirements	Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults.	Optional: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources required to run this template. When provided, the discovery dry-run will be skipped and these sources will be used directly. This allows users to explicitly declare model dependencies without requiring a discovery job. If omitted, a discovery job will be run to automatically determine the required model sources.	Optional: {}
`profileId` string	ProfileId is the specific AIM profile ID that this template should use. When set, the discovery job will be instructed to use this specific profile.	Optional: {}
`type` AIMProfileType	Type indicates the optimization level of this template. - optimized: Template has been tuned for performance - preview: Template is experimental/pre-release - unoptimized: Default, no specific optimizations applied When nil, the type is determined by discovery. When set, overrides discovery.	Enum: [optimized preview unoptimized] Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. These variables are passed to the inference runtime and can be used to configure runtime behavior, authentication, or other settings.	Optional: {}

AIMCpuRequirements#

AIMCpuRequirements specifies CPU resource requirements.

Appears in:

AIMHardwareRequirements

Field	Description	Default	Validation
`requests` Quantity	Requests is the number of CPU cores to request. Required and must be > 0.		Required: {}
`limits` Quantity	Limits is the maximum number of CPU cores to allow.		Optional: {}

AIMCustomModelSpec#

AIMCustomModelSpec contains configuration for custom models. These fields are only used when modelSources is specified (custom models). For image-based models, these settings come from discovery.

Appears in:

AIMModelSpec

Field	Description	Default	Validation
`hardware` AIMHardwareRequirements	Hardware specifies default hardware requirements for all templates. Individual templates can override these defaults. Required when modelSources is set and customTemplates is empty.		Optional: {}
`type` AIMProfileType	Type specifies default type for all templates. Individual templates can override this default. When nil, templates default to “unoptimized”.		Enum: [optimized preview unoptimized] Optional: {}

AIMCustomProfile#

AIMCustomProfile defines inline custom profile data for user-provided inference engine configuration. When set on a template, the controller assembles a profile YAML, creates a ConfigMap, and mounts it into both the discovery job and the inference service container.

Appears in:

AIMClusterServiceTemplateSpec
AIMCustomTemplate
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon

Field	Description	Default	Validation
`engineArgs` JSON	EngineArgs contains inference engine arguments as a free-form JSON object. These are passed as CLI arguments to the inference engine (e.g., vLLM). Do not include “model” — it is injected separately by the runtime.		Schemaless: {} Optional: {}
`envVars` object (keys:string, values:string)	EnvVars contains environment variables applied to the inference engine process. These are written into the profile YAML and applied by the AIM runtime via os.execv, distinct from container-level Env which targets the AIM runtime container itself. Keys must match ^[A-Z0-9_]+$ (uppercase with underscores).		Optional: {}

AIMCustomTemplate#

AIMCustomTemplate defines a custom template configuration for a model. When modelSources are specified directly on AIMModel, customTemplates allow defining explicit hardware requirements and profiles, skipping the discovery job. This is an existing struct (not a CRD); it appears as an element of AIMModel.spec.customTemplates[].

Appears in:

AIMModelSpec

Field	Description	Default	Validation
`name` string	Name is the template name. If not provided, auto-generated from model name + profile.		MaxLength: 63 Optional: {}
`type` AIMProfileType	Type indicates the optimization status of this template. - optimized: Template has been tuned for performance - preview: Template is experimental/pre-release - unoptimized: Default, no specific optimizations applied	unoptimized	Enum: [optimized preview unoptimized] Optional: {}
`env` EnvVar array	Env specifies environment variable overrides when this template is selected. These are container-level env vars applied to the AIM runtime container.		MaxItems: 64 Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for this template. Optional when spec.hardware is set (inherits from spec). When both are set, values are merged field-by-field with template taking precedence.		Optional: {}
`profile` AIMTemplateProfile	Profile declares runtime profile variables for template selection. Used when multiple templates exist to select based on metric/precision.		Optional: {}
`aimId` string	AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). Required when customProfile is set.		Optional: {}
`modelId` string	ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). Required when customProfile is set.		Optional: {}
`customProfile` AIMCustomProfile	CustomProfile defines inline custom profile data for the inference engine. When set, the resulting template will have a custom profile ConfigMap mounted. Requires aimId, modelId, hardware, profile.metric, and profile.precision.		Optional: {}

AIMDiscoveryProfileMetadata#

AIMDiscoveryProfileMetadata describes the characteristics of a discovered deployment profile.

Appears in:

AIMDiscoveryProfile

Field	Description	Validation
`engine` string	Engine identifies the inference engine used for this profile (e.g., “vllm”, “tgi”).	Optional: {}
`gpu` string	GPU specifies the GPU model this profile is optimized for (e.g., “MI300X”, “MI325X”).	Optional: {}
`gpu_count` integer	GPUCount indicates how many GPUs are required per replica for this profile.	Optional: {}
`metric` AIMMetric	Metric indicates the optimization goal for this profile (“latency” or “throughput”).	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision specifies the numeric precision used in this profile (e.g., “fp16”, “fp8”).	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`type` AIMProfileType	Type specifies the optimization level of this profile (optimized, unoptimized, preview).	Enum: [optimized preview unoptimized] Optional: {}

AIMDownloadFilter#

AIMDownloadFilter controls which files are included or excluded during artifact downloads. Patterns use fnmatch-style glob syntax applied against relative file paths in the repository. Both the size estimator and downloader apply the same filter, ensuring PVC sizing matches the actual download.

Filter order (matching huggingface_hub behavior):

Include: if set, only files matching at least one include pattern are considered
Exclude: files matching any exclude pattern are then removed

When no filter is configured (neither on the artifact nor in the runtime config), subdirectory files are excluded by default (equivalent to exclude: [”/”]). To download all files including subdirectories, set an empty filter: downloadFilter: {}.

Appears in:

AIMArtifactSpec
AIMStorageConfig

Field	Description	Default	Validation
`include` string array	Include specifies glob patterns for files to download. Only files matching at least one pattern are considered. If empty, all files pass the include check. Patterns use fnmatch syntax (e.g., [“*.safetensors”, “config.json”]).		Optional: {}
`exclude` string array	Exclude specifies glob patterns for files to skip. Files matching any exclude pattern are removed after include filtering. Patterns use fnmatch syntax (e.g., [”/”, “.bin”]). Use [”/*”] to exclude all files in subdirectories (the default when no filter is set).		Optional: {}

AIMGpuRequirements#

AIMGpuRequirements specifies GPU resource requirements.

Appears in:

AIMHardwareRequirements

Field	Description	Default	Validation
`requests` integer	Requests is the number of GPUs to set as requests/limits. Set to 0 to target GPU nodes without consuming GPU resources (useful for testing).		Minimum: 0 Optional: {}
`model` string	Model limits deployment to a specific GPU model. Example: “MI300X” Cannot be combined with minVram.		MaxLength: 64 Optional: {}
`minVram` Quantity	MinVRAM limits deployment to GPUs having at least this much VRAM. Used for capacity planning when the model size is known but any GPU with sufficient VRAM is acceptable. Cannot be combined with model.		Optional: {}
`resourceName` string	ResourceName is the Kubernetes resource name for GPU resources. Defaults to “amd.com/gpu” if not specified.	amd.com/gpu	Optional: {}

AIMHardwareRequirements#

AIMHardwareRequirements specifies compute resource requirements for custom models. Used in AIMModelSpec and AIMCustomTemplate to define GPU and CPU needs.

Appears in:

AIMClusterServiceTemplateSpec
AIMCustomModelSpec
AIMCustomTemplate
AIMRuntimeParameters
AIMServiceModelCustom
AIMServiceOverrides
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon
AIMServiceTemplateStatus

Field	Description	Default	Validation
`gpu` AIMGpuRequirements	GPU specifies GPU requirements. If not set, no GPUs are requested (CPU-only model).		Optional: {}
`cpu` AIMCpuRequirements	CPU specifies CPU requirements.		Optional: {}

AIMMetric#

Underlying type: string

AIMMetric enumerates the targeted service characteristic

Validation:

Enum: [latency throughput]

Appears in:

AIMClusterServiceTemplateSpec
AIMDiscoveryProfileMetadata
AIMProfileMetadata
AIMRuntimeParameters
AIMServiceOverrides
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon
AIMTemplateProfile

Field	Description
`latency`
`throughput`

AIMModel#

AIMModel is the schema for namespace-scoped AIM model catalog entries.

Appears in:

AIMModelList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMModel`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMModelSpec
`status` AIMModelStatus

AIMModelConfig#

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec

Field	Description	Default	Validation
`autoDiscovery` boolean	AutoDiscovery controls whether models run discovery by default. When true, models run discovery jobs to extract metadata and auto-create templates. When false, discovery is skipped. Discovery failures are non-fatal and reported via conditions.		Optional: {}

AIMModelDiscoveryConfig#

AIMModelDiscoveryConfig controls discovery behavior for a model.

Appears in:

AIMModelSpec

Field	Description	Default	Validation
`extractMetadata` boolean	ExtractMetadata controls whether metadata extraction runs for this model. During metadata extraction, the controller connects to the image registry and extracts the image’s labels.	true	Optional: {}
`createServiceTemplates` boolean	CreateServiceTemplates controls whether (cluster) service templates are auto-created from the image metadata.	true	Optional: {}

AIMModelList#

AIMModelList contains a list of AIMModel.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMModelList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMModel array

AIMModelSource#

AIMModelSource describes a model artifact that must be downloaded for inference. Discovery extracts these from the container’s configuration to enable caching and validation.

Appears in:

AIMClusterServiceTemplateSpec
AIMModelSpec
AIMServiceModelCustom
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon
AIMServiceTemplateStatus
AIMTemplateCacheSpec

Field	Description	Validation
`modelId` string	ModelID is the canonical identifier in {org}/{name} format. Determines the cache mount path: /workspace/cache/{modelId} For Hugging Face sources, this typically mirrors the URI path (e.g., meta-llama/Llama-3-8B). For S3 sources, users define their own organizational structure.	Pattern: `^[a-zA-Z0-9_-]+/[a-zA-Z0-9._-]+$` Required: {}
`sourceUri` string	SourceURI is the location from which the model should be downloaded. Supported schemes: - hf://org/model - Hugging Face Hub model - s3://bucket/key - S3-compatible storage	Pattern: `^(hf\|s3)://[^ \t\r\n]+$`
`size` Quantity	Size is the expected storage space required for this model artifact. Used for PVC sizing and capacity planning during cache creation. Optional - if not specified, the download job will discover the size automatically. Can be set explicitly to pre-allocate storage or override auto-discovery.	Optional: {}
`env` EnvVar array	Env specifies per-source credential overrides. These variables are used for authentication when downloading this specific source. Takes precedence over base-level env for the same variable name.	Optional: {}

AIMModelSourceType#

Underlying type: string

AIMModelSourceType indicates how a model’s artifacts are sourced.

Validation:

Enum: [Image Custom]

Appears in:

AIMModelStatus

Field	Description
`Image`	AIMModelSourceTypeImage indicates the model is discovered from container image labels.
`Custom`	AIMModelSourceTypeCustom indicates the model uses explicit spec.modelSources.

AIMModelSpec#

AIMModelSpec defines the desired state of AIMModel.

Appears in:

AIMClusterModel
AIMModel

Field	Description	Validation
`image` string	Image is the container image URI for this AIM model. This image is inspected by the operator to select runtime profiles used by templates. Discovery behavior is controlled by the discovery field and runtime config’s AutoDiscovery setting.	MinLength: 1
`discovery` AIMModelDiscoveryConfig	Discovery controls discovery behavior for this model. When unset, uses runtime config defaults.	Optional: {}
`defaultServiceTemplate` string	DefaultServiceTemplate specifies the default AIMServiceTemplate to use when creating services for this model. When set, services that reference this model will use this template if no template is explicitly specified. If this is not set, a template will be automatically selected.	Optional: {}
`custom` AIMCustomModelSpec	Custom contains configuration for custom models (models with inline modelSources). Only used when modelSources are specified; ignored for image-based models.	Optional: {}
`customTemplates` AIMCustomTemplate array	CustomTemplates defines explicit template configurations for this model. These templates are created directly without running a discovery job. Can be used with or without modelSources to define custom deployment configurations. If omitted when modelSources is set, a single template is auto-generated using the custom.hardware requirements.	MaxItems: 16 Optional: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources to use for this model. When specified, these sources are used instead of auto-discovery from the container image. This enables pre-creating custom models with explicit model sources. The size field is optional - if not specified, it will be discovered by the download job. AIM runtime currently supports only one model source.	MaxItems: 1 Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets containing credentials for pulling the model container image. These secrets are used for: - OCI registry metadata extraction during discovery - Pulling the image for inference services The secrets are merged with any runtime config defaults. For namespace-scoped models, secrets must exist in the same namespace. For cluster-scoped models, secrets must exist in the operator namespace.	Optional: {}
`env` EnvVar array	Env specifies environment variables for authentication during model discovery and metadata extraction. These variables are used for authentication with model registries (e.g., Hugging Face tokens).	Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the Kubernetes service account to use for workloads related to this model. This includes metadata extraction jobs and any other model-related operations. If empty, the default service account for the namespace is used.	Optional: {}
`resources` ResourceRequirements	Resources defines the default resource requirements for services using this model. Template- or service-level values override these defaults.	Optional: {}
`imageMetadata` ImageMetadata	ImageMetadata is the metadata that is used to determine which recommended service templates to create, and to drive clients with richer metadata regarding this particular model. For most cases the user does not need to set this field manually, for images that have the supported labels embedded in them the `AIM(Cluster)Model.status.imageMetadata` field is automatically filled from the container image labels. This field is intended to be used when there are network restrictions, or in other similar situations. If this field is set, the remote extraction will not be performed at all.

AIMModelStatus#

AIMModelStatus defines the observed state of AIMModel.

Appears in:

AIMClusterModel
AIMModel

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller
`status` AIMStatus	Status represents the overall status of the image based on its templates	Pending	Enum: [Pending Progressing Ready Degraded Failed NotAvailable]
`conditions` Condition array	Conditions represent the latest available observations of the model’s state
`resolvedRuntimeConfig` AIMResolvedReference	ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.		Optional: {}
`imageMetadata` ImageMetadata	ImageMetadata is the metadata extracted from an AIM image		Optional: {}
`sourceType` AIMModelSourceType	SourceType indicates how this model’s artifacts are sourced. - “Image”: Model discovered from container image labels - “Custom”: Model uses explicit spec.modelSources Set by the controller based on whether spec.modelSources is populated.		Enum: [Image Custom] Optional: {}

AIMPrecision#

Underlying type: string

AIMPrecision enumerates supported numeric precisions

Validation:

Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]

Appears in:

AIMClusterServiceTemplateSpec
AIMDiscoveryProfileMetadata
AIMProfileMetadata
AIMRuntimeParameters
AIMServiceOverrides
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon
AIMTemplateProfile

Field	Description
`auto`
`fp4`
`fp8`
`fp16`
`fp32`
`bf16`
`int4`
`int8`

AIMProfile#

AIMProfile contains the cached discovery results for a template. This is the processed and validated version of AIMDiscoveryProfile that is stored in the template’s status after successful discovery.

The profile serves as a cache of runtime configuration, eliminating the need to re-run discovery for each service that uses this template. Services and caching mechanisms reference this cached profile for deployment parameters and model sources.

See discovery.go for AIMDiscoveryProfile (the raw discovery output) and the relationship between these types.

Appears in:

AIMServiceTemplateStatus

Field	Description	Validation
`engine_args` JSON	EngineArgs contains runtime-specific engine configuration as a free-form JSON object. The structure depends on the inference engine being used (e.g., vLLM, TGI). These arguments are passed to the runtime container to configure model loading and inference.	Schemaless: {}
`env_vars` object (keys:string, values:string)	EnvVars contains environment variables required by the runtime for this profile. These may include engine-specific settings, optimization flags, or hardware configuration.	Optional: {}
`metadata` AIMProfileMetadata	Refer to Kubernetes API documentation for fields of `metadata`.
`originalDiscoveryOutput` JSON	OriginalDiscoveryOutput contains the raw discovery job JSON output. This preserves the complete discovery result from the dry-run container, including all fields that may not be mapped to structured fields above.	Schemaless: {} Optional: {}

AIMProfileMetadata#

AIMProfileMetadata describes the characteristics of a cached deployment profile. This is identical to AIMDiscoveryProfileMetadata but exists in the template status namespace.

Appears in:

AIMProfile

Field	Description	Validation
`engine` string	Engine identifies the inference engine used for this profile (e.g., “vllm”, “tgi”).	Optional: {}
`gpu` string	GPU specifies the GPU model this profile is optimized for (e.g., “MI300X”, “MI325X”).	Optional: {}
`gpuCount` integer	GPUCount indicates how many GPUs are required per replica for this profile.	Optional: {}
`metric` AIMMetric	Metric indicates the optimization goal for this profile (“latency” or “throughput”).	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision specifies the numeric precision used in this profile (e.g., “fp16”, “fp8”).	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`type` AIMProfileType	Type indicates the optimization level of this profile (optimized, preview, unoptimized).	Enum: [optimized preview unoptimized] Optional: {}

AIMProfileType#

Underlying type: string

AIMProfileType indicates the optimization level of a deployment profile.

Validation:

Enum: [optimized preview unoptimized]

Appears in:

AIMClusterServiceTemplateSpec
AIMCustomModelSpec
AIMCustomTemplate
AIMDiscoveryProfileMetadata
AIMProfileMetadata
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon

Field	Description
`optimized`	AIMProfileTypeOptimized indicates the profile has been fully optimized.
`preview`	AIMProfileTypePreview indicates the profile is in preview/beta state.
`unoptimized`	AIMProfileTypeUnoptimized indicates the profile has not been optimized.

AIMResolutionScope#

Underlying type: string

AIMResolutionScope describes the scope of a resolved reference.

Validation:

Enum: [Namespace Cluster Merged Unknown]

Appears in:

AIMResolvedReference

Field	Description
`Namespace`	AIMResolutionScopeNamespace denotes a namespace-scoped resource.
`Cluster`	AIMResolutionScopeCluster denotes a cluster-scoped resource.
`Merged`	AIMResolutionScopeMerged denotes that both cluster and namespace configs were merged.
`Unknown`	AIMResolutionScopeUnknown denotes that the scope could not be determined.

AIMResolvedArtifact#

Appears in:

AIMTemplateCacheStatus

Field	Description	Default	Validation
`uid` string	UID of the AIMArtifact resource
`name` string	Name of the AIMArtifact resource
`model` string	Model is the name of the model that is cached
`status` AIMStatus	Status of the artifact
`persistentVolumeClaim` string	PersistentVolumeClaim name if available
`mountPoint` string	MountPoint is the mount point for the artifact

AIMResolvedReference#

AIMResolvedReference captures metadata about a resolved reference.

Appears in:

AIMModelStatus
AIMServiceCacheStatus
AIMServiceStatus
AIMServiceTemplateStatus
AIMTemplateCacheStatus

Field	Description	Validation
`name` string	Name is the resource name that satisfied the reference.
`namespace` string	Namespace identifies where the resource was found when namespace-scoped. Empty indicates a cluster-scoped resource.
`scope` AIMResolutionScope	Scope indicates whether the resolved resource was namespace or cluster scoped.	Enum: [Namespace Cluster Merged Unknown]
`kind` string	Kind is the fully-qualified kind of the resolved reference, when known.	Optional: {}
`uid` UID	UID captures the unique identifier of the resolved reference, when known.	Optional: {}

AIMRuntimeConfig#

AIMRuntimeConfig is the schema for namespace-scoped AIM runtime configurations.

Appears in:

AIMRuntimeConfigList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMRuntimeConfig`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMRuntimeConfigSpec
`status` AIMRuntimeConfigStatus

AIMRuntimeConfigCommon#

AIMRuntimeConfigCommon captures configuration fields shared across cluster and namespace scopes. These settings apply to both AIMRuntimeConfig (namespace-scoped) and AIMClusterRuntimeConfig (cluster-scoped). It embeds AIMServiceRuntimeConfig which contains fields that can also be overridden at the service level.

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigSpec

Field	Description	Validation
`storage` AIMStorageConfig	Storage configures storage defaults for this service’s PVCs and caches. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`routing` AIMRuntimeRoutingConfig	Routing controls HTTP routing configuration for this service. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. When set on AIMService, these take highest precedence in the merge hierarchy. When set on RuntimeConfig, these provide namespace/cluster-level defaults. Merge order (highest to lowest): Service.Env > Template.Env > RuntimeConfig.Env > Profile.Env	Optional: {}
`model` AIMModelConfig	Model controls model creation and discovery defaults. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`artifact` AIMArtifactConfig	Artifact controls artifact-level defaults such as eviction policy. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`labelPropagation` AIMRuntimeConfigLabelPropagationSpec	LabelPropagation controls how labels from parent AIM resources are propagated to child resources. When enabled, labels matching the specified patterns are automatically copied from parent resources (e.g., AIMService, AIMTemplateCache) to their child resources (e.g., Deployments, Services, PVCs). This is useful for propagating organizational metadata like cost centers, team identifiers, or compliance labels through the resource hierarchy.	Optional: {}
`defaultStorageClassName` string	DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.DefaultStorageClassName is not set, the value will be automatically migrated.	Optional: {}
`pvcHeadroomPercent` integer	DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.PVCHeadroomPercent is not set, the value will be automatically migrated.	Optional: {}

AIMRuntimeConfigLabelPropagationSpec#

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec

Field	Description	Default	Validation
`enabled` boolean	Enabled, if true, allows propagating parent labels to all child resources it creates directly Only label keys that match the ones in Match are propagated.	false	Optional: {}
`match` string array	Match is a list of label keys that will be propagated to any child resources created. Wildcards are supported, so for example `org.my/my-key-*` would match any label with that prefix.		Optional: {}

AIMRuntimeConfigList#

AIMRuntimeConfigList contains a list of AIMRuntimeConfig.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMRuntimeConfigList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMRuntimeConfig array

AIMRuntimeConfigSpec#

AIMRuntimeConfigSpec defines namespace-scoped overrides for AIM resources.

Appears in:

AIMRuntimeConfig

Field	Description	Validation
`storage` AIMStorageConfig	Storage configures storage defaults for this service’s PVCs and caches. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`routing` AIMRuntimeRoutingConfig	Routing controls HTTP routing configuration for this service. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. When set on AIMService, these take highest precedence in the merge hierarchy. When set on RuntimeConfig, these provide namespace/cluster-level defaults. Merge order (highest to lowest): Service.Env > Template.Env > RuntimeConfig.Env > Profile.Env	Optional: {}
`model` AIMModelConfig	Model controls model creation and discovery defaults. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`artifact` AIMArtifactConfig	Artifact controls artifact-level defaults such as eviction policy. This field only applies to RuntimeConfig/ClusterRuntimeConfig and is not available for services.	Optional: {}
`labelPropagation` AIMRuntimeConfigLabelPropagationSpec	LabelPropagation controls how labels from parent AIM resources are propagated to child resources. When enabled, labels matching the specified patterns are automatically copied from parent resources (e.g., AIMService, AIMTemplateCache) to their child resources (e.g., Deployments, Services, PVCs). This is useful for propagating organizational metadata like cost centers, team identifiers, or compliance labels through the resource hierarchy.	Optional: {}
`defaultStorageClassName` string	DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.DefaultStorageClassName is not set, the value will be automatically migrated.	Optional: {}
`pvcHeadroomPercent` integer	DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. For backward compatibility, if this field is set and Storage.PVCHeadroomPercent is not set, the value will be automatically migrated.	Optional: {}

AIMRuntimeConfigStatus#

AIMRuntimeConfigStatus records the resolved config reference surfaced to consumers.

Appears in:

AIMClusterRuntimeConfig
AIMRuntimeConfig

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the last reconciled generation.
`conditions` Condition array	Conditions communicate reconciliation progress.

AIMRuntimeParameters#

AIMRuntimeParameters contains the runtime configuration parameters shared across templates and services. Fields use pointers to allow optional usage in different contexts (required in templates, optional in service overrides).

Appears in:

AIMClusterServiceTemplateSpec
AIMServiceOverrides
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon

Field	Description	Validation
`metric` AIMMetric	Metric selects the optimization goal. - `latency`: prioritize low end‑to‑end latency - `throughput`: prioritize sustained requests/second	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision selects the numeric precision used by the runtime.	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for each replica. For GPU models, defines the GPU count and model types required for deployment. For CPU-only models, defines CPU resource requirements. This field is immutable after creation.	Optional: {}

AIMRuntimeRoutingConfig#

AIMRuntimeRoutingConfig configures HTTP routing defaults for inference services. These settings control how Gateway API HTTPRoutes are created and configured.

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec
AIMServiceRuntimeConfig
AIMServiceSpec

Field	Description	Validation
`enabled` boolean	Enabled controls whether HTTP routing is managed for inference services using this config. When true, the operator creates HTTPRoute resources for services that reference this config. When false or unset, routing must be explicitly enabled on each service. This provides a namespace or cluster-wide default that individual services can override.	Optional: {}
`gatewayRef` ParentReference	GatewayRef specifies the Gateway API Gateway resource that should receive HTTPRoutes. This identifies the parent gateway for routing traffic to inference services. The gateway can be in any namespace (cross-namespace references are supported). If routing is enabled but GatewayRef is not specified, service reconciliation will fail with a validation error.	Optional: {}
`pathTemplate` string	PathTemplate defines the HTTP path template for routes, evaluated using JSONPath expressions. The template is rendered against the AIMService object to generate unique paths. Example templates: - `/\{.metadata.namespace\}/\{.metadata.name\}` - namespace and service name - `/\{.metadata.namespace\}/\{.metadata.labels['team']\}/inference` - with label - `/models/\{.metadata.name\}` - based on service name The template must: - Use valid JSONPath expressions wrapped in {…} - Reference fields that exist on the service - Produce a path ≤ 200 characters after rendering - Result in valid URL path segments (lowercase, RFC 1123 compliant) If evaluation fails, the service enters Degraded state with PathTemplateInvalid reason. Individual services can override this template via spec.routing.pathTemplate.	Optional: {}
`requestTimeout` Duration	RequestTimeout defines the HTTP request timeout for routes. This sets the maximum duration for a request to complete before timing out. The timeout applies to the entire request/response cycle. If not specified, no timeout is set on the route. Individual services can override this value via spec.routing.requestTimeout.	Optional: {}
`annotations` object (keys:string, values:string)	Annotations defines default annotations to add to all HTTPRoute resources. Services can add additional annotations or override these via spec.routing.annotations. When both are specified, service annotations take precedence for conflicting keys. Common use cases include ingress controller settings, rate limiting, monitoring labels, and security policies that should apply to all services using this config.	Optional: {}

AIMService#

AIMService manages a KServe-based AIM inference service for the selected model and template. Note: KServe uses {name}-{namespace} format which must not exceed 63 characters. This constraint is validated at runtime since CEL cannot access metadata.namespace.

Appears in:

AIMServiceList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMService`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMServiceSpec
`status` AIMServiceStatus

AIMServiceAutoScaling#

AIMServiceAutoScaling configures KEDA-based autoscaling with custom metrics. This enables automatic scaling based on metrics collected from OpenTelemetry.

Appears in:

AIMServiceSpec

Field	Description	Default	Validation
`metrics` AIMServiceMetricsSpec array	Metrics is a list of metrics to be used for autoscaling. Each metric defines a source (PodMetric) and target values.		Optional: {}

AIMServiceCacheStatus#

AIMServiceCacheStatus captures cache-related status for an AIMService.

Appears in:

AIMServiceStatus

Field	Description	Default	Validation
`templateCacheRef` AIMResolvedReference	TemplateCacheRef references the TemplateCache being used, if any.		Optional: {}
`retryAttempts` integer	RetryAttempts tracks how many times this service has attempted to retry a failed cache. Each service gets exactly one retry attempt. When a TemplateCache enters Failed state, this counter is incremented from 0 to 1 after deleting failed Artifacts. If the retry fails (cache enters Failed again with attempts == 1), the service degrades.		Optional: {}

AIMServiceCachingConfig#

AIMServiceCachingConfig controls caching behavior for a service.

Appears in:

AIMServiceSpec

Field	Description	Default	Validation
`mode` AIMCachingMode	Mode controls when to use caching. Canonical values: - Shared (default): reuse/create shared cache assets - Dedicated: create service-owned dedicated cache assets Legacy values are accepted and normalized: - Always -> Shared - Auto -> Shared - Never -> Dedicated	Shared	Enum: [Dedicated Shared Auto Always Never] Optional: {}

AIMServiceList#

AIMServiceList contains a list of AIMService.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMServiceList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMService array

AIMServiceMetricTarget#

AIMServiceMetricTarget defines the target value for a metric. Specifies how the metric value should be interpreted and what target to maintain.

Appears in:

AIMServicePodMetricSource

Field	Description	Validation
`type` string	Type specifies how to interpret the metric value. ”Value”: absolute value target (use Value field) ”AverageValue”: average value across all pods (use AverageValue field) ”Utilization”: percentage utilization for resource metrics (use AverageUtilization field)	Enum: [Value AverageValue Utilization]
`value` string	Value is the target value of the metric (as a quantity). Used when Type is “Value”. Example: “1” for 1 request, “100m” for 100 millicores	Optional: {}
`averageValue` string	AverageValue is the target value of the average of the metric across all relevant pods (as a quantity). Used when Type is “AverageValue”. Example: “100m” for 100 millicores per pod	Optional: {}
`averageUtilization` integer	AverageUtilization is the target value of the average of the resource metric across all relevant pods, represented as a percentage of the requested value of the resource for the pods. Used when Type is “Utilization”. Only valid for Resource metric source type. Example: 80 for 80% utilization	Optional: {}

AIMServiceMetricsSpec#

AIMServiceMetricsSpec defines a single metric for autoscaling. Specifies the metric source type and configuration.

Appears in:

AIMServiceAutoScaling

Field	Description	Default	Validation
`type` string	Type is the type of metric source. Valid values: “PodMetric” (per-pod custom metrics).		Enum: [PodMetric]
`podmetric` AIMServicePodMetricSource	PodMetric refers to a metric describing each pod in the current scale target. Used when Type is “PodMetric”. Supports backends like OpenTelemetry for custom metrics.		Optional: {}

AIMServiceModel#

AIMServiceModel specifies which model to deploy. Exactly one field must be set.

Appears in:

AIMServiceSpec

Field	Description	Validation
`name` string	Name references an existing AIMModel or AIMClusterModel by metadata.name. The controller looks for a namespace-scoped AIMModel first, then falls back to cluster-scoped AIMClusterModel. Example: `meta-llama-3-8b`	Optional: {}
`image` string	Image specifies a container image URI directly. The controller searches for an existing model with this image, or creates one if none exists. Auto-created models are namespace-scoped and can be reused by other services. Example: `ghcr.io/silogen/llama-3-8b:v1.2.0`	Optional: {}
`custom` AIMServiceModelCustom	Custom specifies a custom model configuration with explicit base image, model sources, and hardware requirements. The controller will search for an existing matching AIMModel or auto-create one if not found.	Optional: {}

AIMServiceModelCustom#

AIMServiceModelCustom specifies a custom model configuration with explicit base image, model sources, and hardware requirements. Used for ad-hoc custom model deployments.

Appears in:

AIMServiceModel

Field	Description	Validation
`baseImage` string	BaseImage is the container image URI for the AIM base image. This will be used as the image for the auto-created AIMModel. Example: `ghcr.io/silogen/aim-base:0.7.0`	Required: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources to use. The controller will search for or create an AIMModel with these sources. The size field is optional - if not specified, it will be discovered by the download job. AIM runtime currently supports only one model source.	MaxItems: 1 MinItems: 1 Required: {}
`hardware` AIMHardwareRequirements	Hardware specifies the GPU and CPU requirements for this custom model. GPU is optional - if not set, no GPUs are requested (CPU-only model).	Required: {}

AIMServiceOverrides#

AIMServiceOverrides allows overriding template parameters at the service level. All fields are optional. When specified, they override the corresponding values from the referenced AIMServiceTemplate.

Appears in:

AIMServiceSpec

Field	Description	Validation
`metric` AIMMetric	Metric selects the optimization goal. - `latency`: prioritize low end‑to‑end latency - `throughput`: prioritize sustained requests/second	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision selects the numeric precision used by the runtime.	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for each replica. For GPU models, defines the GPU count and model types required for deployment. For CPU-only models, defines CPU resource requirements. This field is immutable after creation.	Optional: {}

AIMServicePodMetric#

AIMServicePodMetric identifies the pod metric and its backend. Supports multiple metrics backends including OpenTelemetry.

Appears in:

AIMServicePodMetricSource

Field	Description	Default	Validation
`backend` string	Backend defines the metrics backend to use. If not specified, defaults to “opentelemetry”.	opentelemetry	Enum: [opentelemetry] Optional: {}
`serverAddress` string	ServerAddress specifies the address of the metrics backend server. If not specified, defaults to “keda-otel-scaler.keda.svc:4317” for OpenTelemetry backend.		Optional: {}
`metricNames` string array	MetricNames specifies which metrics to collect from pods and send to ServerAddress. Example: [“vllm:num_requests_running”]		Optional: {}
`query` string	Query specifies the query to run to retrieve metrics from the backend. The query syntax depends on the backend being used. Example: “vllm:num_requests_running” for OpenTelemetry.		Optional: {}
`operationOverTime` string	OperationOverTime specifies the operation to aggregate metrics over time. Valid values: “last_one”, “avg”, “max”, “min”, “rate”, “count” Default: “last_one”		Optional: {}

AIMServicePodMetricSource#

AIMServicePodMetricSource defines pod-level metrics configuration. Specifies the metric identification and target values for pod-based autoscaling.

Appears in:

AIMServiceMetricsSpec

Field	Description	Default	Validation
`metric` AIMServicePodMetric	Metric contains the metric identification and backend configuration. Defines which metrics to collect and how to query them.
`target` AIMServiceMetricTarget	Target specifies the target value for the metric. The autoscaler will scale to maintain this target value.

AIMServiceRoutingStatus#

AIMServiceRoutingStatus captures observed routing details.

Appears in:

AIMServiceStatus

Field	Description	Default	Validation
`path` string	Path is the HTTP path prefix used when routing is enabled. Example: `/tenant/svc-uuid`		Optional: {}

AIMServiceRuntimeConfig#

AIMServiceRuntimeConfig contains runtime configuration fields that apply to services. This struct is shared between AIMService.spec (inlined) and AIMRuntimeConfigCommon, allowing services to override these specific runtime settings while inheriting defaults from namespace/cluster RuntimeConfigs.

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec
AIMServiceSpec

Field	Description	Validation
`storage` AIMStorageConfig	Storage configures storage defaults for this service’s PVCs and caches. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`routing` AIMRuntimeRoutingConfig	Routing controls HTTP routing configuration for this service. When set, these values override namespace/cluster runtime config defaults.	Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. When set on AIMService, these take highest precedence in the merge hierarchy. When set on RuntimeConfig, these provide namespace/cluster-level defaults. Merge order (highest to lowest): Service.Env > Template.Env > RuntimeConfig.Env > Profile.Env	Optional: {}

AIMServiceRuntimeStatus#

AIMServiceRuntimeStatus captures runtime status including replica counts from HPA.

Appears in:

AIMServiceStatus

Field	Description	Validation
`currentReplicas` integer	CurrentReplicas is the current number of replicas as reported by the HPA.
`desiredReplicas` integer	DesiredReplicas is the desired number of replicas as determined by the HPA.
`minReplicas` integer	MinReplicas is the minimum number of replicas configured for autoscaling.
`maxReplicas` integer	MaxReplicas is the maximum number of replicas configured for autoscaling.
`replicas` string	Replicas is a formatted display string for kubectl output. Shows “current” for fixed replicas or “current/desired (min-max)” for autoscaling.	Optional: {}

AIMServiceSpec#

AIMServiceSpec defines the desired state of AIMService.

Binds a canonical model to an AIMServiceTemplate and configures replicas, caching behavior, and optional overrides. The template governs the base runtime selection knobs, while the overrides field allows service-specific customization.

Appears in:

AIMService

Field	Description	Default	Validation
`model` AIMServiceModel	Model specifies which model to deploy using one of the available reference methods. Use `name` to reference an existing AIMModel/AIMClusterModel by name, or use `image` to specify a container image URI directly (which will auto-create a model if needed).
`template` AIMServiceTemplateConfig	Template contains template selection and configuration. Use Template.Name to specify an explicit template, or omit to auto-select.		Optional: {}
`caching` AIMServiceCachingConfig	Caching controls caching behavior for this service. When nil, defaults to Shared mode.		Optional: {}
`cacheModel` boolean	DEPRECATED: Use Caching.Mode instead. This field will be removed in a future version. This field is no longer honored by the controller.		Optional: {}
`replicas` integer	Replicas specifies the number of replicas for this service. When not specified, defaults to 1 replica. This value overrides any replica settings from the template. For autoscaling, use MinReplicas and MaxReplicas instead.	1	Optional: {}
`minReplicas` integer	MinReplicas specifies the minimum number of replicas for autoscaling. Defaults to 1. Scale to zero is not supported. When specified with MaxReplicas, enables autoscaling for the service.		Minimum: 1 Optional: {}
`maxReplicas` integer	MaxReplicas specifies the maximum number of replicas for autoscaling. Required when MinReplicas is set or when AutoScaling configuration is provided.		Minimum: 1 Optional: {}
`autoScaling` AIMServiceAutoScaling	AutoScaling configures advanced autoscaling behavior using KEDA. Supports custom metrics from OpenTelemetry backend. When specified, MinReplicas and MaxReplicas should also be set.		Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.		Optional: {}
`storage` AIMStorageConfig	Storage configures storage defaults for this service’s PVCs and caches. When set, these values override namespace/cluster runtime config defaults.		Optional: {}
`routing` AIMRuntimeRoutingConfig	Routing controls HTTP routing configuration for this service. When set, these values override namespace/cluster runtime config defaults.		Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. When set on AIMService, these take highest precedence in the merge hierarchy. When set on RuntimeConfig, these provide namespace/cluster-level defaults. Merge order (highest to lowest): Service.Env > Template.Env > RuntimeConfig.Env > Profile.Env		Optional: {}
`resources` ResourceRequirements	Resources overrides the container resource requirements for this service. When specified, these values take precedence over the template and image defaults.		Optional: {}
`overrides` AIMServiceOverrides	Overrides allows overriding specific template parameters for this service. When specified, these values take precedence over the template values.		Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets references secrets for pulling AIM container images.		Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the Kubernetes service account to use for the inference workload. This service account is used by the deployed inference pods. If empty, the default service account for the namespace is used.		Optional: {}
`priorityClassName` string	PriorityClassName specifies the priority class for the inference pods. This maps directly to the Kubernetes PriorityClassName field on the pod spec. If empty, no priority class is set.		Optional: {}

AIMServiceStatus#

AIMServiceStatus defines the observed state of AIMService.

Appears in:

AIMService

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller.
`conditions` Condition array	Conditions represent the latest observations of template state.
`resolvedRuntimeConfig` AIMResolvedReference	ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.		Optional: {}
`resolvedModel` AIMResolvedReference	ResolvedModel captures metadata about the image that was resolved.		Optional: {}
`status` AIMStatus	Status represents the current high‑level status of the service lifecycle. Values: `Pending`, `Starting`, `Running`, `Degraded`, `Failed`.	Pending	Enum: [Pending Starting Running Degraded Failed]
`routing` AIMServiceRoutingStatus	Routing surfaces information about the configured HTTP routing, when enabled.		Optional: {}
`resolvedTemplate` AIMResolvedReference	ResolvedTemplate captures metadata about the template that satisfied the reference.
`cache` AIMServiceCacheStatus	Cache captures cache-related status for this service.		Optional: {}
`runtime` AIMServiceRuntimeStatus	Runtime captures runtime status including replica counts.		Optional: {}

AIMServiceTemplate#

AIMServiceTemplate is the Schema for namespace-scoped AIM service templates.

Appears in:

AIMServiceTemplateList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMServiceTemplate`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMServiceTemplateSpec
`status` AIMServiceTemplateStatus

AIMServiceTemplateConfig#

AIMServiceTemplateConfig contains template selection configuration for AIMService.

Appears in:

AIMServiceSpec

Field	Description	Default	Validation
`name` string	Name is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to use. The template selects the runtime profile and GPU parameters. When not specified, a template will be automatically selected based on the model.		Optional: {}
`allowUnoptimized` boolean	AllowUnoptimized, if true, will allow automatic selection of templates that resolve to an unoptimized profile.		Optional: {}

AIMServiceTemplateList#

AIMServiceTemplateList contains a list of AIMServiceTemplate.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMServiceTemplateList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMServiceTemplate array

AIMServiceTemplateScope#

Underlying type: string

AIMServiceTemplateScope is retained for backwards compatibility with existing consumers.

Validation:

Enum: [Namespace Cluster Unknown]

Appears in:

AIMTemplateCacheSpec

AIMServiceTemplateSpec#

AIMServiceTemplateSpec defines the desired state of AIMServiceTemplate (namespace-scoped).

A namespaced and versioned template that selects a runtime profile for a given AIM model (by canonical name). Templates are intentionally narrow: they describe runtime selection knobs for the AIM container and do not redefine the full Kubernetes deployment shape.

Appears in:

AIMServiceTemplate

Field	Description	Validation
`modelName` string	ModelName is the model name. Matches `metadata.name` of an AIMModel or AIMClusterModel. Immutable. Example: `meta/llama-3-8b:1.1+20240915`	MinLength: 1
`metric` AIMMetric	Metric selects the optimization goal. - `latency`: prioritize low end‑to‑end latency - `throughput`: prioritize sustained requests/second	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision selects the numeric precision used by the runtime.	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for each replica. For GPU models, defines the GPU count and model types required for deployment. For CPU-only models, defines CPU resource requirements. This field is immutable after creation.	Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.	Optional: {}
`aimId` string	AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). Required when customProfile is set; used to assemble the profile YAML aim_id field and to compute the custom profile ID for AIM_PROFILE_ID.	Optional: {}
`modelId` string	ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). Required when customProfile is set; used for profile YAML model_id field and for weight pre-caching via the discovery job.	Optional: {}
`customProfile` AIMCustomProfile	CustomProfile defines inline custom profile data for the inference engine. When set, the controller assembles a profile YAML from this data and template metadata, creates a ConfigMap, and mounts it into discovery and inference containers. Requires aimId, modelId, hardware, metric, and precision to also be set.	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets containing credentials for pulling container images. These secrets are used for: - Discovery dry-run jobs that inspect the model container - Pulling the image for inference services The secrets are merged with any model or runtime config defaults. For namespace-scoped templates, secrets must exist in the same namespace. For cluster-scoped templates, secrets must exist in the operator namespace.	Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. This includes discovery dry-run jobs and inference services created from this template. If empty, the default service account for the namespace is used.	Optional: {}
`resources` ResourceRequirements	Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults.	Optional: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources required to run this template. When provided, the discovery dry-run will be skipped and these sources will be used directly. This allows users to explicitly declare model dependencies without requiring a discovery job. If omitted, a discovery job will be run to automatically determine the required model sources.	Optional: {}
`profileId` string	ProfileId is the specific AIM profile ID that this template should use. When set, the discovery job will be instructed to use this specific profile.	Optional: {}
`type` AIMProfileType	Type indicates the optimization level of this template. - optimized: Template has been tuned for performance - preview: Template is experimental/pre-release - unoptimized: Default, no specific optimizations applied When nil, the type is determined by discovery. When set, overrides discovery.	Enum: [optimized preview unoptimized] Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. These variables are passed to the inference runtime and can be used to configure runtime behavior, authentication, or other settings.	Optional: {}
`caching` AIMTemplateCachingConfig	Caching configures model caching behavior for this namespace-scoped template. When enabled, models will be cached using the specified environment variables during download.	Optional: {}

AIMServiceTemplateSpecCommon#

Appears in:

AIMClusterServiceTemplateSpec
AIMServiceTemplateSpec

Field	Description	Validation
`modelName` string	ModelName is the model name. Matches `metadata.name` of an AIMModel or AIMClusterModel. Immutable. Example: `meta/llama-3-8b:1.1+20240915`	MinLength: 1
`metric` AIMMetric	Metric selects the optimization goal. - `latency`: prioritize low end‑to‑end latency - `throughput`: prioritize sustained requests/second	Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision selects the numeric precision used by the runtime.	Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}
`hardware` AIMHardwareRequirements	Hardware specifies GPU and CPU requirements for each replica. For GPU models, defines the GPU count and model types required for deployment. For CPU-only models, defines CPU resource requirements. This field is immutable after creation.	Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.	Optional: {}
`aimId` string	AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). Required when customProfile is set; used to assemble the profile YAML aim_id field and to compute the custom profile ID for AIM_PROFILE_ID.	Optional: {}
`modelId` string	ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). Required when customProfile is set; used for profile YAML model_id field and for weight pre-caching via the discovery job.	Optional: {}
`customProfile` AIMCustomProfile	CustomProfile defines inline custom profile data for the inference engine. When set, the controller assembles a profile YAML from this data and template metadata, creates a ConfigMap, and mounts it into discovery and inference containers. Requires aimId, modelId, hardware, metric, and precision to also be set.	Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets lists secrets containing credentials for pulling container images. These secrets are used for: - Discovery dry-run jobs that inspect the model container - Pulling the image for inference services The secrets are merged with any model or runtime config defaults. For namespace-scoped templates, secrets must exist in the same namespace. For cluster-scoped templates, secrets must exist in the operator namespace.	Optional: {}
`serviceAccountName` string	ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. This includes discovery dry-run jobs and inference services created from this template. If empty, the default service account for the namespace is used.	Optional: {}
`resources` ResourceRequirements	Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults.	Optional: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources required to run this template. When provided, the discovery dry-run will be skipped and these sources will be used directly. This allows users to explicitly declare model dependencies without requiring a discovery job. If omitted, a discovery job will be run to automatically determine the required model sources.	Optional: {}
`profileId` string	ProfileId is the specific AIM profile ID that this template should use. When set, the discovery job will be instructed to use this specific profile.	Optional: {}
`type` AIMProfileType	Type indicates the optimization level of this template. - optimized: Template has been tuned for performance - preview: Template is experimental/pre-release - unoptimized: Default, no specific optimizations applied When nil, the type is determined by discovery. When set, overrides discovery.	Enum: [optimized preview unoptimized] Optional: {}
`env` EnvVar array	Env specifies environment variables for inference containers. These variables are passed to the inference runtime and can be used to configure runtime behavior, authentication, or other settings.	Optional: {}

AIMServiceTemplateStatus#

AIMServiceTemplateStatus defines the observed state of AIMServiceTemplate.

Appears in:

AIMClusterServiceTemplate
AIMServiceTemplate

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller.
`conditions` Condition array	Conditions represent the latest observations of template state.
`resolvedRuntimeConfig` AIMResolvedReference	ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.		Optional: {}
`resolvedModel` AIMResolvedReference	ResolvedModel captures metadata about the image that was resolved.		Optional: {}
`resolvedCache` AIMResolvedReference	ResolvedCache captures metadata about which cache is used for this template		Optional: {}
`resolvedHardware` AIMHardwareRequirements	ResolvedHardware contains the resolved hardware requirements for this template. These values are computed from discovery results and spec defaults, and represent what will actually be used when creating InferenceServices. Resolution order: discovery output > spec values > defaults.		Optional: {}
`resolvedNodeAffinity` NodeAffinity	ResolvedNodeAffinity contains the computed node affinity rules for GPU scheduling. This is derived from GPU model and minVRAM requirements, merged with any user-specified affinity from the spec. The service controller uses this directly when creating InferenceServices.		Optional: {}
`hardwareSummary` string	HardwareSummary is a human-readable display string for the hardware requirements. Format: “{count} x {model}” for GPU (e.g., “2 x MI300X”) or “CPU” for CPU-only. This is a computed field for display purposes only.		Optional: {}
`status` AIMStatus	Status represents the current high‑level status of the template lifecycle. Values: `Pending`, `Progressing`, `Ready`, `Degraded`, `Failed`.	Pending	Enum: [Pending Progressing Ready Degraded Failed NotAvailable]
`modelSources` AIMModelSource array	ModelSources list the models that this template requires to run. These are the models that will be cached, if this template is cached.
`profile` AIMProfile	Profile contains the full discovery result profile as a free-form JSON object. This includes metadata, engine args, environment variables, and model details.
`discoveryJob` AIMResolvedReference	DiscoveryJob is a reference to the job that was run for discovery
`discovery` DiscoveryState	Discovery contains state tracking for the discovery process, including retry attempts and backoff timing for the circuit breaker pattern.		Optional: {}

AIMStorageConfig#

AIMStorageConfig configures storage defaults for artifacts and PVCs.

Appears in:

AIMClusterRuntimeConfigSpec
AIMRuntimeConfigCommon
AIMRuntimeConfigSpec
AIMServiceRuntimeConfig
AIMServiceSpec

Field	Description	Default	Validation
`defaultStorageClassName` string	DefaultStorageClassName specifies the storage class to use for artifacts and PVCs when the consuming resource (AIMArtifact, AIMTemplateCache, AIMServiceTemplate) does not specify a storage class. If this field is empty, the cluster’s default storage class is used.		Optional: {}
`pvcHeadroomPercent` integer	PVCHeadroomPercent specifies the percentage of extra space to add to PVCs for model storage. This accounts for filesystem overhead and temporary files during model loading. The value represents a percentage (e.g., 10 means 10% extra space). If not specified, defaults to 10%.	10	Minimum: 0 Optional: {}
`downloadFilter` AIMDownloadFilter	DownloadFilter controls which files are included or excluded during artifact downloads. When set here, applies as the default for all artifacts using this runtime config. Individual artifacts can override this with their own downloadFilter. When no filter is configured at any level, subdirectory files are excluded by default. Set to an empty object (downloadFilter: {}) to explicitly allow all files.		Optional: {}

AIMTemplateCache#

AIMTemplateCache pre-warms artifacts for a specified template.

Appears in:

AIMTemplateCacheList

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMTemplateCache`
`metadata` ObjectMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`spec` AIMTemplateCacheSpec
`status` AIMTemplateCacheStatus

AIMTemplateCacheList#

AIMTemplateCacheList contains a list of AIMTemplateCache.

Field	Description	Default	Validation
`apiVersion` string	`aim.eai.amd.com/v1alpha1`
`kind` string	`AIMTemplateCacheList`
`metadata` ListMeta	Refer to Kubernetes API documentation for fields of `metadata`.
`items` AIMTemplateCache array

AIMTemplateCacheMode#

Underlying type: string

AIMTemplateCacheMode controls the ownership behavior of artifacts created by a template cache.

Validation:

Enum: [Dedicated Shared]

Appears in:

AIMTemplateCacheSpec

Field	Description
`Dedicated`	TemplateCacheModeDedicated means artifacts have owner references to the template cache. When the template cache is deleted, all its artifacts are garbage collected. Use this mode for service-specific caches that should be cleaned up with the service.
`Shared`	TemplateCacheModeShared means artifacts have no owner references. artifacts persist independently of template cache lifecycle and can be shared. This is the default mode for long-lived, reusable caches.

AIMTemplateCacheSpec#

AIMTemplateCacheSpec defines the desired state of AIMTemplateCache

Appears in:

AIMTemplateCache

Field	Description	Default	Validation
`templateName` string	TemplateName is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to cache. The controller will first look for a namespace-scoped AIMServiceTemplate in the same namespace. If not found, it will look for a cluster-scoped AIMClusterServiceTemplate with the same name. Namespace-scoped templates take priority over cluster-scoped templates.		MinLength: 1
`templateScope` AIMServiceTemplateScope	TemplateScope indicates whether the template is namespace-scoped or cluster-scoped. This field is set by the controller during template resolution.		Enum: [Namespace Cluster Unknown] Required: {}
`env` EnvVar array	Env specifies environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., Hugging Face tokens).		Optional: {}
`imagePullSecrets` LocalObjectReference array	ImagePullSecrets references secrets for pulling AIM container images.		Optional: {}
`storageClassName` string	StorageClassName specifies the storage class for cache volumes. When not specified, uses the cluster default storage class.		Optional: {}
`downloadImage` string	DownloadImage specifies the container image used to download and initialize artifacts. When not specified, the controller uses the default model download image.		Optional: {}
`modelSources` AIMModelSource array	ModelSources specifies the model sources to cache for this template. These sources are typically copied from the resolved template’s model sources.		Optional: {}
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.		Optional: {}
`mode` AIMTemplateCacheMode	Mode controls the ownership behavior of artifacts created by this template cache. - Dedicated: artifacts are owned by this template cache and garbage collected when it’s deleted. - Shared (default): artifacts have no owner references and persist independently. When a Shared template cache encounters artifacts with owner references, it promotes them to shared by removing the owner references, ensuring they persist for long-term use.	Shared	Enum: [Dedicated Shared] Optional: {}

AIMTemplateCacheStatus#

AIMTemplateCacheStatus defines the observed state of AIMTemplateCache

Appears in:

AIMTemplateCache

Field	Description	Default	Validation
`observedGeneration` integer	ObservedGeneration is the most recent generation observed by the controller.
`conditions` Condition array	Conditions represent the latest observations of the template cache state.
`resolvedRuntimeConfig` AIMResolvedReference	ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.		Optional: {}
`status` AIMStatus	Status represents the current high-level status of the template cache.	Pending	Enum: [Pending Progressing Ready Failed Degraded NotAvailable]
`resolvedTemplateKind` string	ResolvedTemplateKind indicates whether the template resolved to a namespace-scoped AIMServiceTemplate or cluster-scoped AIMClusterServiceTemplate. Values: “AIMServiceTemplate”, “AIMClusterServiceTemplate”
`artifacts` object (keys:string, values:AIMResolvedArtifact)	Artifacts maps model names to their resolved AIMArtifact resources.		Optional: {}

AIMTemplateCachingConfig#

AIMTemplateCachingConfig configures model caching behavior for namespace-scoped templates.

Appears in:

AIMServiceTemplateSpec

Field	Description	Default	Validation
`enabled` boolean	Enabled controls whether caching is enabled for this template. Defaults to `false`.	false
`env` EnvVar array	Env specifies environment variables to use when downloading the model for caching. These variables are available to the model download process and can be used to configure download behavior, authentication, proxies, etc. If not set, falls back to the template’s top-level Env field.		Optional: {}

AIMTemplateProfile#

AIMTemplateProfile declares profile variables for template selection. Used in AIMCustomTemplate to specify optimization targets.

Appears in:

AIMCustomTemplate

Field	Description	Default	Validation
`metric` AIMMetric	Metric specifies the optimization target (e.g., latency, throughput).		Enum: [latency throughput] Optional: {}
`precision` AIMPrecision	Precision specifies the numerical precision (e.g., fp8, fp16, bf16).		Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] Optional: {}

DiscoveryState#

DiscoveryState tracks the discovery process state for circuit breaker logic. This enables exponential backoff and prevents infinite retry loops when discovery jobs fail persistently.

Appears in:

AIMServiceTemplateStatus

Field	Description	Validation
`attempts` integer	Attempts is the number of discovery job attempts that have been made. This counter increments each time a new discovery job is created after a failure.	Optional: {}
`lastAttemptTime` Time	LastAttemptTime is the timestamp of the most recent discovery job creation. Used to calculate exponential backoff before the next retry.	Optional: {}
`lastFailureReason` string	LastFailureReason captures the reason for the most recent discovery failure. Used to classify failures as terminal vs transient.	Optional: {}
`specHash` string	SpecHash is a hash of the template spec fields that affect discovery. When the spec changes, the circuit breaker resets to allow fresh attempts.	Optional: {}

DownloadProgress#

DownloadProgress represents the download progress for a artifact

Appears in:

AIMArtifactStatus

Field	Description	Validation
`totalBytes` integer	TotalBytes is the expected total size of the download in bytes	Optional: {}
`downloadedBytes` integer	DownloadedBytes is the number of bytes downloaded so far	Optional: {}
`percentage` integer	Percentage is the download progress as a percentage (0-100)	Maximum: 100 Minimum: 0 Optional: {}
`displayPercentage` string	DisplayPercentage is a human-readable progress string (e.g., “45 %”) This field is automatically populated from Progress.Percentage	Optional: {}

DownloadState#

DownloadState represents the current download attempt state, updated by the downloader pod

Appears in:

AIMArtifactStatus

Field	Description	Validation
`protocol` string	Protocol is the download protocol currently in use (e.g., “XET”, “HF_TRANSFER”, “HTTP”)	Optional: {}
`attempt` integer	Attempt is the current attempt number (1-based)	Optional: {}
`totalAttempts` integer	TotalAttempts is the total number of attempts configured via AIM_DOWNLOADER_PROTOCOL	Optional: {}
`protocolSequence` string	ProtocolSequence is the configured protocol sequence (e.g., “HF_TRANSFER,XET”)	Optional: {}
`message` string	Message is a human-readable status message from the downloader	Optional: {}

ImageMetadata#

ImageMetadata contains metadata extracted from or provided for a container image.

Appears in:

AIMModelSpec
AIMModelStatus

Field	Description	Validation
`model` ModelMetadata	Model contains AMD Silogen model-specific metadata.	Optional: {}
`oci` OCIMetadata	OCI contains standard OCI image metadata.	Optional: {}
`originalLabels` object (keys:string, values:string)	OriginalLabels contains the raw OCI image labels as a JSON object. This preserves all labels from the image, including those not mapped to structured fields.	Optional: {}

ModelMetadata#

ModelMetadata contains AMD Silogen model-specific metadata extracted from image labels.

Appears in:

ImageMetadata

Field	Description	Validation
`canonicalName` string	CanonicalName is the canonical model identifier (e.g., mistralai/Mixtral-8x22B-Instruct-v0.1). Extracted from: org.amd.silogen.model.canonicalName	Optional: {}
`source` string	Source is the URL where the model can be found. Extracted from: org.amd.silogen.model.source	Optional: {}
`tags` string array	Tags are descriptive tags (e.g., [“text-generation”, “chat”, “instruction”]). Extracted from: org.amd.silogen.model.tags (comma-separated)	Optional: {}
`versions` string array	Versions lists available versions. Extracted from: org.amd.silogen.model.versions (comma-separated)	Optional: {}
`variants` string array	Variants lists model variants. Extracted from: org.amd.silogen.model.variants (comma-separated)	Optional: {}
`hfTokenRequired` boolean	HFTokenRequired indicates if a Hugging Face token is required. Extracted from: org.amd.silogen.hfToken.required	Optional: {}
`title` string	Title is the Silogen-specific title for the model. Extracted from: org.amd.silogen.title	Optional: {}
`descriptionFull` string	DescriptionFull is the full description. Extracted from: org.amd.silogen.description.full	Optional: {}
`releaseNotes` string	ReleaseNotes contains release notes for this version. Extracted from: org.amd.silogen.release.notes	Optional: {}
`recommendedDeployments` RecommendedDeployment array	RecommendedDeployments contains recommended deployment configurations. Extracted from: org.amd.silogen.model.recommendedDeployments (parsed from JSON array)	Optional: {}

ModelSourceFilter#

ModelSourceFilter defines a pattern for discovering images. Supports multiple formats:

Repository patterns: “org/repo*” - matches repositories with wildcards
Repository with tag: “org/repo:1.0.0” - exact tag match
Full URI: “ghcr.io/org/repo:1.0.0” - overrides registry and tag
Full URI with wildcard: “ghcr.io/org/repo*” - overrides registry, matches pattern

Appears in:

AIMClusterModelSourceSpec

Field	Description	Validation
`image` string	Image pattern with wildcard and full URI support. Supported formats: - Repository pattern: “amdenterpriseai/aim-” - Repository with tag: “silogen/aim-llama:1.0.0” (overrides versions field) - Full URI: “ghcr.io/silogen/aim-google-gemma-3-1b-it:0.8.1-rc1” (overrides spec.registry and versions) - Full URI with wildcard: “ghcr.io/silogen/aim-” (overrides spec.registry) When a full URI is specified (including registry like ghcr.io), only images from that registry will match. When a tag is included, it takes precedence over the versions field. Wildcard: * matches any sequence of characters.	MaxLength: 512
`exclude` string array	Exclude lists specific repository names to skip (exact match on repository name only, not registry). Useful for excluding base images or experimental versions. Examples: - [“amdenterpriseai/aim-base”, “amdenterpriseai/aim-experimental”] - [“silogen/aim-base”] - works with “ghcr.io/silogen/aim-*” (registry is not checked in exclusion) Note: Exclusions match against repository names (e.g., “silogen/aim-base”), not full URIs.	Optional: {}
`versions` string array	Versions specifies semantic version constraints for this filter. If specified, overrides the global Versions field. Only tags that parse as valid semver are considered (including prereleases like 0.8.1-rc1). Ignored if the Image field includes an explicit tag (e.g., “repo:1.0.0”). Examples: “>=1.0.0”, “<2.0.0”, “~1.2.0” (patch updates), “^1.0.0” (minor updates) Prerelease versions (e.g., 0.8.1-rc1) are supported and follow semver rules: - 0.8.1-rc1 matches “>=0.8.0” (prerelease is part of version 0.8.1) - Use “>=0.8.1-rc1” to match only that prerelease or higher - Leave empty to match all tags (including prereleases and non-semver tags)	Optional: {}

OCIMetadata#

OCIMetadata contains standard OCI image metadata extracted from image labels.

Appears in:

ImageMetadata

Field	Description	Validation
`title` string	Title is the human-readable title. Extracted from: org.opencontainers.image.title	Optional: {}
`description` string	Description is a brief description. Extracted from: org.opencontainers.image.description	Optional: {}
`licenses` string	Licenses is the SPDX license identifier(s). Extracted from: org.opencontainers.image.licenses	Optional: {}
`vendor` string	Vendor is the organization that produced the image. Extracted from: org.opencontainers.image.vendor	Optional: {}
`authors` string	Authors is contact details of the authors. Extracted from: org.opencontainers.image.authors	Optional: {}
`source` string	Source is the URL to the source code repository. Extracted from: org.opencontainers.image.source	Optional: {}
`documentation` string	Documentation is the URL to documentation. Extracted from: org.opencontainers.image.documentation	Optional: {}
`created` string	Created is the creation timestamp. Extracted from: org.opencontainers.image.created	Optional: {}
`revision` string	Revision is the source control revision. Extracted from: org.opencontainers.image.revision	Optional: {}
`version` string	Version is the image version. Extracted from: org.opencontainers.image.version	Optional: {}

RecommendedDeployment#

RecommendedDeployment describes a recommended deployment configuration for a model.

Appears in:

ModelMetadata

Field	Description	Validation
`gpuModel` string	GPUModel is the GPU model name (e.g., MI300X, MI325X)	Optional: {}
`gpuCount` integer	GPUCount is the number of GPUs required	Optional: {}
`precision` string	Precision is the recommended precision (e.g., fp8, fp16, bf16)	Optional: {}
`metric` string	Metric is the optimization target (e.g., latency, throughput)	Optional: {}
`profileId` string	ProfileId is the unique identifier of the AIM profile for this deployment. When set, templates created from this deployment will use this profile ID to deterministically select the correct runtime profile in the AIM container.	Optional: {}
`description` string	Description provides additional context about this deployment configuration	Optional: {}

RuntimeConfigRef#

Appears in:

AIMArtifactSpec
AIMClusterServiceTemplateSpec
AIMModelSpec
AIMServiceSpec
AIMServiceTemplateSpec
AIMServiceTemplateSpecCommon
AIMTemplateCacheSpec

Field	Description	Default	Validation
`runtimeConfigName` string	Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority over the cluster config when there are conflicts. If this field is empty or set to `default`, the namespace / cluster runtime config with the name `default` is used, if it exists.		Optional: {}

API Reference

Contents

API Reference#

Packages#

aim.eai.amd.com/v1alpha1#

Resource Types#

AIMArtifact#

AIMArtifactConfig#

AIMArtifactList#

AIMArtifactMode#

AIMArtifactSpec#

AIMArtifactStatus#

AIMArtifactStorageQuota#

AIMCachingMode#

AIMClusterModel#

AIMClusterModelList#

AIMClusterModelSource#

AIMClusterModelSourceList#

AIMClusterModelSourceSpec#

AIMClusterModelSourceStatus#

AIMClusterRuntimeConfig#

AIMClusterRuntimeConfigList#

AIMClusterRuntimeConfigSpec#

AIMClusterServiceTemplate#

AIMClusterServiceTemplateList#

AIMClusterServiceTemplateSpec#

AIMCpuRequirements#

AIMCustomModelSpec#

AIMCustomProfile#

AIMCustomTemplate#

AIMDiscoveryProfileMetadata#

AIMDownloadFilter#

AIMGpuRequirements#

AIMHardwareRequirements#

AIMMetric#

AIMModel#

AIMModelConfig#

AIMModelDiscoveryConfig#

AIMModelList#

AIMModelSource#

AIMModelSourceType#

AIMModelSpec#

AIMModelStatus#

AIMPrecision#

AIMProfile#

AIMProfileMetadata#

AIMProfileType#

AIMResolutionScope#

AIMResolvedArtifact#

AIMResolvedReference#

AIMRuntimeConfig#

AIMRuntimeConfigCommon#

AIMRuntimeConfigLabelPropagationSpec#

AIMRuntimeConfigList#

AIMRuntimeConfigSpec#

AIMRuntimeConfigStatus#

AIMRuntimeParameters#

AIMRuntimeRoutingConfig#

AIMService#

AIMServiceAutoScaling#

AIMServiceCacheStatus#

AIMServiceCachingConfig#

AIMServiceList#

AIMServiceMetricTarget#

AIMServiceMetricsSpec#

AIMServiceModel#

AIMServiceModelCustom#

AIMServiceOverrides#

AIMServicePodMetric#

AIMServicePodMetricSource#

AIMServiceRoutingStatus#

AIMServiceRuntimeConfig#

AIMServiceRuntimeStatus#

AIMServiceSpec#

AIMServiceStatus#

AIMServiceTemplate#

AIMServiceTemplateConfig#

AIMServiceTemplateList#

AIMServiceTemplateScope#

AIMServiceTemplateSpec#