API Reference#
Packages#
aim.eai.amd.com/v1alpha1#
Package v1alpha1 contains API Schema definitions for the aim v1alpha1 API group.
Resource Types#
AIMArtifact#
AIMArtifact is the Schema for the artifacts API
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMArtifactConfig#
AIMArtifactConfig controls artifact-level defaults that are not appropriate for individual services. These settings apply at namespace/cluster scope only.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
DefaultRetentionPriority sets the default retention priority for AIMArtifacts |
Minimum: 0 |
AIMArtifactList#
AIMArtifactList contains a list of AIMArtifact
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMArtifactMode#
Underlying type: string
AIMArtifactMode indicates the ownership mode of an artifact, derived from owner references.
Validation:
Enum: [Dedicated Shared]
Appears in:
Field |
Description |
|---|---|
|
ArtifactModeDedicated indicates the cache has owner references and will be |
|
ArtifactModeShared indicates the cache has no owner references and persists |
AIMArtifactSpec#
AIMArtifactSpec defines the desired state of AIMArtifact
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
SourceURI specifies the source location of the model to download. |
MinLength: 1 |
|
|
ModelID is the canonical identifier in {org}/{name} format. |
Pattern: |
|
|
StorageClassName specifies the storage class for the cache volume. |
Optional: {} |
|
|
Size specifies the size of the cache volume |
Optional: {} |
|
|
Env lists the environment variables to use for authentication when downloading models. |
Optional: {} |
|
|
ModelDownloadImage specifies the container image used to download and initialize the artifact. |
Optional: {} |
|
|
DownloadFilter controls which files are included or excluded when downloading from Hugging Face. |
Optional: {} |
|
|
ImagePullSecrets references secrets for pulling AIM container images. |
Optional: {} |
|
|
RetentionPriority marks this artifact as eligible for automatic eviction |
Minimum: 0 |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
AIMArtifactStatus#
AIMArtifactStatus defines the observed state of AIMArtifact
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|||
|
Conditions represent the latest available observations of the artifact’s state |
||
|
Status represents the current status of the artifact |
Pending |
Enum: [Pending Progressing Ready Degraded Failed NotAvailable] |
|
Progress represents the download progress when Status is Progressing |
Optional: {} |
|
|
Download represents the current download attempt state, patched by the downloader pod. |
Optional: {} |
|
|
DisplaySize is the human-readable effective size (spec or discovered) |
Optional: {} |
|
|
LastUsed represents the last time a model was deployed that used this cache |
||
|
PersistentVolumeClaim represents the name of the created PVC |
||
|
Mode indicates the ownership mode of this artifact, derived from owner references. |
Enum: [Dedicated Shared] |
|
|
DiscoveredSizeBytes is the model size discovered via check-size job. |
Optional: {} |
|
|
AllocatedSize is the actual PVC size requested (including headroom). |
Optional: {} |
|
|
HeadroomPercent is the headroom percentage that was applied to the PVC size. |
Optional: {} |
AIMArtifactStorageQuota#
AIMArtifactStorageQuota configures storage limits for AIMArtifacts. These settings are only available on AIMClusterRuntimeConfig (cluster-scoped) because they enforce cluster-wide and cross-namespace policies.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ClusterLimit is the maximum total allocated storage for all AIMArtifacts cluster-wide. |
Optional: {} |
|
|
DefaultNamespaceLimit is the default maximum allocated storage for AIMArtifacts per namespace. |
Optional: {} |
AIMCachingMode#
Underlying type: string
AIMCachingMode controls caching behavior for a service. Canonical values are Dedicated and Shared. Legacy values are accepted for backward compatibility:
Always maps to Shared
Auto maps to Shared
Never maps to Dedicated
Validation:
Enum: [Dedicated Shared Auto Always Never]
Appears in:
Field |
Description |
|---|---|
|
CachingModeDedicated always creates service-owned dedicated caches/artifacts. |
|
CachingModeShared reuses and creates shared caches/artifacts. |
|
CachingModeAuto is deprecated legacy value that maps to Shared. |
|
CachingModeAlways is deprecated legacy value that maps to Shared. |
|
CachingModeNever is deprecated legacy value that maps to Dedicated. |
AIMClusterModel#
AIMClusterModel is a cluster-scoped model catalog entry for AIM container images.
Cluster-scoped models can be referenced by AIMServices in any namespace, making them ideal for shared model deployments across teams and projects. Like namespace-scoped AIMModels, cluster models trigger discovery jobs to extract metadata and generate service templates.
When both cluster and namespace models exist for the same container image, services will preferentially use the namespace-scoped AIMModel when referenced by image URI.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMClusterModelList#
AIMClusterModelList contains a list of AIMClusterModel.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterModelSource#
AIMClusterModelSource automatically discovers and syncs AI model images from container registries.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterModelSourceList#
AIMClusterModelSourceList contains a list of AIMClusterModelSource.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterModelSourceSpec#
AIMClusterModelSourceSpec defines the desired state of AIMClusterModelSource.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Registry to sync from (e.g., docker.io, ghcr.io, gcr.io). |
docker.io |
Optional: {} |
|
ImagePullSecrets contains references to secrets for authenticating to private registries. |
Optional: {} |
|
|
Filters define which images to discover and sync. |
MaxItems: 100 |
|
|
SyncInterval defines how often to sync with the registry. |
1h |
Optional: {} |
|
Versions specifies global semantic version constraints applied to all filters. |
Optional: {} |
|
|
MaxModels is the maximum number of AIMClusterModel resources to create from this source. |
100 |
Maximum: 10000 |
AIMClusterModelSourceStatus#
AIMClusterModelSourceStatus defines the observed state of AIMClusterModelSource.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Status represents the overall state of the model source. |
Enum: [Pending Starting Progressing Ready Running Degraded NotAvailable Failed] |
|
|
LastSyncTime is the timestamp of the last successful registry sync. |
Optional: {} |
|
|
DiscoveredModels is the count of AIMClusterModel resources managed by this source. |
Optional: {} |
|
|
AvailableModels is the total count of images discovered in the registry that match the filters. |
Optional: {} |
|
|
ModelsLimitReached indicates whether the maxModels limit has been reached. |
Optional: {} |
|
|
Conditions represent the latest available observations of the source’s state. |
Optional: {} |
|
|
ObservedGeneration reflects the generation of the most recently observed spec. |
Optional: {} |
AIMClusterRuntimeConfig#
AIMClusterRuntimeConfig is a cluster-scoped runtime configuration for AIM services, models, and templates.
Cluster-scoped runtime configs provide platform-wide defaults that apply to all namespaces, making them ideal for organization-level policies such as storage classes, discovery behavior, model creation scope, and routing configuration.
When both cluster and namespace runtime configs exist with the same name, the configs are merged, and the namespace-scoped AIMRuntimeConfig takes precedence for any field that is set in both.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterRuntimeConfigList#
AIMClusterRuntimeConfigList contains a list of AIMClusterRuntimeConfig.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterRuntimeConfigSpec#
AIMClusterRuntimeConfigSpec defines cluster-wide defaults for AIM resources.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Storage configures storage defaults for this service’s PVCs and caches. |
Optional: {} |
|
|
Routing controls HTTP routing configuration for this service. |
Optional: {} |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
|
|
Model controls model creation and discovery defaults. |
Optional: {} |
|
|
Artifact controls artifact-level defaults such as eviction policy. |
Optional: {} |
|
|
LabelPropagation controls how labels from parent AIM resources are propagated to child resources. |
Optional: {} |
|
|
DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. |
Optional: {} |
|
|
DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. |
Optional: {} |
|
|
ArtifactStorageQuota configures storage limits for AIMArtifacts. |
Optional: {} |
AIMClusterServiceTemplate#
AIMClusterServiceTemplate is a cluster-scoped template that defines runtime profiles for AIM services.
Cluster-scoped templates can be used by AIMServices in any namespace, making them ideal for platform-wide model configurations that should be shared across teams and projects. Unlike namespace-scoped AIMServiceTemplates, cluster templates do not support caching configuration and must be managed by cluster administrators, since caches themselves are namespace-scoped.
When both cluster and namespace templates exist with the same name, the namespace-scoped template takes precedence for services in that namespace.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterServiceTemplateList#
AIMClusterServiceTemplateList contains a list of AIMClusterServiceTemplate.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMClusterServiceTemplateSpec#
AIMClusterServiceTemplateSpec defines the desired state of AIMClusterServiceTemplate (cluster-scoped).
A cluster-scoped template that selects a runtime profile for a given AIM model.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ModelName is the model name. Matches |
MinLength: 1 |
|
|
Metric selects the optimization goal. |
Enum: [latency throughput] |
|
|
Precision selects the numeric precision used by the runtime. |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Hardware specifies GPU and CPU requirements for each replica. |
Optional: {} |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
|
|
AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). |
Optional: {} |
|
|
ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). |
Optional: {} |
|
|
CustomProfile defines inline custom profile data for the inference engine. |
Optional: {} |
|
|
ImagePullSecrets lists secrets containing credentials for pulling container images. |
Optional: {} |
|
|
ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. |
Optional: {} |
|
|
Resources defines the default container resource requirements applied to services derived from this template. |
Optional: {} |
|
|
ModelSources specifies the model sources required to run this template. |
Optional: {} |
|
|
ProfileId is the specific AIM profile ID that this template should use. |
Optional: {} |
|
|
Type indicates the optimization level of this template. |
Enum: [optimized preview unoptimized] |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
AIMCpuRequirements#
AIMCpuRequirements specifies CPU resource requirements.
Appears in:
AIMCustomModelSpec#
AIMCustomModelSpec contains configuration for custom models. These fields are only used when modelSources is specified (custom models). For image-based models, these settings come from discovery.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Hardware specifies default hardware requirements for all templates. |
Optional: {} |
|
|
Type specifies default type for all templates. |
Enum: [optimized preview unoptimized] |
AIMCustomProfile#
AIMCustomProfile defines inline custom profile data for user-provided inference engine configuration. When set on a template, the controller assembles a profile YAML, creates a ConfigMap, and mounts it into both the discovery job and the inference service container.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
EngineArgs contains inference engine arguments as a free-form JSON object. |
Schemaless: {} |
|
|
EnvVars contains environment variables applied to the inference engine process. |
Optional: {} |
AIMCustomTemplate#
AIMCustomTemplate defines a custom template configuration for a model. When modelSources are specified directly on AIMModel, customTemplates allow defining explicit hardware requirements and profiles, skipping the discovery job. This is an existing struct (not a CRD); it appears as an element of AIMModel.spec.customTemplates[].
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Name is the template name. If not provided, auto-generated from model name + profile. |
MaxLength: 63 |
|
|
Type indicates the optimization status of this template. |
unoptimized |
Enum: [optimized preview unoptimized] |
|
Env specifies environment variable overrides when this template is selected. |
MaxItems: 64 |
|
|
Hardware specifies GPU and CPU requirements for this template. |
Optional: {} |
|
|
Profile declares runtime profile variables for template selection. |
Optional: {} |
|
|
AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). |
Optional: {} |
|
|
ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). |
Optional: {} |
|
|
CustomProfile defines inline custom profile data for the inference engine. |
Optional: {} |
AIMDiscoveryProfileMetadata#
AIMDiscoveryProfileMetadata describes the characteristics of a discovered deployment profile.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Engine identifies the inference engine used for this profile (e.g., “vllm”, “tgi”). |
Optional: {} |
|
|
GPU specifies the GPU model this profile is optimized for (e.g., “MI300X”, “MI325X”). |
Optional: {} |
|
|
GPUCount indicates how many GPUs are required per replica for this profile. |
Optional: {} |
|
|
Metric indicates the optimization goal for this profile (“latency” or “throughput”). |
Enum: [latency throughput] |
|
|
Precision specifies the numeric precision used in this profile (e.g., “fp16”, “fp8”). |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Type specifies the optimization level of this profile (optimized, unoptimized, preview). |
Enum: [optimized preview unoptimized] |
AIMDownloadFilter#
AIMDownloadFilter controls which files are included or excluded during artifact downloads. Patterns use fnmatch-style glob syntax applied against relative file paths in the repository. Both the size estimator and downloader apply the same filter, ensuring PVC sizing matches the actual download.
Filter order (matching huggingface_hub behavior):
Include: if set, only files matching at least one include pattern are considered
Exclude: files matching any exclude pattern are then removed
When no filter is configured (neither on the artifact nor in the runtime config), subdirectory files are excluded by default (equivalent to exclude: [”/”]). To download all files including subdirectories, set an empty filter: downloadFilter: {}.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Include specifies glob patterns for files to download. |
Optional: {} |
|
|
Exclude specifies glob patterns for files to skip. |
Optional: {} |
AIMGpuRequirements#
AIMGpuRequirements specifies GPU resource requirements.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Requests is the number of GPUs to set as requests/limits. |
Minimum: 0 |
|
|
Model limits deployment to a specific GPU model. |
MaxLength: 64 |
|
|
MinVRAM limits deployment to GPUs having at least this much VRAM. |
Optional: {} |
|
|
ResourceName is the Kubernetes resource name for GPU resources. |
amd.com/gpu |
Optional: {} |
AIMHardwareRequirements#
AIMHardwareRequirements specifies compute resource requirements for custom models. Used in AIMModelSpec and AIMCustomTemplate to define GPU and CPU needs.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
GPU specifies GPU requirements. If not set, no GPUs are requested (CPU-only model). |
Optional: {} |
||
CPU specifies CPU requirements. |
Optional: {} |
AIMMetric#
Underlying type: string
AIMMetric enumerates the targeted service characteristic
Validation:
Enum: [latency throughput]
Appears in:
Field |
Description |
|---|---|
|
|
|
AIMModel#
AIMModel is the schema for namespace-scoped AIM model catalog entries.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMModelConfig#
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
AutoDiscovery controls whether models run discovery by default. |
Optional: {} |
AIMModelDiscoveryConfig#
AIMModelDiscoveryConfig controls discovery behavior for a model.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ExtractMetadata controls whether metadata extraction runs for this model. |
true |
Optional: {} |
|
CreateServiceTemplates controls whether (cluster) service templates are auto-created from the image metadata. |
true |
Optional: {} |
AIMModelList#
AIMModelList contains a list of AIMModel.
AIMModelSource#
AIMModelSource describes a model artifact that must be downloaded for inference. Discovery extracts these from the container’s configuration to enable caching and validation.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ModelID is the canonical identifier in {org}/{name} format. |
Pattern: |
|
|
SourceURI is the location from which the model should be downloaded. |
Pattern: |
|
|
Size is the expected storage space required for this model artifact. |
Optional: {} |
|
|
Env specifies per-source credential overrides. |
Optional: {} |
AIMModelSourceType#
Underlying type: string
AIMModelSourceType indicates how a model’s artifacts are sourced.
Validation:
Enum: [Image Custom]
Appears in:
Field |
Description |
|---|---|
|
AIMModelSourceTypeImage indicates the model is discovered from container image labels. |
|
AIMModelSourceTypeCustom indicates the model uses explicit spec.modelSources. |
AIMModelSpec#
AIMModelSpec defines the desired state of AIMModel.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Image is the container image URI for this AIM model. |
MinLength: 1 |
|
|
Discovery controls discovery behavior for this model. |
Optional: {} |
|
|
DefaultServiceTemplate specifies the default AIMServiceTemplate to use when creating services for this model. |
Optional: {} |
|
|
Custom contains configuration for custom models (models with inline modelSources). |
Optional: {} |
|
|
CustomTemplates defines explicit template configurations for this model. |
MaxItems: 16 |
|
|
ModelSources specifies the model sources to use for this model. |
MaxItems: 1 |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both as a namespace and a cluster runtime config, the values are merged together, the namespace config taking priority |
Optional: {} |
|
|
ImagePullSecrets lists secrets containing credentials for pulling the model container image. |
Optional: {} |
|
|
Env specifies environment variables for authentication during model discovery and metadata extraction. |
Optional: {} |
|
|
ServiceAccountName specifies the Kubernetes service account to use for workloads related to this model. |
Optional: {} |
|
|
Resources defines the default resource requirements for services using this model. |
Optional: {} |
|
|
ImageMetadata is the metadata that is used to determine which recommended service templates to create, |
AIMModelStatus#
AIMModelStatus defines the observed state of AIMModel.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ObservedGeneration is the most recent generation observed by the controller |
||
|
Status represents the overall status of the image based on its templates |
Pending |
Enum: [Pending Progressing Ready Degraded Failed NotAvailable] |
|
Conditions represent the latest available observations of the model’s state |
||
|
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. |
Optional: {} |
|
|
ImageMetadata is the metadata extracted from an AIM image |
Optional: {} |
|
|
SourceType indicates how this model’s artifacts are sourced. |
Enum: [Image Custom] |
AIMPrecision#
Underlying type: string
AIMPrecision enumerates supported numeric precisions
Validation:
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
Appears in:
Field |
Description |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AIMProfile#
AIMProfile contains the cached discovery results for a template. This is the processed and validated version of AIMDiscoveryProfile that is stored in the template’s status after successful discovery.
The profile serves as a cache of runtime configuration, eliminating the need to re-run discovery for each service that uses this template. Services and caching mechanisms reference this cached profile for deployment parameters and model sources.
See discovery.go for AIMDiscoveryProfile (the raw discovery output) and the relationship between these types.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
EngineArgs contains runtime-specific engine configuration as a free-form JSON object. |
Schemaless: {} |
|
|
EnvVars contains environment variables required by the runtime for this profile. |
Optional: {} |
|
|
Refer to Kubernetes API documentation for fields of |
||
|
OriginalDiscoveryOutput contains the raw discovery job JSON output. |
Schemaless: {} |
AIMProfileMetadata#
AIMProfileMetadata describes the characteristics of a cached deployment profile. This is identical to AIMDiscoveryProfileMetadata but exists in the template status namespace.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Engine identifies the inference engine used for this profile (e.g., “vllm”, “tgi”). |
Optional: {} |
|
|
GPU specifies the GPU model this profile is optimized for (e.g., “MI300X”, “MI325X”). |
Optional: {} |
|
|
GPUCount indicates how many GPUs are required per replica for this profile. |
Optional: {} |
|
|
Metric indicates the optimization goal for this profile (“latency” or “throughput”). |
Enum: [latency throughput] |
|
|
Precision specifies the numeric precision used in this profile (e.g., “fp16”, “fp8”). |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Type indicates the optimization level of this profile (optimized, preview, unoptimized). |
Enum: [optimized preview unoptimized] |
AIMProfileType#
Underlying type: string
AIMProfileType indicates the optimization level of a deployment profile.
Validation:
Enum: [optimized preview unoptimized]
Appears in:
Field |
Description |
|---|---|
|
AIMProfileTypeOptimized indicates the profile has been fully optimized. |
|
AIMProfileTypePreview indicates the profile is in preview/beta state. |
|
AIMProfileTypeUnoptimized indicates the profile has not been optimized. |
AIMResolutionScope#
Underlying type: string
AIMResolutionScope describes the scope of a resolved reference.
Validation:
Enum: [Namespace Cluster Merged Unknown]
Appears in:
Field |
Description |
|---|---|
|
AIMResolutionScopeNamespace denotes a namespace-scoped resource. |
|
AIMResolutionScopeCluster denotes a cluster-scoped resource. |
|
AIMResolutionScopeMerged denotes that both cluster and namespace configs were merged. |
|
AIMResolutionScopeUnknown denotes that the scope could not be determined. |
AIMResolvedArtifact#
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
UID of the AIMArtifact resource |
||
|
Name of the AIMArtifact resource |
||
|
Model is the name of the model that is cached |
||
|
Status of the artifact |
||
|
PersistentVolumeClaim name if available |
||
|
MountPoint is the mount point for the artifact |
AIMResolvedReference#
AIMResolvedReference captures metadata about a resolved reference.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Name is the resource name that satisfied the reference. |
||
|
Namespace identifies where the resource was found when namespace-scoped. |
||
|
Scope indicates whether the resolved resource was namespace or cluster scoped. |
Enum: [Namespace Cluster Merged Unknown] |
|
|
Kind is the fully-qualified kind of the resolved reference, when known. |
Optional: {} |
|
|
UID captures the unique identifier of the resolved reference, when known. |
Optional: {} |
AIMRuntimeConfig#
AIMRuntimeConfig is the schema for namespace-scoped AIM runtime configurations.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMRuntimeConfigCommon#
AIMRuntimeConfigCommon captures configuration fields shared across cluster and namespace scopes. These settings apply to both AIMRuntimeConfig (namespace-scoped) and AIMClusterRuntimeConfig (cluster-scoped). It embeds AIMServiceRuntimeConfig which contains fields that can also be overridden at the service level.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Storage configures storage defaults for this service’s PVCs and caches. |
Optional: {} |
|
|
Routing controls HTTP routing configuration for this service. |
Optional: {} |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
|
|
Model controls model creation and discovery defaults. |
Optional: {} |
|
|
Artifact controls artifact-level defaults such as eviction policy. |
Optional: {} |
|
|
LabelPropagation controls how labels from parent AIM resources are propagated to child resources. |
Optional: {} |
|
|
DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. |
Optional: {} |
|
|
DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. |
Optional: {} |
AIMRuntimeConfigLabelPropagationSpec#
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Enabled, if true, allows propagating parent labels to all child resources it creates directly |
false |
Optional: {} |
|
Match is a list of label keys that will be propagated to any child resources created. |
Optional: {} |
AIMRuntimeConfigList#
AIMRuntimeConfigList contains a list of AIMRuntimeConfig.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMRuntimeConfigSpec#
AIMRuntimeConfigSpec defines namespace-scoped overrides for AIM resources.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Storage configures storage defaults for this service’s PVCs and caches. |
Optional: {} |
|
|
Routing controls HTTP routing configuration for this service. |
Optional: {} |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
|
|
Model controls model creation and discovery defaults. |
Optional: {} |
|
|
Artifact controls artifact-level defaults such as eviction policy. |
Optional: {} |
|
|
LabelPropagation controls how labels from parent AIM resources are propagated to child resources. |
Optional: {} |
|
|
DEPRECATED: Use Storage.DefaultStorageClassName instead. This field will be removed in a future version. |
Optional: {} |
|
|
DEPRECATED: Use Storage.PVCHeadroomPercent instead. This field will be removed in a future version. |
Optional: {} |
AIMRuntimeConfigStatus#
AIMRuntimeConfigStatus records the resolved config reference surfaced to consumers.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ObservedGeneration is the last reconciled generation. |
||
|
Conditions communicate reconciliation progress. |
AIMRuntimeParameters#
AIMRuntimeParameters contains the runtime configuration parameters shared across templates and services. Fields use pointers to allow optional usage in different contexts (required in templates, optional in service overrides).
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Metric selects the optimization goal. |
Enum: [latency throughput] |
|
|
Precision selects the numeric precision used by the runtime. |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Hardware specifies GPU and CPU requirements for each replica. |
Optional: {} |
AIMRuntimeRoutingConfig#
AIMRuntimeRoutingConfig configures HTTP routing defaults for inference services. These settings control how Gateway API HTTPRoutes are created and configured.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Enabled controls whether HTTP routing is managed for inference services using this config. |
Optional: {} |
|
|
GatewayRef specifies the Gateway API Gateway resource that should receive HTTPRoutes. |
Optional: {} |
|
|
PathTemplate defines the HTTP path template for routes, evaluated using JSONPath expressions. |
Optional: {} |
|
|
RequestTimeout defines the HTTP request timeout for routes. |
Optional: {} |
|
|
Annotations defines default annotations to add to all HTTPRoute resources. |
Optional: {} |
AIMService#
AIMService manages a KServe-based AIM inference service for the selected model and template. Note: KServe uses {name}-{namespace} format which must not exceed 63 characters. This constraint is validated at runtime since CEL cannot access metadata.namespace.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMServiceAutoScaling#
AIMServiceAutoScaling configures KEDA-based autoscaling with custom metrics. This enables automatic scaling based on metrics collected from OpenTelemetry.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Metrics is a list of metrics to be used for autoscaling. |
Optional: {} |
AIMServiceCacheStatus#
AIMServiceCacheStatus captures cache-related status for an AIMService.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
TemplateCacheRef references the TemplateCache being used, if any. |
Optional: {} |
|
|
RetryAttempts tracks how many times this service has attempted to retry a failed cache. |
Optional: {} |
AIMServiceCachingConfig#
AIMServiceCachingConfig controls caching behavior for a service.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Mode controls when to use caching. |
Shared |
Enum: [Dedicated Shared Auto Always Never] |
AIMServiceList#
AIMServiceList contains a list of AIMService.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMServiceMetricTarget#
AIMServiceMetricTarget defines the target value for a metric. Specifies how the metric value should be interpreted and what target to maintain.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Type specifies how to interpret the metric value. |
Enum: [Value AverageValue Utilization] |
|
|
Value is the target value of the metric (as a quantity). |
Optional: {} |
|
|
AverageValue is the target value of the average of the metric across all relevant pods (as a quantity). |
Optional: {} |
|
|
AverageUtilization is the target value of the average of the resource metric across all relevant pods, |
Optional: {} |
AIMServiceMetricsSpec#
AIMServiceMetricsSpec defines a single metric for autoscaling. Specifies the metric source type and configuration.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Type is the type of metric source. |
Enum: [PodMetric] |
|
|
PodMetric refers to a metric describing each pod in the current scale target. |
Optional: {} |
AIMServiceModel#
AIMServiceModel specifies which model to deploy. Exactly one field must be set.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Name references an existing AIMModel or AIMClusterModel by metadata.name. |
Optional: {} |
|
|
Image specifies a container image URI directly. |
Optional: {} |
|
|
Custom specifies a custom model configuration with explicit base image, |
Optional: {} |
AIMServiceModelCustom#
AIMServiceModelCustom specifies a custom model configuration with explicit base image, model sources, and hardware requirements. Used for ad-hoc custom model deployments.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
BaseImage is the container image URI for the AIM base image. |
Required: {} |
|
|
ModelSources specifies the model sources to use. |
MaxItems: 1 |
|
|
Hardware specifies the GPU and CPU requirements for this custom model. |
Required: {} |
AIMServiceOverrides#
AIMServiceOverrides allows overriding template parameters at the service level. All fields are optional. When specified, they override the corresponding values from the referenced AIMServiceTemplate.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Metric selects the optimization goal. |
Enum: [latency throughput] |
|
|
Precision selects the numeric precision used by the runtime. |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Hardware specifies GPU and CPU requirements for each replica. |
Optional: {} |
AIMServicePodMetric#
AIMServicePodMetric identifies the pod metric and its backend. Supports multiple metrics backends including OpenTelemetry.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Backend defines the metrics backend to use. |
opentelemetry |
Enum: [opentelemetry] |
|
ServerAddress specifies the address of the metrics backend server. |
Optional: {} |
|
|
MetricNames specifies which metrics to collect from pods and send to ServerAddress. |
Optional: {} |
|
|
Query specifies the query to run to retrieve metrics from the backend. |
Optional: {} |
|
|
OperationOverTime specifies the operation to aggregate metrics over time. |
Optional: {} |
AIMServicePodMetricSource#
AIMServicePodMetricSource defines pod-level metrics configuration. Specifies the metric identification and target values for pod-based autoscaling.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Metric contains the metric identification and backend configuration. |
||
|
Target specifies the target value for the metric. |
AIMServiceRoutingStatus#
AIMServiceRoutingStatus captures observed routing details.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Path is the HTTP path prefix used when routing is enabled. |
Optional: {} |
AIMServiceRuntimeConfig#
AIMServiceRuntimeConfig contains runtime configuration fields that apply to services. This struct is shared between AIMService.spec (inlined) and AIMRuntimeConfigCommon, allowing services to override these specific runtime settings while inheriting defaults from namespace/cluster RuntimeConfigs.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Storage configures storage defaults for this service’s PVCs and caches. |
Optional: {} |
|
|
Routing controls HTTP routing configuration for this service. |
Optional: {} |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
AIMServiceRuntimeStatus#
AIMServiceRuntimeStatus captures runtime status including replica counts from HPA.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
CurrentReplicas is the current number of replicas as reported by the HPA. |
||
|
DesiredReplicas is the desired number of replicas as determined by the HPA. |
||
|
MinReplicas is the minimum number of replicas configured for autoscaling. |
||
|
MaxReplicas is the maximum number of replicas configured for autoscaling. |
||
|
Replicas is a formatted display string for kubectl output. |
Optional: {} |
AIMServiceSpec#
AIMServiceSpec defines the desired state of AIMService.
Binds a canonical model to an AIMServiceTemplate and configures replicas, caching behavior, and optional overrides. The template governs the base runtime selection knobs, while the overrides field allows service-specific customization.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Model specifies which model to deploy using one of the available reference methods. |
||
|
Template contains template selection and configuration. |
Optional: {} |
|
|
Caching controls caching behavior for this service. |
Optional: {} |
|
|
DEPRECATED: Use Caching.Mode instead. This field will be removed in a future version. |
Optional: {} |
|
|
Replicas specifies the number of replicas for this service. |
1 |
Optional: {} |
|
MinReplicas specifies the minimum number of replicas for autoscaling. |
Minimum: 1 |
|
|
MaxReplicas specifies the maximum number of replicas for autoscaling. |
Minimum: 1 |
|
|
AutoScaling configures advanced autoscaling behavior using KEDA. |
Optional: {} |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
|
|
Storage configures storage defaults for this service’s PVCs and caches. |
Optional: {} |
|
|
Routing controls HTTP routing configuration for this service. |
Optional: {} |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
|
|
Resources overrides the container resource requirements for this service. |
Optional: {} |
|
|
Overrides allows overriding specific template parameters for this service. |
Optional: {} |
|
|
ImagePullSecrets references secrets for pulling AIM container images. |
Optional: {} |
|
|
ServiceAccountName specifies the Kubernetes service account to use for the inference workload. |
Optional: {} |
|
|
PriorityClassName specifies the priority class for the inference pods. |
Optional: {} |
AIMServiceStatus#
AIMServiceStatus defines the observed state of AIMService.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ObservedGeneration is the most recent generation observed by the controller. |
||
|
Conditions represent the latest observations of template state. |
||
|
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. |
Optional: {} |
|
|
ResolvedModel captures metadata about the image that was resolved. |
Optional: {} |
|
|
Status represents the current high‑level status of the service lifecycle. |
Pending |
Enum: [Pending Starting Running Degraded Failed] |
|
Routing surfaces information about the configured HTTP routing, when enabled. |
Optional: {} |
|
|
ResolvedTemplate captures metadata about the template that satisfied the reference. |
||
|
Cache captures cache-related status for this service. |
Optional: {} |
|
|
Runtime captures runtime status including replica counts. |
Optional: {} |
AIMServiceTemplate#
AIMServiceTemplate is the Schema for namespace-scoped AIM service templates.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMServiceTemplateConfig#
AIMServiceTemplateConfig contains template selection configuration for AIMService.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Name is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to use. |
Optional: {} |
|
|
AllowUnoptimized, if true, will allow automatic selection of templates |
Optional: {} |
AIMServiceTemplateList#
AIMServiceTemplateList contains a list of AIMServiceTemplate.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMServiceTemplateScope#
Underlying type: string
AIMServiceTemplateScope is retained for backwards compatibility with existing consumers.
Validation:
Enum: [Namespace Cluster Unknown]
Appears in:
AIMServiceTemplateSpec#
AIMServiceTemplateSpec defines the desired state of AIMServiceTemplate (namespace-scoped).
A namespaced and versioned template that selects a runtime profile for a given AIM model (by canonical name). Templates are intentionally narrow: they describe runtime selection knobs for the AIM container and do not redefine the full Kubernetes deployment shape.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ModelName is the model name. Matches |
MinLength: 1 |
|
|
Metric selects the optimization goal. |
Enum: [latency throughput] |
|
|
Precision selects the numeric precision used by the runtime. |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Hardware specifies GPU and CPU requirements for each replica. |
Optional: {} |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
|
|
AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). |
Optional: {} |
|
|
ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). |
Optional: {} |
|
|
CustomProfile defines inline custom profile data for the inference engine. |
Optional: {} |
|
|
ImagePullSecrets lists secrets containing credentials for pulling container images. |
Optional: {} |
|
|
ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. |
Optional: {} |
|
|
Resources defines the default container resource requirements applied to services derived from this template. |
Optional: {} |
|
|
ModelSources specifies the model sources required to run this template. |
Optional: {} |
|
|
ProfileId is the specific AIM profile ID that this template should use. |
Optional: {} |
|
|
Type indicates the optimization level of this template. |
Enum: [optimized preview unoptimized] |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
|
|
Caching configures model caching behavior for this namespace-scoped template. |
Optional: {} |
AIMServiceTemplateSpecCommon#
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ModelName is the model name. Matches |
MinLength: 1 |
|
|
Metric selects the optimization goal. |
Enum: [latency throughput] |
|
|
Precision selects the numeric precision used by the runtime. |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
|
Hardware specifies GPU and CPU requirements for each replica. |
Optional: {} |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
|
|
AimId is the AIM product family identifier (e.g., “meta-llama/Llama-3-8B”). |
Optional: {} |
|
|
ModelId is the specific model identifier / Hugging Face URI (e.g., “Qwen/Qwen3-32B-FP8”). |
Optional: {} |
|
|
CustomProfile defines inline custom profile data for the inference engine. |
Optional: {} |
|
|
ImagePullSecrets lists secrets containing credentials for pulling container images. |
Optional: {} |
|
|
ServiceAccountName specifies the Kubernetes service account to use for workloads related to this template. |
Optional: {} |
|
|
Resources defines the default container resource requirements applied to services derived from this template. |
Optional: {} |
|
|
ModelSources specifies the model sources required to run this template. |
Optional: {} |
|
|
ProfileId is the specific AIM profile ID that this template should use. |
Optional: {} |
|
|
Type indicates the optimization level of this template. |
Enum: [optimized preview unoptimized] |
|
|
Env specifies environment variables for inference containers. |
Optional: {} |
AIMServiceTemplateStatus#
AIMServiceTemplateStatus defines the observed state of AIMServiceTemplate.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ObservedGeneration is the most recent generation observed by the controller. |
||
|
Conditions represent the latest observations of template state. |
||
|
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. |
Optional: {} |
|
|
ResolvedModel captures metadata about the image that was resolved. |
Optional: {} |
|
|
ResolvedCache captures metadata about which cache is used for this template |
Optional: {} |
|
|
ResolvedHardware contains the resolved hardware requirements for this template. |
Optional: {} |
|
|
ResolvedNodeAffinity contains the computed node affinity rules for GPU scheduling. |
Optional: {} |
|
|
HardwareSummary is a human-readable display string for the hardware requirements. |
Optional: {} |
|
|
Status represents the current high‑level status of the template lifecycle. |
Pending |
Enum: [Pending Progressing Ready Degraded Failed NotAvailable] |
|
ModelSources list the models that this template requires to run. These are the models that will be |
||
|
Profile contains the full discovery result profile as a free-form JSON object. |
||
|
DiscoveryJob is a reference to the job that was run for discovery |
||
|
Discovery contains state tracking for the discovery process, including |
Optional: {} |
AIMStorageConfig#
AIMStorageConfig configures storage defaults for artifacts and PVCs.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
DefaultStorageClassName specifies the storage class to use for artifacts and PVCs |
Optional: {} |
|
|
PVCHeadroomPercent specifies the percentage of extra space to add to PVCs |
10 |
Minimum: 0 |
|
DownloadFilter controls which files are included or excluded during artifact downloads. |
Optional: {} |
AIMTemplateCache#
AIMTemplateCache pre-warms artifacts for a specified template.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
|||
|
AIMTemplateCacheList#
AIMTemplateCacheList contains a list of AIMTemplateCache.
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
|
||
|
|
||
|
Refer to Kubernetes API documentation for fields of |
||
|
AIMTemplateCacheMode#
Underlying type: string
AIMTemplateCacheMode controls the ownership behavior of artifacts created by a template cache.
Validation:
Enum: [Dedicated Shared]
Appears in:
Field |
Description |
|---|---|
|
TemplateCacheModeDedicated means artifacts have owner references to the template cache. |
|
TemplateCacheModeShared means artifacts have no owner references. |
AIMTemplateCacheSpec#
AIMTemplateCacheSpec defines the desired state of AIMTemplateCache
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
TemplateName is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to cache. |
MinLength: 1 |
|
|
TemplateScope indicates whether the template is namespace-scoped or cluster-scoped. |
Enum: [Namespace Cluster Unknown] |
|
|
Env specifies environment variables to use for authentication when downloading models. |
Optional: {} |
|
|
ImagePullSecrets references secrets for pulling AIM container images. |
Optional: {} |
|
|
StorageClassName specifies the storage class for cache volumes. |
Optional: {} |
|
|
DownloadImage specifies the container image used to download and initialize artifacts. |
Optional: {} |
|
|
ModelSources specifies the model sources to cache for this template. |
Optional: {} |
|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |
|
|
Mode controls the ownership behavior of artifacts created by this template cache. |
Shared |
Enum: [Dedicated Shared] |
AIMTemplateCacheStatus#
AIMTemplateCacheStatus defines the observed state of AIMTemplateCache
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
ObservedGeneration is the most recent generation observed by the controller. |
||
|
Conditions represent the latest observations of the template cache state. |
||
|
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. |
Optional: {} |
|
|
Status represents the current high-level status of the template cache. |
Pending |
Enum: [Pending Progressing Ready Failed Degraded NotAvailable] |
|
ResolvedTemplateKind indicates whether the template resolved to a namespace-scoped |
||
|
Artifacts maps model names to their resolved AIMArtifact resources. |
Optional: {} |
AIMTemplateCachingConfig#
AIMTemplateCachingConfig configures model caching behavior for namespace-scoped templates.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Enabled controls whether caching is enabled for this template. |
false |
|
|
Env specifies environment variables to use when downloading the model for caching. |
Optional: {} |
AIMTemplateProfile#
AIMTemplateProfile declares profile variables for template selection. Used in AIMCustomTemplate to specify optimization targets.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Metric specifies the optimization target (e.g., latency, throughput). |
Enum: [latency throughput] |
|
|
Precision specifies the numerical precision (e.g., fp8, fp16, bf16). |
Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
DiscoveryState#
DiscoveryState tracks the discovery process state for circuit breaker logic. This enables exponential backoff and prevents infinite retry loops when discovery jobs fail persistently.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Attempts is the number of discovery job attempts that have been made. |
Optional: {} |
|
|
LastAttemptTime is the timestamp of the most recent discovery job creation. |
Optional: {} |
|
|
LastFailureReason captures the reason for the most recent discovery failure. |
Optional: {} |
|
|
SpecHash is a hash of the template spec fields that affect discovery. |
Optional: {} |
DownloadProgress#
DownloadProgress represents the download progress for a artifact
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
TotalBytes is the expected total size of the download in bytes |
Optional: {} |
|
|
DownloadedBytes is the number of bytes downloaded so far |
Optional: {} |
|
|
Percentage is the download progress as a percentage (0-100) |
Maximum: 100 |
|
|
DisplayPercentage is a human-readable progress string (e.g., “45 %”) |
Optional: {} |
DownloadState#
DownloadState represents the current download attempt state, updated by the downloader pod
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Protocol is the download protocol currently in use (e.g., “XET”, “HF_TRANSFER”, “HTTP”) |
Optional: {} |
|
|
Attempt is the current attempt number (1-based) |
Optional: {} |
|
|
TotalAttempts is the total number of attempts configured via AIM_DOWNLOADER_PROTOCOL |
Optional: {} |
|
|
ProtocolSequence is the configured protocol sequence (e.g., “HF_TRANSFER,XET”) |
Optional: {} |
|
|
Message is a human-readable status message from the downloader |
Optional: {} |
ImageMetadata#
ImageMetadata contains metadata extracted from or provided for a container image.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Model contains AMD Silogen model-specific metadata. |
Optional: {} |
|
|
OCI contains standard OCI image metadata. |
Optional: {} |
|
|
OriginalLabels contains the raw OCI image labels as a JSON object. |
Optional: {} |
ModelMetadata#
ModelMetadata contains AMD Silogen model-specific metadata extracted from image labels.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
CanonicalName is the canonical model identifier (e.g., mistralai/Mixtral-8x22B-Instruct-v0.1). |
Optional: {} |
|
|
Source is the URL where the model can be found. |
Optional: {} |
|
|
Tags are descriptive tags (e.g., [“text-generation”, “chat”, “instruction”]). |
Optional: {} |
|
|
Versions lists available versions. |
Optional: {} |
|
|
Variants lists model variants. |
Optional: {} |
|
|
HFTokenRequired indicates if a Hugging Face token is required. |
Optional: {} |
|
|
Title is the Silogen-specific title for the model. |
Optional: {} |
|
|
DescriptionFull is the full description. |
Optional: {} |
|
|
ReleaseNotes contains release notes for this version. |
Optional: {} |
|
|
RecommendedDeployments contains recommended deployment configurations. |
Optional: {} |
ModelSourceFilter#
ModelSourceFilter defines a pattern for discovering images. Supports multiple formats:
Repository patterns: “org/repo*” - matches repositories with wildcards
Repository with tag: “org/repo:1.0.0” - exact tag match
Full URI: “ghcr.io/org/repo:1.0.0” - overrides registry and tag
Full URI with wildcard: “ghcr.io/org/repo*” - overrides registry, matches pattern
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Image pattern with wildcard and full URI support. |
MaxLength: 512 |
|
|
Exclude lists specific repository names to skip (exact match on repository name only, not registry). |
Optional: {} |
|
|
Versions specifies semantic version constraints for this filter. |
Optional: {} |
OCIMetadata#
OCIMetadata contains standard OCI image metadata extracted from image labels.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Title is the human-readable title. |
Optional: {} |
|
|
Description is a brief description. |
Optional: {} |
|
|
Licenses is the SPDX license identifier(s). |
Optional: {} |
|
|
Vendor is the organization that produced the image. |
Optional: {} |
|
|
Authors is contact details of the authors. |
Optional: {} |
|
|
Source is the URL to the source code repository. |
Optional: {} |
|
|
Documentation is the URL to documentation. |
Optional: {} |
|
|
Created is the creation timestamp. |
Optional: {} |
|
|
Revision is the source control revision. |
Optional: {} |
|
|
Version is the image version. |
Optional: {} |
RecommendedDeployment#
RecommendedDeployment describes a recommended deployment configuration for a model.
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
GPUModel is the GPU model name (e.g., MI300X, MI325X) |
Optional: {} |
|
|
GPUCount is the number of GPUs required |
Optional: {} |
|
|
Precision is the recommended precision (e.g., fp8, fp16, bf16) |
Optional: {} |
|
|
Metric is the optimization target (e.g., latency, throughput) |
Optional: {} |
|
|
ProfileId is the unique identifier of the AIM profile for this deployment. |
Optional: {} |
|
|
Description provides additional context about this deployment configuration |
Optional: {} |
RuntimeConfigRef#
Appears in:
Field |
Description |
Default |
Validation |
|---|---|---|---|
|
Name is the name of the runtime config to use for this resource. If a runtime config with this name exists both |
Optional: {} |