Available AIM containers#
Cohere Labs#
111B parameter language model with configurable reasoning and tool use capabilities.
Meta Llama#
meta-llama/Llama-3.1-405B-Instruct (stable)
Multilingual 405B parameter instruction-tuned language model for dialogue use cases.
meta-llama/Llama-3.1-8B-Instruct (stable)
Multilingual 8B parameter instruction-tuned language model for dialogue use cases.
meta-llama/Llama-3.2-1B-Instruct (stable)
Multilingual 1B parameter instruction-tuned language model for dialogue and on-device use cases.
meta-llama/Llama-3.2-3B-Instruct (stable)
Multilingual 3B parameter instruction-tuned language model for dialogue and on-device use cases.
meta-llama/Llama-3.3-70B-Instruct (stable)
Multilingual 70B parameter instruction-tuned language model for dialogue use cases.
Mistral AI#
14B parameter instruction-tuned language model with vision and function calling capabilities.
14B parameter instruction-tuned language model with vision and function calling capabilities.
675B parameter granular MoE multimodal model with 41B active parameters and vision capabilities.
24B parameter instruction-tuned language model with vision and function calling capabilities.
Sparse MoE language model with 141B total parameters across 8 experts and function calling support.
mistralai/Mixtral-8x7B-Instruct-v0.1 (stable)
Sparse MoE language model with 47B total parameters across 8 experts.
OpenAI#
openai/gpt-oss-120b (stable)
Open-weight 117B parameter MoE model with 5.1B active parameters and configurable reasoning.
openai/gpt-oss-20b (stable)
Open-weight 21B parameter MoE model with 3.6B active parameters for lower-latency use cases.
Qwen#
Qwen/Qwen3-235B-A22B (stable)
235B parameter MoE language model with 22B active parameters and dual thinking modes.
Qwen/Qwen3-32B (stable)
32.8B parameter dense language model with dual thinking modes and multilingual support.
Qwen/Qwen3-Coder-Next (stable)
80B parameter MoE coding agent model with 3B active parameters and hybrid attention architecture.
Qwen/Qwen3-VL-235B-A22B-Instruct (stable)
236B parameter MoE vision-language model with 22B active parameters and multimodal capabilities.
Qwen/Qwen3-VL-235B-A22B-Thinking (stable)
236B parameter MoE vision-language model with reasoning-enhanced thinking capabilities.
deepseek-ai#
deepseek-ai/DeepSeek-R1 (stable)
671B parameter MoE reasoning model with 37B active parameters and 128K context length.
deepseek-ai/DeepSeek-R1-0528 (stable)
671B parameter MoE reasoning model with 37B active parameters, updated version of DeepSeek-R1.
deepseek-ai/DeepSeek-V3.1 (stable)
671B parameter MoE model with 37B active parameters supporting thinking and non-thinking modes.
deepseek-ai/DeepSeek-V3.1-Terminus (stable)
671B parameter MoE model with 37B active parameters, refined for language consistency and agent tasks.
google#
google/gemma-3-27b-it (stable)
Gemma 3 27B IT is a multimodal instruction-tuned model supporting text and image input with a 128K context window.