LLMs
The ai() layer owns provider clients and model traffic. It keeps Ax programs focused on signatures while one provider surface handles chat, streaming, embeddings, media, usage normalization, thinking controls, routing, balancing, tracing, and provider-specific behavior.
import { AxAIOpenAIModel, ai } from '@ax-llm/ax';
const openai = ai({
name: 'openai',
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
});Provider Setup
Create provider clients near the application boundary, keep keys in environment variables, and pass the client into forward(), agents, flows, or optimizers.
OpenAI
import { AxAIOpenAIModel, ai } from '@ax-llm/ax';
const openai = ai({
name: 'openai',
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
});OpenAI Responses
const responses = ai({
name: 'openai-responses',
apiKey: process.env.OPENAI_APIKEY!,
config: { model: 'gpt-4.1-mini' },
});Claude / Anthropic
import { AxAIAnthropicModel, ai } from '@ax-llm/ax';
const claude = ai({
name: 'anthropic',
apiKey: process.env.ANTHROPIC_APIKEY!,
config: { model: AxAIAnthropicModel.Claude48Opus },
});Gemini
import { AxAIGoogleGeminiModel, ai } from '@ax-llm/ax';
const gemini = ai({
name: 'google-gemini',
apiKey: process.env.GOOGLE_APIKEY!,
config: { model: AxAIGoogleGeminiModel.Gemini25Flash },
});OpenAI-Compatible Providers
Use apiURL when a provider shares the OpenAI wire shape but uses a different host or model naming scheme.
const compatible = ai({
name: 'openai',
apiKey: process.env.PROVIDER_API_KEY!,
apiURL: 'https://provider.example/v1',
config: { model: 'provider/model-name' },
});Model Catalog
Use the model catalog before runtime when a UI or router needs model choices, costs, and capabilities. It can filter for text, code, embedding, and audio models.
import { axGetSupportedAIModels } from '@ax-llm/ax';
const textModels = axGetSupportedAIModels({ type: 'text' });
const audioModels = axGetSupportedAIModels({ type: 'audio' });
console.log(textModels[0]?.models[0]?.promptTokenCostPer1M);flowchart LR A[Model catalog] --> B[Capability filter] B --> C[Text] B --> D[Embeddings] B --> E[Audio] C --> F[Route or select model] D --> F E --> F
Routing And Balancing
Routers choose a provider by capability, model key, or app policy. Balancers retry across services while preserving the Ax request shape. Use them when latency, quota, cost, rate limits, or provider outages matter.
Embeddings
Embeddings live on the same provider client surface. Use them for retrieval indexes, memory search, context lookup, and similarity workflows while keeping embedding model selection separate from generation model selection.
const { embeddings } = await openai.embed({
texts: ['typed LLM programs', 'runtime agents'],
embedModel: 'text-embedding-3-small',
});Audio, Realtime, And Responses
Ax maps batch transcription, batch speech, conversational audio, OpenAI Responses audio, and realtime event folding where supported. Direct ax(...) programs can pass media to compatible models; agents usually transcribe audio before planner/executor/responder stages.
const transcript = await openai.transcribe({
audio: { data: base64Wav, format: 'wav' },
model: 'gpt-4o-mini-transcribe',
language: 'en',
});
const speech = await openai.speak({ text: transcript.text, model: 'gpt-4o-mini-tts', voice: 'alloy' });Thinking And Context Caching
Thinking controls expose provider-specific reasoning budgets through one Ax option. Context caching marks stable prompt regions so providers with prefix caching can reuse expensive context.
const res = await claude.chat(
{ chatPrompt: [{ role: 'user', content: 'Review this design.' }] },
{ thinkingTokenBudget: 'medium', showThoughts: true }
);
console.log(res.results[0]?.thought);flowchart TB A[Stable context field] --> B[Cache breakpoint] C[User query] --> D[Generation] B --> D E[thinkingTokenBudget] --> D D --> F[Usage + trace]
Production Notes
- Keep provider keys outside source code.
- Prefer model aliases like
fast,smart, orcheapwhen app callers should not know provider model IDs. - Trace request latency, retries, token usage, cost, route choice, media mode, and model key.
- Keep provider-api examples separate from no-key examples.
- Use OpenAI-compatible clients for generated-language package examples when that is the supported provider path.
See ai() LLM models and ai() API.