LLMs

The ai() layer owns provider clients and model traffic. It keeps Ax programs focused on signatures while one provider surface handles chat, streaming, embeddings, media, usage normalization, thinking controls, routing, balancing, tracing, and provider-specific behavior.

TypeScript

import { AxAIOpenAIModel, ai } from '@ax-llm/ax';

const openai = ai({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: AxAIOpenAIModel.GPT4OMini },
});

Provider Setup

Create provider clients near the application boundary, keep keys in environment variables, and pass the client into forward(), agents, flows, or optimizers.

OpenAI

TypeScript

import { AxAIOpenAIModel, ai } from '@ax-llm/ax';

const openai = ai({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: AxAIOpenAIModel.GPT4OMini },
});

OpenAI Responses

TypeScript

const responses = ai({
  name: 'openai-responses',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4.1-mini' },
});

Claude / Anthropic

TypeScript

import { AxAIAnthropicModel, ai } from '@ax-llm/ax';

const claude = ai({
  name: 'anthropic',
  apiKey: process.env.ANTHROPIC_APIKEY!,
  config: { model: AxAIAnthropicModel.Claude48Opus },
});

Gemini

TypeScript

import { AxAIGoogleGeminiModel, ai } from '@ax-llm/ax';

const gemini = ai({
  name: 'google-gemini',
  apiKey: process.env.GOOGLE_APIKEY!,
  config: { model: AxAIGoogleGeminiModel.Gemini25Flash },
});

OpenAI-Compatible Providers

Use apiURL when a provider shares the OpenAI wire shape but uses a different host or model naming scheme.

TypeScript

const compatible = ai({
  name: 'openai',
  apiKey: process.env.PROVIDER_API_KEY!,
  apiURL: 'https://provider.example/v1',
  config: { model: 'provider/model-name' },
});

Model Catalog

Use the model catalog before runtime when a UI or router needs model choices, costs, and capabilities. It can filter for text, code, embedding, and audio models.

TypeScript

import { axGetSupportedAIModels } from '@ax-llm/ax';

const textModels = axGetSupportedAIModels({ type: 'text' });
const audioModels = axGetSupportedAIModels({ type: 'audio' });
console.log(textModels[0]?.models[0]?.promptTokenCostPer1M);

flowchart LR
  A[Model catalog] --> B[Capability filter]
  B --> C[Text]
  B --> D[Embeddings]
  B --> E[Audio]
  C --> F[Route or select model]
  D --> F
  E --> F

Routing And Balancing

Routing has two distinct jobs. The multi-service router combines provider model lists and dispatches the model key the caller already chose; it does not select a model. AxBalancer handles equivalent services behind shared model aliases, preserving the Ax request shape while applying capability filters and provider failover.

Every supported language can opt into adaptive AxBalancer routing. It learns transient provider failure rate and successful latency, then weighs the probability of a failure or deadline miss against estimated request cost. This is operational provider selection, not semantic prompt-to-model selection, so every model behind an alias must be an acceptable substitute.

VerifiedNeeds credentialsSource

TypeScript

import { AxBalancer, AxInMemoryBalancerStatsStore } from '@ax-llm/ax';

const statsStore = new AxInMemoryBalancerStatsStore();
const routeKeys = new Map<string, string>([
  [openai.getId(), 'openai-primary'],
  [anthropic.getId(), 'anthropic-primary'],
]);

const llm = AxBalancer.create([openai, anthropic] as const, {
  strategy: {
    type: 'adaptive',
    deadlineMs: 6_000,
    badOutcomeCost: 0.02,
    expectedTokens: { promptTokens: 1_200, completionTokens: 300 },
    namespace: 'support-summary-v1',
    routeKey: (service) => {
      const key = routeKeys.get(service.getId());
      if (!key) throw new Error('Missing stable route key.');
      return key;
    },
    slice: ({ options }) =>
      options?.customLabels?.workflow ?? 'default-workflow',
    statsStore,
    onRoutingEvent: (event) => {
      if (event.type === 'selected' || event.type === 'fallback') {
        console.log('route:', event);
      }
    },
  },
});

Embeddings

Embeddings live on the same provider client surface. Use them for retrieval indexes, memory search, context lookup, and similarity workflows while keeping embedding model selection separate from generation model selection.

TypeScript

const { embeddings } = await openai.embed({
  texts: ['typed LLM programs', 'runtime agents'],
  embedModel: 'text-embedding-3-small',
});

Audio, Realtime, And Responses

Ax maps batch transcription, batch speech, conversational audio, OpenAI Responses audio, and realtime event folding where supported. Direct ax(...) programs can pass media to compatible models; agents usually transcribe audio before planner/executor/responder stages.

TypeScript

const transcript = await openai.transcribe({
  audio: { data: base64Wav, format: 'wav' },
  model: 'gpt-4o-mini-transcribe',
  language: 'en',
});

const speech = await openai.speak({ text: transcript.text, model: 'gpt-4o-mini-tts', voice: 'alloy' });

Thinking And Context Caching

Thinking controls expose provider-specific reasoning budgets through one Ax option. Context caching marks stable prompt regions so providers with prefix caching can reuse expensive context.

TypeScript

const res = await claude.chat(
  { chatPrompt: [{ role: 'user', content: 'Review this design.' }] },
  { thinkingTokenBudget: 'medium', showThoughts: true }
);
console.log(res.results[0]?.thought);

flowchart TB
  A[Stable context field] --> B[Cache breakpoint]
  C[User query] --> D[Generation]
  B --> D
  E[thinkingTokenBudget] --> D
  D --> F[Usage + trace]

Production Notes

Keep provider keys outside source code.
Prefer model aliases like fast, smart, or cheap when app callers should not know provider model IDs.
Trace request latency, retries, token usage, cost, route choice, media mode, and model key.
Keep public provider examples separate from internal conformance fixtures.
Use OpenAI-compatible clients for generated-language package examples when that is the supported provider path.

See ai() LLM models and ai() API.