Getting Started with Ax AI Providers and Models
This guide helps beginners get productive with Ax quickly: pick a provider, choose a model, and send a request. You’ll also learn how to define model presets and common options.
1. Install and set up
npm i @ax-llm/ax
Set your API keys as environment variables:
OPENAI_APIKEYANTHROPIC_APIKEYGOOGLE_APIKEY(or Google Vertex setup)
2. Create an AI instance
Use the ai() factory with a provider name and your API key.
import { ai, AxAIGoogleGeminiModel } from "@ax-llm/ax";
const llm = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: {
model: AxAIGoogleGeminiModel.Gemini20Flash,
},
});
Supported providers include: openai, anthropic, google-gemini, mistral,
groq, cohere, together, deepseek, ollama, huggingface, openrouter,
azure-openai, reka, x-grok.
3. Choose models using presets (recommended)
Define a models list with user-friendly keys. Each item describes a preset and
can include provider-specific settings. When you use a key in model, Ax maps
it to the right backend model and merges the preset config.
import { ai, AxAIGoogleGeminiModel } from "@ax-llm/ax";
const gemini = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: { model: "simple" },
models: [
{
key: "tiny",
model: AxAIGoogleGeminiModel.Gemini20FlashLite,
description: "Fast + cheap",
// Provider config merged automatically
config: { maxTokens: 1024, temperature: 0.3 },
},
{
key: "simple",
model: AxAIGoogleGeminiModel.Gemini20Flash,
description: "Balanced general-purpose",
config: { temperature: 0.6 },
},
],
});
// Use a preset by key
await gemini.chat({
model: "tiny",
chatPrompt: [{ role: "user", content: "Summarize this:" }],
});
What gets merged when you pick a key:
- Model mapping: preset
modelreplaces the key - Tuning:
maxTokens,temperature,topP,topK, penalties,stopSequences,n,stream - Provider extras (Gemini):
config.thinking.thinkingTokenBudgetis mapped to Ax’s levels automatically;includeThoughtsmaps toshowThoughts
You can still override per-request:
await gemini.chat(
{ model: "simple", chatPrompt: [{ role: "user", content: "Hi" }] },
{ stream: false, thinkingTokenBudget: "medium" },
);
4. Send your first chat
const res = await gemini.chat({
chatPrompt: [
{ role: "system", content: "You are concise." },
{ role: "user", content: "Write a haiku about the ocean." },
],
});
console.log(res.results[0]?.content);
5. Common options
stream(boolean): enable server-sent events;trueby default if supportedthinkingTokenBudget(Gemini/Claude-like):'minimal' | 'low' | 'medium' | 'high' | 'highest' | 'none'showThoughts(if model supports): include thoughts in outputfunctionCallMode:'auto' | 'native' | 'prompt'debug,logger,tracer,rateLimiter,timeout
Example with overrides:
await gemini.chat(
{ chatPrompt: [{ role: "user", content: "Plan a weekend trip" }] },
{ stream: false, thinkingTokenBudget: "high", showThoughts: true },
);
6. Embeddings (if supported)
const { embeddings } = await gemini.embed({
texts: ['hello', 'world'],
embedModel: 'text-embedding-005',
})
``;
### 7. Context Caching
Context caching reduces costs and latency by caching large prompt prefixes
(system prompts, function definitions, examples) for reuse across multiple
requests. This is especially valuable for multi-turn agentic flows.
#### Enabling Context Caching
Pass the `contextCache` option to `forward()` to enable caching:
```ts
import { ai, ax, AxMemory } from "@ax-llm/ax";
const llm = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
});
const codeReviewer = ax(
`code:string, language:string -> review:string, suggestions:string[]`,
{ description: "You are an expert code reviewer..." } // Large system prompt
);
const mem = new AxMemory();
// Enable context caching
const result = await codeReviewer.forward(llm, { code, language }, {
mem,
sessionId: "code-review-session",
contextCache: {
ttlSeconds: 3600, // Cache TTL (1 hour)
},
});
How It Works
Google Gemini (Explicit Caching):
- Creates a separate cache resource with an ID
- Cache persists across requests using the same
sessionId+ content hash - Automatic TTL refresh when cache is near expiration
- Provides up to 90% cost reduction on cached tokens
- Minimum 2048 tokens required for caching
Anthropic (Implicit Caching):
- Uses
cache_controlmarkers in the request - System prompts are automatically cached
- Function definitions and results are marked for caching
- No explicit cache management needed
- Provides up to 90% cost reduction on cached tokens
Configuration Options
type AxContextCacheOptions = {
// Explicit cache name (bypasses auto-creation)
name?: string;
// TTL in seconds (default: 3600)
ttlSeconds?: number;
// Minimum tokens to create cache (default: 2048)
minTokens?: number;
// Window before expiration to trigger refresh (default: 300)
refreshWindowSeconds?: number;
// External registry for serverless environments
registry?: AxContextCacheRegistry;
};
Multi-Turn Function Calling with Caching
When using functions/tools, caching is automatically applied:
import { ai, ax, type AxFunction } from "@ax-llm/ax";
const tools: AxFunction[] = [
{
name: "calculate",
description: "Evaluate a math expression",
parameters: { type: "object", properties: { expression: { type: "string" } } },
func: ({ expression }) => eval(expression),
},
];
const agent = ax("question:string -> answer:string", {
description: "You are a helpful assistant...",
functions: tools,
});
const llm = ai({ name: "google-gemini", apiKey: process.env.GOOGLE_APIKEY! });
// Tools and function results are automatically cached
const result = await agent.forward(llm, { question: "What is 2^10?" }, {
contextCache: { ttlSeconds: 3600 },
});
External Cache Registry (Serverless)
For serverless environments where in-memory state is lost, use an external registry:
// Redis-backed registry example
const registry: AxContextCacheRegistry = {
get: async (key) => {
const data = await redis.get(`cache:${key}`);
return data ? JSON.parse(data) : undefined;
},
set: async (key, entry) => {
await redis.set(`cache:${key}`, JSON.stringify(entry), "EX", 3600);
},
};
const result = await gen.forward(llm, input, {
sessionId: "my-session",
contextCache: {
ttlSeconds: 3600,
registry,
},
});
Supported Models
Gemini (Explicit Caching):
- Gemini 3 Flash/Pro
- Gemini 2.5 Pro/Flash/Flash-Lite
- Gemini 2.0 Flash/Flash-Lite
Anthropic (Implicit Caching):
- All Claude models support implicit caching
8. Tips
- Prefer presets: gives friendly names and consistent tuning across your app
- Start with fast/cheap models for iteration; switch keys later without code changes
- Use
stream: falsein tests for simpler assertions - In the browser, set
corsProxyif needed
For more examples, see the examples directory and provider-specific docs.
AWS Bedrock Provider
The @ax-llm/ax-ai-aws-bedrock package provides production-ready AWS Bedrock integration supporting Claude, GPT OSS, and Titan Embed models.
Installation
npm install @ax-llm/ax @ax-llm/ax-ai-aws-bedrock
Quick Start
import { AxAIBedrock, AxAIBedrockModel } from "@ax-llm/ax-ai-aws-bedrock";
import { ax } from "@ax-llm/ax";
const ai = new AxAIBedrock({
region: "us-east-2",
config: { model: AxAIBedrockModel.ClaudeSonnet4 },
});
const generator = ax("question:string -> answer:string");
const result = await generator.forward(ai, {
question: "What is AWS Bedrock?",
});
console.log(result.answer);
Configuration
const ai = new AxAIBedrock({
region: "us-east-2", // Primary AWS region
fallbackRegions: ["us-west-2", "us-east-1"], // Fallback regions for Claude
gptRegion: "us-west-2", // Primary region for GPT models
gptFallbackRegions: ["us-east-1"], // Fallback regions for GPT
config: {
model: AxAIBedrockModel.ClaudeSonnet4,
maxTokens: 4096,
temperature: 0.7,
topP: 0.9,
},
});
Supported Models
Claude Models:
AxAIBedrockModel.ClaudeSonnet4- Claude Sonnet 4AxAIBedrockModel.ClaudeOpus4- Claude Opus 4AxAIBedrockModel.Claude35Sonnet- Claude 3.5 SonnetAxAIBedrockModel.Claude35Haiku- Claude 3.5 HaikuAxAIBedrockModel.Claude3Opus- Claude 3 Opus
GPT Models:
AxAIBedrockModel.Gpt41106- GPT-4 1106 PreviewAxAIBedrockModel.Gpt4Mini- GPT-4o Mini
Embedding Models:
AxAIBedrockEmbedModel.TitanEmbedV2- Titan Embed V2
Regional Failover
The provider automatically handles regional failover for high availability. If the primary region fails, it retries with fallback regions.
AWS Authentication
Uses AWS SDK’s default credential chain:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - AWS credentials file (
~/.aws/credentials) - IAM roles (EC2/Lambda)
Vercel AI SDK Integration
The @ax-llm/ax-ai-sdk-provider package provides seamless integration with the Vercel AI SDK v5.
Installation
npm install @ax-llm/ax @ax-llm/ax-ai-sdk-provider ai
Basic Usage
import { ai } from "@ax-llm/ax";
import { AxAIProvider } from "@ax-llm/ax-ai-sdk-provider";
import { generateText, streamText } from "ai";
// Create Ax AI instance
const axAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
});
// Create AI SDK v5 compatible provider
const model = new AxAIProvider(axAI);
// Use with AI SDK functions
const result = await generateText({
model,
messages: [{ role: "user", content: "Hello!" }],
});
console.log(result.text);
Streaming with React Server Components
import { ai } from "@ax-llm/ax";
import { AxAIProvider } from "@ax-llm/ax-ai-sdk-provider";
import { streamUI } from "ai/rsc";
const axAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
});
const model = new AxAIProvider(axAI);
const result = await streamUI({
model,
messages: [{ role: "user", content: "Tell me a story" }],
text: ({ content }) => <p>{content}</p>,
});
Agent Provider
Use Ax agents with the AI SDK:
import { ai, agent } from "@ax-llm/ax";
import { AxAgentProvider } from "@ax-llm/ax-ai-sdk-provider";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const myAgent = agent("userInput:string -> response:string", {
name: "helper",
description: "A helpful assistant",
ai: llm,
});
const agentProvider = new AxAgentProvider({
agent: myAgent,
updateState: (msgs) => {
/* handle state updates */
},
generate: (result) => <div>{result.response}</div>,
});
Features
- AI SDK v5
LanguageModelV2compatible - Full tool/function calling support
- Streaming with lifecycle events
- Multi-modal inputs (text, images, files)
- Full TypeScript support
Ax Tools Package
The @ax-llm/ax-tools package provides additional tools for Ax including MCP (Model Context Protocol) support and a JavaScript interpreter.
Installation
npm install @ax-llm/ax @ax-llm/ax-tools
MCP Stdio Transport
Connect to MCP servers via stdio:
import { AxMCPClient } from "@ax-llm/ax";
import { axCreateMCPStdioTransport } from "@ax-llm/ax-tools";
// Create transport for an MCP server
const transport = axCreateMCPStdioTransport({
command: "npx",
args: ["-y", "@anthropic/mcp-server-filesystem"],
env: { HOME: process.env.HOME },
});
// Use with AxMCPClient
const client = new AxMCPClient(transport);
await client.init();
const tools = await client.getTools();
console.log("Available tools:", tools.map((t) => t.name));
JavaScript Interpreter
A sandboxed JavaScript interpreter that can be used as a function tool:
import { ai, ax } from "@ax-llm/ax";
import {
AxJSInterpreter,
AxJSInterpreterPermission,
} from "@ax-llm/ax-tools";
// Create interpreter with specific permissions
const interpreter = new AxJSInterpreter({
permissions: [
AxJSInterpreterPermission.CRYPTO,
AxJSInterpreterPermission.OS,
],
});
// Use as a function tool
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const codeRunner = ax("task:string -> result:string", {
functions: [interpreter.toFunction()],
});
const result = await codeRunner.forward(llm, {
task: "Calculate the factorial of 10",
});
Permissions
Control what the interpreter can access:
| Permission | Description |
|---|---|
FS | File system access (node:fs) |
NET | Network access (http, https) |
OS | OS information (node:os) |
CRYPTO | Cryptographic functions |
PROCESS | Process information |
import { AxJSInterpreterPermission } from "@ax-llm/ax-tools";
const interpreter = new AxJSInterpreter({
permissions: [
AxJSInterpreterPermission.FS,
AxJSInterpreterPermission.NET,
],
});