# Ax Framework Documentation
# Generated from project documentation
# Last updated: 2026-03-11T20:56:50.186Z
================================================================================
# Documentation
# Source: README.md
# Ax - Build Reliable AI Apps in TypeScript
# Ax - Build Reliable AI Apps in TypeScript with DSPy
Ax brings DSPy's approach to TypeScript – describe what you want, and let the framework handle the rest. Production-ready, type-safe, works with all major LLMs.
[](https://www.npmjs.com/package/@ax-llm/ax)
[](https://twitter.com/dosco)
[](https://discord.gg/DSHg3dU7dW)
## The Problem
Building with LLMs is painful. You write prompts, test them, they break. You switch providers, everything needs rewriting. You add validation, error handling, retries – suddenly you're maintaining infrastructure instead of shipping features.
## The Solution
Define what goes in and what comes out. Ax handles the rest.
```typescript
import { ai, ax } from "@ax-llm/ax";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY });
const classifier = ax(
'review:string -> sentiment:class "positive, negative, neutral"',
);
const result = await classifier.forward(llm, {
review: "This product is amazing!",
});
console.log(result.sentiment); // "positive"
```
No prompt engineering. No trial and error. Works with GPT-4, Claude, Gemini, or any LLM.
## Why Ax
**Write once, run anywhere.** Switch between OpenAI, Anthropic, Google, or 15+ providers with one line. No rewrites.
**Ship faster.** Stop tweaking prompts. Define inputs and outputs. The framework generates optimal prompts automatically.
**Production-ready.** Built-in streaming, validation, error handling, observability. Used in production handling millions of requests.
**Gets smarter.** Train your programs with examples. Watch accuracy improve automatically. No ML expertise needed.
## Examples
### Extract structured data
```typescript
const extractor = ax(`
customerEmail:string, currentDate:datetime ->
priority:class "high, normal, low",
sentiment:class "positive, negative, neutral",
ticketNumber?:number,
nextSteps:string[],
estimatedResponseTime:string
`);
const result = await extractor.forward(llm, {
customerEmail: "Order #12345 hasn't arrived. Need this resolved immediately!",
currentDate: new Date(),
});
```
### Complex nested objects
```typescript
import { f, ax } from "@ax-llm/ax";
const productExtractor = f()
.input("productPage", f.string())
.output("product", f.object({
name: f.string(),
price: f.number(),
specs: f.object({
dimensions: f.object({
width: f.number(),
height: f.number()
}),
materials: f.array(f.string())
}),
reviews: f.array(f.object({
rating: f.number(),
comment: f.string()
}))
}))
.build();
const generator = ax(productExtractor);
const result = await generator.forward(llm, { productPage: "..." });
// Full TypeScript inference
console.log(result.product.specs.dimensions.width);
console.log(result.product.reviews[0].comment);
```
### Validation and constraints
```typescript
const userRegistration = f()
.input("userData", f.string())
.output("user", f.object({
username: f.string().min(3).max(20),
email: f.string().email(),
age: f.number().min(18).max(120),
password: f.string().min(8).regex("^(?=.*[A-Za-z])(?=.*\\d)", "Must contain letter and digit"),
bio: f.string().max(500).optional(),
website: f.string().url().optional(),
}))
.build();
```
Available constraints: `.min(n)`, `.max(n)`, `.email()`, `.url()`, `.date()`, `.datetime()`, `.regex(pattern, description)`, `.optional()`
Validation runs on both input and output. Automatic retry with corrections on validation errors.
### Agents with tools (ReAct pattern)
```typescript
const assistant = ax(
"question:string -> answer:string",
{
functions: [
{ name: "getCurrentWeather", func: weatherAPI },
{ name: "searchNews", func: newsAPI },
],
},
);
const result = await assistant.forward(llm, {
question: "What's the weather in Tokyo and any news about it?",
});
```
### AxAgent + RLM for long context
```typescript
import { agent, AxJSRuntime } from "@ax-llm/ax";
const analyzer = agent(
"context:string, query:string -> answer:string, evidence:string[]",
{
name: "documentAnalyzer",
description: "Analyze very long documents with recursive code + sub-queries",
maxSteps: 20,
rlm: {
contextFields: ["context"],
runtime: new AxJSRuntime(),
maxSubAgentCalls: 40,
maxRuntimeChars: 2_000, // Shared cap for llmQuery context + interpreter output
maxBatchedLlmQueryConcurrency: 6,
subModel: "gpt-4o-mini",
},
},
);
const result = await analyzer.forward(llm, {
context: veryLongDocument,
query: "What are the main arguments and supporting evidence?",
});
```
RLM mode keeps long context out of the root prompt, runs iterative analysis in a persistent runtime session, and uses bounded sub-queries for semantic extraction (typically targeting <=10k chars per sub-call).
### AxJSRuntime
`AxJSRuntime` is the built-in JavaScript runtime used by RLM and tool-style execution.
It works across:
- Node.js/Bun-style backends (worker_threads runtime path)
- Deno backends (module worker path)
- Browser environments (Web Worker path)
It supports:
- Persistent sessions via `createSession()`
- Function tool usage via `toFunction()`
- Sandbox permissions via `AxJSRuntimePermission`
### Multi-modal (images, audio)
```typescript
const analyzer = ax(`
image:image, question:string ->
description:string,
mainColors:string[],
category:class "electronics, clothing, food, other",
estimatedPrice:string
`);
```
## Install
```bash
npm install @ax-llm/ax
```
Additional packages:
```bash
# AWS Bedrock provider
npm install @ax-llm/ax-ai-aws-bedrock
# Vercel AI SDK v5 integration
npm install @ax-llm/ax-ai-sdk-provider
# Tools: MCP stdio transport, JS runtime
npm install @ax-llm/ax-tools
```
## Features
- **15+ LLM Providers** – OpenAI, Anthropic, Google, Mistral, Ollama, and more
- **Type-safe** – Full TypeScript support with auto-completion
- **Streaming** – Real-time responses with validation
- **Multi-modal** – Images, audio, text in the same signature
- **Optimization** – Automatic prompt tuning with MiPRO, ACE, GEPA
- **Observability** – OpenTelemetry tracing built-in
- **Workflows** – Compose complex pipelines with AxFlow
- **RAG** – Multi-hop retrieval with quality loops
- **Agents** – Tools and multi-agent collaboration
- **RLM in AxAgent** – Long-context analysis with recursive runtime loops
- **Zero dependencies** – Lightweight, fast, reliable
## Documentation
**Get Started**
- [Quick Start Guide](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/QUICKSTART.md) – Set up in 5 minutes
- [Examples Guide](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/EXAMPLES.md) – Comprehensive examples
- [DSPy Concepts](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/DSPY.md) – Understanding the approach
- [Signatures Guide](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/SIGNATURES.md) – Type-safe signature design
**Deep Dives**
- [AI Providers](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/AI.md) – All providers, AWS Bedrock, Vercel AI SDK
- [AxFlow Workflows](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/AXFLOW.md) – Build complex AI systems
- [Optimization (MiPRO, ACE, GEPA)](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/OPTIMIZE.md) – Make programs smarter
- [AxAgent & RLM](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/AXAGENT.md) – Agents, child agents, tools, and RLM for long contexts
- [Advanced RAG](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/docs/AXRAG.md) – Production search and retrieval
## Run Examples
```bash
OPENAI_APIKEY=your-key npm run tsx ./src/examples/[example-name].ts
```
Core examples: `extract.ts`, `react.ts`, `agent.ts`, `streaming1.ts`, `multi-modal.ts`
Production patterns: `customer-support.ts`, `food-search.ts`, `rlm.ts`, `ace-train-inference.ts`, `ax-flow-enhanced-demo.ts`
[View all 70+ examples](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/)
## Community
- [Twitter](https://twitter.com/dosco) – Updates
- [Discord](https://discord.gg/DSHg3dU7dW) – Help and discussion
- [GitHub](https://github.com/ax-llm/ax) – Star the project
- [DeepWiki](https://deepwiki.com/ax-llm/ax) – AI-powered docs
## Production Ready
- Battle-tested in production
- Stable minor versions
- Comprehensive test coverage
- OpenTelemetry built-in
- TypeScript first
## Contributors
- Author: [@dosco](https://github.com/dosco)
- GEPA and ACE optimizers: [@monotykamary](https://github.com/monotykamary)
## License
Apache 2.0
---
```bash
npm install @ax-llm/ax
```
================================================================================
# Quick Start
# Source: QUICKSTART.md
# Get from zero to your first AI application in 5 minutes
# Quick Start Guide
This guide will get you from zero to your first AI application in 5 minutes.
## Prerequisites
- Node.js 20 or higher
- An API key from OpenAI, Anthropic, or Google (we'll use OpenAI in this guide)
## Installation
```bash
npm install @ax-llm/ax
```
### Additional Packages
```bash
# AWS Bedrock provider (Claude, GPT, Titan on AWS)
npm install @ax-llm/ax-ai-aws-bedrock
# Vercel AI SDK v5 integration
npm install @ax-llm/ax-ai-sdk-provider
# Tools: MCP stdio transport, JS interpreter
npm install @ax-llm/ax-tools
```
See the [AI Providers Guide](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AI.md) for detailed documentation on each package.
## Step 1: Set Up Your API Key
Create a `.env` file in your project root:
```bash
OPENAI_APIKEY=your-api-key-here
```
Or export it in your terminal:
```bash
export OPENAI_APIKEY=your-api-key-here
```
## Step 2: Your First AI Program
Create a file called `hello-ai.ts`:
```typescript
import { ai, ax } from "@ax-llm/ax";
// Initialize your AI provider
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!
});
// Create a simple classifier
const sentimentAnalyzer = ax(
'reviewText:string -> sentiment:class "positive, negative, neutral"'
);
// Use it!
async function analyze() {
const result = await sentimentAnalyzer.forward(llm, {
reviewText: "This product exceeded all my expectations!"
});
console.log(`Sentiment: ${result.sentiment}`);
}
analyze();
```
## Step 3: Run Your Program
```bash
npx tsx hello-ai.ts
```
You should see:
```
Sentiment: positive
```
## What Just Happened?
1. **No prompt engineering** - You didn't write any prompts, just described what you wanted
2. **Type safety** - TypeScript knows that `result.sentiment` is one of your three classes
3. **Automatic optimization** - The framework generated an optimal prompt for you
4. **Provider agnostic** - This same code works with Claude, Gemini, or any other LLM
## Next: Add Streaming
Want to see results as they generate? Add one parameter:
```typescript
const result = await sentimentAnalyzer.forward(
llm,
{ reviewText: "Great product!" },
{ stream: true } // ← Enable streaming
);
```
## Next: Multi-Modal (Images)
Work with images just as easily:
```typescript
import fs from "fs";
const imageAnalyzer = ax(
'photo:image, question:string -> answer:string'
);
const imageData = fs.readFileSync("photo.jpg").toString("base64");
const result = await imageAnalyzer.forward(llm, {
photo: { mimeType: "image/jpeg", data: imageData },
question: "What's in this image?"
});
```
## Next: Complex Workflows
Build multi-step processes:
```typescript
const documentProcessor = ax(`
documentText:string ->
summary:string "2-3 sentences",
keyPoints:string[] "main points",
sentiment:class "positive, negative, neutral"
`);
const result = await documentProcessor.forward(llm, {
documentText: "Your long document here..."
});
console.log(`Summary: ${result.summary}`);
console.log(`Key Points: ${result.keyPoints.join(", ")}`);
console.log(`Sentiment: ${result.sentiment}`);
```
## Next: Add Validation
Ensure data quality with built-in validators:
```typescript
import { f, ax } from "@ax-llm/ax";
const contactForm = f()
.input("formData", f.string())
.output("contact", f.object({
name: f.string().min(2).max(100),
email: f.string().email(),
age: f.number().min(18).max(120),
website: f.string().url().optional(),
message: f.string().min(10).max(500)
}))
.build();
const generator = ax(contactForm);
const result = await generator.forward(llm, {
formData: "Name: John Doe, Email: john@example.com, Age: 30..."
});
// All fields are automatically validated:
// - name: 2-100 characters
// - email: valid email format
// - age: between 18-120
// - website: valid URL if provided
// - message: 10-500 characters
```
**Available Constraints:**
- `.min(n)` / `.max(n)` - String length or number range
- `.email()` - Email format validation (or use `f.email()`)
- `.url()` - URL format validation (or use `f.url()`)
- `.date()` - Date format validation (or use `f.date()`)
- `.datetime()` - Datetime format validation (or use `f.datetime()`)
- `.regex(pattern, description)` - Custom regex pattern with human-readable description
- `.optional()` - Make field optional
**Note:** For email, url, date, and datetime, you can use either the validator syntax (`f.string().email()`) or the dedicated type syntax (`f.email()`). Both work consistently in all contexts!
Validation runs automatically:
- ✅ **Before LLM calls** - Input validation ensures clean data
- ✅ **After LLM responses** - Output validation with auto-retry on errors
- ✅ **During streaming** - Incremental validation as fields complete
## Using Different Providers
### OpenAI
```typescript
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4O } // Optional: specify model
});
```
### Anthropic Claude
```typescript
const llm = ai({
name: "anthropic",
apiKey: process.env.ANTHROPIC_APIKEY!,
config: { model: AxAIAnthropicModel.Claude35Sonnet }
});
```
### Google Gemini
```typescript
const llm = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: { model: AxAIGoogleGeminiModel.Gemini15Pro }
});
```
### Local Ollama
```typescript
const llm = ai({
name: "ollama",
config: { model: "llama3.2" }
});
```
## Field Types Reference
| Type | Example | Description |
|------|---------|-------------|
| `string` | `name:string` | Text input/output |
| `number` | `score:number` | Numeric values |
| `boolean` | `isValid:boolean` | True/false |
| `class` | `category:class "a,b,c"` | Enumeration |
| `string[]` | `tags:string[]` | Array of strings |
| `json` | `data:json` | Any JSON object |
| `image` | `photo:image` | Image input |
| `audio` | `recording:audio` | Audio input |
| `date` | `dueDate:date` | Date value |
| `?` | `notes?:string` | Optional field |
## Common Patterns
### Classification
```typescript
const classifier = ax(
'text:string -> category:class "option1, option2, option3"'
);
```
### Extraction
```typescript
const extractor = ax(
'document:string -> names:string[], dates:date[], amounts:number[]'
);
```
### Question Answering
```typescript
const qa = ax(
'context:string, question:string -> answer:string'
);
```
### Translation
```typescript
const translator = ax(
'text:string, targetLanguage:string -> translation:string'
);
```
## Error Handling
```typescript
try {
const result = await gen.forward(llm, input);
} catch (error) {
console.error("Generation failed:", error);
}
```
## Debug Mode
See what's happening under the hood:
```typescript
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: { debug: true } // Enable debug logging
});
```
## What's Next?
Now that you have the basics:
1. **Explore Examples** - Check out the [examples directory](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/) for real-world patterns
2. **Learn DSPy Concepts** - Understand the [revolutionary approach](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/DSPY.md)
3. **Build Workflows** - Create complex systems with [AxFlow](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AXFLOW.md)
4. **Optimize Performance** - Make your programs smarter with [optimization](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/OPTIMIZE.md)
5. **Add Observability** - Monitor production apps with [telemetry](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/TELEMETRY.md)
## Need Help?
- 💬 [Join our Discord](https://discord.gg/DSHg3dU7dW)
- 📖 [Read the docs](https://github.com/ax-llm/ax)
- 🐦 [Follow on Twitter](https://twitter.com/dosco)
## 🔗 Integration with Vercel AI SDK v5
Ax provides seamless integration with the Vercel AI SDK through `@ax-llm/ax-ai-sdk-provider`:
### Installation
```bash
npm install @ax-llm/ax-ai-sdk-provider
```
### Basic Usage
```typescript
import { ai } from "@ax-llm/ax";
import { AxAIProvider } from "@ax-llm/ax-ai-sdk-provider";
import { streamUI } from "ai/rsc";
// Create Ax AI instance
const axAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!
});
// Create AI SDK v5 compatible provider
const model = new AxAIProvider(axAI);
// Use with AI SDK functions
const result = await streamUI({
model,
messages: [
{ role: "user", content: "Hello!" }
],
text: ({ content }) => content,
});
```
### Features
- ✅ **AI SDK v5 Compatible**: Implements `LanguageModelV2` specification
- ✅ **Full Tool Support**: Function calling with proper serialization
- ✅ **Streaming**: Enhanced streaming with lifecycle events
- ✅ **Multi-modal**: Text, images, and file inputs
- ✅ **Type Safety**: Full TypeScript support
> **Note**: This allows you to use Ax's powerful AI provider ecosystem with any AI SDK v5 application, giving you access to 15+ LLM providers through a single interface.
---
## 🔗 AWS Bedrock Provider
Use Claude, GPT, and Titan models on AWS with `@ax-llm/ax-ai-aws-bedrock`:
### Installation
```bash
npm install @ax-llm/ax-ai-aws-bedrock
```
### Basic Usage
```typescript
import { AxAIBedrock, AxAIBedrockModel } from "@ax-llm/ax-ai-aws-bedrock";
import { ax } from "@ax-llm/ax";
const ai = new AxAIBedrock({
region: "us-east-2",
config: { model: AxAIBedrockModel.ClaudeSonnet4 }
});
const generator = ax("question:string -> answer:string");
const result = await generator.forward(ai, {
question: "What is AWS Bedrock?"
});
console.log(result.answer);
```
### Features
- ✅ **Claude, GPT, Titan**: All major Bedrock models supported
- ✅ **Regional Failover**: Automatic failover across AWS regions
- ✅ **Embeddings**: Titan Embed V2 for vector embeddings
- ✅ **AWS Auth**: Uses standard AWS credential chain
---
## 🔗 Ax Tools Package
Additional tools for MCP and code execution with `@ax-llm/ax-tools`:
### Installation
```bash
npm install @ax-llm/ax-tools
```
### MCP Stdio Transport
```typescript
import { AxMCPClient } from "@ax-llm/ax";
import { axCreateMCPStdioTransport } from "@ax-llm/ax-tools";
const transport = axCreateMCPStdioTransport({
command: "npx",
args: ["-y", "@anthropic/mcp-server-filesystem"]
});
const client = new AxMCPClient(transport);
await client.init();
const tools = await client.getTools();
```
### JavaScript Interpreter
```typescript
import { ai, ax } from "@ax-llm/ax";
import { AxJSRuntime, AxJSRuntimePermission } from "@ax-llm/ax";
const runtime = new AxJSRuntime({
permissions: [AxJSRuntimePermission.NETWORK]
});
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const codeRunner = ax("task:string -> result:string", {
functions: [interpreter.toFunction()]
});
```
`AxJSRuntime` is the Ax JS runtime used for sandboxed code
execution across Node.js/Bun-style backends, Deno, and browser environments.
---
Remember: **You're not writing prompts, you're declaring capabilities.** Let the framework handle the complexity while you focus on building.
================================================================================
# DSPy Concepts
# Source: DSPY.md
# The revolutionary approach to building with LLMs
# DSPy in TypeScript: The Future of Building with LLMs
## The Problem: LLMs Are Powerful but Unpredictable
Working with LLMs today feels like herding cats. You write prompts, tweak them endlessly, and still get inconsistent results. When you switch models or providers, everything breaks. Sound familiar?
**What if you could just describe what you want, and let the system figure out the best way to get it?**
## Enter DSPy: A Revolutionary Approach
DSPy (Demonstrate–Search–Predict) changes everything. Instead of writing prompts, you write **signatures** – simple declarations of what goes in and what comes out. The framework handles the rest.
Think of it like this:
- **Traditional approach**: "Please analyze the sentiment of this review, considering positive, negative, and neutral tones..."
- **DSPy approach**: `reviewText:string -> sentiment:class "positive, negative, neutral"`
That's it. The system generates optimal prompts, validates outputs, and even improves itself over time.
## See It in Action (30 Seconds)
```typescript
import { ai, ax } from "@ax-llm/ax";
// 1. Pick your LLM
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
// 2. Declare what you want
const classifier = ax('reviewText:string -> sentiment:class "positive, negative, neutral"');
// 3. Just use it
const result = await classifier.forward(llm, {
reviewText: "This product exceeded my expectations!"
});
console.log(result.sentiment); // "positive"
```
**That's a complete, production-ready sentiment analyzer.** No prompt engineering. No trial and error.
## Why DSPy Will Change How You Build
### 1. 🎯 **Write Once, Run Anywhere**
Your code works with OpenAI, Google, Anthropic, or any LLM. Switch providers with one line. No rewrites.
### 2. ⚡ **Stream Everything**
Get results as they generate. Validate on-the-fly. Fail fast. Ship faster.
```typescript
const gen = ax("question:string -> answer:string");
// Stream responses in real-time
await gen.forward(llm, { question: "Hello" }, { stream: true });
```
### 3. 🛡️ **Built-in Quality Control**
Add assertions that run during generation. Catch issues before they reach users.
```typescript
const gen = ax("question:string -> answer:string, confidence:number");
// Method 1: Return error string for custom messages (recommended)
gen.addAssert(({ answer }) => {
if (answer.length < 10) {
return `Answer too short: ${answer.length} characters (minimum 10)`;
}
return true;
});
// Method 2: Return false with fallback message
gen.addAssert(
({ confidence }) => confidence > 0.7,
"Confidence must be above 70%"
);
// Method 3: Throw for immediate failure
gen.addAssert(({ answer }) => {
if (answer.includes('offensive-term')) {
throw new Error('Content moderation failed');
}
return true;
});
// Streaming assertions for real-time validation
gen.addStreamingAssert('answer', (content, done) => {
if (!done) return undefined; // Wait for complete content
return content.length >= 10 ? true : 'Answer too brief';
});
```
### 4. 🚀 **Automatic Optimization**
Train your programs with examples. Watch them improve automatically.
```typescript
const optimizer = new AxMiPRO({ studentAI: llm, examples: trainingData });
const improved = await optimizer.compile(classifier, examples, metric);
// Your classifier just got 30% more accurate!
```
### 5. 🎨 **Multi-Modal Native**
Images, audio, text – all in the same signature. It just works.
```typescript
const vision = ax("photo:image, question:string -> description:string");
```
## Real-World Power: Build Complex Systems Simply
### Smart Customer Support in 5 Lines
```typescript
const supportBot = ax(`
customerMessage:string ->
category:class "billing, technical, general",
priority:class "high, medium, low",
suggestedResponse:string
`);
// That's it. You have intelligent ticket routing and response generation.
```
### Multi-Step Reasoning? Trivial.
```typescript
const researcher = ax(`
question:string ->
searchQueries:string[] "3-5 queries",
analysis:string,
confidence:number "0-1"
`);
```
## Beyond Simple Generation: Production Features
### Complete Observability
- OpenTelemetry tracing built-in
- Track every decision, optimization, and retry
- Monitor costs, latency, and quality in real-time
### Enterprise-Ready Workflows
AxFlow lets you compose signatures into complex pipelines with automatic parallelization:
```typescript
new AxFlow()
.node("analyzer", "text:string -> sentiment:string")
.node("summarizer", "text:string -> summary:string")
.execute("analyzer", (state) => ({ text: state.text }))
.execute("summarizer", (state) => ({ text: state.text }))
// Both run in parallel automatically!
```
### Advanced RAG Out of the Box
```typescript
const rag = axRAG(vectorDB, {
maxHops: 3, // Multi-hop retrieval
qualityTarget: 0.85, // Self-healing quality loops
});
// Enterprise RAG in 3 lines
```
## Start Now: From Zero to Production
### Install (30 seconds)
```bash
npm install @ax-llm/ax
```
### Your First Intelligent App (2 minutes)
```typescript
import { ai, ax } from "@ax-llm/ax";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
// Create any AI capability with a signature
const translator = ax(`
text:string,
targetLanguage:string ->
translation:string,
confidence:number "0-1"
`);
const result = await translator.forward(llm, {
text: "Hello world",
targetLanguage: "French"
});
// { translation: "Bonjour le monde", confidence: 0.95 }
```
## The Bottom Line
**Stop fighting with prompts. Start building with signatures.**
DSPy isn't just another LLM library. It's a fundamental shift in how we build AI systems:
- **Deterministic** where it matters (structure, types, validation)
- **Flexible** where you need it (providers, models, optimization)
- **Production-ready** from day one (streaming, observability, scaling)
## Ready to Build the Future?
### Quick Wins
- [Simple Examples](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/) - Start here
- [Streaming Magic](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/streaming1.ts) - Real-time validation
- [Multi-Modal](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/multi-modal.ts) - Images + text together
### Level Up
- [Optimization Guide](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/OPTIMIZE.md) - Make your programs smarter
- [AxFlow Workflows](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AXFLOW.md) - Build complex systems
- [Advanced RAG](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AXRAG.md) - Production search & retrieval
### Join the Revolution
- 🐦 [Follow Updates](https://twitter.com/dosco)
- 💬 [Discord Community](https://discord.gg/DSHg3dU7dW)
- ⭐ [Star on GitHub](https://github.com/ax-llm/ax)
---
**Remember**: Every prompt you write today is technical debt. Every signature you write is an asset that gets better over time.
Welcome to the future of building with LLMs. Welcome to DSPy with Ax.
================================================================================
# Signatures Guide
# Source: SIGNATURES.md
# Complete guide to DSPy signatures - from basics to advanced patterns
# The Complete Guide to DSPy Signatures in Ax
## Introduction: Why Signatures Beat Prompts
Traditional prompt engineering is like writing assembly code – tedious, fragile, and requires constant tweaking. DSPy signatures are like high-level programming – you describe **what** you want, not **how** to get it.
### The Problem with Prompts
```typescript
// ❌ Traditional approach - fragile and verbose
const prompt = `You are a sentiment analyzer. Given a customer review,
analyze the sentiment and return exactly one of: positive, negative, or neutral.
Be sure to only return the sentiment word, nothing else.
Review: ${review}
Sentiment:`;
// Hope the LLM follows instructions...
```
### The Power of Signatures
```typescript
// ✅ Signature approach - clear and type-safe
const analyzer = ax('review:string -> sentiment:class "positive, negative, neutral"');
// Guaranteed structured output with TypeScript types!
const result = await analyzer.forward(llm, { review });
console.log(result.sentiment); // TypeScript knows this is "positive" | "negative" | "neutral"
```
## Understanding Signature Syntax
A signature defines the contract between your code and the LLM:
```
[description] input1:type, input2:type -> output1:type, output2:type
```
### Basic Structure
1. **Optional Description**: Overall purpose in quotes
2. **Input Fields**: What you provide to the LLM
3. **Arrow (`->`)**: Separates inputs from outputs
4. **Output Fields**: What the LLM returns
### Examples
```typescript
// Simple Q&A
'userQuestion:string -> aiAnswer:string'
// With description
'"Answer questions about TypeScript" question:string -> answer:string, confidence:number'
// Multiple inputs and outputs
'document:string, query:string -> summary:string, relevantQuotes:string[]'
```
## Field Types Reference
Ax supports a rich type system that maps directly to TypeScript types:
### Basic Types
| Type | Signature Syntax | TypeScript Type | Example |
|------|-----------------|-----------------|---------|
| String | `:string` | `string` | `userName:string` |
| Number | `:number` | `number` | `score:number` |
| Boolean | `:boolean` | `boolean` | `isValid:boolean` |
| JSON | `:json` | `any` | `metadata:json` |
### Date and Time Types
| Type | Signature Syntax | TypeScript Type | Example |
|------|-----------------|-----------------|---------|
| Date | `:date` | `Date` | `birthDate:date` |
| DateTime | `:datetime` | `Date` | `timestamp:datetime` |
### Media Types (Input Only)
| Type | Signature Syntax | TypeScript Type | Example |
|------|-----------------|-----------------|---------|
| Image | `:image` | `{mimeType: string, data: string}` | `photo:image` |
| Audio | `:audio` | `{format?: 'wav', data: string}` | `recording:audio` |
| File | `:file` | `{mimeType: string, data: string}` | `document:file` |
| URL | `:url` | `string` | `website:url` |
### Special Types
| Type | Signature Syntax | TypeScript Type | Example |
|------|-----------------|-----------------|---------|
| Code | `:code` | `string` | `pythonScript:code` |
| Classification | `:class "opt1, opt2"` | `"opt1" \| "opt2"` | `mood:class "happy, sad, neutral"` |
## Arrays and Optional Fields
### Arrays
Add `[]` after any type to make it an array:
```typescript
// String array
'tags:string[] -> processedTags:string[]'
// Number array
'scores:number[] -> average:number, median:number'
// Classification array
'documents:string[] -> categories:class[] "news, blog, tutorial"'
```
### Optional Fields
Add `?` before the colon to make a field optional:
```typescript
// Optional input
'query:string, context?:string -> response:string'
// Optional output
'text:string -> summary:string, keywords?:string[]'
// Both optional
'message?:string -> reply?:string, confidence:number'
```
## Advanced Features
### Internal Fields (Output Only)
Use `!` to mark output fields as internal (for reasoning/chain-of-thought):
```typescript
// Internal fields are hidden from the final output but guide LLM reasoning
'problem:string -> reasoning!:string, solution:string'
```
### Classification Fields (Output Only)
Classifications provide type-safe enums:
```typescript
// Single classification
'email:string -> priority:class "urgent, normal, low"'
// Multiple options with pipe separator
'text:string -> sentiment:class "positive | negative | neutral"'
// Array of classifications
'reviews:string[] -> sentiments:class[] "positive, negative, neutral"'
```
## Creating Signatures: Three Approaches
### 1. String-Based (Recommended)
```typescript
import { ax, s } from '@ax-llm/ax';
// Direct generator creation
const generator = ax('input:string -> output:string');
// Create signature first, then generator
const sig = s('query:string -> response:string');
const gen = ax(sig.toString());
```
### 2. Pure Fluent Builder API
```typescript
import { f } from '@ax-llm/ax';
// Using the pure fluent builder - supports .optional(), .array(), .internal(), .cache()
const signature = f()
.input('userMessage', f.string('User input'))
.input('contextData', f.string('Additional context').optional())
.input('tags', f.string('Keywords').array())
.input('categories', f.string('Categories').optional().array())
.output('responseText', f.string('AI response'))
.output('confidenceScore', f.number('Confidence score 0-1'))
.output('debugInfo', f.string('Debug information').internal())
.build();
```
### 3. Hybrid Approach
```typescript
import { s, f } from '@ax-llm/ax';
// Start with string, add fields programmatically
const sig = s('base:string -> result:string')
.appendInputField('extra', f.optional(f.json('Metadata')))
.appendOutputField('score', f.number('Quality score'));
```
## Pure Fluent API Reference
The fluent API has been redesigned to be purely fluent, meaning you can only use method chaining with `.optional()`, `.array()`, `.internal()`, and `.cache()` methods. Nested function calls are no longer supported.
### ✅ Pure Fluent Syntax (Current)
```typescript
import { f } from '@ax-llm/ax';
// Basic field types
const stringField = f.string('description');
const numberField = f.number('description');
const booleanField = f.boolean('description');
// Array types - use .array() method chaining
const stringArray = f.string('array description').array();
const numberArray = f.number('array description').array();
const booleanArray = f.boolean('array description').array();
// Optional fields - use .optional() method chaining
const optionalString = f.string('optional description').optional();
const optionalArray = f.string('optional array').optional().array();
const arrayOptional = f.string('array optional').array().optional(); // Same as above
// Internal fields (output only) - use .internal() method chaining
const internalField = f.string('internal description').internal();
const internalArray = f.string('internal array').array().internal();
// Cached fields (input only) - use .cache() method chaining
// Marks input fields for LLM provider caching when contextCache is enabled
const cachedField = f.string('static context').cache();
const cachedOptional = f.string('optional cached').cache().optional();
const cachedArray = f.string('cached items').cache().array();
// Complex combinations
const complexField = f.string('complex field')
.optional() // Make it optional
.array() // Make it an array
.internal(); // Mark as internal (output only)
// Object descriptions
const objectField = f.object({
field: f.string()
}, 'Description of the object structure');
// Array of objects with distinct descriptions
const objectArray = f.object({
field: f.string()
}, 'Description of the individual item')
.array('Description of the list itself');
```
### ❌ Deprecated Nested Syntax (Removed)
```typescript
// These no longer work and will cause compilation errors
const badArray = f.array(f.string('description')); // ❌ Removed
const badOptional = f.optional(f.string('description')); // ❌ Removed
const badInternal = f.internal(f.string('description')); // ❌ Removed
```
### String vs Fluent API Equivalence
Both approaches create identical signatures:
```typescript
// String syntax
const stringSig = AxSignature.create(`
userMessages:string[] "User messages",
maxTokens?:number "Max tokens",
enableDebug:boolean "Debug mode",
categories?:string[] "Optional categories"
->
responseText:string "Response",
debugInfo!:string "Debug info"
`);
// Equivalent fluent syntax
const fluentSig = f()
.input('userMessages', f.string('User messages').array())
.input('maxTokens', f.number('Max tokens').optional())
.input('enableDebug', f.boolean('Debug mode'))
.input('categories', f.string('Optional categories').optional().array())
.output('responseText', f.string('Response'))
.output('debugInfo', f.string('Debug info').internal())
.build();
// Both create identical runtime structures and TypeScript types
console.log(stringSig.toString() === fluentSig.toString()); // true
```
### Type Inference and Arrays
The fluent API properly maps to TypeScript array types:
```typescript
// These all correctly infer TypeScript types
const sig = f()
.input('strings', f.string('strings').array()) // string[]
.input('numbers', f.number('numbers').array()) // number[]
.input('booleans', f.boolean('booleans').array()) // boolean[]
.input('optionalStrings', f.string('optional').optional().array()) // string[] | undefined
.output('responseText', f.string('response')) // string
.build();
// TypeScript knows the exact types at compile time
type InputType = {
strings: string[];
numbers: number[];
booleans: boolean[];
optionalStrings?: string[];
responseText: string;
};
```
## Cached Input Fields
The `.cache()` method marks input fields for LLM provider caching when using `contextCache` in `AxPromptTemplate`. Cached fields are rendered in a separate message with `cache: true`, enabling prompt caching features (e.g., Anthropic's prompt caching, Google Gemini's context caching).
### Basic Usage
```typescript
import { f, ax, AxPromptTemplate } from '@ax-llm/ax';
// Mark static content for caching
const sig = f()
.input('staticContext', f.string('Context that rarely changes').cache())
.input('userQuery', f.string('Dynamic user query'))
.output('answer', f.string('Response'))
.build();
// Enable context caching in template
const template = new AxPromptTemplate(sig, {
contextCache: { ttlSeconds: 3600 }
});
// When rendered, staticContext appears in a cached message,
// userQuery appears in a regular message
```
### Chaining with Other Modifiers
The `.cache()` method can be chained with `.optional()` and `.array()` in any order:
```typescript
// All of these are equivalent
f.string('context').cache().optional().array()
f.string('context').optional().cache().array()
f.string('context').array().optional().cache()
```
### Key Points
- `.cache()` is only meaningful for **input fields**
- Requires `contextCache` option in `AxPromptTemplate` or `forward()` options to take effect
- Works with `cacheBreakpoint` configuration (`'system'`, `'after-functions'`, `'after-examples'`)
- Cached fields are sorted to appear first in the rendered prompt
- See [AI.md - Context Caching](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AI.md#7-context-caching) for full context caching documentation
## Field Naming Best Practices
Ax enforces descriptive field names to improve LLM understanding:
### ✅ Good Field Names
- `userQuestion`, `customerEmail`, `documentText`
- `analysisResult`, `summaryContent`, `responseMessage`
- `confidenceScore`, `categoryType`, `priorityLevel`
### ❌ Bad Field Names (Will Error)
- `text`, `data`, `input`, `output` (too generic)
- `a`, `x`, `val` (too short)
- `1field`, `123name` (starts with number)
## Validation and Constraints (New!)
Ax now supports Zod-like validation constraints for ensuring data quality and format compliance. These constraints work automatically across input validation, output validation, and streaming scenarios.
### Available Constraints
#### String Constraints
```typescript
import { f } from '@ax-llm/ax';
// String length constraints
f.string('username').min(3).max(20)
f.string('password').min(8).max(128)
// Format validation (two equivalent syntaxes available)
f.string('email').email() // Email format (or f.email())
f.string('website').url() // URL format (or f.url())
f.string('birthDate').date() // Date format (or f.date())
f.string('timestamp').datetime() // DateTime format (or f.datetime())
f.string('pattern').regex('^[A-Z0-9]') // Custom regex
// Alternative syntax using dedicated types
f.email('email') // Equivalent to f.string().email()
f.url('website') // Equivalent to f.string().url()
f.date('birthDate') // Equivalent to f.string().date()
f.datetime('timestamp') // Equivalent to f.string().datetime()
// Combinations
f.string('bio').max(500).optional()
f.string('contact').email().optional()
f.url('homepage').optional() // Also works with dedicated types
```
**Note:** For email, url, date, and datetime, you can use either:
- **Validator syntax**: `f.string().email()` - chain validators on strings
- **Dedicated type syntax**: `f.email()` - use dedicated type directly
Both syntaxes work consistently in all contexts (input fields, output fields, nested objects)!
#### Number Constraints
```typescript
// Numeric range constraints
f.number('age').min(18).max(120)
f.number('score').min(0).max(100)
f.number('price').min(0) // No maximum
f.number('rating').max(5) // No minimum
```
### Complete Example
```typescript
import { ax, f } from '@ax-llm/ax';
const userRegistration = f()
.input('formData', f.string('Raw form data'))
.output('user', f.object({
username: f.string('Username').min(3).max(20),
email: f.string('Email address').email(),
age: f.number('User age').min(18).max(120),
password: f.string('Password').min(8).regex('^(?=.*[A-Za-z])(?=.*\\d)'),
bio: f.string('Biography').max(500).optional(),
website: f.string('Personal website').url().optional(),
tags: f.string('Interest tag').min(2).max(30).array()
}, 'User profile information'))
.build();
const generator = ax(userRegistration);
const result = await generator.forward(llm, {
formData: 'Name: John, Email: john@example.com, Age: 25...'
});
// All fields are automatically validated:
// - username: 3-20 characters
// - email: valid email format
// - age: between 18-120
// - password: min 8 chars with letter and number
// - website: valid URL if provided
// - tags: each tag 2-30 characters
```
### Validation Behavior
**Input Validation (Before LLM):**
- Runs automatically before `forward()` calls
- Validates user-provided data against constraints
- Throws `ValidationError` if constraints are violated
- Includes nested object and array validation
**Output Validation (After LLM):**
- Runs automatically during response extraction
- Validates LLM-generated data against constraints
- Triggers automatic retry with correction instructions on validation errors
- Works seamlessly with streaming
**Streaming Validation:**
- Validates each field as it completes during streaming
- Provides incremental validation feedback
- Maintains type safety throughout the stream
### TypeScript Integration
Validation constraints are fully integrated with TypeScript's type system:
```typescript
const sig = f()
.output('contact', f.object({
email: f.string().email(),
age: f.number().min(0).max(150)
}))
.build();
const gen = ax(sig);
const result = await gen.forward(llm, input);
// TypeScript knows exact types with runtime guarantees
result.contact.email // string (guaranteed valid email format)
result.contact.age // number (guaranteed 0-150 range)
```
### Media Type Restrictions
Media types (`f.image()`, `f.audio()`, `f.file()`) have special restrictions:
```typescript
// ✅ Allowed: Top-level input fields
const validSig = f()
.input('photo', f.image('Profile picture'))
.input('recording', f.audio('Voice message'))
.output('result', f.string('Analysis result'))
.build();
// ❌ Not allowed: Media types in nested objects
const invalidSig = f()
.output('user', f.object({
photo: f.image('Profile') // TypeScript error + runtime error
}))
.build();
// ❌ Not allowed: Media types as outputs
const invalidSig2 = f()
.output('photo', f.image('Generated image')) // Runtime validation error
.build();
```
**Media types are restricted to top-level input fields only** for security and practical reasons. This is enforced at both:
- **Compile time**: TypeScript will show errors for invalid usage
- **Runtime**: Validation throws descriptive errors if restrictions are violated
### Advanced Validation Patterns
```typescript
// Complex nested validation
const productExtractor = f()
.input('productPage', f.string('HTML content'))
.output('product', f.object({
name: f.string('Product name').min(1).max(200),
price: f.number('Price in USD').min(0),
specs: f.object({
dimensions: f.object({
width: f.number('Width in cm').min(0),
height: f.number('Height in cm').min(0)
}),
materials: f.string('Material').min(1).array()
}),
reviews: f.object({
rating: f.number('Rating').min(1).max(5),
comment: f.string('Review text').min(10).max(1000)
}).array()
}))
.build();
// Validation runs recursively through all nested levels
```
### Error Handling
```typescript
try {
const result = await generator.forward(llm, { formData: 'invalid' });
} catch (error) {
if (error instanceof ValidationError) {
// Validation failed - error contains detailed constraint violation info
console.error('Validation failed:', error.message);
// Example: "Field 'email' failed validation: Invalid email format"
}
}
```
Validation errors automatically trigger retries with correction instructions, making the LLM aware of the constraint violations and improving output quality.
## Real-World Examples
### Email Classifier
```typescript
const emailClassifier = ax(`
emailSubject:string "Email subject line",
emailBody:string "Full email content" ->
category:class "sales, support, spam, newsletter" "Email category",
priority:class "urgent, normal, low" "Priority level",
summary:string "Brief summary of the email"
`);
const result = await emailClassifier.forward(llm, {
emailSubject: "Urgent: Server Down",
emailBody: "Our production server is experiencing issues..."
});
console.log(result.category); // "support"
console.log(result.priority); // "urgent"
```
### Document Analyzer with Chain-of-Thought
```typescript
const analyzer = ax(`
documentText:string "Document to analyze" ->
reasoning!:string "Step-by-step analysis",
mainTopics:string[] "Key topics discussed",
sentiment:class "positive, negative, neutral, mixed" "Overall tone",
readability:class "elementary, high-school, college, graduate" "Reading level",
keyInsights:string[] "Important takeaways"
`);
// The reasoning field guides the LLM but isn't returned
const result = await analyzer.forward(llm, {
documentText: "..."
});
// result.reasoning is undefined (internal field)
// result.mainTopics, sentiment, etc. are available
```
### Multi-Modal Analysis
```typescript
const imageAnalyzer = ax(`
imageData:image "Image to analyze",
question?:string "Specific question about the image" ->
description:string "What's in the image",
objects:string[] "Identified objects",
textFound?:string "Any text detected in the image",
answerToQuestion?:string "Answer if question was provided"
`);
```
### Data Extraction
```typescript
const extractor = ax(`
invoiceText:string "Raw invoice text" ->
invoiceNumber:string "Invoice ID",
invoiceDate:date "Date of invoice",
dueDate:date "Payment due date",
totalAmount:number "Total amount due",
lineItems:json[] "Array of {description, quantity, price}",
vendor:json "{ name, address, taxId }"
`);
```
### Pure Fluent API Example
```typescript
import { f, ax } from '@ax-llm/ax';
// Complex signature using pure fluent API
const contentAnalyzer = f()
.input('articleText', f.string('Article content to analyze'))
.input('authorInfo', f.json('Author metadata').optional())
.input('keywords', f.string('Target keywords').array())
.input('checkFactuality', f.boolean('Enable fact-checking'))
.output('mainThemes', f.string('Key themes').array())
.output('sentimentScore', f.number('Sentiment score -1 to 1'))
.output('readabilityLevel', f.class(['elementary', 'middle', 'high', 'college'], 'Reading level'))
.output('factChecks', f.json('Fact checking results').array().optional())
.output('processingTime', f.number('Analysis time in ms').internal())
.description('Comprehensive article analysis with optional fact-checking')
.build();
// Create generator from fluent signature
const generator = ax(contentAnalyzer.toString());
// Usage with typed inputs/outputs
const result = await generator.forward(llm, {
articleText: 'Sample article content...',
keywords: ['AI', 'machine learning', 'technology'],
checkFactuality: true,
// authorInfo is optional
});
// TypeScript knows exact types
console.log(result.mainThemes); // string[]
console.log(result.sentimentScore); // number
console.log(result.readabilityLevel); // 'elementary' | 'middle' | 'high' | 'college'
console.log(result.factChecks); // json[] | undefined (optional)
// result.processingTime is undefined (internal field)
```
## Streaming Support
All signatures support streaming by default:
```typescript
const storyteller = ax(`
prompt:string "Story prompt",
genre:class "fantasy, sci-fi, mystery, romance" ->
title:string "Story title",
story:string "The complete story",
wordCount:number "Approximate word count"
`);
// Stream the response
for await (const chunk of storyteller.streamingForward(llm, {
prompt: "A detective discovers their partner is a time traveler",
genre: "mystery"
})) {
if (chunk.delta.story) {
process.stdout.write(chunk.delta.story); // Real-time streaming
}
}
```
## Type Safety and IntelliSense
Signatures provide full TypeScript type inference:
```typescript
const typed = ax(`
userId:number,
includeDetails?:boolean ->
userName:string,
userEmail:string,
metadata?:json
`);
// TypeScript knows the exact types
const result = await typed.forward(llm, {
userId: 123, // ✅ number required
includeDetails: true // ✅ boolean optional
// userEmail: "..." // ❌ TypeScript error: not an input field
});
console.log(result.userName); // ✅ TypeScript knows this is string
console.log(result.metadata?.x); // ✅ TypeScript knows this is any | undefined
// console.log(result.userId); // ❌ TypeScript error: not an output field
```
## Common Patterns
### 1. Chain of Thought Reasoning
```typescript
// Use internal fields for reasoning steps
const reasoner = ax(`
problem:string ->
thoughts!:string "Internal reasoning process",
answer:string "Final answer"
`);
```
### 2. Structured Data Extraction
```typescript
// Extract structured data from unstructured text
const parser = ax(`
messyData:string ->
structured:json "Clean JSON representation"
`);
```
### 3. Multi-Step Classification
```typescript
// Hierarchical classification
const classifier = ax(`
text:string ->
mainCategory:class "technical, business, creative",
subCategory:class "based on main category",
confidence:number "0-1 confidence score"
`);
```
### 4. Validation and Checking
```typescript
// Validate and explain
const validator = ax(`
code:code "Code to review",
language:string "Programming language" ->
isValid:boolean "Is the code syntactically correct",
errors?:string[] "List of errors if any",
suggestions?:string[] "Improvement suggestions"
`);
```
## Error Handling
Signatures provide clear, actionable error messages:
```typescript
// ❌ This will throw a descriptive error
try {
const bad = ax('text:string -> result:string');
} catch (error) {
// Error: Field name "text" is too generic.
// Use a more descriptive name like "inputText" or "documentText"
}
// ❌ Invalid type
try {
const bad = ax('userInput:str -> result:string');
} catch (error) {
// Error: Unknown type "str". Did you mean "string"?
}
// ❌ Duplicate field names
try {
const bad = ax('data:string, data:number -> result:string');
} catch (error) {
// Error: Duplicate field name "data" in inputs
}
```
## Migration from Traditional Prompts
### Before (Prompt Engineering)
```typescript
const prompt = `
Analyze the sentiment of the following review.
Rate it on a scale of 1-5.
Identify the main topics discussed.
Format your response as JSON with keys: rating, sentiment, topics
Review: ${review}
`;
const response = await llm.generate(prompt);
const parsed = JSON.parse(response); // Hope it's valid JSON...
```
### After (Signatures)
```typescript
const analyzer = ax(`
review:string ->
rating:number "1-5 rating",
sentiment:class "very positive, positive, neutral, negative, very negative",
topics:string[] "Main topics discussed"
`);
const result = await analyzer.forward(llm, { review });
// Guaranteed structure, no parsing needed!
```
## Performance Tips
1. **Use specific types**: `class` is more token-efficient than `string` for enums
2. **Leverage arrays**: Process multiple items in one call
3. **Optional fields**: Only request what you need
4. **Internal fields**: Use `!` for reasoning without returning it
## Conclusion
DSPy signatures in Ax transform LLM interactions from fragile prompt engineering to robust, type-safe programming. By describing what you want instead of how to get it, you can:
- Write more maintainable code
- Get guaranteed structured outputs
- Leverage TypeScript's type system
- Switch LLM providers without changing logic
- Build production-ready AI features faster
Start using signatures today and experience the difference!
================================================================================
# AI Providers
# Source: AI.md
# Complete guide to all supported AI providers and their features
## Getting Started with Ax AI Providers and Models
This guide helps beginners get productive with Ax quickly: pick a provider,
choose a model, and send a request. You’ll also learn how to define model
presets and common options.
### 1. Install and set up
```bash
npm i @ax-llm/ax
```
Set your API keys as environment variables:
- `OPENAI_APIKEY`
- `ANTHROPIC_APIKEY`
- `GOOGLE_APIKEY` (or Google Vertex setup)
### 2. Create an AI instance
Use the `ai()` factory with a provider name and your API key.
```ts
import { ai, AxAIGoogleGeminiModel } from "@ax-llm/ax";
const llm = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: {
model: AxAIGoogleGeminiModel.Gemini20Flash,
},
});
```
Supported providers include: `openai`, `anthropic`, `google-gemini`, `mistral`,
`groq`, `cohere`, `together`, `deepseek`, `ollama`, `huggingface`, `openrouter`,
`azure-openai`, `reka`, `x-grok`.
### 3. Choose models using presets (recommended)
Define a `models` list with user-friendly keys. Each item describes a preset and
can include provider-specific settings. When you use a key in `model`, Ax maps
it to the right backend model and merges the preset config.
```ts
import { ai, AxAIGoogleGeminiModel } from "@ax-llm/ax";
const gemini = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: { model: "simple" },
models: [
{
key: "tiny",
model: AxAIGoogleGeminiModel.Gemini20FlashLite,
description: "Fast + cheap",
// Provider config merged automatically
config: { maxTokens: 1024, temperature: 0.3 },
},
{
key: "simple",
model: AxAIGoogleGeminiModel.Gemini20Flash,
description: "Balanced general-purpose",
config: { temperature: 0.6 },
},
],
});
// Use a preset by key
await gemini.chat({
model: "tiny",
chatPrompt: [{ role: "user", content: "Summarize this:" }],
});
```
What gets merged when you pick a key:
- Model mapping: preset `model` replaces the key
- Tuning: `maxTokens`, `temperature`, `topP`, `topK`, penalties,
`stopSequences`, `n`, `stream`
- Provider extras (Gemini): `config.thinking.thinkingTokenBudget` is mapped to
Ax’s levels automatically; `includeThoughts` maps to `showThoughts`
You can still override per-request:
```ts
await gemini.chat(
{ model: "simple", chatPrompt: [{ role: "user", content: "Hi" }] },
{ stream: false, thinkingTokenBudget: "medium" },
);
```
### 4. Send your first chat
```ts
const res = await gemini.chat({
chatPrompt: [
{ role: "system", content: "You are concise." },
{ role: "user", content: "Write a haiku about the ocean." },
],
});
console.log(res.results[0]?.content);
```
### 5. Common options
- `stream` (boolean): enable server-sent events; `true` by default if supported
- `thinkingTokenBudget` (Gemini/Claude-like):
`'minimal' | 'low' | 'medium' | 'high' | 'highest' | 'none'`
- `showThoughts` (if model supports): include thoughts in output
- `functionCallMode`: `'auto' | 'native' | 'prompt'`
- `debug`, `logger`, `tracer`, `rateLimiter`, `timeout`
Example with overrides:
```ts
await gemini.chat(
{ chatPrompt: [{ role: "user", content: "Plan a weekend trip" }] },
{ stream: false, thinkingTokenBudget: "high", showThoughts: true },
);
```
### 6. Extended Thinking
Extended thinking allows models to reason internally before responding, improving
quality on complex tasks. Ax provides a unified `thinkingTokenBudget` interface
that works across providers (Anthropic, Google Gemini) while handling
provider-specific details automatically.
#### Usage
Pass `thinkingTokenBudget` and optionally `showThoughts` when making requests:
```ts
import { ai, AxAIAnthropicModel } from "@ax-llm/ax";
// Anthropic
const claude = ai({
name: "anthropic",
apiKey: process.env.ANTHROPIC_APIKEY!,
config: { model: AxAIAnthropicModel.Claude46Opus },
});
const res = await claude.chat(
{ chatPrompt: [{ role: "user", content: "Solve this step by step..." }] },
{ thinkingTokenBudget: "medium", showThoughts: true },
);
console.log(res.results[0]?.thought); // The model's internal reasoning
console.log(res.results[0]?.content); // The final answer
```
```ts
import { ai, AxAIGoogleGeminiModel } from "@ax-llm/ax";
// Google Gemini
const gemini = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
config: { model: AxAIGoogleGeminiModel.Gemini25Pro },
});
const res = await gemini.chat(
{ chatPrompt: [{ role: "user", content: "Analyze this complex problem..." }] },
{ thinkingTokenBudget: "high", showThoughts: true },
);
```
#### Budget levels
The string levels map to provider-specific token budgets:
| Level | Anthropic (tokens) | Gemini (tokens) |
| --- | --- | --- |
| `'none'` | disabled | minimal (Gemini 3+ can't fully disable) |
| `'minimal'` | 1,024 | 200 |
| `'low'` | 5,000 | 800 |
| `'medium'` | 10,000 | 5,000 |
| `'high'` | 20,000 | 10,000 |
| `'highest'` | 32,000 | 24,500 |
#### Anthropic model-specific behavior
Ax automatically selects the right wire format based on the Anthropic model:
- **Opus 4.6**: Uses adaptive thinking (`type: 'adaptive'`) + effort levels. No
explicit token budget is sent — the model decides how much to think. The
`thinkingTokenBudget` level controls the effort parameter instead.
- **Opus 4.5**: Uses explicit budget (`budget_tokens`) + effort levels. Effort
is capped at `'high'` (the `'max'` effort level is not supported).
- **Other thinking models** (Claude 3.7 Sonnet, Claude 4 Sonnet, etc.): Uses
budget tokens only, no effort parameter.
#### Effort levels
For Opus 4.5+ models, Ax automatically maps your `thinkingTokenBudget` level to
an Anthropic effort level (`low` / `medium` / `high` / `max`). You don't need
to set effort manually. The default mapping is:
| Budget level | Effort |
| --- | --- |
| `'minimal'` | `low` |
| `'low'` | `low` |
| `'medium'` | `medium` |
| `'high'` | `high` |
| `'highest'` | `max` |
You can customize this via the `effortLevelMapping` config (see below).
#### Customization
Override the default token budgets or effort mapping in your provider config:
```ts
const claude = ai({
name: "anthropic",
apiKey: process.env.ANTHROPIC_APIKEY!,
config: {
model: AxAIAnthropicModel.Claude46Opus,
thinkingTokenBudgetLevels: {
minimal: 2048,
low: 8000,
medium: 16000,
high: 25000,
highest: 40000,
},
effortLevelMapping: {
minimal: "low",
low: "medium",
medium: "high",
high: "high",
highest: "max",
},
},
});
```
#### Constraints
When thinking is enabled on Anthropic, the API restricts certain parameters:
- `temperature` is ignored (cannot be set)
- `topK` is ignored (cannot be set)
- `topP` is only sent if its value is >= 0.95
These restrictions are handled automatically — Ax omits the restricted
parameters from the request when thinking is active.
### 7. Embeddings (if supported)
```ts
const { embeddings } = await gemini.embed({
texts: ['hello', 'world'],
embedModel: 'text-embedding-005',
})
``;
### 8. Context Caching
Context caching reduces costs and latency by caching large prompt prefixes
(system prompts, function definitions, examples) for reuse across multiple
requests. This is especially valuable for multi-turn agentic flows.
#### Enabling Context Caching
Pass the `contextCache` option to `forward()` to enable caching:
```ts
import { ai, ax, AxMemory } from "@ax-llm/ax";
const llm = ai({
name: "google-gemini",
apiKey: process.env.GOOGLE_APIKEY!,
});
const codeReviewer = ax(
`code:string, language:string -> review:string, suggestions:string[]`,
{ description: "You are an expert code reviewer..." } // Large system prompt
);
const mem = new AxMemory();
// Enable context caching
const result = await codeReviewer.forward(llm, { code, language }, {
mem,
sessionId: "code-review-session",
contextCache: {
ttlSeconds: 3600, // Cache TTL (1 hour)
},
});
```
#### How It Works
**Google Gemini (Explicit Caching)**:
- Creates a separate cache resource with an ID
- Cache persists across requests using the same `sessionId` + content hash
- Automatic TTL refresh when cache is near expiration
- Provides up to 90% cost reduction on cached tokens
- Minimum 2048 tokens required for caching
**Anthropic (Implicit Caching)**:
- Uses `cache_control` markers in the request
- System prompts are automatically cached
- Function definitions and results are marked for caching
- No explicit cache management needed
- Provides up to 90% cost reduction on cached tokens
#### Configuration Options
```ts
type AxContextCacheOptions = {
// Explicit cache name (bypasses auto-creation)
name?: string;
// TTL in seconds (default: 3600)
ttlSeconds?: number;
// Minimum tokens to create cache (default: 2048)
minTokens?: number;
// Window before expiration to trigger refresh (default: 300)
refreshWindowSeconds?: number;
// External registry for serverless environments
registry?: AxContextCacheRegistry;
// Controls where the cache breakpoint is set in the prompt prefix
// Prefix order: System → Functions → Examples → User Input
// - 'after-examples' (default): Cache includes system + functions + examples
// - 'after-functions': Cache system + functions only (use when examples are dynamic)
// - 'system': Cache only system prompt (use when functions are dynamic)
cacheBreakpoint?: 'system' | 'after-functions' | 'after-examples';
};
```
#### Dynamic Examples (Excluding from Cache)
When examples are dynamic (e.g., retrieved per-request from a vector database),
use `cacheBreakpoint: 'after-functions'` to exclude them from caching:
```ts
const result = await gen.forward(llm, input, {
contextCache: {
ttlSeconds: 3600,
cacheBreakpoint: 'after-functions', // Cache system + functions, but not examples
},
});
```
Similarly, if both examples and functions are dynamic, use `cacheBreakpoint: 'system'`
to cache only the system prompt.
#### Multi-Turn Function Calling with Caching
When using functions/tools, caching is automatically applied:
```ts
import { ai, ax, f, fn } from "@ax-llm/ax";
const tools = [
fn("calculate")
.description("Evaluate a math expression")
.arg("expression", f.string("Math expression"))
.returns(f.number("Calculated value"))
.handler(({ expression }) => eval(expression))
.build(),
];
const agent = ax("question:string -> answer:string", {
description: "You are a helpful assistant...",
functions: tools,
});
const llm = ai({ name: "google-gemini", apiKey: process.env.GOOGLE_APIKEY! });
// Tools and function results are automatically cached
const result = await agent.forward(llm, { question: "What is 2^10?" }, {
contextCache: { ttlSeconds: 3600 },
});
```
#### External Cache Registry (Serverless)
For serverless environments where in-memory state is lost, use an external
registry:
```ts
// Redis-backed registry example
const registry: AxContextCacheRegistry = {
get: async (key) => {
const data = await redis.get(`cache:${key}`);
return data ? JSON.parse(data) : undefined;
},
set: async (key, entry) => {
await redis.set(`cache:${key}`, JSON.stringify(entry), "EX", 3600);
},
};
const result = await gen.forward(llm, input, {
sessionId: "my-session",
contextCache: {
ttlSeconds: 3600,
registry,
},
});
```
#### Supported Models
**Gemini (Explicit Caching)**:
- Gemini 3 Flash/Pro
- Gemini 2.5 Pro/Flash/Flash-Lite
- Gemini 2.0 Flash/Flash-Lite
**Anthropic (Implicit Caching)**:
- All Claude models support implicit caching
### 9. Tips
- Prefer presets: gives friendly names and consistent tuning across your app
- Start with fast/cheap models for iteration; switch keys later without code changes
- Use `stream: false` in tests for simpler assertions
- In the browser, set `corsProxy` if needed
For more examples, see the examples directory and provider-specific docs.
---
## AWS Bedrock Provider
The `@ax-llm/ax-ai-aws-bedrock` package provides production-ready AWS Bedrock integration supporting Claude, GPT OSS, and Titan Embed models.
### Installation
```bash
npm install @ax-llm/ax @ax-llm/ax-ai-aws-bedrock
```
### Quick Start
```typescript
import { AxAIBedrock, AxAIBedrockModel } from "@ax-llm/ax-ai-aws-bedrock";
import { ax } from "@ax-llm/ax";
const ai = new AxAIBedrock({
region: "us-east-2",
config: { model: AxAIBedrockModel.ClaudeSonnet4 },
});
const generator = ax("question:string -> answer:string");
const result = await generator.forward(ai, {
question: "What is AWS Bedrock?",
});
console.log(result.answer);
```
### Configuration
```typescript
const ai = new AxAIBedrock({
region: "us-east-2", // Primary AWS region
fallbackRegions: ["us-west-2", "us-east-1"], // Fallback regions for Claude
gptRegion: "us-west-2", // Primary region for GPT models
gptFallbackRegions: ["us-east-1"], // Fallback regions for GPT
config: {
model: AxAIBedrockModel.ClaudeSonnet4,
maxTokens: 4096,
temperature: 0.7,
topP: 0.9,
},
});
```
### Supported Models
**Claude Models:**
- `AxAIBedrockModel.ClaudeSonnet4` - Claude Sonnet 4
- `AxAIBedrockModel.ClaudeOpus4` - Claude Opus 4
- `AxAIBedrockModel.Claude35Sonnet` - Claude 3.5 Sonnet
- `AxAIBedrockModel.Claude35Haiku` - Claude 3.5 Haiku
- `AxAIBedrockModel.Claude3Opus` - Claude 3 Opus
**GPT Models:**
- `AxAIBedrockModel.Gpt41106` - GPT-4 1106 Preview
- `AxAIBedrockModel.Gpt4Mini` - GPT-4o Mini
**Embedding Models:**
- `AxAIBedrockEmbedModel.TitanEmbedV2` - Titan Embed V2
### Regional Failover
The provider automatically handles regional failover for high availability. If the primary region fails, it retries with fallback regions.
### AWS Authentication
Uses AWS SDK's default credential chain:
- Environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`)
- AWS credentials file (`~/.aws/credentials`)
- IAM roles (EC2/Lambda)
---
## Vercel AI SDK Integration
The `@ax-llm/ax-ai-sdk-provider` package provides seamless integration with the Vercel AI SDK v5.
### Installation
```bash
npm install @ax-llm/ax @ax-llm/ax-ai-sdk-provider ai
```
### Basic Usage
```typescript
import { ai } from "@ax-llm/ax";
import { AxAIProvider } from "@ax-llm/ax-ai-sdk-provider";
import { generateText, streamText } from "ai";
// Create Ax AI instance
const axAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
});
// Create AI SDK v5 compatible provider
const model = new AxAIProvider(axAI);
// Use with AI SDK functions
const result = await generateText({
model,
messages: [{ role: "user", content: "Hello!" }],
});
console.log(result.text);
```
### Streaming with React Server Components
```typescript
import { ai } from "@ax-llm/ax";
import { AxAIProvider } from "@ax-llm/ax-ai-sdk-provider";
import { streamUI } from "ai/rsc";
const axAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
});
const model = new AxAIProvider(axAI);
const result = await streamUI({
model,
messages: [{ role: "user", content: "Tell me a story" }],
text: ({ content }) =>
{content}
,
});
```
### Agent Provider
Use Ax agents with the AI SDK:
```typescript
import { ai, agent } from "@ax-llm/ax";
import { AxAgentProvider } from "@ax-llm/ax-ai-sdk-provider";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const myAgent = agent("userInput:string -> response:string", {
name: "helper",
description: "A helpful assistant",
ai: llm,
});
const agentProvider = new AxAgentProvider({
agent: myAgent,
updateState: (msgs) => {
/* handle state updates */
},
generate: (result) => {result.response}
,
});
```
### Features
- AI SDK v5 `LanguageModelV2` compatible
- Full tool/function calling support
- Streaming with lifecycle events
- Multi-modal inputs (text, images, files)
- Full TypeScript support
---
## Ax Tools Package
The `@ax-llm/ax-tools` package provides additional tools for Ax including MCP (Model Context Protocol) support and a JavaScript runtime.
### Installation
```bash
npm install @ax-llm/ax @ax-llm/ax-tools
```
### MCP Stdio Transport
Connect to MCP servers via stdio:
```typescript
import { AxMCPClient } from "@ax-llm/ax";
import { axCreateMCPStdioTransport } from "@ax-llm/ax-tools";
// Create transport for an MCP server
const transport = axCreateMCPStdioTransport({
command: "npx",
args: ["-y", "@anthropic/mcp-server-filesystem"],
env: { HOME: process.env.HOME },
});
// Use with AxMCPClient
const client = new AxMCPClient(transport);
await client.init();
const tools = await client.getTools();
console.log("Available tools:", tools.map((t) => t.name));
```
### AxJSRuntime
A sandboxed JavaScript runtime that can be used as a function tool.
`AxJSRuntime` is the runtime implementation and is designed
to work across Node.js/Bun-style backends, Deno, and browsers.
```typescript
import { ai, ax } from "@ax-llm/ax";
import {
AxJSRuntime,
AxJSRuntimePermission,
} from "@ax-llm/ax";
// Create runtime with specific permissions
const runtime = new AxJSRuntime({
permissions: [
AxJSRuntimePermission.NETWORK,
AxJSRuntimePermission.TIMING,
],
});
// Use as a function tool
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const codeRunner = ax("task:string -> result:string", {
functions: [runtime.toFunction()],
});
const result = await codeRunner.forward(llm, {
task: "Calculate the factorial of 10",
});
```
### Permissions
Control what the runtime can access:
| Permission | Description |
| ---------- | ----------- |
| `NETWORK` | Network APIs (`fetch`, `WebSocket`, etc.) |
| `STORAGE` | Client storage APIs (`indexedDB`, `caches`) |
| `CODE_LOADING` | Dynamic script loading (`importScripts`) |
| `COMMUNICATION` | Cross-context messaging (`BroadcastChannel`) |
| `TIMING` | High-resolution timing (`performance`) |
| `WORKERS` | Sub-worker spawning (`Worker`, `SharedWorker`) |
```typescript
import { AxJSRuntimePermission } from "@ax-llm/ax";
const runtime = new AxJSRuntime({
permissions: [
AxJSRuntimePermission.NETWORK,
AxJSRuntimePermission.STORAGE,
],
});
```
================================================================================
# Optimization Guide
# Source: OPTIMIZE.md
# LLM Optimization Made Simple: A Beginner's Guide to Ax
# LLM Optimization Made Simple: A Beginner's Guide to Ax
**Goal**: Learn how to make your AI programs smarter, faster, and cheaper
through automatic optimization. **Time to first results**: 5 minutes
## 📋 Table of Contents
- [What is LLM Optimization?](#what-is-llm-optimization)
- [🚀 5-Minute Quick Start](#-5-minute-quick-start) ← **Start here!**
- [Step 6: Save Your Optimization Results 💾](#step-6-save-your-optimization-results-)
- [Step 7: Load and Use in Production 🚀](#step-7-load-and-use-in-production-)
- [📚 Understanding the Basics](#-understanding-the-basics)
- [🎯 Common Use Cases](#-common-use-cases-copy--paste-ready)
- [💰 Saving Money: Teacher-Student Setup](#-saving-money-teacher-student-setup)
- [🔧 Making It Better: Practical Tips](#-making-it-better-practical-tips)
- [🛠️ Troubleshooting Guide](#️-troubleshooting-guide)
- [🎓 Next Steps: Level Up Your Skills](#-next-steps-level-up-your-skills)
- [📖 Complete Working Example](#-complete-working-example)
- [🎯 Key Takeaways](#-key-takeaways)
## 📚 Detailed Optimizer Guides
For in-depth documentation on specific optimizers, see:
- **[MiPRO](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/MIPRO.md)** - Multi-Prompt Optimization (recommended for most use cases)
- **[GEPA](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/GEPA.md)** - Multi-objective optimization with Pareto frontiers
- **[ACE](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/ACE.md)** - Agentic Context Engineering for structured, evolving playbooks
---
## What is LLM Optimization?
Think of optimization like having a writing tutor for your AI. Instead of
manually tweaking prompts and examples, Ax automatically:
- **Writes better prompts** for your AI programs
- **Picks the best examples** to show your AI what you want
- **Saves you money** by making cheaper models work as well as expensive ones
- **Improves accuracy** without you having to be a prompt engineering expert
**Real example**: A sentiment analysis that goes from 70% accuracy to 90%
accuracy automatically, while reducing costs by 80%.
---
### Step 1: Install and Setup
```bash
npm install @ax-llm/ax
```
```typescript
// Create a .env file with your OpenAI API key
// OPENAI_APIKEY=your_key_here
import { ai, ax, AxMiPRO, AxAIOpenAIModel } from "@ax-llm/ax";
```
**Important**: Ax optimizers depend on a Python optimization service (Optuna).
For MiPRO v2 and production-scale optimization, you must start the Python
service before running any optimization. See "Python Optimization Service
Integration" below. Quick start:
```bash
cd src/optimizer
uv sync
uv run ax-optimizer server start --debug
```
### Step 2: Create Your First Optimizable Program
```typescript
// This is a simple sentiment analyzer - we'll make it smarter!
const sentimentAnalyzer = ax(
'reviewText:string "Customer review" -> sentiment:class "positive, negative, neutral" "How the customer feels"',
);
// Set up your AI
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini }, // Start with the cheaper model
});
```
### Step 3: Provide Training Examples
```typescript
// Just 3-5 examples are enough to start!
const examples = [
{ reviewText: "I love this product!", sentiment: "positive" },
{ reviewText: "This is terrible quality", sentiment: "negative" },
{ reviewText: "It works fine, nothing special", sentiment: "neutral" },
{ reviewText: "Best purchase ever!", sentiment: "positive" },
{ reviewText: "Waste of money", sentiment: "negative" },
];
```
### Step 4: Define Success (Your Metric)
```typescript
// This tells the optimizer what "good" looks like
const metric = ({ prediction, example }) => {
// Simple: 1 point for correct answer, 0 for wrong
return prediction.sentiment === example.sentiment ? 1 : 0;
};
```
### Step 5: Run the Magic ✨
```typescript
// Create the optimizer
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
options: { verbose: true }, // Show progress
});
// Let it optimize (takes 1-2 minutes)
console.log("🔄 Optimizing your AI program...");
const result = await optimizer.compile(sentimentAnalyzer, examples, metric);
// Apply the improvements
if (result.demos) {
sentimentAnalyzer.setDemos(result.demos);
}
console.log(
`✅ Done! Improved from baseline to ${result.bestScore * 100}% accuracy`,
);
```
### Step 6: Save Your Optimization Results 💾
**This is crucial for production!** The new unified `AxOptimizedProgram`
contains everything needed to reproduce your optimization:
```typescript
import { promises as fs } from "fs";
// Apply the optimized configuration using the unified approach
if (result.optimizedProgram) {
// Apply all optimizations in one clean call
sentimentAnalyzer.applyOptimization(result.optimizedProgram);
console.log(`✨ Applied optimized configuration:`);
console.log(` Score: ${result.optimizedProgram.bestScore.toFixed(3)}`);
console.log(` Optimizer: ${result.optimizedProgram.optimizerType}`);
console.log(
` Converged: ${result.optimizedProgram.converged ? "✅" : "❌"}`,
);
// Save the complete optimization result
await fs.writeFile(
"sentiment-analyzer-optimization.json",
JSON.stringify(
{
version: "2.0",
bestScore: result.optimizedProgram.bestScore,
instruction: result.optimizedProgram.instruction,
demos: result.optimizedProgram.demos,
modelConfig: result.optimizedProgram.modelConfig,
optimizerType: result.optimizedProgram.optimizerType,
optimizationTime: result.optimizedProgram.optimizationTime,
totalRounds: result.optimizedProgram.totalRounds,
converged: result.optimizedProgram.converged,
stats: result.optimizedProgram.stats,
timestamp: new Date().toISOString(),
},
null,
2,
),
);
console.log(
"✅ Complete optimization saved to sentiment-analyzer-optimization.json",
);
}
// What you just saved:
console.log("Saved data contains:");
console.log("- Optimized few-shot examples (demos)");
console.log("- Optimized instruction prompts");
console.log("- Model configuration (temperature, etc.)");
console.log("- Complete performance metrics");
console.log("- Optimization metadata and timing");
console.log(`- Performance score: ${result.bestScore}`);
```
### Step 7: Load and Use in Production 🚀
In your production code, recreate and apply the saved optimization:
```typescript
import { AxOptimizedProgramImpl } from "@ax-llm/ax";
// Production app - load pre-optimized configuration
const sentimentAnalyzer = ax(
'reviewText:string "Customer review" -> sentiment:class "positive, negative, neutral" "How the customer feels"',
);
// Load the saved optimization results
const savedData = JSON.parse(
await fs.readFile("sentiment-analyzer-optimization.json", "utf8"),
);
// Recreate the optimized program
const optimizedProgram = new AxOptimizedProgramImpl({
bestScore: savedData.bestScore,
stats: savedData.stats,
instruction: savedData.instruction,
demos: savedData.demos,
modelConfig: savedData.modelConfig,
optimizerType: savedData.optimizerType,
optimizationTime: savedData.optimizationTime,
totalRounds: savedData.totalRounds,
converged: savedData.converged,
});
// Apply the complete optimization (demos, instruction, model config, etc.)
sentimentAnalyzer.applyOptimization(optimizedProgram);
console.log(`🚀 Loaded optimization v${savedData.version}`);
console.log(` Score: ${optimizedProgram.bestScore.toFixed(3)}`);
console.log(` Optimizer: ${optimizedProgram.optimizerType}`);
// Now your AI performs at the optimized level
const analysis = await sentimentAnalyzer.forward(llm, {
reviewText: "The product arrived quickly but the quality was disappointing",
});
console.log("Analysis:", analysis.sentiment); // Much more accurate!
```
### Step 8: Understanding What You Get 📊
The new unified optimization result provides comprehensive information in one
object:
```typescript
const result = await optimizer.compile(sentimentAnalyzer, examples, metric);
// New unified approach - everything in one place:
if (result.optimizedProgram) {
console.log({
// Performance metrics
bestScore: result.optimizedProgram.bestScore, // Best performance (0-1)
converged: result.optimizedProgram.converged, // Did optimization converge?
totalRounds: result.optimizedProgram.totalRounds, // Number of optimization rounds
optimizationTime: result.optimizedProgram.optimizationTime, // Time taken (ms)
// Program configuration
instruction: result.optimizedProgram.instruction, // Optimized prompt
demos: result.optimizedProgram.demos?.length, // Number of few-shot examples
modelConfig: result.optimizedProgram.modelConfig, // Model settings (temperature, etc.)
// Optimization metadata
optimizerType: result.optimizedProgram.optimizerType, // Which optimizer was used
stats: result.optimizedProgram.stats, // Detailed statistics
});
}
// The unified result contains everything:
// - Optimized few-shot examples (demos)
// - Optimized instruction text
// - Model configuration (temperature, maxTokens, etc.)
// - Complete performance statistics
// - Optimization metadata (type, time, convergence)
// - Everything needed to reproduce the performance
```
### Step 9: Production Best Practices 📁
**File Organization:**
```
your-app/
├── optimizations/
│ ├── sentiment-analyzer-v2.0.json ← Complete optimization (new format)
│ ├── email-classifier-v2.0.json ← Different task
│ └── product-reviewer-v2.0.json ← Another task
├── legacy-optimizations/
│ ├── sentiment-analyzer-demos.json ← Legacy demos (v1.0 format)
│ └── email-classifier-demos.json ← Old format
├── src/
│ ├── train-models.ts ← Training script
│ └── production-app.ts ← Production app
```
**Environment-specific Loading:**
```typescript
import { AxOptimizedProgramImpl } from "@ax-llm/ax";
// Load different optimizations for different environments
const optimizationFile = process.env.NODE_ENV === "production"
? "optimizations/sentiment-analyzer-prod-v2.0.json"
: "optimizations/sentiment-analyzer-dev-v2.0.json";
const savedData = JSON.parse(await fs.readFile(optimizationFile, "utf8"));
// Handle both new unified format and legacy format
if (savedData.version === "2.0") {
// New unified format
const optimizedProgram = new AxOptimizedProgramImpl(savedData);
sentimentAnalyzer.applyOptimization(optimizedProgram);
console.log(`🚀 Loaded unified optimization v${savedData.version}`);
} else {
// Legacy format (backward compatibility)
sentimentAnalyzer.setDemos(savedData.demos || savedData);
console.log("⚠️ Loaded legacy demo format - consider upgrading");
}
```
**Version Your Optimizations:**
```typescript
// The new format includes comprehensive versioning by default
const optimizationData = {
version: "2.0", // Format version
modelVersion: "1.3.0", // Your model version
created: new Date().toISOString(),
bestScore: result.optimizedProgram.bestScore,
instruction: result.optimizedProgram.instruction,
demos: result.optimizedProgram.demos,
modelConfig: result.optimizedProgram.modelConfig,
optimizerType: result.optimizedProgram.optimizerType,
optimizationTime: result.optimizedProgram.optimizationTime,
totalRounds: result.optimizedProgram.totalRounds,
converged: result.optimizedProgram.converged,
stats: result.optimizedProgram.stats,
environment: process.env.NODE_ENV || "development",
modelUsed: AxAIOpenAIModel.GPT4OMini,
trainingDataSize: examples.length,
};
await fs.writeFile(
"sentiment-analyzer-v1.3.0.json",
JSON.stringify(optimizationData, null, 2),
);
```
**🎉 Congratulations!** You now understand the complete unified optimization
workflow:
1. **Train** with examples and metrics
2. **Apply** optimization using
`program.applyOptimization(result.optimizedProgram)`
3. **Save** the complete optimization configuration (demos + instruction + model
config)
4. **Load** and recreate optimization in production using
`AxOptimizedProgramImpl`
5. **Version** and manage your optimizations with comprehensive metadata
---
## 📚 Understanding the Basics
### What Just Happened?
1. **The Optimizer** tried different ways to ask your AI the question
2. **It tested** each approach using your examples
3. **It kept** the best-performing version
4. **Your program** now uses the optimized prompt and examples
### Key Terms (Simple Explanations)
- **Student AI**: The model you want to optimize (usually cheaper/faster)
- **Teacher AI**: Optional expensive model that helps create better instructions
- **Examples**: Your training data showing correct answers
- **Metric**: How you measure if the AI is doing well
- **Demos**: The best examples the optimizer found to show your AI
### What Does Optimization Actually Produce? 🎯
**The main output is DEMOS** - these are not just "demo data" but **optimized
few-shot examples** that dramatically improve your AI's performance:
```typescript
// What demos contain:
{
"traces": [
{
"reviewText": "I love this product!", // Input that works well
"sentiment": "positive" // Expected output
},
{
"reviewText": "This is terrible quality", // Another good example
"sentiment": "negative" // Expected output
}
],
"instruction": "Analyze customer sentiment..." // Optimized prompt (MiPRO)
}
```
**Why demos are powerful:**
- ✅ **Portable**: Save as JSON, load anywhere
- ✅ **Fast**: No re-optimization in production
- ✅ **Effective**: Often 2-5x performance improvement
- ✅ **Cost-effective**: Reduce API calls by using cheaper models better
**The workflow:**
1. **Training**: `optimizer.compile()` → produces `result.demos`
2. **Save**: `JSON.stringify(result.demos)` → save to file/database
3. **Production**: Load demos → `program.setDemos(demos)` → improved performance
### When to Use Optimization
> **🎯 Perfect for beginners**: Start with classification tasks like sentiment
> analysis, email categorization, or content moderation where you have clear
> right/wrong answers.
✅ **Great for:**
- Classification tasks (sentiment, categories, etc.)
- When you have some example data (even just 5-10 examples!)
- When accuracy matters more than speed
- When you want to save money on API calls
- Repetitive tasks you do often
❌ **Skip for now:**
- Simple one-off tasks
- When you have no training examples
- Creative writing tasks (poems, stories)
- When you need results immediately (optimization takes 1-5 minutes)
---
## 🎯 Common Use Cases (Copy & Paste Ready)
### 1. Email Classification
```typescript
const emailClassifier = ax(`
emailContent:string "Email text" ->
category:class "urgent, normal, spam" "Email priority",
needsReply:class "yes, no" "Does this need a response?"
`);
const examples = [
{
emailContent: "URGENT: Server is down!",
category: "urgent",
needsReply: "yes",
},
{
emailContent: "Thanks for your help yesterday",
category: "normal",
needsReply: "no",
},
{
emailContent: "You won a million dollars! Click here!",
category: "spam",
needsReply: "no",
},
];
const metric = ({ prediction, example }) => {
let score = 0;
if (prediction.category === example.category) score += 0.7;
if (prediction.needsReply === example.needsReply) score += 0.3;
return score;
};
// Same optimization pattern as before...
```
### 2. Customer Support Routing
```typescript
const supportRouter = ax(`
customerMessage:string "Customer inquiry" ->
department:class "billing, technical, general" "Which team should handle this",
urgency:class "low, medium, high" "How urgent is this"
`);
const examples = [
{
customerMessage: "I was charged twice for my subscription",
department: "billing",
urgency: "high",
},
{
customerMessage: "How do I reset my password?",
department: "technical",
urgency: "medium",
},
{
customerMessage: "What are your business hours?",
department: "general",
urgency: "low",
},
];
```
### 3. Content Moderation
```typescript
const contentModerator = ax(`
userPost:string "User-generated content" ->
safe:class "yes, no" "Is this content appropriate?",
reason:string "Why was this flagged (if unsafe)"
`);
const examples = [
{ userPost: "Great weather today!", safe: "yes", reason: "" },
{
userPost: "This product sucks and so do you!",
safe: "no",
reason: "Inappropriate language",
},
{ userPost: "Check out my new blog post", safe: "yes", reason: "" },
];
```
---
## 💰 Saving Money: Teacher-Student Setup
**The Problem**: GPT-4 is smart but expensive. GPT-4-mini is cheap but sometimes
not as accurate.
**The Solution**: Use GPT-4 as a "teacher" to make GPT-4-mini as smart as GPT-4,
but at 1/10th the cost!
### Simple Teacher-Student Setup
```typescript
// Teacher: Smart but expensive (only used during optimization)
const teacherAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4O }, // The expensive one
});
// Student: Fast and cheap (used for actual work)
const studentAI = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini }, // The cheap one
});
const optimizer = new AxMiPRO({
studentAI, // This is what gets optimized
teacherAI, // This helps create better instructions
examples,
options: { verbose: true },
});
// The magic: cheap model performs like expensive model!
const result = await optimizer.compile(program, examples, metric);
```
**Real savings**: Instead of paying $0.03 per 1K tokens, you pay $0.0006 per 1K
tokens after optimization - that's 50x cheaper!
---
## 🔧 Making It Better: Practical Tips
### 1. Better Examples = Better Results
❌ **Bad examples** (too similar):
```typescript
const badExamples = [
{ text: "I love it", sentiment: "positive" },
{ text: "I like it", sentiment: "positive" },
{ text: "I enjoy it", sentiment: "positive" },
];
```
✅ **Good examples** (diverse):
```typescript
const goodExamples = [
{ text: "I love this product!", sentiment: "positive" },
{ text: "Terrible quality, broke immediately", sentiment: "negative" },
{ text: "It works fine, nothing special", sentiment: "neutral" },
{ text: "Best purchase ever made!", sentiment: "positive" },
{ text: "Completely useless waste of money", sentiment: "negative" },
];
```
### 2. Better Metrics = Better Optimization
❌ **Too simple**:
```typescript
const simpleMetric = ({ prediction, example }) => {
return prediction.category === example.category ? 1 : 0;
};
```
✅ **More nuanced**:
```typescript
const betterMetric = ({ prediction, example }) => {
let score = 0;
// Main task (80% of score)
if (prediction.category === example.category) {
score += 0.8;
}
// Bonus for confidence (20% of score)
if (prediction.confidence && prediction.confidence > 0.7) {
score += 0.2;
}
return score;
};
```
### 3. Start Small, Then Scale
**Phase 1**: Start with 5-10 examples
```typescript
const optimizer = new AxMiPRO({
studentAI,
examples: examples.slice(0, 10), // Just first 10
options: {
numTrials: 3, // Quick test
verbose: true,
},
});
```
**Phase 2**: Scale up if results are good
```typescript
const optimizer = new AxMiPRO({
studentAI,
teacherAI,
examples: allExamples, // All your data
options: {
numTrials: 8, // More thorough
verbose: true,
},
});
```
---
## 🛠️ Troubleshooting Guide
### "My optimization score is low!"
**Check your examples**:
```typescript
// Are they diverse enough?
console.log("Unique categories:", [
...new Set(examples.map((e) => e.category)),
]);
// Are they correct?
examples.forEach((ex, i) => {
console.log(`Example ${i}: ${ex.text} -> ${ex.category}`);
});
```
**Try a better metric**:
```typescript
// Add logging to see what's happening
const debugMetric = ({ prediction, example }) => {
const correct = prediction.category === example.category;
console.log(
`Predicted: ${prediction.category}, Expected: ${example.category}, Correct: ${correct}`,
);
return correct ? 1 : 0;
};
```
### "It's too expensive!"
**Set a budget**:
```typescript
import { AxDefaultCostTracker } from "@ax-llm/ax";
const costTracker = new AxDefaultCostTracker({
maxTokens: 10000, // Stop after 10K tokens
maxCost: 5, // Stop after $5
});
const optimizer = new AxMiPRO({
studentAI,
examples,
costTracker, // Automatic budget control
options: {
numTrials: 3, // Fewer trials
earlyStoppingTrials: 2, // Stop early if no improvement
},
});
```
### "It's taking too long!"
**Speed it up**:
```typescript
const optimizer = new AxMiPRO({
studentAI,
examples: examples.slice(0, 20), // Fewer examples
options: {
numCandidates: 3, // Fewer candidates to try
numTrials: 5, // Fewer trials
minibatch: true, // Process in smaller batches
verbose: true,
},
});
```
### "Results are inconsistent!"
**Make it reproducible**:
```typescript
const optimizer = new AxMiPRO({
studentAI: ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: {
model: AxAIOpenAIModel.GPT4OMini,
temperature: 0.1, // Lower = more consistent
},
}),
examples,
seed: 42, // Same results every time
options: { verbose: true },
});
```
---
## 🎓 Next Steps: Level Up Your Skills
### 1. Try Different Optimizers
**For few-shot learning** (when you have good examples):
```typescript
import { AxBootstrapFewShot } from "@ax-llm/ax";
const optimizer = new AxBootstrapFewShot({
studentAI,
examples,
options: {
maxDemos: 5, // Show 5 examples to AI
maxRounds: 3, // 3 rounds of improvement
verboseMode: true,
},
});
```
### 2. Agentic Context Engineering (ACE)
> **📖 Full Documentation**: See [ACE.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/ACE.md) for complete ACE guide
**The Problem**: Iteratively rewriting a giant system prompt causes brevity bias and context collapse—hard-won strategies disappear after a few updates. You need a way to grow and refine a durable playbook both offline and online.
**The Solution**: Use `AxACE`, an optimizer that mirrors the ACE paper's Generator → Reflector → Curator loop. It represents context as structured bullets, applies incremental deltas, and returns a serialized playbook you can save, load, and keep updating at inference time.
```typescript
import fs from "node:fs/promises";
import { ax, AxAI, AxACE } from "@ax-llm/ax";
const student = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
});
const teacher = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4O },
});
const classifier = ax(
'ticket:string "Support ticket text" -> severity:class "low, medium, high" "Incident severity"'
);
classifier.setDescription(
"Classify the severity of the support ticket and explain your reasoning."
);
const examples = [
{ ticket: "Billing portal returns 502 errors globally.", severity: "high" },
{ ticket: "UI misaligned on Safari but usable.", severity: "low" },
{ ticket: "Checkout intermittently drops vouchers.", severity: "medium" },
];
const metric = ({ prediction, example }) =>
prediction.severity === example.severity ? 1 : 0;
const optimizer = new AxACE(
{ studentAI: student, teacherAI: teacher, verbose: true },
{ maxEpochs: 2 }
);
const result = await optimizer.compile(classifier, examples, metric);
result.optimizedProgram?.applyTo(classifier);
// Save the structured playbook for future sessions
await fs.writeFile(
"ace-playbook.json",
JSON.stringify(result.artifact.playbook, null, 2)
);
// Later, load the playbook and keep adapting online
const loadedPlaybook = JSON.parse(
await fs.readFile("ace-playbook.json", "utf8")
);
const onlineOptimizer = new AxACE(
{ studentAI: student, teacherAI: teacher },
{ initialPlaybook: loadedPlaybook }
);
const onlineCuratorDelta = await onlineOptimizer.applyOnlineUpdate({
example: { ticket: "Orders failing globally", severity: "high" },
prediction: await classifier.forward(student, {
ticket: "Orders failing globally",
}),
feedback: "Production telemetry confirmed a P1 outage.",
});
```
**Why it matters**:
- **Structured memory** – Playbooks of tagged bullets persist across runs.
- **Incremental updates** – Curator operations apply as deltas, so context never collapses.
- **Offline + Online** – Same optimizer supports batch training and per-sample updates.
- **Unified artifacts** – `AxACEOptimizedProgram` extends `AxOptimizedProgramImpl`, so you can save/load/apply like MiPRO or GEPA.
> **📖 Full Example**: `src/examples/ace-train-inference.ts` demonstrates offline training plus an online adaptation pass.
### 3. Multi-Objective Optimization with GEPA and GEPA-Flow
> **📖 Full Documentation**: See [GEPA.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/GEPA.md) for complete GEPA guide
**The Problem**: Sometimes you care about multiple things at once - accuracy AND
speed AND cost. Traditional optimization only handles one objective at a time.
**The Solution**: Use `AxGEPA` (single-module) or `AxGEPAFlow` (multi-module)
with a multi-objective metric. Both use `compile(...)` and return a Pareto
frontier of trade-offs plus hypervolume metrics.
**NEW in v14.0.24+**: GEPA now returns the same unified `optimizedProgram` interface as MiPRO, enabling consistent save/load/apply workflows across all optimizers.
> Note: Pass `maxMetricCalls` in `compile` options to bound evaluation cost.
#### What is Pareto Optimization?
A solution is "Pareto optimal" if you can't improve one objective without making
another objective worse. The collection of all such solutions is called the
"Pareto frontier."
**Example**:
- Solution A: 90% accuracy, 100ms response time, $0.10 cost
- Solution B: 85% accuracy, 50ms response time, $0.05 cost
- Solution C: 80% accuracy, 200ms response time, $0.08 cost
Solutions A and B are both Pareto optimal (A is more accurate but
slower/expensive, B is faster/cheaper but less accurate). Solution C is
dominated by both A and B.
#### When to Use GEPA / GEPA-Flow
✅ **Perfect for:**
- Content moderation (accuracy vs speed vs cost)
- Customer service routing (response time vs routing accuracy vs resource usage)
- Email classification (precision vs recall vs processing speed)
- Product recommendations (relevance vs diversity vs computation cost)
❌ **Skip for:**
- Single clear objective (use regular `AxMiPRO.compile`)
- When one objective is clearly most important
- Quick prototyping (multi-objective adds complexity)
#### Complete Working Example (GEPA)
**GEPA now returns the same unified `optimizedProgram` interface as MiPRO**, making save/load/apply workflows consistent across optimizers.
> **📖 Full Example**: For a comprehensive multi-objective optimization demonstration, see `src/examples/gepa-quality-vs-speed-optimization.ts` which shows GEPA optimizing code review quality vs speed trade-offs with detailed Pareto frontier analysis.
```typescript
import { ai, ax, AxGEPA, AxOptimizedProgramImpl } from "@ax-llm/ax";
// Two-objective demo: accuracy (classification) + brevity (short rationale)
const moderator = ax(`
userPost:string "User content" ->
isSafe:class "safe, unsafe" "Safety",
rationale:string "One concise sentence"
`);
const train = [
{ userPost: "Great weather today!", isSafe: "safe" },
{ userPost: "This product sucks and the company is terrible!", isSafe: "unsafe" },
// ...
];
const val = [
{ userPost: "Reminder: submit timesheets", isSafe: "safe" },
{ userPost: "Data breach follow-up actions required", isSafe: "unsafe" },
// ...
];
// Multi-objective metric
const multiMetric = ({ prediction, example }: any) => {
const accuracy = prediction?.isSafe === example?.isSafe ? 1 : 0;
const rationale: string = typeof prediction?.rationale === 'string' ? prediction.rationale : '';
const len = rationale.length;
const brevity = len <= 30 ? 1 : len <= 60 ? 0.7 : len <= 100 ? 0.4 : 0.1;
return { accuracy, brevity } as Record;
};
const student = ai({ name: 'openai', apiKey: process.env.OPENAI_APIKEY!, config: { model: 'gpt-4o-mini' } });
const optimizer = new AxGEPA({ studentAI: student, numTrials: 16, minibatch: true, minibatchSize: 6, seed: 42, verbose: true });
console.log("🔄 Finding Pareto trade-offs...");
const result = await optimizer.compile(
moderator as any,
train,
multiMetric as any,
{
validationExamples: val,
feedbackExamples: val,
feedbackFn: ({ prediction, example }) =>
prediction?.isSafe === example?.isSafe
? '✅ Matched label'
: [
`Expected: ${example?.isSafe ?? 'unknown'}`,
`Received: ${prediction?.isSafe ?? 'unknown'}`,
],
// Required to bound evaluation cost
maxMetricCalls: 200,
// Optional: provide a tie-break scalarizer for selection logic
// paretoMetricKey: 'accuracy',
// or
// paretoScalarize: (s) => 0.7*s.accuracy + 0.3*s.brevity,
}
);
console.log(`✅ Found ${result.paretoFrontSize} Pareto points`);
console.log(`📊 Hypervolume (2D): ${result.hypervolume ?? 'N/A'}`);
// Inspect a few points
for (const [i, p] of [...result.paretoFront].entries()) {
if (i >= 3) break;
console.log(` #${i+1}: acc=${(p.scores as any).accuracy?.toFixed(3)}, brev=${(p.scores as any).brevity?.toFixed(3)}, config=${JSON.stringify(p.configuration)}`);
}
// **NEW: GEPA now provides unified optimizedProgram interface**
if (result.optimizedProgram) {
// Apply optimization using the same pattern as MiPRO
moderator.applyOptimization(result.optimizedProgram);
console.log(`✨ Applied GEPA optimization:`);
console.log(` Score: ${result.optimizedProgram.bestScore.toFixed(3)}`);
console.log(` Optimizer: ${result.optimizedProgram.optimizerType}`); // "GEPA"
console.log(` Converged: ${result.optimizedProgram.converged ? "✅" : "❌"}`);
// Save the complete GEPA optimization (same as MiPRO format)
await fs.writeFile(
"gepa-optimization.json",
JSON.stringify({
version: "2.0",
bestScore: result.optimizedProgram.bestScore,
instruction: result.optimizedProgram.instruction,
demos: result.optimizedProgram.demos,
examples: result.optimizedProgram.examples, // GEPA includes training examples
modelConfig: result.optimizedProgram.modelConfig,
optimizerType: result.optimizedProgram.optimizerType,
optimizationTime: result.optimizedProgram.optimizationTime,
totalRounds: result.optimizedProgram.totalRounds,
converged: result.optimizedProgram.converged,
stats: result.optimizedProgram.stats,
timestamp: new Date().toISOString(),
}, null, 2)
);
// Load and apply later (same pattern as MiPRO)
// const savedData = JSON.parse(await fs.readFile('gepa-optimization.json', 'utf8'));
// const optimizedProgram = new AxOptimizedProgramImpl(savedData);
// moderator.applyOptimization(optimizedProgram);
} else {
// Fallback: choose a compromise by weighted sum
const weights = { accuracy: 0.7, brevity: 0.3 };
const best = result.paretoFront.reduce((best, cur) => {
const s = weights.accuracy * ((cur.scores as any).accuracy ?? 0) + weights.brevity * ((cur.scores as any).brevity ?? 0);
const b = weights.accuracy * ((best.scores as any).accuracy ?? 0) + weights.brevity * ((best.scores as any).brevity ?? 0);
return s > b ? cur : best;
});
console.log(`🎯 Chosen config: ${JSON.stringify(best.configuration)}`);
}
```
> 💡 **Feedback hook**: `feedbackFn` lets you surface rich guidance for each evaluation, whether it's a short string or multiple
> bullet points. The hook receives the raw `prediction` and original `example`, making it easy to emit reviewer-style comments
> alongside scores. Pair it with `feedbackExamples` to keep cost-efficient review sets separate from validation metrics.
#### GEPA-Flow (Multi-Module)
```typescript
import { AxGEPAFlow, flow, ai } from "@ax-llm/ax";
const pipeline = flow<{ emailText: string }>()
.n('classifier', 'emailText:string -> priority:class "high, normal, low"')
.n('rationale', 'emailText:string, priority:string -> rationale:string "One concise sentence"')
.e('classifier', (s) => ({ emailText: s.emailText }))
.e('rationale', (s) => ({ emailText: s.emailText, priority: s.classifierResult.priority }))
.m((s) => ({ priority: s.classifierResult.priority, rationale: s.rationaleResult.rationale }));
const optimizer = new AxGEPAFlow({ studentAI: ai({ name: 'openai', apiKey: process.env.OPENAI_APIKEY!, config: { model: 'gpt-4o-mini' } }), numTrials: 16 });
const result = await optimizer.compile(pipeline as any, train, multiMetric as any, { validationExamples: val, maxMetricCalls: 240 });
console.log(`Front size: ${result.paretoFrontSize}, Hypervolume: ${result.hypervolume}`);
```
#### Advanced Multi-Objective Patterns
**Cost-Quality Trade-off**:
```typescript
const multiMetric = ({ prediction, example }) => ({
accuracy: prediction.category === example.category ? 1 : 0,
cost: 1 / (estimateTokenCost(prediction) + 1), // Inverse cost (higher = cheaper)
speed: 1 / (prediction.responseTime || 1000), // Inverse time (higher = faster)
});
```
**Precision-Recall Optimization**:
```typescript
const multiMetric = ({ prediction, example }) => {
const truePositive =
prediction.category === "positive" && example.category === "positive"
? 1
: 0;
const falsePositive =
prediction.category === "positive" && example.category !== "positive"
? 1
: 0;
const falseNegative =
prediction.category !== "positive" && example.category === "positive"
? 1
: 0;
return {
precision: falsePositive === 0
? 1
: (truePositive / (truePositive + falsePositive)),
recall: falseNegative === 0
? 1
: (truePositive / (truePositive + falseNegative)),
};
};
```
**Customer Satisfaction vs Efficiency**:
```typescript
const multiMetric = ({ prediction, example }) => ({
customerSatisfaction: calculateSatisfactionScore(prediction, example),
resourceEfficiency: 1 / (prediction.processingSteps || 1),
resolutionSpeed: prediction.resolutionTime
? (1 / prediction.resolutionTime)
: 0,
});
```
#### Understanding the Results
```typescript
const result = await optimizer.compile(program, examples, multiMetric, { maxMetricCalls: 200 });
// Key properties of AxParetoResult:
console.log(`Pareto frontier size: ${result.paretoFrontSize}`);
console.log(`Best scalarized score on frontier: ${result.bestScore}`);
console.log(`Hypervolume (2D only): ${result.hypervolume}`);
console.log(`Total candidates evaluated: ${result.finalConfiguration?.candidates}`);
// Each frontier solution contains:
result.paretoFront.forEach((solution) => {
solution.scores; // Scores for each objective
solution.configuration; // Candidate identifier for this solution
solution.dominatedSolutions; // How many others this point dominates
});
```
#### Performance Considerations
- **Runtime**: GEPA/GEPA-Flow perform reflective evolution with Pareto sampling; time scales with `numTrials`, validation size, and `maxMetricCalls`.
- **Cost**: Bound evaluations with `maxMetricCalls`; consider minibatching.
- **Scalability**: Works best with 2–4 objectives; hypervolume reporting is 2D.
- **Determinism**: Provide `seed` for reproducibility; `tieEpsilon` resolves near-ties.
#### Tips for Success
1. **Start with 2-3 objectives**: More objectives make selection harder.
2. **Scale objectives similarly (0–1)** for fair comparison.
3. **Use `paretoMetricKey` or `paretoScalarize`** to guide selection/tie-breaks.
4. **Validate chosen trade-offs** on a holdout set aligned to business constraints.
5. **Keep validation small** to control cost; use `validationExamples` and `feedbackExamples` splits.
### 3. Chain Multiple Programs
```typescript
// First program: Extract key info
const extractor = ax(
'emailContent:string "Email content" -> keyPoints:string[] "Important points"',
);
// Second program: Classify based on extracted info
const classifier = ax(
'keyPoints:string[] "Key points" -> priority:class "low, medium, high" "Email priority"',
);
// Optimize them separately, then chain them
const extractResult = await extractOptimizer.compile(
extractor,
extractExamples,
extractMetric,
);
const classifyResult = await classifyOptimizer.compile(
classifier,
classifyExamples,
classifyMetric,
);
// Use them together
const emailContent = "Meeting moved to 3pm tomorrow, please confirm";
const keyPoints = await extractor.forward(llm, { emailContent });
const priority = await classifier.forward(llm, {
keyPoints: keyPoints.keyPoints,
});
```
---
## 📖 Complete Working Example
Here's a full example you can copy, paste, and run:
```typescript
import { ai, ax, AxMiPRO } from "@ax-llm/ax";
// 1. Define the task
const productReviewer = ax(`
productReview:string "Customer product review" ->
rating:class "1, 2, 3, 4, 5" "Star rating 1-5",
aspect:class "quality, price, shipping, service" "Main concern",
recommendation:class "buy, avoid, maybe" "Would you recommend?"
`);
// 2. Training examples
const examples = [
{
productReview: "Amazing quality, worth every penny!",
rating: "5",
aspect: "quality",
recommendation: "buy",
},
{
productReview: "Too expensive for what you get",
rating: "2",
aspect: "price",
recommendation: "avoid",
},
{
productReview: "Good product but took forever to arrive",
rating: "3",
aspect: "shipping",
recommendation: "maybe",
},
{
productReview: "Great value, fast delivery, happy customer!",
rating: "5",
aspect: "price",
recommendation: "buy",
},
{
productReview: "Customer service was rude when I had issues",
rating: "1",
aspect: "service",
recommendation: "avoid",
},
];
// 3. AI setup
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" },
});
// 4. Success metric
const metric = ({ prediction, example }) => {
let score = 0;
if (prediction.rating === example.rating) score += 0.5;
if (prediction.aspect === example.aspect) score += 0.3;
if (prediction.recommendation === example.recommendation) score += 0.2;
return score;
};
// 5. Optimize
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
options: { verbose: true },
});
console.log("🔄 Starting optimization...");
const result = await optimizer.compile(productReviewer, examples, metric);
console.log(
`✅ Optimization complete! Score improved to ${
(result.bestScore * 100).toFixed(1)
}%`,
);
// 6. Apply and save the optimization results using the unified approach
if (result.optimizedProgram) {
const fs = await import("fs/promises");
// Apply all optimizations at once
productReviewer.applyOptimization(result.optimizedProgram);
console.log(`✨ Applied optimized configuration:`);
console.log(` Score: ${result.optimizedProgram.bestScore.toFixed(3)}`);
console.log(` Optimizer: ${result.optimizedProgram.optimizerType}`);
console.log(
` Converged: ${result.optimizedProgram.converged ? "✅" : "❌"}`,
);
// Save complete optimization configuration
await fs.writeFile(
"product-reviewer-optimization.json",
JSON.stringify(
{
version: "2.0",
bestScore: result.optimizedProgram.bestScore,
instruction: result.optimizedProgram.instruction,
demos: result.optimizedProgram.demos,
modelConfig: result.optimizedProgram.modelConfig,
optimizerType: result.optimizedProgram.optimizerType,
optimizationTime: result.optimizedProgram.optimizationTime,
totalRounds: result.optimizedProgram.totalRounds,
converged: result.optimizedProgram.converged,
stats: result.optimizedProgram.stats,
created: new Date().toISOString(),
},
null,
2,
),
);
console.log(
"💾 Complete optimization saved to product-reviewer-optimization.json!",
);
}
// 7. Test the optimized version
const testReview =
"The item was okay but customer support was unhelpful when I had questions";
const analysis = await productReviewer.forward(llm, {
productReview: testReview,
});
console.log("Analysis:", analysis);
// Expected: rating: '2' or '3', aspect: 'service', recommendation: 'avoid' or 'maybe'
// 8. Later in production - load complete optimization:
// import { AxOptimizedProgramImpl } from '@ax-llm/ax';
// const savedData = JSON.parse(await fs.readFile('product-reviewer-optimization.json', 'utf8'));
// const optimizedProgram = new AxOptimizedProgramImpl(savedData);
// productReviewer.applyOptimization(optimizedProgram);
// console.log(`🚀 Loaded complete optimization v${savedData.version} with score ${savedData.bestScore.toFixed(3)}`);
```
---
## 🎯 Key Takeaways
1. **Start simple**: 5 examples and basic optimization can give you 20-30%
improvement
2. **Use the unified approach**:
`program.applyOptimization(result.optimizedProgram)` - one call does
everything!
3. **Save complete optimizations**: New v2.0 format includes demos, instruction,
model config, and metadata
4. **Load optimizations cleanly**: Use `AxOptimizedProgramImpl` to recreate
saved optimizations
5. **Teacher-student saves money**: Use expensive models to teach cheap ones
6. **Good examples matter more than lots of examples**: 10 diverse examples beat
100 similar ones
7. **Measure what matters**: Your metric defines what the AI optimizes for
8. **Version comprehensively**: Track optimization versions, scores,
convergence, and metadata
9. **Backward compatibility**: Legacy demo format still works, but upgrade for
better experience
10. **Production-ready**: The unified approach is designed for enterprise
production use
**Ready to optimize your first AI program?** Copy the examples above and start
experimenting!
**Questions?** Check the `src/examples/` folder for more real-world examples, or
refer to the troubleshooting section above.
---
## 📚 Quick Reference
### Essential Imports
```typescript
import { ai, ax, AxMiPRO, AxOptimizedProgramImpl } from "@ax-llm/ax";
```
### Basic Pattern (Copy This!)
```typescript
// 1. Define program
const program = ax(
'inputText:string "description" -> output:class "a, b" "description"',
);
// 2. Create AI
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" },
});
// 3. Add examples
const examples = [{ inputText: "example", output: "a" }];
// 4. Define metric
const metric = ({ prediction, example }) =>
prediction.output === example.output ? 1 : 0;
// 5. Optimize
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
options: { verbose: true },
});
const result = await optimizer.compile(program, examples, metric);
// 6. Apply optimization (unified approach)
if (result.optimizedProgram) {
program.applyOptimization(result.optimizedProgram);
}
```
### Common Field Types
- `fieldName:string "description"` - Text input/output
- `fieldName:class "option1, option2" "description"` - Classification
- `fieldName:number "description"` - Numeric values
- `fieldName:string[] "description"` - Lists
- `fieldName:boolean "description"` - True/false
### Budget Control
```typescript
import { AxDefaultCostTracker } from "@ax-llm/ax";
const costTracker = new AxDefaultCostTracker({ maxTokens: 10000, maxCost: 5 });
// Add to optimizer: costTracker
```
### Teacher-Student (Cost Savings)
```typescript
const teacherAI = ai({ name: "openai", config: { model: "gpt-4o" } }); // Expensive
const studentAI = ai({
name: "openai",
config: { model: "gpt-4o-mini" },
}); // Cheap
// Use both in optimizer: { studentAI, teacherAI, ... }
```
### Unified Optimization (New in v14.0+)
```typescript
// Save complete optimization
const savedData = {
version: "2.0",
bestScore: result.optimizedProgram.bestScore,
instruction: result.optimizedProgram.instruction,
demos: result.optimizedProgram.demos,
modelConfig: result.optimizedProgram.modelConfig, // temperature, etc.
optimizerType: result.optimizedProgram.optimizerType,
// ... all other optimization data
};
// Load and apply in production
const optimizedProgram = new AxOptimizedProgramImpl(savedData);
program.applyOptimization(optimizedProgram); // One call does everything!
// Benefits:
// ✅ Single object contains all optimization data
// ✅ One method call applies everything
// ✅ Complete metadata tracking
// ✅ Backward compatibility with legacy demos
// ✅ Production-ready versioning and deployment
```
---
_💡 Remember: Optimization is like having a personal AI tutor. You provide the
examples and goals, and it figures out the best way to teach your AI. Start
simple, measure results, and gradually make it more sophisticated as you learn
what works!_
---
## 💾 Checkpointing (Fault Tolerance)
Long-running optimizations can be expensive and time-consuming. Ax provides
simple function-based checkpointing to save optimization progress and recover
from failures.
### Why Use Checkpointing?
- **Cost Protection**: Don't lose expensive optimization work due to crashes
- **Fault Tolerance**: Resume optimization after interruptions
- **Experimentation**: Save optimization state at different points for analysis
### How It Works
Implement two simple functions to save and load checkpoint data:
```typescript
import { type AxCheckpointSaveFn, type AxCheckpointLoadFn } from '@ax-llm/ax'
const checkpointSave: AxCheckpointSaveFn = async (checkpoint) => {
// JSON serialize the checkpoint and save it wherever you want:
// - Memory: map.set(id, checkpoint)
// - localStorage: localStorage.setItem(id, JSON.stringify(checkpoint))
// - Database: await db.create({ data: checkpoint })
// - Files: await fs.writeFile(`${id}.json`, JSON.stringify(checkpoint))
// - Cloud: await s3.putObject({ Key: id, Body: JSON.stringify(checkpoint) })
const id = `checkpoint_${Date.now()}`
// Your storage implementation here
return id
}
const checkpointLoad: AxCheckpointLoadFn = async (id) => {
// Load and JSON parse the checkpoint data
// Return null if not found
return /* your loaded checkpoint */ || null
}
// Use with any optimizer
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
checkpointSave,
checkpointLoad,
checkpointInterval: 10, // Save every 10 rounds
resumeFromCheckpoint: 'checkpoint_12345', // Resume from specific checkpoint
options: { numTrials: 50, verbose: true }
})
```
### Key Points
- **Simple**: Just two functions - save and load
- **Storage Agnostic**: Works with any storage (memory, files, databases, cloud)
- **JSON Serializable**: Checkpoint data is just JSON - store it anywhere
- **Complete State**: Contains all optimization progress (scores,
configurations, examples)
- **Browser Compatible**: No filesystem dependencies
The checkpoint contains complete optimization state, so you can resume exactly
where you left off, even after crashes or interruptions.
---
## 🐍 Python Optimization Service Integration
For advanced optimization scenarios requiring sophisticated Bayesian
optimization, Ax uses a production-ready Python service using Optuna. This is
required for MiPro v2 optimization with complex parameter spaces.
### When to Use Python Service
✅ **Great for:**
- Complex parameter optimization (10+ parameters)
- Bayesian optimization with acquisition functions
- Long-running optimization jobs (100+ trials)
- Production deployments requiring fault tolerance
- Distributed optimization across multiple machines
- Advanced pruning and sampling strategies
❌ **Note:** MiPro v2 requires the Python service; local TypeScript fallback is
no longer supported.
### Quick Setup with uv
The Python service uses `uv` for fast, modern Python package management:
```bash
# 1. Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Navigate to optimizer directory
cd src/optimizer
# 3. Install and run (that's it!)
uv sync
uv run ax-optimizer server start --debug
```
The service runs with an in-memory queue by default - no Redis or configuration
needed!
#### Production Setup (With Redis for Scaling)
```bash
# Install with Redis support
uv sync --group redis
# Start Redis (in another terminal)
docker run -p 6379:6379 redis:7-alpine
# Start the service
uv run ax-optimizer server start --debug
```
### CLI Usage
The service provides a comprehensive CLI for all operations:
```bash
# Server management
uv run ax-optimizer server start --host 0.0.0.0 --port 8000 --debug
uv run ax-optimizer server status
uv run ax-optimizer server stop
# Create MiPro optimization configuration
uv run ax-optimizer mipro create-config --output mipro_config.json
# Start optimization job
uv run ax-optimizer optimize --config mipro_config.json --monitor
# Monitor existing job
uv run ax-optimizer monitor
# Get parameter suggestions (manual optimization loop)
uv run ax-optimizer suggest
# Report trial results
uv run ax-optimizer evaluate
# Get final results
uv run ax-optimizer results
# List all jobs
uv run ax-optimizer list --limit 20
```
### Docker Setup (Production)
For production deployments, use the provided Docker setup:
```bash
# Start all services (Redis, PostgreSQL, API, Workers)
cd src/optimizer
docker-compose up -d
# View logs
docker-compose logs -f
# Scale workers for performance
docker-compose up -d --scale worker=3
# Stop services
docker-compose down
```
### MiPro with Python Service
Here's how to use MiPro with the Python optimization service:
```typescript
import { ai, ax, type AxMetricFn, AxMiPRO } from "@ax-llm/ax";
// Email classification example
const emailClassifier = ax(
'emailText:string "Email content" -> priority:class "critical, normal, low" "Email priority"',
);
const examples = [
{ emailText: "URGENT: Server down!", priority: "critical" },
{ emailText: "Meeting reminder", priority: "normal" },
{ emailText: "Newsletter update", priority: "low" },
// ... more examples
];
const metric: AxMetricFn = ({ prediction, example }) => {
return (prediction as any).priority === (example as any).priority ? 1.0 : 0.0;
};
// Configure MiPro with Python service
const optimizer = new AxMiPRO({
studentAI: ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" },
}),
teacherAI: ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4" },
}),
examples,
// Python service configuration
optimizerEndpoint: "http://localhost:8000",
optimizerTimeout: 60000,
optimizerRetries: 3,
// Enhanced MiPro settings for Python service
numTrials: 100, // More trials with Python
bayesianOptimization: true,
acquisitionFunction: "expected_improvement",
explorationWeight: 0.15,
// Self-consistency (MiPRO v2)
// Ask the model for multiple independent samples and pick the best with a default majority-vote picker
sampleCount: 3,
// Progress tracking
onProgress: (update) => {
console.log(`Trial ${update.round}: ${update.currentScore.toFixed(3)}`);
},
});
// Run optimization
const result = await optimizer.compile(emailClassifier, examples, metric);
console.log(`Best score: ${result.bestScore}`);
```
#### Custom Result Picker (Advanced)
```ts
import { type AxResultPickerFunction } from "@ax-llm/ax";
// Example: prefer higher confidence, break ties by shortest explanation
const myPicker: AxResultPickerFunction = async (data) => {
if (data.type === "function") {
// Choose first non-error function execution
const ix = data.results.findIndex((r) => !r.isError);
return ix >= 0 ? ix : 0;
}
// Fields: choose highest confidence; tie-breaker shortest explanation
let bestIx = 0;
let bestScore = -Infinity;
for (const r of data.results) {
const sample = r.sample as { confidence?: number; explanation?: string };
const score = (sample.confidence ?? 0) -
(sample.explanation?.length ?? 0) / 1000;
if (score > bestScore) {
bestScore = score;
bestIx = r.index;
}
}
return bestIx;
};
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
// Use 5 samples per example and custom picker
sampleCount: 5,
resultPicker: myPicker,
});
```
When to use:
- Use a custom picker when your task has a clear selection heuristic (e.g.,
confidence, shortness, scoring rubric) or you want to implement an LLM-judge
selection.
- For classification tasks, the built-in majority-vote default often works well.
#### New/Updated Options (MiPRO v2)
- `sampleCount?: number` (default: 1)
- When > 1, MiPRO evaluates each example with multiple samples and uses a
default result picker to select the best output per example. Great for tasks
where self-consistency helps.
- Early stopping
- Controlled by `earlyStoppingTrials` and `minImprovementThreshold`. MiPRO
will stop if no trial improves the best score by at least the threshold for
the configured number of trials.
- Minibatch scheduling
- When `minibatch` is true, evaluations run on random minibatches. Every
`minibatchFullEvalSteps` trials, MiPRO runs a full evaluation to correct
drift from minibatch noise.
- Expanded logging
- Progress is emitted each trial with score and configuration; early stopping
is logged; final result includes score/configuration histories and accurate
`optimizationTime`.
Note: MiPRO now applies suggested `bootstrappedDemos` during evaluation so that
the optimizer can learn their true effect on your metric.
#### Hyperparameters vs. MiPRO
MiPRO primarily optimizes the program-level levers emphasized in DSPy/MiPRO
(instructions, few-shot demos, data-aware proposals). Model hyperparameters
(e.g., `temperature`, `topP`, penalties) can be included for practical gains;
tuning `temperature` often helps self-consistency. The original MiPRO work
focuses on program synthesis and demo selection rather than broad model
hyperparameter sweeps. If you decide to extend the search space:
- Prefer a small, impactful set (e.g., `temperature`, occasionally `topP`).
- Keep ranges conservative to avoid noisy evaluations.
- Measure costs: a larger hyperparameter space increases trials.
Optional: Include topP in MiPRO
```ts
const optimizer = new AxMiPRO({
studentAI: llm,
examples,
// Keep it off by default; turn on if diversity helps your task
optimizeTopP: true, // adds topP (0.7–1.0) to the optimizer search space
sampleCount: 3, // pairs well with self-consistency
});
```
### Environment Variables
Configure the service with environment variables:
```bash
# .env file for Python service
HOST=0.0.0.0
PORT=8000
DEBUG=false
REDIS_URL=redis://localhost:6379/0
DATABASE_URL=postgresql://user:password@localhost/optimizer
USE_MEMORY_STORAGE=true # Set to false for PostgreSQL persistence
MAX_TRIALS_PER_STUDY=1000
DEFAULT_TIMEOUT_SECONDS=3600
MAX_CONCURRENT_JOBS=10
```
### Production Features
The Python service includes enterprise-ready features:
**Fault Tolerance:**
- Automatic checkpointing and resumption
- Redis-based task queue with ARQ
- Background job processing
- Health checks and monitoring
**Scalability:**
- Horizontal scaling with multiple workers
- Database persistence with PostgreSQL
- Connection pooling and resource management
- Rate limiting and timeout controls
**Observability:**
- Comprehensive logging with structured output
- Metrics export for monitoring systems
- Job status tracking and history
- Error reporting and debugging tools
### Advanced Parameter Templates
The service includes optimized parameter templates for different scenarios:
```python
# Using the Python adapter directly
from app.mipro_adapter import MiProAdapter, MiProConfiguration
# Light optimization (fast, good for development)
config = MiProConfiguration(optimization_level="light")
adapter = MiProAdapter(config)
request = adapter.create_optimization_request(
study_name="email_classification",
parameter_sets=["instruction_generation", "demo_selection"]
)
# Medium optimization (balanced, good for most use cases)
config = MiProConfiguration(optimization_level="medium")
# Heavy optimization (thorough, good for production)
config = MiProConfiguration(optimization_level="heavy")
```
### Integration with TypeScript
Switch between local and Python optimization seamlessly:
```typescript
const optimizer = new AxMiPRO({
studentAI,
examples,
numTrials: 100,
optimizerEndpoint: process.env.OPTIMIZER_ENDPOINT || "http://localhost:8000",
bayesianOptimization: true,
acquisitionFunction: "expected_improvement",
onProgress: (update) => {
console.log(`Trial ${update.round}: ${update.currentScore.toFixed(3)}`);
},
});
```
### Development Workflow
1. **Start with TypeScript** for quick prototyping:
```bash
npm run tsx ./src/examples/mipro-python-optimizer.ts
```
2. **Scale to Python** for production optimization:
```bash
# Terminal 1: Start Python service
cd src/optimizer && uv run ax-optimizer server start
# Terminal 2: Run with Python service
USE_PYTHON_OPTIMIZER=true npm run tsx ./src/examples/mipro-python-optimizer.ts
```
3. **Deploy to production** with Docker:
```bash
cd src/optimizer && docker-compose up -d
```
This provides a smooth development path from prototype to production with the
same codebase!
================================================================================
# AxFlow Guide
# Source: AXFLOW.md
# AxFlow - Orchestration framework for building AI workflows with Ax
# AxFlow Documentation
**AxFlow** is a powerful workflow orchestration system for building complex AI
applications with automatic dependency analysis, parallel execution, and
flexible control flow patterns.
## Table of Contents
- [Quick Start](#quick-start)
- [Why AxFlow is the Future](#why-axflow-is-the-future)
- [Core Concepts](#core-concepts)
- [API Reference](#api-reference)
- [Control Flow Patterns](#control-flow-patterns)
- [Asynchronous Operations](#5-asynchronous-operations)
- [Advanced Features](#advanced-features)
- [Best Practices](#best-practices)
- [Examples](#examples)
## Quick Start
### Basic Flow
```typescript
import { ai, flow } from "@ax-llm/ax";
// Create AI instance
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
// Create a simple flow (factory function)
const wf = flow<{ userInput: string }, { responseText: string }>()
.node("testNode", "userInput:string -> responseText:string")
.execute("testNode", (state) => ({ userInput: state.userInput }))
.returns((state) => ({ responseText: state.testNodeResult.responseText }));
// Execute the flow
const result = await wf.forward(llm, { userInput: "Hello world" });
console.log(result.responseText);
```
### Factory Options
```typescript
// Basic factory
const wf = flow();
// With options
const wf = flow({ autoParallel: false });
// With explicit typing
const wf = flow();
// With options and typing
const wf = flow({
autoParallel: true,
batchSize: 5,
});
```
## Why AxFlow is the Future
**🚀 Automatic Performance Optimization:**
- **Zero-Config Parallelization**: Automatically runs independent operations in
parallel (1.5-3x speedup)
- **Intelligent Dependency Analysis**: AI-powered analysis of input/output
dependencies
- **Optimal Execution Planning**: Automatically groups operations into parallel
levels
- **Concurrency Control**: Smart resource management with configurable limits
- **Runtime Control**: Enable/disable auto-parallelization per execution as
needed
**🛡️ Production-Ready Resilience:**
- **Exponential Backoff**: Smart retry strategies with configurable delays
- **Graceful Degradation**: Fallback mechanisms for continuous operation
- **Error Isolation**: Prevent cascading failures across workflow components
- **Resource Monitoring**: Adaptive scaling based on system performance
**Compared to Traditional Approaches:**
- **10x More Compact**: Ultra-concise syntax with powerful aliases
- **Zero Boilerplate**: Automatic state management and context threading
- **Multi-Modal Ready**: Native support for text, images, audio, and streaming
- **Self-Optimizing**: Built-in compatibility with MiPRO and other advanced
optimizers
- **Enterprise Ready**: Circuit breakers, retries, and monitoring built-in
- **Production Hardened**: Used by startups scaling to millions of users
**Real-World Superpowers:**
- **Autonomous Agents**: Self-healing, self-improving AI workflows
- **Multi-Model Orchestration**: Route tasks to the perfect AI for each job
- **Adaptive Pipelines**: Workflows that evolve based on real-time feedback
- **Cost Intelligence**: Automatic optimization between speed, quality, and cost
- **Mission Critical**: Built for production with enterprise-grade reliability
> _"AxFlow doesn't just execute AI workflows—it orchestrates the future of
> intelligent systems with automatic performance optimization"_
**Ready to build the impossible?** AxFlow extends `AxProgramWithSignature`,
giving you access to the entire Ax ecosystem: optimization, streaming, tracing,
function calling, and more. The future of AI development is declarative,
adaptive, and beautiful.
## Core Concepts
### 1. Node Definition
Nodes define the available operations in your flow. You must define nodes before
executing them.
```typescript
// String signature (creates AxGen automatically)
flow.node("processor", "input:string -> output:string");
// With multiple outputs
flow.node("analyzer", "text:string -> sentiment:string, confidence:number");
// Complex field types
flow.node(
"extractor",
"documentText:string -> processedResult:string, entities:string[]",
);
```
### 2. State Evolution
State grows as you execute nodes, with results stored in `{nodeName}Result`
format:
```typescript
// Initial state: { userInput: "Hello" }
flow.execute("processor", (state) => ({ input: state.userInput }));
// State becomes: { userInput: "Hello", processorResult: { output: "Processed Hello" } }
flow.execute("analyzer", (state) => ({ text: state.processorResult.output }));
// State becomes: {
// userInput: "Hello",
// processorResult: { output: "Processed Hello" },
// analyzerResult: { sentiment: "positive", confidence: 0.8 }
// }
```
### 3. State Transformation
Use `map()` to transform state between operations:
```typescript
flow.map((state) => ({
...state,
processedInput: state.userInput.toLowerCase(),
timestamp: Date.now(),
}));
```
## API Reference
### Core Methods
#### `node(name: string, signature: string, options?: object)`
Define a node with the given signature.
```typescript
flow.node("summarizer", "documentText:string -> summary:string");
flow.node("classifier", "text:string -> category:string", { debug: true });
```
**Alias:** `n()` - Short alias for `node()`
```typescript
flow.n("summarizer", "documentText:string -> summary:string");
```
#### `nodeExtended(name: string, baseSignature: string | AxSignature, extensions: object)`
Create a node with extended signature by adding additional input/output fields.
```typescript
// Add chain-of-thought reasoning
flow.nodeExtended("reasoner", "question:string -> answer:string", {
prependOutputs: [
{ name: "reasoning", type: f.internal(f.string("Step-by-step reasoning")) },
],
});
// Add context and confidence
flow.nodeExtended("analyzer", "input:string -> output:string", {
appendInputs: [{ name: "context", type: f.optional(f.string("Context")) }],
appendOutputs: [{ name: "confidence", type: f.number("Confidence score") }],
});
```
**Extension Options:**
- `prependInputs` - Add fields at the beginning of input signature
- `appendInputs` - Add fields at the end of input signature
- `prependOutputs` - Add fields at the beginning of output signature
- `appendOutputs` - Add fields at the end of output signature
**Alias:** `nx()` - Short alias for `nodeExtended()`
```typescript
flow.nx("reasoner", "question:string -> answer:string", {
prependOutputs: [
{ name: "reasoning", type: f.internal(f.string("Step-by-step reasoning")) },
],
});
```
#### `execute(nodeName: string, mapping: Function, options?: object)`
Execute a node with input mapping.
```typescript
flow.execute("summarizer", (state) => ({
documentText: state.document,
}));
// With AI override
flow.execute("processor", mapping, { ai: alternativeAI });
```
#### `map(transform: Function)`
Transform the current state synchronously or asynchronously.
```typescript
// Synchronous transformation
flow.map((state) => ({
...state,
upperCaseResult: state.processorResult.output.toUpperCase(),
}));
// Asynchronous transformation
flow.map(async (state) => {
const apiData = await fetchFromAPI(state.query);
return {
...state,
enrichedData: apiData,
};
});
// Parallel asynchronous transformations
flow.map([
async (state) => ({ ...state, result1: await api1(state.data) }),
async (state) => ({ ...state, result2: await api2(state.data) }),
async (state) => ({ ...state, result3: await api3(state.data) }),
], { parallel: true });
```
**Alias:** `m()` - Short alias for `map()` (supports both sync and async)
```typescript
// Async with alias
flow.m(async (state) => {
const processed = await processAsync(state.input);
return { ...state, processed };
});
```
#### `returns(transform: Function)` / `r(transform: Function)`
Terminal transformation that sets the final output type of the flow. Use this as
the last transformation to get proper TypeScript type inference for the flow
result.
```typescript
const typedFlow = flow<{ input: string }>()
.map((state) => ({ ...state, processed: true, count: 42 }))
.returns((state) => ({
result: state.processed ? "done" : "pending",
totalCount: state.count,
})); // TypeScript now properly infers the output type
// Result is typed as { result: string; totalCount: number }
const result = await typedFlow.forward(llm, { input: "test" });
console.log(result.result); // Type-safe access
console.log(result.totalCount); // Type-safe access
```
**Key Benefits:**
- **Proper Type Inference**: TypeScript automatically infers the correct return
type
- **Clear Intent**: Explicitly marks the final transformation of your flow
- **Type Safety**: Full autocomplete and type checking on the result object
**Aliases:**
- `returns()` - Full descriptive name
- `r()` - Short alias (matches `m()` pattern)
```typescript
// These are equivalent:
flow.returns((state) => ({ output: state.value }));
flow.r((state) => ({ output: state.value }));
```
**When to Use:**
- When you want proper TypeScript type inference for complex flows
- As the final step in flows that transform the state into a specific output
format
- When building reusable flows that need clear output contracts
#### `description(name: string, description: string)`
Set a flow-level name and description. The description is stored on the flow's
inferred signature, and the name/description are used by `toFunction()` for the
exported function metadata.
```typescript
const wf = flow<{ userQuestion: string }, { responseText: string }>()
.node("qa", "userQuestion:string -> responseText:string")
.description(
"Question Answerer",
"Answers user questions concisely using the configured AI model.",
);
```
#### `toFunction()`
Convert the flow into an AxFunction using the flow's inferred signature. The
function's `name` prefers the name set via `description(name, ...)` and falls
back to the first line of the signature description. The `parameters` are
generated from the inferred input fields as JSON Schema.
```typescript
const wf = flow<{ userQuestion: string }, { responseText: string }>()
.node("qa", "userQuestion:string -> responseText:string")
.description(
"Question Answerer",
"Answers user questions concisely using the configured AI model.",
);
const fn = wf.toFunction();
console.log(fn.name); // "questionAnswerer"
console.log(fn.parameters); // JSON Schema from inferred input
// You can call fn.func with args and { ai } to execute the flow
```
### Control Flow Methods
#### `while(condition: Function)` / `endWhile()`
Create loops that execute while condition is true.
```typescript
flow
.map((state) => ({ ...state, counter: 0 }))
.while((state) => state.counter < 3)
.map((state) => ({ ...state, counter: state.counter + 1 }))
.execute("processor", (state) => ({ input: `iteration ${state.counter}` }))
.endWhile();
```
#### `branch(predicate: Function)` / `when(value)` / `merge()`
Conditional branching based on predicate evaluation.
```typescript
flow
.branch((state) => state.complexity)
.when("simple")
.execute("simpleProcessor", mapping)
.when("complex")
.execute("complexProcessor", mapping)
.merge()
.map((state) => ({
result: state.simpleProcessorResult?.output ||
state.complexProcessorResult?.output,
}));
```
#### `parallel(subFlows: Function[])` / `merge(key: string, mergeFunction: Function)`
Execute multiple sub-flows in parallel.
```typescript
flow
.parallel([
(subFlow) =>
subFlow.execute("analyzer1", (state) => ({ text: state.input })),
(subFlow) =>
subFlow.execute("analyzer2", (state) => ({ text: state.input })),
(subFlow) =>
subFlow.execute("analyzer3", (state) => ({ text: state.input })),
])
.merge("combinedResults", (result1, result2, result3) => ({
analysis1: result1.analyzer1Result.analysis,
analysis2: result2.analyzer2Result.analysis,
analysis3: result3.analyzer3Result.analysis,
}));
```
#### `label(name: string)` / `feedback(condition: Function, labelName: string, maxIterations?: number)`
Create labeled points for feedback loops.
```typescript
flow
.map((state) => ({ ...state, attempts: 0 }))
.label("retry-point")
.map((state) => ({ ...state, attempts: state.attempts + 1 }))
.execute("processor", (state) => ({ input: state.userInput }))
.execute("validator", (state) => ({ output: state.processorResult.output }))
.feedback(
(state) => !state.validatorResult.isValid && state.attempts < 3,
"retry-point",
);
```
### Advanced Methods
#### `derive(outputField: string, inputField: string, transform: Function, options?: object)`
Create derived fields from array or scalar inputs with parallel processing
support.
```typescript
// Derive from array with parallel processing
flow.derive(
"processedItems",
"items",
(item, index) => `processed-${item}-${index}`,
{
batchSize: 2,
},
);
// Derive from scalar
flow.derive("upperText", "inputText", (text) => text.toUpperCase());
```
## Control Flow Patterns
### 1. Sequential Processing
```typescript
const sequentialFlow = flow<{ input: string }, { finalResult: string }>()
.node("step1", "input:string -> intermediate:string")
.node("step2", "intermediate:string -> output:string")
.execute("step1", (state) => ({ input: state.input }))
.execute(
"step2",
(state) => ({ intermediate: state.step1Result.intermediate }),
)
.map((state) => ({ finalResult: state.step2Result.output }));
```
### 2. Conditional Processing
```typescript
const conditionalFlow = flow<
{ query: string; isComplex: boolean },
{ response: string }
>()
.node("simpleHandler", "query:string -> response:string")
.node("complexHandler", "query:string -> response:string")
.branch((state) => state.isComplex)
.when(true)
.execute("complexHandler", (state) => ({ query: state.query }))
.when(false)
.execute("simpleHandler", (state) => ({ query: state.query }))
.merge()
.map((state) => ({
response: state.complexHandlerResult?.response ||
state.simpleHandlerResult?.response,
}));
```
### 3. Iterative Processing
```typescript
const iterativeFlow = flow<
{ content: string },
{ finalContent: string }
>()
.node("processor", "content:string -> processedContent:string")
.node("qualityChecker", "content:string -> qualityScore:number")
.map((state) => ({ currentContent: state.content, iteration: 0 }))
.while((state) => state.iteration < 3 && (state.qualityScore || 0) < 0.8)
.map((state) => ({ ...state, iteration: state.iteration + 1 }))
.execute("processor", (state) => ({ content: state.currentContent }))
.execute(
"qualityChecker",
(state) => ({ content: state.processorResult.processedContent }),
)
.map((state) => ({
...state,
currentContent: state.processorResult.processedContent,
qualityScore: state.qualityCheckerResult.qualityScore,
}))
.endWhile()
.map((state) => ({ finalContent: state.currentContent }));
```
### 4. Parallel Processing with Auto-Parallelization
AxFlow automatically detects independent operations and runs them in parallel:
```typescript
const autoParallelFlow = flow<
{ text: string },
{ combinedAnalysis: string }
>()
.node("sentimentAnalyzer", "text:string -> sentiment:string")
.node("topicExtractor", "text:string -> topics:string[]")
.node("entityRecognizer", "text:string -> entities:string[]")
// These three execute automatically in parallel! ⚡
.execute("sentimentAnalyzer", (state) => ({ text: state.text }))
.execute("topicExtractor", (state) => ({ text: state.text }))
.execute("entityRecognizer", (state) => ({ text: state.text }))
// This waits for all three to complete
.map((state) => ({
combinedAnalysis: JSON.stringify({
sentiment: state.sentimentAnalyzerResult.sentiment,
topics: state.topicExtractorResult.topics,
entities: state.entityRecognizerResult.entities,
}),
}));
// Check execution plan
const plan = autoParallelFlow.getExecutionPlan();
console.log("Parallel groups:", plan.parallelGroups);
console.log("Max parallelism:", plan.maxParallelism);
```
### 5. Asynchronous Operations
AxFlow supports asynchronous transformations in map operations, enabling API
calls, database queries, and other async operations within your flow:
```typescript
const asyncFlow = flow<
{ userQuery: string },
{ enrichedData: string; apiCallTime: number }
>()
.node("processor", "enrichedData:string -> processedResult:string")
// Single async map - API enrichment
.map(async (state) => {
const startTime = Date.now();
const apiData = await fetchFromExternalAPI(state.userQuery);
const duration = Date.now() - startTime;
return {
...state,
enrichedData: apiData,
apiCallTime: duration,
};
})
// Execute AI processing on enriched data
.execute("processor", (state) => ({ enrichedData: state.enrichedData }))
// Parallel async operations
.map([
async (state) => {
const sentiment = await analyzeSentiment(state.processedResult);
return { ...state, sentiment };
},
async (state) => {
const entities = await extractEntities(state.processedResult);
return { ...state, entities };
},
async (state) => {
const summary = await generateSummary(state.processedResult);
return { ...state, summary };
},
], { parallel: true })
.map((state) => ({
enrichedData: state.enrichedData,
apiCallTime: state.apiCallTime,
}));
```
#### Mixed Sync/Async Processing
You can mix synchronous and asynchronous operations seamlessly:
```typescript
const mixedFlow = flow<{ rawData: string }, { result: string }>()
// Sync preprocessing
.map((state) => ({
...state,
cleanedData: state.rawData.trim().toLowerCase(),
}))
// Async validation
.map(async (state) => {
const isValid = await validateWithAPI(state.cleanedData);
return { ...state, isValid };
})
// More sync processing
.map((state) => ({
...state,
timestamp: Date.now(),
}))
// Final async processing
.map(async (state) => {
if (state.isValid) {
const processed = await processData(state.cleanedData);
return { result: processed };
}
return { result: "Invalid data" };
});
```
#### Performance Considerations
- **Parallel async maps**: Multiple async operations run concurrently
- **Sequential async maps**: Each async operation waits for the previous one
- **Batch control**: Use `batchSize` option to control parallelism
```typescript
// Parallel execution (faster)
flow.map([asyncOp1, asyncOp2, asyncOp3], { parallel: true });
// Sequential execution (slower but controlled)
flow
.map(asyncOp1)
.map(asyncOp2)
.map(asyncOp3);
```
### 6. Self-Healing with Feedback Loops
```typescript
const selfHealingFlow = flow<{ input: string }, { output: string }>()
.node("processor", "input:string -> output:string, confidence:number")
.node("validator", "output:string -> isValid:boolean, issues:string[]")
.node("fixer", "output:string, issues:string[] -> fixedOutput:string")
.map((state) => ({ ...state, attempts: 0 }))
.label("process")
.map((state) => ({ ...state, attempts: state.attempts + 1 }))
.execute("processor", (state) => ({ input: state.input }))
.execute("validator", (state) => ({ output: state.processorResult.output }))
.feedback(
(state) => !state.validatorResult.isValid && state.attempts < 3,
"process",
)
// If still invalid after retries, try to fix
.branch((state) => state.validatorResult.isValid)
.when(false)
.execute("fixer", (state) => ({
output: state.processorResult.output,
issues: state.validatorResult.issues,
}))
.map((state) => ({
output: state.fixerResult.fixedOutput,
}))
.when(true)
.map((state) => ({
output: state.processorResult.output,
}))
.merge();
```
## Advanced Features
### Instrumentation and Optimization (v13.0.24+)
- Deprecation: prefer `flow()` factory over `new AxFlow()`.
- Tracing: pass a Tracer via `flow({ tracer })` or
`flow.forward(llm, input, { tracer, traceContext })`.
- A parent span is created at the flow boundary (if `tracer` is provided).
- The parent span context is propagated to all node `.forward()` calls via
`options.traceContext`.
- Pass an OpenTelemetry Context for `traceContext` (not a Span). Use
`@opentelemetry/api` `context.active()` or similar.
- Meter: pass `meter` the same way as `tracer`; it is propagated to node
forwards.
- **Program IDs & `namedPrograms()`**: Each node is registered with a
dot-separated ID. Use `namedPrograms()` to discover them:
```typescript
const wf = flow<{ input: string }>()
.node('summarizer', 'text:string -> summary:string')
.node('classifier', 'text:string -> category:string');
console.log(wf.namedPrograms());
// [
// { id: 'root.summarizer', signature: 'text:string -> summary:string' },
// { id: 'root.classifier', signature: 'text:string -> category:string' },
// ]
```
- **Demos/Examples routing**: `flow.setDemos(demos)` routes by `programId` to the
correct node. The flow maintains an internal `AxProgram`
and registers all child nodes; each node filters demos by its `programId`.
TypeScript narrows `programId` to registered node names — typos are caught at compile time:
```typescript
// OK
wf.setDemos([{ programId: 'root.summarizer', traces: [] }]);
// TypeScript error: 'root.summerizer' is not a valid node name
wf.setDemos([{ programId: 'root.summerizer', traces: [] }]);
```
At runtime, unknown programIds throw a descriptive error listing valid IDs.
- **Optimization**: `flow.applyOptimization(optimizedProgram)` applies to the flow's
internal program and all registered child nodes.
- Parallel map: `flow.map([...], { parallel: true })` merges all transform
outputs back into state.
Example
```ts
import { ai, flow } from "@ax-llm/ax";
import { context, trace } from "@opentelemetry/api";
const llm = ai({ name: "mock" });
const tracer = trace.getTracer("axflow");
const wf = flow<{ userQuestion: string }>()
.node("summarizer", "documentText:string -> summaryText:string")
.execute("summarizer", (s) => ({ documentText: s.userQuestion }))
.returns((s) => ({ finalAnswer: (s as any).summarizerResult.summaryText }));
const parentCtx = context.active();
const out = await wf.forward(llm, { userQuestion: "hi" }, {
tracer,
traceContext: parentCtx,
});
```
### 1. Auto-Parallelization
AxFlow automatically analyzes dependencies and runs independent operations in
parallel:
```typescript
// Disable auto-parallelization globally
const sequentialFlow = flow({ autoParallel: false });
// Disable for specific execution
const result = await flow.forward(llm, input, { autoParallel: false });
// Get execution plan information
const plan = flow.getExecutionPlan();
console.log(
`Will run ${plan.parallelGroups} parallel groups with max ${plan.maxParallelism} concurrent operations`,
);
```
### 2. Dynamic AI Context
Use different AI services for different nodes:
```typescript
flow
.execute("fastProcessor", mapping, { ai: speedAI })
.execute("powerfulAnalyzer", mapping, { ai: powerAI })
.execute("defaultProcessor", mapping); // Uses default AI from forward()
```
### 3. Batch Processing with Derive
```typescript
const batchFlow = flow<{ items: string[] }, { processedItems: string[] }>(
{
autoParallel: true,
batchSize: 3, // Process 3 items at a time
},
)
.derive("processedItems", "items", (item, index) => {
return `processed-${item}-${index}`;
}, { batchSize: 2 }); // Override batch size for this operation
```
### 4. Error Handling
```typescript
try {
const result = await flow.forward(llm, input);
} catch (error) {
console.error("Flow execution failed:", error);
}
```
### 5. Program Integration
AxFlow integrates with the dspy-ts ecosystem:
```typescript
// Get signature
const signature = flow.getSignature();
// Set examples (if applicable)
flow.setExamples(examples);
// Get traces and usage
const traces = flow.getTraces();
const usage = flow.getUsage();
```
## Best Practices
### 1. Node Naming
Use descriptive names that clearly indicate the node's purpose:
```typescript
// ❌ Unclear
flow.node("proc1", signature);
// ✅ Clear
flow.node("documentSummarizer", signature);
flow.node("sentimentAnalyzer", signature);
```
### 2. State Management
Keep state flat and predictable:
```typescript
// ✅ Good - flat structure
flow.map((state) => ({
...state,
processedText: state.rawText.toLowerCase(),
timestamp: Date.now(),
}));
// ❌ Avoid - deep nesting
flow.map((state) => ({
data: {
processed: {
text: state.rawText.toLowerCase(),
},
},
}));
```
### 3. Error Prevention
Always define nodes before executing them:
```typescript
// ✅ Correct order
flow
.node("processor", signature)
.execute("processor", mapping);
// ❌ Will throw error
flow
.execute("processor", mapping) // Node not defined yet!
.node("processor", signature);
```
### 4. Loop Safety
Ensure loop conditions can change:
```typescript
// ✅ Safe - counter increments
flow
.map((state) => ({ ...state, counter: 0 }))
.while((state) => state.counter < 5)
.map((state) => ({ ...state, counter: state.counter + 1 })) // Condition changes
.execute("processor", mapping)
.endWhile();
// ❌ Infinite loop - condition never changes
flow
.while((state) => state.isProcessing) // This never changes!
.execute("processor", mapping)
.endWhile();
```
### 5. Parallel Design
Structure flows to maximize automatic parallelization:
```typescript
// ✅ Parallel-friendly - independent operations
flow
.execute("analyzer1", (state) => ({ text: state.input })) // Can run in parallel
.execute("analyzer2", (state) => ({ text: state.input })) // Can run in parallel
.execute("combiner", (state) => ({ // Waits for both
input1: state.analyzer1Result.output,
input2: state.analyzer2Result.output,
}));
// ❌ Sequential - unnecessary dependencies
flow
.execute("analyzer1", (state) => ({ text: state.input }))
.execute("analyzer2", (state) => ({
text: state.input,
context: state.analyzer1Result.output, // Creates dependency!
}));
```
## When to Use Each Feature
### Data shaping vs. AI calls: map() vs execute()
Use `map()` for synchronous/async data transformations without calling AI; use
`execute()` to invoke a previously defined AI node.
```typescript
const wf = flow<{ raw: string }, { answer: string }>()
// Preprocess user input (no AI call)
.map((s) => ({ cleaned: s.raw.trim().toLowerCase() }))
// Call an AI node you defined earlier (requires execute)
.node("qa", "question:string -> answer:string")
.execute("qa", (s) => ({ question: s.cleaned }))
.map((s) => ({ answer: s.qaResult.answer }));
```
### Conditional routing: branch()/when()/merge()
Use when you need different paths for different conditions, then converge.
```typescript
const wf = flow<{ query: string; expertMode: boolean }, { response: string }>()
.node("simple", "query:string -> response:string")
.node("expert", "query:string -> response:string")
.branch((s) => s.expertMode)
.when(true)
.execute("expert", (s) => ({ query: s.query }))
.when(false)
.execute("simple", (s) => ({ query: s.query }))
.merge()
.map((s) => ({
response: s.expertResult?.response ?? s.simpleResult?.response,
}));
```
### Iteration/retries: while()/endWhile()
Use to repeat work until a goal is met or a cap is hit.
```typescript
const wf = flow<{ draft: string }, { final: string }>()
.node("grader", "text:string -> score:number")
.node("improver", "text:string -> improved:string")
.map((s) => ({ current: s.draft, attempts: 0 }))
.while((s) => s.attempts < 3)
.execute("grader", (s) => ({ text: s.current }))
.branch((s) => s.graderResult.score >= 0.8)
.when(true)
.map((s) => ({ attempts: 3 }))
.when(false)
.execute("improver", (s) => ({ text: s.current }))
.map((s) => ({
current: s.improverResult.improved,
attempts: s.attempts + 1,
}))
.merge()
.endWhile()
.map((s) => ({ final: s.current }));
```
### Extend node contracts: nx()
Use `nx()` to augment inputs/outputs (e.g., add internal reasoning or
confidence) without changing original signature text.
```typescript
const wf = flow<{ question: string }, { answer: string; confidence: number }>()
.nx("answerer", "question:string -> answer:string", {
appendOutputs: [{ name: "confidence", type: f.number("0-1") }],
})
.execute("answerer", (s) => ({ question: s.question }))
.map((s) => ({
answer: s.answererResult.answer,
confidence: s.answererResult.confidence,
}));
```
### Batch/array processing: derive()
Use to fan out work over arrays with built-in batching and merging.
```typescript
const wf = flow<{ items: string[] }, { processed: string[] }>({ batchSize: 3 })
.derive("processed", "items", (item) => item.toUpperCase(), { batchSize: 2 });
```
### Free speedups: auto-parallelization
Independent executes run in parallel automatically. Avoid creating unnecessary
dependencies.
```typescript
const wf = flow<{ text: string }, { combined: string }>()
.node("a", "text:string -> x:string")
.node("b", "text:string -> y:string")
.execute("a", (s) => ({ text: s.text }))
.execute("b", (s) => ({ text: s.text }))
.map((s) => ({ combined: `${s.aResult.x}|${s.bResult.y}` }));
```
### Multi-model strategy: dynamic AI context
Override AI per execute to route tasks to the best model.
```typescript
const wf = flow<{ text: string }, { out: string }>()
.node("fast", "text:string -> out:string")
.node("smart", "text:string -> out:string")
.execute("fast", (s) => ({ text: s.text }), { ai: ai({ name: "groq" }) })
.execute("smart", (s) => ({ text: s.text }), {
ai: ai({ name: "anthropic" }),
})
.map((s) => ({ out: s.smartResult?.out ?? s.fastResult.out }));
```
### Final typing and shape: returns()/r()
Use to lock in the exact output type and get full TypeScript inference.
```typescript
const wf = flow<{ input: string }>()
.map((s) => ({ upper: s.input.toUpperCase(), length: s.input.length }))
.returns((s) => ({ upper: s.upper, isLong: s.length > 20 }));
```
### Quality loops: label()/feedback()
Use to jump back to a label when a condition holds, with optional caps.
```typescript
const wf = flow<{ prompt: string }, { result: string }>()
.node("gen", "prompt:string -> result:string, quality:number")
.map((s) => ({ tries: 0 }))
.label("retry")
.execute("gen", (s) => ({ prompt: s.prompt }))
.feedback((s) => s.genResult.quality < 0.9 && s.tries < 2, "retry")
.map((s) => ({ result: s.genResult.result }));
```
### Parallel async transforms: map([...], { parallel: true })
Use to run multiple independent async transforms concurrently.
```typescript
const wf = flow<{ url: string }, { title: string; sentiment: string }>()
.map([
async (s) => ({ html: await fetchHTML(s.url) }),
async (s) => ({ meta: await fetchMetadata(s.url) }),
], { parallel: true })
.node("title", "html:string -> title:string")
.node("sent", "html:string -> sentiment:string")
.execute("title", (s) => ({ html: s.html }))
.execute("sent", (s) => ({ html: s.html }))
.returns((s) => ({
title: s.titleResult.title,
sentiment: s.sentResult.sentiment,
}));
```
## Examples
### Extended Node Patterns with `nx`
```typescript
import { f, flow } from "@ax-llm/ax";
// Chain-of-thought reasoning pattern
const reasoningFlow = flow<{ question: string }, { answer: string }>()
.nx("reasoner", "question:string -> answer:string", {
prependOutputs: [
{
name: "reasoning",
type: f.internal(f.string("Step-by-step reasoning")),
},
],
})
.execute("reasoner", (state) => ({ question: state.question }))
.map((state) => ({ answer: state.reasonerResult.answer }));
// Confidence scoring pattern
const confidenceFlow = flow<
{ input: string },
{ result: string; confidence: number }
>()
.nx("analyzer", "input:string -> result:string", {
appendOutputs: [
{ name: "confidence", type: f.number("Confidence score 0-1") },
],
})
.execute("analyzer", (state) => ({ input: state.input }))
.map((state) => ({
result: state.analyzerResult.result,
confidence: state.analyzerResult.confidence,
}));
// Contextual processing pattern
const contextualFlow = flow<
{ query: string; context?: string },
{ response: string }
>()
.nx("processor", "query:string -> response:string", {
appendInputs: [
{ name: "context", type: f.optional(f.string("Additional context")) },
],
})
.execute("processor", (state) => ({
query: state.query,
context: state.context,
}))
.map((state) => ({ response: state.processorResult.response }));
```
### Document Processing Pipeline
```typescript
const documentPipeline = flow<{ document: string }>()
.node("summarizer", "documentText:string -> summary:string")
.node("sentimentAnalyzer", "documentText:string -> sentiment:string")
.node("keywordExtractor", "documentText:string -> keywords:string[]")
// These run automatically in parallel
.execute("summarizer", (state) => ({ documentText: state.document }))
.execute("sentimentAnalyzer", (state) => ({ documentText: state.document }))
.execute("keywordExtractor", (state) => ({ documentText: state.document }))
// Use returns() for proper type inference
.returns((state) => ({
summary: state.summarizerResult.summary,
sentiment: state.sentimentAnalyzerResult.sentiment,
keywords: state.keywordExtractorResult.keywords,
}));
// TypeScript now knows the exact return type:
// { summary: string; sentiment: string; keywords: string[] }
const result = await documentPipeline.forward(llm, { document: "..." });
console.log(result.summary); // Fully typed
console.log(result.sentiment); // Fully typed
console.log(result.keywords); // Fully typed
```
### Quality-Driven Content Creation
```typescript
const contentCreator = flow<
{ topic: string; targetQuality: number },
{ finalContent: string; iterations: number }
>()
.node("writer", "topic:string -> content:string")
.node("qualityChecker", "content:string -> score:number, feedback:string")
.node("improver", "content:string, feedback:string -> improvedContent:string")
.map((state) => ({ currentContent: "", iteration: 0, bestScore: 0 }))
// Initial writing
.execute("writer", (state) => ({ topic: state.topic }))
.map((state) => ({
...state,
currentContent: state.writerResult.content,
iteration: 1,
}))
// Improvement loop
.while((state) =>
state.iteration < 5 && state.bestScore < state.targetQuality
)
.execute("qualityChecker", (state) => ({ content: state.currentContent }))
.branch((state) => state.qualityCheckerResult.score > state.bestScore)
.when(true)
.execute("improver", (state) => ({
content: state.currentContent,
feedback: state.qualityCheckerResult.feedback,
}))
.map((state) => ({
...state,
currentContent: state.improverResult.improvedContent,
bestScore: state.qualityCheckerResult.score,
iteration: state.iteration + 1,
}))
.when(false)
.map((state) => ({ ...state, iteration: 5 })) // Exit loop
.merge()
.endWhile()
.map((state) => ({
finalContent: state.currentContent,
iterations: state.iteration,
}));
```
### Async Data Enrichment Pipeline
```typescript
const enrichmentPipeline = flow<
{ userQuery: string },
{ finalResult: string; metadata: object }
>()
.node("analyzer", "enrichedData:string -> analysis:string")
// Parallel async data enrichment from multiple sources
.map([
async (state) => {
const userProfile = await fetchUserProfile(state.userQuery);
return { ...state, userProfile };
},
async (state) => {
const contextData = await fetchContextData(state.userQuery);
return { ...state, contextData };
},
async (state) => {
const historicalData = await fetchHistoricalData(state.userQuery);
return { ...state, historicalData };
},
], { parallel: true })
// Combine enriched data
.map(async (state) => {
const combinedData = await combineDataSources({
userProfile: state.userProfile,
context: state.contextData,
historical: state.historicalData,
});
return {
...state,
enrichedData: combinedData,
metadata: {
sources: ["userProfile", "contextData", "historicalData"],
timestamp: Date.now(),
},
};
})
// Process with AI
.execute("analyzer", (state) => ({ enrichedData: state.enrichedData }))
// Final async post-processing
.map(async (state) => {
const enhanced = await enhanceResult(state.analyzerResult.analysis);
return {
finalResult: enhanced,
metadata: state.metadata,
};
});
```
### Real-time Data Processing with Async Maps
```typescript
const realTimeProcessor = flow<
{ streamData: string[] },
{ processedResults: string[]; stats: object }
>()
// Async preprocessing of each item in parallel
.map(async (state) => {
const startTime = Date.now();
// Process each item in batches with async operations
const processedItems = await Promise.all(
state.streamData.map(async (item, index) => {
const enriched = await enrichDataItem(item);
const validated = await validateItem(enriched);
return { item: validated, index, timestamp: Date.now() };
}),
);
const processingTime = Date.now() - startTime;
return {
...state,
processedItems,
processingStats: {
totalItems: state.streamData.length,
processingTime,
itemsPerSecond: state.streamData.length / (processingTime / 1000),
},
};
})
// Parallel async quality checks
.map([
async (state) => {
const qualityScore = await calculateQualityScore(state.processedItems);
return { ...state, qualityScore };
},
async (state) => {
const anomalies = await detectAnomalies(state.processedItems);
return { ...state, anomalies };
},
async (state) => {
const trends = await analyzeTrends(state.processedItems);
return { ...state, trends };
},
], { parallel: true })
// Final aggregation
.map((state) => ({
processedResults: state.processedItems.map((item) => item.item),
stats: {
...state.processingStats,
qualityScore: state.qualityScore,
anomaliesFound: state.anomalies?.length || 0,
trendsDetected: state.trends?.length || 0,
},
}));
```
### Multi-Model Research System
```typescript
const researchSystem = flow<
{ query: string },
{ answer: string; sources: string[]; confidence: number }
>()
.node("queryGenerator", "researchQuestion:string -> searchQuery:string")
.node("retriever", "searchQuery:string -> retrievedDocument:string")
.node(
"answerGenerator",
"retrievedDocument:string, researchQuestion:string -> researchAnswer:string",
)
.execute("queryGenerator", (state) => ({ researchQuestion: state.query }))
.execute(
"retriever",
(state) => ({ searchQuery: state.queryGeneratorResult.searchQuery }),
)
.execute("answerGenerator", (state) => ({
retrievedDocument: state.retrieverResult.retrievedDocument,
researchQuestion: state.query,
}))
.map((state) => ({
answer: state.answerGeneratorResult.researchAnswer,
sources: [state.retrieverResult.retrievedDocument],
confidence: 0.85,
}));
```
## Troubleshooting
### Common Errors
1. **"Node 'nodeName' not found"**
- Ensure you call `.node()` before `.execute()`
2. **"endWhile() called without matching while()"**
- Every `.while()` needs a matching `.endWhile()`
3. **"when() called without matching branch()"**
- Every `.when()` needs to be inside a `.branch()` / `.merge()` block
4. **"merge() called without matching branch()"**
- Every `.branch()` needs a matching `.merge()`
5. **"Label 'labelName' not found"**
- Ensure the label exists before using it in `.feedback()`
### Performance Issues
1. **Operations running sequentially instead of parallel**
- Check for unnecessary dependencies in your mappings
- Use `flow.getExecutionPlan()` to debug
2. **Memory issues with large datasets**
- Use `batchSize` option to control parallel execution
- Consider using `.derive()` for array processing
### Type Errors
1. **State property not found**
- Use `.map()` to ensure required properties exist
- Check the spelling of result field names (`{nodeName}Result`)
This documentation provides a comprehensive guide to AxFlow based on the actual
implementation and test cases. All examples have been verified against the test
suite to ensure accuracy.
================================================================================
# Telemetry Guide
# Source: TELEMETRY.md
# Observability and monitoring for Ax applications
# Telemetry Guide
**🎯 Goal**: Learn how to monitor, trace, and observe your AI applications with
industry-standard OpenTelemetry integration. **⏱️ Time to first results**: 5
minutes\
**🔍 Value**: Understand performance, debug issues, and optimize costs with
comprehensive observability
## 📋 Table of Contents
- [What is Telemetry in Ax?](#what-is-telemetry-in-ax)
- [🚀 5-Minute Quick Start](#-5-minute-quick-start) ← **Start here!**
- [📊 Metrics Overview](#-metrics-overview)
- [🔍 Tracing Overview](#-tracing-overview)
- [🎯 Common Observability Patterns](#-common-observability-patterns)
- [🏗️ Production Setup](#️-production-setup)
- [⚡ Advanced Configuration](#-advanced-configuration)
- [🛠️ Troubleshooting Guide](#️-troubleshooting-guide)
- [🎓 Best Practices](#-best-practices)
- [📖 Complete Examples](#-complete-examples)
- [🎯 Key Takeaways](#-key-takeaways)
---
## What is Telemetry in Ax?
Think of telemetry as **X-ray vision for your AI applications**. Instead of
guessing what's happening, you get:
- **Real-time metrics** on performance, costs, and usage
- **Distributed tracing** to follow requests through your entire AI pipeline
- **Automatic instrumentation** of all LLM operations, vector databases, and
function calls
- **Industry-standard OpenTelemetry** integration for any observability platform
- **Zero-configuration** setup that works out of the box
**Real example**: A production AI system that went from "it's slow sometimes" to
"we can see exactly which model calls are taking 3+ seconds and why."
---
### Step 1: Basic Setup with Console Export
```typescript
import { ax, AxAI, f, AxAIOpenAIModel } from "@ax-llm/ax";
import { metrics, trace } from "@opentelemetry/api";
import {
BasicTracerProvider,
ConsoleSpanExporter,
SimpleSpanProcessor,
} from "@opentelemetry/sdk-trace-base";
import {
ConsoleMetricExporter,
MeterProvider,
PeriodicExportingMetricReader,
} from "@opentelemetry/sdk-metrics";
// Set up basic tracing
const tracerProvider = new BasicTracerProvider();
tracerProvider.addSpanProcessor(
new SimpleSpanProcessor(new ConsoleSpanExporter()),
);
trace.setGlobalTracerProvider(tracerProvider);
// Set up basic metrics
const meterProvider = new MeterProvider({
readers: [
new PeriodicExportingMetricReader({
exporter: new ConsoleMetricExporter(),
exportIntervalMillis: 5000,
}),
],
});
metrics.setGlobalMeterProvider(meterProvider);
// Get your tracer and meter
const tracer = trace.getTracer("my-ai-app");
const meter = metrics.getMeter("my-ai-app");
```
### Step 2: Create AI with Telemetry
```typescript
// Create AI instance with telemetry enabled
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
options: {
tracer,
meter,
debug: true, // Enable detailed logging
},
});
// Create a simple generator
const sentimentAnalyzer = ax`
reviewText:${f.string("Customer review")} ->
sentiment:${f.class(["positive", "negative", "neutral"], "Sentiment")}
`;
```
### Step 3: Run and Observe
```typescript
// This will automatically generate traces and metrics
const result = await sentimentAnalyzer.forward(ai, {
reviewText: "This product is amazing! I love it!",
});
console.log("Result:", result.sentiment);
```
**�� Congratulations!** You now have full observability. Check your console for:
- **Traces**: Complete request flow with timing and metadata
- **Metrics**: Performance counters, histograms, and gauges
- **Logs**: Detailed debug information
---
## 📊 Metrics Overview
Ax automatically tracks comprehensive metrics across all operations. Here's what
you get:
### 🤖 AI Service Metrics
**Request Metrics**
- `ax_llm_requests_total` - Total requests by service/model
- `ax_llm_request_duration_ms` - Request latency distribution
- `ax_llm_errors_total` - Error counts by type
- `ax_llm_error_rate` - Current error rate percentage
**Token Usage**
- `ax_llm_tokens_total` - Total tokens consumed
- `ax_llm_input_tokens_total` - Input/prompt tokens
- `ax_llm_output_tokens_total` - Output/completion tokens
- `ax_llm_thinking_budget_usage_total` - Thinking tokens used
**Cost & Performance**
- `ax_llm_estimated_cost_total` - Estimated costs in USD
- `ax_llm_request_size_bytes` - Request payload sizes
- `ax_llm_response_size_bytes` - Response payload sizes
- `ax_llm_context_window_usage_ratio` - Context window utilization
**Streaming & Functions**
- `ax_llm_streaming_requests_total` - Streaming request count
- `ax_llm_function_calls_total` - Function call counts
- `ax_llm_function_call_latency_ms` - Function call timing
### 🧠 AxGen Metrics
**Generation Flow**
- `ax_gen_generation_requests_total` - Total generation requests
- `ax_gen_generation_duration_ms` - End-to-end generation time
- `ax_gen_generation_errors_total` - Generation failures
**Multi-Step Processing**
- `ax_gen_multistep_generations_total` - Multi-step generations
- `ax_gen_steps_per_generation` - Steps taken per generation
- `ax_gen_max_steps_reached_total` - Max steps limit hits
**Error Correction**
- `ax_gen_validation_errors_total` - Validation failures
- `ax_gen_assertion_errors_total` - Assertion failures
- `ax_gen_error_correction_attempts` - Retry attempts
- `ax_gen_error_correction_success_total` - Successful corrections
**Function Integration**
- `ax_gen_functions_enabled_generations_total` - Function-enabled requests
- `ax_gen_function_call_steps_total` - Steps with function calls
- `ax_gen_functions_executed_per_generation` - Functions per generation
### 🔧 Optimizer Metrics
**Optimization Flow**
- `ax_optimizer_optimization_requests_total` - Total optimization requests
- `ax_optimizer_optimization_duration_ms` - End-to-end optimization time
- `ax_optimizer_optimization_errors_total` - Optimization failures
**Convergence Tracking**
- `ax_optimizer_convergence_rounds` - Rounds until convergence
- `ax_optimizer_convergence_score` - Current best score
- `ax_optimizer_convergence_improvement` - Score improvement from baseline
- `ax_optimizer_stagnation_rounds` - Rounds without improvement
- `ax_optimizer_early_stopping_total` - Early stopping events
**Resource Usage**
- `ax_optimizer_token_usage_total` - Total tokens used during optimization
- `ax_optimizer_cost_usage_total` - Total cost incurred
- `ax_optimizer_memory_usage_bytes` - Peak memory usage
- `ax_optimizer_duration_ms` - Optimization duration
**Teacher-Student Interactions**
- `ax_optimizer_teacher_student_usage_total` - Teacher-student interactions
- `ax_optimizer_teacher_student_latency_ms` - Interaction latency
- `ax_optimizer_teacher_student_score_improvement` - Score improvement from
teacher
**Checkpointing**
- `ax_optimizer_checkpoint_save_total` - Checkpoint saves
- `ax_optimizer_checkpoint_load_total` - Checkpoint loads
- `ax_optimizer_checkpoint_save_latency_ms` - Save operation latency
- `ax_optimizer_checkpoint_load_latency_ms` - Load operation latency
**Pareto Optimization**
- `ax_optimizer_pareto_optimizations_total` - Pareto optimization runs
- `ax_optimizer_pareto_front_size` - Size of Pareto frontier
- `ax_optimizer_pareto_hypervolume` - Hypervolume of Pareto frontier
- `ax_optimizer_pareto_solutions_generated` - Solutions generated
**Program Complexity**
- `ax_optimizer_program_input_fields` - Input fields in optimized program
- `ax_optimizer_program_output_fields` - Output fields in optimized program
- `ax_optimizer_examples_count` - Training examples used
- `ax_optimizer_validation_set_size` - Validation set size
### 📊 Database Metrics
**Vector Operations**
- `db_operations_total` - Total DB operations
- `db_query_duration_ms` - Query latency
- `db_upsert_duration_ms` - Upsert latency
- `db_vector_dimensions` - Vector dimensions
### 📈 Example Metrics Output
```json
{
"name": "ax_llm_request_duration_ms",
"description": "Duration of LLM requests in milliseconds",
"unit": "ms",
"data": {
"resourceMetrics": [{
"scopeMetrics": [{
"metrics": [{
"name": "ax_llm_request_duration_ms",
"histogram": {
"dataPoints": [{
"attributes": {
"operation": "chat",
"ai_service": "openai",
"model": "gpt-4o-mini"
},
"sum": 2450.5,
"count": 10,
"bounds": [1, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000],
"bucketCounts": [0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 2]
}]
}
}]
}]
}]
}
}
```
---
## 🔍 Tracing Overview
Ax provides comprehensive distributed tracing following OpenTelemetry standards
and the new `gen_ai` semantic conventions.
### 🎯 Trace Structure
**Root Spans**
- `Chat Request` - Complete chat completion flow
- `AI Embed Request` - Embedding generation
- `AxGen` - AxGen generation pipeline
- `DB Query Request` - Vector database operations
**Child Spans**
- `API Call` - HTTP requests to AI providers
- `Function Call` - Tool/function execution
- `Validation` - Response validation
- `Extraction` - Value extraction from responses
### 📋 Standard Attributes
**LLM Attributes** (`gen_ai.*`)
```typescript
{
'gen_ai.system': 'openai',
'gen_ai.operation.name': 'chat',
'gen_ai.request.model': 'gpt-4o-mini',
'gen_ai.request.max_tokens': 500,
'gen_ai.request.temperature': 0.1,
'gen_ai.request.llm_is_streaming': false,
'gen_ai.usage.input_tokens': 150,
'gen_ai.usage.output_tokens': 200,
'gen_ai.usage.total_tokens': 350
}
```
**Database Attributes** (`db.*`)
```typescript
{
'db.system': 'weaviate',
'db.operation.name': 'query',
'db.table': 'documents',
'db.namespace': 'default',
'db.vector.query.top_k': 10
}
```
**Custom Ax Attributes**
```typescript
{
'signature': 'JSON representation of signature',
'examples': 'JSON representation of examples',
'provided_functions': 'function1,function2',
'thinking_token_budget': 'low',
'show_thoughts': true,
'max_steps': 5,
'max_retries': 3
}
```
### 📊 Standard Events
**Message Events**
- `gen_ai.user.message` - User input content
- `gen_ai.system.message` - System prompt content
- `gen_ai.assistant.message` - Assistant response content
- `gen_ai.tool.message` - Function call results
**Usage Events**
- `gen_ai.usage` - Token usage information
- `gen_ai.choice` - Response choices
### 📈 Example Trace Output
```json
{
"traceId": "ddc7405e9848c8c884e53b823e120845",
"name": "Chat Request",
"id": "d376daad21da7a3c",
"kind": "SERVER",
"timestamp": 1716622997025000,
"duration": 14190456.542,
"attributes": {
"gen_ai.system": "openai",
"gen_ai.operation.name": "chat",
"gen_ai.request.model": "gpt-4o-mini",
"gen_ai.request.max_tokens": 500,
"gen_ai.request.temperature": 0.1,
"gen_ai.request.llm_is_streaming": false,
"gen_ai.usage.input_tokens": 150,
"gen_ai.usage.output_tokens": 200,
"gen_ai.usage.total_tokens": 350
},
"events": [
{
"name": "gen_ai.user.message",
"timestamp": 1716622997025000,
"attributes": {
"content": "What is the capital of France?"
}
},
{
"name": "gen_ai.assistant.message",
"timestamp": 1716622997025000,
"attributes": {
"content": "The capital of France is Paris."
}
}
]
}
```
### 🛠️ Function/Tool Call Tracing
Ax traces each function/tool invocation as a child span of the active generation
(AxGen) span. This works for both native function calling and signature
prompt-mode tool routing.
What you get automatically when a tracer is configured:
- A child span per tool call with clear parent-child relationships
- Standardized attributes and events
- Content redaction that respects `excludeContentFromTrace`
Span naming and attributes
```typescript
// Span name
// Tool:
// Standard attributes (examples)
{
'tool.name': '',
'tool.mode': 'native' | 'prompt',
'function.id': '',
'session.id': ''
// Optionally, inherited or added context like gen_ai.system/model from parent
}
```
Events
```typescript
// On success
{
name: 'gen_ai.tool.message',
attributes: {
args?: string, // omitted when excludeContentFromTrace=true
result?: string // omitted when excludeContentFromTrace=true
}
}
// On error
{
name: 'function.error',
attributes: {
name: '',
message: '',
fixing_instructions?: '' // when available
}
}
```
Prompt-mode vs native
- Native: `tool.mode='native'` spans are created when the provider returns a
function call.
- Prompt-mode: `tool.mode='prompt'` spans are created when signature-injected
tool fields trigger execution.
Developer guidance
- Do not pass tracer or meter into your tool functions. Ax starts spans around
your tool handler and relies on OpenTelemetry context propagation.
- If you need correlation inside your function, use `trace.getActiveSpan()` or
read `extra.sessionId`/`extra.traceId` parameters your handler already
receives.
Modern example with factory functions
```ts
import { ai, ax } from "@ax-llm/ax";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const summarize = ax(
'documentText:string "Text to summarize" -> summaryText:string "Summary"',
);
// Provide tools; Ax will create a child span per call
const tools = [
{
name: "fetchUrl",
description: "Fetches content from a URL",
parameters: {
type: "object",
properties: {
url: { type: "string", description: "The URL to fetch" },
},
required: ["url"],
},
async func(args) {
const res = await fetch(args.url);
return await res.text();
},
},
];
const result = await summarize.forward(llm, { documentText: "..." }, {
functions: tools,
functionCallMode: "auto", // native when supported, prompt-mode otherwise
});
```
---
## 🎯 Common Observability Patterns
### 1. Performance Monitoring
```typescript
// Monitor latency percentiles
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: {
tracer,
meter,
// Custom latency thresholds
timeout: 30000,
},
});
// Set up alerts on high latency
// P95 > 5s, P99 > 10s
```
### 2. Cost Tracking
```typescript
// Track costs by model and operation
const costOptimizer = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" }, // Cheaper model
options: { tracer, meter },
});
// Monitor estimated costs
// Alert when daily spend > $100
```
### 3. Error Rate Monitoring
```typescript
// Track error rates by service
const reliableAI = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: {
tracer,
meter,
// Retry configuration
maxRetries: 3,
retryDelay: 1000,
},
});
// Set up alerts on error rate > 5%
```
### 4. Function Call Monitoring
```typescript
// Monitor function call success rates
const functionAI = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: { tracer, meter },
});
// Track function call latency and success rates
// Alert on function call failures
```
### 5. Streaming Performance
```typescript
// Monitor streaming response times
const streamingAI = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { stream: true },
options: { tracer, meter },
});
// Track time to first token
// Monitor streaming completion rates
```
---
## 🏗️ Production Setup
### 1. Jaeger Tracing Setup
```typescript
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base";
// Start Jaeger locally
// docker run --rm --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/jaeger:2.6.0
const otlpExporter = new OTLPTraceExporter({
url: "http://localhost:4318/v1/traces",
});
const provider = new BasicTracerProvider({
spanProcessors: [new BatchSpanProcessor(otlpExporter)],
resource: defaultResource().merge(
resourceFromAttributes({
"service.name": "my-ai-app",
"service.version": "1.0.0",
}),
),
});
trace.setGlobalTracerProvider(provider);
```
### 2. Prometheus Metrics Setup
```typescript
import { PrometheusExporter } from "@opentelemetry/exporter-prometheus";
const prometheusExporter = new PrometheusExporter({
port: 9464,
endpoint: "/metrics",
});
const meterProvider = new MeterProvider({
readers: [
new PeriodicExportingMetricReader({
exporter: prometheusExporter,
exportIntervalMillis: 1000,
}),
],
});
metrics.setGlobalMeterProvider(meterProvider);
```
### 3. Cloud Observability Setup
```typescript
// For AWS X-Ray, Google Cloud Trace, Azure Monitor, etc.
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-http";
const cloudExporter = new OTLPTraceExporter({
url: "https://your-observability-endpoint.com/v1/traces",
headers: {
"Authorization": `Bearer ${process.env.OBSERVABILITY_API_KEY}`,
},
});
const cloudMetricsExporter = new OTLPMetricExporter({
url: "https://your-observability-endpoint.com/v1/metrics",
headers: {
"Authorization": `Bearer ${process.env.OBSERVABILITY_API_KEY}`,
},
});
```
### 4. Environment-Specific Configuration
```typescript
// config/telemetry.ts
export const setupTelemetry = (environment: "development" | "production") => {
if (environment === "development") {
// Console export for local development
const consoleExporter = new ConsoleSpanExporter();
const provider = new BasicTracerProvider({
spanProcessors: [new SimpleSpanProcessor(consoleExporter)],
});
trace.setGlobalTracerProvider(provider);
} else {
// Production setup with sampling and batching
const otlpExporter = new OTLPTraceExporter({
url: process.env.OTLP_ENDPOINT!,
});
const provider = new BasicTracerProvider({
spanProcessors: [
new BatchSpanProcessor(otlpExporter, {
maxQueueSize: 2048,
maxExportBatchSize: 512,
scheduledDelayMillis: 5000,
}),
],
});
trace.setGlobalTracerProvider(provider);
}
};
```
---
## ⚡ Advanced Configuration
### 1. Custom Metrics
```typescript
// Create custom business metrics
const customMeter = metrics.getMeter("business-metrics");
const customCounter = customMeter.createCounter("business_operations_total", {
description: "Total business operations",
});
// Record custom metrics
customCounter.add(1, {
operation_type: "sentiment_analysis",
customer_tier: "premium",
});
```
### 2. Custom Spans
```typescript
// Create custom spans for business logic
const tracer = trace.getTracer("business-logic");
const processOrder = async (orderId: string) => {
return await tracer.startActiveSpan(
"Process Order",
{
attributes: {
"order.id": orderId,
"business.operation": "order_processing",
},
},
async (span) => {
try {
// Your business logic here
const result = await ai.chat({/* ... */});
span.setAttributes({
"order.status": "completed",
"order.value": result.total,
});
return result;
} catch (error) {
span.recordException(error);
span.setAttributes({ "order.status": "failed" });
throw error;
} finally {
span.end();
}
},
);
};
```
### 3. Sampling Configuration
```typescript
// Configure sampling for high-traffic applications
import {
ParentBasedSampler,
TraceIdRatioBasedSampler,
} from "@opentelemetry/sdk-trace-base";
const sampler = new ParentBasedSampler({
root: new TraceIdRatioBasedSampler(0.1), // Sample 10% of traces
});
const provider = new BasicTracerProvider({
sampler,
spanProcessors: [new BatchSpanProcessor(otlpExporter)],
});
```
### 4. Metrics Configuration
```typescript
// Configure metrics collection
import {
axUpdateMetricsConfig,
axUpdateOptimizerMetricsConfig,
} from "@ax-llm/ax";
// Configure DSPy metrics
axUpdateMetricsConfig({
enabled: true,
enabledCategories: [
"generation",
"streaming",
"functions",
"errors",
"performance",
],
maxLabelLength: 100,
samplingRate: 1.0, // Collect all metrics
});
// Configure optimizer metrics
axUpdateOptimizerMetricsConfig({
enabled: true,
enabledCategories: [
"optimization",
"convergence",
"resource_usage",
"teacher_student",
"checkpointing",
"pareto",
],
maxLabelLength: 100,
samplingRate: 1.0,
});
```
### 5. Optimizer Metrics Usage
```typescript
// Optimizer metrics are automatically collected when using optimizers
import { AxBootstrapFewShot } from "@ax-llm/ax";
const optimizer = new AxBootstrapFewShot({
studentAI: ai,
examples: trainingExamples,
validationSet: validationExamples,
targetScore: 0.9,
verbose: true,
options: {
maxRounds: 5,
},
});
// Metrics are automatically recorded during optimization
const result = await optimizer.compile(program, metricFn);
// Check optimization metrics
console.log("Optimization duration:", result.stats.resourceUsage.totalTime);
console.log("Total tokens used:", result.stats.resourceUsage.totalTokens);
console.log("Convergence info:", result.stats.convergenceInfo);
```
### 6. Global Telemetry Setup
```typescript
// Set up global telemetry for all Ax operations
import { axGlobals } from "@ax-llm/ax";
// Global tracer
axGlobals.tracer = trace.getTracer("global-ax-tracer");
// Global meter
axGlobals.meter = metrics.getMeter("global-ax-meter");
// Now all Ax operations will use these by default
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
// No need to specify tracer/meter - uses globals
});
```
### 7. Custom Labels for Metrics
Add custom labels to metrics for better filtering and grouping in your observability platform. Labels are merged from three sources (in order of precedence):
1. **Global labels** - Apply to all operations
2. **AI service labels** - Apply to all operations using that service
3. **Per-call labels** - Apply to a specific generation call
```typescript
import { axGlobals, AxAI, ax } from "@ax-llm/ax";
// 1. Global custom labels (lowest precedence)
axGlobals.customLabels = {
environment: "production",
service: "recommendation-engine",
};
// 2. AI service level labels
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: {
customLabels: {
team: "ml-ops",
cost_center: "ai-platform",
},
},
});
// 3. Per-call labels (highest precedence)
const gen = ax("question:string -> answer:string");
const result = await gen.forward(ai, { question: "..." }, {
customLabels: {
feature: "sentiment-analysis",
experiment_id: "exp-123",
},
});
```
**Resulting labels on metrics:**
```json
{
"environment": "production",
"service": "recommendation-engine",
"team": "ml-ops",
"cost_center": "ai-platform",
"feature": "sentiment-analysis",
"experiment_id": "exp-123"
}
```
**Use cases for custom labels:**
- **Cost allocation**: Track costs by team, project, or feature
- **A/B testing**: Tag generations with experiment IDs
- **Debugging**: Add request IDs or user context
- **Multi-tenancy**: Track usage by tenant or customer
- **Feature flags**: Monitor rollout of new capabilities
---
## 🛠️ Troubleshooting Guide
### Common Issues
**1. No traces appearing**
```typescript
// Check if tracer is properly configured
console.log("Tracer:", trace.getTracer("test"));
console.log("Provider:", trace.getTracerProvider());
// Ensure spans are being created
const span = tracer.startSpan("test");
span.end();
```
**2. Metrics not updating**
```typescript
// Check meter configuration
console.log("Meter:", metrics.getMeter("test"));
console.log("Provider:", metrics.getMeterProvider());
// Verify metric collection
const testCounter = meter.createCounter("test_counter");
testCounter.add(1);
```
**3. High memory usage**
```typescript
// Reduce metric cardinality
axUpdateMetricsConfig({
maxLabelLength: 50, // Shorter labels
samplingRate: 0.1, // Sample 10% of metrics
});
// Use batch processing for spans
const batchProcessor = new BatchSpanProcessor(exporter, {
maxQueueSize: 1024, // Smaller queue
maxExportBatchSize: 256, // Smaller batches
});
```
**4. Slow performance**
```typescript
// Use async exporters
const asyncExporter = new OTLPTraceExporter({
url: "http://localhost:4318/v1/traces",
timeoutMillis: 30000,
});
// Configure appropriate sampling
const sampler = new TraceIdRatioBasedSampler(0.01); // Sample 1%
```
### Debug Mode
```typescript
// Enable debug mode for detailed logging
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: {
debug: true, // Detailed logging
tracer,
meter,
},
});
// Check debug output for telemetry information
```
---
## 🎓 Best Practices
### 1. Naming Conventions
```typescript
// Use consistent naming for tracers and meters
const tracer = trace.getTracer("my-app.ai-service");
const meter = metrics.getMeter("my-app.ai-service");
// Use descriptive span names
const span = tracer.startSpan("Sentiment Analysis Request");
```
### 2. Attribute Management
```typescript
// Use standard attributes when possible
span.setAttributes({
"gen_ai.system": "openai",
"gen_ai.operation.name": "chat",
"gen_ai.request.model": "gpt-4o-mini",
});
// Add business context
span.setAttributes({
"business.customer_id": customerId,
"business.operation_type": "sentiment_analysis",
});
```
### 3. Error Handling
```typescript
// Always record exceptions in spans
try {
const result = await ai.chat(request);
return result;
} catch (error) {
span.recordException(error);
span.setAttributes({ "error.type": error.name });
throw error;
} finally {
span.end();
}
```
### 4. Performance Optimization
```typescript
// Use batch processing for high-volume applications
const batchProcessor = new BatchSpanProcessor(exporter, {
maxQueueSize: 2048,
maxExportBatchSize: 512,
scheduledDelayMillis: 5000,
});
// Configure appropriate sampling
const sampler = new ParentBasedSampler({
root: new TraceIdRatioBasedSampler(0.1), // 10% sampling
});
```
### 5. Security Considerations
```typescript
// Exclude sensitive content from traces
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: {
excludeContentFromTrace: true, // Don't log prompt content
tracer,
},
});
// Use secure headers for cloud exporters
const secureExporter = new OTLPTraceExporter({
url: process.env.OTLP_ENDPOINT!,
headers: {
"Authorization": `Bearer ${process.env.API_KEY}`,
},
});
```
---
## 📖 Complete Examples
### 1. Full Production Setup
```typescript
// examples/production-telemetry.ts
import { ax, AxAI, f } from "@ax-llm/ax";
import { metrics, trace } from "@opentelemetry/api";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-http";
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base";
import { PeriodicExportingMetricReader } from "@opentelemetry/sdk-metrics";
// Production telemetry setup
const setupProductionTelemetry = () => {
// Tracing setup
const traceExporter = new OTLPTraceExporter({
url: process.env.OTLP_TRACE_ENDPOINT!,
headers: { "Authorization": `Bearer ${process.env.OTLP_API_KEY}` },
});
const traceProvider = new BasicTracerProvider({
spanProcessors: [new BatchSpanProcessor(traceExporter)],
});
trace.setGlobalTracerProvider(traceProvider);
// Metrics setup
const metricExporter = new OTLPMetricExporter({
url: process.env.OTLP_METRIC_ENDPOINT!,
headers: { "Authorization": `Bearer ${process.env.OTLP_API_KEY}` },
});
const meterProvider = new MeterProvider({
readers: [
new PeriodicExportingMetricReader({
exporter: metricExporter,
exportIntervalMillis: 10000,
}),
],
});
metrics.setGlobalMeterProvider(meterProvider);
};
// Initialize telemetry
setupProductionTelemetry();
// Create AI with telemetry
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" },
options: {
tracer: trace.getTracer("production-ai"),
meter: metrics.getMeter("production-ai"),
debug: process.env.NODE_ENV === "development",
},
});
// Create generator
const sentimentAnalyzer = ax`
reviewText:${f.string("Customer review")} ->
sentiment:${f.class(["positive", "negative", "neutral"], "Sentiment")},
confidence:${f.number("Confidence score 0-1")}
`;
// Usage with full observability
export const analyzeSentiment = async (review: string) => {
const result = await sentimentAnalyzer.forward(ai, { reviewText: review });
return result;
};
```
### 2. Multi-Service Tracing
```typescript
// examples/multi-service-tracing.ts
import { AxAI, AxFlow } from "@ax-llm/ax";
import { trace } from "@opentelemetry/api";
const tracer = trace.getTracer("multi-service");
// Create AI services
const fastAI = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o-mini" },
options: { tracer },
});
const powerfulAI = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: "gpt-4o" },
options: { tracer },
});
// Create multi-service workflow
const documentProcessor = new AxFlow<
{ document: string },
{ summary: string; analysis: string }
>()
.n("summarizer", "documentText:string -> summary:string")
.n("analyzer", "documentText:string -> analysis:string")
.e("summarizer", (s) => ({ documentText: s.document }), { ai: fastAI })
.e("analyzer", (s) => ({ documentText: s.document }), { ai: powerfulAI })
.m((s) => ({
summary: s.summarizerResult.summary,
analysis: s.analyzerResult.analysis,
}));
// Each step gets its own span with proper parent-child relationships
export const processDocument = async (document: string) => {
return await documentProcessor.forward(fastAI, { document });
};
```
### 3. Custom Business Metrics
```typescript
// examples/custom-business-metrics.ts
import { ax, AxAI, f } from "@ax-llm/ax";
import { metrics } from "@opentelemetry/api";
const meter = metrics.getMeter("business-metrics");
// Create custom business metrics
const customerSatisfactionGauge = meter.createGauge(
"customer_satisfaction_score",
{
description: "Customer satisfaction score",
},
);
const orderProcessingHistogram = meter.createHistogram(
"order_processing_duration_ms",
{
description: "Order processing time",
unit: "ms",
},
);
const ai = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
options: { meter },
});
const orderAnalyzer = ax`
orderText:${f.string("Order description")} ->
category:${f.class(["urgent", "normal", "low"], "Priority")},
estimatedTime:${f.number("Estimated processing time in hours")}
`;
export const processOrder = async (orderText: string) => {
const startTime = performance.now();
try {
const result = await orderAnalyzer.forward(ai, { orderText });
// Record business metrics
const processingTime = performance.now() - startTime;
orderProcessingHistogram.record(processingTime, {
category: result.category,
});
// Update satisfaction score based on processing time
const satisfactionScore = processingTime < 1000 ? 0.9 : 0.7;
customerSatisfactionGauge.record(satisfactionScore, {
order_type: result.category,
});
return result;
} catch (error) {
// Record error metrics
customerSatisfactionGauge.record(0.0, {
order_type: "error",
});
throw error;
}
};
```
---
## 🎯 Key Takeaways
### ✅ What You've Learned
1. **Complete Observability**: Ax provides comprehensive metrics and tracing out
of the box
2. **Industry Standards**: Uses OpenTelemetry and `gen_ai` semantic conventions
3. **Zero Configuration**: Works immediately with minimal setup
4. **Production Ready**: Scales from development to enterprise environments
5. **Cost Optimization**: Track usage and costs to optimize spending
### 🚀 Next Steps
1. **Start Simple**: Begin with console export for development
2. **Add Production**: Set up cloud observability for production
3. **Custom Metrics**: Add business-specific metrics
4. **Alerting**: Set up alerts on key metrics
5. **Optimization**: Use data to optimize performance and costs
### 📚 Resources
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
- [Gen AI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai/)
- [Ax Examples](https://github.com/ax-llm/ax/tree/main/src/examples)
- [Telemetry Example](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/telemetry.ts)
- [Metrics Export Example](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/metrics-export.ts)
### 🎉 You're Ready!
You now have the knowledge to build observable, production-ready AI applications
with Ax. Start with the quick setup, add production telemetry, and watch your AI
systems become transparent and optimizable!
---
_Need help? Check out the [Ax documentation](https://ax-llm.com) or join our
[community](https://github.com/ax-llm/ax/discussions)._
================================================================================
# AxRAG Guide
# Source: AXRAG.md
# Advanced RAG with multi-hop retrieval and self-healing quality loops
# Advanced RAG with AxFlow: `axRAG`
**`axRAG`** is a powerful, production-ready RAG (Retrieval-Augmented Generation) implementation built on AxFlow that provides advanced multi-hop retrieval, self-healing quality loops, and intelligent query refinement.
## Key Features
- **🔍 Multi-hop Retrieval**: Iteratively refines queries and accumulates context across multiple retrieval rounds
- **🧠 Intelligent Query Generation**: AI-powered query expansion and refinement based on previous context
- **🔄 Self-healing Quality Loops**: Automatically improves answers through quality assessment and iterative healing
- **⚡ Parallel Sub-query Processing**: Breaks down complex questions into parallel sub-queries for comprehensive coverage
- **🎯 Gap Analysis**: Identifies missing information and generates focused follow-up queries
- **🏥 Answer Healing**: Retrieves additional context to address quality issues and improve final answers
- **📊 Configurable Quality Thresholds**: Fine-tune performance vs. thoroughness trade-offs
- **🐛 Debug Mode**: Built-in logging to visualize the entire RAG pipeline execution
## Basic Usage
```typescript
import { AxAI, axRAG } from "@ax-llm/ax";
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY,
});
// Your vector database query function
const queryVectorDB = async (query: string): Promise => {
// Connect to your vector database (Pinecone, Weaviate, etc.)
// Return relevant context for the query
return await yourVectorDB.query(query);
};
// Create a powerful RAG pipeline
const rag = axRAG(queryVectorDB, {
maxHops: 3, // Maximum retrieval iterations
qualityThreshold: 0.8, // Quality score threshold (0-1)
maxIterations: 2, // Max parallel sub-query iterations
qualityTarget: 0.85, // Target quality for healing loops
debug: true // Enable detailed logging
});
// Execute RAG with complex question
const result = await rag.forward(llm, {
originalQuestion: "How do machine learning algorithms impact privacy in financial services?"
});
console.log("Answer:", result.finalAnswer);
console.log("Quality:", result.qualityAchieved);
console.log("Sources:", result.retrievedContexts.length);
console.log("Hops:", result.totalHops);
console.log("Healing attempts:", result.healingAttempts);
```
## Advanced Configuration
```typescript
// Production-ready RAG with full configuration
const advancedRAG = axRAG(queryVectorDB, {
// Multi-hop retrieval settings
maxHops: 4, // More thorough retrieval
qualityThreshold: 0.75, // Lower threshold for faster execution
// Parallel processing settings
maxIterations: 3, // More sub-query iterations
// Self-healing settings
qualityTarget: 0.9, // Higher quality target
disableQualityHealing: false, // Enable healing loops
// Debug and monitoring
debug: true, // Detailed execution logging
logger: customLogger, // Custom logging function
});
// Execute with complex research query
const result = await advancedRAG.forward(llm, {
originalQuestion: "What are the latest developments in quantum computing for cryptography, including both opportunities and security risks?"
});
```
## RAG Pipeline Architecture
The `axRAG` implementation uses a sophisticated 4-phase approach:
**Phase 1: Multi-hop Context Retrieval**
- Generates intelligent search queries based on the original question
- Iteratively retrieves and contextualizes information
- Assesses completeness and refines queries for subsequent hops
- Accumulates comprehensive context across multiple retrieval rounds
**Phase 2: Parallel Sub-query Processing**
- Decomposes complex questions into focused sub-queries
- Executes parallel retrieval for comprehensive coverage
- Synthesizes evidence from multiple information sources
- Analyzes gaps and determines need for additional information
**Phase 3: Answer Generation**
- Generates comprehensive answers using accumulated context
- Leverages all retrieved information and synthesized evidence
- Produces initial high-quality responses
**Phase 4: Self-healing Quality Loops**
- Validates answer quality against configurable thresholds
- Identifies specific issues and areas for improvement
- Retrieves targeted healing context to address deficiencies
- Iteratively improves answers until quality targets are met
## AxFlow Pipeline Implementation
The `axRAG` implementation showcases the power of **AxFlow** to build complex, real-world LLM-powered pipelines that solve sophisticated problems. Below is the commented AxFlow pipeline code that demonstrates how intricate multi-hop RAG systems can be elegantly constructed using AxFlow's declarative approach:
```typescript
export const axRAG = (
queryFn: (query: string) => Promise,
options?: {
maxHops?: number;
qualityThreshold?: number;
maxIterations?: number;
qualityTarget?: number;
disableQualityHealing?: boolean;
logger?: AxFlowLoggerFunction;
debug?: boolean;
}
) => {
// Extract configuration with sensible defaults
const maxHops = options?.maxHops ?? 3;
const qualityThreshold = options?.qualityThreshold ?? 0.8;
const maxIterations = options?.maxIterations ?? 2;
const qualityTarget = options?.qualityTarget ?? 0.85;
const disableQualityHealing = options?.disableQualityHealing ?? false;
return (
new AxFlow<
{ originalQuestion: string },
{
finalAnswer: string;
totalHops: number;
retrievedContexts: string[];
iterationCount: number;
healingAttempts: number;
qualityAchieved: number;
}
>({
logger: options?.logger,
debug: options?.debug,
})
// 🏗️ STEP 1: Define AI-powered processing nodes
// Each node represents a specialized AI task with typed inputs/outputs
.node(
'queryGenerator',
'originalQuestion:string, previousContext?:string -> searchQuery:string, queryReasoning:string'
)
.node(
'contextualizer',
'retrievedDocument:string, accumulatedContext?:string -> enhancedContext:string'
)
.node(
'qualityAssessor',
'currentContext:string, originalQuestion:string -> completenessScore:number, missingAspects:string[]'
)
.node(
'questionDecomposer',
'complexQuestion:string -> subQuestions:string[], decompositionReason:string'
)
.node(
'evidenceSynthesizer',
'collectedEvidence:string[], originalQuestion:string -> synthesizedEvidence:string, evidenceGaps:string[]'
)
.node(
'gapAnalyzer',
'synthesizedEvidence:string, evidenceGaps:string[], originalQuestion:string -> needsMoreInfo:boolean, focusedQueries:string[]'
)
.node(
'answerGenerator',
'finalContext:string, originalQuestion:string -> comprehensiveAnswer:string, confidenceLevel:number'
)
.node(
'queryRefiner',
'originalQuestion:string, currentContext:string, missingAspects:string[] -> refinedQuery:string'
)
.node(
'qualityValidator',
'generatedAnswer:string, userQuery:string -> qualityScore:number, issues:string[]'
)
.node(
'answerHealer',
'originalAnswer:string, healingDocument:string, issues?:string[] -> healedAnswer:string'
)
// 🚀 STEP 2: Initialize comprehensive pipeline state
// AxFlow maintains this state throughout the entire pipeline execution
.map((state) => ({
...state,
maxHops,
qualityThreshold,
maxIterations,
qualityTarget,
disableQualityHealing,
currentHop: 0,
accumulatedContext: '',
retrievedContexts: [] as string[],
completenessScore: 0,
searchQuery: state.originalQuestion,
shouldContinue: true,
iteration: 0,
allEvidence: [] as string[],
evidenceSources: [] as string[],
needsMoreInfo: true,
healingAttempts: 0,
currentQuality: 0,
shouldContinueHealing: true,
currentAnswer: '',
currentIssues: [] as string[],
}))
// 🔄 PHASE 1: Multi-hop Retrieval with Iterative Refinement
// AxFlow's .while() enables sophisticated looping with dynamic conditions
.while(
(state) =>
state.currentHop < state.maxHops &&
state.completenessScore < state.qualityThreshold &&
state.shouldContinue
)
// Increment hop counter for each iteration
.map((state) => ({
...state,
currentHop: state.currentHop + 1,
}))
// 🧠 Generate intelligent search query using AI
.execute('queryGenerator', (state) => ({
originalQuestion: state.originalQuestion,
previousContext: state.accumulatedContext || undefined,
}))
// 📚 Execute vector database retrieval using provided queryFn
.map(async (state) => {
const searchQuery = state.queryGeneratorResult.searchQuery as string;
const retrievedDocument = await queryFn(searchQuery);
return {
...state,
retrievalResult: {
retrievedDocument,
retrievalConfidence: 0.9,
},
};
})
// 🔗 Contextualize retrieved document with existing knowledge
.execute('contextualizer', (state) => ({
retrievedDocument: state.retrievalResult.retrievedDocument,
accumulatedContext: state.accumulatedContext || undefined,
}))
// 📊 Assess quality and completeness of current context
.execute('qualityAssessor', (state) => ({
currentContext: state.contextualizerResult.enhancedContext,
originalQuestion: state.originalQuestion,
}))
// 📈 Update state with enhanced context and quality metrics
.map((state) => ({
...state,
accumulatedContext: state.contextualizerResult.enhancedContext,
retrievedContexts: [
...state.retrievedContexts,
state.retrievalResult.retrievedDocument,
],
completenessScore: state.qualityAssessorResult.completenessScore as number,
searchQuery: state.queryGeneratorResult.searchQuery as string,
shouldContinue:
(state.qualityAssessorResult.completenessScore as number) < state.qualityThreshold,
}))
// 🎯 Conditional query refinement using AxFlow's branching
.branch(
(state) => state.shouldContinue && state.currentHop < state.maxHops
)
.when(true)
.execute('queryRefiner', (state) => ({
originalQuestion: state.originalQuestion,
currentContext: state.accumulatedContext,
missingAspects: state.qualityAssessorResult.missingAspects,
}))
.map((state) => ({
...state,
searchQuery: state.queryRefinerResult?.refinedQuery || state.searchQuery,
}))
.when(false)
.map((state) => state) // No refinement needed
.merge()
.endWhile()
// ⚡ PHASE 2: Advanced Parallel Sub-query Processing
// Initialize evidence collection from Phase 1 results
.map((state) => ({
...state,
allEvidence: state.retrievedContexts.length > 0 ? state.retrievedContexts : [],
}))
// 🔄 Iterative sub-query processing loop
.while(
(state) => state.iteration < state.maxIterations && state.needsMoreInfo
)
.map((state) => ({
...state,
iteration: state.iteration + 1,
}))
// 🧩 Question decomposition for first iteration
.branch((state) => state.iteration === 1)
.when(true)
.execute('questionDecomposer', (state) => ({
complexQuestion: state.originalQuestion,
}))
.map((state) => ({
...state,
currentQueries: state.questionDecomposerResult.subQuestions,
}))
.when(false)
// Use focused queries from gap analysis for subsequent iterations
.map((state) => ({
...state,
currentQueries: ((state as any).gapAnalyzerResult?.focusedQueries as string[]) || [],
}))
.merge()
// 🚀 Parallel retrieval execution for multiple queries
.map(async (state) => {
const queries = state.currentQueries || [];
const retrievalResults =
queries.length > 0
? await Promise.all(queries.map((query: string) => queryFn(query)))
: [];
return {
...state,
retrievalResults,
};
})
// 🧪 Synthesize evidence from multiple sources
.execute('evidenceSynthesizer', (state) => {
const evidence = [
...(state.allEvidence || []),
...(state.retrievalResults || []),
].filter(Boolean);
return {
collectedEvidence: evidence.length > 0 ? evidence : ['No evidence collected yet'],
originalQuestion: state.originalQuestion,
};
})
// 🔍 Analyze information gaps and determine next steps
.execute('gapAnalyzer', (state) => ({
synthesizedEvidence: state.evidenceSynthesizerResult.synthesizedEvidence,
evidenceGaps: state.evidenceSynthesizerResult.evidenceGaps,
originalQuestion: state.originalQuestion,
}))
// 📊 Update state with synthesized evidence and gap analysis
.map((state) => ({
...state,
allEvidence: [...state.allEvidence, ...state.retrievalResults],
evidenceSources: [...state.evidenceSources, `Iteration ${state.iteration} sources`],
needsMoreInfo: state.gapAnalyzerResult.needsMoreInfo,
synthesizedEvidence: state.evidenceSynthesizerResult.synthesizedEvidence,
}))
.endWhile()
// 📝 PHASE 3: Generate comprehensive initial answer
.execute('answerGenerator', (state) => ({
finalContext:
state.accumulatedContext ||
state.synthesizedEvidence ||
state.allEvidence.join('\n'),
originalQuestion: state.originalQuestion,
}))
// 🏥 PHASE 4: Self-healing Quality Validation and Improvement
// Conditional quality healing based on configuration
.branch((state) => !state.disableQualityHealing)
.when(true)
// Validate initial answer quality
.execute('qualityValidator', (state) => ({
generatedAnswer: state.answerGeneratorResult.comprehensiveAnswer,
userQuery: state.originalQuestion,
}))
.map((state) => ({
...state,
currentAnswer: state.answerGeneratorResult.comprehensiveAnswer as string,
currentQuality: state.qualityValidatorResult.qualityScore as number,
currentIssues: state.qualityValidatorResult.issues as string[],
shouldContinueHealing:
(state.qualityValidatorResult.qualityScore as number) < state.qualityTarget,
}))
// 🔄 Healing loop for iterative quality improvement
.while(
(state) => state.healingAttempts < 3 && state.shouldContinueHealing
)
.map((state) => ({
...state,
healingAttempts: state.healingAttempts + 1,
}))
// 🩹 Retrieve healing context to address specific issues
.map(async (state) => {
const healingQuery = `${state.originalQuestion} addressing issues: ${(state.currentIssues as string[])?.join(', ') || 'quality improvement'}`;
const healingDocument = await queryFn(healingQuery);
return {
...state,
healingResult: { healingDocument },
};
})
// 🔧 Apply healing to improve answer quality
.execute('answerHealer', (state) => ({
originalAnswer: state.currentAnswer,
healingDocument: state.healingResult.healingDocument,
issues: state.currentIssues,
}))
// ✅ Re-validate after healing application
.execute('qualityValidator', (state) => ({
generatedAnswer: state.answerHealerResult.healedAnswer,
userQuery: state.originalQuestion,
}))
.map((state) => ({
...state,
currentAnswer: state.answerHealerResult.healedAnswer as string,
currentQuality: state.qualityValidatorResult.qualityScore as number,
currentIssues: state.qualityValidatorResult.issues as string[],
shouldContinueHealing:
(state.qualityValidatorResult.qualityScore as number) < state.qualityTarget,
}))
.endWhile()
.when(false)
// Skip quality healing - use answer directly from Phase 3
.map((state) => ({
...state,
currentAnswer: state.answerGeneratorResult.comprehensiveAnswer,
currentQuality: 1.0, // Assume perfect quality when disabled
currentIssues: [] as string[],
shouldContinueHealing: false,
}))
.merge()
// 🎯 Final output transformation
.map((state) => ({
finalAnswer: state.currentAnswer,
totalHops: state.currentHop,
retrievedContexts: state.retrievedContexts,
iterationCount: state.iteration,
healingAttempts: state.healingAttempts,
qualityAchieved: state.currentQuality,
}))
);
};
```
### 🌟 Why This Demonstrates AxFlow's Power
This `axRAG` implementation showcases **AxFlow's unique capabilities** for building enterprise-grade LLM pipelines:
**🏗️ Complex State Management**: AxFlow seamlessly manages complex state transformations across 20+ pipeline steps, handling async operations, branching logic, and iterative loops without losing state consistency.
**🔄 Advanced Control Flow**: The pipeline uses AxFlow's `.while()`, `.branch()`, `.when()`, and `.merge()` operators to implement sophisticated control flow that would be complex and error-prone with traditional code.
**⚡ Automatic Parallelization**: AxFlow automatically parallelizes operations where possible, such as the parallel sub-query processing in Phase 2, maximizing performance without manual coordination.
**🧠 AI-Native Design**: Each `.node()` defines an AI task with typed signatures, making the pipeline self-documenting and enabling automatic prompt optimization and validation.
**🛡️ Production Reliability**: Built-in error handling, retry logic, state recovery, and comprehensive logging make this production-ready for real-world applications.
**📊 Observability**: The debug mode and logging capabilities provide complete visibility into the pipeline execution, essential for debugging and optimization.
This level of sophistication—multi-hop reasoning, self-healing quality loops, parallel processing, and intelligent branching—demonstrates how **AxFlow enables developers to build the kinds of advanced LLM systems that solve real-world problems** with enterprise reliability and maintainability.
## Debug Mode Visualization
Enable `debug: true` to see the complete RAG pipeline execution:
```bash
🔄 [ AXFLOW START ]
Input Fields: originalQuestion
Total Steps: 18
═══════════════════════════════════════
⚡ [ STEP 3 EXECUTE ] Node: contextualizer in 1.62s
New Fields: contextualizerResult
Result: {
"enhancedContext": "Machine learning in financial services raises privacy concerns through data collection, algorithmic bias, and potential for discrimination. Regulations like GDPR require explicit consent and data protection measures."
}
────────────────────────────────────────
⚡ [ STEP 9 EXECUTE ] Node: evidenceSynthesizer in 1.42s
New Fields: evidenceSynthesizerResult
Result: {
"synthesizedEvidence": "Comprehensive analysis of ML privacy implications including regulatory compliance, bias prevention, and consumer protection measures.",
"evidenceGaps": ["Technical implementation details", "Industry-specific case studies"]
}
────────────────────────────────────────
✅ [ AXFLOW COMPLETE ]
Total Time: 12.49s
Steps Executed: 15
═══════════════════════════════════════
```
## Integration with Vector Databases
```typescript
// Weaviate integration
import { axDB } from "@ax-llm/ax";
const weaviateDB = new axDB("weaviate", {
url: "http://localhost:8080",
});
const queryWeaviate = async (query: string): Promise => {
const embedding = await llm.embed({ texts: [query] });
const results = await weaviateDB.query({
table: "documents",
values: embedding.embeddings[0],
limit: 5,
});
return results.map(r => r.metadata.text).join('\n');
};
// Pinecone integration
import { axDB } from "@ax-llm/ax";
const pineconeDB = new axDB("pinecone", {
apiKey: process.env.PINECONE_API_KEY,
environment: "us-west1-gcp",
});
const queryPinecone = async (query: string): Promise => {
const embedding = await llm.embed({ texts: [query] });
const results = await pineconeDB.query({
table: "knowledge-base",
values: embedding.embeddings[0],
topK: 10,
});
return results.matches.map(m => m.metadata.content).join('\n');
};
```
## Performance Optimization
```typescript
// Optimized for speed
const fastRAG = axRAG(queryFn, {
maxHops: 2, // Fewer hops for speed
qualityThreshold: 0.7, // Lower quality threshold
maxIterations: 1, // Single iteration
disableQualityHealing: true, // Skip healing for speed
});
// Optimized for quality
const qualityRAG = axRAG(queryFn, {
maxHops: 5, // Thorough retrieval
qualityThreshold: 0.9, // High quality threshold
maxIterations: 3, // Multiple iterations
qualityTarget: 0.95, // Very high healing target
disableQualityHealing: false,
});
// Balanced configuration
const balancedRAG = axRAG(queryFn, {
maxHops: 3,
qualityThreshold: 0.8,
maxIterations: 2,
qualityTarget: 0.85,
});
```
## Simple RAG Alternative
For basic use cases, `axRAG` also provides `axSimpleRAG`:
```typescript
import { axSimpleRAG } from "@ax-llm/ax";
// Simple single-hop RAG
const simpleRAG = axSimpleRAG(queryVectorDB);
const result = await simpleRAG.forward(llm, {
question: "What is renewable energy?"
});
console.log("Answer:", result.answer);
console.log("Context:", result.context);
```
## Why axRAG is Powerful
**🚀 Production-Ready Architecture:**
- Built on AxFlow's automatic parallelization and resilience features
- Self-healing quality loops prevent poor answers
- Configurable trade-offs between speed and thoroughness
- Comprehensive logging and debugging capabilities
**🧠 Advanced Intelligence:**
- Multi-hop reasoning that builds context iteratively
- Intelligent query refinement based on previous results
- Gap analysis to identify missing information
- Parallel sub-query processing for complex questions
**🔧 Enterprise Features:**
- Configurable quality thresholds and targets
- Support for any vector database through simple query function
- Built-in error handling and retry logic
- Comprehensive metrics and observability
**⚡ Performance Optimized:**
- Automatic parallelization where possible
- Intelligent caching and context reuse
- Configurable performance vs. quality trade-offs
- Efficient token usage through smart prompt management
> _"axRAG doesn't just retrieve and generate—it thinks, analyzes, and iteratively improves to deliver the highest quality answers possible"_
The `axRAG` function represents the future of RAG systems: intelligent, self-improving, and production-ready with enterprise-grade reliability built on AxFlow's powerful orchestration capabilities.
================================================================================
# Migration Guide
# Source: MIGRATION.md
# Complete migration guide for Ax v13.0.24+ API changes
# Migration Guide: Ax v13.0.24+ API Changes
This document provides comprehensive migration instructions for Ax v14.0.0+ API
changes. The framework introduces significant improvements for better type
safety, performance, and consistency.
## Overview of Changes
**Version 14.0.0+** deprecates several patterns that will be **completely
removed in v15.0.0**:
1. **Template literal syntax** for signatures and generators
2. **Constructor-based API** for core classes
3. **Legacy classes** like `AxChainOfThought` and `AxRAG`
**Important**: The `f.()` field helper functions are **NOT deprecated** -
they remain available in the fluent signature creation API
(`f().input().output().build()`).
## What's Deprecated vs What's Not
### ❌ Deprecated (will be removed in v15.0.0)
1. **Template literal functions**:
- `` ax`template` `` → Use `ax('string')`
- `` s`template` `` → Use `s('string')`
2. **Constructor-based classes**:
- `new AxAI()` → Use `ai()` factory function
- `new AxAgent()` → Use `agent()` factory function
- `new AxFlow()` → Use `flow()` static method
- `new AxSignature()` → Use `s()` function or `AxSignature.create()`
3. **Legacy classes**:
- `AxChainOfThought` → Use modern thinking models (o1, etc.)
- `AxRAG` → Use `axRAG()` function built on AxFlow
### ✅ Still Available (NOT deprecated)
1. **Field helper functions in fluent API**:
- `f.string()`, `f.number()`, `f.class()`, etc. - still work in fluent
signatures
- `f().input().output().build()` pattern remains fully supported
- Pure fluent methods: `.optional()`, `.array()`, `.internal()`
2. **All current functionality** - just accessed through new patterns
### ❌ Removed in v14.0.0+
1. **Nested fluent helper functions**:
- `f.array(f.string())` → Use `f.string().array()`
- `f.optional(f.string())` → Use `f.string().optional()`
- `f.internal(f.string())` → Use `f.string().internal()`
## Detailed Migration Instructions
### 1. AI Instance Creation
```typescript
// ❌ DEPRECATED: Constructor
const ai = new AxAI({ name: "openai", apiKey: "..." });
// ✅ CURRENT: Factory function
const llm = ai({ name: "openai", apiKey: "..." });
```
**Why migrate**: Factory functions provide better type inference and
consistency.
### 2. Signature Creation
#### String-Based Signatures (Recommended)
```typescript
// ❌ DEPRECATED: Template literal
const sig = s`input:string -> output:string`;
// ✅ CURRENT: Function call
const sig = s("input:string -> output:string");
```
#### Fluent API (Still Fully Supported)
```typescript
// ✅ CURRENT: Fluent API with f.() helpers
const sig = f()
.input("userMessage", f.string("User input"))
.input("context", f.string("Background context").optional())
.output("response", f.string("Generated response"))
.output(
"sentiment",
f.class(["positive", "negative", "neutral"], "Sentiment"),
)
.build();
```
#### Static Methods (Alternative)
```typescript
// ✅ CURRENT: Static method
const sig = AxSignature.create("input:string -> output:string");
```
### 3. Generator Creation
```typescript
// ❌ DEPRECATED: Template literal
const gen = ax`input:string -> output:string`;
// ✅ CURRENT: Function call
const gen = ax("input:string -> output:string");
```
### 4. Agent Creation
```typescript
// ❌ DEPRECATED: Constructor
const agent = new AxAgent({
name: "helper",
signature: sig,
ai: llm,
});
// ✅ CURRENT: Factory function
const agentInstance = agent({
name: "helper",
signature: sig,
ai: llm,
});
// ✅ ALTERNATIVE: Static method
const agentInstance = AxAgent.create({
name: "helper",
signature: sig,
ai: llm,
});
```
### 5. Flow Creation
```typescript
// ❌ DEPRECATED: Constructor
const flow = new AxFlow();
// ✅ CURRENT: Static method or direct instantiation
const flow = AxFlow.create();
// OR continue using: new AxFlow() (constructors work for AxFlow)
```
### 6. RAG Usage
```typescript
// ❌ DEPRECATED: AxRAG class
const rag = new AxRAG({ ai: llm, db: vectorDb });
// ✅ CURRENT: axRAG function (AxFlow-based)
const rag = axRAG({ ai: llm, db: vectorDb });
```
## Field Type Reference
### String-Based Field Syntax
When using `s()` or `ax()` functions, use string-based field definitions:
| Type | Syntax | Example |
| ------------------ | ---------------------------------------- | ------------------------------------------------- |
| **String** | `field:string "description"` | `userInput:string "User question"` |
| **Number** | `field:number "description"` | `score:number "Confidence 0-1"` |
| **Boolean** | `field:boolean "description"` | `isValid:boolean "Is input valid"` |
| **JSON** | `field:json "description"` | `metadata:json "Extra data"` |
| **Arrays** | `field:type[] "description"` | `tags:string[] "Keywords"` |
| **Optional** | `field?:type "description"` | `context?:string "Optional context"` |
| **Classification** | `field:class "opt1, opt2" "description"` | `category:class "urgent, normal, low" "Priority"` |
| **Date** | `field:date "description"` | `dueDate:date "Due date"` |
| **DateTime** | `field:datetime "description"` | `timestamp:datetime "Event time"` |
| **Code** | `field:code "description"` | `script:code "Python code"` |
| **Media** | `field:image/audio/file/url` | `photo:image "Profile picture"` |
### Pure Fluent API (Updated in v14.0.0+)
The fluent API has been redesigned to be purely fluent, removing nested function calls:
```typescript
// ✅ Pure fluent syntax (current)
const sig = f()
.input("textInput", f.string("Input text"))
.input("optionsList", f.string("Option").array().optional())
.input("metadataInfo", f.json("Extra data"))
.output("processedResult", f.string("Processed result"))
.output("categoryType", f.class(["A", "B", "C"], "Classification"))
.output("confidenceScore", f.number("Confidence score"))
// ❌ Deprecated nested syntax (removed in v14.0.0+)
// .input("options", f.array(f.string("Option")).optional()) // No longer works
// .input("optional", f.optional(f.string("Field"))) // No longer works
// .output("internal", f.internal(f.string("Field"))) // No longer works
.build();
// Key differences:
// 1. f.array(f.string()) → f.string().array()
// 2. f.optional(f.string()) → f.string().optional()
// 3. f.internal(f.string()) → f.string().internal()
// 4. Method chaining works in any order: .optional().array() === .array().optional()
```
### Migration: Fluent API Nested Functions
**Before (v13.x - Nested Functions)**:
```typescript
const oldSig = f()
.input("items", f.array(f.string("Item description")))
.input("config", f.optional(f.json("Configuration")))
.output("result", f.string("Processing result"))
.output("debug", f.internal(f.string("Debug info")))
.build();
```
**After (v14.0+ - Pure Fluent)**:
```typescript
const newSig = f()
.input("itemsList", f.string("Item description").array())
.input("configData", f.json("Configuration").optional())
.output("processedResult", f.string("Processing result"))
.output("debugInfo", f.string("Debug info").internal())
.build();
```
**Migration Steps**:
1. Replace `f.array(f.TYPE())` with `f.TYPE().array()`
2. Replace `f.optional(f.TYPE())` with `f.TYPE().optional()`
3. Replace `f.internal(f.TYPE())` with `f.TYPE().internal()`
4. Update field names to be more descriptive (recommended)
5. Combine modifiers: `.optional().array()`, `.array().internal()`, etc.
## Complete Migration Examples
### Example 1: Simple Text Processing
```typescript
// ❌ DEPRECATED
const ai = new AxAI({ name: "openai", apiKey: "..." });
const gen = ax`text:string -> summary:string`;
const result = await gen.forward(ai, { text: "Long text..." });
// ✅ CURRENT
const llm = ai({ name: "openai", apiKey: "..." });
const gen = ax("text:string -> summary:string");
const result = await gen.forward(llm, { text: "Long text..." });
```
### Example 2: Complex Agent
```typescript
// ❌ DEPRECATED
const ai = new AxAI({ name: "openai", apiKey: "..." });
const sig = s`question:string -> answer:string, confidence:number`;
const agent = new AxAgent({
name: "assistant",
signature: sig,
ai: ai,
});
// ✅ CURRENT
const llm = ai({ name: "openai", apiKey: "..." });
const sig = s("question:string -> answer:string, confidence:number");
const agentInstance = agent({
name: "assistant",
signature: sig,
ai: llm,
});
```
### Example 3: RAG Pipeline
```typescript
// ❌ DEPRECATED
const ai = new AxAI({ name: "openai", apiKey: "..." });
const rag = new AxRAG({ ai, db: vectorDb });
// ✅ CURRENT
const llm = ai({ name: "openai", apiKey: "..." });
const rag = axRAG({ ai: llm, db: vectorDb });
```
## Automated Migration
For large codebases, you can use find-and-replace patterns to automate
migration:
### Template Literal Migration
```bash
# Replace ax template literals
find . -name "*.ts" -exec sed -i 's/ax`\([^`]*\)`/ax("\1")/g' {} \;
# Replace s template literals
find . -name "*.ts" -exec sed -i 's/s`\([^`]*\)`/s("\1")/g' {} \;
```
### Constructor Migration
```bash
# Replace AxAI constructor
find . -name "*.ts" -exec sed -i 's/new AxAI(/ai(/g' {} \;
# Replace AxAgent constructor
find . -name "*.ts" -exec sed -i 's/new AxAgent(/agent(/g' {} \;
# Replace AxRAG constructor
find . -name "*.ts" -exec sed -i 's/new AxRAG(/axRAG(/g' {} \;
```
### Import Updates
```bash
# Update imports to include factory functions
find . -name "*.ts" -exec sed -i 's/import { AxAI }/import { ai }/g' {} \;
find . -name "*.ts" -exec sed -i 's/import { AxAgent }/import { agent }/g' {} \;
```
## Benefits of Migration
### 1. Better Type Safety
- Full TypeScript inference for all field types
- Exact literal type inference for class fields
- Compile-time validation of signatures
### 2. Improved Performance
- No template literal processing overhead
- Faster signature parsing
- Reduced runtime validation
### 3. Cleaner Syntax
- More readable and consistent API patterns
- Better IntelliSense support
- Enhanced auto-completion
### 4. Future-Proof Architecture
- Aligned with framework's long-term vision
- Consistent patterns across all APIs
- Better extensibility for new features
## Timeline
- **v13.0.24+**: Deprecated patterns still work but show warnings
- **v15.0.0**: Deprecated patterns will be completely removed
- **Recommendation**: Migrate as soon as possible to take advantage of
improvements
## Common Migration Issues
### Issue 1: Template Literal Field Interpolation
```typescript
// ❌ PROBLEMATIC: Complex template literals
const dynamicType = "string";
const sig = s`input:${dynamicType} -> output:string`;
// ✅ SOLUTION: Use fluent API for dynamic fields
const sig = f()
.input("input", f[dynamicType as keyof typeof f](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/"Input field"))
.output("output", f.string("Output field"))
.build();
```
### Issue 2: Variable Naming Conflicts
```typescript
// ❌ PROBLEMATIC: Variable name conflicts
const ai = ai({ name: "openai", apiKey: "..." }); // ai conflicts with function name
// ✅ SOLUTION: Use recommended naming
const llm = ai({ name: "openai", apiKey: "..." }); // Clear naming
```
### Issue 3: Import Statement Updates
```typescript
// ❌ OLD: Constructor imports
import { AxAgent, AxAI } from "@ax-llm/ax";
// ✅ NEW: Factory function imports
import { agent, ai } from "@ax-llm/ax";
```
## Need Help?
If you encounter issues during migration:
1. Check the [examples directory](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/src/examples/) for updated patterns
2. Refer to the main [README.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/README.md) for current API usage
3. Join our [Discord community](https://discord.gg/DSHg3dU7dW) for support
4. Open an issue on [GitHub](https://github.com/ax-llm/ax/issues)
## Summary
The v13.0.24+ migration primarily involves:
1. **Replace template literals** with function calls: `` ax`...` `` →
`ax('...')`
2. **Replace constructors** with factory functions: `new AxAI()` → `ai()`
3. **Update variable names** to avoid conflicts: Use `llm` instead of `ai`
4. **Update imports** to include new factory functions
The `f.()` field helper functions remain fully supported in the fluent API
and are **not deprecated**.
All deprecated patterns will be removed in v15.0.0, so migrate as soon as
possible to ensure compatibility and take advantage of the improved type safety
and performance.
================================================================================
# Examples Guide
# Source: EXAMPLES.md
# Comprehensive examples showcasing Ax framework capabilities
# Ax Examples Guide
A comprehensive collection of examples showcasing Ax framework capabilities, from basic signatures to production-ready patterns.
## Table of Contents
- [Getting Started](#getting-started)
- [Core Concepts](#core-concepts)
- [Advanced Features](#advanced-features)
- [Production Patterns](#production-patterns)
- [Optimization & Training](#optimization--training)
- [Multi-Modal & Vision](#multi-modal--vision)
- [Agent Systems](#agent-systems)
- [Workflow Orchestration](#workflow-orchestration)
## Getting Started
### 1. Basic Signature - Email Classification
The simplest way to start with Ax - define input → output, get type-safe results.
```typescript
import { ai, ax } from '@ax-llm/ax';
// Define your signature
const classifier = ax(
'email:string -> category:class "spam, important, normal" "Email category"'
);
// Choose your LLM
const llm = ai({
name: 'openai',
apiKey: process.env.OPENAI_APIKEY!
});
// Get results
const result = await classifier.forward(llm, {
email: "URGENT: You've won $1,000,000! Click here now!"
});
console.log(result.category); // "spam" - Type-safe!
```
**Key Concepts:**
- Signatures define the contract: `input -> output`
- `class` type ensures output is one of the specified values
- Full TypeScript type inference
### 2. Structured Extraction
Extract multiple structured fields from unstructured text in one call.
```typescript
import { ax, ai } from '@ax-llm/ax';
const extractor = ax(`
customerEmail:string, currentDate:datetime ->
subject:string "Email subject",
priority:class "high, normal, low",
sentiment:class "positive, negative, neutral",
ticketNumber?:number "Optional ticket number",
nextSteps:string[] "Action items",
estimatedResponseTime:string
`);
const result = await extractor.forward(ai({ name: 'openai' }), {
customerEmail: `
Subject: Order #12345 hasn't arrived
I ordered 2 weeks ago and still haven't received my package.
This is unacceptable! I need this resolved immediately or I want a refund.
The tracking shows it's been stuck for days.
`,
currentDate: new Date()
});
console.log(result);
// {
// subject: "Order #12345 hasn't arrived",
// priority: "high",
// sentiment: "negative",
// ticketNumber: 12345,
// nextSteps: ["Check tracking status", "Contact shipping carrier", "Offer refund or replacement"],
// estimatedResponseTime: "Within 24 hours"
// }
```
### 3. Adding Validation with Assertions
Ensure outputs meet your business rules with assertions. Assertions provide multiple ways to signal failures:
```typescript
import { ax, ai } from '@ax-llm/ax';
const gen = ax('startNumber:number -> next10Numbers:number[], summary:string');
// Method 1: Return false with a fallback message
gen.addAssert(
({ next10Numbers }) => next10Numbers?.length === 10,
'Must generate exactly 10 numbers'
);
// Method 2: Return a custom error string (recommended)
gen.addAssert(({ next10Numbers }) => {
if (!next10Numbers) return undefined; // Skip validation if undefined
if (next10Numbers.length !== 10) {
return `Generated ${next10Numbers.length} numbers, expected exactly 10`;
}
return true; // Pass validation
});
// Method 3: Throw AxAssertionError to trigger retry
import { AxAssertionError } from '@ax-llm/ax';
gen.addAssert(({ next10Numbers }) => {
if (next10Numbers?.some(n => n <= 0)) {
throw new AxAssertionError({
message: `Invalid numbers found: ${next10Numbers.filter(n => n <= 0)}`
});
}
return true;
});
// Method 4: Throw standard Error for immediate failure (no retries)
gen.addAssert(({ next10Numbers }) => {
if (!next10Numbers) {
throw new Error('Critical: Numbers missing entirely!');
}
return true;
});
// Method 5: Conditional validation with undefined return
gen.addAssert(({ summary }) => {
if (!summary) return undefined; // Skip if summary not provided
return summary.length >= 20; // Only validate if present
}, 'Summary must be at least 20 characters when provided');
// Ax will automatically retry if assertions fail (up to maxRetries)
const result = await gen.forward(ai({ name: 'openai' }), {
startNumber: 1
});
```
**Assertion Return Values:**
- `true`: Assertion passes, continue generation
- `false`: Assertion fails, triggers retry with provided message
- `string`: Assertion fails, triggers retry with returned string as error message
- `undefined`: Skip this assertion (useful for conditional validation)
- `throw new AxAssertionError(...)`: Assertion fails, triggers retry
- `throw new Error(...)`: Immediate failure (crashes program, no retries)
**Streaming Assertions:**
```typescript
const streamingGen = ax('topic:string -> article:string, title:string');
// Validate streaming content as it's generated
streamingGen.addStreamingAssert('article', (content, done) => {
// Only validate complete content
if (!done) return undefined;
if (content.length < 100) {
return 'Article must be at least 100 characters long';
}
return true;
});
// Stream with validation
for await (const chunk of streamingGen.streamingForward(ai({ name: 'openai' }), {
topic: 'TypeScript best practices'
})) {
console.log(chunk.article || chunk.title || '');
}
```
### 3.5. Field Validation & Constraints (New!)
Ensure data quality with built-in Zod-like validation constraints. These run automatically on both inputs and outputs.
#### Data Quality with Built-in Validators
```typescript
import { ax, f, ai } from '@ax-llm/ax';
const userRegistration = f()
.input('formData', f.string('Raw registration form data'))
.output('user', f.object({
username: f.string('Username').min(3).max(20),
email: f.string('Email address').email(),
age: f.number('User age').min(18).max(120),
password: f.string('Password').min(8).regex('^(?=.*[A-Za-z])(?=.*\\d)'),
bio: f.string('User biography').max(500).optional(),
website: f.string('Personal website').url().optional(),
tags: f.string('Interest tag').min(2).max(30).array()
}))
.build();
const llm = ai({ name: 'openai', apiKey: process.env.OPENAI_APIKEY! });
const generator = ax(userRegistration);
const result = await generator.forward(llm, {
formData: `
Name: johndoe
Email: john@example.com
Age: 25
Password: secure123
Bio: Software developer passionate about TypeScript and AI
Website: https://johndoe.dev
Tags: typescript, ai, web development
`
});
console.log(result.user);
// {
// username: "johndoe",
// email: "john@example.com",
// age: 25,
// password: "secure123",
// bio: "Software developer passionate about TypeScript and AI",
// website: "https://johndoe.dev",
// tags: ["typescript", "ai", "web development"]
// }
// All constraints are validated:
// ✅ username: 3-20 characters
// ✅ email: valid email format
// ✅ age: between 18-120
// ✅ password: min 8 chars with letter and number
// ✅ website: valid URL format
// ✅ tags: each 2-30 characters
```
**Available Validators:**
- `.min(n)` / `.max(n)` - String length or number range
- `.email()` - Email format (or use `f.email()`)
- `.url()` - URL format (or use `f.url()`)
- `.date()` - Date format (or use `f.date()`)
- `.datetime()` - DateTime format (or use `f.datetime()`)
- `.regex(pattern, description)` - Custom regex pattern
- `.optional()` - Make field optional
**Note:** For email, url, date, and datetime, you can use either the validator syntax (`f.string().email()`) or the dedicated type syntax (`f.email()`). Both work consistently everywhere!
#### Contact Form with Regex Patterns
```typescript
import { ax, f } from '@ax-llm/ax';
const contactFormParser = f()
.input('formSubmission', f.string('Raw form data'))
.output('contact', f.object({
fullName: f.string('Full name').min(2).max(100),
email: f.string('Email address').email(),
phone: f.string('Phone number').regex('^\\+?[1-9]\\d{1,14}$'),
subject: f.string('Subject line').min(5).max(200),
message: f.string('Message content').min(20).max(2000),
urgency: f.string('Urgency level').optional()
}))
.build();
const result = await ax(contactFormParser).forward(llm, {
formSubmission: `
Name: Jane Smith
Email: jane.smith@company.com
Phone: +1234567890
Subject: Product inquiry about Enterprise plan
Message: I'm interested in learning more about your Enterprise plan for our team of 50 developers. Could you provide pricing and feature details?
Urgency: High
`
});
// Validation ensures:
// ✅ Phone matches international format
// ✅ Email is properly formatted
// ✅ Message has sufficient detail (20+ chars)
// ✅ Subject is descriptive (5-200 chars)
```
#### E-Commerce Product Validation
```typescript
import { ax, f } from '@ax-llm/ax';
const productExtractor = f()
.input('productPage', f.string('Product page HTML'))
.output('product', f.object({
name: f.string('Product name').min(1).max(200),
price: f.number('Price in USD').min(0),
specifications: f.object({
dimensions: f.object({
width: f.number('Width in cm').min(0),
height: f.number('Height in cm').min(0),
depth: f.number('Depth in cm').min(0)
}),
weight: f.number('Weight in kg').min(0),
materials: f.string('Material name').min(1).array()
}),
availability: f.object({
inStock: f.boolean('Stock status'),
quantity: f.number('Available quantity').min(0),
restockDate: f.string('Restock date').optional()
}),
images: f.object({
url: f.string('Image URL').url(),
alt: f.string('Alt text').min(1).max(100)
}).array(),
reviews: f.object({
rating: f.number('Rating').min(1).max(5),
comment: f.string('Review text').min(10).max(1000),
verified: f.boolean('Verified purchase')
}).array()
}))
.build();
const result = await ax(productExtractor).forward(llm, {
productPage: '...' // Real product page HTML
});
// Deep validation ensures:
// ✅ All dimensions are non-negative numbers
// ✅ Rating is between 1-5
// ✅ Image URLs are valid
// ✅ Review comments have meaningful length
// ✅ Nested object structure is correct
```
**Key Features:**
- **Automatic Input Validation**: Validates before sending to LLM
- **Automatic Output Validation**: Validates LLM responses
- **Auto-Retry**: ValidationError triggers retry with corrections
- **Streaming Support**: Incremental validation during streaming
- **Nested Validation**: Works recursively through objects and arrays
- **TypeScript Safety**: Full compile-time + runtime validation
## Core Concepts
### 4. Function Calling (ReAct Pattern)
Let your AI use tools to answer questions - the ReAct (Reasoning + Acting) pattern.
```typescript
import { ax, ai, type AxFunction } from '@ax-llm/ax';
// Define available functions
const functions: AxFunction[] = [
{
name: 'getCurrentWeather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' },
units: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location']
},
func: async ({ location, units }) => {
// Real API call would go here
return { temp: 72, condition: 'sunny', location };
}
},
{
name: 'searchNews',
description: 'Search for recent news',
parameters: {
type: 'object',
properties: {
query: { type: 'string' },
limit: { type: 'number', default: 5 }
},
required: ['query']
},
func: async ({ query, limit }) => {
return [`Breaking: ${query} news item 1`, `Update: ${query} item 2`];
}
}
];
// Create signature with functions
const assistant = ax(
'question:string -> answer:string "Detailed answer using available tools"',
{ functions }
);
const result = await assistant.forward(
ai({ name: 'openai' }),
{ question: "What's the weather like in Tokyo and any news about it?" }
);
// AI will automatically call both functions and combine results
console.log(result.answer);
// "The current weather in Tokyo is 72°F and sunny. Recent news about Tokyo includes..."
```
### 4.5. Parallel Function Calling
Execute multiple tools in parallel for complex queries.
```typescript
import { ax, ai, type AxFunction, AxAIGoogleGeminiModel } from '@ax-llm/ax';
const functions: AxFunction[] = [
{
name: 'getCurrentWeather',
description: 'get the current weather for a location',
func: async ({ location }) => ({ temperature: '22C', condition: 'Sunny' }),
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
},
{
name: 'getCurrentTime',
description: 'get the current time for a location',
func: async ({ location }) => ({ time: '14:30' }),
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
}
];
const agent = ax(
'query:string -> report:string "Comprehensive report"',
{ functions }
);
const result = await agent.forward(
ai({ name: 'google-gemini', config: { model: AxAIGoogleGeminiModel.Gemini15Pro } }),
{ query: "Compare the weather and time in Tokyo, New York, and London." }
);
// The AI will call weather and time functions for all 3 cities in parallel
console.log(result.report);
```
### 5. Streaming Responses
Stream responses for real-time user feedback.
```typescript
import { ax, ai } from '@ax-llm/ax';
const gen = ax('topic:string -> article:string "500 word article"');
// Enable streaming
const stream = await gen.streamingForward(
ai({ name: 'openai' }),
{ topic: 'The future of TypeScript' }
);
// Process chunks as they arrive
for await (const chunk of stream) {
if (chunk.article) {
process.stdout.write(chunk.article); // Real-time output
}
}
```
### 6. Multi-Step Reasoning with Examples
Improve accuracy by providing examples - few-shot learning made simple.
```typescript
import { AxGen, ai } from '@ax-llm/ax';
const analyzer = new AxGen<
{ code: string },
{ hasVulnerability: boolean; type?: string; severity?: string; suggestion?: string }
>('code:string -> hasVulnerability:boolean, type?:string, severity?:string, suggestion?:string');
// Add examples to guide the AI
analyzer.setExamples([
{
code: 'const password = "admin123"',
hasVulnerability: true,
type: 'Hardcoded Credentials',
severity: 'critical',
suggestion: 'Use environment variables for sensitive data'
},
{
code: 'const add = (a: number, b: number) => a + b',
hasVulnerability: false
}
]);
const result = await analyzer.forward(
ai({ name: 'openai' }),
{ code: 'eval(userInput)' }
);
// { hasVulnerability: true, type: "Code Injection", severity: "critical", ... }
```
## Advanced Features
### 7. Multi-Modal Processing
Process images and text together seamlessly.
```typescript
import { ax, ai, image, AxAIOpenAIModel } from '@ax-llm/ax';
const analyzer = ax(`
image:image "Product photo",
question:string ->
description:string,
mainColors:string[],
category:class "electronics, clothing, food, other",
estimatedPrice:string
`);
const result = await analyzer.forward(
ai({ name: 'openai', config: { model: AxAIOpenAIModel.GPT4O } }),
{
image: image('./product.jpg'),
question: 'What product is this and what can you tell me about it?'
}
);
```
### 8. Smart Document Processing with Chain of Thought
Process complex documents with automatic reasoning steps.
```typescript
import { ax, ai, AxAIAnthropicModel } from '@ax-llm/ax';
const processor = ax(`
document:string "Full document text",
instructions:string ->
thinking:string "Step-by-step analysis",
summary:string "Executive summary",
keyInsights:string[] "Main takeaways",
risks:string[] "Identified risks",
opportunities:string[] "Identified opportunities",
recommendedActions:string[] "Concrete next steps",
confidence:number "0-100 confidence score"
`);
const result = await processor.forward(
ai({ name: 'anthropic', config: { model: AxAIAnthropicModel.Claude35Sonnet } }),
{
document: businessPlan,
instructions: "Analyze this business plan for investment potential"
}
);
// Access the reasoning process
console.log('Analysis:', result.thinking);
console.log('Summary:', result.summary);
console.log('Confidence:', result.confidence);
```
## Production Patterns
### 9. Customer Support Agent
Complete customer support system with routing, prioritization, and response generation.
```typescript
import { ax, ai } from '@ax-llm/ax';
const supportAgent = ax(`
customerMessage:string,
customerHistory?:string "Previous interactions",
knowledgeBase?:string "Relevant KB articles" ->
intent:class "question, complaint, feedback, request",
department:class "billing, technical, sales, general",
priority:class "urgent, high, normal, low",
sentiment:number "0-10 scale",
suggestedResponse:string,
internalNotes:string "For support team",
requiresHumanReview:boolean,
tags:string[]
`);
// Add validation rules
supportAgent.addAssert(
({ priority, sentiment }) =>
!(priority === 'low' && sentiment !== undefined && sentiment < 3),
'Low sentiment should not be low priority'
);
const result = await supportAgent.forward(
ai({ name: 'openai' }),
{
customerMessage: "I've been charged twice for my subscription and need a refund immediately!",
customerHistory: "Premium customer since 2020, previous billing issue in March"
}
);
console.log(`Route to: ${result.department} (Priority: ${result.priority})`);
console.log(`Response: ${result.suggestedResponse}`);
```
### 10. Restaurant Recommendation System
Multi-criteria recommendation with function calling.
```typescript
import { ax, ai, type AxFunction } from '@ax-llm/ax';
const searchRestaurants: AxFunction = {
name: 'searchRestaurants',
description: 'Search restaurants by criteria',
parameters: {
type: 'object',
properties: {
cuisine: { type: 'string' },
priceRange: { type: 'string', enum: ['$', '$$', '$$$', '$$$$'] },
location: { type: 'string' },
features: {
type: 'array',
items: { type: 'string' },
description: 'outdoor seating, delivery, etc'
}
}
},
func: async (params) => {
// Database query would go here
return mockRestaurantData.filter(r =>
r.cuisine === params.cuisine &&
r.priceRange === params.priceRange
);
}
};
const recommender = ax(`
preferences:string "User's dining preferences",
occasion:string,
groupSize:number,
location:string ->
thinking:string "Analysis of preferences",
recommendations:object[] "Top 3 restaurants with reasons",
bestMatch:object "Single best recommendation",
alternativeOptions:string "Other cuisines to consider"
`, {
functions: [searchRestaurants, getWeather, checkAvailability]
});
const result = await recommender.forward(
ai({ name: 'openai' }),
{
preferences: "I love spicy food and outdoor dining",
occasion: "anniversary dinner",
groupSize: 2,
location: "San Francisco"
}
);
```
## Optimization & Training
### 11. Automatic Prompt Optimization
Use Bootstrap Few-Shot optimization to improve accuracy automatically.
```typescript
import { ax, ai, AxBootstrapFewShot, type AxMetricFn } from '@ax-llm/ax';
// Define your task
const classifier = ax(
'email:string -> category:class "spam, important, normal", confidence:number'
);
// Provide training examples
const trainingData = [
{ email: "Meeting at 3pm", category: "normal", confidence: 0.9 },
{ email: "WINNER! Claim prize!", category: "spam", confidence: 0.95 },
{ email: "Server is down", category: "important", confidence: 0.85 },
// ... more examples
];
// Define success metric
const metric: AxMetricFn = ({ prediction, example }) => {
const correct = prediction.category === example.category;
const confidentAndCorrect = correct && prediction.confidence > 0.8;
return confidentAndCorrect ? 1 : correct ? 0.5 : 0;
};
// Run optimization
const optimizer = new AxBootstrapFewShot({
studentAI: ai({ name: 'openai' }),
teacherAI: ai({ name: 'anthropic' }), // Optional: use stronger model as teacher
metric,
options: {
maxRounds: 5,
maxDemos: 3,
maxExamples: 100
}
});
const optimized = await optimizer.compile(classifier, trainingData);
console.log(`Improved accuracy from 65% to ${optimized.bestScore * 100}%`);
// Use optimized program
const result = await optimized.program.forward(ai({ name: 'openai' }), {
email: "System maintenance tonight"
});
```
## Agent Systems
### 12. Multi-Agent Collaboration
Build systems where specialized agents work together.
```typescript
import { AxAgent, ai } from '@ax-llm/ax';
// Specialized researcher agent
const researcher = new AxAgent({
name: 'Researcher',
description: 'Expert at finding and analyzing information',
signature: 'question:string -> research:string "Detailed findings", sources:string[]'
});
// Specialized writer agent
const writer = new AxAgent({
name: 'Writer',
description: 'Expert at creating engaging content',
signature: 'research:string, tone:string -> article:string, title:string'
});
// Specialized editor agent
const editor = new AxAgent({
name: 'Editor',
description: 'Expert at improving clarity and correctness',
signature: 'article:string -> editedArticle:string, changes:string[]'
});
// Coordinator agent that orchestrates others
const coordinator = new AxAgent({
name: 'Content Creator',
description: 'Creates high-quality articles using specialized agents',
signature: 'topic:string, style:string -> finalArticle:string, metadata:object',
agents: [researcher, writer, editor]
});
// The coordinator will automatically delegate to the right agents
const result = await coordinator.forward(
ai({ name: 'openai' }),
{
topic: 'The future of TypeScript',
style: 'technical but accessible'
}
);
console.log(result.finalArticle); // Fully researched, written, and edited article
```
### 13. Agent with Memory and Tools
Build stateful agents that remember context and use tools.
```typescript
import { AxAgent, AxMemory, ai, type AxFunction } from '@ax-llm/ax';
// Create memory store
const memory = new AxMemory();
// Define agent tools
const tools: AxFunction[] = [
{
name: 'saveNote',
description: 'Save important information for later',
parameters: {
type: 'object',
properties: {
category: { type: 'string' },
content: { type: 'string' }
}
},
func: async ({ category, content }) => {
await memory.add(category, content);
return 'Saved to memory';
}
},
{
name: 'recall',
description: 'Recall previously saved information',
parameters: {
type: 'object',
properties: {
category: { type: 'string' },
query: { type: 'string' }
}
},
func: async ({ category, query }) => {
return await memory.search(category, query);
}
}
];
const assistant = new AxAgent({
name: 'Personal Assistant',
description: 'Helps manage tasks and remember important information',
signature: 'message:string, userId:string -> response:string, actionsTaken:string[]',
functions: tools,
memory
});
// First interaction
await assistant.forward(ai({ name: 'openai' }), {
message: "Remember that my favorite color is blue",
userId: "user123"
});
// Later interaction - agent remembers
const result = await assistant.forward(ai({ name: 'openai' }), {
message: "What's my favorite color?",
userId: "user123"
});
// "Your favorite color is blue"
```
## Workflow Orchestration
### 14. AxFlow - Complex Pipeline
Build sophisticated data processing pipelines with AxFlow.
```typescript
import { AxFlow, AxFlow, ai } from '@ax-llm/ax';
// Create a content moderation pipeline
const pipeline = new AxFlow()
// Step 1: Analyze content
.addNode('analyzer', ax(`
content:string ->
hasPII:boolean "Contains personal information",
hasProfanity:boolean,
toxicityScore:number "0-100",
topics:string[]
`))
// Step 2: Redact sensitive info (only if needed)
.addNode('redactor', ax(`
content:string,
hasPII:boolean ->
redactedContent:string,
redactedItems:string[]
`))
// Step 3: Generate moderation decision
.addNode('moderator', ax(`
content:string,
toxicityScore:number,
hasProfanity:boolean ->
decision:class "approve, flag, reject",
reason:string,
suggestedAction:string
`))
// Define the flow
.flow(({ content }) => ({
analyzer: { content },
redactor: {
content,
hasPII: '{{analyzer.hasPII}}'
},
moderator: {
content: '{{redactor.redactedContent}}',
toxicityScore: '{{analyzer.toxicityScore}}',
hasProfanity: '{{analyzer.hasProfanity}}'
}
}));
const result = await pipeline.run(
ai({ name: 'openai' }),
{ content: "John Smith (SSN: 123-45-6789) posted offensive content" }
);
console.log(result.moderator.decision); // "reject"
console.log(result.redactor.redactedItems); // ["SSN: XXX-XX-XXXX"]
```
### 15. Parallel Processing with Map-Reduce
Process multiple items in parallel and aggregate results.
```typescript
import { AxFlow, ax, ai } from '@ax-llm/ax';
const flow = new AxFlow()
// Map: Process each item in parallel
.map('processor', ax(`
item:object ->
processed:object,
quality:number,
issues:string[]
`))
// Reduce: Aggregate all results
.reduce('aggregator', ax(`
results:object[] ->
summary:string,
totalQuality:number,
allIssues:string[],
recommendations:string[]
`));
const items = [
{ id: 1, data: 'Item 1 data' },
{ id: 2, data: 'Item 2 data' },
{ id: 3, data: 'Item 3 data' }
];
const result = await flow.run(
ai({ name: 'openai' }),
{ items }
);
console.log(`Processed ${items.length} items`);
console.log(`Average quality: ${result.totalQuality / items.length}`);
```
## Running Examples
All examples are in the `src/examples/` directory. To run any example:
```bash
# Set your API key
export OPENAI_APIKEY=your-key-here
# Or for other providers:
export ANTHROPIC_APIKEY=your-key
export GOOGLE_APIKEY=your-key
# Run an example
npm run tsx ./src/examples/summarize.ts
```
## Complete Examples Reference
Below is a comprehensive list of all available examples in [`src/examples/`](https://github.com/ax-llm/ax/tree/main/src/examples), organized by category.
### Basic Concepts
- **[chat.ts](https://github.com/ax-llm/ax/tree/main/src/examples/chat.ts)** - Simple chat interface demonstrating basic conversation flow
- **[simple-classify.ts](https://github.com/ax-llm/ax/tree/main/src/examples/simple-classify.ts)** - Basic classification example
- **[extract.ts](https://github.com/ax-llm/ax/tree/main/src/examples/extract.ts)** - Extract structured data from unstructured text
- **[extract-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/extract-test.ts)** - Testing extraction capabilities
- **[summarize.ts](https://github.com/ax-llm/ax/tree/main/src/examples/summarize.ts)** - Document summarization with key insights
- **[marketing.ts](https://github.com/ax-llm/ax/tree/main/src/examples/marketing.ts)** - Marketing content generation
- **[embed.ts](https://github.com/ax-llm/ax/tree/main/src/examples/embed.ts)** - Text embeddings and vector operations
### Signatures & Type Safety
- **[fluent-signature-example.ts](https://github.com/ax-llm/ax/tree/main/src/examples/fluent-signature-example.ts)** - Using the fluent API for signature definition
- **[signature-tool-calling.ts](https://github.com/ax-llm/ax/tree/main/src/examples/signature-tool-calling.ts)** - Combining signatures with tool calling
- **[structured_output.ts](https://github.com/ax-llm/ax/tree/main/src/examples/structured_output.ts)** - Complex structured output with validation
- **[debug_schema.ts](https://github.com/ax-llm/ax/tree/main/src/examples/debug_schema.ts)** - Debugging JSON schema generation
### Function Calling & Tools
- **[function.ts](https://github.com/ax-llm/ax/tree/main/src/examples/function.ts)** - Basic function calling (ReAct pattern)
- **[food-search.ts](https://github.com/ax-llm/ax/tree/main/src/examples/food-search.ts)** - Restaurant search with multi-step reasoning
- **[smart-home.ts](https://github.com/ax-llm/ax/tree/main/src/examples/smart-home.ts)** - Smart home control with multiple devices
- **[stop-function.ts](https://github.com/ax-llm/ax/tree/main/src/examples/stop-function.ts)** - Controlling function execution flow
- **[function-result-formatter.ts](https://github.com/ax-llm/ax/tree/main/src/examples/function-result-formatter.ts)** - Formatting function call results
- **[function-result-picker.ts](https://github.com/ax-llm/ax/tree/main/src/examples/function-result-picker.ts)** - Selecting best function results
- **[result-picker.ts](https://github.com/ax-llm/ax/tree/main/src/examples/result-picker.ts)** - Advanced result selection strategies
### Streaming
- **[streaming.ts](https://github.com/ax-llm/ax/tree/main/src/examples/streaming.ts)** - Basic streaming responses
- **[streaming-asserts.ts](https://github.com/ax-llm/ax/tree/main/src/examples/streaming-asserts.ts)** - Streaming with real-time validation
### Assertions & Validation
- **[asserts.ts](https://github.com/ax-llm/ax/tree/main/src/examples/asserts.ts)** - Using assertions for output validation
- **[sample-count.ts](https://github.com/ax-llm/ax/tree/main/src/examples/sample-count.ts)** - Controlling sampling and retries
### Agent Systems
- **[agent.ts](https://github.com/ax-llm/ax/tree/main/src/examples/agent.ts)** - Basic agent implementation
- **[agent-migration-example.ts](https://github.com/ax-llm/ax/tree/main/src/examples/agent-migration-example.ts)** - Migrating to the agent pattern
- **[customer-support.ts](https://github.com/ax-llm/ax/tree/main/src/examples/customer-support.ts)** - Complete customer support agent
- **[meetings.ts](https://github.com/ax-llm/ax/tree/main/src/examples/meetings.ts)** - Meeting assistant with scheduling
### Workflow Orchestration (AxFlow)
- **[ax-flow.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow.ts)** - Comprehensive AxFlow demonstration
- **[ax-flow-enhanced-demo.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-enhanced-demo.ts)** - Advanced flow patterns
- **[ax-flow-async-map.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-async-map.ts)** - Parallel processing with map
- **[ax-flow-auto-parallel.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-auto-parallel.ts)** - Automatic parallelization
- **[ax-flow-map-merge-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-map-merge-test.ts)** - Map-reduce patterns
- **[ax-flow-signature-inference.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-signature-inference.ts)** - Type inference in flows
- **[ax-flow-to-function.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-flow-to-function.ts)** - Converting flows to functions
- **[fluent-flow-example.ts](https://github.com/ax-llm/ax/tree/main/src/examples/fluent-flow-example.ts)** - Fluent API for flows
- **[flow-type-inference-demo.ts](https://github.com/ax-llm/ax/tree/main/src/examples/flow-type-inference-demo.ts)** - Type safety in flows
- **[flow-type-safe-output.ts](https://github.com/ax-llm/ax/tree/main/src/examples/flow-type-safe-output.ts)** - Type-safe flow outputs
- **[flow-logging-simple.ts](https://github.com/ax-llm/ax/tree/main/src/examples/flow-logging-simple.ts)** - Simple flow logging
- **[flow-verbose-logging.ts](https://github.com/ax-llm/ax/tree/main/src/examples/flow-verbose-logging.ts)** - Detailed flow debugging
### Optimization & Training
- **[teacher-student-optimization.ts](https://github.com/ax-llm/ax/tree/main/src/examples/teacher-student-optimization.ts)** - MiPRO teacher-student optimization
- **[mipro-python-optimizer.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mipro-python-optimizer.ts)** - MiPRO with Python backend
- **[gepa.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gepa.ts)** - GEPA optimizer basics
- **[gepa-flow.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gepa-flow.ts)** - GEPA with workflows
- **[gepa-train-inference.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gepa-train-inference.ts)** - GEPA training and inference
- **[gepa-quality-vs-speed-optimization.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gepa-quality-vs-speed-optimization.ts)** - Multi-objective optimization
- **[ace-train-inference.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ace-train-inference.ts)** - ACE optimizer demonstration
- **[simple-optimizer-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/simple-optimizer-test.ts)** - Basic optimizer testing
- **[optimizer-metrics.ts](https://github.com/ax-llm/ax/tree/main/src/examples/optimizer-metrics.ts)** - Optimization metrics tracking
- **[use-examples.ts](https://github.com/ax-llm/ax/tree/main/src/examples/use-examples.ts)** - Using examples for few-shot learning
### Multi-Modal & Vision
- **[multi-modal.ts](https://github.com/ax-llm/ax/tree/main/src/examples/multi-modal.ts)** - Basic multi-modal processing
- **[multi-modal-abstraction.ts](https://github.com/ax-llm/ax/tree/main/src/examples/multi-modal-abstraction.ts)** - Advanced multi-modal patterns
- **[image-arrays-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/image-arrays-test.ts)** - Processing multiple images
- **[image-arrays-multi-provider-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/image-arrays-multi-provider-test.ts)** - Multi-provider image handling
- **[audio-arrays-test.ts](https://github.com/ax-llm/ax/tree/main/src/examples/audio-arrays-test.ts)** - Audio processing
### RAG & Document Processing
- **[rag-docs.ts](https://github.com/ax-llm/ax/tree/main/src/examples/rag-docs.ts)** - Basic RAG implementation
- **[advanced-rag.ts](https://github.com/ax-llm/ax/tree/main/src/examples/advanced-rag.ts)** - Advanced RAG patterns
- **[vectordb.ts](https://github.com/ax-llm/ax/tree/main/src/examples/vectordb.ts)** - Vector database integration
- **[codingWithMemory.ts](https://github.com/ax-llm/ax/tree/main/src/examples/codingWithMemory.ts)** - Code generation with memory
### Provider-Specific Examples
#### Anthropic (Claude)
- **[anthropic-thinking-function.ts](https://github.com/ax-llm/ax/tree/main/src/examples/anthropic-thinking-function.ts)** - Extended thinking with function calls
- **[anthropic-thinking-separation.ts](https://github.com/ax-llm/ax/tree/main/src/examples/anthropic-thinking-separation.ts)** - Separating thinking from output
- **[anthropic-web-search.ts](https://github.com/ax-llm/ax/tree/main/src/examples/anthropic-web-search.ts)** - Web search with Claude
- **[test-anthropic-cache.ts](https://github.com/ax-llm/ax/tree/main/src/examples/test-anthropic-cache.ts)** - Prompt caching with Anthropic
#### Google Gemini
- **[gemini-file-support.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gemini-file-support.ts)** - File uploads with Gemini
- **[gemini-google-maps.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gemini-google-maps.ts)** - Google Maps integration
- **[gemini-empty-params-function.ts](https://github.com/ax-llm/ax/tree/main/src/examples/gemini-empty-params-function.ts)** - Functions without parameters
- **[vertex-auth-example.ts](https://github.com/ax-llm/ax/tree/main/src/examples/vertex-auth-example.ts)** - Vertex AI authentication
#### OpenAI
- **[openai-responses.ts](https://github.com/ax-llm/ax/tree/main/src/examples/openai-responses.ts)** - OpenAI response handling
- **[openai-web-search.ts](https://github.com/ax-llm/ax/tree/main/src/examples/openai-web-search.ts)** - Web search with OpenAI
- **[reasoning-o3-example.ts](https://github.com/ax-llm/ax/tree/main/src/examples/reasoning-o3-example.ts)** - O3 reasoning model
#### Other Providers
- **[grok-live-search.ts](https://github.com/ax-llm/ax/tree/main/src/examples/grok-live-search.ts)** - Grok with live search
- **[openrouter.ts](https://github.com/ax-llm/ax/tree/main/src/examples/openrouter.ts)** - OpenRouter integration
### MCP (Model Context Protocol)
- **[mcp-client-memory.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mcp-client-memory.ts)** - MCP memory server integration
- **[mcp-client-blender.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mcp-client-blender.ts)** - Blender MCP integration
- **[mcp-client-pipedream.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mcp-client-pipedream.ts)** - Pipedream MCP integration
- **[mcp-client-notion-http-oauth.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mcp-client-notion-http-oauth.ts)** - Notion MCP with HTTP OAuth
- **[mcp-client-notion-sse-oauth.ts](https://github.com/ax-llm/ax/tree/main/src/examples/mcp-client-notion-sse-oauth.ts)** - Notion MCP with SSE OAuth
### Advanced Patterns
- **[react.ts](https://github.com/ax-llm/ax/tree/main/src/examples/react.ts)** - ReAct (Reasoning + Acting) pattern
- **[prime.ts](https://github.com/ax-llm/ax/tree/main/src/examples/prime.ts)** - Prime number generation with reasoning
- **[fibonacci.ts](https://github.com/ax-llm/ax/tree/main/src/examples/fibonacci.ts)** - Fibonacci sequence generation
- **[show-thoughts.ts](https://github.com/ax-llm/ax/tree/main/src/examples/show-thoughts.ts)** - Displaying model reasoning
- **[checkpoint-recovery.ts](https://github.com/ax-llm/ax/tree/main/src/examples/checkpoint-recovery.ts)** - Checkpointing and recovery
- **[balancer.ts](https://github.com/ax-llm/ax/tree/main/src/examples/balancer.ts)** - Load balancing across models
- **[ax-multiservice-router.ts](https://github.com/ax-llm/ax/tree/main/src/examples/ax-multiservice-router.ts)** - Routing between multiple AI services
### Monitoring & Debugging
- **[debug-logging.ts](https://github.com/ax-llm/ax/tree/main/src/examples/debug-logging.ts)** - Debug logging configuration
- **[telemetry.ts](https://github.com/ax-llm/ax/tree/main/src/examples/telemetry.ts)** - Telemetry and observability
- **[metrics-export.ts](https://github.com/ax-llm/ax/tree/main/src/examples/metrics-export.ts)** - Exporting metrics
### Abort & Control Flow
- **[abort-simple.ts](https://github.com/ax-llm/ax/tree/main/src/examples/abort-simple.ts)** - Simple abort handling
- **[abort-patterns.ts](https://github.com/ax-llm/ax/tree/main/src/examples/abort-patterns.ts)** - Advanced abort patterns
### Web & Browser
- **[web-chat.html](https://github.com/ax-llm/ax/tree/main/src/examples/web-chat.html)** - Browser-based chat interface
- **[webllm-chat.html](https://github.com/ax-llm/ax/tree/main/src/examples/webllm-chat.html)** - WebLLM browser integration
- **[cors-proxy.js](https://github.com/ax-llm/ax/tree/main/src/examples/cors-proxy.js)** - CORS proxy for browser usage
### Deployment
- **[docker.ts](https://github.com/ax-llm/ax/tree/main/src/examples/docker.ts)** - Docker deployment example
## Best Practices
1. **Start Simple**: Begin with basic signatures, add complexity as needed
2. **Use Types**: Leverage TypeScript's type system for safety
3. **Add Assertions**: Validate outputs to ensure quality
4. **Provide Examples**: Few-shot examples dramatically improve accuracy
5. **Optimize When Needed**: Use BootstrapFewShot for production accuracy
6. **Handle Errors**: Always wrap in try-catch for production
7. **Stream for UX**: Use streaming for better user experience
8. **Monitor Performance**: Use built-in telemetry for observability
## Next Steps
- [Read the DSPy Concepts](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/DSPY.md) to understand the theory
- [Explore the API Reference](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/API.md) for detailed documentation
- [Join our Discord](https://discord.gg/DSHg3dU7dW) for help and discussions
- [Star us on GitHub](https://github.com/ax-llm/ax) if you find Ax useful!
================================================================================
# ACE Guide
# Source: ACE.md
# Advanced ACE framework capabilities
# Agentic Context Engineering (ACE)
ACE (Agentic Context Engineering) provides a structured approach to evolving AI program context through iterative refinement loops. Unlike traditional prompt optimization, ACE maintains a persistent, structured "playbook" that grows and adapts over time.
## Table of Contents
- [What is ACE?](#what-is-ace)
- [When to Use ACE](#when-to-use-ace)
- [How ACE Works](#how-ace-works)
- [Quick Start](#quick-start)
- [Online Adaptation](#online-adaptation)
- [Understanding ACE Components](#understanding-ace-components)
- [Customizing ACE Prompts](#customizing-ace-prompts)
- [Complete Working Example](#complete-working-example)
- [Best Practices](#best-practices)
## What is ACE?
**The Problem**: Iteratively rewriting a giant system prompt causes brevity bias and context collapse—hard-won strategies disappear after a few updates. You need a way to grow and refine a durable playbook both offline and online.
**The Solution**: Use `AxACE`, an optimizer that mirrors the ACE paper's Generator → Reflector → Curator loop. It represents context as structured bullets, applies incremental deltas, and returns a serialized playbook you can save, load, and keep updating at inference time.
## When to Use ACE
✅ **Perfect for:**
- Programs that need to learn from ongoing feedback
- Systems requiring structured, evolving knowledge bases
- Tasks where context needs to persist and grow over time
- Scenarios with incremental learning from production data
- Cases where prompt brevity bias is a concern
❌ **Skip for:**
- Simple classification tasks (use MiPRO instead)
- One-time optimizations without ongoing updates
- Tasks that don't benefit from structured memory
- Quick prototypes needing fast results
## How ACE Works
ACE implements a three-component loop:
1. **Generator**: Your program that performs the task
2. **Reflector**: Analyzes generator performance and identifies improvements
3. **Curator**: Updates the playbook with structured, incremental changes
The playbook is represented as structured bullets organized into sections, allowing for targeted updates without context collapse.
## Quick Start
### Step 1: Define Your Program
```typescript
import { ax, AxAI, AxACE, type AxMetricFn, f, AxAIOpenAIModel } from "@ax-llm/ax";
const student = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
});
const teacher = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4O },
});
const classifier = ax(
'ticket:string "Support ticket text" -> severity:class "low, medium, high" "Incident severity"'
);
classifier.setDescription(
"Classify the severity of the support ticket and explain your reasoning."
);
```
### Step 2: Provide Training Examples
```typescript
const examples = [
{ ticket: "Billing portal returns 502 errors globally.", severity: "high" },
{ ticket: "UI misaligned on Safari but usable.", severity: "low" },
{ ticket: "Checkout intermittently drops vouchers.", severity: "medium" },
];
```
### Step 3: Define Success Metric
```typescript
const metric: AxMetricFn = ({ prediction, example }) =>
prediction.severity === example.severity ? 1 : 0;
```
### Step 4: Run ACE Optimization
```typescript
const optimizer = new AxACE(
{ studentAI: student, teacherAI: teacher, verbose: true },
{ maxEpochs: 2 }
);
console.log('🚀 Running ACE offline optimization...');
const result = await optimizer.compile(classifier, examples, metric);
// Apply the optimized playbook
result.optimizedProgram?.applyTo(classifier);
console.log(`✅ Optimization complete!`);
console.log(`Score: ${result.optimizedProgram?.bestScore.toFixed(3)}`);
```
### Step 5: Save and Load the Playbook
```typescript
import fs from "node:fs/promises";
// Save the structured playbook
await fs.writeFile(
"ace-playbook.json",
JSON.stringify(result.artifact.playbook, null, 2)
);
// Later, load the playbook
const loadedPlaybook = JSON.parse(
await fs.readFile("ace-playbook.json", "utf8")
);
const onlineOptimizer = new AxACE(
{ studentAI: student, teacherAI: teacher },
{ initialPlaybook: loadedPlaybook }
);
```
## Online Adaptation
ACE's key feature is online learning—updating the playbook based on real-world feedback.
```typescript
// New example from production
const newTicket = {
ticket: "VIP equities desk reports quote stream silent",
severity: "high",
};
// Get prediction
const prediction = await classifier.forward(student, newTicket);
// Apply online update with feedback
const curatorDelta = await optimizer.applyOnlineUpdate({
example: newTicket,
prediction,
feedback: "Escalation confirmed SEV-1. Reward guidance about VIP customer clauses.",
});
if (curatorDelta?.operations?.length) {
console.log(`Added ${curatorDelta.operations.length} new playbook bullets`);
}
```
## Understanding ACE Components
### Generator
ACE uses the program you pass into `optimizer.compile(...)` as the Generator. You own the signature and the base system instruction—ACE simply appends the evolving playbook when it calls `forward`.
```typescript
const generatorSig = f()
.input('ticket', f.string('Concise incident summary'))
.input('impact', f.string('Observed customer or business impact'))
.input('scope', f.string('Reported scope of the issue'))
.input('signals', f.string('Supporting telemetry or operational signals'))
.output('severity', f.class(['low', 'medium', 'high'], 'Incident severity label'))
.output('reasoning', f.string('Brief rationale referencing internal incident policy'))
.build();
const generator = ax(generatorSig);
generator.setDescription(`You are doing first-pass incident triage ...`);
```
At compile-time ACE stitches the playbook beneath whatever instruction you provide.
### Reflector
The reflector program is generated lazily inside `AxACE`. Its schema is:
```typescript
const reflector = ax(
`
question:string "Original task input serialized as JSON",
generator_answer:string "Generator output serialized as JSON",
generator_reasoning?:string "Generator reasoning trace",
playbook:string "Current context playbook rendered as markdown",
expected_answer?:string "Expected output when ground truth is available",
feedback?:string "External feedback or reward signal",
previous_reflection?:string "Most recent reflection JSON when running multi-round refinement" ->
reasoning:string "Step-by-step analysis of generator performance",
errorIdentification:string "Specific mistakes detected",
rootCauseAnalysis:string "Underlying cause of the error",
correctApproach:string "What the generator should do differently",
keyInsight:string "Reusable insight to remember",
bulletTags:json "Array of {id, tag} entries referencing playbook bullets"
`,
);
```
### Curator
The curator schema tracks the paper's delta-output contract:
```typescript
const curator = ax(
`
playbook:string "Current playbook serialized as JSON",
reflection:string "Latest reflection output serialized as JSON",
question_context:string "Original task input serialized as JSON",
token_budget?:number "Approximate token budget for curator response" ->
reasoning:string "Justification for the proposed updates",
operations:json "List of operations with type/section/content fields"
`,
);
```
## Customizing ACE Prompts
By default Ax synthesizes prompts from the signatures. If you want to drop in custom prompts (e.g., from the ACE paper's Appendix D), you can override them:
### Customizing Reflector Prompt
```typescript
const optimizer = new AxACE({ studentAI, teacherAI });
const reflector = (optimizer as any).getOrCreateReflectorProgram?.call(optimizer);
reflector?.setDescription(myCustomReflectorPrompt);
```
### Customizing Curator Prompt
```typescript
const curator = (optimizer as any).getOrCreateCuratorProgram?.call(optimizer);
curator?.setDescription(myCustomCuratorPrompt);
```
### Where to Hook In
Until helper setters land, reaching the underlying programs through the internal `getOrCreateReflectorProgram` / `getOrCreateCuratorProgram` methods (as shown above) is the supported path.
📌 **Tip:** The reflector and curator signatures live in `src/ax/dsp/optimizers/ace.ts`. Search for `getOrCreateReflectorProgram` and `getOrCreateCuratorProgram` if you need to track future changes.
## Complete Working Example
> **📖 Full Example**: `src/examples/ace-train-inference.ts` demonstrates offline training plus an online adaptation pass.
```typescript
import { ax, AxAI, AxACE, type AxMetricFn, f, AxAIOpenAIModel } from "@ax-llm/ax";
import fs from "node:fs/promises";
async function run() {
const student = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4OMini },
});
const teacher = new AxAI({
name: "openai",
apiKey: process.env.OPENAI_APIKEY!,
config: { model: AxAIOpenAIModel.GPT4O },
});
const signatureSource = f()
.input("ticket", f.string("Concise incident summary"))
.input("impact", f.string("Observed customer or business impact"))
.input("scope", f.string("Reported scope of the issue"))
.input("signals", f.string("Supporting telemetry or operational signals"))
.output("severity", f.class(["low", "medium", "high"], "Incident severity label"))
.output("reasoning", f.string("Brief rationale referencing internal incident policy"))
.build()
.toString();
const baseInstruction = `You are doing first-pass incident triage. Use the table below and do not deviate from it.
- single-user -> low
- regional -> medium
- global -> high
- internal -> low`;
const program = ax(signatureSource);
program.setDescription(baseInstruction);
const trainExamples = [
{
ticket: "Fraud rules flag 80% of card transactions in CA region",
impact: "Legitimate purchases blocked for many customers",
scope: "regional",
signals: "Chargeback rate flat, ruleset pushed 10 minutes ago",
severity: "high",
},
{
ticket: "Global search results delayed during planned reindex",
impact: "Catalog searchable but updates appear 20 minutes late",
scope: "global",
signals: "Maintenance ticket CAB-512 approved, no customer complaints",
severity: "medium",
},
];
const metric: AxMetricFn = ({ prediction, example }) =>
(prediction as any).severity === (example as any).severity ? 1 : 0;
const optimizer = new AxACE(
{ studentAI: student, teacherAI: teacher, verbose: true },
{ maxEpochs: 2, allowDynamicSections: true }
);
console.log("\n🚀 Running ACE offline optimization...");
const result = await optimizer.compile(program, trainExamples, metric, {
aceOptions: { maxEpochs: 2 },
});
const optimizedProgram = ax(signatureSource);
optimizedProgram.setDescription(baseInstruction);
result.optimizedProgram?.applyTo(optimizedProgram);
console.log(`✅ ACE produced ${result.artifact.history.length} curator updates`);
// Save playbook
await fs.writeFile(
"ace-playbook.json",
JSON.stringify(result.artifact.playbook, null, 2)
);
// Online update
const newTicket = {
ticket: "VIP equities desk reports quote stream silent",
impact: "Tier-1 customer cannot trade; contractual penalties kick in soon",
scope: "single-user",
signals: "Quote service returns 503 for client subnet",
severity: "high",
};
const prediction = await optimizedProgram.forward(student, newTicket);
console.log("\n🧠 Applying online update...");
const curatorDelta = await optimizer.applyOnlineUpdate({
example: newTicket,
prediction,
feedback: "Escalation confirmed SEV-1. Reward guidance about VIP clauses.",
});
if (curatorDelta?.operations?.length) {
console.log(`Added ${curatorDelta.operations.length} new playbook bullets`);
}
}
run().catch((error) => {
console.error("💥 ACE example failed", error);
process.exit(1);
});
```
## Best Practices
### 1. Start with Clear Base Instructions
Provide a clear, structured base instruction for your generator. ACE will augment it, not replace it.
```typescript
const baseInstruction = `You are doing first-pass incident triage. Use the table below:
- single-user -> low
- regional -> medium
- global -> high
- internal -> low`;
program.setDescription(baseInstruction);
```
### 2. Use Structured Examples
Provide diverse, well-structured training examples that cover edge cases.
### 3. Meaningful Feedback for Online Updates
When doing online updates, provide clear, actionable feedback:
```typescript
const feedback = "Escalation confirmed SEV-1. Reward guidance about VIP customer clauses.";
```
### 4. Monitor Playbook Growth
Periodically review your playbook to ensure it's growing in useful directions:
```typescript
const playbook = result.artifact.playbook;
console.log("\n📘 Learned playbook sections:");
for (const [section, bullets] of Object.entries(playbook.sections)) {
console.log(`- ${section}: ${bullets.length} bullets`);
}
```
### 5. Save Playbooks for Future Sessions
Always save your optimized playbooks—they represent learned knowledge:
```typescript
await fs.writeFile(
"ace-playbook.json",
JSON.stringify(result.artifact.playbook, null, 2)
);
```
### 6. Combine Offline and Online Learning
Use offline optimization for initial training, then continue with online updates in production:
```typescript
// Offline: Initial training
const result = await optimizer.compile(program, trainExamples, metric);
// Online: Continuous improvement
const delta = await optimizer.applyOnlineUpdate({ example, prediction, feedback });
```
## Why ACE Matters
- **Structured memory**: Playbooks of tagged bullets persist across runs
- **Incremental updates**: Curator operations apply as deltas, so context never collapses
- **Offline + Online**: Same optimizer supports batch training and per-sample updates
- **Unified artifacts**: `AxACEOptimizedProgram` extends `AxOptimizedProgramImpl`, so you can save/load/apply like MiPRO or GEPA
## See Also
- [OPTIMIZE.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/OPTIMIZE.md) - Main optimization guide
- [MIPRO.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/MIPRO.md) - MiPRO optimizer documentation
- [GEPA.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/GEPA.md) - Multi-objective optimization
- `src/examples/ace-train-inference.ts` - Complete working example
================================================================================
# AxGen Guide
# Source: AXGEN.md
# The programmable unit of Ax for building AI workflows
# AxGen Guide
`AxGen` is the core programmable unit in Ax. It represents a single step in an
AI workflow, encapsulating a signature (input/output definition), a prompt
template, and execution logic (including retries, streaming, and assertions).
`AxGen` is designed to be composable, allowing you to build complex workflows by
chaining multiple `AxGen` instances together or using them within `AxFlow`.
## Creating an AxGen Instance
To create an `AxGen` instance, you need a **Signature**. A signature defines the
input fields and output fields for the generation task.
```typescript
import { ax } from "@ax-llm/ax";
const gen = ax(
`input:string -> output:string, reasoning:string`,
);
```
You can also use `s()` for reusable signatures:
```typescript
import { ax, s } from "@ax-llm/ax";
const sig = s(`question:string, context:string[] -> answer:string`);
const gen = ax(sig);
```
### Options
The `ax()` factory accepts an optional configuration object:
```typescript
const gen = ax("input -> output", {
description: "A helpful assistant", // Description for the prompt
maxRetries: 3, // Default retries for assertions/validation
maxSteps: 10, // Max steps for multi-step generation
temperature: 0.7, // Default Model temperature (can be overridden)
fastFail: false, // If true, fail immediately on error
debug: false, // Enable debug logging
});
```
## Running AxGen
To run an `AxGen` instance, you use the `forward` method. This method sends the
request to the AI service and processes the response.
### Passing an AI Service
You must pass an AI service instance (from `ai()`) to `forward`.
```typescript
import { ai } from "@ax-llm/ax";
const llm = ai({
name: "openai",
apiKey: process.env.OPENAI_APIKEY,
config: { model: "gpt-4o" },
});
const result = await gen.forward(llm, { input: "Hello world" });
console.log(result.output);
```
### Options for `forward`
The `forward` method accepts an options object as the third argument, allowing
you to override defaults and configure per-request behavior.
```typescript
const result = await gen.forward(llm, { input: "..." }, {
// Execution Control
maxRetries: 5, // Override default max retries
stopFunction: "stop", // Custom stop function name
// AI Configuration
model: "gpt-4.1", // Override model for this call
modelConfig: {
temperature: 0.9,
maxTokens: 1000,
},
// Retry Configuration (Low-level)
retry: {
maxRetries: 3,
backoffFactor: 2,
maxDelayMs: 30000,
},
// Debugging
debug: true, // Enable debug logging for this call
traceLabel: "custom-trace",
});
```
## Stopping AxGen
`AxGen` supports two cancellation paths for in-flight `forward()` and `streamingForward()` calls:
- `stop()` on the generator instance
- `abortSignal` in per-call options
Both paths throw `AxAIServiceAbortedError` so you can handle cancellation consistently. `stop()` aborts all in-flight calls started from the same `AxGen` instance, including retry backoff waits.
```typescript
import { AxAIServiceAbortedError, ai, ax } from "@ax-llm/ax";
const llm = ai({ name: "openai", apiKey: process.env.OPENAI_APIKEY! });
const gen = ax("topic:string -> summary:string");
const timer = setTimeout(() => gen.stop(), 3_000);
try {
const result = await gen.forward(llm, { topic: "Long document" }, {
abortSignal: AbortSignal.timeout(10_000),
});
console.log(result.summary);
} catch (err) {
if (err instanceof AxAIServiceAbortedError) {
console.log("Generation was aborted");
} else {
throw err;
}
} finally {
clearTimeout(timer);
}
```
## Streaming
`AxGen` supports streaming responses, which is useful for real-time
applications.
### Using `streamingForward`
Use `streamingForward` to get an async generator that yields partial results.
```typescript
const stream = gen.streamingForward(llm, { input: "Write a long story" });
for await (const chunk of stream) {
// chunk contains partial deltas and the current accumulated state
if (chunk.delta.output) {
process.stdout.write(chunk.delta.output);
}
}
```
The `chunk` object contains:
- `delta`: The partial change in this update (e.g., newly generated tokens).
- `partial`: The full accumulated value so far.
## Structured Outputs
`AxGen` automatically handles structured outputs based on your signature. If
your output signature contains types other than string (like specific classes,
arrays, or JSON objects), `AxGen` will instruct the LLM to produce JSON and
strict type validation will be applied.
```typescript
const gen = new AxGen<
{ topic: string },
{ tags: string[]; sentiment: "pos" | "neg" }
>(
`topic:string -> tags:string[], sentiment:string`,
);
const result = await gen.forward(ai, { topic: "Ax Framework" });
// result.tags is string[]
// result.sentiment is 'pos' | 'neg'
```
## Assertions and Validation
You can add assertions to `AxGen` to validate the output. If an assertion fails,
`AxGen` can automatically retry with error feedback (self-correction).
```typescript
gen.addAssert(
(args) => args.output.length > 50,
"Output must be at least 50 characters long",
);
// Streaming assertions work on partial updates
gen.addStreamingAssert(
"output",
(text) => !text.includes("forbidden"),
"Output contains forbidden text",
);
```
## Field Processors
Field processors allow you to transform or process output field values during or
after generation. They are useful for post-processing, logging, or real-time
feedback.
### Post-Generation Field Processors
Use `addFieldProcessor` to transform a field value after generation completes:
```typescript
const gen = new AxGen("document:string -> summary:string, keywords:string[]");
// Transform the summary to uppercase
gen.addFieldProcessor("summary", (value, context) => {
return value.toUpperCase();
});
// Process keywords array
gen.addFieldProcessor("keywords", (value, context) => {
// Filter out short keywords
return value.filter((kw: string) => kw.length > 3);
});
```
The context object provides:
- `values`: All output field values
- `sessionId`: Current session ID (if provided)
- `done`: Whether generation is complete
### Streaming Field Processors
For real-time processing during streaming, use `addStreamingFieldProcessor`:
```typescript
const gen = new AxGen("topic:string -> content:string");
// Process content as it streams in
gen.addStreamingFieldProcessor("content", (partialValue, context) => {
// Log streaming progress
console.log(`Received ${partialValue.length} characters`);
// You can return a transformed value
return partialValue;
});
```
Streaming field processors only work with string fields (`string` or `code`
types).
## Error Handling and Retry Strategies
`AxGen` implements sophisticated error handling with automatic retries for
different error categories.
### Validation and Assertion Retries
When output validation or assertions fail, `AxGen` automatically retries with
corrective feedback:
```typescript
const gen = new AxGen("question:string -> answer:string", {
maxRetries: 5, // Retry up to 5 times on validation/assertion errors
});
gen.addAssert(
(result) => result.answer.length > 100,
"Answer must be detailed (at least 100 characters)",
);
// If the assertion fails, AxGen will:
// 1. Add error feedback to the conversation
// 2. Request a new response from the LLM
// 3. Repeat until success or maxRetries exhausted
```
### Infrastructure Error Retries
Network errors, timeouts, and server errors (5xx) are handled separately with
exponential backoff:
```typescript
const result = await gen.forward(ai, { question: "..." }, {
maxRetries: 3, // Also applies to infrastructure errors
retry: {
maxRetries: 3,
backoffFactor: 2, // Exponential backoff multiplier
maxDelayMs: 60000, // Maximum delay between retries (60s)
},
});
```
The retry sequence for infrastructure errors: 1s → 2s → 4s → 8s → ... (up to
`maxDelayMs`).
### Error Types
`AxGen` provides detailed error information via `AxGenerateError`:
```typescript
import { AxGenerateError } from "@ax-llm/ax";
try {
const result = await gen.forward(ai, { input: "..." });
} catch (error) {
if (error instanceof AxGenerateError) {
console.log("Model:", error.details.model);
console.log("Max Tokens:", error.details.maxTokens);
console.log("Streaming:", error.details.streaming);
console.log("Signature:", error.details.signature);
console.log("Original Error:", error.cause);
}
}
```
## Function Calling
`AxGen` supports function calling (tool use) with three modes to accommodate
different LLM providers.
### Function Calling Modes
```typescript
const tools = [{
name: "search",
description: "Search for information",
parameters: {
type: "object",
properties: {
query: { type: "string" },
},
required: ["query"],
},
func: async ({ query }) => {
// Perform search
return `Results for: ${query}`;
},
}];
const result = await gen.forward(ai, { question: "..." }, {
functions: tools,
functionCallMode: "auto", // 'auto' | 'native' | 'prompt'
});
```
**Available modes:**
| Mode | Description |
| ---------- | ------------------------------------------------------------------------------------------------------------------ |
| `"auto"` | (Default) Uses native function calling if the provider supports it, otherwise falls back to prompt-based emulation |
| `"native"` | Forces native function calling. Throws error if provider doesn't support it |
| `"prompt"` | Emulates function calling via prompt injection. Works with any LLM |
### Stop Functions
You can specify functions that should terminate the generation loop when called:
```typescript
const result = await gen.forward(ai, { question: "..." }, {
functions: tools,
stopFunction: "finalAnswer", // Stop when this function is called
});
// Multiple stop functions
const result = await gen.forward(ai, { question: "..." }, {
functions: tools,
stopFunction: ["finalAnswer", "done", "complete"],
});
```
## Caching
`AxGen` supports two types of caching: response caching and context (prompt)
caching.
### Response Caching
Cache complete generation results to avoid redundant LLM calls:
```typescript
// Simple in-memory cache example
const cache = new Map();
const gen = new AxGen("question:string -> answer:string", {
cachingFunction: async (key, value?) => {
if (value !== undefined) {
// Store value
cache.set(key, value);
return undefined;
}
// Retrieve value
return cache.get(key);
},
});
// First call - hits LLM
const result1 = await gen.forward(ai, { question: "What is 2+2?" });
// Second call with same input - returns cached result
const result2 = await gen.forward(ai, { question: "What is 2+2?" });
```
The cache key is computed from:
- Signature hash
- All input field values (including nested objects and arrays)
### Context Caching (Prompt Caching)
For providers that support prompt caching (Anthropic, OpenAI), you can configure
cache breakpoints:
```typescript
const result = await gen.forward(ai, { question: "..." }, {
contextCache: {
cacheBreakpoint: "after-examples", // or 'after-functions'
},
});
```
**Breakpoint options:**
- `"after-examples"`: Cache after examples/few-shot demonstrations (default)
- `"after-functions"`: Cache after function definitions
## Input Validation
`AxGen` validates input values against field constraints defined in your
signature.
### String Constraints
```typescript
// Using the Pure Fluent API (see SIGNATURES.md)
import { f, s } from "@ax-llm/ax";
const signature = s("", "")
.appendInputField("email", f.string("User email").email())
.appendInputField("username", f.string("Username").min(3).max(20))
.appendInputField("bio", f.string("Bio").max(500).optional())
.appendOutputField("result", f.string("Result"));
const gen = new AxGen(signature);
```
### Number Constraints
```typescript
const signature = s("", "")
.appendInputField("age", f.number("User age").min(0).max(150))
.appendInputField("score", f.number("Score").min(0).max(100))
.appendOutputField("result", f.string("Result"));
```
### URL and Date Validation
```typescript
const signature = s("", "")
.appendInputField("website", f.url("Website URL"))
.appendInputField("birthDate", f.date("Birth date"))
.appendInputField("createdAt", f.datetime("Creation timestamp"))
.appendOutputField("result", f.string("Result"));
```
Validation errors trigger the retry loop with corrective feedback.
## Sampling and Result Selection
Generate multiple samples in parallel and select the best result.
### Multiple Samples
```typescript
const result = await gen.forward(ai, { question: "..." }, {
sampleCount: 3, // Generate 3 samples in parallel
});
```
### Custom Result Picker
Use a `resultPicker` function to select the best sample:
```typescript
const result = await gen.forward(ai, { question: "..." }, {
sampleCount: 5,
resultPicker: async (samples) => {
// samples is an array of { delta: OUT, index: number }
// Example: Select the longest answer
let bestIndex = 0;
let maxLength = 0;
for (let i = 0; i < samples.length; i++) {
const len = samples[i].delta.answer?.length ?? 0;
if (len > maxLength) {
maxLength = len;
bestIndex = i;
}
}
return bestIndex;
},
});
```
## Multi-Step Processing
`AxGen` supports multi-step generation loops, useful for function calling
workflows.
### Configuration
```typescript
const gen = new AxGen("question:string -> answer:string", {
maxSteps: 25, // Maximum number of steps (default: 25)
});
```
### How It Works
In multi-step mode, `AxGen` continues generating until:
1. All output fields are filled without pending function calls
2. A stop function is called
3. `maxSteps` is reached
```typescript
const result = await gen.forward(ai, { question: "Search and summarize..." }, {
functions: [searchTool, summarizeTool],
maxSteps: 10,
stopFunction: "finalAnswer",
});
```
Each step is traced separately for debugging and can trigger function
executions.
## Extended Thinking
For models that support extended thinking (Claude, Gemini), you can configure
thinking behavior using string budget levels. See [AI.md](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AI.md) for full
details on budget levels, provider differences, and customization.
```typescript
const result = await gen.forward(ai, { question: "..." }, {
thinkingTokenBudget: "medium", // Budget level: 'minimal' | 'low' | 'medium' | 'high' | 'highest' | 'none'
showThoughts: true, // Include thinking in response
});
// Access the thought process
console.log(result.thought); // Contains the model's reasoning
```
### Custom Thought Field Name
```typescript
const gen = new AxGen("question:string -> answer:string", {
thoughtFieldName: "reasoning", // Default is 'thought'
});
const result = await gen.forward(ai, { question: "..." }, {
thinkingTokenBudget: "high",
showThoughts: true,
});
console.log(result.reasoning); // Thinking is in 'reasoning' field
```
## Step Hooks
Step hooks let you observe and control the multi-step generation loop from the
outside. They fire at well-defined points during each iteration and receive an
`AxStepContext` that exposes read-only state and mutation methods.
### Three Hook Points
| Hook | When it fires |
| ------------------------ | ----------------------------------------------------------- |
| `beforeStep` | Before the AI request is sent for this step |
| `afterStep` | After the step completes (response processed) |
| `afterFunctionExecution` | After function calls are executed (only when functions ran) |
### Basic Example
```typescript
const result = await gen.forward(ai, values, {
stepHooks: {
beforeStep: (ctx) => {
console.log(`Step ${ctx.stepIndex}, first: ${ctx.isFirstStep}`);
// Upgrade model after a specific function ran
if (ctx.functionsExecuted.has("complexanalysis")) {
ctx.setModel("smart");
ctx.setThinkingBudget("high");
}
},
afterStep: (ctx) => {
console.log(`Usage so far: ${ctx.usage.totalTokens} tokens`);
},
afterFunctionExecution: (ctx) => {
console.log(`Functions ran: ${[...ctx.functionsExecuted].join(", ")}`);
},
},
});
```
### AxStepContext Reference
**Read-only properties:**
| Property | Type | Description |
| ------------------- | ------------------------ | ---------------------------------------------------- |
| `stepIndex` | `number` | Current step number (0-based) |
| `maxSteps` | `number` | Maximum steps allowed |
| `isFirstStep` | `boolean` | True when `stepIndex === 0` |
| `functionsExecuted` | `ReadonlySet` | Lowercased names of functions called this step |
| `lastFunctionCalls` | `AxFunctionCallRecord[]` | Detailed records (name, args, result) from this step |
| `usage` | `AxStepUsage` | Accumulated token usage across all steps |
| `state` | `Map` | Custom state that persists across steps |
**Mutators (applied at the next step boundary):**
| Method | Description |
| --------------------------- | ------------------------------------------------------ |
| `setModel(model)` | Switch to a different model key |
| `setThinkingBudget(budget)` | Adjust reasoning depth (`'none'` to `'highest'`) |
| `setTemperature(temp)` | Change sampling temperature |
| `setMaxTokens(tokens)` | Change max output tokens |
| `setOptions(opts)` | Merge arbitrary AI service options |
| `addFunctions(fns)` | Add functions to the active set |
| `removeFunctions(...names)` | Remove functions by name |
| `stop(resultValues?)` | Terminate the loop, optionally providing result values |
Mutations use a **pending pattern**: changes are collected during a step and
applied at the top of the next iteration. This prevents mid-step
inconsistencies.
### Functions Also Receive Step Context
User-defined functions receive the step context via `extra.step`, enabling
programmatic loop control from within function handlers:
```typescript
const gen = new AxGen("question:string -> answer:string", {
functions: [{
name: "analyzeData",
description: "Analyze data",
parameters: {
type: "object",
properties: { query: { type: "string", description: "Query" } },
},
func: (args, extra) => {
// Read step state
const step = extra?.step;
console.log(`Running at step ${step?.stepIndex}`);
// Mutate for next step
step?.setThinkingBudget("high");
return analyzeData(args.query);
},
}],
});
```
## Self-Tuning
Self-tuning lets the LLM adjust its own generation parameters between steps.
When enabled, an `adjustGeneration` function is auto-injected that the model can
call alongside regular tool calls.
### Simple Usage
```typescript
// Boolean shorthand: enables model + thinkingBudget adjustment
const result = await gen.forward(ai, values, {
selfTuning: true,
});
```
### Granular Configuration
```typescript
const result = await gen.forward(ai, values, {
selfTuning: {
model: true, // Let LLM pick from available models
thinkingBudget: true, // Let LLM adjust reasoning depth
},
});
```
### Function Pool
Use `selfTuning.functions` to provide a pool of tools the LLM can activate or
deactivate on demand — useful for large toolboxes where you only want a subset
active at any time:
```typescript
const result = await gen.forward(ai, values, {
selfTuning: {
model: true,
thinkingBudget: true,
functions: [searchWeb, calculate, fetchDatabase, generateChart],
},
});
```
The LLM calls `adjustGeneration({ addFunctions: ['searchWeb'] })` to activate
tools, or `adjustGeneration({ removeFunctions: ['calculate'] })` to deactivate
them.
### How It Works
1. An `adjustGeneration` function is injected into the function list
2. The LLM can call it alongside other functions within the same step
3. Model selection uses the `models` list configured on the AI service (via
`AxAI` model keys)
4. Thinking budget uses a 6-level enum: `none`, `minimal`, `low`, `medium`,
`high`, `highest`
5. Mutations are applied at the next step boundary (same pending pattern as step
hooks)
Temperature is excluded by default because LLMs have limited intuition about
sampling parameters. Enable it explicitly with `temperature: true` if your use
case benefits from it.
================================================================================
# AxAgent Guide
# Source: AXAGENT.md
# Agent framework with child agents, tools, and RLM for long contexts
# AxAgent Guide
`AxAgent` is the agent framework in Ax. It wraps `AxGen` with support for child agents, tool use, and **RLM (Recursive Language Model)** mode for processing long contexts through runtime-backed code execution.
Use `AxAgent` when you need:
- Multi-step reasoning with tools
- Composing multiple agents into a hierarchy
- Processing long documents without context window limits (RLM mode)
For single-step generation without tools or agents, use [`AxGen`](https://raw.githubusercontent.com/ax-llm/ax/refs/heads/main/AXGEN.md) directly.
## Creating Agents
Use the `agent()` factory function with a string signature:
```typescript
import { agent, ai } from '@ax-llm/ax';
const llm = ai({ name: 'openai', apiKey: process.env.OPENAI_APIKEY! });
const myAgent = agent('userQuestion:string -> responseText:string', {
agentIdentity: {
name: 'helpfulAgent',
description: 'An agent that provides helpful responses to user questions',
},
});
const result = await myAgent.forward(llm, { userQuestion: 'What is TypeScript?' });
console.log(result.responseText);
```
The `agent()` function accepts both string signatures and `AxSignature` objects:
```typescript
import { agent, s } from '@ax-llm/ax';
const sig = s('userQuestion:string -> responseText:string');
const myAgent = agent(sig, {
agentIdentity: {
name: 'helpfulAgent',
description: 'An agent that provides helpful responses to user questions',
},
});
```
## Agent Options
The `agent()` factory accepts a configuration object:
```typescript
const myAgent = agent('input:string -> output:string', {
// Agent identity (required when used as a child agent)
agentIdentity: {
name: 'myAgent', // Agent name (converted to camelCase)
description: 'Does something useful and interesting with inputs',
namespace: 'team', // Optional child-agent module namespace (default: 'agents')
},
// Required when using context fields
contextFields: [
'largeDoc', // Runtime-only (legacy behavior)
{
field: 'chatHistory',
keepInPromptChars: 500,
reverseTruncate: true, // Keep the last 500 chars in the Actor prompt
},
],
// Optional
ai: llm, // Bind a specific AI service
debug: false, // Debug logging
// Child agents and sharing
agents: {
local: [childAgent1, childAgent2], // Callable under .* in this agent
shared: [utilityAgent], // Propagated one level to direct children
globallyShared: [loggerAgent], // Propagated recursively to all descendants
excluded: ['agentName'], // Agent names NOT to receive from parents
},
// Field sharing
fields: {
local: ['userId'], // Keep shared/global fields visible in this agent
shared: ['userId'], // Passed to direct child agents
globallyShared: ['sessionId'], // Passed to all descendants
excluded: ['field'], // Fields NOT to receive from parents
},
// Agent functions (namespaced JS runtime globals)
functions: {
discovery: true, // Optional: module discovery mode for runtime callables
local: [myAgentFn], // Flat AxAgentFunction[] OR grouped AxAgentFunctionGroup[]
shared: [sharedFn], // Flat or grouped; propagated one level to direct children
globallyShared: [globalFn], // Flat or grouped; propagated recursively to all descendants
excluded: ['fnName'], // Function names NOT to receive from parents
},
// RLM limits (see RLM section below)
maxSubAgentCalls: 50, // Sub-agent call cap (default: 50)
maxRuntimeChars: 5000, // Runtime payload size cap (default: 5000)
maxTurns: 10, // Actor loop turn cap (default: 10)
inputUpdateCallback: async (inputs) => ({ // Optional host-side per-turn input patch
query: inputs.query,
}),
actorOptions: { description: '...' }, // Extra guidance appended to Actor prompt
responderOptions: { description: '...' }, // Extra guidance appended to Responder prompt
recursionOptions: { maxDepth: 2 }, // llmQuery sub-agent options
});
```
### `agentIdentity`
Required when the agent is used as a child agent. Contains:
- `name` (converted to camelCase for the function name, e.g. `'Physics Researcher'` becomes `physicsResearcher`)
- `description` (shown to parent agents when they decide which child to call)
- `namespace` (optional module name used for child-agent calls in this agent's runtime; defaults to `agents`, example: `team.researcher(...)`)
### `agents`
Grouped child agent configuration. `local` are callable under `.*` in this agent's JS runtime (`agentIdentity.namespace` if set, otherwise `agents`). See [Child Agents](#child-agents).
## Running Agents
### `forward()`
Run the agent and get the final result:
```typescript
const result = await myAgent.forward(llm, { userQuestion: 'Hello' });
console.log(result.responseText);
```
If the agent was created with `ai` bound, the parent AI is used as fallback:
```typescript
const myAgent = agent('input:string -> output:string', {
agentIdentity: {
name: 'myAgent',
description: 'An agent that processes inputs reliably',
},
ai: llm, // Bound AI service
});
// Can also pass a different AI to override
const result = await myAgent.forward(differentLlm, { input: 'test' });
```
### `streamingForward()`
Stream partial results as they arrive:
```typescript
const stream = myAgent.streamingForward(llm, { userQuestion: 'Write a story' });
for await (const chunk of stream) {
if (chunk.delta.responseText) {
process.stdout.write(chunk.delta.responseText);
}
}
```
### Forward Options
Both `forward` and `streamingForward` accept an options object as the third argument:
```typescript
const result = await myAgent.forward(llm, values, {
model: 'smart', // Override model
maxSteps: 10, // Override max steps
debug: true, // Enable debug logging
functions: [extraTool], // Additional tools (merged with agent's tools)
thinkingTokenBudget: 'medium',
abortSignal: controller.signal, // Cancel via AbortSignal
});
```
## Stopping Agents
`AxAgent`, `AxGen`, and `AxFlow` support two ways to stop an in-flight `forward()` or `streamingForward()` call. Both cause the call to throw `AxAIServiceAbortedError`, which you handle with try/catch.
### `stop()` method
Call `stop()` from any context — a timer, event handler, or another async task — to halt the multi-step loop. `stop()` aborts all in-flight calls started by the same `AxAgent` instance (including retry backoff waits), and the loop throws `AxAIServiceAbortedError`.
```typescript
const myAgent = agent('question:string -> answer:string', {
agentIdentity: {
name: 'myAgent',
description: 'An agent that answers questions thoroughly',
},
});
// Stop after 5 seconds
const timer = setTimeout(() => myAgent.stop(), 5_000);
try {
const result = await myAgent.forward(llm, { question: 'Explain quantum gravity' });
console.log(result.answer);
} catch (err) {
if (err instanceof AxAIServiceAbortedError) {
console.log('Agent was stopped');
} else {
throw err;
}
} finally {
clearTimeout(timer);
}
```
`stop()` is also available on `AxGen` and `AxFlow` instances:
```typescript
const gen = ax('topic:string -> summary:string');
setTimeout(() => gen.stop(), 3_000);
try {
const result = await gen.forward(llm, { topic: 'Climate change' });
} catch (err) {
if (err instanceof AxAIServiceAbortedError) {
console.log('Generation was stopped');
}
}
```
### Using `AbortSignal`
Pass an `abortSignal` in the forward options to cancel via the standard `AbortController` / `AbortSignal` API. The signal is checked between each step of the multi-step loop, not only during the HTTP call, so cancellation is detected promptly even when the agent is between LLM calls.
```typescript
// Time-based deadline
try {
const result = await myAgent.forward(llm, values, {
abortSignal: AbortSignal.timeout(10_000), // 10-second deadline
});
} catch (err) {
if (err instanceof AxAIServiceAbortedError) {
console.log('Timed out');
}
}
// Manual controller
const controller = new AbortController();
onUserCancel(() => controller.abort());
try {
const result = await myAgent.forward(llm, values, {
abortSignal: controller.signal,
});
} catch (err) {
if (err instanceof AxAIServiceAbortedError) {
console.log('Cancelled by user');
}
}
```
## Child Agents
Agents can compose other agents as children. The parent agent sees each child as a callable function and decides when to invoke it.
```typescript
const researcher = agent(
'question:string, physicsQuestion:string -> answer:string',
{
agentIdentity: {
name: 'Physics Researcher',
description: 'Researcher for physics questions can answer questions about advanced physics',
},
}
);
const summarizer = agent(
'answer:string -> shortSummary:string',
{
agentIdentity: {
name: 'Science Summarizer',
description: 'Summarizer can write short summaries of advanced science topics',
},
contextFields: [],
actorOptions: {
description: 'Use numbered bullet points to summarize the answer in order of importance.',
},
}
);
const scientist = agent('question:string -> answer:string', {
agentIdentity: {
name: 'Scientist',
description: 'An agent that can answer advanced science questions',
},
contextFields: [],
agents: { local: [researcher, summarizer] },
});
const result = await scientist.forward(llm, {
question: 'Why is gravity not a real force?',
});
```
### Value Passthrough
When a parent and child agent share input field names, the parent automatically passes those values to the child. For example, if the parent has `question:string` and a child also expects `question:string`, the parent's value is injected automatically — the LLM doesn't need to re-type it.
## Agent Functions (`AxAgentFunction`)
Agent functions are registered as namespaced globals in the JS runtime. Unlike child agents (which are called via `await .(...)`, where `` defaults to `agents`), agent functions are rendered directly in the Actor prompt as callable JS APIs.
```typescript
import { agent, f, fn } from '@ax-llm/ax';
const search = fn('search')
.description('Search the product catalog')
.namespace('db') // callable as db.search(...) in JS runtime
.arg('query', f.string('Search query'))
.arg('limit', f.number('Maximum results').optional())
.returnsField('results', f.string('Result item').array())
.handler(async ({ query, limit = 5 }) => {
return { results: [`result for ${query}`] };
})
.build();
const shopAssistant = agent(
'userQuery:string, catalog:string[] -> answer:string',
{
agentIdentity: { name: 'shopAssistant', description: 'Answers product questions' },
contextFields: ['catalog'],
functions: { local: [search] },
}
);
```
For discovery mode, you can group functions by module and attach discovery metadata:
```typescript
import { agent, f, fn, type AxAgentFunctionGroup } from '@ax-llm/ax';
const dbTools: AxAgentFunctionGroup = {
namespace: 'db',
title: 'Scheduling Database',
selectionCriteria: 'Use for availability lookups or window resolution.',
description: 'Database helpers for schedule and availability data.',
functions: [
fn('search')
.description('Search the product catalog')
.arg('query', f.string('Search query'))
.arg('limit', f.number('Maximum results').optional())
.returnsField('results', f.string('Result item').array())
.handler(async ({ query, limit = 5 }) => {
return { results: [`result for ${query}`] };
})
.build(),
],
};
const shopAssistant = agent('userQuery:string -> answer:string', {
contextFields: [],
functions: { discovery: true, local: [dbTools] },
});
```
When an agent function is invoked from an active AxAgent actor runtime session, the handler also receives a call-scoped protocol capability on the `extra` argument. Use it to end the current actor turn from host-side code without changing the runtime globals:
```typescript
const complete = fn('complete')
.description('Finish the current actor turn')
.arg('answer', f.string('Final answer'))
.handler(async ({ answer }, extra) => {
extra?.protocol?.final(answer);
return answer;
})
.build();
```
`extra.protocol` is only defined for host-side function calls that originate from an active AxAgent actor runtime session. It is not part of discovery mode, is not a normal registered function, and remains unavailable in regular AxGen/AxFlow function-calling paths.
The Actor prompt will include:
```
### Available Functions
```javascript
// db namespace
// Search the product catalog
async function db.search({ query: string, limit?: number }): Promise<{ results: string[] }>
```
```
Key rules:
- Default namespace is `'utils'` if omitted (callable as `utils.fnName(...)`)
- Reserved namespaces: `agents`, `llmQuery`, `final`, `ask_clarification`, and the configured `agentIdentity.namespace` (if set)
- `parameters` is required; `returns` is optional but shown in the prompt
- Grouped function modules (`AxAgentFunctionGroup`) own the namespace and discovery metadata for every function in `functions`
- Functions inside a group must not define `namespace`
- Use `functions.shared` / `functions.globallyShared` to propagate to child agents
### Callable Discovery Mode
Set `functions.discovery: true` to avoid dumping full callable definitions into the Actor prompt.
- Prompt behavior: shows `### Available Modules` (for function namespaces and the child-agent module namespace)
- Runtime APIs:
- `await listModuleFunctions(modules: string | string[]) : string`
- `await getFunctionDefinitions(functions: string | string[]) : string`
- Both APIs return markdown.
- When multiple modules are needed, prefer one batched call such as `await listModuleFunctions(['timeRange', 'schedulingOrganizer'])`.
- When multiple callable definitions are needed, prefer one batched `await getFunctionDefinitions([...])` call.
- Treat discovery results as markdown sections to inspect or log directly; do not wrap them in JSON or custom objects.
- Do not fan out discovery work with `Promise.all(...)`.
- `listModuleFunctions(...)` only advertises modules that currently have callable entries.
- Grouped modules render in the Actor prompt as `` - `` when `selectionCriteria` is defined.
- `listModuleFunctions(...)` prints the module namespace, title, description, and the function names available in that module.
- If a requested module does not exist, `listModuleFunctions(...)` returns a per-module markdown error instead of failing the whole call.
- `getFunctionDefinitions(...)` includes argument comments from JSON Schema property descriptions.
- `getFunctionDefinitions(...)` includes fenced code examples from `AxAgentFunction.examples`.
- `getFunctionDefinitions` accepts fully-qualified names like `db.search` or `.researcher`, where `` is `agentIdentity.namespace` when set, otherwise `agents`.
- Bare names resolve to `utils.` (for example `lookup` -> `utils.lookup`).
## Shared Fields and Agents
When composing agent hierarchies, you often need to pass data or utility agents to child agents without requiring the parent's LLM to explicitly route them. AxAgent provides grouped `fields`, `agents`, and `functions` options with three propagation levels each.
### `fields.shared` — Pass fields to direct children (one level)
Fields listed in `fields.shared` are automatically injected into direct child agents at runtime. By default, they bypass the parent's LLM.
```typescript
const childAgent = agent('question:string -> answer:string', {
agentIdentity: { name: 'Child', description: 'Answers questions' },
contextFields: [],
});
const parentAgent = agent('query:string, userId:string, knowledgeBase:string -> answer:string', {
contextFields: ['knowledgeBase'],
agents: { local: [childAgent] },
fields: { shared: ['userId'] }, // userId is injected into child agents automatically
});
```
- `userId` is removed from the parent's Actor/Responder prompts
- `userId` is automatically injected into every call to child agents
- Children can opt out via `fields: { excluded: ['userId'] }`
- Add `fields.local: ['userId']` to keep `userId` available in the parent too
### `fields.globallyShared` — Pass fields to ALL descendants (recursive)
Like `fields.shared`, but propagates through the entire agent tree — children, grandchildren, and beyond.
```typescript
const grandchild = agent('question:string -> answer:string', {
agentIdentity: { name: 'Grandchild', description: 'Answers questions' },
contextFields: [],
});
const child = agent('topic:string -> summary:string', {
agentIdentity: { name: 'Child', description: 'Summarizes topics' },
contextFields: [],
agents: { local: [grandchild] },
});
const parent = agent('query:string, sessionId:string -> answer:string', {
contextFields: [],
agents: { local: [child] },
fields: { globallyShared: ['sessionId'] }, // sessionId reaches child AND grandchild
});
```
### `agents.shared` — Add agents to direct children (one level)
Utility agents listed in `agents.shared` are added to every direct child agent's available agents list.
```typescript
const logger = agent('message:string -> logResult:string', {
agentIdentity: { name: 'Logger', description: 'Logs messages for debugging' },
contextFields: [],
});
const worker = agent('task:string -> result:string', {
agentIdentity: { name: 'Worker', description: 'Performs tasks' },
contextFields: [],
});
const parent = agent('query:string -> answer:string', {
contextFields: [],
agents: {
local: [worker],
shared: [logger], // worker can now call agents.logger(...)
},
});
```
### `agents.globallyShared` — Add agents to ALL descendants (recursive)
Like `agents.shared`, but propagates through the entire agent tree.
```typescript
const logger = agent('message:string -> logResult:string', {
agentIdentity: { name: 'Logger', description: 'Logs messages for debugging' },
contextFields: [],
});
const grandchild = agent('question:string -> answer:string', {
agentIdentity: { name: 'Grandchild', description: 'Answers questions' },
contextFields: [],
});
const child = agent('topic:string -> summary:string', {
agentIdentity: { name: 'Child', description: 'Summarizes topics' },
contextFields: [],
agents: { local: [grandchild] },
});
const parent = agent('query:string -> answer:string', {
contextFields: [],
agents: {
local: [child],
globallyShared: [logger], // both child AND grandchild can call agents.logger(...)
},
});
```
### `fields.excluded` / `agents.excluded` / `functions.excluded`
Any child agent can opt out of receiving specific shared fields, agents, or functions from parents:
```typescript
const sentiment = agent('text:string -> sentiment:string', {
agentIdentity: { name: 'Sentiment', description: 'Analyzes sentiment' },
contextFields: [],
fields: { excluded: ['userId'] }, // Does not receive userId from parents
agents: { excluded: ['loggerAgent'] }, // Does not receive logger from parents
functions: { excluded: ['searchFn'] }, // Does not receive searchFn from parents
});
```
## RLM Mode
**RLM (Recursive Language Model)** mode lets agents process arbitrarily long documents without hitting context window limits. Instead of stuffing the entire document into the LLM prompt, RLM loads it into a code interpreter session and gives the LLM tools to analyze it programmatically.
### The Problem
When you pass a long document to an LLM, you face:
- **Context window limits** — the document may not fit
- **Context rot** — accuracy degrades as context grows
- **Cost** — long prompts are expensive
### How It Works
1. **Context extraction** — Fields listed in `contextFields` are removed from the LLM prompt and loaded into a runtime session as variables.
2. **Actor/Responder split** — The agent uses two internal programs:
- **Actor** — A code generation agent that writes JavaScript to analyze context data. It NEVER generates final answers directly.
- **Responder** — An answer synthesis agent that produces the final answer from the Actor's `actorResult` payload. It NEVER generates code.
3. **Recursive queries** — Inside code, `llmQuery(...)` delegates semantic work to a sub-query (plain AxGen in simple mode, full AxAgent in advanced mode).
4. **Completion** — The Actor signals completion by calling `final(...args)` or asks for more user input with `ask_clarification(...args)`, then the Responder synthesizes the final answer.
The Actor writes JavaScript code to inspect, filter, and iterate over the document. It uses `llmQuery` for semantic analysis and can chunk data in code before querying.
### Configuration
```typescript
import { agent, ai } from '@ax-llm/ax';
const analyzer = agent(
'context:string, query:string -> answer:string, evidence:string[]',
{
agentIdentity: {
name: 'documentAnalyzer',
description: 'Analyzes long documents using code interpreter and sub-LM queries',
},
contextFields: [
'context', // Runtime-only context field
{
field: 'chatHistory',
keepInPromptChars: 500,
reverseTruncate: true, // Keep the last 500 chars in the Actor prompt
},
],
runtime: new AxJSRuntime(), // Code runtime (default: AxJSRuntime)
maxSubAgentCalls: 30, // Cap on sub-LM calls (default: 50)
maxRuntimeChars: 2_000, // Cap for llmQuery context + code output (default: 5000)
maxBatchedLlmQueryConcurrency: 6, // Max parallel batched llmQuery calls (default: 8)
maxTurns: 10, // Max Actor turns before forcing Responder (default: 10)
contextPolicy: { // Context replay + checkpoint policy
preset: 'adaptive', // Opinionated defaults for long runtime tasks
state: {
summary: true, // Include Live Runtime State in the actor prompt
inspect: true, // Expose inspect_runtime() to the actor
inspectThresholdChars: 2000,
maxEntries: 6,
},
checkpoints: {
enabled: true, // Summarize older successful turns into a checkpoint
triggerChars: 2000,
},
expert: {
pruneErrors: true, // Prune resolved errors after successful turns
rankPruning: { enabled: true, minRank: 2 },
tombstones: true, // Replace resolved errors with compact tombstones
},
},
actorFields: ['reasoning'], // Output fields produced by Actor instead of Responder
actorCallback: async (result) => { // Called after each Actor turn
console.log('Actor turn:', result);
},
mode: 'simple', // Sub-query mode: 'simple' = AxGen, 'advanced' = AxAgent (default: 'simple')
recursionOptions: {
model: 'gpt-4o-mini', // Forward options for recursive llmQuery agent calls
maxDepth: 2, // Maximum recursion depth
},
}
);
```
### AxJSRuntime
In AxAgent + RLM, `AxJSRuntime` is the default JS runtime
for executing model-generated code. It is cross-runtime and works in:
- Node.js/Bun-style backends
- Deno backends
- Browser environments
It can be used both as:
- an `AxCodeRuntime` for RLM sessions (`createSession`)
- a function tool (`toFunction`) for non-RLM workflows
### Sandbox Permissions
By default, the `AxJSRuntime` sandbox blocks all dangerous Web APIs (network, storage, etc.). You can selectively grant access using the `AxJSRuntimePermission` enum:
```typescript
import { AxJSRuntime, AxJSRuntimePermission } from '@ax-llm/ax';
const runtime = new AxJSRuntime({
permissions: [
AxJSRuntimePermission.NETWORK,
AxJSRuntimePermission.STORAGE,
],
});
```
Node safety note:
- In Node runtime, `AxJSRuntime` uses safer defaults and hides host globals like `process` and `require`.
- You can opt into unsafe host access only when you trust generated code:
```typescript
const runtime = new AxJSRuntime({
allowUnsafeNodeHostAccess: true, // WARNING: model code can access host capabilities
});
```
Available permissions:
| Permission | Unlocked Globals | Description |
|---|---|---|
| `NETWORK` | `fetch`, `XMLHttpRequest`, `WebSocket`, `EventSource` | HTTP requests and real-time connections |
| `STORAGE` | `indexedDB`, `caches` | Client-side persistent storage |
| `CODE_LOADING` | `importScripts` | Dynamic script loading |
| `COMMUNICATION` | `BroadcastChannel` | Cross-tab/worker messaging |
| `TIMING` | `performance` | High-resolution timing |
| `WORKERS` | `Worker`, `SharedWorker` | Sub-worker spawning (see warning below) |
> **Warning**: Granting `WORKERS` allows code to spawn sub-workers that get fresh, unlocked globals. A child worker has full access to `fetch`, `indexedDB`, etc. regardless of the parent's permissions. Only grant `WORKERS` when you trust the executed code.
### Consecutive Execution Error Cutoff
`AxJSRuntime` can enforce a cutoff for consecutive execution failures. This is useful when generated code gets stuck in a failure loop.
```typescript
import { AxJSRuntime } from '@ax-llm/ax';
const runtime = new AxJSRuntime({
consecutiveErrorCutoff: 3,
});
```
Behavior:
- The runtime tracks consecutive execution failures.
- The counter resets on successful execution.
- When failures hit the configured cutoff, the runtime throws `AxRuntimeExecutionError` and exits the session.
- Preflight guardrail errors are not counted (for example blocked `"use strict"` and reserved-name reassignment checks).
You can manually reset the runtime-level counter:
```typescript
runtime.resetConsecutiveErrorCounter();
```
### Structured Context Fields
Context fields aren't limited to plain strings. You can pass structured data — objects and arrays with typed sub-fields — and the LLM will see their full schema in the code interpreter prompt.
```typescript
import { agent, f, s } from '@ax-llm/ax';
import { AxJSRuntime } from '@ax-llm/ax';
const sig = s('query:string -> answer:string, evidence:string[]')
.appendInputField('documents', f.object({
id: f.number('Document ID'),
title: f.string('Document title'),
content: f.string('Document body'),
}).array('Source documents'));
const analyzer = agent(sig, {
agentIdentity: {
name: 'structuredAnalyzer',
description: 'Analyzes structured document collections using RLM',
},
contextFields: ['documents'],
runtime: new AxJSRuntime(),
});
```
When the LLM enters the code interpreter, it sees the schema:
```
- `documents` (json array of object { id: number, title: string, content: string } items)
```
The LLM can then work with the data using property access and array methods:
```javascript
// Filter documents by title
const relevant = documents.filter(d => d.title.includes('climate'));
// Pass content to sub-LM — strings go directly, objects via JSON.stringify()
const summaries = await llmQuery(
relevant.map(d => ({ query: 'Summarize this document', context: d.content }))
);
```
Structured fields are loaded as native JavaScript objects in the interpreter, preserving their full structure for programmatic access.
### The Actor Loop
The Actor generates JavaScript code in a `javascriptCode` output field. Each turn:
1. The Actor emits `javascriptCode` containing JavaScript to execute
2. The runtime executes the code and returns the result
3. The result is appended to the action log
4. The Actor sees the updated action log and decides what to do next
5. When the Actor calls `final(...args)` or `ask_clarification(...args)`, the loop ends and the Responder takes over
Host applications can update inputs during this loop by setting `inputUpdateCallback`. The callback runs before each Actor turn, can return a partial patch of signature input fields, and those updates are applied to both prompt inputs and runtime `inputs.` values.
The Actor's typical workflow:
```
1. Explore context structure (typeof, length, slice)
2. Plan a chunking strategy based on what it observes
3. Use code for structural work (filter, map, regex, property access)
4. Use llmQuery for semantic work (summarization, interpretation)
5. Build up answers in variables across turns
6. Signal completion by calling `final(...args)` (or `ask_clarification(...args)` to request user input)
```
### Actor Fields
By default, all output fields from the signature go to the Responder. Use `actorFields` to route specific output fields to the Actor instead. The Actor produces these fields each turn (alongside `javascriptCode`), and their values are included in the action log for context. The last Actor turn's values are merged into the final output.
```typescript
const analyzer = agent(
'context:string, query:string -> answer:string, reasoning:string',
{
contextFields: ['context'],
actorFields: ['reasoning'], // Actor produces 'reasoning', Responder produces 'answer'
}
);
```
### Actor Callback
Use `actorCallback` to observe each Actor turn. It receives the full Actor result (including `javascriptCode` and any `actorFields`) and fires every turn, including the `final(...)`/`ask_clarification(...)` turn.
```typescript
const analyzer = agent('context:string, query:string -> answer:string', {
contextFields: ['context'],
actorCallback: async (result) => {
console.log('Actor code:', result.javascriptCode);
},
});
```
### Input Update Callback
Use `inputUpdateCallback` to apply host-side input updates while `forward()` / `streamingForward()` is in progress. It runs before each Actor turn.
```typescript
let latestQuery = 'initial question';
const analyzer = agent('query:string -> answer:string', {
contextFields: [],
inputUpdateCallback: async (inputs) => {
if (latestQuery !== inputs.query) {
return { query: latestQuery };
}
return undefined; // no-op this turn
},
});
```
Updates from this callback are merged into current inputs (unknown keys are ignored), then synchronized into runtime `inputs.` and existing non-colliding top-level aliases via `AxCodeSession.patchGlobals(...)` before code execution. This host-side sync does not run through the Actor's `execute(code)` path.
### Actor/Responder Forward Options
Use `actorOptions` and `responderOptions` to set different forward options (model, thinking budget, etc.) for the Actor and Responder sub-programs. These are set at construction time and act as defaults that can still be overridden at forward time.
```typescript
const analyzer = agent('context:string, query:string -> answer:string', {
contextFields: ['context'],
actorOptions: {
model: 'fast-model',
thinkingTokenBudget: 1024,
},
responderOptions: {
model: 'smart-model',
thinkingTokenBudget: 4096,
},
});
```
Priority order (low to high): constructor base options < `actorOptions`/`responderOptions` < forward-time options.
### Recursive llmQuery Options
Use `recursionOptions` to set default forward options for recursive `llmQuery` sub-agent calls.
```typescript
const analyzer = agent('context:string, query:string -> answer:string', {
contextFields: ['context'],
recursionOptions: {
model: 'fast-model',
maxDepth: 2,
timeout: 60_000,
},
});
```
Each `llmQuery` call runs a sub-query with a fresh session and the same registered tool/agent globals. The child receives only the `context` value passed to `llmQuery(...)` — parent `contextFields` values are not forwarded. In simple mode (default), the child is a plain AxGen (direct LLM call). In advanced mode, the child is a full AxAgent with Actor/Responder and code runtime.
### Actor/Responder Descriptions
Use `actorOptions.description` and `responderOptions.description` to append additional instructions to the Actor or Responder system prompts. The base prompts are preserved; your text is appended after them.
```typescript
const analyzer = agent('context:string, query:string -> answer:string', {
contextFields: ['context'],
actorOptions: {
description: 'Focus on numerical data. Use precise calculations.',
},
responderOptions: {
description: 'Format answers as bullet points. Cite evidence.',
},
});
```
> **Note:** Signature-level descriptions (via `.description()` on the signature) are not supported on `AxAgent`. Use `actorOptions.description` / `responderOptions.description` instead to customize each sub-program independently.
### Few-Shot Demos
Use `setDemos()` to provide few-shot examples that guide the Actor and Responder. Demos are keyed by program ID — use `namedPrograms()` to discover available IDs.
Each demo trace must include at least one input field AND one output field. The Actor's input fields are `contextMetadata`, `actionLog`, and any non-context inputs from the original signature. The Responder's input fields are `contextMetadata`, `actorResult`, and any non-context inputs from the original signature.
Note: use `final(...)` (not `submit(...)`) in Actor demo traces to signal completion.
```typescript
analyzer.setDemos([
{
programId: 'root.actor',
traces: [
{
actionLog: '(no actions yet)',
javascriptCode: 'console.log(context.slice(0, 200))',
},
{
actionLog: 'Step 1 | console.log(context.slice(0, 200))\n→ Chapter 1: The Rise of...',
javascriptCode: 'const summary = await llmQuery("Summarize", context.slice(0, 500)); console.log(summary)',
},
{
actionLog: 'Step 1 | ...\nStep 2 | llmQuery(...)\n→ The document argues about...',
javascriptCode: 'final("analysis complete")',
},
],
},
{
programId: 'root.responder',
traces: [
{
query: 'What are the main arguments?',
answer: 'The document presents arguments about distributed systems.',
evidence: ['Chapter 1 discusses scalability', 'Chapter 2 covers CAP'],
},
],
},
]);
```
Demo values are validated against the target program's signature. Invalid values or missing input/output fields throw an error at `setDemos()` time.
### Available APIs in the Sandbox
Inside the code interpreter, these functions are available as globals:
| API | Description |
|-----|-------------|
| `await llmQuery(query, context?)` | Ask a sub-LM a question with optional context. Returns a string. On non-abort sub-query failures, the string may be `[ERROR] ...`. Oversized context is truncated to `maxRuntimeChars` |
| `await llmQuery({ query, context? })` | Single-object convenience form of `llmQuery`. Returns a string, including `[ERROR] ...` on non-abort sub-query failures |
| `await llmQuery([{ query, context }, ...])` | Run multiple sub-LM queries in parallel. Returns string[]. Failed items return `[ERROR] ...`; each query still counts toward the call limit |
| `final(...args)` | Stop Actor execution and pass payload args to Responder. Requires at least one argument |
| `ask_clarification(...args)` | Stop Actor execution and pass clarification payload args to Responder. Requires at least one argument |
| `await .({...})` | Call a child agent by name (from `agents.local`). `` is `agentIdentity.namespace` when set, otherwise `agents`. Parameters match the agent's JSON schema. Returns a string |
| `await .({...})` | Call an agent function by namespace and name (from `functions.local`). Returns the typed result |
| `await listModuleFunctions(modules)` | Discovery mode only (`functions.discovery: true`). Returns markdown sections listing callable names for callable-backed modules, and per-module markdown errors for unknown requested modules. Prefer one batched array call when inspecting multiple modules |
| `await getFunctionDefinitions(functions)` | Discovery mode only (`functions.discovery: true`). Returns markdown sections with API descriptions and signatures for one or more callables. Prefer one batched array call when inspecting multiple callables |
| `print(...args)` | Available in `AxJSRuntime` when `outputMode: 'stdout'`; captured output appears in the function result |
| Context variables | All input fields are available as `inputs.` (including context fields). Non-colliding top-level aliases may also exist and are refreshed from `inputUpdateCallback` patches before each turn |
Errors from actor-authored child-agent or tool calls appear in `Action Log` as execution errors so the Actor can correct its code on the next turn. Abort/cancellation still stops execution.
Host-side function handlers can trigger the same completion flow through `extra.protocol.final(...)` or `extra.protocol.askClarification(...)`. Inside actor-authored JavaScript, continue using the runtime globals `final(...)` and `ask_clarification(...)`.
By default, `AxJSRuntime` uses `outputMode: 'stdout'`, where visible output comes from `console.log(...)`, `print(...)`, and other captured stdout lines.
### Testing Runtime Snippets
Use `agent.test(code, contextFieldValues?, options?)` to validate a JavaScript snippet against the same runtime globals the Actor would see, without running the full Actor/Responder loop.
```typescript
const analyzer = agent('query:string -> answer:string', {
contextFields: ['query'],
runtime: new AxJSRuntime(),
functions: {
local: [
{
name: 'uppercase',
namespace: 'tools',
description: 'Uppercase a string',
parameters: {
type: 'object',
properties: { value: { type: 'string' } },
required: ['value'],
},
func: async ({ value }) => String(value).toUpperCase(),
},
],
},
});
const output = await analyzer.test(
'console.log(await tools.uppercase({ value: query }))',
{ query: 'hello' }
);
console.log(output); // "HELLO"
```
`test(...)` creates a fresh runtime session per call. It seeds the runtime only with the optional values you provide for configured `contextFields`, returns captured runtime output as a string, and throws on runtime failures. It also throws if the snippet calls `final(...)` or `ask_clarification(...)`.
### Session State and `await`
`AxJSRuntime` state is session-scoped. Values survive across `execute()` calls only while you keep using the same session.
- The Actor loop runs in a persistent runtime session — variables survive across turns.
- `runtime.toFunction()` is different: it creates a new session per tool call, then closes it, so state does not persist across calls.
When code contains `await`, the runtime compiles it as an async function so top-level `await` works. In that async path, local declarations (`const`/`let`/`var`) are function-scoped and should not be relied on for cross-call state.
Prefer one of these patterns for durable state:
```javascript
// Pattern 1: Explicit global
globalThis.state = await getState();
globalThis.state.x += 1;
return globalThis.state;
```
```javascript
// Pattern 2: Shared object passed in globals/context
state.x += 1;
return state;
```
This may appear to work in some cases:
```javascript
state = await getState(); // no let/const/var
```
but `globalThis.state = ...` (or mutating a shared `state` object) is the recommended explicit pattern.
### Error Handling in the Code Interpreter
Errors thrown by code running inside `session.execute(code)` cross the worker boundary and can be caught on the host. Always `await` `session.execute()` inside a try/catch:
```typescript
import { AxRuntimeExecutionError } from '@ax-llm/ax';
try {
const result = await session.execute(code);
// use result
} catch (e) {
if (e instanceof AxRuntimeExecutionError) {
// Consecutive execution failures reached the cutoff; session was exited.
} else if (e instanceof Error && e.name === 'WaitForUserActionError') {
// handle domain-specific error
}
console.error(e instanceof Error ? e.message : String(e));
}
```
- `AxRuntimeExecutionError` is thrown when the runtime reaches the configured consecutive execution failure cutoff.
- The cutoff counter resets on successful execution.
- Preflight guardrail errors are not counted toward the cutoff.
- For custom errors thrown in the worker, use `e.name` checks if prototype identity is not preserved across the worker boundary.
### Custom Interpreters
The built-in `AxJSRuntime` uses Web Workers for sandboxed code execution. For other environments, implement the `AxCodeRuntime` interface:
```typescript
import type { AxCodeRuntime, AxCodeSession } from '@ax-llm/ax';
class MyBrowserInterpreter implements AxCodeRuntime {
getUsageInstructions?(): string {
return 'Runtime-specific guidance for writing code in this environment.';
}
createSession(globals?: Record): AxCodeSession {
const scope = { ...globals };
const isPlainObject = (
value: unknown
): value is Record => {
if (!value || typeof value !== 'object' || Array.isArray(value)) {
return false;
}
const proto = Object.getPrototypeOf(value);
return proto === Object.prototype || proto === null;
};
// Set up your execution environment with globals
return {
async execute(code: string) {
// Execute code and return result
},
async patchGlobals(nextGlobals: Record) {
for (const [key, value] of Object.entries(nextGlobals)) {
const current = scope[key];
if (isPlainObject(current) && isPlainObject(value)) {
for (const existingKey of Object.keys(current)) {
if (!(existingKey in value)) {
delete current[existingKey];
}
}
Object.assign(current, value);
continue;
}
scope[key] = value;
}
},
close() {
// Clean up resources
},
};
}
}
```
When patching object-valued globals such as `inputs`, reconcile the existing object in place instead of blindly replacing the reference. That keeps previously saved references in the runtime session aligned with later host-side updates.
The `globals` object passed to `createSession` includes:
- All context field values (by field name)
- `llmQuery` function (supports both single and batched queries)
- `final(...args)` and `ask_clarification(...args)` completion functions
- Child-agent namespace object (`agentIdentity.namespace` if set, else `agents`) with child agent functions (e.g., `agents.summarize(...)`, `team.summarize(...)`)
- Agent functions under their namespaces (e.g., `utils.myFn(...)`, `db.search(...)`)
- `print` function when supported by the runtime (for `AxJSRuntime`, set `outputMode: 'stdout'`)
If provided, `getUsageInstructions()` is appended to the RLM system prompt as runtime-specific guidance. Use it for semantics that differ by runtime (for example state persistence or async execution behavior).
### RLM with Streaming
RLM mode does not support true streaming. When using `streamingForward`, RLM runs the full analysis and yields the final result as a single chunk.
### Runtime Character Cap
`maxRuntimeChars` is a hard ceiling for runtime payloads.
- **Hard cap:** `2_000` chars by default
- **Applies to:** `llmQuery` context and `codeInterpreter` output
- **Truncation behavior:** if data exceeds the cap, it is truncated with a `...[truncated N chars]` suffix
- **Manual chunking:** optional strategy you can implement in interpreter code; not done automatically by the runtime
## API Reference
### `AxJSRuntime`
```typescript
new AxJSRuntime({
timeout?: number;
permissions?: readonly AxJSRuntimePermission[];
outputMode?: 'return' | 'stdout';
captureConsole?: boolean;
allowUnsafeNodeHostAccess?: boolean;
nodeWorkerPoolSize?: number;
debugNodeWorkerPool?: boolean;
consecutiveErrorCutoff?: number; // Cutoff for consecutive execution failures
});
runtime.resetConsecutiveErrorCounter(): void; // Resets runtime-level consecutive failure counter
```
### `AxRuntimeExecutionError`
Thrown by `AxJSRuntime` when consecutive execution failures reach `consecutiveErrorCutoff`. When this happens, the active runtime session is exited. Preflight guardrail errors are not counted toward this cutoff.
### `AxRLMConfig`
```typescript
interface AxRLMConfig {
contextFields: string[]; // Input fields holding long context
runtime?: AxCodeRuntime; // Code runtime (default: AxJSRuntime)
maxSubAgentCalls?: number; // Cap on sub-LM calls (default: 50)
maxRuntimeChars?: number; // Cap for llmQuery context + code output (default: 5000)
maxBatchedLlmQueryConcurrency?: number; // Max parallel batched llmQuery calls (default: 8)
maxTurns?: number; // Max Actor turns before forcing Responder (default: 10)
contextPolicy?: AxContextPolicyConfig; // Context replay, checkpointing, and runtime-state policy
actorFields?: string[]; // Output fields produced by Actor instead of Responder
actorCallback?: (result: Record) => void | Promise; // Called after each Actor turn
mode?: 'simple' | 'advanced'; // Sub-query mode: 'simple' = AxGen, 'advanced' = AxAgent (default: 'simple')
}
type AxContextPolicyPreset = 'full' | 'adaptive' | 'lean';
// Preset meanings:
// - 'full': keep prior actions fully replayed with minimal compression
// - 'adaptive': keep live runtime state visible, preserve important recent actions,
// and collapse older successful work into checkpoint summaries as context grows
// - 'lean': prefer live runtime state plus compact summaries/checkpoints over full replay
// of older successful turns
// Practical rule:
// - use 'adaptive' for most long multi-turn tasks
// - use 'lean' when token pressure matters more than raw replay detail
// - use 'full' when debugging or when the actor must reread exact prior code/output
interface AxContextPolicyConfig {
preset?: AxContextPolicyPreset; // Compression profile: 'full' | 'adaptive' | 'lean'
state?: {
summary?: boolean; // Include Live Runtime State ahead of the action log
inspect?: boolean; // Expose inspect_runtime() to the actor
inspectThresholdChars?: number; // Large-context hint threshold
maxEntries?: number; // Max runtime-state entries to render
};
checkpoints?: {
enabled?: boolean; // Enable rolling checkpoint summaries
triggerChars?: number; // Generate a checkpoint when the prompt grows past this size
};
expert?: {
replay?: 'full' | 'adaptive' | 'minimal';
recentFullActions?: number;
pruneErrors?: boolean;
rankPruning?: { enabled?: boolean; minRank?: number };
tombstones?: boolean | Omit, 'functions'>;
};
}
```
### `AxCodeRuntime`
```typescript
interface AxCodeRuntime {
getUsageInstructions?(): string;
createSession(globals?: Record): AxCodeSession;
}
```
### `AxCodeSession`
```typescript
interface AxCodeSession {
execute(
code: string,
options?: { signal?: AbortSignal; reservedNames?: readonly string[] }
): Promise;
patchGlobals(
globals: Record,
options?: { signal?: AbortSignal }
): Promise;
close(): void;
}
```
### `AxAgentConfig`
```typescript
interface AxAgentConfig extends AxAgentOptions {
ai?: AxAIService;
agentIdentity?: { name: string; description: string; namespace?: string };
}
```
### `AxAgentFunction`
```typescript
type AxAgentFunction = {
name: string;
description: string;
parameters: AxFunctionJSONSchema; // required
returns?: AxFunctionJSONSchema; // optional output schema
namespace?: string; // default: 'utils'
examples?: {
code: string;
title?: string;
description?: string;
language?: string; // default render language: 'typescript'
}[];
func: AxFunctionHandler;
};
```
Agent functions are registered as namespaced globals in the JS runtime (e.g. `utils.search`, `db.query`). Reserved namespaces: `agents`, `llmQuery`, `final`, `ask_clarification`, and the configured `agentIdentity.namespace` when set.
### `AxAgentFunctionGroup`
```typescript
type AxAgentFunctionGroup = {
namespace: string; // Discovery/runtime module name, such as 'db'
title: string; // Human-readable module title
selectionCriteria: string; // Short guidance shown in the Actor prompt module list
description: string; // Summary shown in discovery markdown
functions: Omit[];
};
```
### `AxAgentOptions`
Extends `AxProgramForwardOptions` (without `functions` or `description`) with:
```typescript
{
debug?: boolean;
contextFields: readonly (
| string
| {
field: string;
promptMaxChars?: number; // Inline only when the full value is at or below the threshold
keepInPromptChars?: number; // Keep a truncated string excerpt in the Actor prompt
reverseTruncate?: boolean; // With keepInPromptChars, keep the last N chars instead of the first N
}
)[]; // Input fields loaded into JS runtime; object form can also expose prompt excerpts
agents?: {
local?: AxAnyAgentic[]; // Callable under .* in this agent
shared?: AxAnyAgentic[]; // Propagated one level to direct children
globallyShared?: AxAnyAgentic[]; // Propagated recursively to all descendants
excluded?: string[]; // Agent names NOT to receive from parents
};
fields?: {
local?: string[]; // Keep shared/global fields visible in this agent
shared?: string[]; // Fields passed to direct child agents
globallyShared?: string[]; // Fields passed to all descendants
excluded?: string[]; // Fields NOT to receive from parents
};
functions?: {
discovery?: boolean; // Enable module discovery APIs instead of prompt definition dump
local?: AxAgentFunction[] | AxAgentFunctionGroup[]; // Flat or grouped function modules in this agent's JS runtime
shared?: AxAgentFunction[] | AxAgentFunctionGroup[]; // Flat or grouped; propagated one level to direct children
globallyShared?: AxAgentFunction[] | AxAgentFunctionGroup[]; // Flat or grouped; propagated recursively to all descendants
excluded?: string[]; // Function names NOT to receive from parents
};
runtime?: AxCodeRuntime;
maxSubAgentCalls?: number;
maxRuntimeChars?: number;
maxBatchedLlmQueryConcurrency?: number;
maxTurns?: number;
contextPolicy?: AxContextPolicyConfig;
actorFields?: string[];
actorCallback?: (result: Record) => void | Promise;
inputUpdateCallback?: (currentInputs: Record) => Promise | undefined> | Record | undefined;
mode?: 'simple' | 'advanced';
recursionOptions?: Partial> & {
maxDepth?: number; // Maximum recursion depth for llmQuery sub-agent calls (default: 2)
};
actorOptions?: Partial;
responderOptions?: Partial;
}
```
### `stop()`
```typescript
public stop(): void
```
Available on `AxAgent`, `AxGen`, and `AxFlow`. Stops an in-flight `forward()` or `streamingForward()` call, causing it to throw `AxAIServiceAbortedError`. See [Stopping Agents](#stopping-agents).
================================================================================
# AxLearn: Self-Improving Agents
# Source: LEARN.md
# Zero-configuration optimization loop for Ax agents
# AxLearn: Self-Improving Agents
AxLearn provides a zero-configuration optimization loop that enables your Ax agents to automatically improve their prompts using production logs and teacher models.
## Quick Start
```typescript
import { ax, ai, AxLearn, AxAIOpenAIModel } from '@ax-llm/ax';
// 1. Define storage (simple functional interface)
// In a real app, save to a database.
const traces = new Map();
const storage = {
save: async (name, item) => name === 'traces' ? traces.set(item.id, item) : null,
load: async (name, query) => Array.from(traces.values())
};
// 2. Create your self-improving agent
// AxLearn handles logging, trace collection, and optimization automatically.
const agent = new AxLearn(ax('customer_query -> polite_reply'), {
name: 'support-chat-v1', // Unique identifier for versioning
storage, // Where to save traces
// Teacher for optimization
teacher: ai({
name: 'openai',
apiKey: process.env.OPENAI_APIKEY as string,
config: { model: AxAIOpenAIModel.GPT4O }
}),
budget: 20 // Max optimization rounds
});
// 3. Use in production
// Traces are automatically logged to your storage.
const response = await agent.forward(ai, { customer_query: 'Where is my order?' });
// 4. Optimize offline
// Uses collected traces + synthetic data to improve the prompt.
await agent.optimize();
```
## How It Works
`AxLearn` is an all-in-one wrapper that manages the lifecycle of a self-improving agent.
1. **Production Logging**: When you call `agent.forward()`, it automatically logs input/output traces to your provided `storage`.
2. **Data Management**: During optimization, it loads these traces and mixes them with synthetic data generated by the teacher model.
3. **Automatic Tuning**: It runs an optimization loop (using GEPA) to iteratively improve the prompt and examples.
4. **Evaluation**: It uses an LLM-as-a-Judge to evaluate performance against the teacher model.
## Core Components
### AxStorage (Required)
You must provide a simple storage adapter to save traces and checkpoints.
**Interface:**
```typescript
type AxStorage = {
save: (name: string, item: AxTrace | AxCheckpoint) => Promise;
load: (name: string, query: AxStorageQuery) => Promise<(AxTrace | AxCheckpoint)[]>;
};
```
**Example (In-Memory):**
```typescript
const traces = new Map();
const storage: AxStorage = {
save: async (name, item) => {
const list = traces.get(name) ?? [];
list.push(item);
traces.set(name, list);
},
load: async (name, query) => {
return traces.get(name) ?? [];
}
};
```
### AxLearn Configuration
```typescript
const agent = new AxLearn(generator, {
name: 'my-agent',
storage: myStorage,
// Optimizer Settings
teacher: myTeacherLLM, // Strong model for teaching/judging
budget: 20, // Max optimization steps
// Data Settings
useTraces: true, // Use production traces? (default: true)
generateExamples: true, // Create synthetic data? (default: true)
synthCount: 50, // How many synthetic examples to generate
});
```
## Workflow
```
┌──────────────────────────────────────────────────────────────┐
│ Production Runtime │
│ ┌─────────┐ ┌──────────────┐ │
│ │ AxLearn │──(automatic logging)──────────▶│ AxStorage │ │
│ └─────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────┘
│ │
│ (agent.forward) │ (load traces)
▼ ▼
┌──────────────────────────────────────────────────────────────┐
│ Offline Optimization (agent.optimize) │
│ │
│ 1. Load Traces from Storage │
│ 2. Generate Synthetic Data (using Teacher) │
│ 3. Run Optimization Loop (adjust prompt/examples) │
│ 4. Evaluate (using Teacher as Judge) │
│ 5. Save Improved Checkpoint │
└──────────────────────────────────────────────────────────────┘
```
## Best Practices
1. **Start with traces**: Deploy your agent early to collect real-world data.
2. **Use user feedback**: Save traces with feedback to guide optimization.
```typescript
// Add feedback to a trace
trace.feedback = { score: 1, label: 'good' };
await storage.save('my-agent', trace);
```
3. **Optimize offline**: Run `agent.optimize()` periodically (e.g., nightly).
4. **Versioning**: Change the `name` (e.g., `'agent-v2'`) when you deploy a major update to keep traces separate.
================================================================================