AX AGENT FRAMEWORK
TYPESCRIPT-FIRST LLM INFRASTRUCTURE

Ax Framework

Build LLM-Powered Agents with TypeScript

> About Ax

The best framework to build LLM powered agents

Building intelligent agents is a breeze with the Ax framework, inspired by the power of “Agentic workflows” and the Stanford DSPy paper. It seamlessly integrates with multiple LLMs and VectorDBs to build RAG pipelines or collaborative agents that can solve complex problems. Plus, it offers advanced features like streaming validation, multi-modal DSPy, etc.

Large language models (LLMs) are becoming really powerful and have reached a point where they can work as the backend for your entire product. However, there’s still a lot of complexity to manage from using the correct prompts, models, streaming, function calls, error correction, and much more. We aim help manage this complexity via this easy-to-use library that can work with all state-of-the-art LLMs. Additionally, we are using the latest research to add new capabilities like DSPy to the library.

Install

With NPM

npm install @ax-llm/ax

With Yarn

yarn add @ax-llm/ax

Features

  • Support for various LLMs and Vector DBs
  • Prompts auto-generated from simple signatures
  • Build Agents that can call other agents
  • Convert docs of any format to text
  • RAG, smart chunking, embedding, querying
  • Works with Vercel AI SDK
  • Output validation while streaming
  • Multi-modal DSPy supported
  • Automatic prompt tuning using optimizers
  • OpenTelemetry tracing / observability
  • Production ready Typescript code
  • Lite weight, zero-dependencies

Quick Start

  1. Pick an AI to work with
// Pick a LLM
const ai = new AxOpenAI({ apiKey: process.env.OPENAI_APIKEY } as AxOpenAIArgs);
  1. Create a prompt signature based on your usecase
// Signature defines the inputs and outputs of your prompt program
const cot = new AxGen(ai, `question:string -> answer:string`, { mem });
  1. Execute this new prompt program
// Pass in the input fields defined in the above signature
const res = await cot.forward({ question: 'Are we in a simulation?' });
  1. Or if you just want to directly use the LLM
const res = await ai.chat([
  { role: "system", content: "Help the customer with his questions" }
  { role: "user", content: "I'm looking for a Macbook Pro M2 With 96GB RAM?" }
]);

Reach out

https://twitter.com/dosco

> What's a prompt signature?

Prompt signatures are how you define the inputs and outputs to a Ax Prompt.

shapes at 24-03-31 00 05 55

Efficient type-safe prompts are auto-generated from a simple signature. A prompt signature is made up of a "task description" inputField:type "field description" -> "outputField:type. The idea behind prompt signatures is based on work done in the “Demonstrate-Search-Predict” paper.

You can have multiple input and output fields, and each field can be of the types stringnumberbooleandate, datetime, class "class1, class2", JSON, or an array of any of these, e.g., string[]. When a type is not defined, it defaults to string. The underlying AI is encouraged to generate the correct JSON when the JSON type is used.

Output Field Types

TypeDescriptionUsageExample Output
stringA sequence of characters.fullName:string"example"
numberA numerical value.price:number42
booleanA true or false value.isEvent:booleantrue, false
dateA date value.startDate:date"2023-10-01"
datetimeA date and time value.createdAt:datetime"2023-10-01T12:00:00Z"
class "class1,class2"A classification of items.category:class["class1", "class2", "class3"]
string[]An array of strings.tags:string[]["example1", "example2"]
number[]An array of numbers.scores:number[][1, 2, 3]
boolean[]An array of boolean values.permissions:boolean[][true, false, true]
date[]An array of dates.holidayDates:date[]["2023-10-01", "2023-10-02"]
datetime[]An array of date and time values.logTimestamps:datetime[]["2023-10-01T12:00:00Z", "2023-10-02T12:00:00Z"]
class[] "class1,class2"Multiple classescategories:class[]["class1", "class2", "class3"]

> Supported LLMs

Using various LLMs

Ax supports all the top LLM providers and models, along with their advanced capabilities, such as function calling, multi-modal, streaming, and JSON.

Our defaults, including default models, are selected to ensure solid agent performance.

OpenAI

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});
const ai = new AxAI({
 name: 'openai',
 apiKey: process.env.OPENAI_APIKEY as string
 config: {
    model: AxAIOpenAIModel.GPT4Turbo,
    embedModel: AxAIOpenAIEmbedModel.TextEmbedding3Small
    temperature: 0.1,
 }
});

Azure OpenAI

Azure requires you to set a resource name and a deployment name

https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal

const ai = new AxAI({
  name: 'azure-openai',
  apiKey: process.env.AZURE_OPENAI_APIKEY as string,
  resourceName: 'test-resource',
  deploymentName: 'test-deployment'
});

Together

Together runs a diverse array of open-source models, each designed for a specific use case. This variety ensures that you can find the perfect model for your needs.

https://docs.together.ai/docs/inference-models

const ai = new AxAI({
  name: 'together',
  apiKey: process.env.TOGETHER_APIKEY as string,
  config: {
    model: 'Qwen/Qwen1.5-0.5B-Chat'
  }
});

Anthropic

const ai = new AxAI({
  name: 'anthropic',
  apiKey: process.env.ANTHROPIC_APIKEY as string
});

Groq

Groq uses specialized hardware to serve open-source models with the lowest latency. It supports a small number of good models.

const ai = new AxAI({
  name: 'groq',
  apiKey: process.env.GROQ_APIKEY as string
});

Google Gemini

An excellent model family with very long context lengths at the lowest price points. Gemini has built-in support for compute (code execution); their models can write and run code in the backend if needed.

const ai = new AxAI({
  name: 'google-gemini',
  apiKey: process.env.GOOGLE_GEMINI_APIKEY as string
  options: { codeExecution: true }
});

Cohere

const ai = new AxAI({
  name: 'cohere',
  apiKey: process.env.COHERE_APIKEY as string
});

Mistral

const ai = new AxAI({
  name: 'mistral',
  apiKey: process.env.MISTRAL_APIKEY as string
});

Deepseek

Deepseek is an LLM provider from China that has excellent models.

const ai = new AxAI({
  name: 'deepseek',
  apiKey: process.env.DEEPSEEK_APIKEY as string
});

Ollama

Ollama is an engine for running open-source models locally on your laptop. We default to nous-hermes21 for inference and all-minilm for embedding.

const ai = new AxAI({
  name: 'ollama',
  apiKey: "not-set",
  url: 'http://localhost:11434/v1'
  config: { model: 'nous-hermes2', embedModel: 'all-minilm' }
});

Huggingface

const ai = new AxAI({
  name: 'huggingface',
  apiKey: process.env.HF_APIKEY as string
});

> Using Ax

A more detailed guide to using Ax

Pick an LLM

Ax is a zero-dependency framework. Every LLM API integration we build is solid, works well with Ax, and supports all required features, such as function calling, multi-modal, JSON, streaming, etc.

Currently we support "openai" | "azure-openai" | "together" | "anthropic" | "groq" | "google-gemini" | "cohere" | "huggingface" | "mistral" | "deepseek" | "ollama"

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});

The LLMs are pre-configured with sensible defaults such as models and other conifgurations such as topK, temperature, etc

Prompting

Prompts are usually stressful and complex. You never know what the right prompt is, and blobs of text in your code are hard to deal with. We fix this by adopting the prompt signatures from the popular Stanford DSPy paper.

A prompt signature is a list of typed input and output fields along with a task description prefix. the following fields are supported 'string' | 'number' | 'boolean' | 'json' | 'image' | 'audio' add a [] to convert a field into an array field eg. string[], number[], etc. Additionally a ? marks the field as an optional field context?:string.

Summarize some text

textToSummarize -> shortSummary "summarize in 5 to 10 words"

Answer questions using a multi-modal prompt that takes a question and an image

"answer biology questions about animals"
question:string, animalImage:image -> answer:string

A prompt that ensures the response is a numeric list

"Rate the quality of each answer on a scale of 1 to 10 against the question"
question:string, answers:string[] -> rating:number[]

Putting it all together

Use the above AI and a prompt to build an LLM-powered program to summarize the text.

// example.ts
import { AxAI, AxChainOfThought } from '@ax-llm/ax';

const textToSummarize = `
The technological singularity—or simply the singularity[1]—is a hypothetical 
future point in time at which technological growth becomes uncontrollable 
and irreversible, resulting in unforeseeable changes to human 
civilization.[2][3] ...`;

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});

const gen = new AxChainOfThought(`textToSummarize -> shortSummary "summarize in 5 to 10 words"`);
const res = await gen.forward(ai, { textToSummarize });

console.log(res);
tsx example.ts

{
    shortSummary: "The technological singularity refers to a
    hypothetical future scenario where technological..."
}

Build your first agent

Ax makes it really simple to build agents. An agent requires a name, description and signature. it can optionally use functions and other agents.

Example Stock Analyst Agent The Stock Analyst Agent is an advanced AI-powered tool that provides comprehensive stock analysis and financial insights. It combines multiple specialized sub-agents and functions to deliver in-depth evaluations of stocks, market trends, and related financial data.

This is only an example, but it highlights the power of agentic workflows, where you can build agents who work with agents to handle complex tasks.

const agent = new AxAgent({
  name: 'Stock Analyst',
  description:
    'An AI agent specialized in analyzing stocks, market trends, and providing financial insights.',
  signature: `
    stockSymbol:string, 
    analysisType:string "fundamental, technical or sentiment" -> analysisReport`,
  functions: [
    getStockData,
    calculateFinancialRatios,
    analyzeTechnicalIndicators,
    performSentimentAnalysis
  ],
  agents: [
    financialDataCollector,
    marketTrendAnalyzer,
    newsAnalyzer,
    sectorAnalyst,
    competitorAnalyzer,
    riskAssessor,
    valuationExpert,
    economicIndicatorAnalyzer,
    insiderTradingMonitor,
    esgAnalyst
  ]
});

Example of agents working with other agents

// ./src/examples/agent.ts

const researcher = new AxAgent({
  name: 'researcher',
  description: 'Researcher agent',
  signature: `physicsQuestion "physics questions" -> answer "reply in bullet points"`
});

const summarizer = new AxAgent({
  name: 'summarizer',
  description: 'Summarizer agent',
  signature: `text "text so summarize" -> shortSummary "summarize in 5 to 10 words"`
});

const agent = new AxAgent({
  name: 'agent',
  description: 'A an agent to research complex topics',
  signature: `question -> answer`,
  agents: [researcher, summarizer]
});

agent.forward({ questions: 'How many atoms are there in the universe' });

> RAG & Vector DBs

A guide on working with vector databases and Retrieval Augmented Generation (RAG) in ax.

Vector databases are critical to building LLM workflows. We have clean abstractions over popular vector databases and our own quick in-memory vector database.

ProviderTested
In Memory🟢 100%
Weaviate🟢 100%
Cloudflare🟡 50%
Pinecone🟡 50%
// Create embeddings from text using an LLM
const ret = await this.ai.embed({ texts: 'hello world' });

// Create an in memory vector db
const db = new axDB('memory');

// Insert into vector db
await this.db.upsert({
  id: 'abc',
  table: 'products',
  values: ret.embeddings[0]
});

// Query for similar entries using embeddings
const matches = await this.db.query({
  table: 'products',
  values: embeddings[0]
});

Alternatively you can use the AxDBManager which handles smart chunking, embedding and querying everything for you, it makes things almost too easy.

const manager = new AxDBManager({ ai, db });
await manager.insert(text);

const matches = await manager.query(
  'John von Neumann on human intelligence and singularity.'
);
console.log(matches);

RAG Documents

Using documents like PDF, DOCX, PPT, XLS, etc., with LLMs is a huge pain. We make it easy with Apache Tika, an open-source document processing engine.

Launch Apache Tika

docker run -p 9998:9998 apache/tika

Convert documents to text and embed them for retrieval using the AxDBManager, which also supports a reranker and query rewriter. Two default implementations, AxDefaultResultReranker and AxDefaultQueryRewriter, are available.

const tika = new AxApacheTika();
const text = await tika.convert('/path/to/document.pdf');

const manager = new AxDBManager({ ai, db });
await manager.insert(text);

const matches = await manager.query('Find some text');
console.log(matches);

> Multi-modal DSPy

Using multi-modal inputs like images and audio with DSPy pipelines and LLMs

When using models like GPT-4o and Gemini that support multi-modal prompts, we support using image fields, and this works with the whole DSP pipeline.

const image = fs
  .readFileSync('./src/examples/assets/kitten.jpeg')
  .toString('base64');

const gen = new AxChainOfThought(`question, animalImage:image -> answer`);

const res = await gen.forward(ai, {
  question: 'What family does this animal belong to?',
  animalImage: { mimeType: 'image/jpeg', data: image }
});

When using models like gpt-4o-audio-preview that support multi-modal prompts with audio support, we support using audio fields, and this works with the whole DSP pipeline.

const audio = fs
  .readFileSync('./src/examples/assets/comment.wav')
  .toString('base64');

const gen = new AxGen(`question, commentAudio:audio -> answer`);

const res = await gen.forward(ai, {
  question: 'What family does this animal belong to?',
  commentAudio: { format: 'wav', data: audio }
});

> Routing

Use multiple AI services through a single interface, automatically routing requests to the right service based on the model specified.

Multi-Service Router

The router lets you use multiple AI services through a single interface, automatically routing requests to the right service based on the model specified.

import { AxAI, AxMultiServiceRouter, AxAIOpenAIModel } from '@ax-llm/ax'

/// Setup OpenAI with model list
const openai = new AxAI({ 
  name: 'openai', 
  apiKey: process.env.OPENAI_APIKEY,
  models: [
    {
      key: 'basic',
      model: AxAIOpenAIModel.GPT4OMini,
      description: 'Model for very simple tasks such as answering quick short questions',
    },
    {
      key: 'medium',
      model: AxAIOpenAIModel.GPT4O,
      description: 'Model for semi-complex tasks such as summarizing text, writing code, and more',
    }
  ]
})

// Setup Gemini with model list
const gemini = new AxAI({ 
  name: 'google-gemini', 
  apiKey: process.env.GOOGLE_APIKEY,
  models: [
    {
      key: 'deep-thinker',
      model: 'gemini-2.0-flash-thinking',
      description: 'Model that can think deeply about a task, best for tasks that require planning',
    },
    {
      key: 'expert',
      model: 'gemini-2.0-pro',
      description: 'Model that is the best for very complex tasks such as writing large essays, complex coding, and more',
    }
  ]
})

const ollama = new AxAI({ 
  name: 'ollama', 
  config: { model: "nous-hermes2" }
})

const secretService = {
    key: 'sensitive-secret',
    service: ollama,
    description: 'Model for sensitive secrets tasks'
}

// Create a router with all services
const router = new AxMultiServiceRouter([openai, gemini, secretService])

// Route to OpenAI's expert model
const openaiResponse = await router.chat({
  chatPrompt: [{ role: 'user', content: 'Hello!' }],
  model: 'expert'
})

// Or use the router with AxGen
const gen = new AxGen(`question -> answer`)
const res = await gen.forward(router, { question: 'Hello!' })

The load balancer is ideal for high availability while the router is perfect when you need specific models for specific tasks Both can be used with any of Ax’s features like streaming, function calling, and chain-of-thought prompting.

They can also be used together

You can also use the balancer and the router together either the multiple balancers can be used with the router or the router can be used with the balancer.

Clear Use Cases

  • Balancer (AxBalancer):
    Use the balancer when you want to ensure high availability and load distribution across multiple AI services. It automatically retries requests on failures (using an exponential backoff mechanism) and chooses the service with optimal performance (for example, based on latency metrics).

  • Router (AxMultiServiceRouter):
    Use the router when you need explicit routing based on a model key. It aggregates models from both key–based and non–key–based services and delegates requests (chat or embed) to the underlying service that matches the provided key. This is especially useful when you have specialized models for different tasks.

Configuration and Options

  • Balancer Options:

    • Debug Mode: Toggle debug logs to see which service is being used and how retries occur.
    • Retry Settings: Options like initialBackoffMs, maxBackoffMs, and maxRetries allow you to control the retry behavior.
    • Metric Comparator: The default comparator (based on mean latency) can be overridden with a custom comparator if you want to prioritize services differently.
  • Router Expectations:

    • Unique Model Keys: When adding services, make sure the model keys are unique. The router validates that no two services provide the same key (for non–key–based items, it aggregates the model list).
    • Delegation Logic: For key–based services, the router keeps the original request’s model key. For non–key–based services (where a model list is provided), it delegates using the provided key from the model list.

Error Handling and Fallback

  • Balancer Fallback:
    The balancer uses error classification (for example, differentiating network, authentication, or timeout errors) to decide whether to retry the same service or to switch to another available service.

  • Router Errors:
    If the router cannot find a service for a given model key (or if there’s a conflict between key–based and non–key–based definitions), it throws an error immediately. This helps in catching configuration mistakes early.

> Model Context Protocol (MCP)

Model Context Protocol (MCP), allowing your agents to access external tools, and resources through a standardized interface.

Ax provides seamless integration with the Model Context Protocol (MCP), allowing your agents to access external tools, and resources through a standardized interface.

Using AxMCPClient

The AxMCPClient allows you to connect to any MCP-compatible server and use its capabilities within your Ax agents:

import { AxMCPClient, AxMCPStdioTransport } from '@ax-llm/ax'

// Initialize an MCP client with a transport
const transport = new AxMCPStdioTransport({
  command: 'npx',
  args: ['-y', '@modelcontextprotocol/server-memory'],
})

// Create the client with optional debug mode
const client = new AxMCPClient(transport, { debug: true })

// Initialize the connection
await client.init()

// Use the client's functions in an agent
const memoryAgent = new AxAgent({
  name: 'MemoryAssistant',
  description: 'An assistant with persistent memory',
  signature: 'input, userId -> response',
  functions: [client], // Pass the client as a function provider
})

// Or use the client with AxGen
const memoryGen = new AxGen('input, userId -> response', {
    functions: [client]
})

> Streaming Outputs

Learn how to use streaming outputs in ax, including streaming validation and assertions for the output fields and function execution.

We support parsing output fields and function execution while streaming. And end-to-end streaming with parsing, validating and function calling while streaming. This allows for fail-fast and error correction without waiting for the whole output, saving tokens and costs and reducing latency. Assertions are a powerful way to ensure the output matches your requirements; they also work with streaming.

// setup the prompt program
const gen = new AxChainOfThought(
  ai,
  `startNumber:number -> next10Numbers:number[]`
);

// add a assertion to ensure that the number 5 is not in an output field
gen.addAssert(({ next10Numbers }: Readonly<{ next10Numbers: number[] }>) => {
  return next10Numbers ? !next10Numbers.includes(5) : undefined;
}, 'Numbers 5 is not allowed');

// run the program with streaming enabled
const res = await gen.forward({ startNumber: 1 }, { stream: true });

// or run the program with end-to-end streaming
const generator = await gen.streamingForward({ startNumber: 1 }, { stream: true });
for await (const res of generator) {}

The above example allows you to validate entire output fields as they are streamed in. This validation works with streaming and when not streaming and is triggered when the whole field value is available. For true validation while streaming, check out the example below. This will massively improve performance and save tokens at scale in production.

// add a assertion to ensure all lines start with a number and a dot.
gen.addStreamingAssert(
  'answerInPoints',
  (value: string) => {
    const re = /^\d+\./;

    // split the value by lines, trim each line,
    // filter out empty lines and check if all lines match the regex
    return value
      .split('\n')
      .map((x) => x.trim())
      .filter((x) => x.length > 0)
      .every((x) => re.test(x));
  },
  'Lines must start with a number and a dot. Eg: 1. This is a line.'
);

// run the program with streaming enabled
const res = await gen.forward(
  {
    question: 'Provide a list of optimizations to speedup LLM inference.'
  },
  { stream: true, debug: true }
);

> Vercel AI SDK Integration

Learn how to integrate Ax with the Vercel AI SDK for building AI-powered applications using both the AI provider and Agent provider functionality.

npm i @ax-llm/ax-ai-sdk-provider

Then use it with the AI SDK, you can either use the AI provider or the Agent Provider

const ai = new AxAI({
    name: 'openai',
    apiKey: process.env['OPENAI_APIKEY'] ?? "",
});

// Create a model using the provider
const model = new AxAIProvider(ai);

export const foodAgent = new AxAgent({
  name: 'food-search',
  description:
    'Use this agent to find restaurants based on what the customer wants',
  signature,
  functions
})

// Get vercel ai sdk state
const aiState = getMutableAIState()

// Create an agent for a specific task
const foodAgent = new AxAgentProvider(ai, {
    agent: foodAgent,
    updateState: (state) => {
         aiState.done({ ...aiState.get(), state })
    },
    generate: async ({ restaurant, priceRange }) => {
        return (
            <BotCard>
                <h1>{restaurant as string} {priceRange as string}</h1>
            </BotCard>
        )
    }
})

// Use with streamUI a critical part of building chat UIs in the AI SDK
const result = await streamUI({
    model,
    initial: <SpinnerMessage />,
    messages: [
        // ...
    ],
    text: ({ content, done, delta }) => {
        // ...
    },
    tools: {
        // @ts-ignore
        'find-food': foodAgent,
    }
})

> Prompt Tuning Basic

You can tune your prompts using a larger model to help them run more efficiently and give you better results.

You can tune your prompts using a larger model to help them run more efficiently and give you better results. This is done by using an optimizer like AxBootstrapFewShot with and examples from the popular HotPotQA dataset. The optimizer generates demonstrations demos which when used with the prompt help improve its efficiency.

// Download the HotPotQA dataset from huggingface
const hf = new AxHFDataLoader({
  dataset: 'hotpot_qa',
  split: 'train'
});

const examples = await hf.getData<{ question: string; answer: string }>({
  count: 100,
  fields: ['question', 'answer']
});

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});

// Setup the program to tune
const program = new AxChainOfThought<{ question: string }, { answer: string }>(
  ai,
  `question -> answer "in short 2 or 3 words"`
);

// Setup a Bootstrap Few Shot optimizer to tune the above program
const optimize = new AxBootstrapFewShot<
  { question: string },
  { answer: string }
>({
  program,
  examples
});

// Setup a evaluation metric em, f1 scores are a popular way measure retrieval performance.
const metricFn: AxMetricFn = ({ prediction, example }) =>
  emScore(prediction.answer as string, example.answer as string);

// Run the optimizer and remember to save the result to use later
const result = await optimize.compile(metricFn);
tune-prompt

And to use the generated demos with the above ChainOfThought program

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});

// Setup the program to use the tuned data
const program = new AxChainOfThought<{ question: string }, { answer: string }>(
  ai,
  `question -> answer "in short 2 or 3 words"`
);

// load tuning data
program.loadDemos('demos.json');

const res = await program.forward({
  question: 'What castle did David Gregory inherit?'
});

console.log(res);

> Prompt Tuning MiPRO v2

MiPRO v2 is an advanced prompt optimization framework that uses Bayesian optimization to automatically find the best instructions, demonstrations, and examples for your LLM programs.

MiPRO v2 is an advanced prompt optimization framework that uses Bayesian optimization to automatically find the best instructions, demonstrations, and examples for your LLM programs. By systematically exploring different prompt configurations, MiPRO v2 helps maximize model performance without manual tuning.

Key Features

  • Instruction optimization: Automatically generates and tests multiple instruction candidates
  • Few-shot example selection: Finds optimal demonstrations from your dataset
  • Smart Bayesian optimization: Uses UCB (Upper Confidence Bound) strategy to efficiently explore configurations
  • Early stopping: Stops optimization when improvements plateau to save compute
  • Program and data-aware: Considers program structure and dataset characteristics

Basic Usage

import { AxAI, AxChainOfThought, AxMiPRO } from '@ax-llm/ax'

// 1. Setup your AI service
const ai = new AxAI({
  name: 'google-gemini',
  apiKey: process.env.GOOGLE_APIKEY
})

// 2. Create your program
const program = new AxChainOfThought(`input -> output`)

// 3. Configure the optimizer
const optimizer = new AxMiPRO({
  ai,
  program,
  examples: trainingData, // Your training examples
  options: {
    numTrials: 20,  // Number of configurations to try
    auto: 'medium'  // Optimization level
  }
})

// 4. Define your evaluation metric
const metricFn = ({ prediction, example }) => {
  return prediction.output === example.output
}

// 5. Run the optimization
const optimizedProgram = await optimizer.compile(metricFn, {
  valset: validationData  // Optional validation set
})

// 6. Use the optimized program
const result = await optimizedProgram.forward(ai, { input: "test input" })

Configuration Options

MiPRO v2 provides extensive configuration options:

OptionDescriptionDefault
numCandidatesNumber of instruction candidates to generate5
numTrialsNumber of optimization trials30
maxBootstrappedDemosMaximum number of bootstrapped demonstrations3
maxLabeledDemosMaximum number of labeled examples4
minibatchUse minibatching for faster evaluationtrue
minibatchSizeSize of evaluation minibatches25
earlyStoppingTrialsStop if no improvement after N trials5
minImprovementThresholdMinimum score improvement threshold0.01
programAwareProposerUse program structure for better proposalstrue
dataAwareProposerConsider dataset characteristicstrue
verboseShow detailed optimization progressfalse

Optimization Levels

You can quickly configure optimization intensity with the auto parameter:

// Light optimization (faster, less thorough)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'light' })

// Medium optimization (balanced)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'medium' })

// Heavy optimization (slower, more thorough)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'heavy' })

Advanced Example: Sentiment Analysis

// Create sentiment analysis program
const classifyProgram = new AxChainOfThought<
  { productReview: string },
  { label: string }
>(`productReview -> label:string "positive" or "negative"`)

// Configure optimizer with advanced settings
const optimizer = new AxMiPRO({
  ai,
  program: classifyProgram,
  examples: trainingData,
  options: {
    numCandidates: 3,
    numTrials: 10,
    maxBootstrappedDemos: 2,
    maxLabeledDemos: 3,
    earlyStoppingTrials: 3,
    programAwareProposer: true,
    dataAwareProposer: true,
    verbose: true
  }
})

// Run optimization and save the result
const optimizedProgram = await optimizer.compile(metricFn, {
  valset: validationData
})

// Save configuration for future use
const programConfig = JSON.stringify(optimizedProgram, null, 2)
await fs.promises.writeFile('./optimized-config.json', programConfig)

How It Works

MiPRO v2 works through these steps:

  1. Generates various instruction candidates
  2. Bootstraps few-shot examples from your data
  3. Selects labeled examples directly from your dataset
  4. Uses Bayesian optimization to find the optimal combination
  5. Applies the best configuration to your program

By exploring the space of possible prompt configurations and systematically measuring performance, MiPRO v2 delivers optimized prompts that maximize your model’s effectiveness.

> DSPy Explained

Whats DSPy, why it matters and how to use it.

Demonstrate, search, predict, or DSPy is a now-famous Stanford paper focused on optimizing the prompting of LLMs. The basic idea is to provide examples instead of instructions.

Ax supports DSPy and allows you to set examples on each prompt. It also allows you to run an optimizer, which runs the prompt using inputs from a test set and validates the outputs against the same test set. In short, the optimizer helps you capture good examples across the entire tree of prompts your workflow is built with.

Pick a prompt strategy

There are various prompts available in Ax, pick one based on your needs.

  1. Generate - Generic prompt that all other prompts inherit from.
  2. ChainOfThough - Increasing performance by reasoning before providing the answer
  3. RAG - Uses a vector database to add context and improve performance and accuracy.
  4. Agent - For agentic workflows

Create a signature

A signature defines the task you want to do, the inputs you’ll provide, and the outputs you expect the LLM to generate.

const prompt = new AxGen(
`"Extract customer query details" customerMessage:string -> customerName, customerIssue, ,productName:string, troubleshootingAttempted?:string`)

The next optional but most important thing you can do to improve the performance of your prompts is to set examples. When we say “performance,” we mean the number of times the LLM does exactly what you expect correctly over the number of times it fails.

Examples are the best way to communicate to the LLM what you want it to do. The patterns you define in high-quality examples help the LLM much better than the instructions.

prompt.setExample([
    {
        customerMessage: "Hello, I'm Jane Smith. I'm having trouble with my UltraPhone X. The screen remains black even after restarting multiple times. I have tried charging it overnight and using a different charger.",
        customerName: "Jane Smith",
        productName: "UltraPhone X",
        troubleshootingAttempted: "Charging it overnight and using a different charger.",
    },
    {
        customerMessage: "Hi, my name is Michael Johnson. My EcoPrinter Pro isn't connecting to Wi-Fi. I've restarted the printer and my router, and also tried connecting via Ethernet cable.",
        customerName: "Michael Johnson",
        productName: "EcoPrinter Pro",
        troubleshootingAttempted: "Restarted the printer and router, and tried connecting via Ethernet cable.",
    },
    {
        customerMessage: "Greetings, I'm Sarah Lee. I'm experiencing issues with my SmartHome Hub. It keeps losing connection with my smart devices. I have reset the hub, checked my internet connection, and re-paired the devices.",
        customerName: "Sarah Lee",
        productName: "SmartHome Hub",
        troubleshootingAttempted: "Reset the hub, checked the internet connection, and re-paired the devices.",
    }
])

Use this prompt

You are now ready to use this prompt in your workflows.

# Setup the ai
const ai = new AxAI("openai", { apiKey: process.env.OPENAI_APIKEY })

# Execute the prompt
const { customerName, productName, troubleshootingAttempted } = prompt.forward(ai, { customerMessage })

Easy enough! this is all you need

DAP prompt tuning

What if I want more performance, or do I want to run this with a smaller model? I was told you can tune your prompts with DSPy. Yes, this is true. You can do this. In short, you can use a big LLM to generate better examples for every prompt you use in your entire flow of prompts.

// Use the HuggingFace data loader or create one for your own data
const hf = new AxHFDataLoader({
  dataset: 'yixuantt/MultiHopRAG',
  split: 'train',
  config: 'MultiHopRAG',
  options: { length: 5 }
});

await hf.loadData();
// Fetch some rows, map the data columns to your prompts inputs
const examples = await hf.getRows<{ question: string; answer: string }>({
  count: 20,
  fields: ['query', 'answer'],
  renameMap: { query: 'question', answer: 'answer' }
});
// Create your prompt
const prompt = new AxGen(`question -> answer`)
// Setup a Bootstrap Few Shot optimizer to tune the above prompt
const optimize = new AxBootstrapFewShot<
  { question: string },
  { answer: string }
>({
  prompt,
  examples
});
// Setup a evaluation metric em, f1 scores are a popular way measure retrieval performance.
const metricFn: AxMetricFn = ({ prediction, example }) => {
  return AxEvalUtil.emScore(
    prediction.answer as string,
    example.answer as string
  );
};
// Run the optimizer
const result = await optimize.compile(metricFn);

// Save the results to use later
await fs.promises.writeFile('./qna-tune-demos.json', values);
// Use this tuning data in your workflow
const values = await fs.promises.readFile('./qna-tune-demos.json', 'utf8');
const demos = JSON.parse(values);

// Your done now, use this prompt
prompt.setDemos(demos);

> LLM Function Calling

How to create functions to use in Ax

In this guide, we’ll explain how to create functions, function classes, etc. that can be used in Ax. Creation focused functions with clear names and descriptions are critical to a solid workflow. Do not use too many functions on a prompt or make the function itself do too much. Focused functions are better. If you need to use several functions, then look into breaking down the task into multiple prompts or using agents.

Function definition simple

A function is an object with a name, and description along with a JSON schema of the function arguments and the function itself

// The function
const googleSearchAPI = async (query: string) => {
    const res = await axios.get("http://google.com/?q=" + query)
    return res.json()
}
// The function definition
const googleSearch AxFunction = {
    name: 'googleSearch',
    description: 'Use this function to search google for links related to the query',
    func: googleSearchAPI,
    parameters: {
        type: 'object',
         properties: {
             query: {
                description: `The query to search for`,
                type: 'string'
            },
        }
    }
}

Function definition as a class

Another way to define functions is as a class with a toFunction method.

class GoogleSearch {
    private apiKey: string;

    constructor(apiKey: string) {
        this.apiLey = apiKey;
    }


    async query(query: string) {
        const res = await axios.get("http://google.com/?q=" + query)
        return res.json()
    }

    async toFunction() {
        return {
            name: 'googleSearch',
            description: 'Use this function to search google for links related to the query',
            parameters: {
                type: 'object',
                properties: {
                    query: {
                        description: `The query to search for`,
                        type: 'string'
                    },
                }
            },
            func: (query: string) => this.query(query)
        }
    }
}

How to use these functions

Just set the function on the prompt

const prompt = new AxGen('inputs -> output', { functions: [ googleSearch ] })

Or in the case of function classes

const prompt = new AxGen('inputs -> output', { functions: [ new GoogleSearch(apiKey) ] })

Restaurant finding agent

Let’s create an agent to help find a restaurant based on the diner’s preferences. To do this, we’ll start by creating some dummy APIs specifically for this example. We’ll need a function to get the weather, and another one to look up places to eat at.

const choice = Math.round(Math.random());

const goodDay = {
  temperature: '27C',
  description: 'Clear Sky',
  wind_speed: 5.1,
  humidity: 56
};

const badDay = {
  temperature: '10C',
  description: 'Cloudy',
  wind_speed: 10.6,
  humidity: 70
};

// dummy weather lookup function
const weatherAPI = ({ location }: Readonly<{ location: string }>) => {
  const data = [
    {
      city: 'san francisco',
      weather: choice === 1 ? goodDay : badDay
    },
    {
      city: 'tokyo',
      weather: choice === 1 ? goodDay : badDay
    }
  ];

  return data
    .filter((v) => v.city === location.toLowerCase())
    .map((v) => v.weather);
};
// dummy opentable api
const opentableAPI = ({
  location
}: Readonly<{
  location: string;
  outdoor: string;
  cuisine: string;
  priceRange: string;
}>) => {
  const data = [
    {
      name: "Gordon Ramsay's",
      city: 'san francisco',
      cuisine: 'indian',
      rating: 4.8,
      price_range: '$$$$$$',
      outdoor_seating: true
    },
    {
      name: 'Sukiyabashi Jiro',
      city: 'san francisco',
      cuisine: 'sushi',
      rating: 4.7,
      price_range: '$$',
      outdoor_seating: true
    },
    {
      name: 'Oyster Bar',
      city: 'san francisco',
      cuisine: 'seafood',
      rating: 4.5,
      price_range: '$$',
      outdoor_seating: true
    },
    {
      name: 'Quay',
      city: 'tokyo',
      cuisine: 'sushi',
      rating: 4.6,
      price_range: '$$$$',
      outdoor_seating: true
    },
    {
      name: 'White Rabbit',
      city: 'tokyo',
      cuisine: 'indian',
      rating: 4.7,
      price_range: '$$$',
      outdoor_seating: true
    }
  ];

  return data
    .filter((v) => v.city === location?.toLowerCase())
    .sort((a, b) => {
      return a.price_range.length - b.price_range.length;
    });
};

The function parameters must be defined in JSON schema for the AI to read and understand.

// List of functions available to the AI
const functions: AxFunction[] = [
  {
    name: 'getCurrentWeather',
    description: 'get the current weather for a location',
    func: weatherAPI,
    parameters: {
      type: 'object',
      properties: {
        location: {
          description: 'location to get weather for',
          type: 'string'
        },
        units: {
          type: 'string',
          enum: ['imperial', 'metric'],
          description: 'units to use'
        }
      },
      required: ['location']
    }
  },
  {
    name: 'findRestaurants',
    description: 'find restaurants in a location',
    func: opentableAPI,
    parameters: {
      type: 'object',
      properties: {
        location: {
          description: 'location to find restaurants in',
          type: 'string'
        },
        outdoor: {
          type: 'boolean',
          description: 'outdoor seating'
        },
        cuisine: { type: 'string', description: 'cuisine type' },
        priceRange: {
          type: 'string',
          enum: ['$', '$$', '$$$', '$$$$'],
          description: 'price range'
        }
      },
      required: ['location', 'outdoor', 'cuisine', 'priceRange']
    }
  }
];

Let’s use this agent.

const customerQuery =
  "Give me an ideas for lunch today in San Francisco. I like sushi but I don't want to spend too much or other options are fine as well. Also if its a nice day I'd rather sit outside.";

const ai = new Ax({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
});

const agent = new AxAgent({
  name: 'Restaurant search agent'
  description:
    'Search for restaurants to dine at based on the weather and food preferences',
  signature:
    `customerQuery:string  -> restaurant:string, priceRange:string "use $ signs to indicate price range"`
    functions,
});

const res = await agent.forward(ai, { customerQuery });
console.log(res);
npm run tsx src/examples/food-search.ts

{
  restaurant: 'Sukiyabashi Jiro',
  priceRange: '$$'
}

> Advanced Prompt Tuning

Learn how to tune your prompts for better performance using Ax's optimization tools

Advanced Prompt Tuning

Prompt tuning is the process of automatically improving your prompts to get better, more consistent results from language models. Ax provides multiple optimization methods to enhance your prompt performance, reduce token usage, and enable smaller models to produce higher-quality results.

This guide will cover:

  • Why prompt tuning matters
  • Basic tuning with AxBootstrapFewShot
  • Advanced tuning with AxMiPRO (Model Instruction Program Optimization v2)
  • How to apply tuned prompts to your applications
  • Best practices for effective tuning

Why Tune Your Prompts?

Prompt tuning offers several key benefits:

  • Improved accuracy: Find optimal instructions and examples that help models understand your specific task
  • Reduced costs: Optimize prompts to use fewer tokens or run effectively on smaller, less expensive models
  • Consistency: Reduce variability in outputs by providing high-quality demonstrations
  • Domain adaptation: Tailor general-purpose models to your specific domain with minimal effort

Basic Tuning with AxBootstrapFewShot

The AxBootstrapFewShot optimizer is a straightforward way to improve your prompts through few-shot learning. It generates high-quality examples from your dataset that help the model better understand your task.

How It Works

  1. The optimizer takes your program and examples as input
  2. It uses a larger model to generate demonstrations for a subset of examples
  3. These demonstrations are evaluated using your metric function
  4. The best demonstrations are selected and combined to create an optimized prompt

Example: Optimizing a Question-Answering Prompt

import {
  AxAI,
  AxChainOfThought,
  AxBootstrapFewShot,
  AxEvalUtil,
  AxHFDataLoader,
  type AxMetricFn
} from '@ax-llm/ax'

// 1. Load your dataset (using HuggingFace data loader)
const hf = new AxHFDataLoader({
  dataset: 'hotpot_qa',
  split: 'train'
})

const examples = await hf.getData<{ question: string; answer: string }>({
  count: 100,
  fields: ['question', 'answer']
})

// 2. Create your AI service
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
})

// 3. Setup the program you want to tune
const program = new AxChainOfThought<{ question: string }, { answer: string }>(
  `question -> answer "in short 2 or 3 words"`
)

// 4. Configure the optimizer
const optimizer = new AxBootstrapFewShot<
  { question: string },
  { answer: string }
>({
  ai,
  program,
  examples
})

// 5. Define your evaluation metric
const metricFn: AxMetricFn = ({ prediction, example }) =>
  AxEvalUtil.emScore(prediction.answer as string, example.answer as string)

// 6. Run the optimizer and save the results
const result = await optimizer.compile(metricFn)
const values = JSON.stringify(result, null, 2)
await fs.promises.writeFile('./tuned-demos.json', values)

Using Your Tuned Prompt

After tuning, you can load and use your optimized prompt:

import fs from 'fs'
import { AxAI, AxChainOfThought } from '@ax-llm/ax'

// Load the AI service
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY as string
})

// Create your program
const program = new AxChainOfThought<{ question: string }, { answer: string }>(
  `question -> answer "in short 2 or 3 words"`
)

// Load the tuned demonstrations
const values = await fs.promises.readFile('./tuned-demos.json', 'utf8')
const demos = JSON.parse(values)
program.setDemos(demos)

// Use the optimized program
const result = await program.forward(ai, {
  question: 'What castle did David Gregory inherit?'
})
console.log(result) // Optimized answer

Advanced Tuning with AxMiPRO v2

While AxBootstrapFewShot is powerful, AxMiPRO (Model Instruction Program Optimization v2) takes prompt tuning to the next level. This Bayesian optimization framework systematically explores different combinations of instructions, demonstrations, and examples to find the optimal configuration.

Key Features of MiPRO v2

  • Instruction optimization: Automatically generates and tests multiple instruction candidates
  • Few-shot example selection: Finds optimal demonstrations from your dataset
  • Smart Bayesian optimization: Efficiently explores the configuration space
  • Early stopping: Halts optimization when improvements plateau to save compute
  • Program and data-aware: Considers program structure and dataset characteristics

Example: Sentiment Analysis Optimization

import fs from 'node:fs'
import {
  AxAI,
  AxChainOfThought,
  AxMiPRO,
  type AxMetricFn
} from '@ax-llm/ax'

// 1. Create your training data
const trainingData = [
  { productReview: 'This product is amazing!', label: 'positive' },
  { productReview: 'Completely disappointed by the quality.', label: 'negative' },
  { productReview: 'Best purchase ever.', label: 'positive' },
  { productReview: 'I really hate how this turned out.', label: 'negative' },
  // Additional examples...
]

const validationData = [
  { productReview: 'Very happy with the results.', label: 'positive' },
  { productReview: 'Terrible experience, not recommended.', label: 'negative' },
  // Additional validation examples...
]

// 2. Setup AI service
const ai = new AxAI({
  name: 'google-gemini',
  apiKey: process.env.GOOGLE_APIKEY
})

// 3. Create sentiment analysis program
const classifyProgram = new AxChainOfThought<
  { productReview: string },
  { label: string }
>(`productReview -> label:string "positive" or "negative"`)

// 4. Configure MiPRO optimizer
const optimizer = new AxMiPRO({
  ai,
  program: classifyProgram,
  examples: trainingData,
  options: {
    numCandidates: 3,       // Number of instruction candidates
    numTrials: 10,          // Number of optimization trials
    maxBootstrappedDemos: 2, // Maximum demos to bootstrap
    maxLabeledDemos: 3,     // Maximum labeled examples
    earlyStoppingTrials: 3, // Stop after 3 trials with no improvement
    programAwareProposer: true,
    dataAwareProposer: true,
    verbose: true
  }
})

// 5. Define evaluation metric
const metricFn: AxMetricFn = ({ prediction, example }) => {
  return prediction.label === example.label
}

// 6. Run the optimization
const optimizedProgram = await optimizer.compile(metricFn, {
  valset: validationData,
  auto: 'medium'  // Balanced optimization level
})

// 7. Save the optimized configuration
const programConfig = JSON.stringify(optimizedProgram, null, 2)
await fs.promises.writeFile('./mipro-optimized-config.json', programConfig)

MiPRO Configuration Options

MiPRO v2 offers extensive configuration options to tailor the optimization process:

OptionDescriptionDefault
numCandidatesNumber of instruction candidates to generate5
numTrialsNumber of optimization trials30
maxBootstrappedDemosMaximum number of bootstrapped demonstrations3
maxLabeledDemosMaximum number of labeled examples4
minibatchUse minibatching for faster evaluationtrue
minibatchSizeSize of evaluation minibatches25
earlyStoppingTrialsStop if no improvement after N trials5
minImprovementThresholdMinimum score improvement threshold0.01
programAwareProposerUse program structure for better proposalstrue
dataAwareProposerConsider dataset characteristicstrue
verboseShow detailed optimization progressfalse

Optimization Levels

For convenience, MiPRO offers pre-configured optimization intensities using the auto parameter:

// 1. Light optimization (faster, less thorough)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'light' })

// 2. Medium optimization (balanced)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'medium' })

// 3. Heavy optimization (slower, more thorough)
const optimizedProgram = await optimizer.compile(metricFn, { auto: 'heavy' })

How MiPRO v2 Works

MiPRO v2 optimizes your prompts through a systematic process:

  1. Instruction Generation: Creates multiple candidate instructions based on program structure and dataset characteristics
  2. Few-Shot Bootstrapping: Generates high-quality example demonstrations from your data
  3. Example Selection: Strategically selects labeled examples from your dataset
  4. Bayesian Optimization: Systematically explores different combinations of instructions and examples
  5. Configuration Application: Applies the best-performing configuration to your program

This process finds the optimal balance of instructions and examples to maximize your model’s effectiveness for your specific task.

Best Practices for Prompt Tuning

1. Prepare Quality Training Data

  • Diversity: Include examples covering different aspects of your task
  • Balance: Ensure balanced representation of different classes or categories
  • Size: Aim for at least 20-100 examples for basic tuning, more for complex tasks
  • Quality: Manually review examples to ensure they’re correct and representative

2. Choose the Right Evaluation Metric

Select a metric that truly measures success for your task:

  • Classification: Accuracy, F1 score, or precision/recall
  • Generation: BLEU, ROUGE, or semantic similarity scores
  • Question Answering: Exact match (EM) or F1 scores
  • Custom Metrics: Design task-specific metrics when standard ones don’t apply

3. Balance Compute and Quality

  • For quick improvements, use AxBootstrapFewShot with fewer examples
  • For production-critical applications, use AxMiPRO with the “heavy” optimization level
  • Consider running optimization overnight for complex tasks
  • Save and version your optimized configurations for reuse

4. Test on Diverse Validation Sets

  • Always test your tuned programs on held-out validation data
  • Ensure validation examples are representative of real-world use cases
  • Compare optimized vs. unoptimized performance to measure improvement

Conclusion

Prompt tuning is a powerful technique to improve the performance of your language model applications. Ax provides both simple and advanced optimization tools that can significantly enhance your results while potentially reducing costs.

Start with AxBootstrapFewShot for quick improvements, then explore AxMiPRO for more comprehensive optimization. By following the best practices outlined in this guide, you’ll be able to create prompts that maximize the effectiveness of your language models for your specific tasks.