The Complete Guide to DSPy Signatures in Ax

Introduction: Why Signatures Beat Prompts

Traditional prompt engineering is like writing assembly code – tedious, fragile, and requires constant tweaking. DSPy signatures are like high-level programming – you describe what you want, not how to get it.

The Problem with Prompts

// ❌ Traditional approach - fragile and verbose
const prompt = `You are a sentiment analyzer. Given a customer review, 
analyze the sentiment and return exactly one of: positive, negative, or neutral.
Be sure to only return the sentiment word, nothing else.

Review: ${review}
Sentiment:`;

// Hope the LLM follows instructions...

The Power of Signatures

// ✅ Signature approach - clear and type-safe
const analyzer = ax('review:string -> sentiment:class "positive, negative, neutral"');

// Guaranteed structured output with TypeScript types!
const result = await analyzer.forward(llm, { review });
console.log(result.sentiment); // TypeScript knows this is "positive" | "negative" | "neutral"

Understanding Signature Syntax

A signature defines the contract between your code and the LLM:

[description] input1:type, input2:type -> output1:type, output2:type

Basic Structure

Optional Description: Overall purpose in quotes
Input Fields: What you provide to the LLM
Arrow (->): Separates inputs from outputs
Output Fields: What the LLM returns

Examples

// Simple Q&A
'userQuestion:string -> aiAnswer:string'

// With description
'"Answer questions about TypeScript" question:string -> answer:string, confidence:number'

// Multiple inputs and outputs
'document:string, query:string -> summary:string, relevantQuotes:string[]'

Field Types Reference

Ax supports a rich type system that maps directly to TypeScript types:

Basic Types

Type	Signature Syntax	TypeScript Type	Example
String	`:string`	`string`	`userName:string`
Number	`:number`	`number`	`score:number`
Boolean	`:boolean`	`boolean`	`isValid:boolean`
JSON	`:json`	`any`	`metadata:json`

Date and Time Types

Type	Signature Syntax	TypeScript Type	Example
Date	`:date`	`Date`	`birthDate:date`
DateTime	`:datetime`	`Date`	`timestamp:datetime`

Media Types (Input Only)

Type	Signature Syntax	TypeScript Type	Example
Image	`:image`	`{mimeType: string, data: string}`	`photo:image`
Audio	`:audio`	`{format?: 'wav', data: string}`	`recording:audio`
File	`:file`	`{mimeType: string, data: string}`	`document:file`
URL	`:url`	`string`	`website:url`

Special Types

Type	Signature Syntax	TypeScript Type	Example
Code	`:code`	`string`	`pythonScript:code`
Classification	`:class "opt1, opt2"`	`"opt1" \| "opt2"`	`mood:class "happy, sad, neutral"`

Arrays and Optional Fields

Arrays

Add [] after any type to make it an array:

// String array
'tags:string[] -> processedTags:string[]'

// Number array
'scores:number[] -> average:number, median:number'

// Classification array
'documents:string[] -> categories:class[] "news, blog, tutorial"'

Optional Fields

Add ? before the colon to make a field optional:

// Optional input
'query:string, context?:string -> response:string'

// Optional output
'text:string -> summary:string, keywords?:string[]'

// Both optional
'message?:string -> reply?:string, confidence:number'

Advanced Features

Internal Fields (Output Only)

Use ! to mark output fields as internal (for reasoning/chain-of-thought):

// Internal fields are hidden from the final output but guide LLM reasoning
'problem:string -> reasoning!:string, solution:string'

Classification Fields (Output Only)

Classifications provide type-safe enums:

// Single classification
'email:string -> priority:class "urgent, normal, low"'

// Multiple options with pipe separator
'text:string -> sentiment:class "positive | negative | neutral"'

// Array of classifications
'reviews:string[] -> sentiments:class[] "positive, negative, neutral"'

Creating Signatures: Three Approaches

1. String-Based (Recommended)

import { ax, s } from '@ax-llm/ax';

// Direct generator creation
const generator = ax('input:string -> output:string');

// Create signature first, then generator
const sig = s('query:string -> response:string');
const gen = ax(sig.toString());

2. Pure Fluent Builder API

import { f } from '@ax-llm/ax';

// Using the pure fluent builder - only supports .optional(), .array(), .internal()
const signature = f()
  .input('userMessage', f.string('User input'))
  .input('contextData', f.string('Additional context').optional())
  .input('tags', f.string('Keywords').array())
  .input('categories', f.string('Categories').optional().array())
  .output('responseText', f.string('AI response'))
  .output('confidenceScore', f.number('Confidence score 0-1'))
  .output('debugInfo', f.string('Debug information').internal())
  .build();

3. Hybrid Approach

import { s, f } from '@ax-llm/ax';

// Start with string, add fields programmatically
const sig = s('base:string -> result:string')
  .appendInputField('extra', f.optional(f.json('Metadata')))
  .appendOutputField('score', f.number('Quality score'));

Pure Fluent API Reference

The fluent API has been redesigned to be purely fluent, meaning you can only use method chaining with .optional(), .array(), and .internal() methods. Nested function calls are no longer supported.

✅ Pure Fluent Syntax (Current)

import { f } from '@ax-llm/ax';

// Basic field types
const stringField = f.string('description');
const numberField = f.number('description');
const booleanField = f.boolean('description');

// Array types - use .array() method chaining
const stringArray = f.string('array description').array();
const numberArray = f.number('array description').array();
const booleanArray = f.boolean('array description').array();

// Optional fields - use .optional() method chaining
const optionalString = f.string('optional description').optional();
const optionalArray = f.string('optional array').optional().array();
const arrayOptional = f.string('array optional').array().optional(); // Same as above

// Internal fields (output only) - use .internal() method chaining
const internalField = f.string('internal description').internal();
const internalArray = f.string('internal array').array().internal();

// Complex combinations
const complexField = f.string('complex field')
  .optional()  // Make it optional
  .array()     // Make it an array
  .internal(); // Mark as internal (output only)

// Object descriptions
const objectField = f.object({
  field: f.string()
}, 'Description of the object structure');

// Array of objects with distinct descriptions
const objectArray = f.object({
  field: f.string()
}, 'Description of the individual item')
.array('Description of the list itself');

❌ Deprecated Nested Syntax (Removed)

// These no longer work and will cause compilation errors
const badArray = f.array(f.string('description'));      // ❌ Removed
const badOptional = f.optional(f.string('description')); // ❌ Removed  
const badInternal = f.internal(f.string('description')); // ❌ Removed

String vs Fluent API Equivalence

Both approaches create identical signatures:

// String syntax
const stringSig = AxSignature.create(`
  userMessages:string[] "User messages",
  maxTokens?:number "Max tokens",
  enableDebug:boolean "Debug mode",
  categories?:string[] "Optional categories"
  ->
  responseText:string "Response",
  debugInfo!:string "Debug info"
`);

// Equivalent fluent syntax  
const fluentSig = f()
  .input('userMessages', f.string('User messages').array())
  .input('maxTokens', f.number('Max tokens').optional())
  .input('enableDebug', f.boolean('Debug mode'))
  .input('categories', f.string('Optional categories').optional().array())
  .output('responseText', f.string('Response'))
  .output('debugInfo', f.string('Debug info').internal())
  .build();

// Both create identical runtime structures and TypeScript types
console.log(stringSig.toString() === fluentSig.toString()); // true

Type Inference and Arrays

The fluent API properly maps to TypeScript array types:

// These all correctly infer TypeScript types
const sig = f()
  .input('strings', f.string('strings').array())           // string[]
  .input('numbers', f.number('numbers').array())           // number[]
  .input('booleans', f.boolean('booleans').array())        // boolean[]
  .input('optionalStrings', f.string('optional').optional().array()) // string[] | undefined
  .output('responseText', f.string('response'))            // string
  .build();

// TypeScript knows the exact types at compile time
type InputType = {
  strings: string[];
  numbers: number[];  
  booleans: boolean[];
  optionalStrings?: string[];
  responseText: string;
};

Field Naming Best Practices

Ax enforces descriptive field names to improve LLM understanding:

✅ Good Field Names

userQuestion, customerEmail, documentText
analysisResult, summaryContent, responseMessage
confidenceScore, categoryType, priorityLevel

❌ Bad Field Names (Will Error)

text, data, input, output (too generic)
a, x, val (too short)
1field, 123name (starts with number)

Validation and Constraints (New!)

Ax now supports Zod-like validation constraints for ensuring data quality and format compliance. These constraints work automatically across input validation, output validation, and streaming scenarios.

Available Constraints

String Constraints

import { f } from '@ax-llm/ax';

// String length constraints
f.string('username').min(3).max(20)
f.string('password').min(8).max(128)

// Format validation (two equivalent syntaxes available)
f.string('email').email()               // Email format (or f.email())
f.string('website').url()               // URL format (or f.url())
f.string('birthDate').date()            // Date format (or f.date())
f.string('timestamp').datetime()        // DateTime format (or f.datetime())
f.string('pattern').regex('^[A-Z0-9]')  // Custom regex

// Alternative syntax using dedicated types
f.email('email')                        // Equivalent to f.string().email()
f.url('website')                        // Equivalent to f.string().url()
f.date('birthDate')                     // Equivalent to f.string().date()
f.datetime('timestamp')                 // Equivalent to f.string().datetime()

// Combinations
f.string('bio').max(500).optional()
f.string('contact').email().optional()
f.url('homepage').optional()            // Also works with dedicated types

Note: For email, url, date, and datetime, you can use either:

Validator syntax: f.string().email() - chain validators on strings
Dedicated type syntax: f.email() - use dedicated type directly

Both syntaxes work consistently in all contexts (input fields, output fields, nested objects)!

Number Constraints

// Numeric range constraints
f.number('age').min(18).max(120)
f.number('score').min(0).max(100)
f.number('price').min(0)        // No maximum
f.number('rating').max(5)       // No minimum

Complete Example

import { ax, f } from '@ax-llm/ax';

const userRegistration = f()
  .input('formData', f.string('Raw form data'))
  .output('user', f.object({
    username: f.string('Username').min(3).max(20),
    email: f.string('Email address').email(),
    age: f.number('User age').min(18).max(120),
    password: f.string('Password').min(8).regex('^(?=.*[A-Za-z])(?=.*\\d)'),
    bio: f.string('Biography').max(500).optional(),
    website: f.string('Personal website').url().optional(),
    tags: f.string('Interest tag').min(2).max(30).array()
  }, 'User profile information'))
  .build();

const generator = ax(userRegistration);
const result = await generator.forward(llm, {
  formData: 'Name: John, Email: john@example.com, Age: 25...'
});

// All fields are automatically validated:
// - username: 3-20 characters
// - email: valid email format
// - age: between 18-120
// - password: min 8 chars with letter and number
// - website: valid URL if provided
// - tags: each tag 2-30 characters

Validation Behavior

Input Validation (Before LLM):

Runs automatically before forward() calls
Validates user-provided data against constraints
Throws ValidationError if constraints are violated
Includes nested object and array validation

Output Validation (After LLM):

Runs automatically during response extraction
Validates LLM-generated data against constraints
Triggers automatic retry with correction instructions on validation errors
Works seamlessly with streaming

Streaming Validation:

Validates each field as it completes during streaming
Provides incremental validation feedback
Maintains type safety throughout the stream

TypeScript Integration

Validation constraints are fully integrated with TypeScript’s type system:

const sig = f()
  .output('contact', f.object({
    email: f.string().email(),
    age: f.number().min(0).max(150)
  }))
  .build();

const gen = ax(sig);
const result = await gen.forward(llm, input);

// TypeScript knows exact types with runtime guarantees
result.contact.email  // string (guaranteed valid email format)
result.contact.age    // number (guaranteed 0-150 range)

Media Type Restrictions

Media types (f.image(), f.audio(), f.file()) have special restrictions:

// ✅ Allowed: Top-level input fields
const validSig = f()
  .input('photo', f.image('Profile picture'))
  .input('recording', f.audio('Voice message'))
  .output('result', f.string('Analysis result'))
  .build();

// ❌ Not allowed: Media types in nested objects
const invalidSig = f()
  .output('user', f.object({
    photo: f.image('Profile')  // TypeScript error + runtime error
  }))
  .build();

// ❌ Not allowed: Media types as outputs
const invalidSig2 = f()
  .output('photo', f.image('Generated image'))  // Runtime validation error
  .build();

Media types are restricted to top-level input fields only for security and practical reasons. This is enforced at both:

Compile time: TypeScript will show errors for invalid usage
Runtime: Validation throws descriptive errors if restrictions are violated

Advanced Validation Patterns

// Complex nested validation
const productExtractor = f()
  .input('productPage', f.string('HTML content'))
  .output('product', f.object({
    name: f.string('Product name').min(1).max(200),
    price: f.number('Price in USD').min(0),
    specs: f.object({
      dimensions: f.object({
        width: f.number('Width in cm').min(0),
        height: f.number('Height in cm').min(0)
      }),
      materials: f.string('Material').min(1).array()
    }),
    reviews: f.object({
      rating: f.number('Rating').min(1).max(5),
      comment: f.string('Review text').min(10).max(1000)
    }).array()
  }))
  .build();

// Validation runs recursively through all nested levels

Error Handling

try {
  const result = await generator.forward(llm, { formData: 'invalid' });
} catch (error) {
  if (error instanceof ValidationError) {
    // Validation failed - error contains detailed constraint violation info
    console.error('Validation failed:', error.message);
    // Example: "Field 'email' failed validation: Invalid email format"
  }
}

Validation errors automatically trigger retries with correction instructions, making the LLM aware of the constraint violations and improving output quality.

Real-World Examples

Email Classifier

const emailClassifier = ax(`
  emailSubject:string "Email subject line",
  emailBody:string "Full email content" ->
  category:class "sales, support, spam, newsletter" "Email category",
  priority:class "urgent, normal, low" "Priority level",
  summary:string "Brief summary of the email"
`);

const result = await emailClassifier.forward(llm, {
  emailSubject: "Urgent: Server Down",
  emailBody: "Our production server is experiencing issues..."
});

console.log(result.category);  // "support"
console.log(result.priority);  // "urgent"

Document Analyzer with Chain-of-Thought

const analyzer = ax(`
  documentText:string "Document to analyze" ->
  reasoning!:string "Step-by-step analysis",
  mainTopics:string[] "Key topics discussed",
  sentiment:class "positive, negative, neutral, mixed" "Overall tone",
  readability:class "elementary, high-school, college, graduate" "Reading level",
  keyInsights:string[] "Important takeaways"
`);

// The reasoning field guides the LLM but isn't returned
const result = await analyzer.forward(llm, { 
  documentText: "..." 
});
// result.reasoning is undefined (internal field)
// result.mainTopics, sentiment, etc. are available

const imageAnalyzer = ax(`
  imageData:image "Image to analyze",
  question?:string "Specific question about the image" ->
  description:string "What's in the image",
  objects:string[] "Identified objects",
  textFound?:string "Any text detected in the image",
  answerToQuestion?:string "Answer if question was provided"
`);

Data Extraction

const extractor = ax(`
  invoiceText:string "Raw invoice text" ->
  invoiceNumber:string "Invoice ID",
  invoiceDate:date "Date of invoice",
  dueDate:date "Payment due date",
  totalAmount:number "Total amount due",
  lineItems:json[] "Array of {description, quantity, price}",
  vendor:json "{ name, address, taxId }"
`);

Pure Fluent API Example

import { f, ax } from '@ax-llm/ax';

// Complex signature using pure fluent API
const contentAnalyzer = f()
  .input('articleText', f.string('Article content to analyze'))
  .input('authorInfo', f.json('Author metadata').optional())
  .input('keywords', f.string('Target keywords').array())
  .input('checkFactuality', f.boolean('Enable fact-checking'))
  .output('mainThemes', f.string('Key themes').array())
  .output('sentimentScore', f.number('Sentiment score -1 to 1'))
  .output('readabilityLevel', f.class(['elementary', 'middle', 'high', 'college'], 'Reading level'))
  .output('factChecks', f.json('Fact checking results').array().optional())
  .output('processingTime', f.number('Analysis time in ms').internal())
  .description('Comprehensive article analysis with optional fact-checking')
  .build();

// Create generator from fluent signature
const generator = ax(contentAnalyzer.toString());

// Usage with typed inputs/outputs
const result = await generator.forward(llm, {
  articleText: 'Sample article content...',
  keywords: ['AI', 'machine learning', 'technology'],
  checkFactuality: true,
  // authorInfo is optional
});

// TypeScript knows exact types
console.log(result.mainThemes);        // string[]
console.log(result.sentimentScore);    // number
console.log(result.readabilityLevel);  // 'elementary' | 'middle' | 'high' | 'college'
console.log(result.factChecks);        // json[] | undefined (optional)
// result.processingTime is undefined (internal field)

Streaming Support

All signatures support streaming by default:

const storyteller = ax(`
  prompt:string "Story prompt",
  genre:class "fantasy, sci-fi, mystery, romance" ->
  title:string "Story title",
  story:string "The complete story",
  wordCount:number "Approximate word count"
`);

// Stream the response
for await (const chunk of storyteller.stream(llm, { 
  prompt: "A detective discovers their partner is a time traveler",
  genre: "mystery"
})) {
  if (chunk.story) {
    process.stdout.write(chunk.story); // Real-time streaming
  }
}

Type Safety and IntelliSense

Signatures provide full TypeScript type inference:

const typed = ax(`
  userId:number,
  includeDetails?:boolean ->
  userName:string,
  userEmail:string,
  metadata?:json
`);

// TypeScript knows the exact types
const result = await typed.forward(llm, {
  userId: 123,          // ✅ number required
  includeDetails: true  // ✅ boolean optional
  // userEmail: "..."   // ❌ TypeScript error: not an input field
});

console.log(result.userName);    // ✅ TypeScript knows this is string
console.log(result.metadata?.x); // ✅ TypeScript knows this is any | undefined
// console.log(result.userId);   // ❌ TypeScript error: not an output field

Common Patterns

1. Chain of Thought Reasoning

// Use internal fields for reasoning steps
const reasoner = ax(`
  problem:string ->
  thoughts!:string "Internal reasoning process",
  answer:string "Final answer"
`);

2. Structured Data Extraction

// Extract structured data from unstructured text
const parser = ax(`
  messyData:string ->
  structured:json "Clean JSON representation"
`);

3. Multi-Step Classification

// Hierarchical classification
const classifier = ax(`
  text:string ->
  mainCategory:class "technical, business, creative",
  subCategory:class "based on main category",
  confidence:number "0-1 confidence score"
`);

4. Validation and Checking

// Validate and explain
const validator = ax(`
  code:code "Code to review",
  language:string "Programming language" ->
  isValid:boolean "Is the code syntactically correct",
  errors?:string[] "List of errors if any",
  suggestions?:string[] "Improvement suggestions"
`);

Error Handling

Signatures provide clear, actionable error messages:

// ❌ This will throw a descriptive error
try {
  const bad = ax('text:string -> result:string');
} catch (error) {
  // Error: Field name "text" is too generic. 
  // Use a more descriptive name like "inputText" or "documentText"
}

// ❌ Invalid type
try {
  const bad = ax('userInput:str -> result:string');
} catch (error) {
  // Error: Unknown type "str". Did you mean "string"?
}

// ❌ Duplicate field names
try {
  const bad = ax('data:string, data:number -> result:string');
} catch (error) {
  // Error: Duplicate field name "data" in inputs
}

Migration from Traditional Prompts

Before (Prompt Engineering)

const prompt = `
Analyze the sentiment of the following review.
Rate it on a scale of 1-5.
Identify the main topics discussed.
Format your response as JSON with keys: rating, sentiment, topics

Review: ${review}
`;

const response = await llm.generate(prompt);
const parsed = JSON.parse(response); // Hope it's valid JSON...

After (Signatures)

const analyzer = ax(`
  review:string ->
  rating:number "1-5 rating",
  sentiment:class "very positive, positive, neutral, negative, very negative",
  topics:string[] "Main topics discussed"
`);

const result = await analyzer.forward(llm, { review });
// Guaranteed structure, no parsing needed!

Performance Tips

Use specific types: class is more token-efficient than string for enums
Leverage arrays: Process multiple items in one call
Optional fields: Only request what you need
Internal fields: Use ! for reasoning without returning it

Conclusion

DSPy signatures in Ax transform LLM interactions from fragile prompt engineering to robust, type-safe programming. By describing what you want instead of how to get it, you can:

Write more maintainable code
Get guaranteed structured outputs
Leverage TypeScript’s type system
Switch LLM providers without changing logic
Build production-ready AI features faster

Start using signatures today and experience the difference!

The Complete Guide to DSPy Signatures in Ax

Introduction: Why Signatures Beat Prompts

The Problem with Prompts

The Power of Signatures

Understanding Signature Syntax

Basic Structure

Examples

Field Types Reference

Basic Types

Date and Time Types

Media Types (Input Only)

Special Types

Arrays and Optional Fields

Arrays

Optional Fields

Advanced Features

Internal Fields (Output Only)

Classification Fields (Output Only)

Creating Signatures: Three Approaches

1. String-Based (Recommended)

2. Pure Fluent Builder API

3. Hybrid Approach

Pure Fluent API Reference

✅ Pure Fluent Syntax (Current)

❌ Deprecated Nested Syntax (Removed)

String vs Fluent API Equivalence

Type Inference and Arrays

Field Naming Best Practices

✅ Good Field Names

❌ Bad Field Names (Will Error)

Validation and Constraints (New!)

Available Constraints

String Constraints

Number Constraints

Complete Example

Validation Behavior

TypeScript Integration

Media Type Restrictions

Advanced Validation Patterns

Error Handling

Real-World Examples

Email Classifier

Document Analyzer with Chain-of-Thought

Multi-Modal Analysis

Data Extraction

Pure Fluent API Example

Streaming Support

Type Safety and IntelliSense

Common Patterns

1. Chain of Thought Reasoning

2. Structured Data Extraction

3. Multi-Step Classification

4. Validation and Checking

Error Handling

Migration from Traditional Prompts

Before (Prompt Engineering)

After (Signatures)

Performance Tips

Conclusion