Documentation

Build LLM-powered agents
with production-ready TypeScript

DSPy for TypeScript. Working with LLMs is complexβ€”they don't always do what you want. DSPy makes it easier to build amazing things with LLMs. Just define your inputs and outputs (signature) and an efficient prompt is auto-generated and used. Connect together various signatures to build complex systems and workflows using LLMs.

15+ LLM Providers
End-to-end Streaming
Auto Prompt Tuning

Telemetry Guide

🎯 Goal: Learn how to monitor, trace, and observe your AI applications with industry-standard OpenTelemetry integration. ⏱️ Time to first results: 5 minutes
πŸ” Value: Understand performance, debug issues, and optimize costs with comprehensive observability

πŸ“‹ Table of Contents


What is Telemetry in Ax?

Think of telemetry as X-ray vision for your AI applications. Instead of guessing what’s happening, you get:

Real example: A production AI system that went from β€œit’s slow sometimes” to β€œwe can see exactly which model calls are taking 3+ seconds and why.”

πŸ—ΊοΈ Learning Path

Beginner      β†’ Intermediate    β†’ Advanced       β†’ Production
     ↓              ↓               ↓                ↓
Quick Start  β†’ Metrics Setup   β†’ Custom Spans    β†’ Enterprise
(5 min)       (15 min)          (30 min)          (1+ hour)

πŸš€ 5-Minute Quick Start

Step 1: Basic Setup with Console Export

import { AxAI, ax, f } from '@ax-llm/ax'
import { trace, metrics } from '@opentelemetry/api'
import {
  BasicTracerProvider,
  ConsoleSpanExporter,
  SimpleSpanProcessor,
} from '@opentelemetry/sdk-trace-base'
import {
  MeterProvider,
  ConsoleMetricExporter,
  PeriodicExportingMetricReader,
} from '@opentelemetry/sdk-metrics'

// Set up basic tracing
const tracerProvider = new BasicTracerProvider()
tracerProvider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()))
trace.setGlobalTracerProvider(tracerProvider)

// Set up basic metrics
const meterProvider = new MeterProvider({
  readers: [
    new PeriodicExportingMetricReader({
      exporter: new ConsoleMetricExporter(),
      exportIntervalMillis: 5000,
    }),
  ],
})
metrics.setGlobalMeterProvider(meterProvider)

// Get your tracer and meter
const tracer = trace.getTracer('my-ai-app')
const meter = metrics.getMeter('my-ai-app')

Step 2: Create AI with Telemetry

// Create AI instance with telemetry enabled
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4o-mini' },
  options: {
    tracer,
    meter,
    debug: true, // Enable detailed logging
  },
})

// Create a simple generator
const sentimentAnalyzer = ax`
  reviewText:${f.string('Customer review')} -> 
  sentiment:${f.class(['positive', 'negative', 'neutral'], 'Sentiment')}
`

Step 3: Run and Observe

// This will automatically generate traces and metrics
const result = await sentimentAnalyzer.forward(ai, {
  reviewText: 'This product is amazing! I love it!'
})

console.log('Result:', result.sentiment)

οΏ½οΏ½ Congratulations! You now have full observability. Check your console for:


πŸ“Š Metrics Overview

Ax automatically tracks comprehensive metrics across all operations. Here’s what you get:

πŸ€– AI Service Metrics

Request Metrics

Token Usage

Cost & Performance

Streaming & Functions

🧠 AxGen Metrics

Generation Flow

Multi-Step Processing

Error Correction

Function Integration

πŸ”§ Optimizer Metrics

Optimization Flow

Convergence Tracking

Resource Usage

Teacher-Student Interactions

Checkpointing

Pareto Optimization

Program Complexity

πŸ“Š Database Metrics

Vector Operations

πŸ“ˆ Example Metrics Output

{
  "name": "ax_llm_request_duration_ms",
  "description": "Duration of LLM requests in milliseconds",
  "unit": "ms",
  "data": {
    "resourceMetrics": [{
      "scopeMetrics": [{
        "metrics": [{
          "name": "ax_llm_request_duration_ms",
          "histogram": {
            "dataPoints": [{
              "attributes": {
                "operation": "chat",
                "ai_service": "openai",
                "model": "gpt-4o-mini"
              },
              "sum": 2450.5,
              "count": 10,
              "bounds": [1, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000],
              "bucketCounts": [0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 2]
            }]
          }
        }]
      }]
    }]
  }
}

πŸ” Tracing Overview

Ax provides comprehensive distributed tracing following OpenTelemetry standards and the new gen_ai semantic conventions.

🎯 Trace Structure

Root Spans

Child Spans

πŸ“‹ Standard Attributes

LLM Attributes (gen_ai.*)

{
  'gen_ai.system': 'openai',
  'gen_ai.operation.name': 'chat',
  'gen_ai.request.model': 'gpt-4o-mini',
  'gen_ai.request.max_tokens': 500,
  'gen_ai.request.temperature': 0.1,
  'gen_ai.request.llm_is_streaming': false,
  'gen_ai.usage.input_tokens': 150,
  'gen_ai.usage.output_tokens': 200,
  'gen_ai.usage.total_tokens': 350
}

Database Attributes (db.*)

{
  'db.system': 'weaviate',
  'db.operation.name': 'query',
  'db.table': 'documents',
  'db.namespace': 'default',
  'db.vector.query.top_k': 10
}

Custom Ax Attributes

{
  'signature': 'JSON representation of signature',
  'examples': 'JSON representation of examples',
  'provided_functions': 'function1,function2',
  'thinking_token_budget': 'low',
  'show_thoughts': true,
  'max_steps': 5,
  'max_retries': 3
}

πŸ“Š Standard Events

Message Events

Usage Events

πŸ“ˆ Example Trace Output

{
  "traceId": "ddc7405e9848c8c884e53b823e120845",
  "name": "Chat Request",
  "id": "d376daad21da7a3c",
  "kind": "SERVER",
  "timestamp": 1716622997025000,
  "duration": 14190456.542,
  "attributes": {
    "gen_ai.system": "openai",
    "gen_ai.operation.name": "chat",
    "gen_ai.request.model": "gpt-4o-mini",
    "gen_ai.request.max_tokens": 500,
    "gen_ai.request.temperature": 0.1,
    "gen_ai.request.llm_is_streaming": false,
    "gen_ai.usage.input_tokens": 150,
    "gen_ai.usage.output_tokens": 200,
    "gen_ai.usage.total_tokens": 350
  },
  "events": [
    {
      "name": "gen_ai.user.message",
      "timestamp": 1716622997025000,
      "attributes": {
        "content": "What is the capital of France?"
      }
    },
    {
      "name": "gen_ai.assistant.message",
      "timestamp": 1716622997025000,
      "attributes": {
        "content": "The capital of France is Paris."
      }
    }
  ]
}

🎯 Common Observability Patterns

1. Performance Monitoring

// Monitor latency percentiles
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: {
    tracer,
    meter,
    // Custom latency thresholds
    timeout: 30000,
  },
})

// Set up alerts on high latency
// P95 > 5s, P99 > 10s

2. Cost Tracking

// Track costs by model and operation
const costOptimizer = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4o-mini' }, // Cheaper model
  options: { tracer, meter },
})

// Monitor estimated costs
// Alert when daily spend > $100

3. Error Rate Monitoring

// Track error rates by service
const reliableAI = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: {
    tracer,
    meter,
    // Retry configuration
    maxRetries: 3,
    retryDelay: 1000,
  },
})

// Set up alerts on error rate > 5%

4. Function Call Monitoring

// Monitor function call success rates
const functionAI = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: { tracer, meter },
})

// Track function call latency and success rates
// Alert on function call failures

5. Streaming Performance

// Monitor streaming response times
const streamingAI = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { stream: true },
  options: { tracer, meter },
})

// Track time to first token
// Monitor streaming completion rates

πŸ—οΈ Production Setup

1. Jaeger Tracing Setup

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base'

// Start Jaeger locally
// docker run --rm --name jaeger -p 16686:16686 -p 4318:4318 jaegertracing/jaeger:2.6.0

const otlpExporter = new OTLPTraceExporter({
  url: 'http://localhost:4318/v1/traces',
})

const provider = new BasicTracerProvider({
  spanProcessors: [new BatchSpanProcessor(otlpExporter)],
  resource: defaultResource().merge(
    resourceFromAttributes({
      'service.name': 'my-ai-app',
      'service.version': '1.0.0',
    })
  ),
})

trace.setGlobalTracerProvider(provider)

2. Prometheus Metrics Setup

import { PrometheusExporter } from '@opentelemetry/exporter-prometheus'

const prometheusExporter = new PrometheusExporter({
  port: 9464,
  endpoint: '/metrics',
})

const meterProvider = new MeterProvider({
  readers: [
    new PeriodicExportingMetricReader({
      exporter: prometheusExporter,
      exportIntervalMillis: 1000,
    }),
  ],
})

metrics.setGlobalMeterProvider(meterProvider)

3. Cloud Observability Setup

// For AWS X-Ray, Google Cloud Trace, Azure Monitor, etc.
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http'

const cloudExporter = new OTLPTraceExporter({
  url: 'https://your-observability-endpoint.com/v1/traces',
  headers: {
    'Authorization': `Bearer ${process.env.OBSERVABILITY_API_KEY}`,
  },
})

const cloudMetricsExporter = new OTLPMetricExporter({
  url: 'https://your-observability-endpoint.com/v1/metrics',
  headers: {
    'Authorization': `Bearer ${process.env.OBSERVABILITY_API_KEY}`,
  },
})

4. Environment-Specific Configuration

// config/telemetry.ts
export const setupTelemetry = (environment: 'development' | 'production') => {
  if (environment === 'development') {
    // Console export for local development
    const consoleExporter = new ConsoleSpanExporter()
    const provider = new BasicTracerProvider({
      spanProcessors: [new SimpleSpanProcessor(consoleExporter)],
    })
    trace.setGlobalTracerProvider(provider)
  } else {
    // Production setup with sampling and batching
    const otlpExporter = new OTLPTraceExporter({
      url: process.env.OTLP_ENDPOINT!,
    })
    
    const provider = new BasicTracerProvider({
      spanProcessors: [
        new BatchSpanProcessor(otlpExporter, {
          maxQueueSize: 2048,
          maxExportBatchSize: 512,
          scheduledDelayMillis: 5000,
        }),
      ],
    })
    
    trace.setGlobalTracerProvider(provider)
  }
}

⚑ Advanced Configuration

1. Custom Metrics

// Create custom business metrics
const customMeter = metrics.getMeter('business-metrics')
const customCounter = customMeter.createCounter('business_operations_total', {
  description: 'Total business operations',
})

// Record custom metrics
customCounter.add(1, {
  operation_type: 'sentiment_analysis',
  customer_tier: 'premium',
})

2. Custom Spans

// Create custom spans for business logic
const tracer = trace.getTracer('business-logic')

const processOrder = async (orderId: string) => {
  return await tracer.startActiveSpan(
    'Process Order',
    {
      attributes: {
        'order.id': orderId,
        'business.operation': 'order_processing',
      },
    },
    async (span) => {
      try {
        // Your business logic here
        const result = await ai.chat({ /* ... */ })
        
        span.setAttributes({
          'order.status': 'completed',
          'order.value': result.total,
        })
        
        return result
      } catch (error) {
        span.recordException(error)
        span.setAttributes({ 'order.status': 'failed' })
        throw error
      } finally {
        span.end()
      }
    }
  )
}

3. Sampling Configuration

// Configure sampling for high-traffic applications
import { ParentBasedSampler, TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base'

const sampler = new ParentBasedSampler({
  root: new TraceIdRatioBasedSampler(0.1), // Sample 10% of traces
})

const provider = new BasicTracerProvider({
  sampler,
  spanProcessors: [new BatchSpanProcessor(otlpExporter)],
})

4. Metrics Configuration

// Configure metrics collection
import { axUpdateMetricsConfig, axUpdateOptimizerMetricsConfig } from '@ax-llm/ax'

// Configure DSPy metrics
axUpdateMetricsConfig({
  enabled: true,
  enabledCategories: ['generation', 'streaming', 'functions', 'errors', 'performance'],
  maxLabelLength: 100,
  samplingRate: 1.0, // Collect all metrics
})

// Configure optimizer metrics
axUpdateOptimizerMetricsConfig({
  enabled: true,
  enabledCategories: [
    'optimization',
    'convergence', 
    'resource_usage',
    'teacher_student',
    'checkpointing',
    'pareto'
  ],
  maxLabelLength: 100,
  samplingRate: 1.0
})

5. Optimizer Metrics Usage

// Optimizer metrics are automatically collected when using optimizers
import { AxBootstrapFewShot } from '@ax-llm/ax'

const optimizer = new AxBootstrapFewShot({
  studentAI: ai,
  examples: trainingExamples,
  validationSet: validationExamples,
  targetScore: 0.9,
  verbose: true,
  options: {
    maxRounds: 5,
  },
})

// Metrics are automatically recorded during optimization
const result = await optimizer.compile(program, metricFn)

// Check optimization metrics
console.log('Optimization duration:', result.stats.resourceUsage.totalTime)
console.log('Total tokens used:', result.stats.resourceUsage.totalTokens)
console.log('Convergence info:', result.stats.convergenceInfo)

6. Global Telemetry Setup

// Set up global telemetry for all Ax operations
import { axGlobals } from '@ax-llm/ax'

// Global tracer
axGlobals.tracer = trace.getTracer('global-ax-tracer')

// Global meter
axGlobals.meter = metrics.getMeter('global-ax-meter')

// Now all Ax operations will use these by default
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  // No need to specify tracer/meter - uses globals
})

πŸ› οΈ Troubleshooting Guide

Common Issues

1. No traces appearing

// Check if tracer is properly configured
console.log('Tracer:', trace.getTracer('test'))
console.log('Provider:', trace.getTracerProvider())

// Ensure spans are being created
const span = tracer.startSpan('test')
span.end()

2. Metrics not updating

// Check meter configuration
console.log('Meter:', metrics.getMeter('test'))
console.log('Provider:', metrics.getMeterProvider())

// Verify metric collection
const testCounter = meter.createCounter('test_counter')
testCounter.add(1)

3. High memory usage

// Reduce metric cardinality
axUpdateMetricsConfig({
  maxLabelLength: 50, // Shorter labels
  samplingRate: 0.1, // Sample 10% of metrics
})

// Use batch processing for spans
const batchProcessor = new BatchSpanProcessor(exporter, {
  maxQueueSize: 1024, // Smaller queue
  maxExportBatchSize: 256, // Smaller batches
})

4. Slow performance

// Use async exporters
const asyncExporter = new OTLPTraceExporter({
  url: 'http://localhost:4318/v1/traces',
  timeoutMillis: 30000,
})

// Configure appropriate sampling
const sampler = new TraceIdRatioBasedSampler(0.01) // Sample 1%

Debug Mode

// Enable debug mode for detailed logging
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: {
    debug: true, // Detailed logging
    tracer,
    meter,
  },
})

// Check debug output for telemetry information

πŸŽ“ Best Practices

1. Naming Conventions

// Use consistent naming for tracers and meters
const tracer = trace.getTracer('my-app.ai-service')
const meter = metrics.getMeter('my-app.ai-service')

// Use descriptive span names
const span = tracer.startSpan('Sentiment Analysis Request')

2. Attribute Management

// Use standard attributes when possible
span.setAttributes({
  'gen_ai.system': 'openai',
  'gen_ai.operation.name': 'chat',
  'gen_ai.request.model': 'gpt-4o-mini',
})

// Add business context
span.setAttributes({
  'business.customer_id': customerId,
  'business.operation_type': 'sentiment_analysis',
})

3. Error Handling

// Always record exceptions in spans
try {
  const result = await ai.chat(request)
  return result
} catch (error) {
  span.recordException(error)
  span.setAttributes({ 'error.type': error.name })
  throw error
} finally {
  span.end()
}

4. Performance Optimization

// Use batch processing for high-volume applications
const batchProcessor = new BatchSpanProcessor(exporter, {
  maxQueueSize: 2048,
  maxExportBatchSize: 512,
  scheduledDelayMillis: 5000,
})

// Configure appropriate sampling
const sampler = new ParentBasedSampler({
  root: new TraceIdRatioBasedSampler(0.1), // 10% sampling
})

5. Security Considerations

// Exclude sensitive content from traces
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: {
    excludeContentFromTrace: true, // Don't log prompt content
    tracer,
  },
})

// Use secure headers for cloud exporters
const secureExporter = new OTLPTraceExporter({
  url: process.env.OTLP_ENDPOINT!,
  headers: {
    'Authorization': `Bearer ${process.env.API_KEY}`,
  },
})

πŸ“– Complete Examples

1. Full Production Setup

// examples/production-telemetry.ts
import { AxAI, ax, f } from '@ax-llm/ax'
import { trace, metrics } from '@opentelemetry/api'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http'
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base'
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics'

// Production telemetry setup
const setupProductionTelemetry = () => {
  // Tracing setup
  const traceExporter = new OTLPTraceExporter({
    url: process.env.OTLP_TRACE_ENDPOINT!,
    headers: { 'Authorization': `Bearer ${process.env.OTLP_API_KEY}` },
  })
  
  const traceProvider = new BasicTracerProvider({
    spanProcessors: [new BatchSpanProcessor(traceExporter)],
  })
  trace.setGlobalTracerProvider(traceProvider)
  
  // Metrics setup
  const metricExporter = new OTLPMetricExporter({
    url: process.env.OTLP_METRIC_ENDPOINT!,
    headers: { 'Authorization': `Bearer ${process.env.OTLP_API_KEY}` },
  })
  
  const meterProvider = new MeterProvider({
    readers: [
      new PeriodicExportingMetricReader({
        exporter: metricExporter,
        exportIntervalMillis: 10000,
      }),
    ],
  })
  metrics.setGlobalMeterProvider(meterProvider)
}

// Initialize telemetry
setupProductionTelemetry()

// Create AI with telemetry
const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4o-mini' },
  options: {
    tracer: trace.getTracer('production-ai'),
    meter: metrics.getMeter('production-ai'),
    debug: process.env.NODE_ENV === 'development',
  },
})

// Create generator
const sentimentAnalyzer = ax`
  reviewText:${f.string('Customer review')} -> 
  sentiment:${f.class(['positive', 'negative', 'neutral'], 'Sentiment')},
  confidence:${f.number('Confidence score 0-1')}
`

// Usage with full observability
export const analyzeSentiment = async (review: string) => {
  const result = await sentimentAnalyzer.forward(ai, { reviewText: review })
  return result
}

2. Multi-Service Tracing

// examples/multi-service-tracing.ts
import { AxAI, AxFlow } from '@ax-llm/ax'
import { trace } from '@opentelemetry/api'

const tracer = trace.getTracer('multi-service')

// Create AI services
const fastAI = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4o-mini' },
  options: { tracer },
})

const powerfulAI = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  config: { model: 'gpt-4o' },
  options: { tracer },
})

// Create multi-service workflow
const documentProcessor = new AxFlow<
  { document: string },
  { summary: string; analysis: string }
>()
  .n('summarizer', 'documentText:string -> summary:string')
  .n('analyzer', 'documentText:string -> analysis:string')
  
  .e('summarizer', s => ({ documentText: s.document }), { ai: fastAI })
  .e('analyzer', s => ({ documentText: s.document }), { ai: powerfulAI })
  
  .m(s => ({
    summary: s.summarizerResult.summary,
    analysis: s.analyzerResult.analysis,
  }))

// Each step gets its own span with proper parent-child relationships
export const processDocument = async (document: string) => {
  return await documentProcessor.forward(fastAI, { document })
}

3. Custom Business Metrics

// examples/custom-business-metrics.ts
import { AxAI, ax, f } from '@ax-llm/ax'
import { metrics } from '@opentelemetry/api'

const meter = metrics.getMeter('business-metrics')

// Create custom business metrics
const customerSatisfactionGauge = meter.createGauge('customer_satisfaction_score', {
  description: 'Customer satisfaction score',
})

const orderProcessingHistogram = meter.createHistogram('order_processing_duration_ms', {
  description: 'Order processing time',
  unit: 'ms',
})

const ai = new AxAI({
  name: 'openai',
  apiKey: process.env.OPENAI_APIKEY!,
  options: { meter },
})

const orderAnalyzer = ax`
  orderText:${f.string('Order description')} -> 
  category:${f.class(['urgent', 'normal', 'low'], 'Priority')},
  estimatedTime:${f.number('Estimated processing time in hours')}
`

export const processOrder = async (orderText: string) => {
  const startTime = performance.now()
  
  try {
    const result = await orderAnalyzer.forward(ai, { orderText })
    
    // Record business metrics
    const processingTime = performance.now() - startTime
    orderProcessingHistogram.record(processingTime, {
      category: result.category,
    })
    
    // Update satisfaction score based on processing time
    const satisfactionScore = processingTime < 1000 ? 0.9 : 0.7
    customerSatisfactionGauge.record(satisfactionScore, {
      order_type: result.category,
    })
    
    return result
  } catch (error) {
    // Record error metrics
    customerSatisfactionGauge.record(0.0, {
      order_type: 'error',
    })
    throw error
  }
}

🎯 Key Takeaways

βœ… What You’ve Learned

  1. Complete Observability: Ax provides comprehensive metrics and tracing out of the box
  2. Industry Standards: Uses OpenTelemetry and gen_ai semantic conventions
  3. Zero Configuration: Works immediately with minimal setup
  4. Production Ready: Scales from development to enterprise environments
  5. Cost Optimization: Track usage and costs to optimize spending

πŸš€ Next Steps

  1. Start Simple: Begin with console export for development
  2. Add Production: Set up cloud observability for production
  3. Custom Metrics: Add business-specific metrics
  4. Alerting: Set up alerts on key metrics
  5. Optimization: Use data to optimize performance and costs

πŸ“š Resources

πŸŽ‰ You’re Ready!

You now have the knowledge to build observable, production-ready AI applications with Ax. Start with the quick setup, add production telemetry, and watch your AI systems become transparent and optimizable!


Need help? Check out the Ax documentation or join our community.