Ax Refine And BestOfN
Use this skill when writing or reviewing Ax bestOfN/refine code, reward functions, thresholds, native sample selection, serial attempts, generated advice, and attempt diagnostics.
Install
Install only this skill for TypeScript:
npx skills add https://ax-llm.github.io/ax/typescript/ --skill 'ax-refine'Published skill file: ax-refine/SKILL.md.
Source
- Source: src/ax/skills/ax-refine.md
- Version:
22.0.3
Skill Instructions
Use bestOfN(...) when you can score complete outputs independently. Use refine(...) when failed rounds should produce feedback that changes the next attempt.
Validation And Assertions
Keep reward scoring, whole-output assertions, and streaming assertions separate:
- Use schema validation for shape, types, and field-level constraints.
- Use
addAssert(...)for whole-output hard invariants. Failed assertions feed correction text into the normal retry loop. - Use
addStreamingAssert(...)for partial streaming hard invariants. It aborts the current stream attempt as soon as the partial field fails, then feeds correction text into the normal retry loop. - Use
bestOfN(...)for complete-candidate selection. - Use
refine(...)for reward-scored retry rounds with generated feedback.
APIs
import { bestOfN, refine } from '@ax-llm/ax';
const selected = bestOfN(program, {
n: 4,
threshold: 0.8,
rewardFn: ({ input, prediction, traces, chatLog }) => score(prediction),
});
const improved = refine(program, {
rounds: 3,
samplesPerRound: 2,
threshold: 0.85,
rewardDescription: 'Prefer complete, grounded, concise answers.',
rewardFn: ({ prediction }) => score(prediction),
});Rules:
forward(...)returns the selected prediction.streamingForward(...)is unsupported; score complete outputs instead.getUsage()aggregates usage across attempts.getTraces()andgetChatLog()return the selected attempt’s diagnostics.getAttempts()returns all attempt metadata, including reward, errors, and advice application.
Reward Functions
Reward functions return a number. Higher is better. A threshold marks a good-enough candidate and can stop serial attempts early.
const rewardFn = ({ prediction }) => {
const exact = prediction.answer === 'Paris' ? 1 : 0;
const concise = prediction.answer.length < 80 ? 0.2 : 0;
return exact + concise;
};Use serial strategy when the reward needs traces, chat logs, tools, or full flow behavior.
Strategies
strategy: "auto"uses native samples forAxGenand serial attempts for composite programs.strategy: "native-samples"usessampleCountand a reward-backedresultPicker; candidate context includes outputs, not full per-candidate traces.strategy: "serial"runs isolated full-program attempts with fresh memory/session IDs.
Refine Advice
refine(...) generates advice after a below-threshold round. Advice is appended temporarily to matching kind: "instruction" components exposed by getOptimizableComponents() and applied through applyOptimizedComponents().
Rules:
- Original instruction values are restored in
finally, on success and error. - Programs without instruction components continue as best-of-N rounds and mark
adviceApplied: false. - Do not add DSPy-style
hint_signature fields; Ax uses instruction-component advice.
Streaming
Do not use refine(...) for streaming. For partial-output safety, use addStreamingAssert(fieldName, fn, message?) on AxGen. Streaming assertions fail fast within the current attempt with AxStreamingAssertionError, then retry with correction feedback when retries remain.