AxAgent RLM Runtime Rules (@ax-llm/ax)
This skill helps an LLM generate correct AxAgent RLM/runtime code using @ax-llm/ax. Use when the user asks about RLM code execution, AxJSRuntime, contextFields, contextPolicy, liveRuntimeState, promptLevel, stage prompt controls, executorModelPolicy, maxRuntimeChars, agent.test(…), llmQuery(…), recursionOptions, or long-running agent runtime behavior.
Install
Install only this skill for TypeScript:
npx skills add https://ax-llm.github.io/ax/typescript/ --skill 'ax-agent-rlm'Published skill file: ax-agent-rlm/SKILL.md.
Source
- Source: src/ax/skills/ax-agent-rlm.md
- Version:
22.0.3
Skill Instructions
Use this skill for code-runtime agents and llmQuery(...) semantic-helper behavior. For ordinary agent setup, child agents, tool namespaces, clarification, and bubbleErrors, use ax-agent. For callbacks and logs, use ax-agent-observability. For memories and skill loading, use ax-agent-memory-skills.
Use These Defaults
- Use
agent(...), notnew AxAgent(...). - In stdout-mode RLM, use one observable
console.log(...)step per non-final actor turn. - Default to
contextPolicy: { preset: 'checkpointed', budget: 'balanced' }for most RLM tasks. - Prefer
contextPolicy: { preset: 'adaptive', budget: 'balanced' }when older successful turns should collapse sooner while live runtime state stays visible. - Use
contextMapfor recurring long-context corpora when the distiller should start future runs with a small persisted orientation cache. - Prefer
promptLevel: 'default'for normal use. - Use
promptLevel: 'detailed'when you want extra anti-pattern examples and tighter teaching scaffolding in the actor prompt. - Prefer
executorModelPolicywhen the actor may need to upgrade after repeated error turns or discovery in specific namespaces without also upgrading the responder. - Use explicit child agents in
functions: [...]when the task needs specialist agents with their own tools/runtime. - Use
llmQuery(...)only for focused semantic questions over narrowed context; it does not spawn a tool-using child AxAgent. - Prefer
maxSubAgentCallsonly when you need an explicit cap onllmQuery(...)sub-query usage.
Mental Model
AxAgent is a three-stage pipeline. Each forward() call walks the stages in order:
distiller (RLM actor) -> executor (RLM actor) -> responder (synthesizer)- distiller always runs first. It sees all original inputs so it can understand and normalize the task; declared
contextFieldsstay runtime-only when present. It distils relevant evidence by writing runtime-language code in a multi-turn loop, then calls the runtime-exposedfinal(request, evidence)primitive. The request becomes the executor’sinputs.executorRequest; it must be self-contained and restate the concrete action, target, and constraints, not vague wording like “do it”. The distiller should expand the original user task with facts found in context, including follow-ups like “yes, do it”. When nocontextFieldsare configured, it still performs request normalization over the original inputs withcontextFields: []. The distiller has no tools and is not a capability gate. - executor always runs. It receives non-context inputs plus
inputs.executorRequestandinputs.distilledContextfrom the distiller’sfinal(request, evidence)payload. Raw context fields are not present in the executor stage. The executor owns tool use, decides whether to call its available functions or finish directly from distilled evidence, and reports actual tool results or failures. - responder always runs last. It synthesizes the user’s output signature from whichever upstream actor finished the run and must not contradict tool evidence gathered upstream.
Treat both actor stages as long-running code runtime sessions that the actor steers over multiple turns, not as fresh script generators on every turn. AxJSRuntime is the default; custom runtimes set language so the actor code field becomes <language>Code such as pythonCode while JavaScript keeps the legacy javascriptCode.
- Successful code leaves variables, functions, imports, and computed values available in the runtime session.
- The actor should continue from existing runtime state instead of recreating prior work.
actionLog,liveRuntimeState, and checkpoint summaries only control what the actor can see again in the prompt.- Rebuild state only after an explicit runtime restart notice or when you intentionally need to overwrite a value.
RLM Actor Code Rules
Use these rules when generating actor JavaScript for RLM in AxJSRuntime stdout mode. For custom runtimes, follow the runtime’s getUsageInstructions(), primitive overrides, and callable formatter instead.
- Treat each actor turn as exactly one observable step.
- Inspect what already exists before recomputing it. If a prior turn successfully created a value, prefer reusing that runtime value.
- If you need to inspect a value, compute it or read it,
console.log(...)it, and stop immediately after thatconsole.log(...). - On the next turn, continue from the existing runtime state and use the logged result from
Action Logonly as evidence for what happened. - If the prompt contains
Live Runtime State, treat it as the canonical view of current variables. - Errors from child-agent or tool calls appear in
Action Log; inspect them and fix the code on the next turn. - Non-final turns should contain exactly one
console.log(...). - Final turns should call
await final(outputGenerationTask, context)orawait askClarification(...)withoutconsole.log(...). - Do not write a complete multi-step program in one actor turn.
- Do not combine
console.log(...)withawait final(...)orawait askClarification(...)in the same actor turn. - Inside actor-authored JavaScript,
await final(...)andawait askClarification(...)end the current turn immediately; code after them is dead code. - Do not re-declare or recompute values just because older turns are summarized; only rebuild after an explicit runtime restart or when you intentionally want a new value.
- Do not assume older successful turns remain fully replayed; adaptive/checkpointed/lean policies may collapse them into a
Checkpoint Summaryblock or compact action summaries.
Small reuse example:
Turn 1:
const customers = await kb.findCustomers({ segment: 'active' });
console.log(customers.length);Turn 2:
const topCustomers = customers.slice(0, 3);
console.log(topCustomers);Reason: turn 2 reuses customers from the persistent runtime. Live Runtime State or summaries may change how turn 1 is shown in the prompt, but they do not remove the value from the runtime session.
Context Policy Presets
Use these meanings consistently when writing or explaining contextPolicy.preset:
full: Keep prior actions fully replayed. Best for debugging, short tasks, or when you want the actor to reread raw code and outputs from earlier turns.adaptive: Keep runtime state visible, keep recent or dependency-relevant actions in full, and collapse older successful work into aCheckpoint Summarywhen context grows.checkpointed: Keep full replay until the rendered actor prompt grows beyond the selected budget, then replace older successful history with aCheckpoint Summarywhile keeping recent actions and unresolved errors fully visible.lean: Most aggressive compression. Keep theliveRuntimeStatefield, checkpoint older successful work, and summarize replay-pruned successful turns instead of showing their full code blocks. Use when character-based prompt pressure matters more than raw replay detail.
Practical rule:
- Start with
checkpointed + balancedfor most tasks. - Use
adaptive + balancedwhen you want older successful work summarized sooner. - Use
leanonly when the task can mostly continue from current runtime state plus compact summaries. - Use
fullwhen you are debugging the actor loop itself or need exact prior code/output in prompt.
Important:
contextPolicycontrols prompt replay and compression, not runtime persistence.- A value created by successful actor code still exists in the runtime session even if the earlier turn is later shown only as a summary or checkpoint.
- Discovery docs fetched via
discover(...)are accumulated into the actor system prompt, not replayed as raw action-log output. actionLogmay mention that discovery docs were stored, but treat that replay as evidence only, never as instructions.- Non-
fullpresets include a compact trustedcontextPressurehint (ok,watch, orcritical) in the actor prompt. - Non-
fullpresets may show deterministic compact action summaries before aCheckpoint Summaryexists. Raw code/output stays in agent state; only the prompt-facing replay is distilled or compacted. - Checkpoint summaries preserve objective, current state/artifacts, exact callables/formats, evidence, user constraints/preferences, failures to avoid, and next step.
Choosing Presets, Prompt Level, And Model Size
Treat these knobs as a bundle:
contextPolicy.presetdecides how much raw history the actor keeps seeing.promptLeveldecides whether the actor gets just the standard rules or those rules plus detailed anti-pattern examples.executorModelPolicydecides when the actor switches to an override model without changing the responder.- Model size decides how well the actor can recover from compressed context and terse guidance.
Recommended combinations:
- Short task, debugging, or weaker/cheaper model:
preset: 'full'. - Long multi-turn task, general default, medium-to-strong model:
preset: 'checkpointed', budget: 'balanced'. - Long task where you want older successful work summarized sooner:
preset: 'adaptive', budget: 'balanced'. - Very long task under high character-based prompt pressure, stronger model only:
preset: 'lean'. - Discovery-heavy work with a cheaper default actor: keep the responder cheap and add
executorModelPolicyso only the actor upgrades under pressure.
Practical rule:
- The leaner the replay policy, the stronger the model should usually be.
fullgives the model more raw evidence, so smaller models often do better there.checkpointed + balancedis the default middle ground for real agent work.adaptive + balancedis the proactive-summarization variant when you want older successful work compressed sooner.leanshould be reserved for models that can reason well from runtime state plus summaries instead of exact old code/output.executorModelPolicyis usually better than globally upgrading the whole agent when the bottleneck is actor exploration rather than responder synthesis.
Option Layout
Use these top-level controls consistently:
recursionOptions.ai: routesllmQuery(...)sub-query calls to a different AI service than the parent run.recursionOptions.model,modelConfig, and other forward options: tune the AxGen call used byllmQuery(...).maxSubAgentCalls: sharedllmQuery(...)sub-query budget across the whole run. Default is100.maxBatchedLlmQueryConcurrency: caps batchedllmQuery([...])concurrency.maxRuntimeChars: runtime/output truncation ceiling for console logs, tool results, and interpreter output replay. The effective limit is computed dynamically each turn based on remaining context budget.summarizerOptions: default model/options for the internal checkpoint summarizer.contextPolicy: replay/checkpointing/compression policy.contextMap: optional persistent orientation cache injected into the distiller and updated once after each successful run.AxAgentContextMapevolves indefinitely by default; use{ infiniteEvolve: false, evolveSteps: N }on the map object for finite warmup followed by reuse.contextOptions: distiller-stage forward options.executorOptions: executor-stage forward options such asdescription,model,modelConfig,thinkingTokenBudget, andshowThoughts.executorModelPolicy: executor-only model override rules based on consecutive error turns or discovery fetches from listed namespaces.responderOptions: responder-stage forward options.judgeOptions: built-in judge options foragent.optimize(...); for tuning workflows useax-agent-optimize.
Canonical shape:
const researchAgent = agent('query:string -> answer:string', {
contextFields: ['query'],
runtime,
recursionOptions: {
model: 'gpt-5.4-mini',
},
maxRuntimeChars: 3000,
summarizerOptions: {
model: 'gpt-5.4-mini',
modelConfig: { temperature: 0.1, maxTokens: 180 },
},
contextPolicy: {
preset: 'checkpointed',
budget: 'balanced',
},
contextOptions: {
model: 'gpt-5.4-mini',
maxTurns: 3,
},
executorOptions: {
description: 'Use tools first and keep JS steps small.',
model: 'gpt-5.4-mini',
},
executorModelPolicy: [
{
model: 'gpt-5.4',
aboveErrorTurns: 2,
namespaces: ['db', 'kb'],
},
],
responderOptions: {
model: 'gpt-5.4-mini',
},
});Semantics:
maxRuntimeCharssets the truncation ceiling and is separate fromcontextPolicy.budget.summarizerOptionstunes only the internal checkpoint summarizer. It does not change actor or responder model selection.executorModelPolicyonly switches the actor model. It does not changeresponderOptions.model.llmQuery(...)usesrecursionOptions.aiwhen set, otherwise it falls back to the parent.forward(ai, ...)service.recursionOptionsconfigures the AxGen semantic sub-query used byllmQuery(...); it does not create a child AxAgent and cannot give the sub-query tools.executorModelPolicyentries are ordered from weaker to stronger. If multiple rules match, the last matching entry wins.- If one entry defines
namespaces, any successfuldiscover(...)function-definition fetch from one of those namespaces marks the rule as matched starting on the next actor turn. - Do not add
recursionOptionsunless the user needs different model/options forllmQuery(...).
Dynamic Output Truncation
Runtime output truncation is budget-proportional and type-aware:
- Early turns with little action-log pressure use the full
maxRuntimeCharsceiling. - As the action log fills toward
targetPromptChars, the limit decays linearly down to 15% of the ceiling, hard-floored at 400 chars. - Large arrays keep the first 3 and last 2 items, with the middle replaced by
... [N hidden items]. - Deep objects replace nested values beyond depth 3 with
[Object]or[Array(N)]. - Error stack traces keep the first 3 and last 1 stack frames.
- Simple values use standard
JSON.stringifypassthrough.
Users do not need to configure this behavior. maxRuntimeChars sets the upper bound; the dynamic system only reduces it.
Stage Prompt Controls
The pipeline has three peer stage-config bags: contextOptions (distiller), executorOptions (executor), and responderOptions (responder). Each accepts the same shape: description, model, modelConfig, excludeFields, plus other forward options.
Key fields:
contextOptions.description: append extra distiller-specific instructions.executorOptions.description: append extra executor-specific instructions; this is the typical place for tool-use guidance.responderOptions.description: append extra responder-specific instructions.contextOptions.model/executorOptions.model/responderOptions.model: split model choice across stages.contextOptions.ai/executorOptions.ai/responderOptions.ai: override the AI service for a specific stage.executorModelPolicy: auto-switch only the executor when the run is on a consecutive error streak or discovery fetches land in specific namespaces.
Good split-model pattern:
const researchAgent = agent('query:string -> answer:string', {
contextFields: ['query'],
runtime,
contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
executorOptions: {
model: 'gpt-5.4',
},
responderOptions: {
model: 'gpt-5.4-mini',
},
});Model guidance:
- Put the stronger model on the actor when the task depends on multi-turn exploration, discovery, runtime state reuse, or compressed replay.
- Put the stronger model on the responder only when the hard part is final synthesis/formatting rather than exploration.
- For cost-sensitive setups, a common pattern is stronger actor plus cheaper responder.
- Prefer
executorModelPolicyover globally upgrading the whole agent when the actor only needs help after context grows or the run starts thrashing.
Prompt/cache shape:
- Actor turns are compact observable turns, not replayed chat transcripts.
- Stable system prompt: role/stage rules, primitive descriptions, static module list, always-included callable signatures, output contract, and field definitions.
- Cached working inputs: task inputs, inline context,
contextMetadata,contextMap,memories,executorRequest,distilledContext,discoveredToolDocs,loadedSkills, andsummarizedActorLog. - Dynamic turn tail:
guidanceLog,actionLog,liveRuntimeState, andcontextPressure. - Prefer one compact inspection per non-final turn. Never combine inspection output with
final(...)oraskClarification(...).
Invalid actor turn:
await discover(['kb.findSnippets']);
const snippets = await kb.findSnippets({ topic: 'severity' });
await final("Summarize severity findings", { snippets });Reason: this mixes observation and follow-up work in one turn. discover(...) returns void; read the next prompt’s “Discovered Tool Docs” section before calling the function.
AxJSRuntime Security
Default new AxJSRuntime() is hardened: no network, no filesystem, no child process, dynamic import() blocked, intrinsics frozen, ShadowRealm locked to undefined, worker IPC locked in browser/Deno/Bun, Bun workers use smol: true, and on Node 20+ the OS Permission Model auto-engages where available.
Threat model: this is defense-in-depth for LLM-authored code, not a container or VM boundary. Host callbacks and granted runtime permissions remain the authority boundary; keep durable secrets and privileged effects in host-side functions.
Permission enum (AxJSRuntimePermission):
NETWORK, STORAGE, CODE_LOADING, COMMUNICATION, TIMING, WORKERS, FILESYSTEM, CHILD_PROCESS.
Options quick reference:
permissions?: readonly AxJSRuntimePermission[]: default[]; opt in capabilities.blockDynamicImport?: boolean: defaulttrue.allowedModules?: readonly string[]: default[]; narrow dynamic-import allowlist gate. Allowlisted specifiers are attempted, but full Node module namespace passthrough depends on Node vm semantics.freezeIntrinsics?: boolean: defaulttrue.blockShadowRealm?: boolean: defaulttrue.lockWorkerIPC?: boolean: defaulttrue.preventGlobalThisExtensions?: boolean: defaultfalse; opt-in and breaks top-level persistence.useNodePermissionModel?: boolean | 'auto': default'auto'.nodePermissionAllowlist?: { fsRead?; fsWrite?; childProcess?; addons?; wasi? }.resourceLimits?: { maxOldGenerationSizeMb?; maxYoungGenerationSizeMb?; codeRangeSizeMb?; stackSizeMb? }.allowDenoRemoteImport?: boolean: defaultfalse.allowUnsafeNodeHostAccess?: boolean: defaultfalse.
Recipes:
new AxJSRuntime();
new AxJSRuntime({ permissions: [AxJSRuntimePermission.NETWORK] });
new AxJSRuntime({
permissions: [AxJSRuntimePermission.FILESYSTEM],
allowedModules: ['node:fs', 'node:fs/promises', 'node:path'],
useNodePermissionModel: 'auto',
nodePermissionAllowlist: {
fsRead: ['/app/data'],
fsWrite: ['/app/data'],
},
});Rules for the LLM author:
- Default to
new AxJSRuntime()with no options unless the user asked for a specific capability. - When the user asks for
fetch, addpermissions: [AxJSRuntimePermission.NETWORK]. - When the user asks for filesystem access, prefer host-side tool functions. If direct runtime filesystem access is required, add
permissions: [AxJSRuntimePermission.FILESYSTEM], scope withnodePermissionAllowlistwhen the user names a directory, and treatallowedModulesas an import allowlist gate rather than a portability guarantee. - Do not disable
freezeIntrinsics,blockShadowRealm, orlockWorkerIPCunless the user explicitly asks. - Treat
allowUnsafeNodeHostAccess: trueas a red flag; only use it when the user is authoring trusted code in their own process. preventGlobalThisExtensions: truebreaks top-levelvar/let/constpersistence across turns; never set it for stdout-mode RLM where persistence is load-bearing.- On Deno,
blockDynamicImportis a no-op; the defense is the worker permission sandbox. PassallowDenoRemoteImport: trueonly if remote module loading is genuinely required.
Custom Code Runtimes
Implement AxCodeRuntime when the actor should write a language other than JavaScript.
- Set
languageto the model-facing language name. JavaScript aliases (JavaScript,js,ecmascript) keepjavascriptCode; other values derive lower-camel code fields such aspythonCodeorcSharpCode. - Keep execution inside
createSession(globals, options). AxAgent passesinputs,llmQuery,final,askClarification, progress callbacks, memory/discovery primitives, and namespaced tools as host globals; the runtime decides how those globals appear in the target language. - Put language syntax, output behavior, persistence semantics, and completion-call examples in
getUsageInstructions(). - Use
getPrimitiveOverrides()to describe language-native calls for built-in primitives, andformatCallable()to describe language-native calls for tools and child agents. - Implement
inspectGlobals()on sessions whencontextPolicyshould show live runtime state for non-JavaScript runtimes; otherwise AxAgent will not run JavaScript fallback inspection snippets.
RLM Test Harness
Use agent.test(code, contextFieldValues?, options?) when the user wants to validate runtime snippets against the actual AxAgent runtime environment without running the full actor/responder loop. With AxJSRuntime, those snippets are JavaScript.
import { AxJSRuntime, agent, f, fn } from '@ax-llm/ax';
const runtime = new AxJSRuntime();
const tools = [
fn('sum')
.description('Return the sum of the provided numeric values')
.namespace('math')
.arg('values', f.number('Value to add').array())
.returns(f.number('Sum of all values'))
.handler(async ({ values }) =>
values.reduce((total, value) => total + value, 0)
)
.build(),
];
const toolHarness = agent('query:string -> answer:string', {
contextFields: [],
runtime,
functions: tools,
contextPolicy: { preset: 'checkpointed', budget: 'balanced' },
});
const toolOutput = await toolHarness.test(
'console.log(await math.sum({ values: [3, 5, 8] }))'
);
console.log(toolOutput);Rules:
test(...)creates a fresh runtime session per call.- Context-field snippets run in the context/distiller runtime and expose
inputsplus non-colliding top-level aliases for configuredcontextFields. - Tool snippets should use an agent with no
contextFields, or test the executor stage directly, so namespaced functions, child agents, andllmQuery(...)are in scope. - In
AxJSRuntime, do not rely on callinginspectRuntime()from insidetest(...)snippets yet; prefer checking runtime globals directly inside the snippet. - It returns the formatted runtime output string.
- It throws on runtime failures instead of returning LLM-style error strings.
- Do not call
final(...)oraskClarification(...)insidetest(...)snippets. - Pass only
contextFieldsvalues totest(...); it is not a general way to inject arbitrary non-context inputs. - If the snippet uses
llmQuery(...), provide an AI service through the agent config oroptions.ai.
llmQuery(...) Rules
Available forms:
await llmQuery(query, context?)await llmQuery({ query, context? })await llmQuery([{ query, context }, ...])
Rules:
llmQuery(...)forwards only the explicitcontextargument.- Parent inputs, runtime variables, tool results, and discovered docs are not automatically available to
llmQuery(...); include any needed facts incontext. llmQuery(...)is a direct semantic helper backed by an AxGen sub-query. It does not create a child AxAgent, does not run an actor runtime session, and does not have access to tools or discovery.- Use batched
llmQuery([...])only for independent semantic questions. Use serial calls when later work depends on earlier results. - Pass compact named object context instead of huge raw parent payloads.
- Do not assume anything other than the returned string comes back from
llmQuery(...). maxSubAgentCallsis a shared budget forllmQuery(...)sub-queries across the top-level run.- Single-call
llmQuery(...)may return[ERROR] ...on non-abort failures. - Batched
llmQuery([...])returns per-item[ERROR] .... - If a result starts with
[ERROR], inspect or branch on it instead of assuming success.
Minimal example:
const summary = await llmQuery('Summarize this incident', inputs.context);
if (summary.startsWith('[ERROR]')) {
console.log(summary);
} else {
console.log(summary);
}Parallel semantic review example:
const narrowedIncidents = incidents.map((incident) => ({
id: incident.id,
timeline: incident.timeline,
notes: incident.notes.slice(0, 1200),
}));
const [severityReview, followupReview] = await llmQuery([
{
query:
'Use discovery and available tools to review severity policy alignment. Return compact findings.',
context: {
incidents: narrowedIncidents,
rubric: 'severity-policy',
},
},
{
query:
'Use discovery and available tools to review postmortem and follow-up obligations. Return compact findings.',
context: {
incidents: narrowedIncidents,
rubric: 'postmortem-followup',
},
},
]);
const merged = await llmQuery(
'Merge these delegated reviews into one manager-ready summary with next steps.',
{
severityReview,
followupReview,
audience: inputs.audience,
}
);Delegation decision guide:
- JS-only: deterministic logic such as filter, sort, count, regex, or date math -> do it inline.
- Single-shot semantic: needs LLM reasoning but no tools or multi-step exploration -> single
llmQuery(...)with narrow context. - Specialist/tool delegation: needs its own tools, discovery, runtime, or reusable role -> create a child
agent(...)and pass it infunctions: [...]. - Parallel semantic fan-out: two or more independent semantic-only subtasks -> batched
llmQuery([...]).
Context handling:
- Always narrow with JS before delegating. Never pass raw
inputs.*. - Name context keys semantically, e.g.
{ emails: filtered, rubric: 'classify-urgency' }. - Estimate total sub-query calls before fanning out.
maxSubAgentCallsis shared across the run.
Patterns:
- Fan-Out / Fan-In: JS narrows into categories ->
llmQuery([...])fans out per category -> JS or one morellmQuery(...)merges semantic results. - Pipeline: serial
llmQuery(...)calls where each depends on the prior result. - Specialist tool use: call child agents or tools via their namespaced function globals, e.g.
await team.writer({ draft }).
Examples
Fetch these for full working code:
- RLM - RLM basic
- RLM Long Task - RLM context policy
- RLM Discovery - discovery mode, grouped tools, child agents as functions, and semantic
llmQuery(...) - RLM Adaptive Replay - adaptive replay
- RLM Live Runtime State - structured runtime-state rendering
- RLM Clarification Resume - clarification exception plus
getState()/setState(...)
Do Not Generate
- Do not write a full multi-step RLM actor program in one turn.
- Do not combine
console.log(...)withfinal(...). - Do not assume old successful turns stay fully replayed under adaptive/checkpointed/lean policies.
- Do not rebuild runtime state just because a prior turn was summarized.
- Do not describe
llmQuery(...)as spawning a tool-using child AxAgent. - Do not assume parent inputs are available to
llmQuery(...)unless passed incontext. - Do not ignore
[ERROR] ...results fromllmQuery(...). - Do not grant
AxJSRuntimepermissions unless the user asked for the capability.