Scaffold a .rlmx/ config directory with templates.
rlmx init [--template <type>] [--dir <path>]
Flag
Type
Default
Description
--template <type>
string
default
Template type: default or code
--dir <path>
string
. (cwd)
Directory to scaffold in
Creates a .rlmx/ directory containing:
File
Purpose
rlmx.yaml
Main configuration (model, budget, context, storage)
SYSTEM.md
System prompt used by the RLM loop
CRITERIA.md
Output criteria for quality checks
TOOLS.md
Custom Python tools exposed to the RLM
RLMX v0.260331+ uses the .rlmx/ directory as the only config location. Projects using the old flat-file rlmx.yaml in the project root must re-run rlmx init to migrate.
# Scaffold with default templaterlmx init# Scaffold with code analysis templaterlmx init --template code# Scaffold in a specific directoryrlmx init --template default --dir ./my-project
--context is required. Without --estimate, rlmx performs a single-iteration warmup run so the provider caches the prompt prefix for subsequent queries.
Flag
Type
Default
Description
--context <path>
string
required
Path to context directory or file
--estimate
boolean
false
Print token/cost estimate only — skip the warmup call
--ext <list>
string
from rlmx.yaml
File extensions when --context is a directory (comma-separated)
If the context exceeds the provider’s token limit, the command exits with a non-zero status and an error message. See Cache Mode for the full caching workflow.
Run bulk queries from a questions file against a shared cached context. Each question is executed through the same RLM loop used by rlmx "query", with provider-level prompt caching always enabled so the first question pays full price and subsequent questions benefit from the cache.
rlmx batch <questions-file> [options]
Questions are read one per line. Blank lines and lines beginning with # are ignored.
Flag
Type
Default
Description
<questions-file>
path
required
Path to a text file of questions (one per line)
--context <path>
string
—
Shared context for every question
--max-iterations <n>
number
30
Maximum RLM iterations per question
--timeout <ms>
number
300000
Per-question timeout
--max-cost <n>
number
—
Stop after cumulative USD cost crosses this threshold
--max-tokens <n>
number
—
Per-question token cap
--max-depth <n>
number
—
Maximum recursive rlm_query depth
--parallel <n>
number
1
Concurrency hint (currently executes sequentially)
--batch-api
boolean
false
Opt into the Gemini Batch API path (requires provider: google)
--tools <level>
string
core
Tool level: core, standard, or full
--ext <list>
string
from rlmx.yaml
File extensions for directory context
--verbose
boolean
false
Show per-question progress on stderr
--cache does not need to be passed — batch mode always runs with cache.enabled = true. If the context exceeds the provider’s token limit, rlmx falls back to pgserve storage mode when storage.enabled is auto or always.
# Run a questions file against a cached docs corpusrlmx batch questions.txt --context ./docs/# Stop if total spend crosses $1.00rlmx batch questions.txt --context ./src/ --max-cost 1.00# Gemini Batch API for 50% input/output token discountrlmx batch questions.txt --context ./docs/ --batch-api
rlmx batch writes JSONL to stdout — one JSON object per question, followed by a final aggregate line:
{"question":"How does IPC work?","answer":"IPC uses...","stats":{"iterations":2,"inputTokens":42100,"outputTokens":520,"cost":0.0042}}{"question":"Where is auth defined?","answer":"src/auth.ts...","stats":{"iterations":1,"inputTokens":820,"outputTokens":310,"cost":0.0009}}{"type":"aggregate","total_questions":2,"completed":2,"total_cost":0.0051,"cache_savings":0.004}
Budget trips, cache fallbacks, and verbose progress are logged to stderr so the stdout stream stays valid JSONL for downstream pipelines. See Batch Mode for full details.
Query run history and cost breakdowns from the rlmx observability database (pgserve at ~/.rlmx/data). Stats are populated automatically by every run that saves a session.
rlmx stats [options]
Flag
Type
Default
Description
--run <id>
string
—
Show the event timeline for a specific session id
--costs
boolean
false
Show cost breakdown grouped by model
--tools
boolean
false
Show REPL tool usage grouped by session
--since <duration>
string
—
Limit to the recent window (30m, 24h, 7d)
--output json
literal
—
Emit structured JSON instead of the terminal table
Without any flags, rlmx stats prints the 20 most recent sessions as a terminal table (id, query, model, iterations, cost, status, duration).
Stats require pgserve storage. If ~/.rlmx/data does not exist, the command prints "No stats yet. Run a query first." and exits cleanly. See Configuration for storage setup.
# Most recent 20 runs as a tablerlmx stats# JSON for scripting / jq pipelinesrlmx stats --output json# Cost by model over the last 24 hoursrlmx stats --costs --since 24h# Tool usage over the last weekrlmx stats --tools --since 7d# Full event timeline for a specific runrlmx stats --run 0c3e2f1a-...-9f02
Default (sessions table) — plain-text columns written to stdout:
ID Query Model Iter Cost Status Duration--------------------------------------------------------------------------------------------------------------0c3e2f1a.. How does IPC work? google/gemini-3.1-flash... 3 $0.0042 completed 4.1sf91d8a05.. Summarize paper.md google/gemini-3.1-flash... 2 $0.0011 completed 1.8s
--run <id> — one row per event (llm_call, repl_exec, sub_call) with iteration, token counts, cost, duration, and kind-specific detail (model, code preview, request type).--costs — one row per (session, model) pair with total calls, input/output tokens, cost, and average call duration.--tools — one row per (session, request_type) with calls, errors, and average duration.--output json — any of the above as a pretty-printed JSON array of rows.
Run benchmarks that compare the RLM loop against a direct LLM call on the same question. rlmx benchmark does not accept --context — each mode ships its own dataset.
Both modes print a box-drawn comparison table to stderr with per-question rows (Direct / RLM / Savings) and a TOTALS footer covering tokens, cost, latency, and average RLM iterations:
With rlmx benchmark cost --output json, the same results are emitted as a structured JSON document to stdout (timestamp, mode, model, per-question runs[], and totals). Every benchmark — table or JSON — is also persisted to ~/.rlmx/benchmarks/benchmark-<mode>-<timestamp>.json and the saved path is printed to stderr.
Set a configuration value. Values are type-coerced: "true" becomes boolean, numeric strings become numbers.
rlmx config set GEMINI_API_KEY sk-abc123rlmx config set model.provider googlerlmx config set model.model gemini-3.1-flash-lite-previewrlmx config set budget.max_cost 0.50rlmx config set gemini.thinking_level medium