Skip to main content

CLI Reference

rlmx "query"

Run an RLM query against a context.
rlmx "query" [options]
The default command. Loads context into a Python REPL, then iterates with the LLM until it produces a final answer.

Options

FlagTypeDefaultDescription
--context <path>stringPath to context directory or file
--output <mode>stringtextOutput mode: text, json, or stream
--verbosebooleanfalseShow iteration progress on stderr
--max-iterations <n>number30Maximum RLM iterations before forced termination
--timeout <ms>number300000Timeout in milliseconds (5 minutes)
--statsbooleanfalseEmit JSON stats to stderr (or include in --output json)
--log <path>stringWrite structured JSONL log to file
--tools <level>stringcoreTool level: core, standard, or full
--max-cost <n>numberMaximum USD spend per run
--max-tokens <n>numberMaximum total tokens per run
--max-depth <n>numberMaximum recursive rlm_query depth
--ext <list>string.mdFile extensions for context dirs (comma-separated)
--thinking <level>stringThinking level: minimal, low, medium, high (Gemini only)
--cachebooleanfalseEnable CAG mode (full context cached in system prompt)

Context loading

InputBehavior
--context dir/Recursively reads files matching --ext as list[{path, content}]
--context file.mdReads as single string
--context file.jsonParses JSON as dict or list
stdin pipeReads as single string

Examples

# Basic query with directory context
rlmx "How does IPC work?" --context ./docs/

# JSON output with stats
rlmx "Summarize this" --context paper.md --output json --stats

# Code analysis with extended file types
rlmx "Analyze code" --context ./src/ --tools full --ext .ts,.js

# Budget-limited query
rlmx "Quick question" --max-cost 0.10 --max-tokens 5000

# Piped input with logging
echo "data" | rlmx "Analyze this" --log run.jsonl

# CAG mode for repeated queries
rlmx "First question" --context ./docs/ --cache
rlmx "Follow-up question" --context ./docs/ --cache

# Gemini thinking mode
rlmx "Complex analysis" --context ./src/ --thinking high

Output formats

{
  "answer": "The answer to your query...",
  "references": ["docs/start/create-project.md", "docs/concept/ipc.md"],
  "usage": {
    "inputTokens": 12500,
    "outputTokens": 3200,
    "llmCalls": 5
  },
  "iterations": 3,
  "model": "google/gemini-3.1-flash-lite-preview"
}

rlmx init

Scaffold a new rlmx.yaml config file in the current directory.
rlmx init [--dir <path>]
FlagTypeDefaultDescription
--dir <path>string. (cwd)Directory to scaffold in
Creates a fully commented rlmx.yaml with sensible defaults. Idempotent — skips creation if the file already exists.
You can skip rlmx init entirely. RLMX auto-scaffolds on first query if no config is found.

Example

# Scaffold in current directory
rlmx init

# Scaffold in a specific directory
rlmx init --dir ./my-project

rlmx cache

Pre-warm the provider cache or estimate context size and costs.
rlmx cache [options]
FlagTypeDefaultDescription
--context <path>stringPath to context directory or file
--estimatebooleanfalseShow cost estimate without making LLM calls
--ext <list>string.mdFile extensions for context dirs

Examples

# Estimate cache costs (no LLM calls)
rlmx cache --context ./docs/ --estimate

# Warm the provider cache
rlmx cache --context ./docs/

# Estimate with custom extensions
rlmx cache --context ./src/ --ext .ts,.js --estimate

Estimate output

The --estimate flag shows token counts, provider limits, and cost projections without making any API calls:
Context: ./docs/ (23 files, 145KB)
Estimated tokens: 43,500
Provider limit: 1,000,000 (google)
Cache retention: long
Estimated first-query cost: $0.003
Estimated cached-query cost: $0.0003 (90% savings)

rlmx batch

Run bulk queries from a questions file against cached context.
rlmx batch <file> [options]
FlagTypeDefaultDescription
--context <path>stringPath to context directory or file
--max-iterations <n>number30Maximum RLM iterations per question
--max-cost <n>numberMaximum total USD spend across all questions
--parallel <n>number1Concurrent questions
--batch-apibooleanfalseUse Gemini Batch API for 50% cost reduction
--output <mode>stringOutput mode
--verbosebooleanfalseShow progress
Cache is always enabled for batch mode. See Batch Mode for full details.

Example

# Run questions from a file
rlmx batch questions.txt --context ./docs/

# With budget cap
rlmx batch questions.txt --context ./src/ --max-cost 1.00

# Low iteration count for quick answers
rlmx batch questions.txt --context ./docs/ --max-iterations 3

rlmx config

Manage global settings stored at ~/.rlmx/settings.json.

rlmx config set

rlmx config set <key> <value>
Set a configuration value. Values are type-coerced: "true" becomes boolean, numeric strings become numbers.
rlmx config set GEMINI_API_KEY sk-abc123
rlmx config set model.provider google
rlmx config set model.model gemini-3.1-flash-lite-preview
rlmx config set budget.max_cost 0.50
rlmx config set gemini.thinking_level medium

rlmx config get

rlmx config get <key>
Retrieve a setting value. API keys are masked in output.
$ rlmx config get model.provider
google

rlmx config list

rlmx config list
Show all configured settings. Sensitive keys (containing API_KEY, SECRET, TOKEN) are masked.

rlmx config delete

rlmx config delete <key>
Remove a setting.

rlmx config path

rlmx config path
Print the settings file path (~/.rlmx/settings.json).

Common keys

KeyDescriptionExample
GEMINI_API_KEYGoogle Gemini API keyAIza...
ANTHROPIC_API_KEYAnthropic API keysk-ant-...
OPENAI_API_KEYOpenAI API keysk-...
GROQ_API_KEYGroq API keygsk_...
XAI_API_KEYxAI API keyxai-...
OPENROUTER_API_KEYOpenRouter API keysk-or-...
model.providerLLM providergoogle, anthropic, openai
model.modelModel IDgemini-3.1-flash-lite-preview
model.sub_call_modelModel for llm_query() sub-callsgemini-3.1-flash-lite-preview
budget.max_costDefault max USD per run0.50
budget.max_tokensDefault max tokens per run100000
budget.max_depthDefault max recursion depth3
tools_levelDefault tool levelcore, standard, full
cache.retentionCache TTL strategyshort, long
gemini.thinking_levelDefault thinking levelminimal, low, medium, high
gemini.google_searchEnable web search batterytrue
gemini.code_executionEnable server-side Pythontrue

Priority order

Settings are resolved in this order (highest priority first):
  1. CLI flags (--max-cost 0.10)
  2. Project rlmx.yaml
  3. Global ~/.rlmx/settings.json
  4. Hardcoded defaults

Tool levels

The --tools flag controls which functions are available in the REPL:

core (default)

Paper-faithful RLM functions:
FunctionDescription
contextInjected context variable
llm_query(prompt)Single LLM completion
llm_query_batched(prompts)Concurrent LLM calls
rlm_query(prompt)Recursive child RLM session
rlm_query_batched(prompts)Parallel child RLM sessions
SHOW_VARS()List all REPL variables
FINAL(answer)Terminate with answer string
FINAL_VAR(name)Terminate with variable value
Plus any custom tools defined in rlmx.yaml.

standard (core + batteries)

All core functions plus utility batteries:
FunctionDescription
describe_context()Metadata overview of loaded context
preview_context()Content sample
search_context(query)Semantic search over context
grep_context(pattern)Regex search over context
chunk_context()Split context into chunks
chunk_text(text)Split arbitrary text by size
map_query(fn, items)Distributed LLM calls
reduce_query(fn, items)Aggregation queries
With Google provider, also includes Gemini batteries:
FunctionDescription
web_search(query)Google web search
fetch_url(url)Fetch and summarize URL content
generate_image(prompt)Image generation

full (standard + environment)

All standard functions plus auto-injected information about available Python packages and versions in the REPL environment.

Global flags

These flags work with any command:
FlagDescription
--help, -hShow help message
--version, -vShow version