CLI backends (fallback runtime)
FluffBuzz can run local AI CLIs as a text-only fallback when API providers are down, rate-limited, or temporarily misbehaving. This is intentionally conservative:- FluffBuzz tools are not injected directly, but backends with
bundleMcp: truecan receive gateway tools via a loopback MCP bridge. - JSONL streaming for CLIs that support it.
- Sessions are supported (so follow-up turns stay coherent).
- Images can be passed through if the CLI accepts image paths.
Beginner-friendly quick start
You can use Codex CLI without any config (the bundled OpenAI plugin registers a default backend):agents.defaults.cliBackends.
Using it as a fallback
Add a CLI backend to your fallback list so it only runs when primary models fail:- If you use
agents.defaults.models(allowlist), you must include your CLI backend models there too. - If the primary provider fails (auth, rate limits, timeouts), FluffBuzz will try the CLI backend next.
Configuration overview
All CLI backends live under:codex-cli, my-cli).
The provider id becomes the left side of your model ref:
Example configuration
How it works
- Selects a backend based on the provider prefix (
codex-cli/...). - Builds a system prompt using the same FluffBuzz prompt + workspace context.
- Executes the CLI with a session id (if supported) so history stays consistent.
The bundled
claude-clibackend keeps a Claude stdio process alive per FluffBuzz session and sends follow-up turns over stream-json stdin. - Parses output (JSON or plain text) and returns the final text.
- Persists session ids per backend, so follow-ups reuse the same CLI session.
The bundled Anthropic
claude-cli backend is supported again. Anthropic staff
told us FluffBuzz-style Claude CLI usage is allowed again, so FluffBuzz treats
claude -p usage as sanctioned for this integration unless Anthropic publishes
a new policy.codex-cli backend passes FluffBuzz’s system prompt through
Codex’s model_instructions_file config override (-c model_instructions_file="..."). Codex does not expose a Claude-style
--append-system-prompt flag, so FluffBuzz writes the assembled prompt to a
temporary file for each fresh Codex CLI session.
The bundled Anthropic claude-cli backend receives the FluffBuzz skills snapshot
two ways: the compact FluffBuzz skills catalog in the appended system prompt, and
a temporary Claude Code plugin passed with --plugin-dir. The plugin contains
only the eligible skills for that agent/session, so Claude Code’s native skill
resolver sees the same filtered set that FluffBuzz would otherwise advertise in
the prompt. Skill env/API key overrides are still applied by FluffBuzz to the
child process environment for the run.
Claude CLI also has its own noninteractive permission mode. FluffBuzz maps that
to the existing exec policy instead of adding Claude-specific config: when the
effective requested exec policy is YOLO (tools.exec.security: "full" and
tools.exec.ask: "off"), FluffBuzz adds --permission-mode bypassPermissions.
Per-agent agents.list[].tools.exec settings override global tools.exec for
that agent. To force a different Claude mode, set explicit raw backend args
such as --permission-mode default or --permission-mode acceptEdits under
agents.defaults.cliBackends.claude-cli.args and matching resumeArgs.
Before FluffBuzz can use the bundled claude-cli backend, Claude Code itself
must already be logged in on the same host:
agents.defaults.cliBackends.claude-cli.command only when the claude
binary is not already on PATH.
Sessions
- If the CLI supports sessions, set
sessionArg(e.g.--session-id) orsessionArgs(placeholder{sessionId}) when the ID needs to be inserted into multiple flags. - If the CLI uses a resume subcommand with different flags, set
resumeArgs(replacesargswhen resuming) and optionallyresumeOutput(for non-JSON resumes). sessionMode:always: always send a session id (new UUID if none stored).existing: only send a session id if one was stored before.none: never send a session id.
claude-clidefaults toliveSession: "claude-stdio",output: "jsonl", andinput: "stdin"so follow-up turns reuse the live Claude process while it is active. Warm stdio is the default now, including for custom configs that omit transport fields. If the Gateway restarts or the idle process exits, FluffBuzz resumes from the stored Claude session id. Stored session ids are verified against an existing readable project transcript before resume, so phantom bindings are cleared withreason=transcript-missinginstead of silently starting a fresh Claude CLI session under--resume.- Stored CLI sessions are provider-owned continuity. The implicit daily session
reset does not cut them;
/resetand explicitsession.resetpolicies still do.
serialize: truekeeps same-lane runs ordered.- Most CLIs serialize on one provider lane.
- FluffBuzz drops stored CLI session reuse when the selected auth identity changes, including a changed auth profile id, static API key, static token, or OAuth account identity when the CLI exposes one. OAuth access and refresh token rotation does not cut the stored CLI session. If a CLI does not expose a stable OAuth account id, FluffBuzz lets that CLI enforce resume permissions.
Images (pass-through)
If your CLI accepts image paths, setimageArg:
imageArg is set, those
paths are passed as CLI args. If imageArg is missing, FluffBuzz appends the
file paths to the prompt (path injection), which is enough for CLIs that auto-
load local files from plain paths.
Inputs / outputs
output: "json"(default) tries to parse JSON and extract text + session id.- For Gemini CLI JSON output, FluffBuzz reads reply text from
responseand usage fromstatswhenusageis missing or empty. output: "jsonl"parses JSONL streams (for example Codex CLI--json) and extracts the final agent message plus session identifiers when present.output: "text"treats stdout as the final response.
input: "arg"(default) passes the prompt as the last CLI arg.input: "stdin"sends the prompt via stdin.- If the prompt is very long and
maxPromptArgCharsis set, stdin is used.
Defaults (plugin-owned)
The bundled OpenAI plugin also registers a default forcodex-cli:
command: "codex"args: ["exec","--json","--color","never","--sandbox","workspace-write","--skip-git-repo-check"]resumeArgs: ["exec","resume","{sessionId}","-c","sandbox_mode=\"workspace-write\"","--skip-git-repo-check"]output: "jsonl"resumeOutput: "text"modelArg: "--model"imageArg: "--image"sessionMode: "existing"
google-gemini-cli:
command: "gemini"args: ["--output-format", "json", "--prompt", "{prompt}"]resumeArgs: ["--resume", "{sessionId}", "--output-format", "json", "--prompt", "{prompt}"]imageArg: "@"imagePathScope: "workspace"modelArg: "--model"sessionMode: "existing"sessionIdFields: ["session_id", "sessionId"]
gemini on PATH (brew install gemini-cli or
npm install -g @google/gemini-cli).
Gemini CLI JSON notes:
- Reply text is read from the JSON
responsefield. - Usage falls back to
statswhenusageis absent or empty. stats.cachedis normalized into FluffBuzzcacheRead.- If
stats.inputis missing, FluffBuzz derives input tokens fromstats.input_tokens - stats.cached.
command path).
Plugin-owned defaults
CLI backend defaults are now part of the plugin surface:- Plugins register them with
api.registerCliBackend(...). - The backend
idbecomes the provider prefix in model refs. - User config in
agents.defaults.cliBackends.<id>still overrides the plugin default. - Backend-specific config cleanup stays plugin-owned through the optional
normalizeConfighook.
input rewrites the system prompt and user prompt passed to the CLI. output
rewrites streamed assistant deltas and parsed final text before FluffBuzz handles
its own control markers and channel delivery.
For CLIs that emit Claude Code stream-json compatible JSONL, set
jsonlDialect: "claude-stream-json" on that backend’s config.
Bundle MCP overlays
CLI backends do not receive FluffBuzz tool calls directly, but a backend can opt into a generated MCP config overlay withbundleMcp: true.
Current bundled behavior:
claude-cli: generated strict MCP config filecodex-cli: inline config overrides formcp_servers; the generated FluffBuzz loopback server is marked with Codex’s per-server tool approval mode so MCP calls cannot stall on local approval promptsgoogle-gemini-cli: generated Gemini system settings file
- spawns a loopback HTTP MCP server that exposes gateway tools to the CLI process
- authenticates the bridge with a per-session token (
FLUFFBUZZ_MCP_TOKEN) - scopes tool access to the current session, account, and channel context
- loads enabled bundle-MCP servers for the current workspace
- merges them with any existing backend MCP config/settings shape
- rewrites the launch config using the backend-owned integration mode from the owning extension
Limitations
- No direct FluffBuzz tool calls. FluffBuzz does not inject tool calls into
the CLI backend protocol. Backends only see gateway tools when they opt into
bundleMcp: true. - Streaming is backend-specific. Some backends stream JSONL; others buffer until exit.
- Structured outputs depend on the CLI’s JSON format.
- Codex CLI sessions resume via text output (no JSONL), which is less
structured than the initial
--jsonrun. FluffBuzz sessions still work normally.
Troubleshooting
- CLI not found: set
commandto a full path. - Wrong model name: use
modelAliasesto mapprovider/model→ CLI model. - No session continuity: ensure
sessionArgis set andsessionModeis notnone(Codex CLI currently cannot resume with JSON output). - Images ignored: set
imageArg(and verify CLI supports file paths).