Agent Loop

The agent loop is the central execution primitive in every AI coding tool. It is the cycle that turns a user message into a sequence of actions: send a prompt, observe the model’s response, execute any requested tools, feed results back, and repeat until the model signals completion or a limit is reached. Getting this loop right determines whether the tool feels responsive, handles errors gracefully, and avoids runaway behavior.

This page traces the full loop implementation in Aider, Codex, OpenCode, and Claude Code, then proposes the OpenOxide design.

Aider Implementation

Pinned commit: b9050e1d

Aider’s loop lives entirely in aider/coders/base_coder.py (2485 lines). The architecture is a three-tier nesting: an outer input loop, a middle reflection loop, and an inner LLM call.

Outer Loop: `run()`

File: aider/coders/base_coder.py:876

The outermost loop is a simple REPL. It calls get_input() to receive a user message from the prompt_toolkit input, then delegates to run_one(). This loops indefinitely until EOFError (Ctrl+D) or an explicit /exit command.

Middle Loop: `run_one()`

File: aider/coders/base_coder.py:924

run_one() handles one user message and its reflection cycles. Structure:

def run_one(self, user_message, preproc):
    self.init_before_message()
    message = self.preproc_user_input(user_message) if preproc else user_message

    while message:
        self.reflected_message = None
        list(self.send_message(message))  # consume the generator

        if not self.reflected_message:
            break
        if self.num_reflections >= self.max_reflections:  # max_reflections = 3
            break

        self.num_reflections += 1
        message = self.reflected_message  # loop with reflection feedback

The preproc_user_input() step at line 912 handles slash commands (/add, /drop, /model, etc.) and URL extraction before the message enters the LLM path. If the input is a command, it is dispatched via commands.run() and the result (if any) becomes the message.

Inner Loop: `send_message()`

File: aider/coders/base_coder.py:1419

This is the core agent cycle. Each call to send_message() performs one full prompt-act-observe iteration:

Step 1 — Add user message to context (line 1425): The user message is appended to self.cur_messages as a {"role": "user", "content": inp} dict.

Step 2 — Format and assemble messages (line 1429): format_chat_chunks() (line 1226) builds a ChatChunks dataclass (aider/coders/chat_chunks.py) with eight segments assembled in order:

system — system prompt with edit format instructions
examples — few-shot examples for the edit format
readonly_files — content of read-only files
repo — repo map output (tree-sitter tags + PageRank)
done — previous turns (summarized if over token budget)
chat_files — current files in the editing context
cur — the current turn’s messages
reminder — optional system reminder

chunks.all_messages() concatenates all segments into a flat message list.

Step 3 — Token validation (line 1431): check_tokens(messages) verifies the assembled prompt fits within the model’s context window.

Step 4 — LLM call with retry (lines 1449-1487): A while True loop calls self.send(messages, functions=self.functions) with exponential backoff on transient errors:

retry_delay = 0.125
while True:
    try:
        yield from self.send(messages, functions=self.functions)
        break
    except ContextWindowExceededError:
        exhausted = True; break
    except retryable_error:
        retry_delay *= 2
        if retry_delay > RETRY_TIMEOUT: break
        time.sleep(retry_delay); continue

The send() method (line 1783) calls model.send_completion() which delegates to litellm.completion(). If self.stream is True, it yields from show_send_output_stream() (line 1900), which iterates over the streaming completion and accumulates self.partial_response_content token by token, calling self.live_incremental_response() for real-time markdown rendering.

Step 5 — Extract and apply edits (line 1585): apply_updates() (line 2296) calls the subclass-specific get_edits() to parse the response format, then apply_edits() to write changes to disk. If parsing fails with a ValueError, the error message is assigned to self.reflected_message, triggering a reflection cycle.

Step 6 — Auto-commit (line 1589): If git integration is enabled, auto_commit(edited) creates a commit with a model-generated message.

Step 7 — Lint feedback (lines 1599-1607): If self.auto_lint is enabled and files were edited, lint_edited() runs the linter. If errors are found, they are assigned to self.reflected_message and the method returns — triggering another iteration of the run_one() while loop.

Step 8 — Test feedback (lines 1616-1618): If self.auto_test is enabled, test results are captured and can trigger reflection.

Coder Class Hierarchy

The Coder base class defines the loop; subclasses override get_edits() and apply_edits() to handle different edit formats:

Subclass	`edit_format`	Strategy
`EditBlockCoder`	`diff`	SEARCH/REPLACE blocks with exact text matching
`UnifiedDiffCoder`	`udiff`	Unified diff hunks with flexible search-and-replace
`WholeFileCoder`	`whole`	Complete file content between fence markers
`AskCoder`	`ask`	No edits, question-only mode
`ArchitectCoder`	`architect`	Two-model flow (see below)

Subclass selection happens in Coder.create() (line 125), which takes edit_format as a parameter and returns the appropriate subclass instance. Mid-session format switching is supported via the from_coder parameter, which transfers chat history.

Architect Mode: Two-Model Flow

File: aider/coders/architect_coder.py

ArchitectCoder extends AskCoder (no edits). When its reply_completed() hook fires:

The architect model’s plan text is captured
User confirms with “Edit the files?”
An EditorCoder is spawned with the editor_model and editor_edit_format
The editor runs in a fresh conversation: editor_coder.run(with_message=content, preproc=False)
Results are merged back: self.move_back_cur_messages("I made those changes to the files.")

This is Aider’s only multi-model agent pattern. The editor gets a clean context with just the plan text — no accumulated chat history.

Key State

self.cur_messages = []        # current turn
self.done_messages = []       # previous turns (may be summarized)
self.reflected_message = None # set to trigger reflection
self.num_reflections = 0      # counter, max 3
self.max_reflections = 3      # hard limit
self.partial_response_content = ""  # accumulated LLM output

Codex Implementation

Pinned commit: 4ab44e2c5

Codex’s loop is fully async, built on tokio, and uses channels for all communication between the TUI and the core engine. The architecture separates submission handling from turn execution.

Entry Point: `Codex::spawn()`

File: codex-rs/core/src/codex.rs:287

spawn() creates the session infrastructure:

Creates a bounded channel (capacity = 64) for Submission inputs
Creates an unbounded channel for Event outputs
Initializes a watch::channel(AgentStatus::PendingInit) for status tracking
Builds a Session::new() with configuration
Spawns the background submission_loop() task — this is the core event dispatcher

Submission Loop

File: codex-rs/core/src/codex.rs:3196

The submission loop runs indefinitely, receiving Op submissions and dispatching to handlers:

Op Type	Handler	Purpose
`UserInput` / `UserTurn`	`user_input_or_turn()`	New user message, spawns turn task
`ExecApproval`	`exec_approval()`	Approval decision for command execution
`PatchApproval`	`patch_approval()`	Approval decision for file patches
`Interrupt`	`interrupt()`	Abort current task via `CancellationToken`
`Shutdown`	`shutdown()`	Graceful session termination
`Compact`	`compact()`	Manual context compaction

Turn Lifecycle

Handler: handlers::user_input_or_turn() (line 3433)

Parse Op::UserInput or Op::UserTurn to extract items and settings updates
Create TurnContext with per-turn config (model, approval policy, sandbox policy, collaboration mode)
Attempt sess.steer_input(items, None) to inject into an active task (if one exists)
If no active task: call sess.spawn_task(current_context, items, RegularTask) to start a new turn

Task Spawning: Session::spawn_task() (line 116 in codex-rs/core/src/tasks/mod.rs)

Aborts all previous tasks
Creates a CancellationToken for the new task
Spawns a background tokio task that calls task.run()
On completion, emits TurnComplete event and flushes the rollout

The Core Agent Loop: `run_turn()`

File: codex-rs/core/src/codex.rs:4318

This is where the prompt-act-observe cycle lives. Structure:

Phase 1 — Setup (lines 4325-4462):

Emit TurnStarted event
Run pre-sampling compaction if token budget is tight
Load skills for the current working directory
Collect available tools via ToolsConfig
Record user prompt to history
Start ghost snapshot task for undo support

Phase 2 — Main Loop (lines 4476-4657):

loop {
    // Check for pending user input (user typed while model was running)
    let pending = sess.get_pending_input().await;

    // Build sampling request from conversation history
    let input: Vec<ResponseItem> = sess.clone_history().await.for_prompt(...);

    // Call API with retry logic
    match run_sampling_request(...).await {
        Ok(result) => {
            let SamplingRequestResult { needs_follow_up, last_agent_message } = result;

            // Check token limits
            let total_tokens = sess.get_total_token_usage().await;
            let limit_reached = total_tokens >= auto_compact_limit;

            if limit_reached && needs_follow_up {
                run_auto_compact(&sess, &turn_context).await?;
                continue;  // compact and retry
            }

            if !needs_follow_up {
                break;  // model is done
            }
            // else: model wants to continue, loop again
        }
        Err(CodexErr::TurnAborted) => break,
        Err(e) => { send_error_event; break; }
    }
}

The loop continues as long as needs_follow_up is true — meaning the model emitted tool calls that need results fed back. If the token limit is approached mid-turn, auto-compaction runs and the loop continues, enabling arbitrarily long multi-step turns.

API Call and Stream Processing

Function: run_sampling_request() (line 4897)

Wraps try_run_sampling_request() in a retry loop:

Retryable errors (stream failures, timeouts, connection errors): exponential backoff with transport fallback (WebSocket to HTTPS)
Non-retryable errors (context window exceeded, quota, invalid request): fail immediately
Max retries per provider via provider.stream_max_retries()

Function: try_run_sampling_request() (line 5505)

Streams the API response and processes events in real-time:

SSE Event	Action
`ResponseEvent::Created`	No-op
`ResponseEvent::OutputItemAdded(item)`	Emit `TurnItemStarted`
`ResponseEvent::OutputTextDelta(delta)`	Emit `AgentMessageContentDelta`
`ResponseEvent::OutputItemDone(item)`	Extract tool call, dispatch via `ToolCallRuntime`
`ResponseEvent::Completed`	Update token usage, break stream loop
`ResponseEvent::ReasoningSummaryDelta`	Emit reasoning delta event

Tool Call Dispatch

File: codex-rs/core/src/tools/parallel.rs:49

ToolCallRuntime::handle_tool_call() spawns a tokio task per tool call:

tokio::select! {
    _ = cancellation_token.cancelled() => {
        aborted_response(elapsed)
    }
    res = router.dispatch_tool_call(call, source) => {
        // read-lock if parallel, write-lock if serial
        res
    }
}

Tool calls are collected in a FuturesOrdered for parallel execution. After the stream completes, drain_in_flight() waits for all pending tool results and collects them as ResponseInputItem::FunctionCallOutput items, which feed into the next API request.

Approval Flow

When a tool requires approval (determined by ExecPolicyManager):

Tool handler emits EventMsg::ExecApprovalRequest with an approval_id
TUI renders the approval overlay
User responds with Op::ExecApproval { approval_id, decision }
Handler notifies the waiting tool via a channel
Tool proceeds or aborts based on decision

Error Classification

File: codex-rs/core/src/error.rs:195

CodexErr::is_retryable() categorizes errors:

Non-retryable: TurnAborted, ContextWindowExceeded, UsageLimitReached, InvalidRequest, Sandbox, ServerOverloaded
Retryable: Stream, Timeout, UnexpectedStatus, ResponseStreamFailed, ConnectionFailed, InternalServerError, Io, Json

Key State

pub(crate) struct TurnContext {
    pub sub_id: String,
    pub config: Arc<Config>,
    pub model_info: ModelInfo,
    pub approval_policy: AskForApproval,
    pub sandbox_policy: SandboxPolicy,
    pub collaboration_mode: CollaborationMode,
    pub tools_config: ToolsConfig,
    pub final_output_json_schema: Option<Value>,
    pub dynamic_tools: Vec<DynamicToolSpec>,
    // ...
}

OpenCode Implementation

Pinned commit: 7ed449974

OpenCode’s loop is TypeScript async/await built on the Vercel AI SDK. The architecture separates prompt orchestration (SessionPrompt) from stream processing (SessionProcessor).

Entry Point: `SessionPrompt.prompt()`

File: packages/opencode/src/session/prompt.ts:158

Accepts a PromptInput (sessionID, parts, model, agent, format) and:

Creates a user message via createUserMessage() (line 951) — processes file parts, directory listings, MCP resources
Persists via Session.updateMessage() and Session.updatePart()
Unless noReply: true, invokes loop() to start the turn lifecycle

The Core Loop: `SessionPrompt.loop()`

File: packages/opencode/src/session/prompt.ts:274

The loop runs until the assistant finishes or an error occurs:

Step 1 — Load message stream (line 298): Fetches all non-compacted messages via MessageV2.filterCompacted(). Identifies lastUser, lastAssistant, lastFinished.

Exit condition (lines 318-325): If the last assistant message is finished and its finish reason is not "tool-calls" or "unknown", break.

Step 2 — Step tracking (lines 327-334): Increments a step counter. On step 1, starts async title generation via ensureTitle().

Step 3 — Subtask handling (lines 352-526): If a pending subtask part is found (from the Task tool), it executes the subtask tool directly, creates the result message, and continues the loop.

Step 4 — Compaction handling (lines 529-554): If a pending compaction part is found, runs SessionCompaction.process() and continues.

Step 5 — Normal processing (lines 556-714): This is the core agent cycle:

Get agent config: Agent.get(lastUser.agent)
Insert mode reminders (plan/build mode switching)
Create SessionProcessor wrapping an assistant message
Resolve tools via SessionPrompt.resolveTools() (line 602)
Inject StructuredOutput tool if JSON schema format requested
Build system prompt and session messages
Call processor.process() — this streams the LLM response

Step 6 — Loop control (lines 705-714):

Result "stop" — break
Result "compact" — create compaction and continue
Otherwise (tool calls finished) — continue loop

Stream Processing: `SessionProcessor.process()`

File: packages/opencode/src/session/processor.ts:55

Iterates stream.fullStream (from Vercel AI SDK’s streamText()) and processes events:

Event Type	Action
`start`	Set session status to “busy”
`text-delta`	Accumulate text, call `Session.updatePartDelta()`
`reasoning-delta`	Accumulate reasoning, emit delta
`tool-input-start`	Create `ToolPart` with status “pending”
`tool-call`	Update status to “running”, check doom loop guard
`tool-result`	Update status to “completed”, record output + timing
`tool-error`	Update status to “error”, check for permission rejection
`finish-step`	Record finish reason, token usage, check overflow

Doom Loop Detection

File: packages/opencode/src/session/processor.ts:154

If the same tool with the same input is called 3 times consecutively, the processor triggers a doom loop permission check via PermissionNext.ask(). The user can approve (continue) or deny (stop the loop).

Permission Integration

OpenCode’s generic permission checks are enforced in the tool execution path, not in the processor’s tool-call handler. SessionPrompt.resolveTools() builds a Tool.Context with ctx.ask() (packages/opencode/src/session/prompt.ts:773), and individual tools (plus the MCP wrapper at line 852) call ctx.ask(...) before execution.

The processor itself uses PermissionNext.ask() at tool-call time only for doom-loop protection (3 repeated identical calls). If a tool-level permission is rejected, the stream emits tool-error, blocked is set, and the loop returns "stop".

Compaction as Overflow Handler

When SessionCompaction.isOverflow() detects the token count approaching the context limit (with a 20k reserved buffer), the processor sets needsCompaction = true. On the next loop iteration:

A compaction agent summarizes the conversation
A synthetic “Continue if you have next steps…” message is injected
The loop continues with a compressed context

Retry Logic

File: packages/opencode/src/session/retry.ts

Retry decisions are made per-error:

ContextOverflowError — not retried (handled by compaction)
APIError with isRetryable: false — not retried
Rate limits, overloaded errors — retried with exponential backoff (2s initial, 2x factor, 30s cap)
Headers retry-after-ms or retry-after are respected when present

Key State

// SessionPrompt.state() - per session
{
  abort: AbortController,     // cancellation
  callbacks: Array<{resolve, reject}>  // waiting clients
}

// SessionProcessor - per turn
{
  toolcalls: Record<string, ToolPart>,  // active tool calls
  snapshot: string,                      // file state for undo
  blocked: boolean,                      // permission denied
  attempt: number,                       // retry counter
  needsCompaction: boolean,             // overflow flag
}

Claude Code Implementation (Inferred)

Source: Public documentation at code.claude.com/docs/ (closed source — architecture inferred from docs, not inspected code).

Claude Code is Anthropic’s production coding agent. Unlike the open-source references above, its internals are not available for inspection. What follows is inferred from public documentation and compared against the patterns we’ve already traced.

The Agentic Loop

Claude Code describes its loop as three blended phases: gather context, take action, verify results. Critically, these are NOT discrete states or separate code paths. The documentation is explicit: “These phases blend together. Claude uses tools throughout.” The model dynamically decides what each step requires based on what it learned from the previous step.

The system is described as an “agentic harness” around Claude: it provides tools, context management, and an execution environment. Two core components drive the loop: models (reasoning) and tools (acting). Each tool use returns information that feeds back into the loop, informing the next decision.

This aligns most closely with Codex’s needs_follow_up pattern. There is no indication of Aider-style reflection loops or explicit reflection counters.

Loop Termination

The loop exits when:

The model signals completion (no more tool calls)
The user interrupts (type a correction mid-loop and press Enter)
A hard limit is reached (--max-turns or --max-budget-usd in headless mode)
Context fills up — but this triggers compaction, not termination; the loop continues after compaction

No explicit reflection cap is mentioned. Errors are handled through the model observing structured tool results and adjusting, not through a separate reflection mechanism.

User Steering as a First-Class Loop Input

Claude Code explicitly supports mid-loop user interruption: “You can interrupt at any point to steer Claude in a different direction, provide additional context, or ask it to try a different approach. Claude will stop what it’s doing and adjust.” This maps to Codex’s steer_input() pattern where user input during an active turn is injected rather than queued.

Tool Categories

Claude Code organizes tools into five categories plus orchestration:

Category	Capabilities
File operations	Read, edit, create, rename, reorganize
Search	Find files by pattern, search content with regex, explore codebases
Execution	Shell commands, servers, tests, git
Web	Search the web, fetch documentation, look up error messages
Code intelligence	Type errors/warnings after edits, go to definition, find references
Orchestration	Spawn subagents, ask user questions, task management

Code intelligence is delivered via plugins, not built-in. This is a deliberate extension point — the core loop doesn’t hardcode language-specific analysis.

Context Window Management

What loads into the context window per session: conversation history, file contents, command outputs, CLAUDE.md instructions, loaded skills (descriptions only until invoked), and system instructions.

Compaction strategy when approaching the context limit:

Clear older tool outputs first (cheapest, most recoverable)
Summarize conversation if needed (LLM-based compaction)
Preserve user requests and key code snippets
Persistent instructions from early in conversation may be lost (put them in CLAUDE.md instead)

Users control compaction via:

A “Compact Instructions” section in CLAUDE.md (survives compaction)
/compact command with optional focus: /compact focus on the API changes
/context to visualize what’s consuming space

Key insight: Skills load on demand — Claude sees skill descriptions at session start, but full content loads only when invoked. Subagents get completely fresh context windows, separate from the main conversation. They return only a summary. This isolation is the primary scaling mechanism for long sessions.

Subagent Context Isolation

Each subagent spawned by Claude Code runs in its own fresh context window. It does not inherit the parent conversation’s full history. When the subagent completes, it returns a summary to the parent. This prevents long sessions from degrading — complex subtasks run in clean contexts while the parent conversation stays compact.

This is architecturally distinct from all three open-source references:

Aider: Architect mode gives the editor a fresh conversation, but it’s a specific two-model pattern, not a general subagent mechanism
Codex: No subagent isolation; all work runs in the same session context
OpenCode: Agents run in the same session context with shared message history

Adaptive Thinking Budgets

On Opus 4.6, Claude Code uses adaptive reasoning: instead of a fixed thinking token budget, the model dynamically allocates thinking based on an effort level setting (low/medium/high). Other models use a fixed budget up to 31,999 tokens.

This means the loop must handle variable-length thinking phases. The MAX_THINKING_TOKENS environment variable can cap the budget (ignored on Opus 4.6 except when set to 0, which disables thinking entirely).

Checkpoints

Before every file edit, Claude Code snapshots the current file contents. Users can rewind with Esc+Esc to restore to any previous checkpoint. Checkpoints are local to the session, separate from git. They only cover file changes — remote actions (databases, APIs, deployments) cannot be checkpointed.

This is similar to OpenCode’s snapshot mechanism and Codex’s ghost snapshots.

Permission Modes

Three modes, cycled with Shift+Tab:

Default: Asks before file edits and shell commands
Auto-accept edits: Edits without asking, still asks for shell commands
Plan mode: Read-only tools only; creates a plan for user approval before execution

Allowed commands are configurable in .claude/settings.json. Settings scope from organization-wide policies down to personal preferences. In headless mode, --permission-prompt-tool delegates permission decisions to an external MCP tool — enabling CI/CD pipelines where an external system handles approvals.

Key Architectural Differences from Open-Source References

Aspect	Claude Code	Codex	OpenCode	Aider
Reflection mechanism	None — structured tool results only	None — structured tool results only	Doom loop detection (3x same call)	Explicit reflection loop (max 3)
Subagent isolation	Fresh context per subagent	No subagents	Shared session context	Architect mode only
Mid-turn compaction	Yes (clear tool outputs, then summarize)	Yes (auto-compact and continue)	Yes (overflow detection, compaction marker)	No (fails with error)
User mid-loop steering	Yes (interrupt and redirect)	Yes (`steer_input`)	Yes (`AbortController`)	Yes (`KeyboardInterrupt`)
Thinking budget	Adaptive per effort level (Opus 4.6)	Fixed	Fixed	Fixed
Permission delegation	MCP tool for headless mode	Approval channel	Permission promise	N/A (single-user)

Pitfalls & Hard Lessons

Reflection Limits Are Essential

Aider caps reflections at 3. Without this, a malformed response that always fails parsing would loop forever. Codex avoids the problem by not having reflection — tool results are always fed back as structured data, not error strings. OpenCode has doom loop detection (3x same call) but no explicit reflection cap for parse errors since tool results are always structured.

Approval Blocks the Entire Turn

In Codex, when a tool requires approval, the entire turn blocks on the approval channel. If the user is slow to respond, the API connection may time out. Codex mitigates this by holding the connection open, but it creates backpressure in the event loop. OpenCode’s approach is similar but uses async promises.

Context Overflow Mid-Turn

All three tools handle overflow differently:

Aider: Fails with ContextWindowExceededError and tells the user to reduce context
Codex: Auto-compacts mid-turn and continues the loop — most resilient
OpenCode: Detects overflow at finish-step, creates a compaction marker, and continues

Cancellation Safety

Codex uses tokio::select! with CancellationToken on every spawned task, allowing clean abort mid-tool-execution. Aider uses KeyboardInterrupt (Python signal), which can leave partial state. OpenCode uses AbortController signals passed through the AI SDK.

Multi-Model Loops Are Fragile

Aider’s architect mode spawns a fresh editor coder with an empty conversation. If the plan text is ambiguous, the editor has no context to resolve it. Codex avoids this by not supporting multi-model loops. OpenCode has agent-based routing but each agent runs in the same session context.

Token Counting Drift

Aider uses litellm’s token counters (tiktoken for OpenAI, approximations for others). Codex uses a 4-byte heuristic. OpenCode uses char / 4. All three can miscalculate, leading to unexpected context overflow. Only Codex’s mid-turn auto-compact provides a safety net.

OpenOxide Blueprint

Architecture: Channel-Based Event Loop

OpenOxide adopts Codex’s channel architecture with OpenCode’s event granularity:

// Core agent loop crate
pub struct AgentLoop {
    rx_submission: Receiver<Submission>,
    tx_event: Sender<Event>,
    session: Arc<Session>,
}

The submission loop receives Op variants (user input, approval responses, interrupt, compact) and dispatches to handlers. Each turn spawns a background tokio task with a CancellationToken.

Turn Execution

The run_turn() function follows Codex’s pattern:

loop {
    let request = session.build_sampling_request().await;
    match api_client.stream(request).await {
        Ok(stream) => {
            let result = process_stream(stream, &tool_runtime).await?;
            if !result.needs_follow_up { break; }
            if result.token_limit_reached {
                run_auto_compact(&session).await?;
            }
        }
        Err(e) if e.is_retryable() => {
            backoff_and_retry(&mut retries)?;
            continue;
        }
        Err(e) => { emit_error(e); break; }
    }
}

Reflection vs Structured Tool Results

OpenOxide does NOT use Aider’s reflection pattern. Tool results are always fed back as structured FunctionCallOutput items. Parse errors in edit formats are returned as tool error strings, which the model can observe and retry without a separate reflection mechanism.

Approval Flow

Follows Codex’s pattern: approval requests are emitted as events, the turn blocks on a oneshot::channel, and the TUI (or MCP client) sends the decision back. A configurable timeout prevents indefinite blocking.

Compaction Strategy

Mid-turn auto-compaction as in Codex, with OpenCode’s two-phase approach:

Prune: Remove old tool outputs beyond a 40k token protect window
Summarize: LLM-based compaction using a dedicated compaction model

Doom Loop Detection

Adopt OpenCode’s pattern: track the last 3 tool calls, and if the same tool+input appears 3 times consecutively, inject a warning into the context and optionally halt.

Crates

Crate	Responsibility
`openoxide-loop`	Core agent loop, submission handling, turn execution
`openoxide-session`	Session state, message history, compaction
`openoxide-tools`	Tool registry, dispatch, parallel execution
`openoxide-exec`	Command execution, sandbox integration
`openoxide-provider`	API client, streaming, retry logic

Key Design Decisions

No reflection loops. Structured tool results only. Simpler, more predictable. Validated by both Codex and Claude Code — neither uses Aider-style reflection.
Mid-turn compaction. Enables long multi-step tasks without manual intervention. Claude Code confirms this is the production-proven approach: clear tool outputs first, summarize conversation second.
Parallel tool execution. FuturesOrdered with per-tool serialization control.
Event-driven communication. TUI and MCP server are just event consumers — the loop is transport-agnostic.
Cancellation at every await point. tokio::select! with CancellationToken throughout.
User steering mid-turn. Accept user input during an active turn as a redirect, not just as an interrupt-and-restart. Claude Code and Codex both support this via their submission channels. Add a Steer variant to Op that injects into the active turn’s context.
Subagent context isolation. Each subagent gets a fresh context window. Parent sends task description + relevant context; subagent returns a summary. This is Claude Code’s primary mechanism for keeping long sessions manageable. Implement as a spawn_subagent() method on Session that creates a child Session with an independent message history.
Adaptive thinking budget support. The loop must not assume a fixed thinking token allocation. Support a configurable effort level (low/medium/high) that the provider translates into model-specific thinking parameters. Cap with MAX_THINKING_TOKENS env var.
Permission delegation in headless mode. Support an external permission handler (MCP tool or callback) for CI/CD pipelines where no human is available to approve. Mirrors Claude Code’s --permission-prompt-tool pattern.

Agent Loop

Aider Implementation

Outer Loop: run()

Middle Loop: run_one()

Inner Loop: send_message()

Coder Class Hierarchy

Architect Mode: Two-Model Flow

Key State

Codex Implementation

Entry Point: Codex::spawn()

Submission Loop

Turn Lifecycle

The Core Agent Loop: run_turn()

API Call and Stream Processing

Tool Call Dispatch

Approval Flow

Error Classification

Key State

OpenCode Implementation

Entry Point: SessionPrompt.prompt()

The Core Loop: SessionPrompt.loop()

Stream Processing: SessionProcessor.process()

Doom Loop Detection

Permission Integration

Compaction as Overflow Handler

Retry Logic

Key State

Claude Code Implementation (Inferred)

The Agentic Loop

Loop Termination

User Steering as a First-Class Loop Input

Tool Categories

Context Window Management

Subagent Context Isolation

Adaptive Thinking Budgets

Checkpoints

Permission Modes

Key Architectural Differences from Open-Source References

Pitfalls & Hard Lessons

Reflection Limits Are Essential

Approval Blocks the Entire Turn

Context Overflow Mid-Turn

Cancellation Safety

Multi-Model Loops Are Fragile

Token Counting Drift

OpenOxide Blueprint

Architecture: Channel-Based Event Loop

Turn Execution

Reflection vs Structured Tool Results

Approval Flow

Compaction Strategy

Doom Loop Detection

Crates

Key Design Decisions

Outer Loop: `run()`

Middle Loop: `run_one()`

Inner Loop: `send_message()`

Entry Point: `Codex::spawn()`

The Core Agent Loop: `run_turn()`

Entry Point: `SessionPrompt.prompt()`

The Core Loop: `SessionPrompt.loop()`

Stream Processing: `SessionProcessor.process()`