Skip to content

Streaming Edits

The central question: when a model streams 4,000 tokens describing a file edit, does the tool start writing to disk as tokens arrive, or does it accumulate the full response first and then apply changes atomically?

The intuitive answer—stream edits as they arrive—turns out to be wrong for almost every architecture. Streaming application requires that every token boundary produces a valid, parseable partial state. In practice, edit-block formats (SEARCH/REPLACE, unified diff, patch) all have structural requirements that cannot be satisfied until the delimiter closing the edit block has arrived. A >>>>>>> REPLACE marker that lands in token 2,847 of 3,100 makes the preceding 2,846 tokens unparseable as an edit.

There is a second problem: atomicity. If the model edits three files and the agent writes file A and file B successfully but fails mid-stream on file C, the working tree is left in a broken intermediate state. Atomic application—parse everything first, then write everything—makes rollback trivial.

The result is that all three reference tools apply edits atomically after the full response is received, even though they all display streaming output to the user in real time. Streaming is a UI concern; file modification is a post-stream concern.


Commit: b9050e1d5faf8096eae7a46a9ecc05a86231384b

Aider’s edit lifecycle spans three phases: token accumulation, finalization, and application. Only the last phase touches the filesystem.

base_coder.py:1783send() resets the accumulator:

self.partial_response_content = ""

base_coder.py:1806 — if streaming is enabled, control passes to show_send_output_stream():

if self.stream:
yield from self.show_send_output_stream(completion)

Inside the streaming loop (base_coder.py:1903):

for chunk in completion:
text = chunk.choices[0].delta.content # one token
self.partial_response_content += text # accumulate in memory
self.live_incremental_response(False) # update TUI (display only)
yield text # surface to caller for progress

The live_incremental_response(False) call updates the markdown stream renderer — it is a display-only operation. WholeFileCoder overrides render_incremental_response() to attempt a live diff display (wholefile_coder.py:16), but this diff is also display-only; no file is written.

After the streaming loop exits, send_message() enters a finally block (base_coder.py:1513):

self.live_incremental_response(True) # final=True → flush markdown display
self.partial_response_content = get_multi_response_content_in_progress(True)

get_multi_response_content_in_progress() (base_coder.py:2128) concatenates multi-part responses (used when supports_assistant_prefill is enabled for very long outputs). After finalization, add_assistant_reply_to_cur_messages() stores the complete response in chat history.

apply_updates() is called at base_coder.py:1585, only after reply_completed() passes and finalization is done:

edits = self.get_edits() # parse complete partial_response_content
edits = self.apply_edits_dry_run(edits) # validate: would replacements match?
edits = self.prepare_to_edit(edits) # further validation
self.apply_edits(edits) # write to disk

The per-format disk write locations:

FormatFileLineDisk Write Call
SEARCH/REPLACEeditblock_coder.py71self.io.write_text(full_path, new_content)
Whole filewholefile_coder.py128self.io.write_text(full_path, new_lines)
Unified diffudiff_coder.py112self.io.write_text(full_path, content)

apply_edits() in editblock_coder.py writes only on success (if new_content:). If the SEARCH block does not match, it falls through to the fuzzy-matching stack in search_replace.py before giving up and reporting an error — still without touching disk.

After apply_edits() returns, auto_commit() runs (base_coder.py:1589), then optional lint and test hooks.

Why SEARCH/REPLACE Is Most Sensitive to Streaming

Section titled “Why SEARCH/REPLACE Is Most Sensitive to Streaming”

The SEARCH/REPLACE format uses delimiter lines of 5–9 <, =, and > characters (editblock_coder.py HEAD/DIVIDER/UPDATED regex patterns). A partial block — for example, the <<<<<<< SEARCH marker without the closing >>>>>>> REPLACE — is syntactically invalid. find_original_update_blocks() raises on partial input. This is the strongest reason Aider cannot apply edits mid-stream: the parser requires a complete, balanced block structure.


Commit: 4ab44e2c5 (codex-rs workspace)

Codex uses a custom patch format called the Codex Patch Format (not standard unified diff). The model emits the entire patch as a tool call argument, and the apply-patch crate processes it atomically.

Defined in codex-rs/apply-patch/src/parser.rs:4–21:

*** Begin Patch
*** Update File: path/to/file.rs
@@ context line here
-old line
+new line
unchanged line
*** End Patch

Hunk types:

  • *** Add File: <path> — create a new file
  • *** Delete File: <path> — remove a file
  • *** Update File: <path> — modify existing file (with optional *** Move to: <path>)

Entry point: apply_patch() in codex-rs/apply-patch/src/lib.rs:174:

pub fn apply_patch(patch: &str, stdout, stderr) -> Result<(), ApplyPatchError> {
let args = parse_patch(patch)?; // parse entire patch text into Vec<Hunk>
apply_hunks(&args)?; // dispatch all hunks to filesystem
print_summary(stdout, &args)?;
Ok(())
}

apply_hunks_to_files() (lib.rs:270) dispatches by hunk type:

  • AddFile: std::fs::create_dir_all() then std::fs::write(path, contents) (lib.rs:280–291)
  • DeleteFile: std::fs::remove_file(path) (lib.rs:292–295)
  • UpdateFile: calls derive_new_contents_from_chunks() then std::fs::write(path, new_contents) (lib.rs:297–322)

derive_new_contents_from_chunks() (lib.rs:339):

  1. Reads original file: std::fs::read_to_string(path) (lib.rs:343)
  2. Calls compute_replacements() to build a sorted list of (start_idx, old_len, new_lines) tuples
  3. Calls apply_replacements() which applies them in reverse order (lib.rs:475) to prevent index shifting
  4. Returns AppliedPatch { original_contents, new_contents }

seek_sequence() in seek_sequence.rs uses a four-pass strategy:

  1. Exact byte-for-byte match
  2. Trailing-whitespace trimmed match
  3. Full trim (both ends) match
  4. Unicode normalization (smart quotes, em-dashes → ASCII equivalents)

This mirrors Aider’s RelativeIndenter strategy: the model may emit slightly different whitespace than the file contains, so the seeker degrades gracefully.

Codex is fully atomic. The model tool call for apply_patch contains the complete patch text as a JSON string argument. No hunk is applied until the complete patch has been parsed and all hunks validated. If parse_patch() fails, no filesystem writes occur.


Commit: 7ed449974

OpenCode exposes three file-editing tools to the model: apply_patch, write, and multiedit. Each is a TypeScript module in packages/opencode/src/tool/.

Input schema (apply_patch.ts:17–19):

const PatchParams = z.object({
patchText: z.string()
})

OpenCode’s patch format is identical to Codex’s *** Begin Patch format. The packages/opencode/src/patch/index.ts module implements the same parser, including computeReplacements(), applyReplacements() (with reverse-order application, patch/index.ts:406), and the same four-pass seekSequence() fuzzy matcher.

Execution flow (apply_patch.ts:24–269):

  1. Parse all hunks from the complete patchText argument
  2. Compute new content for each file (reads current file from disk, applies replacements in memory)
  3. Collect diffs for all files (display-only at this point)
  4. Gate on permissionctx.ask({ permission: "edit", ... }) blocks until user approves
  5. Write all files sequentially: fs.writeFile() per hunk
  6. Publish events: File.Event.Edited and FileWatcher.Event.Updated per modified file
  7. Notify LSP: LSP.touchFile(target, true) then await LSP.diagnostics()

The permission gate (step 4) is unique to OpenCode. Aider and Codex apply edits without an interactive approval step in the hot path.

Replaces an entire file atomically (write.ts:45):

await Bun.write(filepath, params.content)

Before writing: reads existing content to generate a diff for the permission prompt. Uses FileTime assertion (line 32) to detect races — if the file has been modified since the session loaded it, the write is rejected.

Sequences multiple string-replacement edits on a single file (multiedit.ts:12–21):

edits: z.array(z.object({
filePath: z.string(),
oldString: z.string(),
newString: z.string(),
replaceAll: z.boolean().optional(),
}))

Each edit in the array is applied sequentially by calling EditTool.execute(). This is not a batch atomic operation — if edit 3 fails, edits 1 and 2 have already been applied to disk. This is intentional: the tool is designed for sequential, order-dependent edits within a single file.

packages/opencode/src/session/processor.ts:45 drives the model response loop:

for await (const value of stream.fullStream) {
switch (value.type) {
case "tool-input-start": // create tool part record
case "tool-input-delta": // no-op (not streamed to disk)
case "tool-input-end": // finalize tool input
case "tool-call": // execute tool with complete input ← edits happen here
case "tool-result": // record output
}
}

The tool-call event fires only when the complete tool input has been received. Tool execution is synchronous within the event handler. OpenCode is therefore atomic per tool call — no file write occurs until the model has finished emitting the tool’s JSON arguments.

packages/opencode/src/snapshot/index.ts provides session-level undo but does not intercept writes. It operates on git trees:

  • track(): runs git write-tree to create a git snapshot before edits
  • patch(hash): computes git diff between two snapshots
  • revert(patches): runs git checkout to restore files from a snapshot

This is an undo/revert mechanism, not a write-intercepting VFS.


All three tools display tokens to the user in real time. This creates the impression that editing is happening live. In fact, the TUI is streaming the text representation of the edit — the actual file modification does not occur until the stream closes. Users who interrupt a session mid-stream (Ctrl-C) lose all pending edits.

If Aider’s SEARCH block is interrupted mid-stream (network error, process kill), find_original_update_blocks() raises rather than partially applying. This is intentional fail-closed behavior. However, if the error occurs after apply_edits() has already written file A but before writing file B, the working tree is inconsistent. The auto_commit() call immediately after apply_edits() mitigates this by snapshotting the partial state.

OpenCode Permission Gate as a Soft Boundary

Section titled “OpenCode Permission Gate as a Soft Boundary”

OpenCode’s ctx.ask() permission gate is an approval gate, not an enforcement mechanism. A tool that bypasses the gate (by not calling ctx.ask()) can write freely. The gate is a UX feature, not a security boundary.

apply_replacements() applies replacements from the end of the file toward the beginning (lib.rs:475). This prevents index shifting — removing or inserting lines changes line numbers for all subsequent replacements. Getting this order wrong produces incorrect output silently (the writes succeed, the content is wrong). OpenCode’s patch/index.ts:406 has the same requirement and the same solution.

The four-pass fuzzy seeker can match the wrong location if the same context line appears multiple times in a file. In Codex, seek_sequence() returns the first match from the current position. If the model emits a context line that appears on lines 12, 45, and 78, and the intended target is line 45, the seeker will incorrectly match line 12. This is a known limitation of context-based seeking without explicit line numbers.


OpenOxide should use Codex’s apply-patch crate directly. It is already a Rust library with a clean public API, battle-tested fuzzy seeking, and the correct reverse-order replacement algorithm.

[dependencies]
apply-patch = { path = "../vendor/codex/codex-rs/apply-patch" }
# Or, once published:
# apply-patch = "0.1"
pub trait EditApplier: Send + Sync {
/// Apply a batch of edits atomically. No filesystem write occurs
/// until all edits have been validated.
async fn apply(&self, patch: &str, ctx: &EditContext) -> Result<ApplyResult, EditError>;
}
pub struct ApplyResult {
pub added: Vec<PathBuf>,
pub modified: Vec<PathBuf>,
pub deleted: Vec<PathBuf>,
pub diagnostics: Vec<Diagnostic>,
}

Implement a PermissionGate that runs before any filesystem write. In autonomous mode, the gate auto-approves based on configured rules. In interactive mode, it suspends the coroutine and waits for a TUI event:

pub enum GateDecision {
Allow,
Deny(String),
AskUser(oneshot::Sender<GateDecision>),
}

To give users incremental feedback without sacrificing atomicity:

  1. Parse the complete patch into a Vec<Hunk> immediately after the tool call closes
  2. Emit Event::PatchParsed { hunk_count } to the TUI
  3. For each hunk (before writing), emit Event::HunkPending { path, diff }
  4. Write all hunks atomically (or abort all on first failure)
  5. For each written hunk, emit Event::HunkApplied { path }

This provides per-file granularity in the progress display without any intermediate disk state.

After apply_patch completes, send workspace/didChangeWatchedFiles notifications to all active LSP servers for modified paths. Collect textDocument/publishDiagnostics events with a 500 ms timeout and surface errors in the session output. Use tokio::time::timeout to avoid blocking on a slow language server.

Use the git2 crate to snapshot the working tree state before any edit batch:

use git2::{Repository, Index};
pub fn snapshot(repo: &Repository) -> Result<git2::Oid> {
let mut index = repo.index()?;
index.add_all(["*"], git2::IndexAddOption::DEFAULT, None)?;
index.write()?;
let oid = index.write_tree()?;
Ok(oid)
}
pub fn revert(repo: &Repository, tree_oid: git2::Oid) -> Result<()> {
let tree = repo.find_tree(tree_oid)?;
repo.checkout_tree(tree.as_object(), None)?;
Ok(())
}

Store the snapshot OID in the session record before each edit batch. The /undo command triggers revert().