Edit Formats

Source attribution: The format descriptions and benchmark data in this page are drawn from the official Aider documentation:

Edit Formats — aider.chat/docs/more/edit-formats.html

Edit Format Leaderboard — aider.chat/docs/leaderboards/edit.html

OpenAI apply_patch tool — platform.openai.com/docs/guides/tools-apply-patch

Implementation details are traced from references/aider/ at commit b9050e1d5faf8096eae7a46a9ecc05a86231384b and references/codex/ at commit 4ab44e2c5.

Feature Definition

Every AI coding agent faces the same fundamental problem: the LLM needs to express file changes in text, and the host application needs to parse that text back into filesystem operations. The obvious solution — “have the model return the whole updated file” — works for small files but becomes expensive, slow, and error-prone as files grow. A 1,000-line file edited to change three lines still sends 1,000 lines back across the wire. At $15/million output tokens, that adds up fast.

The deeper problem is reliability. LLMs follow the format you ask for most of the time, not all of the time. Any production edit system must handle malformed output — missing markers, wrong indentation, extra prose, partial responses cut off mid-block. The parser must be lenient enough to recover gracefully while being strict enough to avoid silently applying a wrong edit.

Edit formats exist at the intersection of three competing pressures:

Token efficiency — How many output tokens does a change require? Whole-file is the most expensive; a targeted search/replace block for a 3-line change might use 20 tokens where whole-file uses 2,000.
LLM compliance rate — Can the model reliably follow the format? Some formats are too syntactically complex for weaker models; others are so loose that parsers can’t extract edits cleanly.
Failure recovery — When the match fails (because the model’s “SEARCH” block doesn’t exactly match the file), how gracefully can the system recover?

Aider has solved this problem across years of production use and codified the solution into five distinct formats, each suited to different model capabilities and file sizes. Understanding them is essential groundwork for implementing any AI coding agent.

Aider Implementation

Reference: references/aider/aider/coders/ | Commit: b9050e1d5faf8096eae7a46a9ecc05a86231384b

Aider implements edit formats as a class hierarchy. Every format is a concrete subclass of Coder (defined in aider/coders/base_coder.py), which pairs a prompt bundle (instructions sent to the LLM) with a parser+applier pair. The factory method Coder.create(edit_format=...) instantiates the right subclass.

class Coder:
    edit_format = None        # overridden by each subclass
    gpt_prompts = None        # prompt bundle (system + user templates)

    @classmethod
    def create(cls, edit_format, ...):
        # dispatches to the right subclass

Format 1: SEARCH/REPLACE Blocks (`diff`)

Files: editblock_coder.py, editblock_prompts.py, search_replace.py

This is Aider’s default for Claude and GPT-4 class models. The LLM is instructed to emit blocks using a merge-conflict-inspired syntax:

path/to/file.py
<<<<<<< SEARCH
def old_function():
    return 1
=======
def old_function():
    return 2
>>>>>>> REPLACE

The system prompt in editblock_prompts.py spells out seven numbered rules, including the requirement that the SEARCH block contain a “contiguous chunk of lines” from the existing file — no skipping, no ellipsis, no paraphrasing. The file path appears on the line immediately before the opening fence.

The parser (find_original_update_blocks() in editblock_coder.py) matches the block boundaries with regular expressions that allow 5–9 angle bracket characters to tolerate common LLM transcription errors:

HEAD    = r"^<{5,9} SEARCH>?\s*$"   # <<<<<<< SEARCH
DIVIDER = r"^={5,9}\s*$"            # =======
UPDATED = r"^>{5,9} REPLACE\s*$"   # >>>>>>> REPLACE

The critical piece is what happens when the SEARCH block doesn’t match the file. Rather than failing immediately, Aider runs a three-strategy fallback stack defined in search_replace.py:

editblock_strategies = [
    (search_and_replace, all_preprocs),          # exact string match
    (git_cherry_pick_osr_onto_o, all_preprocs),  # git 3-way merge
    (dmp_lines_apply, all_preprocs),             # diff-match-patch line diff
]

Each strategy is tried with multiple preprocessors — combinations of stripping blank lines and rewriting to relative indentation. The RelativeIndenter class in search_replace.py normalizes absolute indentation levels to relative changes between lines, represented with the Unicode character ← for outdents. This lets the system match a block at eight spaces of indentation even when the SEARCH block was written at four spaces — a common failure mode when the LLM re-indents code in its “thinking” before generating the block.

The dmp_lines_apply strategy converts text to line arrays and applies diff-match-patch patches — the same library Google uses internally for text comparison. It’s the most aggressive fuzzy match and can recover from cases where the file has drifted slightly from what the model saw (e.g., a whitespace-only change by an auto-formatter).

Format 2: Unified Diff (`udiff`)

Files: udiff_coder.py, udiff_prompts.py

The LLM emits standard diff -U0 style hunks inside a fenced block:

```diff
--- mathweb/flask/app.py
+++ mathweb/flask/app.py
@@ ... @@
-def is_prime(x):
-    return False
+def is_prime(x):
+    return x > 1
```

The instruction deliberately omits line numbers from the @@ header — Aider’s prompt says Don't include line numbers like diff -U0 does. This is because LLMs hallucinate line numbers confidently and incorrectly. Without numbers, the parser must locate the hunk by matching context lines.

The parser (find_diffs()) scans for ```diff fenced blocks and calls process_fenced_block() on each. Inside the hunk, each line’s first character is the operation marker:

space = context (must match the file)
- = delete this line
+ = insert this line

hunk_to_before_after() extracts the before/after text. The application then calls do_replace() with the same fallback stack as SEARCH/REPLACE.

Two error types can occur:

UnifiedDiffNoMatch — the before-lines don’t appear in the file. The error message shows the expected lines and suggests nearby similar content.
UnifiedDiffNotUnique — the before-lines appear in multiple locations. The fix is adding more context lines to disambiguate.

Udiff was designed specifically to suppress “lazy coding” in GPT-4 Turbo — older model versions had a tendency to write # ... rest of code unchanged ... inside diff blocks when they didn’t want to reproduce unchanged sections. Unified diff makes this impossible because every line must be explicitly marked.

Format 3: Whole File (`whole`)

Files: wholefile_coder.py, wholefile_prompts.py

The simplest possible approach. The LLM returns the complete contents of every modified file, fenced by backticks:

path/to/file.py
```python
# entire updated file content
# ...

The system prompt explicitly forbids elision: `*NEVER* skip, omit or elide content from a file listing using "..." or by adding comments like "... rest of code..."`. The parser (`get_edits()`) splits the response on fence markers and extracts the filename from the line immediately preceding the opening fence.

Filename extraction uses multiple heuristics: strip markdown formatting (`**file**`, `` `file` ``, `#file`), reject paths over 250 characters, detect when the LLM prepended a bogus directory from the prompt example, and match by basename if the full path isn't in the chat file list.

Application (`apply_edits()`) is trivial — just `io.write_text(full_path, new_content)`.

This format is the fallback for models that can't reliably produce the structured syntax of diff or SEARCH/REPLACE. It's reliable but expensive. For a 500-line file where only 3 lines change, the LLM must output all 500 lines.

### Format 4: Diff-Fenced (`diff-fenced`)

**Files**: `editblock_fenced_coder.py`, `editblock_fenced_prompts.py`

A variant of SEARCH/REPLACE where the file path appears inside the fence rather than before it:

```python
<<<<<<< SEARCH path/to/file.py
old code
=======
new code
>>>>>>> REPLACE

Created specifically for Gemini models, which consistently mishandled the standard format’s requirement to put the filename before the opening fence. The parser is a thin wrapper around find_original_update_blocks() with a regex that extracts the path from the SEARCH line header.

Format 5: Editor Formats (`editor-diff`, `editor-whole`)

Files: editor_editblock_coder.py, editor_whole_coder.py

Two streamlined variants designed for Aider’s “architect mode.” In architect mode, a high-capability model reasons about the changes needed and a separate “editor” model produces the actual syntax. The editor model receives a simpler prompt — just the task and the files, no elaborate format instructions — because its sole job is syntactic transformation, not reasoning.

editor-diff is editor_editblock_coder.py — same SEARCH/REPLACE parsing, shorter prompt. editor-whole is editor_whole_coder.py — same whole-file parsing, shorter prompt.

Fence Format Tolerance

All parsers accept multiple fence styles, because LLMs frequently deviate from the instructed style:

all_fences = [
    ("`" * 3, "`" * 3),       # ```
    ("`" * 4, "`" * 4),       # ````
    wrap_fence("source"),      # <source></source>
    wrap_fence("code"),        # <code></code>
    wrap_fence("pre"),         # <pre></pre>
    wrap_fence("codeblock"),   # <codeblock></codeblock>
    wrap_fence("sourcecode"),  # <sourcecode></sourcecode>
]

Format Selection Logic

Aider auto-selects the format per model. From the Aider docs: “Aider is configured to use the best edit format for the popular OpenAI and Anthropic models. For lesser known models aider will default to using the ‘whole’ editing format since it is the easiest format for an LLM to use.”

Concretely: GPT-4 class and Claude Sonnet/Opus class → diff (SEARCH/REPLACE). Gemini models → diff-fenced. Weaker or unknown models → whole. Users can override with --edit-format.

Benchmark Results (from Aider Leaderboard)

From aider.chat/docs/leaderboards/edit.html:

Model	Format	Score
o1, Claude 3.5 Sonnet (20241022)	diff	84.2%
Gemini-exp-1206	whole	80.5%
o1-mini	whole	70.7%

The leaderboard measures two dimensions: task completion accuracy (did the code change solve the problem?) and format compliance (did the model follow the format without errors requiring human correction?). The diff format dominates the top scores because it uses fewer tokens and forces the model to be precise about what it’s changing.

Codex CLI Implementation

Reference: references/codex/codex-rs/apply-patch/ | Commit: 4ab44e2c5

Codex (the CLI tool open-sourced by OpenAI) uses a custom text-based patch format called the Codex Patch Format, not standard unified diff. The format uses sentinel lines rather than embedded delimiters, making it unambiguous to parse:

*** Begin Patch
*** Add File: src/new_module.rs
+pub fn hello() {}
*** Update File: src/main.rs
@@ fn main() {
-    println!("old");
+    println!("new");
*** Delete File: src/deprecated.rs
*** End Patch

Three hunk types are supported within one patch:

*** Add File: <path> — lines prefixed with + become the new file’s content
*** Update File: <path> — @@ provides context for seeking, -/+ lines make changes
*** Delete File: <path> — no diff payload; the file is removed
*** Move to: <path> — optional directive inside an Update block to relocate the file

The @@ context line (without line numbers) is used by the seeker to locate the target region. The seek_sequence() function in codex-rs/apply-patch/src/seek_sequence.rs uses a four-pass fuzzy matching strategy: exact match, trailing-whitespace trim, full trim, then Unicode normalization (smart quotes and em-dashes converted to ASCII). Replacements are applied in reverse order to prevent index shifting.

The crate’s entry point is apply_patch(&str) in codex-rs/apply-patch/src/lib.rs:174. The complete patch string is parsed into a Vec<Hunk> before any filesystem writes occur — application is fully atomic.

A single *** Begin Patch block can contain multiple *** Update File / *** Add File / *** Delete File hunks, so one model response can surgically touch many files. The model is instructed to produce the format via a system prompt, not by native training.

This format is also used by OpenCode (packages/opencode/src/patch/index.ts) with an identical TypeScript parser — the same four-pass seek, same reverse-order apply, same *** Begin Patch sentinels.

OpenAI Native apply_patch API Tool

Source: platform.openai.com/docs/guides/tools-apply-patch | Supported models: GPT-5.1, GPT-5.3 (Codex), and all subsequent OpenAI models

This is the most significant architectural departure in the space: OpenAI has baked the apply_patch tool directly into their models’ training weights. You do not define a schema for it. You do not write system prompt instructions explaining the format. You enable it with a single entry in the tools array and the model already knows exactly how to produce correct, structured patch operations:

response = client.responses.create(
    model="gpt-5.1",
    input=RESPONSE_INPUT,
    tools=[{"type": "apply_patch"}],   # that's it — no schema, no instructions
)

Every other edit format in this document is a prompt-level convention: the application tells the model “please use this format” and the model tries to comply. OpenAI’s apply_patch is a training-level primitive: the model has internalized the format the same way it has internalized Python syntax. Compliance is not approximate — the model produces valid operation objects as reliably as it produces valid JSON when asked.

What the Model Emits

The Response output contains one or more apply_patch_call objects. One response can carry operations against any number of files simultaneously — rename a symbol used across ten files, and the model emits ten apply_patch_call objects in a single turn:

{
    "id": "apc_08f3d96c87a585390069118b594f7481a088b16cda7d9415fe",
    "type": "apply_patch_call",
    "status": "completed",
    "call_id": "call_Rjsqzz96C5xzPb0jUWJFRTNW",
    "operation": {
        "type": "update_file",
        "path": "lib/fib.py",
        "diff": "@@\n-def fib(n):\n+def fibonacci(n):\n     if n <= 1:\n         return n\n-    return fib(n-1) + fib(n-2)\n+    return fibonacci(n-1) + fibonacci(n-2)\n"
    }
}

Operation Types

Three operations are available:

Operation	Purpose	Payload
`create_file`	Create a new file at `path`	`diff` is a V4A diff of the full file contents (all `+` lines)
`update_file`	Modify an existing file at `path`	`diff` is a V4A diff with `@@` context, `-`/`+` lines
`delete_file`	Remove a file at `path`	No `diff` payload; file is deleted entirely

The V4A Diff Format

The diff embedded in each operation.diff field is V4A format — a simplified unified diff without file headers or line numbers:

@@
-def fib(n):
+def fibonacci(n):
     if n <= 1:
         return n
-    return fib(n-1) + fib(n-2)
+    return fibonacci(n-1) + fibonacci(n-2)

Key characteristics:

@@ is a section marker that begins a hunk — no +N,M line numbers
Lines prefixed with - are removed from the file
Lines prefixed with + are inserted
Lines prefixed with a space (or no prefix) are unchanged context
The file path is in operation.path, not embedded in the diff — there are no --- a/file / +++ b/file headers
Multiple @@ sections can appear in one diff to address separate locations in the same file

This design separates concerns cleanly: the structured JSON carries the metadata (path, operation type), and the diff carries only the change. Parsers never need to extract a filename from diff text.

The Feedback Loop

After applying patches, the harness reports results back per call_id. The model uses this to recover from failures:

# successful application
{
    "type": "apply_patch_call_output",
    "call_id": "call_Rjsqzz96C5xzPb0jUWJFRTNW",
    "status": "completed",
    "output": "Applied patch to lib/fib.py"
}

# failed application
{
    "type": "apply_patch_call_output",
    "call_id": "call_cNWm41dB3RyQcLNOVTIPBWZU",
    "status": "failed",
    "output": "Could not apply patch to lib/foo.py — file not found on disk"
}

The model receives these outputs in the next turn and can reissue corrected operations, ask for more file context, or explain what it attempted. This structured error channel is qualitatively better than the Aider model of embedding error text in the conversation: the model gets a typed status code per file, not a blob of prose to parse.

Why This Changes Everything

The format comparison so far in this document has been about how to get models to reliably follow a prompt-level convention. OpenAI’s approach sidesteps that problem entirely:

Property	Prompt-level formats (SEARCH/REPLACE, udiff, whole)	OpenAI native apply_patch
Format compliance source	System prompt instructions	Model training weights
Compliance rate	Model-dependent, degrades with context length	Native — as reliable as JSON generation
Multi-file in one turn	Via multiple blocks in response text	Native — multiple `apply_patch_call` objects
Error feedback channel	Prose in conversation history	Typed `apply_patch_call_output` per call_id
Harness complexity	Parse text, fuzzy match, recover from malformed output	Parse JSON, apply V4A diff, return status
Model portability	Works with any text-generating model	Requires OpenAI GPT-5.1+ (or fine-tuned equivalent)
Schema definition required	Yes — elaborate system prompt	No — just `{"type": "apply_patch"}` in tools array

The capability for surgical multi-file edits is particularly notable. When GPT-5.1 renames a function used across a codebase, it can emit one apply_patch_call per affected file in the same response. The harness applies them in whatever order it chooses (or in parallel), reports all statuses back, and the model knows exactly which files succeeded and which need retry. This is architecturally impossible with text-embedded formats like SEARCH/REPLACE, where the entire multi-file batch is one opaque text blob.

Harness Implementation (Agents SDK)

The TypeScript Agents SDK exposes applyDiff() for interpreting V4A diffs, so the harness reduces to three filesystem operations:

import { applyDiff, applyPatchTool, Agent, run } from "@openai/agents";

class WorkspaceEditor implements Editor {
    async createFile(op) {
        const content = applyDiff("", op.diff, "create");
        await fs.writeFile(op.path, content);
        return { status: "completed", output: `Created ${op.path}` };
    }
    async updateFile(op) {
        const current = await fs.readFile(op.path, "utf-8");
        const updated = applyDiff(current, op.diff);
        await fs.writeFile(op.path, updated);
        return { status: "completed", output: `Updated ${op.path}` };
    }
    async deleteFile(op) {
        await fs.unlink(op.path);
        return { status: "completed", output: `Deleted ${op.path}` };
    }
}

The applyDiff() function handles V4A parsing and context-seeking internally. For Python, the equivalent is available via the OpenAI Python Agents SDK.

Model Availability

As of the time of writing (February 2026):

GPT-5.1: first model with native apply_patch support
GPT-5.3 (Codex): OpenAI’s code-specialized model, also native — this is the model the Codex CLI was updated to use
All subsequent OpenAI models in the GPT-5.x series support the tool natively

The Codex CLI (codex-rs) predates the API-level tool and uses the *** Begin Patch text format described in the previous section. The API-level apply_patch tool is the successor approach for applications calling the OpenAI API directly.

OpenCode Implementation

Reference: references/opencode/packages/opencode/src/session/ | Commit: 7ed449974864361bad2c1f1405769fd2c2fcdf42

OpenCode delegates all file modification to a dedicated edit tool in its tool registry. The LLM is instructed to call the tool with path, old_string, and new_string parameters rather than producing structured text blocks. Tool calls go through the provider layer as JSON, so the parsing problem is handled by the model provider’s function-calling implementation.

When the LLM calls the edit tool, OpenCode’s VFS (virtual filesystem) layer applies the change, computes a diff for display in the TUI, and stages the result for potential undo. The “old_string must appear exactly once in the file” constraint is enforced at application time — if the old string appears zero times or multiple times, the tool returns an error to the model and asks it to retry with more context.

This is functionally identical to Codex’s approach: offload the format problem to structured output and let the LLM produce semantically correct tool calls. Both OpenCode and Codex avoid building fuzzy recovery layers at the cost of requiring models that support tool use.

Pitfalls and Hard Lessons

Exact Match is the Wrong Default

The first instinct — require the SEARCH block to match exactly — fails immediately in practice. Auto-formatters (Black, Prettier, rustfmt) can change whitespace in files between the model reading them and applying edits. Git operations can normalize line endings. Manual edits by the user during the session can shift lines. Any production implementation must have at least indentation-tolerant matching from day one, not as a future enhancement.

The Ellipsis Problem

LLMs learn from code examples where # ... and ... are common placeholders. When instructed to fill in a SEARCH block, weaker models will “helpfully” abbreviate unchanged sections with ellipsis markers. This breaks exact matching. Aider’s try_dotdotdots() function in editblock_coder.py handles the most common case — matching ... as a wildcard for any content — but it’s a fragile heuristic that breaks on code files where ... is a valid token (Python Ellipsis literal, TypeScript spread operators, Rust range expressions).

Token Counting for the Prompt

Whole-file format is expensive in input tokens too, not just output. If a 2,000-line file is in the chat context and the user asks for a one-line change, the whole file appears in every request turn. SEARCH/REPLACE avoids this by only requiring the changed region — but the model must still “know” the file well enough to write a correct SEARCH block, which means the file should be in context at least once per session.

Format Compliance Degrades with Context Length

On very long conversations, model compliance with format instructions degrades. The format instructions appear at the top of the system prompt, but models attend more to recent context. After 50 turns, a model that was producing perfect SEARCH/REPLACE blocks may start mixing formats or emitting partial blocks. Aider’s parser handles this by accepting multiple format variants simultaneously and falling back gracefully, but the root cause is unresolvable without periodic system prompt reinforcement.

Function Calling Isn’t Free of Errors Either

Codex and OpenCode’s function-call approach trades text parsing errors for tool-call errors. The model can call the edit tool with an old_string that doesn’t appear in the file (same failure mode as a bad SEARCH block), or call it with an old_string that matches multiple locations. These errors are arguably cleaner to handle — the tool returns a typed error, not ambiguous text — but they still require retry logic and recovery heuristics.

OpenOxide Blueprint

OpenOxide will implement SEARCH/REPLACE as the primary format with the same multi-strategy recovery stack as Aider, but with a plugin architecture that allows alternative formats to be selected per model. The implementation lives in openoxide-edit.

Core Trait: `EditFormat`

pub trait EditFormat: Send + Sync {
    /// The string identifier ("diff", "udiff", "whole", "diff-fenced")
    fn name(&self) -> &'static str;

    /// Parse the LLM's raw text response into a list of pending edits.
    fn parse_response(&self, response: &str) -> Result<Vec<PendingEdit>, ParseError>;

    /// Apply a single edit to the given file content.
    fn apply_edit(
        &self,
        content: &str,
        edit: &PendingEdit,
    ) -> Result<String, ApplyError>;
}

pub struct PendingEdit {
    pub path: PathBuf,
    pub kind: EditKind,
}

pub enum EditKind {
    SearchReplace { search: String, replace: String },
    UnifiedDiff { hunks: Vec<Hunk> },
    WholeFile { content: String },
}

SEARCH/REPLACE Parser: `SearchReplaceFormat`

The parser uses the same 5–9 angle-bracket tolerance as Aider:

static HEAD:    Lazy<Regex> = Lazy::new(|| Regex::new(r"(?m)^<{5,9} SEARCH>?\s*$").unwrap());
static DIVIDER: Lazy<Regex> = Lazy::new(|| Regex::new(r"(?m)^={5,9}\s*$").unwrap());
static UPDATED: Lazy<Regex> = Lazy::new(|| Regex::new(r"(?m)^>{5,9} REPLACE\s*$").unwrap());

The parser emits (path, search_text, replace_text) triples. File path is extracted from the 3 lines preceding the SEARCH marker, using the same “last non-fence, non-empty line” heuristic as Aider.

Fallback Application Stack: `FuzzyApplier`

pub struct FuzzyApplier {
    strategies: Vec<Box<dyn ApplyStrategy>>,
}

impl FuzzyApplier {
    pub fn search_replace_default() -> Self {
        Self {
            strategies: vec![
                Box::new(ExactMatch),
                Box::new(RelativeIndentMatch),
                Box::new(DmpLinesApply),
            ],
        }
    }

    pub fn apply(&self, content: &str, search: &str, replace: &str) -> Option<String> {
        for strategy in &self.strategies {
            for preproc in [Preproc::None, Preproc::StripBlanks, Preproc::RelativeIndent] {
                if let Some(result) = strategy.try_apply(content, search, replace, preproc) {
                    return Some(result);
                }
            }
        }
        None
    }
}

DmpLinesApply uses the dissimilar crate (a Rust port of diff-match-patch) for line-level fuzzy application. RelativeIndentMatch implements the RelativeIndenter concept from Aider — convert to relative indentation, match, convert back.

Format Registry

Formats are registered at startup and selected by model name:

pub struct FormatRegistry {
    formats: HashMap<&'static str, Arc<dyn EditFormat>>,
    model_defaults: HashMap<&'static str, &'static str>,
}

impl FormatRegistry {
    pub fn default_for_model(&self, model: &str) -> Arc<dyn EditFormat> {
        let key = self.model_defaults.get(model).copied().unwrap_or("whole");
        self.formats[key].clone()
    }
}

Default assignments:

gpt-5* → "apply_patch" (OpenAI native tool — no schema needed, multi-file per turn)
claude-*, gpt-4*, o1*, o3* → "diff" (SEARCH/REPLACE)
gemini-* → "diff-fenced"
Everything else → "whole"

Users can override via --edit-format in the CLI.

Native apply_patch Adapter

For OpenAI models that support the native tool, add an ApplyPatchFormat variant that bypasses text parsing entirely. The EditFormat trait gets a new associated method:

pub trait EditFormat: Send + Sync {
    fn name(&self) -> &'static str;

    /// Returns Some(tool_definition) if this format uses an API-level tool
    /// rather than text parsing. None means prompt-based text format.
    fn api_tool(&self) -> Option<serde_json::Value> {
        None
    }

    fn parse_response(&self, response: &str) -> Result<Vec<PendingEdit>, ParseError>;
    fn apply_edit(&self, content: &str, edit: &PendingEdit) -> Result<String, ApplyError>;
}

pub struct ApplyPatchFormat;

impl EditFormat for ApplyPatchFormat {
    fn name(&self) -> &'static str { "apply_patch" }

    fn api_tool(&self) -> Option<serde_json::Value> {
        Some(json!({"type": "apply_patch"}))
    }

    fn parse_response(&self, _response: &str) -> Result<Vec<PendingEdit>, ParseError> {
        // Parsing happens at the API response level (apply_patch_call objects),
        // not from response text. This method is unused for native tool formats.
        Err(ParseError::UseApiParser)
    }

    fn apply_edit(&self, content: &str, edit: &PendingEdit) -> Result<String, ApplyError> {
        // V4A diff application — same seek_sequence logic as Codex apply-patch crate
        apply_v4a_diff(content, &edit.diff)
    }
}

The session loop checks format.api_tool() and, if Some, injects the tool definition into the API request. The response handler reads apply_patch_call objects from response.output instead of parsing text. Per-call success/failure is returned via apply_patch_call_output events in the follow-up request.

V4A Diff Application in Rust

The V4A parser is simpler than Codex’s *** Begin Patch format because paths are in the JSON, not the diff text:

pub fn apply_v4a_diff(original: &str, diff: &str) -> Result<String, ApplyError> {
    let hunks = parse_v4a_hunks(diff)?;   // split on @@ markers
    let lines: Vec<&str> = original.lines().collect();
    let mut result = lines.clone();
    // Apply hunks in reverse order to preserve line indices
    for hunk in hunks.iter().rev() {
        let ctx_start = seek_context(&result, &hunk.context_lines)?;
        apply_hunk(&mut result, ctx_start, hunk)?;
    }
    Ok(result.join("\n"))
}

Reuse Codex’s seek_sequence.rs fuzzy-matching logic for seek_context() — same four-pass strategy (exact → rstrip → trim → unicode-normalize).

Error Reporting

On application failure, ApplyError carries enough context to produce a meaningful error message for the LLM to retry:

pub enum ApplyError {
    NotFound { path: PathBuf, search_text: String, nearest: Option<String> },
    Ambiguous { path: PathBuf, search_text: String, match_count: usize },
    IoError(std::io::Error),
}

nearest is populated by a similarity search (using the strsim crate’s jaro_winkler metric) against the file’s lines, giving the LLM a hint about why the match failed.