flue: add generic code review alongside style-guide review by mvvmm · Pull Request #31562 · cloudflare/cloudflare-docs

mvvmm · 2026-06-18T16:50:47Z

Adds a generic engineering code review to the Flue bot. Both the code review and existing style-guide review now post to a single bot comment split into ### Code Review and ### Style Guide Review sections.

Architecture

Both reviewers run as separate specialist workflows (code-review-specialist, style-guide-specialist), each in their own Durable Object isolate with its own 128 MB memory budget. The orchestrator admits both and polls them concurrently; one failing degrades to an empty section rather than aborting the whole review.

Code review specialist

Reviews all changed text files (excluding lockfiles, generated output, binary/image files), capped at 20 files (largest-diff-first).
Trusted TypeScript pre-stages the data before any model call: added lines are parsed from the diff patch (accurate new-file line numbers), and the full file content is fetched at the head SHA. Both are passed directly as args.addedLines and args.fileContent, eliminating the mandatory setup turns the model previously spent reading manifests and fetching files.
read_repo_file (defaults to PR head SHA) and search_repo remain available as optional tools for cross-file lookups (callers, import sites, etc.).
The root AGENTS.md is fetched from the PR base ref and injected as agent instructions so the reviewer has repo conventions without a checkout. (Base ref, not head, to prevent a PR from poisoning the reviewer's instructions.)
Findings use a critical / warning / suggestion severity hierarchy with CR- IDs.

Style guide specialist

Reviews only src/content/(docs|partials|changelog)/**.mdx, capped at 20 files.
Reads the diff and full file content from the staged workspace via the code tool.
warning / suggestion severities only.

Memory and timeout

Each per-file agent session is deleted immediately after it finishes (session.delete()), bounding peak heap to ~concurrency live sessions rather than growing with the file count.
Per-file hard timeout (10 min, env-overridable via CODE_REVIEW_FILE_TIMEOUT_MS / STYLE_GUIDE_FILE_TIMEOUT_MS): on timeout, the CallHandle is aborted first (settling the operation) and then the session is deleted, preventing zombie session leaks.
Concurrency is env-overridable (CODE_REVIEW_CONCURRENCY, STYLE_GUIDE_CONCURRENCY) so local dev can run lower values without touching prod defaults.

Other

Reconciler handles each stream (code, style) independently against its prior findings + human comments; persists { code, style } to R2 (legacy bare-array shape tolerated).
Incremental diff mode (SHA-pinned from last reviewed head to current head) when a prior review exists; full diff otherwise. Specialists self-heal incremental → full if the base SHA is gone.
Cadence unchanged: automatic on open/synchronize/ready_for_review (up to 2-review cap), plus /review, /full-review, and /ignore-review-limit codeowner commands.
Beta disclaimer rendered under the Code Review heading.

Validated

tsc --noEmit, flue build, prettier. Verified end-to-end in log mode and comment mode against real PRs.

Adds a bonk-style generic engineering review that runs in the same orchestrator run as the existing style-guide review and posts findings in the same bot comment, now split into "### Code Review" and "### Style Guide Review" sections. - New code-review skill: reviews all changed files (including code examples in MDX) for bugs, error handling, security, dead code, and maintainability. No style/prose checks; defers CI-caught issues. - New in-process fan-out (lib/code-review-inproc.ts) mirroring the style-guide one: per-file sessions, concurrency 5, cap 20 files. - Code findings add a `critical` severity above warning/suggestion and use the CR- id namespace; reconcile skill preserves it. - Code-review agent gets full-file context via GitHub-API-backed tools (read_repo_file defaults to the PR head SHA, search_repo). - The repo root AGENTS.md is fetched at the head SHA and injected as agent instructions every run, since the Worker has no repo checkout to discover it from. - Orchestrator runs both reviews concurrently (one failing degrades to empty rather than aborting), reconciles per stream, persists { code, style } findings to R2 (legacy bare-array tolerated), and renders one unified comment. Cadence is unchanged: this rides the existing code-review-orchestrator triggers (auto on PRs, /review, /full-review).

github-actions · 2026-06-18T16:51:37Z

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern	Owners
`*`	`@cloudflare/product-owners`
`*.ts`	`@cloudflare/content-engineering`, `@kodster28`
`package.json`	`@cloudflare/content-engineering`

github-actions · 2026-06-18T17:18:45Z

Preview URL: https://df8c587c.preview.developers.cloudflare.com
Preview Branch URL: https://code-review.preview.developers.cloudflare.com

Fixes surfaced by the bot reviewing its own PR (#31562): - getRepoFileContent: decode base64 as UTF-8 via TextDecoder (atob alone produced mojibake for non-ASCII content such as em dashes in AGENTS.md); URL-encode path segments; return null only on 404 and throw on other non-2xx so "absent" is distinct from "failed to load". - Fetch the root AGENTS.md from the PR base ref instead of the head SHA. The content is injected into agent instructions, so reading it from the head let a PR poison AGENTS.md to inject instructions into the reviewer. - Render: a degraded (failed) review section now shows a failure note instead of "No issues found", and the status line notes the partial failure; carried-forward findings still render. Orchestrator passes the per-stream failure flags into renderComment. - Render: match acknowledged findings by stable id, not path:line:rule, so two findings sharing those fields are not both dropped. - github-repo-tools.ts: update stale file header (now shared by the Dependabot and code-review agents).

Running the code-review and style-guide fan-outs concurrently kept ~10 model sessions live in a single Durable Object isolate and over-ran its 128 MB limit, producing repeated "isolate exceeded its memory limit and was reset" errors (and downstream event-stream append failures). - Run the two fan-outs sequentially instead of via Promise.all, so one fan-out's sessions are reclaimed before the next starts (roughly halves peak heap). Per-stream failure isolation is preserved. - Lower code-review concurrency from 5 to 3, since its sessions are heavier (full-file reads + injected AGENTS.md) than style-guide ones. Trade-off: review wall-clock is now the sum of the two fan-outs rather than the max — acceptable for a background review bot.

Move the code-review and style-guide fan-outs out of the orchestrator DO into their own discovered workflows, so each runs in its own Durable Object isolate (own 128 MB budget). This removes the shared-isolate memory contention that caused "isolate exceeded its memory limit and was reset" errors and restores parallel execution. - New workflows code-review-specialist + style-guide-specialist (FlueCodeReviewSpecialistWorkflow / FlueStyleGuideSpecialistWorkflow). Each self-fetches its own diff once (no per-file fetch, no R2 diff staging), stages it into its own workspace, runs the existing in-process fan-out (reused verbatim), and returns findings as its run result. - Orchestrator now decides diff mode only (no file fetch), admits both specialists via admitWorkflow and polls them concurrently via pollRun (the spam-filter pattern). A timed-out/errored specialist degrades to an empty section with its failure flag set; both failing -> failure comment. - Specialists self-heal incremental -> full diff if the base SHA is gone. - Code-review concurrency restored to 5 (own isolate now). - New shared lib/review-specialist.ts: payload contract + helpers. - writeDiffToWorkspace accepts a minimal DiffPullRequest shape. - wrangler v5 migration adds the two new SQLite DO classes. Supersedes the sequential-fan-out workaround; the fan-outs no longer run in the orchestrator DO at all.

…memory limit

The code-review specialist timed out on large PRs. AI Gateway showed all model calls succeeding (0 failures) but slow (p90 ~47s, p99 ~103s), so the cause was DO memory pressure plus total wall-clock, not model errors: - A 13-file run at concurrency 5 hard-OOM'd the specialist isolate ("SQL query failed: ... memory limit") and never reached run_end. - Even without OOM, 13 multi-turn file sessions exceeded the orchestrator's 10-minute poll deadline. Fixes: - CODE_REVIEW_MAX_FILES 20 -> 10. Each code-review file is a multi-turn agent session, so this bounds both peak isolate memory and wall-clock. Covers virtually all docs PRs; larger PRs review their 10 largest files. - Orchestrator specialist poll timeout 10m -> 20m, since a legitimate large review is many slow model calls and the specialist runs durably in its own DO. - Concurrency stays at 3 (memory).

Delete each per-file agent session as soon as it finishes so harness memory is bounded to ~concurrency live sessions instead of growing with the changed-file count. With the leak fixed, restore the code-review cap to 20 files and concurrency to 5, and drop the dead reserveTokens hint.

Pre-stage added lines + file content into skill args (code-review-inproc.ts, code-review-specialist.ts, SKILL.md): trusted TypeScript now parses the diff and fetches the full file at head SHA before each session opens, eliminating the mandatory setup turns (read manifest → read patch → parse lines → read_repo_file). Cross-file tools remain available for optional lookups. Removes diff workspace staging from the code-review path entirely. Add external cron watchdog (review-watchdog.ts, review-watchdog-decide.ts, run-state.ts): a scheduled handler fires every 2 minutes, detects reviews stuck past a deadline via inflight R2 markers, and re-drives them. Workflows are non-resumable after DO interruptions; the watchdog is the only reliable recovery path. Inflight markers are written at review start and cleared on completion (code-review-state.ts, code-review-orchestrator.ts). Per-file hard timeout now aborts the CallHandle before deleting the session (code-review-inproc.ts, style-guide-inproc.ts): session.delete() rejects while an operation is active, so abort must settle the operation first. Fixes zombie session leak on timeout. Env-overridable concurrency and file timeout (lib/env.ts): both specialists read CODE_REVIEW_CONCURRENCY, CODE_REVIEW_FILE_TIMEOUT_MS, STYLE_GUIDE_CONCURRENCY, STYLE_GUIDE_FILE_TIMEOUT_MS so local dev can run lower values without touching prod defaults. Staged-diff cleanup in style-guide-specialist.ts: removeWorkspacePath in try/finally so the run-scoped SQLite rows are freed after each review. Added removeWorkspacePath to connectors/cloudflare-shell.ts. Wrangler: cron trigger every 2 min, DOCS_FLUE_BASE_URL var for watchdog. Tooling: flue:reset:local script, NODE_OPTIONS on flue:dev, fixed clear-r2-pr-data.ts to cover both state paths and inflight/ prefix.

The scheduled watchdog was never functional in prod: Flue drops non-HTTP handlers from app.ts (they require cloudflare.ts), so the cron fired every 2 minutes and logged "Handler does not export a scheduled() function" 30 times per hour without ever invoking runReviewWatchdog(). Reviews stopped getting stuck independently — the skill rewrite + per-file session.delete() + concurrency limits reduced orchestrator interruptions to near-zero. The watchdog's run-state detection and re-admission path were also untested end-to-end, so enabling it carried real risk. Removes: review-watchdog.ts, review-watchdog-decide.ts, run-state.ts, all inflight marker API from code-review-state.ts, all marker call sites from code-review-orchestrator.ts, triggers.crons + vars.DOCS_FLUE_BASE_URL from wrangler.jsonc. Reverts app.ts to canonical export default app. Keeps: markAutoReviewCompleted, renderFailureComment, bypassReviewLimit (all used independently). bin/clear-r2-pr-data.ts inflight/ handling kept for one-time cleanup of orphaned prod R2 markers.

An absent triggers key leaves existing cron triggers intact on deploy; triggers.crons=[] forces wrangler to PUT an empty schedule list, actually deregistering the leftover */2 cron.

1. Extract withConcurrency to lib/inproc-utils.ts; import from both code-review-inproc.ts and style-guide-inproc.ts instead of duplicating. 2. Fix read_repo_file tool UTF-8 decoding: use TextDecoder instead of bare atob() so non-ASCII content (em dashes, CJK, smart quotes) is not mojibake. Also encode path segments with encodeURIComponent, matching the getRepoFileContent fix already applied to github.ts. 3. Fix parseAddedLines: skip lines starting with backslash ('\ No newline at end of file') instead of treating them as context lines. Previously this incremented newLine, producing off-by-one line numbers for every subsequent added line in files without a trailing newline. 4. Fix stale comments: code-review-diff.ts, style-guide-inproc.ts, and wrangler.jsonc v4 comment still described the old in-orchestrator-DO architecture; updated to reflect the specialist-DO split.

…d-mode commands Routes code review to one of two strategies based on combined diff size: Fan-out (≤50 KB): per-file sessions, concurrency 5, cap 20 files. Good for smaller PRs; maximum per-file detail, parallel sessions, per-file degradation. Holistic (>50 KB): single session over the entire PR diff in one pass, cap 50 files (lib/code-review-holistic.ts). Enables cross-file reasoning, no per-file setup overhead, faster and more reliable on large PRs where fan-out wedges. New code-review-holistic skill: rich substance from the per-file skill, reframed for whole-PR review, model-owned navigation. Comment heading reflects which ran: 'Holistic Code Review' or 'Fan Out Code Review'. Both paths log inputTokens/totalTokens for calibration. Two new codeowner slash commands: /fan-out-review — forces fan-out regardless of diff size (with warning in the comment that it may be slow/fail on large PRs) /holistic-review — forces holistic regardless of diff size forceReviewMode threads through orchestrate → orchestrator → review-specialist → code-review-specialist, overriding size-based routing when set. Also: selectCodeReviewFiles takes an optional maxFiles param (holistic passes 50, fan-out uses the existing 20 default); per-file usage logging (inputTokens, totalTokens) added to reviewSingleFile; progress logs committed from the previous session also included here.

…; fix partial run consuming auto-review slot /fan-out-review and /holistic-review now check getPullRequest().user.login for dependabot[bot] before dispatching, matching the /review and /full-review pattern. Dependabot PRs route to /workflows/dependabot-review without forceReviewMode; regular PRs keep the forced mode. markAutoReviewCompleted now only fires when both codeOutcome.ok and styleOutcome.ok are true. Previously a partial run (one specialist degraded) still consumed an auto-review slot, which blocked the retry the comment promised would happen on the next push.

… workflow Remove the 20-minute specialist poll from the code-review orchestrator (root cause of stuck reviews when the orchestrator DO was interrupted mid-wait). Replace with an event-driven finalize pattern: - Orchestrator is now dispatch-only: limit check, placeholder, diffMode, write context.json to R2, admit both specialists fire-and-forget, return. - Each specialist wraps its review in try/catch (degraded result on error), writes its stream result to R2, then races for a conditional-PUT finalize lock (If-None-Match: *). Exactly one specialist wins. - New finalize-review workflow (FlueFinalizeReviewWorkflow, v6 migration): reads context + stream results from R2, head-guards against newer pushes, idempotency-guards against double-finalize, reconciles both streams, persists review JSON, renders and posts the comment, swaps reactions, marks auto-review slot consumed, cleans up the pending namespace. - reviewMode is carried through context.json so finalize always uses the same mode the orchestrator decided on (fixes log-vs-comment split in local dev where DO env views can differ). - dispatchId scoping isolates concurrent same-head dispatches (e.g. auto-review + /full-review arriving simultaneously). - Zero new custom Durable Object bindings beyond the standard Flue workflow class (FlueFinalizeReviewWorkflow in wrangler.jsonc v6).

…lize A specialist DO that is hard-evicted mid-review never reaches the writeStreamResult call at the end, leaving no code.json/style.json for the other specialist to detect and claim the finalize lock — exactly the stuck-review scenario seen on PR #31736. Fix: each specialist now writes { ok: false, degradedResult } to R2 BEFORE starting its review. If the DO is evicted, the placeholder is already there. The placeholder is overwritten with the real result on success. Either way finalize always has two stream results and will run, posting a degraded-section comment rather than staying stuck forever. Also fixes the unused FinalizeContext import lint error.

The placeholder writes added in the previous commit caused a race: both specialists write their degraded placeholders within ~1-2s, the first to write checks the sibling's placeholder (which looks like a valid result), claims the lock, and finalize fires immediately with two ok:false results before either review ran — posting an instant failure comment. Fix: - Add final:boolean to StreamResultPayload (false=placeholder, true=real) - tryClaimFinalize now checks final:true on the sibling — a placeholder (final:false) is not treated as 'done', preventing premature triggering - Orchestrator writes both crash-protection placeholders (final:false) BEFORE admitting specialists so a key exists even if a specialist DO is evicted immediately after admission - Specialists write final:true in their rendezvous tail (removed the redundant per-specialist placeholder writes added previously)

…refactors A. Robustness (stuck-review fixes): - Wrap all specialist setup (token, diff fetch) + review in one try/catch; any setup failure now writes a final:true degraded result and triggers finalize rather than throwing and leaving the placeholder stuck at final:false. Extract shared reportSpecialistResult() helper into lib/finalize-rendezvous.ts to eliminate the duplicated rendezvous tail across both specialists. - Orchestrator admit failures overwrite the placeholder with final:true degraded; if both fail, orchestrator itself claims the finalize lock and admits finalize. Drop redundant type casts on the narrowed admit union. - finalize-review post-comment failure now returns {finalized:false} and skips markAutoReviewCompleted and cleanupPending so the next push can retry without burning an auto-review slot on an undelivered comment. - Degraded streams (ok:false) skip reconciliation and carry previous findings forward as active, preventing false resolution of prior issues. B. Defensive R2 / JSON: - Wrap obj.json() in try/catch in readContext, readStreamResult, and tryClaimFinalize; treat corrupted/partial writes as null/not-final. C. Safety: - Guard req in orchestrator with an explicit throw instead of req!. - safeOrigin() helper returns '' on URL parse failure in both specialists. - parseReviewSpecialistPayload validates pr field shapes and DiffMode union (incremental requires fromSha+toSha strings); normalizes baseUrl to a valid absolute http(s) origin via new URL().origin, dropping invalid values. D. Correctness: - Holistic reviewedFiles now built from patched files only, matching the diff actually sent to the model. - clearTimeout called immediately after await handle resolves in both holistic and per-file reviewSingleFile, preventing a late timer from mislabeling a post-resolve error as a timeout. - Finding ID hash widened from 6 to 12 hex chars (24→48 bit) in both CR-/SG- namespaces; line number deliberately kept excluded for cross-commit stability. - withConcurrency clamps limit to Math.max(1, Math.floor(limit)). E. Dev script (bin/clear-r2-pr-data.ts): - WHERE clause tightened to 'diffs/pr-%' and 'inflight/pr-%' (was 'diffs/%'). - Switch execSync string interpolation to execFileSync with argument arrays. - Freshen stale header comment. F. Refactors: - Collapse four near-identical /review /full-review /fan-out-review /holistic-review handlers in orchestrate.ts into one shared helper. - Merge duplicate cloudflare-shell imports in style-guide-specialist.ts. G. Docs: - AGENTS.md: add finalize-review row to workflow table; rewrite request flow to describe dispatch-only orchestrator, R2 rendezvous, final:bool flag, and finalize-review steps; add v6 migration; update slash-command list.

…th spiral The FlueCodeReviewSpecialistWorkflow and FlueStyleGuideSpecialistWorkflow DOs accumulated SQLite state from previous crashed runs. Each OOM leaves orphaned session data in the DO's storage. After several crashes the accumulated state is large enough that even the Flue alarm recovery handler OOMs on startup (~900ms, repeating every ~30-60s), creating a death spiral where no future review can complete. Fix: two-step migration to reset the storage: v7 — rename bloated instances to *Old (abandons their storage) v8 — recreate the classes with empty storage under the original names The wrangler bindings still point to the original names (FlueCodeReview- SpecialistWorkflow / FlueStyleGuideSpecialistWorkflow), so no code changes are needed. Run history prior to v7 is lost.

… OOM death spiral" This reverts commit fa5e960.

… rendezvous, style-guide rule additions - Generalise finalize rendezvous from 2 streams to N streams (EXPECTED_STREAMS constant; tryClaimFinalize loops all siblings in parallel; stream type widened to string; expectedStreams threaded through context, payload, and options) - Add conventions specialist (light AI session, conventions-check skill): checks PR title format, Summary content against fetched template, and redirect checklist when docs files are renamed/deleted - Add redirect specialist (pure TypeScript, no model): derives old+new URLs from file paths, skips format-only moves (foo.mdx → foo/index.mdx), splat-aware __redirects matching, degrades gracefully if file unreadable - Redirects reconcile deterministically in finalize (no LLM); conventions force full diff mode; bothFailed gate remains code+style only - Add previous_filename to PullRequestFile for correct rename handling - Add marketing-language and time-sensitive-content suggestions to style-guide core-content reference; add component should-use suggestions to code-blocks - Render: 4 sections (Code → Conventions → Style → Redirects); pr path renders as PR label; no-findings messages use <sub> (smaller, no emoji); top-level status line unchanged - wrangler.jsonc v7: new_sqlite_classes for both new DO classes - Fix idempotency guard to skip in log mode (guard is comment-mode only)

Codeowners can comment /disable-auto-review on a PR to stop push-triggered reviews from running. The flag is stored in R2 at diffs/pr-<n>/auto-review-disabled.json (same pattern as ignore-review-limit). Manual /review and /full-review commands (bypassReviewLimit=true) still work normally. The command is acknowledged with a thumbs-up reaction.

Holistic mode timed out on large PRs (>50 KB diff, single model pass, 15-min hard timeout) producing zero findings. Fan-out is strictly better: bounded per-file sessions, graceful per-file degradation, guaranteed completion on any diff size. - Delete lib/code-review-holistic.ts and .agents/skills/code-review-holistic/ - code-review-specialist.ts: always fan-out; remove routing logic, holistic imports, env reads, forceReviewMode handling - lib/code-review-inproc.ts: remove CODE_REVIEW_HOLISTIC_* constants and re-export; remove reviewMode from both return sites - lib/code-review-results.ts: remove reviewMode field from CodeReviewResult - lib/review-specialist.ts: remove forceReviewMode from payload type + parser - code-review-orchestrator.ts: remove forceReviewMode from payload + specialistBody - orchestrate.ts: remove /fan-out-review and /holistic-review commands - code-review-render.ts: heading always 'Code Review'; remove codeMode from RenderReviewInput; remove /fan-out-review and /holistic-review command rows - finalize-review.ts: remove codeMode from renderComment call - AGENTS.md: remove holistic routing description and command bullets

mvvmm · 2026-06-27T20:54:20Z

/full-review

cloudflare-docs-bot · 2026-06-27T20:54:26Z

Review

Reviewing new changes (commit df8c587)…

⚠️ 23 warnings, 💡 5 suggestions found in commit 86b473d.

Code Review

This code review is in beta and may not always be helpful — use your judgment.

Warnings (22)

File	Issue
`.flue/workflows/finalize-review.ts` line 581	Incomplete payload validation — parsePayload casts payload to Partial and then reads .eventType. If payload is null, undefined, or a non-object, this throws a TypeError before the intended validation Error. It also accepts NaN because typeof NaN === 'number'. Fix: Guard that payload is a non-null object, then validate fields and ensure number is a finite positive integer.
`.flue/workflows/finalize-review.ts` line 186	Unhandled GitHub API failure — await getPullRequest(token, input.number) is unwrapped. A network or GitHub error aborts the workflow without running cleanupPending or consuming the auto-review slot, leaving a stale pending namespace. Fix: Wrap the call in try/catch, log the failure, run cleanupPending, and return { finalized: false, reason: 'pr_fetch_failed' }.
`.flue/workflows/finalize-review.ts` line 206	Unnecessary GitHub API call in log mode — getIssueComments is fetched unconditionally, but botComment is only used when reviewMode === 'comment'. In log mode this adds a needless GitHub dependency and can fail finalize when no comment would be posted. Fix: Move the getIssueComments call and botComment lookup inside the if (reviewMode === 'comment') block.
`.flue/lib/code-review-inproc.ts` line 67	Diff parser drops added lines starting with `++` — `if (raw.startsWith("+++") \|\| raw.startsWith("---")) continue;` runs before the added-line check. An added source line like `++i;` appears in the patch as `+++i;`, which starts with `+++` and is skipped entirely, so it never reaches `args.addedLines`. Fix: Match git file headers specifically, e.g. `if (raw.startsWith("+++ ") \|\| raw.startsWith("--- ")) continue;`, so only `+++ b/path`/`--- a/path` metadata lines are skipped.
`.flue/lib/code-review-inproc.ts` line 162	Duplicate IDs can silently drop findings — `findingsById.set(finding.id, finding)` deduplicates by ID, but `assignCodeReviewFindingIds` hashes `rule:path:evidence` and deliberately excludes `line`. Two distinct findings in the same file on different lines with the same rule and evidence text therefore receive the same ID, and the Map keeps only the last one. Fix: Include the line number in the finding ID hash, or append a disambiguating counter to duplicate IDs while keeping a separate stable group key for reconciliation.
`.flue/lib/finalize-rendezvous.ts` line 371	Workflow admission failure leaves dispatch stuck — In reportSpecialistResult, after tryClaimFinalize wins the conditional lock, admitWorkflow is awaited inside the try block. If admitWorkflow throws (network error, 5xx, missing runId), the catch swallows it and returns. The finalize.lock already exists, so no other specialist can claim it later, and finalize-review is never admitted for this dispatch. Fix: Move the admitWorkflow call (and finalization log) outside the catch-all try block so admission errors propagate to the caller and can be retried, or add targeted retries before giving up on the lock.
`.flue/workflows/redirect-specialist.ts` line 317	Unhandled rendezvous failure — `await reportSpecialistResult(...)` is outside the `try/catch` block, so a failure while writing the result or claiming the finalize lock rejects from `run` instead of returning the degraded result that the catch block prepared. Fix: Wrap the `reportSpecialistResult` call in its own `try/catch`, log the finalization failure, and still return `result` so the orchestrator receives the degraded state.
`.flue/workflows/redirect-specialist.ts` line 269	Malformed redirect suggestion for cross-area renames — For a renamed docs page whose new path is outside `src/content/docs/`, `pathToUrl(newPath)` returns `""` and the suggestion template produces an invalid `\`/old/ 301``line with an empty destination. Fix: Use a placeholder destination such as`/new-destination/`when`pathToUrl(newPath)` is empty, instead of interpolating an empty string into the redirect line.
`.flue/workflows/code-review-orchestrator.ts` line 387	Swallowed R2 write can deadlock finalize rendezvous — After a specialist admit fails, the orchestrator overwrites the stream placeholder with `final: true` but appends `.catch(() => {})`, swallowing any R2 write failure. If the write fails, the stream stays `final: false`, so `tryClaimFinalize` can never succeed and the review finalize workflow never runs. Fix: Remove the `.catch(() => {})` so the orchestrator fails visibly (and can be retried), or retry the write a bounded number of times before surfacing the error.
`.flue/workflows/code-review-orchestrator.ts` line 196	req validation after GitHub state mutation leaves stale comment — When `reviewMode === "comment"`, the orchestrator posts or updates the PR comment at lines 179-192 before checking `if (!req)` at line 196. If `req` is missing, the function throws after leaving a "review pending" comment that will not be finalized by any specialist workflow. Fix: Validate `req` and derive `baseUrl` before posting the placeholder comment so a missing request context never mutates GitHub state.
`.flue/lib/code-review-render.ts` line 240	Critical findings silently dropped outside code review — renderSection uses active.length (which includes critical findings) to decide whether to render content, but then only renders the critical table when includeCritical is true (line 248). For style/conventions/redirects, includeCritical is false, so a critical finding in those sections would cause the section heading to be emitted with no tables, and renderComment only counts critical findings from the code section (line 270), so the top-level status would also omit them. Fix: Either always render the critical table when critical.length > 0, or filter critical findings out of active when includeCritical is false so the section correctly shows "no issues" and the counts stay consistent.
`.flue/workflows/code-review-specialist.ts` line 182	Unhandled rejection in finalization — The `await reportSpecialistResult()` call at lines 182-195 is outside the try/catch block that wraps the review logic, so an R2, network, or finalization error causes the entire workflow to reject instead of returning the degraded result that was carefully set up. Fix: Move the rendezvous call inside the try block, or wrap it in its own try/catch that logs the error and continues, ensuring the degraded `result` is always returned as the function's output.
`.flue/lib/review-specialist.ts` line 107	Incomplete PR metadata validation — parseReviewSpecialistPayload validates pr.number, pr.title, pr.base, pr.head, and pr.labels, but ReviewSpecialistPrMeta also requires pr.body and pr.author to be strings. A malformed payload can pass validation with missing or non-string values. Fix: Add typeof checks for pr.body and pr.author, or normalize missing values to empty strings.
`.flue/lib/review-specialist.ts` line 160	Unchecked expectedStreams element types — expectedStreams is conditionally returned as string[] via an unchecked cast of input.expectedStreams. If array elements are not strings, downstream rendezvous logic may fail or behave unexpectedly. Fix: Validate every element with typeof x === "string" before returning the array.
`.flue/lib/style-guide-inproc.ts` line 294	Race condition in timeout detection — The setTimeout callback sets `timedOut = true` before calling `handle.abort()`. If a non-timeout error occurs just before the timer fires, the callback can still run and set the flag before the rejected `await handle` reaches the catch block, causing a genuine failure to be reported and rethrown as a timeout. Fix: Determine timeout from the abort/error itself (for example, only set `timedOut` if `handle.abort()` actually aborts a pending operation, or by inspecting the error type) instead of relying on a shared boolean flag that can be set after the operation has already failed.
`.flue/AGENTS.md` line 7	Architecture omits half the review pipeline — Line 7 claims the bot runs 'two independent reviews ... Code Review and ... Style Guide Review'. Lines 18-26 list only those two specialist workflows. workflows/code-review-orchestrator.ts admits four specialists (`/workflows/code-review-specialist`, `/workflows/style-guide-specialist`, `/workflows/conventions-specialist`, `/workflows/redirect-specialist`), finalize-review.ts reconciles all four streams, and lib/code-review-render.ts emits four sections (Code Review, Conventions, Style Guide Review, Redirects). Line 83 also omits migration v7, which added `FlueConventionsSpecialistWorkflow` and `FlueRedirectSpecialistWorkflow` per wrangler.jsonc. Fix: Update the architecture description, workflow table, request flow, R2 state shape (`{ code, style, conventions, redirects }`), rendered sections, and migration history to include conventions-specialist and redirect-specialist.
`.flue/workflows/orchestrate.ts` line 222	Unhandled promise rejection — addReactionToComment(..., 'eyes') is awaited without error handling in the /review and /full-review handler. If the reaction request fails, the entire slash command aborts and no review is dispatched. Fix: Wrap the reaction call in a try/catch or add a .catch so reaction failures do not block review dispatch.
`.flue/workflows/orchestrate.ts` line 311	State change before acknowledgment — setReviewLimitIgnored persists state before the +1 reaction is added. If addReactionToComment rejects, the review-limit flag remains set but the webhook returns a failure and may be retried. Fix: Either add the reaction first or catch reaction failures after state is persisted to avoid spurious webhook retries.
`.flue/workflows/orchestrate.ts` line 356	State change before acknowledgment — setAutoReviewDisabled persists state before the +1 reaction is added. If addReactionToComment rejects, auto-review remains disabled but the webhook returns a failure and may be retried. Fix: Either add the reaction first or catch reaction failures after state is persisted to avoid spurious webhook retries.
`.flue/bin/clear-r2-pr-data.ts` line 74	Unhandled external command failure — The `sqlite3` calls via `execFileSync` (count query and DELETE) have no error handling. If `sqlite3` is not installed, the database is locked, or the `_mf_objects` table does not exist, the thrown exception crashes the script. Fix: Wrap the `execFileSync` calls in try/catch and emit a clear, actionable error message (e.g., indicate sqlite3 is missing or the DB is locked).
`.flue/lib/code-review-results.ts` line 52	Non-unique finding IDs for distinct findings — The ID hash input is only `${f.rule}:${f.path}:${f.evidence.trim()}` (line 52), excluding the line number. Two distinct findings in the same file that share the same rule and evidence text will therefore produce the same `CR-${hex.slice(0, 12)}` ID (line 57), breaking the expectation that each finding has a unique ID. Fix: Include the line number in the hash input, or check for duplicate IDs after assignment and append a counter to duplicates, so distinct findings remain distinguishable while keeping IDs stable.
`.flue/lib/code-review-state.ts` line 125	Race condition in read-modify-write — markAutoReviewCompleted reads the counter from R2, mutates the count and shas array in memory, and then writes the whole object back with an unconditional bucket.put. Two concurrent review runs for the same PR can both read the same state, both pass the shas.includes(headSha) check, and both write back, causing the second put to overwrite the first. This can double-increment the count and/or lose a recorded SHA. Fix: Use a conditional R2 put (e.g., if-match ETag returned by bucket.get) and retry on conflict, or serialize updates through a Durable Object to make the read-modify-write atomic.

Suggestions (5)

File	Issue
`.flue/lib/finalize-rendezvous.ts` line 257	R2 list pagination not handled — cleanupPending calls bucket.list({ prefix }) once and deletes only listed.objects. R2 list responses are paginated; if a prefix ever contains more than the default page size, truncated results and the cursor are ignored, leaving keys behind despite the comment claiming it deletes every key under the prefix. Fix: Loop while listed.truncated is true, passing listed.cursor to subsequent bucket.list calls, and collect/delete all objects across pages.
`.flue/workflows/conventions-specialist.ts` line 115	Misleading variable name and duplicated path logic — 'renamedDocFiles' is built from both 'renamed' and 'removed' files, and the same previous_filename/filename branch is repeated in the .filter() predicate and the .map() transform. Fix: Rename the variable (and skill argument) to something like 'docsFilesNeedingRedirect' and compute the old path once, e.g. via map+filter or a helper, to avoid duplicating the branching logic.
`.flue/AGENTS.md` line 73	Slash command missing from docs — The slash commands section lists `/review`, `/full-review`, and `/ignore-review-limit`, and says 'All commands swap 👀 → 👍'. workflows/orchestrate.ts also handles `/disable-auto-review`, and lib/code-review-render.ts includes it in the rendered Commands block. Fix: Add `/disable-auto-review` to the slash commands list with a brief description.
`.flue/workflows/orchestrate.ts` line 283	Duplicated slash-command handler — The /ignore-review-limit and /disable-auto-review handlers duplicate comment id validation, sender validation, token setup, codeowner check, R2 bucket access, +1 reaction, logging, and return shape. Fix: Extract a shared helper that takes the command name and a state-mutator callback to reduce duplication and future drift.
`.flue/bin/clear-r2-pr-data.ts` line 42	Directory type not verified — `existsSync(stateDir)` only checks that the path exists, not that it is a directory. If a candidate path exists as a file, the subsequent `readdirSync(stateDir)` call will throw `ENOTDIR`. Fix: Use `statSync(stateDir).isDirectory()` instead of `existsSync(stateDir)`.

Conventions

Checks PR title, description, and redirect checklist.

Warnings (1)

File	Issue
PR	Summary section content — The PR description does not contain a `### Summary` section; the first heading is `## Architecture`. Fix: Add a `### Summary` section to the description with context about the purpose of these changes.

Style Guide Review

_{No style-guide issues found.}

Redirects

_{No missing redirect entries found.}

Commands

Only codeowners can run commands. Post a comment with the command to trigger it.

Command	Description
`/review`	Runs a review now. Incremental if a prior review exists, full if not.
`/full-review`	Re-reviews the entire PR diff from scratch, ignoring incremental history. Useful after a rebase, when you want a fresh review, or if the bot gets out of sync and reports issues that no longer exist.
`/ignore-review-limit`	Permanently lifts the 2-review automatic limit for this PR. Future pushes will trigger reviews as normal.
`/disable-auto-review`	Stops automatic reviews from triggering on future pushes to this PR. Codeowners can still run `/review` or `/full-review` manually.

Replace rigid format rules (title must start with [Product] or type:, description must have a ### Summary section) with lenient semantic checks: 1. Product/area identified — title OR description names the area; no tag required 2. Description explains the work — any human prose passes; only flag truly empty/placeholder descriptions 3. Scope accuracy — only flag material misrepresentation, not missing details Default to no finding in all three. Only flag clear, significant violations.

mvvmm · 2026-06-27T23:29:05Z

/disable-auto-review

mvvmm requested review from a team and kodster28 as code owners June 18, 2026 16:50

github-actions Bot added the size/xl label Jun 18, 2026

github-actions Bot assigned kodster28 Jun 18, 2026

mvvmm added 5 commits June 18, 2026 12:57

flue: remove unused DiffMode import in code-review-specialist

9e44887

flue: add beta disclaimer note under Code Review heading

dcd1aa6

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 18, 2026

flue: lower code-review specialist concurrency to 3 to stay under DO …

abf8906

…memory limit

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 18, 2026

mvvmm added 8 commits June 18, 2026 14:35

enable tracing

4fcb9e6

flue: document code-review architecture in AGENTS.md

99b7890

flue: explicitly clear crons in wrangler config

d5d0f85

An absent triggers key leaves existing cron triggers intact on deploy; triggers.crons=[] forces wrangler to PUT an empty schedule list, actually deregistering the leftover */2 cron.

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 26, 2026

cloudflare deleted a comment from ask-bonk Bot Jun 26, 2026

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 26, 2026

mvvmm added 4 commits June 26, 2026 14:07

[Flue] Use kimi-k2.7-code for reconciliation (drop glm-4.7-flash)

675d7b9

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 26, 2026

mvvmm added 4 commits June 26, 2026 18:27

Revert "[Flue] v7+v8 migrations: clear specialist DO storage to break…

540fb34

… OOM death spiral" This reverts commit fa5e960.

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 27, 2026

mvvmm added 2 commits June 27, 2026 14:32

[Flue] Remove unused CODE_REVIEW_MAX_FILES import in specialist

6b3d8cb

cloudflare deleted a comment from cloudflare-docs-bot Bot Jun 27, 2026

mvvmm added 2 commits June 27, 2026 18:21

[Flue] Tighten scope accuracy check — flag any unmentioned core change

df8c587

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

flue: add generic code review alongside style-guide review#31562

flue: add generic code review alongside style-guide review#31562
mvvmm wants to merge 30 commits into
productionfrom
code-review

mvvmm commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

mvvmm commented Jun 27, 2026

Uh oh!

cloudflare-docs-bot Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

mvvmm commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mvvmm commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture

Code review specialist

Style guide specialist

Memory and timeout

Other

Validated

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mvvmm commented Jun 27, 2026

Uh oh!

cloudflare-docs-bot Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Code Review

Conventions

Style Guide Review

Redirects

Uh oh!

mvvmm commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mvvmm commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

cloudflare-docs-bot Bot commented Jun 27, 2026 •

edited

Loading