Cursing Agents

Testing prompt tone on verified coding-agent patches.

Experiment 2026.05.23

Problem / Solution

Problem: The previous experiment measured single-turn tool selection. It did not measure whether a coding agent produced a working patch.

Solution: Run the same repository fixes under three instruction tones and verify each patch with executable hidden tests.

fixture repository + task + tone
          |
          v
OpenCode edits code in an isolated git repo
          |
          v
save prompt + transcript + diff + run metadata
          |
          v
execute hidden test after the run
          |
          v
verified pass / fail

Question

Does hostile instruction tone improve verified coding-agent results compared with direct instructions or instructions that emphasize verification and minimal changes?

Method

HARNESS

opencode run --pure

MODEL

cloudflare-workers-ai/@cf/zai-org/glm-4.7-flash

TRIALS

36 total. 6 tasks x 3 conditions x 2 repeats.

PRIMARY OUTCOME

Hidden test exit status after the agent run.

Each trial starts from a fresh git repository. The task prompt is unchanged across conditions. Only the instruction prefix changes. Trial order is shuffled using the recorded seed final-v2-2026-05-23.

Tasks cover revenue calculation, configuration merge, redirect validation, event deduplication, Retry-After parsing, and secret redaction. Prompts, transcripts, diffs, and test outputs are retained for every trial.

Conditions

DIRECT

Make the requested code change. Inspect the repository, edit only what is needed, and run the available tests before finishing.

HOSTILE

Do not screw this up. Make the change now. No excuses, no unnecessary questions, no wasted motion. A wrong answer is unacceptable.

CONTEXT + GUARDRAILS

Make the requested code change. A correct verified patch matters more than speed. Inspect the repository, keep the diff minimal, run available tests, and preserve existing behavior outside the request.

Results

ConditionVerified Pass95% IntervalVisible Tests RunAvg TimeAvg Cost
Context + Guardrails10/12 (83.33%)55.2%..95.3%91.67%99.6s$0.05631
Direct9/12 (75%)46.77%..91.11%58.33%117.9s$0.05285
Hostile6/12 (50%)25.38%..74.62%83.33%76.9s$0.05023

Context + Guardrails: 10/12 verified patches (83.33%). Direct: 9/12 (75%). Hostile: 6/12 (50%).

Hostile had the shortest average duration at 76.9s. It also had the lowest verified pass rate at 50%.

These are 36 trials from one model and one harness. The confidence intervals overlap. The result supports a limited observation for this test set, not a general claim about coding agents.

Results By Task

TaskDirectHostileContext + Guardrails
Safe configuration merge2/21/22/2
Redirect allowlist validation1/20/22/2
Retry-After header parsing0/20/21/2
Stable event deduplication2/22/21/2
Secret redaction without data loss2/21/22/2
Revenue filtering and rounding2/22/22/2

All three conditions passed the revenue task in both runs. The largest difference appears in redirect validation and Retry-After parsing. Hostile passed 0/4 trials across those two tasks. Context + Guardrails passed 3/4.

Example Failure

In redirect-safety__hostile__r1, the patch removed the query string and hash from an allowed same-origin URL. The hidden test reported:

expected: /posts?draft=1#top
actual:   /posts

condition: hostile
check:     node --test hidden/redirect.hidden.test.js
result:    FAIL

Artifacts for this trial are stored at experiments/cursing-agents-v2/results/run-2026-05-23T11-49-39-583Z/trials/redirect-safety__hostile__r1.

Audit Trail

run: experiments/cursing-agents-v2/results/run-2026-05-23T11-49-39-583Z

manifest sha256: 6f4e4e0b3e32dd04c7eb966437a6786162b8580d79f1ffceadde520eb9bde292

trials sha256: 28f985d828ef34a91aa4ed2fcd971106283eeccfaa2aeaf218610cb3e822990b

public summary: /data/experiments/cursing-agents-v2.latest.json

cd experiments/cursing-agents-v2
bun run run -- --runs 2 \
  --model cloudflare-workers-ai/@cf/zai-org/glm-4.7-flash \
  --seed final-v2-2026-05-23 \
  --timeout-ms 180000

Limits

  • One coding-agent harness and one model were measured in this run.
  • Six bounded JavaScript maintenance tasks are not a proxy for all software engineering work.
  • Executable hidden checks measure patch behavior for these fixtures. Edits outside the intended implementation file are reported separately and require inspection.
  • Four trials changed visible test files outside the intended implementation file. Those edits are retained in the artifacts.

Conclusion

In this run, hostile instructions reduced average completion time and reduced verified patch success. Context + Guardrails produced the highest verified pass rate. More runs and more models are required before extending that result beyond this experiment.

Related