mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-05-30 13:45:21 +08:00
70 lines
2.4 KiB
Markdown
70 lines
2.4 KiB
Markdown
---
|
|
name: benchmark-optimization-loop
|
|
description: Use when the user asks to make something faster, try many variants, run recursive optimization, benchmark latency/throughput/cost, or choose the best implementation by repeated measured tests.
|
|
origin: ECC
|
|
tools: Read, Write, Edit, Bash, Grep, Glob
|
|
---
|
|
|
|
# Benchmark Optimization Loop
|
|
|
|
Use this skill to convert "make it 20x faster" or "try 50 recursive
|
|
optimizations" into a bounded measured loop that can actually improve a system.
|
|
|
|
## Required Baseline
|
|
|
|
Do not optimize until these exist:
|
|
|
|
- the operation being optimized;
|
|
- the correctness gate that must stay green;
|
|
- the metric: wall time, p95 latency, rows/sec, cost/run, memory, error rate;
|
|
- the current baseline;
|
|
- the search budget: max variants, max time, max spend, max data impact.
|
|
|
|
If the user asks for an unrealistic target, keep the ambition but make the loop
|
|
bounded and measurable.
|
|
|
|
## Loop
|
|
|
|
1. Measure the baseline.
|
|
2. Identify bottlenecks from evidence.
|
|
3. Generate variants that test one hypothesis each.
|
|
4. Run variants with the same input shape.
|
|
5. Reject variants that fail correctness, safety, or reproducibility.
|
|
6. Promote the fastest safe variant.
|
|
7. Codify the winning path in a script, command, test, config, or doc.
|
|
8. Rerun the baseline and winner to confirm the delta.
|
|
|
|
## Variant Table
|
|
|
|
Track variants like this:
|
|
|
|
```text
|
|
Variant | Hypothesis | Command | Time | Correct? | Notes
|
|
baseline | current path | npm run job | 120s | yes | stable
|
|
batch-500 | fewer round trips | npm run job -- --batch 500 | 42s | yes | winner
|
|
parallel-8 | more workers | npm run job -- --workers 8 | 31s | no | rate limited
|
|
```
|
|
|
|
## Recursive Search
|
|
|
|
For recursive or hyperparameter work:
|
|
|
|
- persist every run to a ledger;
|
|
- compare against the prior accepted winner, not only the previous run;
|
|
- keep a holdout or replay check;
|
|
- stop when improvement is within noise, correctness fails, cost exceeds the
|
|
budget, or the search starts changing more variables than it can explain.
|
|
|
|
Use phrases like "best measured safe variant" instead of "global optimum" unless
|
|
the search space was actually exhaustive.
|
|
|
|
## Promotion Gate
|
|
|
|
A variant cannot become the new default until:
|
|
|
|
- correctness tests pass;
|
|
- the performance delta is repeated or explained;
|
|
- rollback is obvious;
|
|
- the change is encoded in source control or a durable runbook;
|
|
- the final summary includes exact commands and measurements.
|