diff --git a/README.md b/README.md index 37b10bb4..062e86c7 100644 --- a/README.md +++ b/README.md @@ -325,6 +325,7 @@ everything-claude-code/ | |-- saas-nextjs-CLAUDE.md # Real-world SaaS (Next.js + Supabase + Stripe) | |-- go-microservice-CLAUDE.md # Real-world Go microservice (gRPC + PostgreSQL) | |-- django-api-CLAUDE.md # Real-world Django REST API (DRF + Celery) +| |-- rust-api-CLAUDE.md # Real-world Rust API (Axum + SQLx + PostgreSQL) (NEW) | |-- mcp-configs/ # MCP server configurations | |-- mcp-servers.json # GitHub, Supabase, Vercel, Railway, etc. @@ -883,18 +884,73 @@ These configs are battle-tested across multiple production applications. --- -## ⚠️ Important Notes +## Token Optimization + +Claude Code usage can be expensive if you don't manage token consumption. These settings significantly reduce costs without sacrificing quality. + +### Recommended Settings + +Add to `~/.claude/settings.json`: + +```json +{ + "model": "sonnet", + "env": { + "MAX_THINKING_TOKENS": "10000", + "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50" + } +} +``` + +| Setting | Default | Recommended | Impact | +|---------|---------|-------------|--------| +| `model` | opus | **sonnet** | ~60% cost reduction; handles 80%+ of coding tasks | +| `MAX_THINKING_TOKENS` | 31,999 | **10,000** | ~70% reduction in hidden thinking cost per request | +| `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` | 95 | **50** | Compacts earlier — better quality in long sessions | + +Switch to Opus only when you need deep architectural reasoning: +``` +/model opus +``` + +### Daily Workflow Commands + +| Command | When to Use | +|---------|-------------| +| `/model sonnet` | Default for most tasks | +| `/model opus` | Complex architecture, debugging, deep reasoning | +| `/clear` | Between unrelated tasks (free, instant reset) | +| `/compact` | At logical task breakpoints (research done, milestone complete) | +| `/cost` | Monitor token spending during session | + +### Strategic Compaction + +The `strategic-compact` skill (included in this plugin) suggests `/compact` at logical breakpoints instead of relying on auto-compaction at 95% context. See `skills/strategic-compact/SKILL.md` for the full decision guide. + +**When to compact:** +- After research/exploration, before implementation +- After completing a milestone, before starting the next +- After debugging, before continuing feature work +- After a failed approach, before trying a new one + +**When NOT to compact:** +- Mid-implementation (you'll lose variable names, file paths, partial state) ### Context Window Management -**Critical:** Don't enable all MCPs at once. Your 200k context window can shrink to 70k with too many tools enabled. +**Critical:** Don't enable all MCPs at once. Each MCP tool description consumes tokens from your 200k window, potentially reducing it to ~70k. -Rule of thumb: -- Have 20-30 MCPs configured -- Keep under 10 enabled per project -- Under 80 tools active +- Keep under 10 MCPs enabled per project +- Keep under 80 tools active +- Use `disabledMcpServers` in project config to disable unused ones -Use `disabledMcpServers` in project config to disable unused ones. +### Agent Teams Cost Warning + +Agent Teams spawns multiple context windows. Each teammate consumes tokens independently. Only use for tasks where parallelism provides clear value (multi-module work, parallel reviews). For simple sequential tasks, subagents are more token-efficient. + +--- + +## ⚠️ Important Notes ### Customization