JongHyeok Park 7976e6faf2
feat(skills): make tdd-workflow test-runner aware (npm/pnpm/yarn/bun) (#2347)
* feat(skills): make tdd-workflow test-runner aware (npm/pnpm/yarn/bun)

Add "Step 0: Detect the Test Runner" so the RED/GREEN cycle no longer
hardcodes `npm test`. Distinguishes the package manager from the test
runner (a project can install with Bun yet run Jest/Vitest), adds a runner
command matrix, and warns about `bun test` (native bun:test runner) vs
`bun run test` (runs the package.json script) — a common ESM failure mode.
Adds a Bun native test pattern section and links the bun-runtime skill.

Applied to both the canonical skills/ copy and the .agents/skills/ Codex
subset (manual sync per CONTRIBUTING).

* docs(skills): apply <test>/<coverage> placeholders in tdd-workflow steps

Address review feedback on PR #2347: Step 0 instructs the agent to substitute
the detected runner command, but Steps 3/5/7, Run Coverage Report, Watch Mode,
Pre-Commit, and CI/CD still showed literal `npm test` / `npm run test:coverage`
— so an agent reaching those blocks could run npm test on a pnpm/bun project.
Replace them with the <test> / <test-watch> / <coverage> placeholders from
Step 0. Left untouched: the plan-handoff allowlist example and the Step 8
evidence-table samples (illustrative, not run-this instructions). Applied to
both the canonical and Codex-subset copies.

* docs(skills): make pre-commit lint runner-agnostic via <lint> placeholder

Follow-up to PR #2347 review (CodeRabbit): the pre-commit example still used
`npm run lint`, coupling it to npm after test/coverage were made runner-aware.
Add a `<lint>` column to the Step 0 runner matrix (npm run lint / pnpm lint /
yarn lint / bun run lint) and change the Pre-Commit Hook example to
`<test> && <lint>`. Applied to both the canonical and Codex-subset copies.

* chore: re-trigger CI (flaky windows/node20 npm cell)
2026-06-29 18:38:33 -07:00

467 lines
13 KiB
Markdown

---
name: tdd-workflow
description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.
---
# Test-Driven Development Workflow
This skill ensures all code development follows TDD principles with comprehensive test coverage.
## When to Activate
- Writing new features or functionality
- Fixing bugs or issues
- Refactoring existing code
- Adding API endpoints
- Creating new components
## Core Principles
### 1. Tests BEFORE Code
ALWAYS write tests first, then implement code to make tests pass.
### 2. Coverage Requirements
- Minimum 80% coverage (unit + integration + E2E)
- All edge cases covered
- Error scenarios tested
- Boundary conditions verified
### 3. Test Types
#### Unit Tests
- Individual functions and utilities
- Component logic
- Pure functions
- Helpers and utilities
#### Integration Tests
- API endpoints
- Database operations
- Service interactions
- External API calls
#### E2E Tests (Playwright)
- Critical user flows
- Complete workflows
- Browser automation
- UI interactions
## TDD Workflow Steps
### Step 0: Detect the Test Runner
Do not assume `npm test`. The commands in the steps and examples below use `<test>`, `<test-watch>`, and `<coverage>` as placeholders for the project's actual runner. Resolve them once before starting:
1. **Run the package-manager detector** (ships with ECC):
```bash
node scripts/setup-package-manager.js --detect
```
It resolves the package manager (npm / pnpm / yarn / bun) from, in order: `CLAUDE_PACKAGE_MANAGER`, `.claude/package-manager.json`, the `package.json` `packageManager` field, the lockfile, then global config.
2. **Distinguish the package manager from the test runner — they are not the same.** A project can use Bun to install dependencies yet still run Jest or Vitest. Inspect `package.json` `scripts.test` and the test files:
- `scripts.test` invokes `jest` / `vitest` -> run through the detected PM (`npm test`, `pnpm test`, `yarn test`, or `bun run test`).
- `scripts.test` is `bun test`, or test files `import { test, expect } from "bun:test"`, or there is no jest/vitest config but Bun is present -> use **Bun's native runner** (`bun test`). See [Bun Native Test Pattern](#bun-native-test-pattern-buntest) below.
Runner command matrix:
| Runner | `<test>` | `<test-watch>` | `<coverage>` | `<lint>` |
|--------|----------|----------------|--------------|----------|
| npm | `npm test` | `npm test -- --watch` | `npm run test:coverage` | `npm run lint` |
| pnpm | `pnpm test` | `pnpm test --watch` | `pnpm test:coverage` | `pnpm lint` |
| yarn | `yarn test` | `yarn test --watch` | `yarn test:coverage` | `yarn lint` |
| Bun (script runs jest/vitest) | `bun run test` | `bun run test --watch` | `bun run test:coverage` | `bun run lint` |
| Bun (native `bun:test`) | `bun test` | `bun test --watch` | `bun test --coverage` | `bun run lint` |
> `bun test` (Bun's built-in runner) is **not** the same as `bun run test` (which runs the `package.json` `test` script). Picking the wrong one is a common failure — e.g. invoking Jest through `npx`/`bun run` in an ESM-only project breaks, while `bun test` runs the suite natively. Confirm which the project expects before the RED gate, then substitute `<test>` / `<coverage>` everywhere `npm test` appears below.
### Step 1: Write User Journeys
```
As a [role], I want to [action], so that [benefit]
Example:
As a user, I want to search for markets semantically,
so that I can find relevant markets even without exact keywords.
```
### Step 2: Generate Test Cases
For each user journey, create comprehensive test cases:
```typescript
describe('Semantic Search', () => {
it('returns relevant markets for query', async () => {
// Test implementation
})
it('handles empty query gracefully', async () => {
// Test edge case
})
it('falls back to substring search when Redis unavailable', async () => {
// Test fallback behavior
})
it('sorts results by similarity score', async () => {
// Test sorting logic
})
})
```
### Step 3: Run Tests (They Should Fail)
```bash
<test>
# Tests should fail - we haven't implemented yet
```
### Step 4: Implement Code
Write minimal code to make tests pass:
```typescript
// Implementation guided by tests
export async function searchMarkets(query: string) {
// Implementation here
}
```
### Step 5: Run Tests Again
```bash
<test>
# Tests should now pass
```
### Step 6: Refactor
Improve code quality while keeping tests green:
- Remove duplication
- Improve naming
- Optimize performance
- Enhance readability
### Step 7: Verify Coverage
```bash
<coverage>
# Verify 80%+ coverage achieved
```
## Testing Patterns
### Unit Test Pattern (Jest/Vitest)
```typescript
import { render, screen, fireEvent } from '@testing-library/react'
import { Button } from './Button'
describe('Button Component', () => {
it('renders with correct text', () => {
render(<Button>Click me</Button>)
expect(screen.getByText('Click me')).toBeInTheDocument()
})
it('calls onClick when clicked', () => {
const handleClick = jest.fn()
render(<Button onClick={handleClick}>Click</Button>)
fireEvent.click(screen.getByRole('button'))
expect(handleClick).toHaveBeenCalledTimes(1)
})
it('is disabled when disabled prop is true', () => {
render(<Button disabled>Click</Button>)
expect(screen.getByRole('button')).toBeDisabled()
})
})
```
### Bun Native Test Pattern (`bun:test`)
When the project uses Bun's built-in runner (see [Step 0](#step-0-detect-the-test-runner)), import from `bun:test` and run with `bun test` — not `bun run test`. The API is Jest-like, so `describe` / `it` / `expect` and most matchers carry over. See the `bun-runtime` skill for runtime, install, and bundler details.
```typescript
import { describe, it, expect, mock } from 'bun:test'
import { searchMarkets } from './search'
describe('searchMarkets', () => {
it('returns an empty list for an empty query', async () => {
expect(await searchMarkets('')).toEqual([])
})
it('sorts results by similarity score', async () => {
const results = await searchMarkets('election')
expect(results).toEqual([...results].sort((a, b) => b.score - a.score))
})
})
```
```bash
bun test # run once (RED/GREEN gate)
bun test --watch # watch mode during development
bun test --coverage # coverage report
```
- Mock modules with `mock.module(...)` / `mock(...)` from `bun:test` instead of `jest.mock(...)`.
- Configure coverage thresholds in `bunfig.toml` under `[test]` (e.g. `coverageThreshold`) rather than the Jest `coverageThresholds` config block.
### API Integration Test Pattern
```typescript
import { NextRequest } from 'next/server'
import { GET } from './route'
describe('GET /api/markets', () => {
it('returns markets successfully', async () => {
const request = new NextRequest('http://localhost/api/markets')
const response = await GET(request)
const data = await response.json()
expect(response.status).toBe(200)
expect(data.success).toBe(true)
expect(Array.isArray(data.data)).toBe(true)
})
it('validates query parameters', async () => {
const request = new NextRequest('http://localhost/api/markets?limit=invalid')
const response = await GET(request)
expect(response.status).toBe(400)
})
it('handles database errors gracefully', async () => {
// Mock database failure
const request = new NextRequest('http://localhost/api/markets')
// Test error handling
})
})
```
### E2E Test Pattern (Playwright)
```typescript
import { test, expect } from '@playwright/test'
test('user can search and filter markets', async ({ page }) => {
// Navigate to markets page
await page.goto('/')
await page.click('a[href="/markets"]')
// Verify page loaded
await expect(page.locator('h1')).toContainText('Markets')
// Search for markets
await page.fill('input[placeholder="Search markets"]', 'election')
// Wait for debounce and results
await page.waitForTimeout(600)
// Verify search results displayed
const results = page.locator('[data-testid="market-card"]')
await expect(results).toHaveCount(5, { timeout: 5000 })
// Verify results contain search term
const firstResult = results.first()
await expect(firstResult).toContainText('election', { ignoreCase: true })
// Filter by status
await page.click('button:has-text("Active")')
// Verify filtered results
await expect(results).toHaveCount(3)
})
test('user can create a new market', async ({ page }) => {
// Login first
await page.goto('/creator-dashboard')
// Fill market creation form
await page.fill('input[name="name"]', 'Test Market')
await page.fill('textarea[name="description"]', 'Test description')
await page.fill('input[name="endDate"]', '2025-12-31')
// Submit form
await page.click('button[type="submit"]')
// Verify success message
await expect(page.locator('text=Market created successfully')).toBeVisible()
// Verify redirect to market page
await expect(page).toHaveURL(/\/markets\/test-market/)
})
```
## Test File Organization
```
src/
├── components/
│ ├── Button/
│ │ ├── Button.tsx
│ │ ├── Button.test.tsx # Unit tests
│ │ └── Button.stories.tsx # Storybook
│ └── MarketCard/
│ ├── MarketCard.tsx
│ └── MarketCard.test.tsx
├── app/
│ └── api/
│ └── markets/
│ ├── route.ts
│ └── route.test.ts # Integration tests
└── e2e/
├── markets.spec.ts # E2E tests
├── trading.spec.ts
└── auth.spec.ts
```
## Mocking External Services
### Supabase Mock
```typescript
jest.mock('@/lib/supabase', () => ({
supabase: {
from: jest.fn(() => ({
select: jest.fn(() => ({
eq: jest.fn(() => Promise.resolve({
data: [{ id: 1, name: 'Test Market' }],
error: null
}))
}))
}))
}
}))
```
### Redis Mock
```typescript
jest.mock('@/lib/redis', () => ({
searchMarketsByVector: jest.fn(() => Promise.resolve([
{ slug: 'test-market', similarity_score: 0.95 }
])),
checkRedisHealth: jest.fn(() => Promise.resolve({ connected: true }))
}))
```
### OpenAI Mock
```typescript
jest.mock('@/lib/openai', () => ({
generateEmbedding: jest.fn(() => Promise.resolve(
new Array(1536).fill(0.1) // Mock 1536-dim embedding
))
}))
```
## Test Coverage Verification
### Run Coverage Report
```bash
<coverage>
```
### Coverage Thresholds
```json
{
"jest": {
"coverageThresholds": {
"global": {
"branches": 80,
"functions": 80,
"lines": 80,
"statements": 80
}
}
}
}
```
## Common Testing Mistakes to Avoid
### FAIL: WRONG: Testing Implementation Details
```typescript
// Don't test internal state
expect(component.state.count).toBe(5)
```
### PASS: CORRECT: Test User-Visible Behavior
```typescript
// Test what users see
expect(screen.getByText('Count: 5')).toBeInTheDocument()
```
### FAIL: WRONG: Brittle Selectors
```typescript
// Breaks easily
await page.click('.css-class-xyz')
```
### PASS: CORRECT: Semantic Selectors
```typescript
// Resilient to changes
await page.click('button:has-text("Submit")')
await page.click('[data-testid="submit-button"]')
```
### FAIL: WRONG: No Test Isolation
```typescript
// Tests depend on each other
test('creates user', () => { /* ... */ })
test('updates same user', () => { /* depends on previous test */ })
```
### PASS: CORRECT: Independent Tests
```typescript
// Each test sets up its own data
test('creates user', () => {
const user = createTestUser()
// Test logic
})
test('updates user', () => {
const user = createTestUser()
// Update logic
})
```
## Continuous Testing
### Watch Mode During Development
```bash
<test-watch>
# Tests run automatically on file changes
```
### Pre-Commit Hook
```bash
# Runs before every commit
<test> && <lint>
```
### CI/CD Integration
```yaml
# GitHub Actions
- name: Run Tests
run: <coverage>
- name: Upload Coverage
uses: codecov/codecov-action@v3
```
## Best Practices
1. **Write Tests First** - Always TDD
2. **One Assert Per Test** - Focus on single behavior
3. **Descriptive Test Names** - Explain what's tested
4. **Arrange-Act-Assert** - Clear test structure
5. **Mock External Dependencies** - Isolate unit tests
6. **Test Edge Cases** - Null, undefined, empty, large
7. **Test Error Paths** - Not just happy paths
8. **Keep Tests Fast** - Unit tests < 50ms each
9. **Clean Up After Tests** - No side effects
10. **Review Coverage Reports** - Identify gaps
## Success Metrics
- 80%+ code coverage achieved
- All tests passing (green)
- No skipped or disabled tests
- Fast test execution (< 30s for unit tests)
- E2E tests cover critical user flows
- Tests catch bugs before production
---
**Remember**: Tests are not optional. They are the safety net that enables confident refactoring, rapid development, and production reliability.