name: bugfix description: 'Multi-layer bug investigation and fix (traces across services before patching). Triggers: "bugfix", "fix bug", "corrigir bug", "debug", "investigar bug". Skip for new feature work (use execute-task/specify).' argument-hint: "[description of the bug, error message, or screenshot path]" allowed-tools: - Read - Glob - Grep - Bash - Edit - Write - Agent - TaskCreate - TaskUpdate

Bug Fix Skill¶

Structured bug fix protocol designed to eliminate cascading fix-reveal-fix cycles by mapping the full data flow before touching any layer. Stack-agnostic — adapt the commands and layer names to the project you are working in.

Arguments¶

$ARGUMENTS should describe the bug: error message, observed behavior, expected behavior, or path to a screenshot.

Step 0: Determine Complexity¶

Assess bug scope before starting:

Complexity	Signals	Approach
Single-layer	Error in one file, one service	Sequential trace (Steps 1-8)
Multi-service	DTOs, enums, or events cross service boundaries	Parallel agent investigation (Step 3b)
Ghost bug	"Works on my machine", intermittent, post-deploy	Stale artifact focus (Step 2)

For multi-service bugs, create tasks to track progress across services.

Step 1: Understand the Bug¶

Read the error message or user description carefully
Identify which service/layer reported the error
If a screenshot was provided, analyze it
Ask which layer owns the responsibility before assuming where the fix goes
For frontend-reported bugs: verify whether the fix should be backend-side first — the most common wrong initial approach is fixing the frontend when the backend is the root cause

Step 2: Check for Stale Artifacts¶

Before any debugging, eliminate ghost bugs. Use the commands appropriate to the project stack — the pattern is "force rebuild + verify source matches running":

# Force rebuild (stack-specific)
# Examples:
#   Go:         go build ./...
#   Node:       rm -rf node_modules/.cache && npm run build
#   Rust:       cargo clean && cargo build
#   Java/JVM:   mvn clean compile
#   Python:     rm -rf __pycache__ dist build *.egg-info

# Verify dependencies haven't drifted
# Examples: go mod verify, npm ls, pip freeze, cargo tree

# Always check git state regardless of stack
git status --short
git log --oneline -3

# For production bugs, confirm deployed commit matches source

If stale artifacts found, warn the user before proceeding.

Step 3: Trace the Full Data Flow¶

CRITICAL: Do NOT fix any single layer yet. Map the complete path first.

For each affected entity/endpoint, trace through ALL layers. Adapt the layer names to the stack in use — the principle is "follow the data end-to-end":

Server / backend layers (typical)¶

Entry point / handler — request parsing, routing, response shape
DTO / request-response model — field names, types, serialization tags
Service / business logic — validation, state transitions, enum values
Repository / data access — queries, column names, schema/namespace prefixes
Schema / migration — table definition, constraints, enum types
Events / messaging — publisher payload matches consumer expectations

Client / frontend layers (typical)¶

API client — endpoint URL, request body shape, field names
Type definitions — types match server DTOs, case convention (camelCase vs snake_case)
State layer — query keys, cache invalidation, response transformations
View / component — data binding, error states, loading states

Cross-boundary checks¶

Inter-service clients — other services calling the affected endpoint
Event/message payloads — publish/consume schemas align
Shared enum values — match across all layers (server code, database constraints, client code)
Gateway / proxy — routing rules, auth interception, CORS

Step 3b: Parallel Agent Investigation (for multi-service bugs)¶

For complex bugs spanning 2+ services, launch parallel agents:

Agent 1: Backend contract audit
  - Read handlers, DTOs, migrations, enum CHECK constraints
  - List all field names and types at each boundary

Agent 2: Frontend contract audit
  - Read API service files, hooks, component types
  - List all field names and expected shapes

Agent 3: Event/messaging audit (if events involved)
  - Read publishers, consumers, event models
  - Verify routing keys and payload shapes

Synthesize findings from all agents BEFORE making any edits.

Step 4: Identify ALL Mismatches¶

Create a list of every discrepancy found. Common categories (adapt to stack):

Field name mismatches across case conventions (e.g., full_name vs fullName vs FullName)
Enum value mismatches between code, database constraints, and API contract
Missing fields in request/response models
Wrong types at boundaries (string vs UUID, number vs string, nullable vs required)
Missing namespace/schema prefix in database queries
Route ordering issues (static routes must come before dynamic params in most routers)
Serialization tag mismatches between server structs and client expectations
Case conversion in middleware (client sends snake_case, server expects camelCase, or vice versa)
Optional field handling (null vs omitted vs empty string)

Step 5: Plan the Fix¶

Present the complete fix plan to the user BEFORE implementing: - List every file that needs changes - Describe what changes in each file - Identify the correct order of changes - Flag any migration needs (show SQL, never run without approval)

Step 6: Implement¶

Apply changes in dependency order: 1. Database migrations (if needed) — show SQL, wait for approval 2. Backend domain/DTO changes 3. Backend service/repository changes 4. Backend handler changes 5. Frontend type changes 6. Frontend API/hook changes 7. Frontend component changes

STOP-AND-REMAP RULE: If implementing a fix reveals a new issue in another layer, STOP immediately. Do not chase the new issue. Go back to Step 3, re-map all remaining layers, update the fix plan, then continue. This prevents the cascading fix-reveal-fix cycle that wastes the most time.

Step 7: Verify¶

After implementing, run the stack-appropriate verification chain:

# Build (stack-specific)
#   Go:      go build ./...
#   Node:    npm run build   (or tsc --noEmit for TS)
#   Rust:    cargo build
#   Python:  python -m compileall .

# Tests (stack-specific)
#   Go:      go test ./... -count=1
#   Node:    npm test
#   Rust:    cargo test
#   Python:  pytest

# Lint / static analysis (stack-specific)
#   Go:      golangci-lint run ./... && go vet ./...
#   Node:    npm run lint && tsc --noEmit
#   Rust:    cargo clippy -- -D warnings
#   Python:  ruff check . && mypy .

# Grep for residual references to old code (any stack)
# Use Grep tool with the old identifier and appropriate file filter

Test-Driven Verification (for recurring or complex bugs)¶

When the bug is subtle or has regressed before: 1. Write a failing test that captures the exact bug scenario 2. Write edge case tests for correct expected behavior 3. Implement the minimal fix 4. Run tests — iterate until all pass 5. Run full lint/build verification

Step 8: Summarize¶

## Bug Fix Summary

**Bug**: {description}
**Root cause**: {what was actually wrong}
**Services affected**: {list}

### Changes
| File | Change |
|------|--------|
| path/to/file.go | Fixed field name from X to Y |
| ... | ... |

### Verify after deploy
- [ ] {what to check in staging/production}

Important Rules¶

NEVER fix just one layer and declare done — always trace the full path
NEVER run migrations or destructive DB operations without showing the SQL/commands and getting user approval
When a fix reveals another issue, STOP and re-map before continuing (Step 6 rule)
Always grep for residual references after any rename/refactor
For client-reported issues, check if the fix should be server-side first — this is the #1 wrong initial approach
Don't overwrite complete files with minimal stubs — preserve existing code
Check persistence-layer fields match code models — missing fields in scan/bind targets cause silent bugs
Verify enum values match across all layers (code, database constraints, client types) — mismatches are the most frequent multi-service bug

Gotchas¶

STOP-AND-REMAP when a fix reveals a new issue¶

If implementing a fix surfaces a problem in another layer, STOP. Do not chase the new issue. Go back to Step 3, re-map remaining layers, update the plan, then continue. Chasing emergent issues without remapping is the #1 cause of fix-reveal-fix cycles that burn hours.

Don't fix one layer and declare done¶

The temptation is strong: one edit in the handler, tests green, ship. But the bug almost always crosses layers. Always trace the full path (request → persistence → response) before declaring the fix complete.

For client-reported bugs, investigate server-side FIRST¶

The most common wrong approach: user reports "form shows wrong value", dev fixes client. Actual root cause: server response has wrong field name. Always ask "which layer owns this responsibility?" before editing.

Always grep for residual references after rename/refactor¶

The old name lives in a config file, a comment, a test fixture, a generated type you forgot. Run a grep with the old identifier across the entire repo — not just the language you edited.

NEVER run destructive DB operations without approval¶

Showing the SQL and waiting for explicit "go" is non-negotiable. Even in dev environments. A forgotten WHERE clause or a wrong schema name wipes data. The skill shows, approves, then runs — never the reverse.

Enum drift across layers is silent and deadly¶

Server defines status = "ACTIVE", database CHECK constraint allows "active", client type uses "Active". None throws a compile error, all three mismatch at runtime. When reviewing a bug that involves a state field, check all three spellings explicitly.

Route order matters in most routers¶

In many routers (Chi, Fiber, Express), static routes must be registered before dynamic-param routes — otherwise /users/me gets captured by /users/:id. If debugging a 404 on what should be a static route, check the registration order.

Don't overwrite files with minimal stubs¶

When editing, preserve existing code. Rewriting a 500-line handler with a 20-line stub "to simplify" destroys context the bug fix depends on. Use Edit for surgical changes; reserve Write for truly new files.

Check DB binding fields match the code model¶

A struct with 8 fields scanning a 10-column row will bind only 8 and silently drop the other 2. The bug is invisible until a user notices missing data. Verify field-to-column alignment in scan/bind sites.