From c219b95dbd443a1d63013b228f4c31ac39226c57 Mon Sep 17 00:00:00 2001 From: Ohad Livne Date: Thu, 12 Feb 2026 08:42:45 +0200 Subject: [PATCH] Add a skill for profiling and optimization --- .../skills/performance-optimization/SKILL.md | 120 ++++++++++++++++++ 1 file changed, 120 insertions(+) create mode 100644 .claude/skills/performance-optimization/SKILL.md diff --git a/.claude/skills/performance-optimization/SKILL.md b/.claude/skills/performance-optimization/SKILL.md new file mode 100644 index 0000000..39046da --- /dev/null +++ b/.claude/skills/performance-optimization/SKILL.md @@ -0,0 +1,120 @@ +--- +name: performance-optimization-expert +description: Scientific performance optimization workflow for backend and full-stack systems using profiling-first investigation, bottleneck isolation, targeted fixes, and before/after verification. Use when requests involve slow endpoints, high p95/p99 latency, low throughput, CPU saturation, memory growth, I/O stalls, DB query slowness, lock/contention issues, or explicit requests to benchmark/profile/optimize code. +--- + +# Performance Optimization Expert + +Use this workflow on the changed path first, then expand scope only if evidence requires it. + +## Operating Rules + +1. Measure before optimizing. +2. Optimize the biggest bottleneck first (Amdahl's Law). +3. Verify that "slow" code is on the hot path. +4. State tradeoffs (complexity, readability, risk, correctness). +5. Report concrete deltas (p95, throughput, memory, CPU, query time). + +## Workflow + +### 1) Define the performance problem + +Capture: +- Slow operation/endpoint/function. +- Current numbers (or explicit "not measured yet"). +- Trigger conditions (data size, concurrency, environment). +- Target/SLO. + +If unknown and measurable, gather measurements first. Do not guess. + +### 2) Build baseline + +Collect: +- Latency p50/p95/p99. +- Throughput (req/s or ops/s). +- CPU, RSS/heap, I/O wait, network. +- Error/timeout rate. + +If direct measurement is unavailable, estimate complexity and I/O/memory behavior from code and mark it as estimated. + +### 3) Classify bottleneck + +Classify one or more: +- CPU-bound. +- I/O-bound (DB/network/disk). +- Memory-bound. +- Contention-bound (locks/pools/starvation/rate limits). + +### 4) Locate root cause + +Trace hot path and name exact files/functions/queries causing most cost. +Prefer profiler traces, flamegraphs, or query plans over intuition. + +### 5) Apply targeted fix + +Pick highest impact-to-effort first: +- Algorithm/data structure improvement. +- Query/index/N+1 reduction. +- Caching with explicit invalidation and TTL. +- Async/batching/streaming for I/O. +- Allocation/copy reduction and memory-safe iteration. +- Concurrency model adjustment (pool sizing, lock granularity, asyncio vs multiprocessing). + +Avoid speculative micro-optimizations unless hot path evidence supports them. + +### 6) Verify and guard + +Measure with the same method and dataset as baseline. +Report before/after metrics and check: +- Correctness. +- Regression on related paths. +- Resource side effects (memory/CPU error rate). + +## Tool Selection + +Use tools that match the stack: +- Python CPU: `cProfile`, `py-spy`, `line_profiler`, flamegraphs. +- Python memory: `tracemalloc`, `memory_profiler`, `scalene`. +- DB: `EXPLAIN ANALYZE`, query logs, ORM query counting. +- FastAPI/async: event-loop blocking checks, sync I/O detection. +- Distributed systems: tracing/APM spans for cross-service latency. + +## Anti-Patterns to Flag + +- Premature optimization without baseline data. +- Micro-optimizing non-hot code while major I/O bottlenecks remain. +- Complex caching without robust invalidation. +- Ignoring database query plans and index strategy. +- Blocking synchronous calls inside async request handlers. +- Loading full datasets when pagination/streaming is sufficient. +- Recomputing expensive values instead of reuse/memoization. + +## Required Response Format + +Always structure findings as: + +```markdown +## Performance Analysis + +### Current State +- Metric: ... +- Target: ... +- Gap: ... + +### Bottleneck Identified +... + +### Root Cause +... + +### Recommended Fix +... + +### Implementation +... + +### Verification Plan +... +``` + +If profiling data is missing, explicitly say so and make "gather profile/baseline" the first action.