Compare commits

...

2 commits

Author SHA1 Message Date
57e8b815be
Add a skill for profiling and optimization 2026-02-12 08:43:17 +02:00
6d96354652
Track Claude Code settings 2026-02-12 08:43:16 +02:00
2 changed files with 156 additions and 0 deletions

36
.claude/settings.json Normal file
View file

@ -0,0 +1,36 @@
{
"permissions": {
"allow": [
"Bash(/home/agent/.local/bin/pylsp:*)",
"Bash(apt list:*)",
"Bash(bash -n:*)",
"Bash(claude mcp list)",
"Bash(claude plugin --help:*)",
"Bash(claude plugin install --help:*)",
"Bash(claude plugin list:*)",
"Bash(claude plugin marketplace --help:*)",
"Bash(claude plugin marketplace list)",
"Bash(claude plugin validate --help:*)",
"Bash(dpkg -l:*)",
"Bash(echo:*)",
"Bash(env)",
"Bash(git log:*)",
"Bash(grep:*)",
"Bash(head:*)",
"Bash(ls:*)",
"Bash(pylsp:*)",
"Bash(tree:*)",
"Bash(uv run mypy:*)",
"Bash(uv run tach:*)",
"Bash(uv tool list:*)",
"Bash(wc:*)",
"Bash(xargs cat:*)",
"WebFetch(domain:docs.cloud.google.com)",
"WebFetch(domain:docs.gauge.sh)",
"WebFetch(domain:docs.temporal.io)",
]
},
"enabledPlugins": {
"gopls-lsp@claude-plugins-official": true
}
}

View file

@ -0,0 +1,120 @@
---
name: performance-optimization-expert
description: Scientific performance optimization workflow for backend and full-stack systems using profiling-first investigation, bottleneck isolation, targeted fixes, and before/after verification. Use when requests involve slow endpoints, high p95/p99 latency, low throughput, CPU saturation, memory growth, I/O stalls, DB query slowness, lock/contention issues, or explicit requests to benchmark/profile/optimize code.
---
# Performance Optimization Expert
Use this workflow on the changed path first, then expand scope only if evidence requires it.
## Operating Rules
1. Measure before optimizing.
2. Optimize the biggest bottleneck first (Amdahl's Law).
3. Verify that "slow" code is on the hot path.
4. State tradeoffs (complexity, readability, risk, correctness).
5. Report concrete deltas (p95, throughput, memory, CPU, query time).
## Workflow
### 1) Define the performance problem
Capture:
- Slow operation/endpoint/function.
- Current numbers (or explicit "not measured yet").
- Trigger conditions (data size, concurrency, environment).
- Target/SLO.
If unknown and measurable, gather measurements first. Do not guess.
### 2) Build baseline
Collect:
- Latency p50/p95/p99.
- Throughput (req/s or ops/s).
- CPU, RSS/heap, I/O wait, network.
- Error/timeout rate.
If direct measurement is unavailable, estimate complexity and I/O/memory behavior from code and mark it as estimated.
### 3) Classify bottleneck
Classify one or more:
- CPU-bound.
- I/O-bound (DB/network/disk).
- Memory-bound.
- Contention-bound (locks/pools/starvation/rate limits).
### 4) Locate root cause
Trace hot path and name exact files/functions/queries causing most cost.
Prefer profiler traces, flamegraphs, or query plans over intuition.
### 5) Apply targeted fix
Pick highest impact-to-effort first:
- Algorithm/data structure improvement.
- Query/index/N+1 reduction.
- Caching with explicit invalidation and TTL.
- Async/batching/streaming for I/O.
- Allocation/copy reduction and memory-safe iteration.
- Concurrency model adjustment (pool sizing, lock granularity, asyncio vs multiprocessing).
Avoid speculative micro-optimizations unless hot path evidence supports them.
### 6) Verify and guard
Measure with the same method and dataset as baseline.
Report before/after metrics and check:
- Correctness.
- Regression on related paths.
- Resource side effects (memory/CPU error rate).
## Tool Selection
Use tools that match the stack:
- Python CPU: `cProfile`, `py-spy`, `line_profiler`, flamegraphs.
- Python memory: `tracemalloc`, `memory_profiler`, `scalene`.
- DB: `EXPLAIN ANALYZE`, query logs, ORM query counting.
- FastAPI/async: event-loop blocking checks, sync I/O detection.
- Distributed systems: tracing/APM spans for cross-service latency.
## Anti-Patterns to Flag
- Premature optimization without baseline data.
- Micro-optimizing non-hot code while major I/O bottlenecks remain.
- Complex caching without robust invalidation.
- Ignoring database query plans and index strategy.
- Blocking synchronous calls inside async request handlers.
- Loading full datasets when pagination/streaming is sufficient.
- Recomputing expensive values instead of reuse/memoization.
## Required Response Format
Always structure findings as:
```markdown
## Performance Analysis
### Current State
- Metric: ...
- Target: ...
- Gap: ...
### Bottleneck Identified
...
### Root Cause
...
### Recommended Fix
...
### Implementation
...
### Verification Plan
...
```
If profiling data is missing, explicitly say so and make "gather profile/baseline" the first action.