Claude Code hooks: the half of Claude Code nobody uses

I was halfway through writing this post when I decided to fact-check myself. Opened ~/.claude/settings.json, expecting three or four hooks I’d forgotten about. There was one. A Stop hook that plays a ding and says “your turn” when Claude finishes thinking. My hooks-to-skills ratio: 1 to 42.
I’m not picking on myself. This is the median. I checked seven of my own project configs after that: zero hooks each. Skills got the awesome-lists. Hooks got a footnote. And the silence is costing people money in tokens, missed incidents in security, and a control surface that ships in the box and never gets wired up.
TL;DR
- Claude Code has 25+ hook event types. The average user has configured zero or one. I checked seven of my own project configs: zero hooks. My user-level config: one Stop hook.
- Skills feel like adding capability, which is fun. Hooks feel like writing policy, which sounds like work. The work is the part that pays.
- One
PreToolUsehook that swaps Grep for LSP cuts navigation tokens by 73-91% in the kit’s own benchmarks (nesaminua/claude-code-lsp-enforcement-kit, MIT).- The enterprise hook playbook (SOC2 audit, SIEM integration, supply-chain scanning, org-wide token budgets) does not exist publicly yet. If your platform team writes one now, you’re early.
Why skills get all the attention and hooks get none
Skills fit a familiar mental model: drop a markdown file, write a description, the agent picks it up. They feel like adding capability. Hooks are different. You write a script that fires at an execution boundary and returns an exit code or JSON to block, modify, or augment what happens next. They feel like writing policy. I saw a reddit thread that nailed this for me: skills change what the model can do, hooks change when it can do it.
That asymmetry shows up in adoption. Skills make Claude more capable; hooks make it more predictable. If you’re a solo dev optimizing for capability, you reach for skills. If you’re a team lead trying to keep five engineers from doing five different unsafe things, you reach for hooks. Most early adopters were solo, so the awesome-lists filled up with skills first.
There’s also a distribution problem. A skill is a markdown file. You can paste it in a Slack message. A hook is a settings.json entry (or a bundled file in a plugin or skill) pointing to a shell script that touches your filesystem. Sharing it requires trust, a setup ritual, and someone willing to chmod +x a stranger’s bash. That’s a real barrier.
Look at the curated lists on GitHub. Hook-only repos trail the mixed lists (skills, slash commands, MCP, hooks) by a wide margin, and broader awesome-claude-code lists treat hooks as a footnote. Even the post on how the creator of Claude Code uses it mentions hooks only in passing, and zero of the 26 comments noticed.
My audit: 42 user-level skills installed. 1 hook (the Stop notification mentioned above). Across 7 project configs: 0 hooks. The one that stings is my reddit-mcp repo, which gives Claude posting and deletion permissions on my Reddit account. Zero hooks there too. If I’m typical, the median ratio is something like 40 skills to 1 hook.
My deep dive on Claude Code’s leaked source showed the harness has a hooks/ directory with 104 files. That’s a lot of internal scaffolding for something most users typically ignore, at least in our org.
The hook event reference (Anthropic’s official taxonomy)

There are at least 25 hook events documented in Anthropic’s hooks documentation. Most third-party tutorials cover four. Here’s the actual catalog, grouped by cadence, so you can see the full surface.
| Cadence | Events | Effect |
|---|---|---|
| Once per session | SessionStart, SessionEnd | SessionStart can inject context |
| Once per turn | UserPromptSubmit, Stop, StopFailure | UserPromptSubmit and Stop can block or modify |
| Per tool call | PreToolUse, PostToolUse, PostToolUseFailure, PermissionRequest, PermissionDenied | PreToolUse can block or modify |
| Async / lifecycle | WorktreeCreate, WorktreeRemove, Notification, ConfigChange, InstructionsLoaded, CwdChanged, FileChanged | Observation only |
| Agent team | SubagentStart, SubagentStop, TeammateIdle, TaskCreated, TaskCompleted | Observation only |
| Compaction | PreCompact, PostCompact | PreCompact can inject context |
| MCP | Elicitation, ElicitationResult | Elicitation can respond |
Configuration sits at four levels: user-wide (~/.claude/settings.json), project-shared (.claude/settings.json, commit it), project-private (.claude/settings.local.json, gitignored), and managed policy (org admin only, individual devs can’t disable). Most public configs use the first two. Managed policy is where org-wide control lives, and almost nobody ships templates for it.
Handler types: command, http, prompt, agent. Most public examples use command. The http handler is the door to centralized policy, where one webhook enforces the same rule across every developer in the org. I haven’t seen it used in any public repo yet.
Exit codes are easy to get wrong. 0 means success. 2 is a blocking error, but behavior depends on the event: PreToolUse blocks the tool call, Stop prevents session end, other events treat it as non-fatal. Any other code is non-blocking. The model can’t see why a hook fired unless the hook writes to stderr; this is the most common reason hooks feel mysterious in practice.
The hook surface is wide. You can block, modify, observe, or augment almost any event in the agent’s lifecycle. The shape of what’s possible is “almost anything.” The shape of what’s actually configured on most machines is “nothing.”
10 hooks people are actually running
These are the hooks I’d put in front of my team. None are “block writes to .env” or “format on save.” Each one is published somewhere (a repo or a Reddit thread) and does something non-obvious. Ranked roughly from most surprising to most foundational.
1. LSP-over-Grep enforcement. PreToolUse on Grep, Glob, Bash, Read. nesaminua/claude-code-lsp-enforcement-kit (MIT). Blocks Grep calls containing code symbols and forces the agent to use LSP find_definition and find_references instead. Documented per-call savings: definition lookup drops from ~6,500 to ~580 tokens. Real workweek aggregate: 320k to 85k navigation tokens, a 73% reduction. Works with cclsp/Serena, supports TypeScript and 13+ other languages. It punishes a default behavior. The agent can still grep, but the cost is paid in friction, not silently in tokens.
2. Knowledge-graph compile of the project. Skill + hook combo. safishamsi/graphify. Karpathy-style: instead of re-reading raw files every session, compile the project into a structured wiki once, then query the wiki via skill. Hook installs the skill and registers /graphify as the entry point. Claimed 71.5x token reduction per query on a mixed corpus. Treats context as a build artifact, not a runtime cost.
3. The 4-hook workflow enforcement stack. SessionStart + PreToolUse on Edit + Stop + PostToolUse on git commit. From tacit7 in the hooks vs skills thread. SessionStart tells the agent to read the workflow skill. PreToolUse on Edit refuses if no task is registered. Stop refuses if the task isn’t annotated. PostToolUse on git commit logs the commit to an external app. Four hooks turn a probabilistic agent into a procedurally-compliant teammate, end to end.
4. “Fail twice, stop and ask.” PostToolUseFailure + Notification. Another one from the same thread. If the same tool fails twice with similar errors, the hook halts the session and pings the human. The most common Claude Code failure mode is the agent looping on a bad call. This rule catches it in maybe 20 lines of code.
5. Per-file typecheck after every edit. PostToolUse on Write/Edit. From DevMoses in the “5 levels of Claude Code” post. Runs tsc --noEmit (or equivalent) on the single file Claude just edited, instead of flooding the agent with 200+ project-wide errors. Inverts the default. The agent gets a tight feedback loop on its own work without drowning in unrelated noise.
6. Dynamic permission control via hook-managed policy. PreToolUse + PermissionRequest. Also from the same thread. Hooks flip permissions on and off at runtime based on context (project, session source, current task). Claude Code’s permission model is mostly static. This hook makes it conditional, which matters for orgs with role-based access.
7. Session-conclusion guard. Stop + SessionEnd. connerohnesorge/conclaude. Refuses to end a session if there’s uncommitted state, in-progress work, or unmerged checkpoints. Stops the “I closed the terminal and lost work” failure mode at the harness level.
8. Filesystem offload of large tool outputs. PostToolUse on Read/Bash/WebFetch. sheeki03/Few-Word. When a tool returns more than N tokens, write the result to disk and return a short summary plus a path. Treats the filesystem as an extension of context. The agent can re-read the slice it actually needs instead of choking on a 50KB file.
9. Cache-fix patches injected via hook. SessionStart. Rangizingo/cc-cache-fix. Patches a documented Claude Code bug where the db8 filter strips deferred_tools_delta records, breaking the prompt cache on resumed sessions. The author’s analysis claims it wasted ~250,000 API calls per day globally before being noticed. The hook applies the patch at session start. Hooks as a deployment mechanism for community fixes. No need to wait for Anthropic to ship a release. (I’d qualify the 250k figure as analysis-based, not independently confirmed by Anthropic.)
10. Lessons-learned hooks (“encode the mistake”). Pattern, not a single repo. From Aggressive-Sweet828 in the hooks vs skills thread. Every time the agent makes a mistake you don’t want repeated, turn it into a hook. Over time, your hooks become your team’s quality bar, written in code instead of whispered in code review. Reframes hooks as institutional memory, not just guardrails.
Of the ten, six are now sitting in my own settings.json: the LSP enforcement kit (#1), the graphify knowledge-graph compile (#2), the fail-twice loop guard (#4), the filesystem offload for large outputs (#8), the cache-fix patch (#9), and the lessons-learned pattern (#10). The LSP kit and the fail-twice guard are the two I have the most to say about so far. The other four are too new for me to have a real story yet, and I’ll update this section as they earn one.
LSP enforcement kit (#1): install was a git clone and a single bash install.sh. The installer is idempotent and merges into ~/.claude/settings.json without touching what’s already there. First session into the portfolio repo, the hook fired on the second tool call and refused a Grep for BlogPostLayout. The agent reached for find_definition instead and landed on the right file. The thing I didn’t expect was how often I write prompts that assume Grep (“find where we use X”). The agent now has to translate those into LSP, which takes a beat. I’m keeping it on this repo and waiting to see what the weekly token total actually does.
Fail-twice loop guard (#4): hand-rolled in about 20 lines of bash because the pattern is small enough I didn’t want a dependency. It hasn’t fired yet, probably because I haven’t kicked off anything ambitious enough since installing it. The version I wrote compares the last two PostToolUseFailure events for the same tool name and a similar error substring, and pings me via the same Stop-hook ding I already had. If it ever fires, I’ll update this section with what it caught.
The LSP entry is the most honest data point here. It’s widely cited, and the kit ships its own reproducible benchmarks. Here’s what the per-call savings look like on the operations the agent does dozens of times a day.
The headline 80% savings number on the original Reddit post is anecdotal. The 91% per-call and 73% workweek-aggregate numbers come from the kit’s own benchmarks, which are reproducible. I’d treat the kit numbers as the reliable ones and the Reddit headline as directionally right.
Skills vs hooks: a decision table
Skills describe what to try. Hooks define what must happen. Pair them: a skill describes the workflow, a hook enforces the precondition.
| Dimension | Skills | Hooks |
|---|---|---|
| Mental model | Add capability (request) | Define policy (enforcement) |
| Distribution | Markdown file, frontmatter | settings.json entry pointing to script |
| Determinism | Probabilistic (model decides if/when) | Deterministic (fires every event match) |
| Token cost | Loaded on demand, ~free when inactive | Often saves tokens (LSP swap, output offload) |
| Observability | Model writes about using it in transcripts | Side effects + exit code; model often can’t see why a hook fired |
| Best for | Reusable workflows, domain expertise | Guardrails, audit, cost control, integration |
| Failure mode | Model forgets to use it | Hook breaks the session if poorly written |
| Sharing friction | Low (a .md file) | Higher (script + permission + JSON) |
A concrete pair: a “deploy” skill describes the deployment runbook (build, test, push, monitor). A PreToolUse hook on the deploy command verifies the test suite passed in the last 5 minutes and you’re on a release branch, and refuses otherwise. The skill teaches. The hook insists.
The insight that did the most for my own thinking: guardrails belong in hooks because blocks need to be deterministic, not described. A skill that says “don’t push to main without tests” is a polite request the model can ignore. A PreToolUse hook that returns exit code 2 with "decision": "block" cannot be ignored.
My comparison of ForgeCode and Claude Code called out hooks as one of Claude Code’s real differentiators. Re-reading it now, I underweighted them. ForgeCode being faster doesn’t matter if your team needs deterministic policy.
The enterprise patterns nobody is writing about
Anthropic shipped the primitives this past quarter: allowManagedHooksOnly, allowedHttpHookUrls, httpHookAllowedEnvVars, plus a drop-in managed-settings.d/ directory for stacking policy from multiple teams. What’s missing is the layer above: published end-to-end SIEM, SOC2, and audit playbooks built on those primitives. I can’t find a single public repo shipping an org-wide audit template I’d actually deploy.
The vendors closest to filling that gap are the ones with skin in the AI-supply-chain game. (Disclosure: JFrog is my employer; not paid to link this.) Supply Chain Attackers Are Coming for Your Agents walks the Shai-Hulud npm worm, the postmark-mcp exfiltration, and the LiteLLM compromise as cases where a PreToolUse hook on the install boundary would have caught the payload. JFrog AI Catalog Evolves to Detect Shadow AI and Govern MCPs covers the upstream gateway angle that pairs with client-side hooks. Neither is a hook playbook, but they’re closer than anything else I’ve found.
Security and policy enforcement. A PreToolUse hook on Write blocks anything outside approved paths (scope to Write|Edit|MultiEdit). A UserPromptSubmit hook scrubs known credentials (AWS access key prefix, GitHub PAT, JFrog tokens) before the prompt leaves the machine, returning JSON decision: "block" on a match. The session file where I found my database password (the one that started Claudoscope) was created because Claude Code read a .env and echoed the contents back. A PreToolUse hook on Read with a .env matcher would have refused that read. Claudoscope catches the credential after it lands in the JSONL; the hook prevents it from landing. The full story is in how I found a database password in a session file.
Compliance and audit. A PostToolUse hook with handler type http sends a structured event to a SIEM (Splunk, Datadog, Elastic) on every tool call: session ID, user, tool name, sanitized input, timestamp, project. SessionStart and SessionEnd book-end the audit log. Combine with managed policy and allowManagedHooksOnly: true so individual devs can’t disable the audit hook locally. This is the org-wide control surface the docs describe but no one’s shipping templates for.
Cost control. Per-developer token budget: a PreToolUse hook logs token estimates per tool call, totals them per day in a small SQLite file, and denies expensive calls once the budget is hit. Same pattern for model routing: rewrite the model selection or refuse a dispatch if the request is trivial enough that Opus is overkill.
DevSecOps and supply chain. This is where my JFrog day job comes in. PreToolUse on Bash intercepts npm install, pip install, etc., and runs the package through a vulnerability scanner before allowing the install. PreToolUse on Write runs the staged change through JFrog Advanced Security and refuses the write if SAST finds an issue.
The conversation that keeps coming up on our side: whether agent-initiated package installs count as a developer action or a CI action under existing supply-chain policy. We don’t have a clean answer yet. A PreToolUse hook that routes installs through Xray would collapse the distinction; the same policy applies wherever the install happens.
Swap in your scanner of choice. Supply-chain controls don’t have to live in CI anymore. They can live at the agent’s tool-call boundary, closer to where the risk is introduced.
If you’re a platform team standing up Claude Code at scale, you’re filling that gap yourself, which is fine but probably not what you signed up for.
The footguns
Hooks are powerful because they run as the user. That’s also the danger. Four real failure modes I’ve watched people hit.
Infinite loop. A hook on PostToolUse triggers another tool call that triggers the same hook. Fix: add a sentinel (env var or marker file) and short-circuit on repeat invocation. This bites the first time you write a PostToolUse that does anything substantive.
The hook breaks the session and Claude has no idea why. From the same thread: “if a hook fails the model has no idea why and can’t self-correct.” Mitigation: return useful text on stderr with exit code 2 so the model gets context, even when blocking. The default behavior of a silent block is the worst possible UX.
Permission deadlock. Also from the same thread: “I’ve already seen it deadlock a session when the hook permissions were set too tight.” Always test with --debug first. Always.
Shell startup pollution. Anthropic’s docs explicitly warn: shell profiles printing text on startup interfere with JSON parsing. A single echo line in your .bashrc will silently break every JSON-output hook on your machine. This one is hilarious until it happens to you.
Frequently asked questions
What is a Claude Code hook?
A user-defined command, HTTP endpoint, prompt, or agent invocation that fires at a specific lifecycle point: a tool call, session start, prompt submission, and 20+ other events. Defined in settings.json or bundled with a plugin or skill. Returns control via exit codes and structured JSON. Hooks can block tool calls, modify their input, inject context, or just observe. (Anthropic docs, 2026)
What’s the difference between Claude Code hooks and skills?
Skills are markdown workflows the model decides whether to use. Hooks are deterministic execution-boundary callbacks: they fire every time the matched event happens, regardless of what the model decides. Use skills to teach a workflow; use hooks when the precondition is non-negotiable. They pair well together, since the skill describes the procedure and the hook makes sure the agent actually followed it.
Can Claude Code hooks reduce token usage?
Yes, significantly. The most-cited example: a PreToolUse hook that swaps Grep for LSP-based code navigation reduces per-call tokens from ~6,500 to ~580 (a 91% drop) for definition lookups, with a documented 73% real-world weekly aggregate reduction (nesaminua/claude-code-lsp-enforcement-kit, MIT). Other patterns (sandboxed tool output, knowledge-graph compilation) report 71.5x to 98% reductions in their respective scopes.
Where I landed
I came into this thinking I’d find a few clever hooks worth sharing. I came out wanting to spend a weekend writing the org-wide policy template that doesn’t exist.
If you’ve been shipping skills and ignoring hooks, you’ve taken the easier half. Skills are the part Claude can ignore. Hooks are the part it can’t. For a solo dev that distinction is small. For a team it’s most of the value.
Practical first move, if you’re still reading: write the UserPromptSubmit hook that scrubs your most likely credentials before they leave the machine. Maybe 30 lines of Python or bash. It will catch a real incident inside a month. I’d bet on it.
After that, audit your own .claude/settings.json. If the count is zero or one, you’ve got a lot of room. (And if you also want to see what your sessions are actually doing while you sort out which hooks to write, that’s what I built Claudoscope for.)