---
title: "Hooks"
description: "Lifecycle hooks let a plugin run code at fixed points during the Assistant's lifecycle."
canonical_url: "https://www.vellum.ai/docs/extensibility/hooks"
md_url: "https://www.vellum.ai/md/docs/extensibility/hooks"
related:
  - "/docs/extensibility"
  - "/docs/extensibility/distribution"
  - "/docs/extensibility/plugins"
  - "/docs/extensibility/skills"
  - "/docs/extensibility/tools"
---

# Hooks

Run your own code at fixed points in a turn. Hooks let a plugin read or transform what flows through the Assistant without forking the core loop.

**Plugins are in beta.** The plugin API (`@vellumai/plugin-api`) is not yet stable and can change between releases. Pin the `peerDependencies` range your plugin declares, and expect breaking changes until we cut a 1.0. Hook names and context shapes can change with it.

A hook is a function that the Assistant calls at a known boundary in their lifecycle. The harness owns the loop, and your Assistant's code runs at named points along the way. Each hook lives in its own file under `hooks/<name>.ts`, and the filename is the hook name.

## The Agent Loop

The diagram below maps what we call the **Agent Loop**. The nodes are **lifecycle events**: the points in time turns of a conversation passes through. The connecting **hooks** are the places your code can run as the turn moves from one event to the next.

Hook (fires on this transition)Control flow

The loop can iterate several times within a single user turn: every tool result returns to a fresh model call, and a `post-model-call` hook can choose to continue rather than end the turn. Because of this, `pre-model-call`, `post-model-call`, and `post-tool-use` can each fire more than once per turn.

### The Assistant Lifecycle

The Assistant can also hook into Lifecycle Events that sit outside the Agent Loop. The diagram below shows where these hooks sit and how they interplay with the Server that manages the Agent Loop.

Hook (fires on this transition)

## Hooks reference

These are the lifecycle hooks this guide covers. The full set of wired hook names lives in the [`HOOKS` constant](https://github.com/vellum-ai/vellum-assistant/blob/main/assistant/src/plugin-api/constants.ts). Expand a hook to see its Context API contract.

`init`PluginInitContext

**When:** Once, when the plugin is first registered (on boot or install).

**Use it to:** Validate config and credentials and open resources. Throwing aborts the plugin's load.

| Field              | Type                     | Access    | Description                                                                      |
| ------------------ | ------------------------ | --------- | -------------------------------------------------------------------------------- |
| `config`           | `unknown`                | Read-only | Parsed config for this plugin, validated against the manifest.                   |
| `credentials`      | `Record<string, string>` | Read-only | Resolved credential values, keyed by the plugin's requiresCredential entries.    |
| `pluginStorageDir` | `string`                 | Read-only | Absolute path to the plugin's writable data directory, created during bootstrap. |
| `assistantVersion` | `string`                 | Read-only | Assistant semver, for defensive runtime checks.                                  |
| `logger`           | `PluginLogger`           | Read-only | Pino-compatible logger scoped to the plugin.                                     |

`user-prompt-submit`UserPromptSubmitContext

**When:** Once per user turn, after messages are assembled and before the agent loop runs.

**Use it to:** Read or rewrite the message list the model is about to see.

| Field              | Type                     | Access    | Description                                                                                       |
| ------------------ | ------------------------ | --------- | ------------------------------------------------------------------------------------------------- |
| `conversationId`   | `string`                 | Read-only | Conversation the prompt was submitted on.                                                         |
| `userMessageId`    | `string`                 | Read-only | Persisted id of the user message that triggered the turn.                                         |
| `requestId`        | `string`                 | Read-only | Stable id for the request driving this turn.                                                      |
| `modelProfileKey`  | `string \| null`         | Read-only | Active inference profile key, or null when unchanged since last announced.                        |
| `isNonInteractive` | `boolean`                | Read-only | True when no human is present to answer clarifications (scheduled or headless runs).              |
| `prompt`           | `string`                 | Read-only | Resolved text of the user prompt, after slash-command expansion.                                  |
| `originalMessages` | `ReadonlyArray<Message>` | Read-only | The user's original message list. Snapshot only, never mutate.                                    |
| `latestMessages`   | `Message[]`              | Mutable   | The working list that flows into the agent loop. Mutate in place or replace via the return value. |
| `logger`           | `PluginLogger`           | Read-only | Logger scoped to the current turn.                                                                |

`post-compact`PostCompactContext

**When:** After the loop compacts a conversation mid-turn, before the turn resumes. It fires on a compaction event rather than a fixed turn boundary, so it branches off the loop rather than sitting on a turn edge.

**Use it to:** Re-apply context that compaction dropped (for example memory injections) onto the compacted history before the next model call.

| Field              | Type                  | Access    | Description                                                                                                                                                   |
| ------------------ | --------------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `history`          | `Message[]`           | Mutable   | The compacted message history to re-inject onto. The loop resumes the turn from the settled value.                                                            |
| `requestId`        | `string`              | Read-only | Stable id of the request driving this turn. Forward it onto the injector so re-applied blocks are attributed to the originating request.                      |
| `conversationId`   | `string`              | Read-only | Conversation the turn being compacted is scoped to.                                                                                                           |
| `isNonInteractive` | `boolean`             | Read-only | True when no human is present to answer clarifications (scheduled, background, or headless runs).                                                             |
| `modelProfileKey`  | `string \| null`      | Read-only | Active inference profile key to surface in the re-injected context, or null when unchanged since last announced.                                              |
| `injectionMode`    | `"full" \| "minimal"` | Read-only | Volume of runtime injection to re-apply. 'full' restores the complete context, 'minimal' is the reduced volume overflow recovery selects. Defaults to 'full'. |

`pre-model-call`PreModelCallContext

**When:** Immediately before every provider call within a turn, including tool-result follow-ups.

**Use it to:** Edit the outbound request (for example the system prompt), route the call to a chosen inference profile, or defer this turn's live output stream.

| Field                  | Type                  | Access    | Description                                                                                                                                                                                                                                                                                                                                                    |
| ---------------------- | --------------------- | --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `conversationId`       | `string`              | Read-only | Conversation the call belongs to.                                                                                                                                                                                                                                                                                                                              |
| `callSite`             | `LLMCallSite \| null` | Read-only | Which call site this serves (mainAgent for the user-facing reply), or null when not tied to a known site. Self-gate on it before acting.                                                                                                                                                                                                                       |
| `systemPrompt`         | `string \| null`      | Mutable   | The system prompt about to be sent. Replace it to edit the request; guard the null case.                                                                                                                                                                                                                                                                       |
| `modelProfile`         | `string \| null`      | Mutable   | The inference profile this call routes to. Set it to a profile key to send the call there (the lever a model-router hook uses to pick a profile per call), or leave it as is for the default resolution. Seeded from the call's resolved override, and null when none applies. Gate on callSite first, and discover the routable keys with getModelProfiles(). |
| `deferAssistantOutput` | `boolean`             | Mutable   | Set true to suppress the live token stream so a post-model-call hook can emit the final text instead.                                                                                                                                                                                                                                                          |
| `logger`               | `PluginLogger`        | Read-only | Logger scoped to the current turn.                                                                                                                                                                                                                                                                                                                             |

`post-model-call`PostModelCallContext

**When:** At every model-call outcome: a finalized assistant message, or a provider rejection. Fires once per model call, before a finalized reply is persisted and streamed.

**Use it to:** Transform the reply's text blocks (leave tool\_use intact), and own the continue decision. On a degenerate no-tool reply or a recoverable rejection, repair the history and set decision to continue to re-query the model.

| Field            | Type                    | Access    | Description                                                                                                                                                                                                                                                       |
| ---------------- | ----------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `conversationId` | `string`                | Read-only | Conversation the message belongs to.                                                                                                                                                                                                                              |
| `callSite`       | `LLMCallSite \| null`   | Read-only | Which call site this message serves, or null when not tied to a known site. Self-gate before acting.                                                                                                                                                              |
| `content`        | `ContentBlock[]`        | Mutable   | The finalized message content; empty on a provider rejection. Transform text blocks and leave tool\_use intact.                                                                                                                                                   |
| `messages`       | `Message[]`             | Mutable   | Full conversation history. When continuing, leave this as the history the next iteration should send (append a follow-up turn, or replace it with a repaired one).                                                                                                |
| `error`          | `Error \| undefined`    | Read-only | The provider rejection that ended the call, on a rejection outcome; absent on a finalized reply. Hooks that only act on a real reply should guard on it and return early.                                                                                         |
| `stopReason`     | `string \| null`        | Read-only | Provider-reported stop reason, or null when none was reported (also null on a rejection).                                                                                                                                                                         |
| `decision`       | `PostModelCallDecision` | Mutable   | Seeded to 'stop'. Set it to 'continue' to re-query the model. Honored only at actionable outcomes (a no-tool reply or a provider rejection); the loop does not gate it on call site, so self-gate via callSite to avoid re-querying background or subagent calls. |
| `logger`         | `PluginLogger`          | Read-only | Logger scoped to the current turn.                                                                                                                                                                                                                                |

`post-tool-use`PostToolUseContext

**When:** After each tool returns, before the result rejoins the history sent to the provider.

**Use it to:** Transform the tool result, for example truncating oversized output to fit the context window.

| Field               | Type                     | Access    | Description                                                                                                                              |
| ------------------- | ------------------------ | --------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| `conversationId`    | `string`                 | Read-only | Conversation the tool ran on.                                                                                                            |
| `toolResponse`      | `ToolResultContent`      | Mutable   | The tool result block. Mutate its content in place or replace the block.                                                                 |
| `messages`          | `ReadonlyArray<Message>` | Read-only | History up to and including the assistant turn that issued the call. The result is not in it yet.                                        |
| `additionalContext` | `string \| null`         | Mutable   | Extra model-only guidance appended after the tool result, for example retry coaching. Defaults to null; set a string to append guidance. |
| `maxInputTokens`    | `number`                 | Read-only | The model's context-window size in tokens, for deriving a character budget.                                                              |
| `logger`            | `PluginLogger`           | Read-only | Logger scoped to the current turn.                                                                                                       |

`stop`StopContext

**When:** Once per run, when the loop has committed to ending the turn. Fires on every terminal exit (a no-tool reply, max tokens, a yield to the user, exhausted overflow recovery, an abort, or an error) and on a checkpoint handoff.

**Use it to:** Run teardown: release per-turn resources or clear per-turn state, knowing nothing will re-enter the loop this run. It cannot continue the loop; the retry decision lives in post-model-call.

| Field            | Type                     | Access    | Description                                                                                                                                                                             |
| ---------------- | ------------------------ | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `conversationId` | `string`                 | Read-only | Conversation the run belongs to.                                                                                                                                                        |
| `messages`       | `ReadonlyArray<Message>` | Read-only | Full conversation history at the terminal stop. Provided for inspection; mutating it has no effect, since the loop will not run again this turn.                                        |
| `exitReason`     | `AgentLoopExitReason`    | Read-only | Which terminal state the turn reached (for example no\_tool\_calls, max\_tokens\_reached, error, checkpoint\_handoff). A hook that should act only on a particular ending guards on it. |
| `error`          | `Error \| undefined`     | Read-only | The rejection that ended the turn, when it ended on one; absent on a clean stop.                                                                                                        |
| `logger`         | `PluginLogger`           | Read-only | Logger scoped to the current turn.                                                                                                                                                      |

`shutdown`PluginShutdownContext

**When:** Once, when the Assistant tears down the plugin (process exit, unload).

**Use it to:** Best-effort cleanup. Do not rely on it for critical writes; persist durably during normal operation instead.

| Field              | Type     | Access    | Description                                        |
| ------------------ | -------- | --------- | -------------------------------------------------- |
| `assistantVersion` | `string` | Read-only | Assistant semver, for version-conditional cleanup. |

When several plugins register hooks for the same boundary, they chain in registration order, each one seeing the previous plugin's changes. Built-in defaults register first, so they run ahead of your hooks.

## Anatomy of a hook

Every hook has the same shape: it receives a typed context and either mutates it in place and returns nothing, or returns a **partial** context. A returned partial is merged onto the threaded context — only the keys it includes are overwritten, every other field is preserved — so a hook can edit just the subset of fields it cares about without re-specifying the rest. The runtime threads the merged context to the next plugin and then to the Assistant.

```
type PluginHookFn<TCtx> = (ctx: TCtx) => Promise<Partial<TCtx> | void>;
```

Because an omitted key means “keep the existing value”, every context field is required and uses `| null` rather than `?` or `| undefined`: a present key always carries a concrete value, so a field absent from a returned partial is never ambiguous with one a hook meant to clear.

One hook per file, default-exported. The filename becomes the hook key, so a `pre-model-call` hook is `hooks/pre-model-call.ts`:

```
// hooks/pre-model-call.ts
import type { PreModelCallContext } from "@vellumai/plugin-api";

export default async function preModelCall(
  ctx: PreModelCallContext,
): Promise<void> {
  // Only touch the user-facing reply, not background or subagent calls.
  if (ctx.callSite !== "mainAgent") {
    return;
  }
  ctx.systemPrompt = (ctx.systemPrompt ?? "") + "\nBe concise.";
}
```

Context types and constants come from [`@vellumai/plugin-api`](https://github.com/vellum-ai/vellum-assistant/tree/main/assistant/src/plugin-api), the only supported contract.