Run your own code at fixed points in a turn. Hooks let a plugin read or transform what flows through the Assistant without forking the core loop.
Plugins are in beta. The plugin API (@vellumai/plugin-api) is not yet stable and can change between releases. Pin the peerDependencies range your plugin declares, and expect breaking changes until we cut a 1.0. Hook names and context shapes can change with it.
A hook is a function that the Assistant calls at a known boundary in their lifecycle. The harness owns the loop, and your Assistant's code runs at named points along the way. Each hook lives in its own file under hooks/<name>.ts, and the filename is the hook name.
The diagram below maps what we call the Agent Loop. The nodes are lifecycle events: the points in time a turn passes through. The connecting hooks are the places your code can run as the turn moves from one event to the next.
The loop can iterate several times within a single user turn: every tool result returns to a fresh model call, and a stop hook can choose to continue rather than end the turn. Because of this, pre-model-call, post-model-call, and post-tool-use can each fire more than once per turn.
init and shutdown sit outside the loop. They bracket the whole session and fire once each: when the plugin is first registered, and when the Assistant tears it down. The Agent Loop runs each turn during the active session in between.
These are the lifecycle hooks this guide covers. The full set of wired hook names lives in the HOOKS constant. Expand a hook to see its Context API contract.
initPluginInitContextWhen: Once, when the plugin is first registered (on boot or install).
Use it to: Validate config and credentials and open resources. Throwing aborts the plugin's load.
| Field | Type | Access | Description |
|---|---|---|---|
config | unknown | Read-only | Parsed config for this plugin, validated against the manifest. |
credentials | Record<string, string> | Read-only | Resolved credential values, keyed by the plugin's requiresCredential entries. |
pluginStorageDir | string | Read-only | Absolute path to the plugin's writable data directory, created during bootstrap. |
assistantVersion | string | Read-only | Assistant semver, for defensive runtime checks. |
logger | PluginLogger | Read-only | Pino-compatible logger scoped to the plugin. |
user-prompt-submitUserPromptSubmitContextWhen: Once per user turn, after messages are assembled and before the agent loop runs.
Use it to: Read or rewrite the message list the model is about to see.
| Field | Type | Access | Description |
|---|---|---|---|
conversationId | string | Read-only | Conversation the prompt was submitted on. |
userMessageId | string | Read-only | Persisted id of the user message that triggered the turn. |
requestId | string | Read-only | Stable id for the request driving this turn. |
modelProfileKey | string | null | Read-only | Active inference profile key, or null when unchanged since last announced. |
isNonInteractive | boolean | Read-only | True when no human is present to answer clarifications (scheduled or headless runs). |
prompt | string | Read-only | Resolved text of the user prompt, after slash-command expansion. |
originalMessages | ReadonlyArray<Message> | Read-only | The user's original message list. Snapshot only, never mutate. |
latestMessages | Message[] | Mutable | The working list that flows into the agent loop. Mutate in place or replace via the return value. |
logger | PluginLogger | Read-only | Logger scoped to the current turn. |
post-compactPostCompactContextWhen: After the loop compacts a conversation mid-turn, before the turn resumes. It fires on a compaction event rather than a fixed turn boundary, so it sits off to the side of the loop.
Use it to: Re-apply context that compaction dropped (for example memory injections) onto the compacted history before the next model call.
Note: post-compact is reserved. The name lives in the HOOKS constant and the built-in re-injection runs on every compaction, but it is not yet dispatched through the plugin hook chain, and PostCompactContext is not yet exported from @vellumai/plugin-api. Treat it as the least stable part of this beta surface.
pre-model-callPreModelCallContextWhen: Immediately before every provider call within a turn, including tool-result follow-ups.
Use it to: Edit the outbound request (for example the system prompt), or defer this turn's live output stream.
| Field | Type | Access | Description |
|---|---|---|---|
conversationId | string | Read-only | Conversation the call belongs to. |
callSite? | LLMCallSite | Read-only | Which call site this serves (mainAgent for the user-facing reply). Self-gate on it before acting. |
systemPrompt | string | undefined | Mutable | The system prompt about to be sent. Replace it to edit the request; guard the undefined case. |
deferAssistantOutput | boolean | Mutable | Set true to suppress the live token stream so a post-model-call hook can emit the final text instead. |
logger | PluginLogger | Read-only | Logger scoped to the current turn. |
post-model-callPostModelCallContextWhen: After each finalized assistant message, before it is persisted and streamed.
Use it to: Transform the text blocks of the reply. Leave tool_use and other non-text blocks intact.
| Field | Type | Access | Description |
|---|---|---|---|
conversationId | string | Read-only | Conversation the message belongs to. |
callSite? | LLMCallSite | Read-only | Which call site this message serves. Self-gate before acting. |
content | ContentBlock[] | Mutable | The finalized message content. Transform text blocks and leave tool_use intact. |
stopReason | string | null | undefined | Read-only | Provider-reported stop reason, when reported. |
logger | PluginLogger | Read-only | Logger scoped to the current turn. |
post-tool-usePostToolUseContextWhen: After each tool returns, before the result rejoins the history sent to the provider.
Use it to: Transform the tool result, for example truncating oversized output to fit the context window.
| Field | Type | Access | Description |
|---|---|---|---|
conversationId | string | Read-only | Conversation the tool ran on. |
toolResponse | ToolResultContent | Mutable | The tool result block. Mutate its content in place or replace the block. |
messages | ReadonlyArray<Message> | Read-only | History up to and including the assistant turn that issued the call. The result is not in it yet. |
additionalContext? | string | Mutable | Extra model-only guidance appended after the tool result, for example retry coaching. |
maxInputTokens | number | Read-only | The model's context-window size in tokens, for deriving a character budget. |
logger | PluginLogger | Read-only | Logger scoped to the current turn. |
stopStopContextWhen: At the stop boundary, when the model returns a response with no tool calls.
Use it to: Decide whether to stop or continue. To continue, append a follow-up turn and the loop re-queries the model.
| Field | Type | Access | Description |
|---|---|---|---|
conversationId | string | Read-only | Conversation the run belongs to. |
messages | Message[] | Mutable | Full conversation history. Append a follow-up turn here when continuing. |
responseContent | ReadonlyArray<ContentBlock> | Read-only | Content blocks of the assistant turn that triggered the stop (no tool_use). |
stopReason | string | null | undefined | Read-only | Provider-reported stop reason (for example refusal, end_turn). |
decision | StopDecision | Mutable | Seeded to 'stop'. Set it to 'continue' to force another loop iteration. |
logger | PluginLogger | Read-only | Logger scoped to the current turn. |
shutdownPluginShutdownContextWhen: Once, when the Assistant tears down the plugin (process exit, unload).
Use it to: Best-effort cleanup. Do not rely on it for critical writes; persist durably during normal operation instead.
| Field | Type | Access | Description |
|---|---|---|---|
assistantVersion | string | Read-only | Assistant semver, for version-conditional cleanup. |
When several plugins register hooks for the same boundary, they chain in registration order, each one seeing the previous plugin's changes. Built-in defaults register first, so they run ahead of your hooks.
Every hook has the same shape: it receives a typed context and either mutates it in place and returns nothing, or returns a new context. The runtime forwards whichever the chain settles on to the next plugin and then to the Assistant.
type PluginHookFn<TCtx> = (ctx: TCtx) => Promise<TCtx | void>;One hook per file, default-exported. The filename becomes the hook key, so a pre-model-call hook is hooks/pre-model-call.ts:
// hooks/pre-model-call.ts
import type { PreModelCallContext } from "@vellumai/plugin-api";
export default async function preModelCall(
ctx: PreModelCallContext,
): Promise<void> {
// Only touch the user-facing reply, not background or subagent calls.
if (ctx.callSite !== "mainAgent") {
return;
}
ctx.systemPrompt = (ctx.systemPrompt ?? "") + "\nBe concise.";
}Context types and constants come from @vellumai/plugin-api, the only supported contract.