Vellum is coming to the AI Engineering World's Fair in SF. Come visit our booth and get a live demo!

Introducing Vellum Workflows

Vellum Workflows help you quickly prototype, deploy, and manage complex chains of LLM calls

Written by
Reviewed by
No items found.

We’re excited to launch an entirely new product area within Vellum – one we’ve been teasing for quite some time… Workflows!

Workflows is a new product in Vellum's LLM dev platform that helps you quickly prototype, deploy, and manage complex chains of LLM calls and the business logic that tie them together. We solve the "whack-a-mole" problem encountered by companies that use popular open source frameworks to build AI applications, but are scared to make changes for fear of introducing regressions in production.

The Problem

Many AI use-cases require chains of prompts, but experimentation and productionization of complex chains is hard.

We have helped dozens of customers take their AI prototypes to production by delivering tools for efficient prompt engineering, tightly integrated semantic search, prompt versioning, and performance monitoring. However, as the AI industry matures, we’ve found that more and more real-world use-cases require multi-step flows across actions like semantic search, multiple prompts/LLM calls, and bespoke business logic.

For example, if building a customer-support chatbot, you may want to:

  1. Use a fast, low-cost, model to categorize an incoming user question
  2. Depending on the categorization, query against a different index in a vector store to return relevant context about how to answer the question
  3. Feed that context into a prompt that’s been tuned to answer accurately about that topic
  4. Feed the output of that prompt into another that rephrases using your brand voice
  5. Finally, return the answer to your end user

Unfortunately, existing tools and frameworks don’t make it easy to:

  1. Rapidly experiment with these chains both step-by-step and end-to-end – especially if you’re non-technical
  2. Make changes with confidence once in production and avoid regressions
  3. Gain visibility into the performance of the system both as a whole, and at each step in the chain

The Solution

A fully managed platform for experimenting with, deploying, and managing AI workflows that power your app

Vellum Workflows provides a low-code UI for experimenting with and deploying LLM workflows to power features in your app.

You can construct a workflow using different “Nodes,” define “Input Variables” to the workflow, their values across different “Scenarios” and run with a single click to see the output at each step along the way.

Shown here is one of the workflows used in production by a customer of ours, Miri Health, for powering their health & wellness AI chatbot.

You get immediate feedback on whether your chain/prompts perform the way you expect without having to edit code, inspect console logs, or hop between browser tabs. You can validate that your workflow does what it should across a variety of scenarios / test cases.

Once you’re happy, you can deploy the Workflow directly in Vellum and invoke it through an API via Vellum’s python/node SDKs. Events for nodes that you subscribe to are streamed back using Server-Sent Events.

Invoke a workflow via a simple API and stream back the results. Use our officially supported python and node sdks, or roll your own.

By deploying your Workflow through Vellum, you can:

  • Mix and match models from different providers without having to integrate with each. Use the best prompt/mode for the job!
  • Have a production-ready backend in minutes without having to write, maintain, and host complex code and orchestration logic
  • Version your Workflow, see changes over time, and revert with one click
  • Get full observability into the production system, viewing inputs, outputs, timestamps, and more for the workflow as a whole, as well as each Node along the way.
  • Use role-based access control to determine which team members are allowed to experiment vs update production deployments
Monitor how your workflows are performing in production, with the ability to inspect the inputs/outputs of the workflow as a whole, as well as each step in the chain.

Looking Ahead

This is just the beginning! Our beta customers are already asking for things like:

  • A/B testing workflows for live experimentation
  • Test suites for evaluating that workflows are doing what they should and don’t break after an “improvement” is made
  • Composability via nested workflows
  • More node types for executing code, making calls to 3rd party APIs, etc.

Why Vellum?

Our focus to date has been to provide robust building blocks for creating production-ready AI applications. We’ve seen our customers assemble Vellum-powered Prompts and Semantic Search to create incredible products, version control and debug them using Vellum Deployments, and validate them when making changes using Vellum Test Suites.

Now that we have the building blocks, we’re well-positioned to help you assemble them. Workflows has been in closed-beta for a few weeks now and we already have customers using them to power their entire AI backend in production.

Vellum Workflows give us the opportunity to really tailor different parts of our product to the end users’ needs without having to invest in tons of custom development, which has dramatically decreased our time to market. As a technical, but non-engineering stakeholder, I’m able to truly participate in the development of the product experience and help deliver personalized AI-powered experiences to customers faster than I could have ever imagined.

- Adam Daigian, Product Lead at Miri Health

We firmly believe that the best AI-powered products out there will be the result of close collaboration between technical and non-technical team members. We’ve repeatedly seen engineers set up the initial scaffolding, integrations, and guard-rails, while non-technical folks run experiments and tweak prompts/chains. No other platform facilitates this collaboration as well as Vellum.

Want to give Workflows a try?

Fill out this form, and we'll set up a custom demo for you.

ABOUT THE AUTHOR
Noa Flaherty
Co-founder & CTO

Noa Flaherty, CTO and co-founder at Vellum (YC W23) is helping developers to develop, deploy and evaluate LLM-powered apps. His diverse background in mechanical and software engineering, as well as marketing and business operations gives him the technical know-how and business acumen needed to bring value to nearly any aspect of startup life. Prior to founding Vellum, Noa completed his undergrad at MIT and worked at three tech startups, including roles in MLOps at DataRobot and Product Engineering at Dover.

ABOUT THE reviewer

No items found.
lAST UPDATED
Aug 15, 2023
share post
Expert verified
Related Posts
LLM basics
October 10, 2025
7 min
The Best AI Workflow Builders for Automating Business Processes
LLM basics
October 7, 2025
8 min
The Complete Guide to No‑Code AI Workflow Automation Tools
All
October 6, 2025
6 min
OpenAI's Agent Builder Explained
Product Updates
October 1, 2025
7
Vellum Product Update | September
Guides
October 6, 2025
15
A practical guide to AI automation
LLM basics
September 25, 2025
8 min
Top Low-code AI Agent Platforms for Product Managers
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component, Use {{general-cta}}

Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component  [For enterprise], Use {{general-cta-enterprise}}

The best AI agent platform for enterprises
Production-grade rigor in one platform: prompt builder, agent sandbox, and built-in evals and monitoring so your whole org can go AI native.

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Button Text

LLM leaderboard CTA component. Use {{llm-cta}}

Check our LLM leaderboard
Compare all open-source and proprietary model across different tasks like coding, math, reasoning and others.

Case study CTA component (ROI)

40% cost reduction on AI investment
Learn how Drata’s team uses Vellum and moves fast with AI initiatives, without sacrificing accuracy and security.

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

6+ months on engineering time saved
Learn how CourseMojo uses Vellum to enable their domain experts to collaborate on AI initiatives, reaching 10x of business growth without expanding the engineering team.

Case study CTA component (Time to value) = {{time-cta}}

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.
New CTA
Sorts the trigger and email categories

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

SOAP Note Generation Agent
Extract subjective and objective info, assess and output a treatment plan.
Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.
AI agent for claims review
Review healthcare claims, detect anomalies and benchmark pricing.

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

E-commerce shopping agent
Check order status, manage shopping carts and process returns.

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.
LinkedIn Content Planning Agent
Create a 30-day Linkedin content plan based on your goals and target audience.

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.
Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

Turn LinkedIn Posts into Articles and Push to Notion
Convert your best Linkedin posts into long form content.

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

No items found.

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Q&A RAG Chatbot with Cohere reranking
Trust Center RAG Chatbot
Read from a vector database, and instantly answer questions about your security policies.

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

AI legal research agent
Comprehensive legal research memo based on research question, jurisdiction and date range.
Population health insights reporter
Combine healthcare sources and structure data for population health management.

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.
Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.
Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.
LinkedIn Content Planning Agent
Create a 30-day Linkedin content plan based on your goals and target audience.
Synthetic Dataset Generator
Generate a synthetic dataset for testing your AI engineered logic.
Review Comment Generator for GitHub PRs
Generate a code review comment for a GitHub pull request.

Build AI agents in minutes for

{{industry_name}}

Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.
Prior authorization navigator
Automate the prior authorization process for medical claims.
Population health insights reporter
Combine healthcare sources and structure data for population health management.
Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.
Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.
Legal RAG chatbot
Chatbot that provides answers based on user queries and legal documents.

Case study results overview (usually added at top of case study)

What we did:

1-click

This is some text inside of a div block.

28,000+

Separate vector databases managed per tenant.

100+

Real-world eval tests run before every release.