Vellum is coming to the AI Engineering World's Fair in SF. Come visit our booth and get a live demo!

How Marveri enabled lawyers to shape AI products without blocking developers

Learn how Marveri's lawyers use Vellum to build and evaluate AI workflows and save countless engineering hours.

8
Written by
Reviewed by
No items found.

Marveri is a legal technology company building an AI-powered document analysis platform to help attorneys review and interpret large sets of corporate legal documents faster and with greater accuracy.

Marveri’s team knew that combining brilliant AI engineering with expert input from lawyers was key to building the best product.

But lawyers communicate in ways that often don’t mesh with how developers work, creating silos between the two groups. This gap pushed the team to try multiple AI solutions, many of which ended up sending AI engineering teams down rabbit holes and stalling projects before they ever reached production. And despite having a trove of private test data and the legal expertise to interpret it, evaluating workflow performance was ad-hoc and inconsistent without engineering assistance.

As a result, Marveri used Vellum to bridge the gap between their lawyers and engineers.

By equipping their lawyers to prototype and evaluate AI workflows, Marveri unlocked a way for legal teams to vet product ideas on their own and bring their full AI engineering efforts to bear only on proven concepts to bring to life.

With Vellum, Marveri can rapidly leverage their curated trove of proprietary data, evals, and benchmarks on all of their existing products and quickly get new functionalities up and running. We sat down with Marveri’s Director of Product Michael Dockery to hear how his team uses Vellum to move faster, save engineering hours, and deliver lawyer-approved AI features with confidence.

The challenge: Engineers chasing unvalidated ideas

Marveri’s target industry is bound by strict rules of professional conduct and very dependent on legal expertise, which meant every new product idea had to be vetted for compliance and benchmarked for accuracy before it could move forward.

Due to regulations barring use of end-user data, Marveri is entirely dependent on their own proprietary testing data to improve their products.

Since they also couldn’t use end-user data to improve their products and were wholly dependent on their own proprietary testing data, testing and iterating without spending valuable engineering time and effort became a major challenge.

On top of this, their team struggled with:

  • Heavy reliance on engineers: Lawyers couldn’t prototype or benchmark workflows on their own, forcing engineers to code every idea for validation.
  • Lost months of work : AI engineering takes time, engineers were chasing features that later proved unviable, costing Marveri a significant amount of time and progress.
  • Ad-hoc evaluations: Despite a treasure trove of private benchmarking data and the legal expertise to interpret it, accuracy was still judged through scripts, spreadsheets, or “vibe checks,” with no consistent way to measure substantive correctness.

To enable their team to overcome these bottlenecks, Marveri partnered with Vellum to enable the gap between their lawyer’s expertise and their engineer’s execution.

Before Vellum, we had to pull in our engineers, who already had plenty on their plates, to code out ideas before we even knew if they were practical. — Michael Dockery, Director of Product at Marveri

The goal: Validate tests before AI engineering

It became clear to Marveri that their lawyers needed a way to sketch out and test workflow ideas directly, to ensure that critical AI engineering efforts were only focused on validated ideas.

With an AI development platform like Vellum, they wanted to:

  • Make it easy for lawyers to prototype: Enable the legal team to sketch out and test how an AI feature could work for their customers
  • Run evaluations at scale: Allow lawyers to evaluate performance and accuracy directly and at scale - starting small with 20 examples, then expanding to 50 or 100+ test cases depending on the complexity of the feature.
  • Quantify accuracy: Replace “vibe checks” with concrete benchmark results they could trust internally and present to customers.
  • Filter ideas early: Discard concepts that didn’t meet accuracy standards before they reached full AI engineering.

The solution: Lawyer-led validation with Vellum

With Vellum, Marveri shifted prototyping and evaluation into the hands of their legal experts. Lawyers now block out and test initial AI workflows directly, while engineers only step in once an idea has been validated and is ready for full AI engineering and production development.

Benchmarking is also lawyer-driven, as the legal team can directly deploy and interpret private test data against workflow performance at scale.

Here are the solutions that enabled Marveri to bridge the gap between legal expertise and engineering execution:

1/ No-code AI workflow prototyping

Vellum’s visual builder gave Marveri’s lawyers the tools to map out and test early ideas for AI products without touching code. In Vellum they:

  • Prototype entire workflows using their legal expertise to reflect how legal reviews are done manually
  • Draft and refine prompts for each step without touching a line of code
  • Validate ideas by testing which models can handle which steps in a legal workflow, and how accurate they are
Our lawyers can block out a workflow, refine the prompts, and quickly see if it’s practical and worth bringing to the full team. - Michael

Vellum’s easy-to-use workflow builder enabled the legal team to move fast on prototyping high stakes legal AI tasks, like scanning contracts for critical provisions, to secure validation before pushing ideas to the full AI engineering team to refine, develop, and engineer for production.

As a result, Marveri’s engineers understood the use cases more easily. They had working prototypes to guide them and a clear concept for building the complex AI systems required for deploying to legal professionals.

2/ Evaluation tooling for industry compliance & accuracy

Accuracy in legal workflows is dependent on substantive correctness. Since professional restrictions barred Marveri from using customer usage data to validate and iterate on products, Marveri focused on obtaining and developing its own proprietary datasets and evaluation protocols to benchmark its AI workflows.

After years of developing and compiling this proprietary data set, corralling and deploying this trove of benchmarks for evaluating its AI agents was a difficult and time-consuming process. Vellum’s Evaluation Sandbox became their go to way of:

  • Compiling and deploying evaluation sets from its large set of private sample corporate legal documents that mirror real transactions
  • Easily scaling their evaluation runs from 20 examples to hundreds of documents to match task complexity.
  • Quantifying performance & accuracy with concrete benchmarks that measure whether outputs stand up under legal scrutiny.
  • Reducing engineering overhead by letting lawyers run and check evaluations themselves, without relying on developers.
In terms of evaluations, we don’t even have to involve our engineering team anymore. We can set up those evaluation sets, run them, check them. It’s really driven by the lawyers now. - Michael

These solutions empowered Marveri’s whole team to collaborate while exercising their expertise to go from idea → validation → production for every AI function faster than ever before.

The results

With Vellum, Marveri completely transformed how their legal AI features get built and validated:

  • Majority of engineering hours saved by shifting prototyping and evaluation to lawyers instead of developers
  • Only the top 10% of AI ideas make it through validation, so engineers can spend their time building features against concepts that are already proven to be viable, instead of chasing dead ends.
  • Validated AI performance: The team runs hundreds of test cases to validate various AI features before they're presented to end users
  • Faster iteration cycles from idea → validation → AI development and production, enabling Marveri to ship lawyer-grade AI features at speed

These outcomes show how Vellum equipped Marveri with tooling critical to build AI workflows that satisfy the high bar of corporate law, while freeing engineers from costly overhead.

Instead of building 10 features that don’t work, we find 9 that don’t, and then focus on the 1 that does. - Michael

Ship AI legal products faster with Vellum

Vellum is proud to partner with Marveri in redefining how legal AI comes to life.

By empowering lawyers to prototype, test, and validate AI functions independently, barriers between experts in the space can be broken to make advanced AI more practical and impactful for legal work.

If your goal is to ship AI products faster, cut wasted engineering time, and deliver features backed by measurable accuracy, Vellum can help.

Book a demo with us here, and discover how Vellum can enable your team to build more reliable AI features, faster.

And if your legal team is looking for assistance in corralling large groups of legal documents for M&A, financings, and more, check out Marveri at marveri.com.

ABOUT THE AUTHOR
Nicolas Zeeb
Technical Content Lead

Nick is Vellum’s technical content lead, writing about practical ways to use both voice and text-based agents at work. He has hands-on experience automating repetitive workflows so teams can focus on higher-value work.

ABOUT THE reviewer

No items found.
lAST UPDATED
Sep 8, 2025
Legal tech
share post
Expert verified
Legal tech
Related Posts
Guides
October 21, 2025
15 min
AI transformation playbook
LLM basics
October 20, 2025
8 min
The Top Enterprise AI Automation Platforms (Guide)
LLM basics
October 10, 2025
7 min
The Best AI Workflow Builders for Automating Business Processes
LLM basics
October 7, 2025
8 min
The Complete Guide to No‑Code AI Workflow Automation Tools
All
October 6, 2025
6 min
OpenAI's Agent Builder Explained
Product Updates
October 1, 2025
7
Vellum Product Update | September
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component, Use {{general-cta}}

Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component  [For enterprise], Use {{general-cta-enterprise}}

The best AI agent platform for enterprises
Production-grade rigor in one platform: prompt builder, agent sandbox, and built-in evals and monitoring so your whole org can go AI native.

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Button Text

LLM leaderboard CTA component. Use {{llm-cta}}

Check our LLM leaderboard
Compare all open-source and proprietary model across different tasks like coding, math, reasoning and others.

Case study CTA component (ROI)

40% cost reduction on AI investment
Learn how Drata’s team uses Vellum and moves fast with AI initiatives, without sacrificing accuracy and security.

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

6+ months on engineering time saved
Learn how CourseMojo uses Vellum to enable their domain experts to collaborate on AI initiatives, reaching 10x of business growth without expanding the engineering team.

Case study CTA component (Time to value) = {{time-cta}}

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.
New CTA
Sorts the trigger and email categories

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

Prior authorization navigator
Automate the prior authorization process for medical claims.
Healthcare explanations of a patient-doctor match
Summarize why a patient was matched with a specific provider.

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.
Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
AI agent for claims review
Review healthcare claims, detect anomalies and benchmark pricing.

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

E-commerce shopping agent
Check order status, manage shopping carts and process returns.

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.
LinkedIn Content Planning Agent
Create a 30-day Linkedin content plan based on your goals and target audience.

Dynamic template box for Sales, Use {{sales}}

Start with some of these sales examples

Research agent for sales demos
Company research based on Linkedin and public data as a prep for sales demo.

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

PDF Data Extraction to CSV
Extract unstructured data (PDF) into a structured format (CSV).
Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

Turn LinkedIn Posts into Articles and Push to Notion
Convert your best Linkedin posts into long form content.

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

No items found.

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Q&A RAG Chatbot with Cohere reranking
Trust Center RAG Chatbot
Read from a vector database, and instantly answer questions about your security policies.

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.
SOAP Note Generation Agent
Extract subjective and objective info, assess and output a treatment plan.

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Research agent for sales demos
Company research based on Linkedin and public data as a prep for sales demo.
Q&A RAG Chatbot with Cohere reranking
Synthetic Dataset Generator
Generate a synthetic dataset for testing your AI engineered logic.
Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.
SOAP Note Generation Agent
Extract subjective and objective info, assess and output a treatment plan.
Population health insights reporter
Combine healthcare sources and structure data for population health management.

Build AI agents in minutes for

{{industry_name}}

Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.
Prior authorization navigator
Automate the prior authorization process for medical claims.
Population health insights reporter
Combine healthcare sources and structure data for population health management.
Legal document processing agent
Process long and complex legal documents and generate legal research memorandum.
Legal contract review AI agent
Asses legal contracts and check for required classes, asses risk and generate report.
Legal RAG chatbot
Chatbot that provides answers based on user queries and legal documents.

Case study results overview (usually added at top of case study)

What we did:

1-click

This is some text inside of a div block.

28,000+

Separate vector databases managed per tenant.

100+

Real-world eval tests run before every release.