How Marveri enabled lawyers to shape AI products without blocking developers

Learn how Marveri's lawyers use Vellum to build and evaluate AI workflows and save countless engineering hours.

Written by

Nicolas Zeeb

Reviewed by

CONTENTS

Inline evaluation / Guardrails: Ensure good system performance at run-time

This is some text inside of a div block.

Marveri is a legal technology company building an AI-powered document analysis platform to help attorneys review and interpret large sets of corporate legal documents faster and with greater accuracy.

Marveri’s team knew that combining brilliant AI engineering with expert input from lawyers was key to building the best product.

But lawyers communicate in ways that often don’t mesh with how developers work, creating silos between the two groups. This gap pushed the team to try multiple AI solutions, many of which ended up sending AI engineering teams down rabbit holes and stalling projects before they ever reached production. And despite having a trove of private test data and the legal expertise to interpret it, evaluating workflow performance was ad-hoc and inconsistent without engineering assistance.

As a result, Marveri used Vellum to bridge the gap between their lawyers and engineers.

By equipping their lawyers to prototype and evaluate AI workflows, Marveri unlocked a way for legal teams to vet product ideas on their own and bring their full AI engineering efforts to bear only on proven concepts to bring to life.

With Vellum, Marveri can rapidly leverage their curated trove of proprietary data, evals, and benchmarks on all of their existing products and quickly get new functionalities up and running. We sat down with Marveri’s Director of Product Michael Dockery to hear how his team uses Vellum to move faster, save engineering hours, and deliver lawyer-approved AI features with confidence.

The challenge: Engineers chasing unvalidated ideas

Marveri’s target industry is bound by strict rules of professional conduct and very dependent on legal expertise, which meant every new product idea had to be vetted for compliance and benchmarked for accuracy before it could move forward.

Due to regulations barring use of end-user data, Marveri is entirely dependent on their own proprietary testing data to improve their products.

Since they also couldn’t use end-user data to improve their products and were wholly dependent on their own proprietary testing data, testing and iterating without spending valuable engineering time and effort became a major challenge.

On top of this, their team struggled with:

Heavy reliance on engineers: Lawyers couldn’t prototype or benchmark workflows on their own, forcing engineers to code every idea for validation.
Lost months of work : AI engineering takes time, engineers were chasing features that later proved unviable, costing Marveri a significant amount of time and progress.
Ad-hoc evaluations: Despite a treasure trove of private benchmarking data and the legal expertise to interpret it, accuracy was still judged through scripts, spreadsheets, or “vibe checks,” with no consistent way to measure substantive correctness.

To enable their team to overcome these bottlenecks, Marveri partnered with Vellum to enable the gap between their lawyer’s expertise and their engineer’s execution.

Before Vellum, we had to pull in our engineers, who already had plenty on their plates, to code out ideas before we even knew if they were practical. — Michael Dockery, Director of Product at Marveri

The goal: Validate tests before AI engineering

It became clear to Marveri that their lawyers needed a way to sketch out and test workflow ideas directly, to ensure that critical AI engineering efforts were only focused on validated ideas.

With an AI development platform like Vellum, they wanted to:

Make it easy for lawyers to prototype: Enable the legal team to sketch out and test how an AI feature could work for their customers
Run evaluations at scale: Allow lawyers to evaluate performance and accuracy directly and at scale - starting small with 20 examples, then expanding to 50 or 100+ test cases depending on the complexity of the feature.
Quantify accuracy: Replace “vibe checks” with concrete benchmark results they could trust internally and present to customers.
Filter ideas early: Discard concepts that didn’t meet accuracy standards before they reached full AI engineering.

The solution: Lawyer-led validation with Vellum

With Vellum, Marveri shifted prototyping and evaluation into the hands of their legal experts. Lawyers now block out and test initial AI workflows directly, while engineers only step in once an idea has been validated and is ready for full AI engineering and production development.

Benchmarking is also lawyer-driven, as the legal team can directly deploy and interpret private test data against workflow performance at scale.

Here are the solutions that enabled Marveri to bridge the gap between legal expertise and engineering execution:

1/ No-code AI workflow prototyping

Vellum’s visual builder gave Marveri’s lawyers the tools to map out and test early ideas for AI products without touching code. In Vellum they:

Prototype entire workflows using their legal expertise to reflect how legal reviews are done manually
Draft and refine prompts for each step without touching a line of code
Validate ideas by testing which models can handle which steps in a legal workflow, and how accurate they are

Our lawyers can block out a workflow, refine the prompts, and quickly see if it’s practical and worth bringing to the full team. - Michael

Vellum’s easy-to-use workflow builder enabled the legal team to move fast on prototyping high stakes legal AI tasks, like scanning contracts for critical provisions, to secure validation before pushing ideas to the full AI engineering team to refine, develop, and engineer for production.

As a result, Marveri’s engineers understood the use cases more easily. They had working prototypes to guide them and a clear concept for building the complex AI systems required for deploying to legal professionals.

2/ Evaluation tooling for industry compliance & accuracy

Accuracy in legal workflows is dependent on substantive correctness. Since professional restrictions barred Marveri from using customer usage data to validate and iterate on products, Marveri focused on obtaining and developing its own proprietary datasets and evaluation protocols to benchmark its AI workflows.

After years of developing and compiling this proprietary data set, corralling and deploying this trove of benchmarks for evaluating its AI agents was a difficult and time-consuming process. Vellum’s Evaluation Sandbox became their go to way of:

Compiling and deploying evaluation sets from its large set of private sample corporate legal documents that mirror real transactions
Easily scaling their evaluation runs from 20 examples to hundreds of documents to match task complexity.
Quantifying performance & accuracy with concrete benchmarks that measure whether outputs stand up under legal scrutiny.
Reducing engineering overhead by letting lawyers run and check evaluations themselves, without relying on developers.

In terms of evaluations, we don’t even have to involve our engineering team anymore. We can set up those evaluation sets, run them, check them. It’s really driven by the lawyers now. - Michael

These solutions empowered Marveri’s whole team to collaborate while exercising their expertise to go from idea → validation → production for every AI function faster than ever before.

The results

With Vellum, Marveri completely transformed how their legal AI features get built and validated:

Majority of engineering hours saved by shifting prototyping and evaluation to lawyers instead of developers
Only the top 10% of AI ideas make it through validation, so engineers can spend their time building features against concepts that are already proven to be viable, instead of chasing dead ends.
Validated AI performance: The team runs hundreds of test cases to validate various AI features before they're presented to end users
Faster iteration cycles from idea → validation → AI development and production, enabling Marveri to ship lawyer-grade AI features at speed

These outcomes show how Vellum equipped Marveri with tooling critical to build AI workflows that satisfy the high bar of corporate law, while freeing engineers from costly overhead.

Instead of building 10 features that don’t work, we find 9 that don’t, and then focus on the 1 that does. - Michael

Ship AI legal products faster with Vellum

Vellum is proud to partner with Marveri in redefining how legal AI comes to life.

By empowering lawyers to prototype, test, and validate AI functions independently, barriers between experts in the space can be broken to make advanced AI more practical and impactful for legal work.

If your goal is to ship AI products faster, cut wasted engineering time, and deliver features backed by measurable accuracy, Vellum can help.

Book a demo with us here, and discover how Vellum can enable your team to build more reliable AI features, faster.

And if your legal team is looking for assistance in corralling large groups of legal documents for M&A, financings, and more, check out Marveri at marveri.com.

Marveri’s team knew that combining brilliant AI engineering with expert input from lawyers was key to building the best product.

As a result, Marveri used Vellum to bridge the gap between their lawyers and engineers.

The challenge: Engineers chasing unvalidated ideas

Due to regulations barring use of end-user data, Marveri is entirely dependent on their own proprietary testing data to improve their products.

On top of this, their team struggled with:

Heavy reliance on engineers: Lawyers couldn’t prototype or benchmark workflows on their own, forcing engineers to code every idea for validation.
Lost months of work : AI engineering takes time, engineers were chasing features that later proved unviable, costing Marveri a significant amount of time and progress.
Ad-hoc evaluations: Despite a treasure trove of private benchmarking data and the legal expertise to interpret it, accuracy was still judged through scripts, spreadsheets, or “vibe checks,” with no consistent way to measure substantive correctness.

To enable their team to overcome these bottlenecks, Marveri partnered with Vellum to enable the gap between their lawyer’s expertise and their engineer’s execution.

Before Vellum, we had to pull in our engineers, who already had plenty on their plates, to code out ideas before we even knew if they were practical. — Michael Dockery, Director of Product at Marveri

The goal: Validate tests before AI engineering

It became clear to Marveri that their lawyers needed a way to sketch out and test workflow ideas directly, to ensure that critical AI engineering efforts were only focused on validated ideas.

With an AI development platform like Vellum, they wanted to:

Make it easy for lawyers to prototype: Enable the legal team to sketch out and test how an AI feature could work for their customers
Run evaluations at scale: Allow lawyers to evaluate performance and accuracy directly and at scale - starting small with 20 examples, then expanding to 50 or 100+ test cases depending on the complexity of the feature.
Quantify accuracy: Replace “vibe checks” with concrete benchmark results they could trust internally and present to customers.
Filter ideas early: Discard concepts that didn’t meet accuracy standards before they reached full AI engineering.

The solution: Lawyer-led validation with Vellum

Benchmarking is also lawyer-driven, as the legal team can directly deploy and interpret private test data against workflow performance at scale.

Here are the solutions that enabled Marveri to bridge the gap between legal expertise and engineering execution:

1/ No-code AI workflow prototyping

Vellum’s visual builder gave Marveri’s lawyers the tools to map out and test early ideas for AI products without touching code. In Vellum they:

Prototype entire workflows using their legal expertise to reflect how legal reviews are done manually
Draft and refine prompts for each step without touching a line of code
Validate ideas by testing which models can handle which steps in a legal workflow, and how accurate they are

Our lawyers can block out a workflow, refine the prompts, and quickly see if it’s practical and worth bringing to the full team. - Michael

2/ Evaluation tooling for industry compliance & accuracy

Compiling and deploying evaluation sets from its large set of private sample corporate legal documents that mirror real transactions
Easily scaling their evaluation runs from 20 examples to hundreds of documents to match task complexity.
Quantifying performance & accuracy with concrete benchmarks that measure whether outputs stand up under legal scrutiny.
Reducing engineering overhead by letting lawyers run and check evaluations themselves, without relying on developers.

In terms of evaluations, we don’t even have to involve our engineering team anymore. We can set up those evaluation sets, run them, check them. It’s really driven by the lawyers now. - Michael

These solutions empowered Marveri’s whole team to collaborate while exercising their expertise to go from idea → validation → production for every AI function faster than ever before.

The results

With Vellum, Marveri completely transformed how their legal AI features get built and validated:

Majority of engineering hours saved by shifting prototyping and evaluation to lawyers instead of developers
Only the top 10% of AI ideas make it through validation, so engineers can spend their time building features against concepts that are already proven to be viable, instead of chasing dead ends.
Validated AI performance: The team runs hundreds of test cases to validate various AI features before they're presented to end users
Faster iteration cycles from idea → validation → AI development and production, enabling Marveri to ship lawyer-grade AI features at speed

These outcomes show how Vellum equipped Marveri with tooling critical to build AI workflows that satisfy the high bar of corporate law, while freeing engineers from costly overhead.

Instead of building 10 features that don’t work, we find 9 that don’t, and then focus on the 1 that does. - Michael

Ship AI legal products faster with Vellum

Vellum is proud to partner with Marveri in redefining how legal AI comes to life.

By empowering lawyers to prototype, test, and validate AI functions independently, barriers between experts in the space can be broken to make advanced AI more practical and impactful for legal work.

If your goal is to ship AI products faster, cut wasted engineering time, and deliver features backed by measurable accuracy, Vellum can help.

Book a demo with us here, and discover how Vellum can enable your team to build more reliable AI features, faster.

And if your legal team is looking for assistance in corralling large groups of legal documents for M&A, financings, and more, check out Marveri at marveri.com.

ABOUT THE AUTHOR

Nicolas Zeeb

Technical Content Lead

Nick is Vellum’s technical content lead, writing about practical ways to use both voice and text-based agents at work. He has hands-on experience automating repetitive workflows so teams can focus on higher-value work.

ABOUT THE reviewer

No items found.

lAST UPDATED

Sep 8, 2025

Legal

Expert verified

Legal

Model Comparisons

February 6, 2026

•

10 min

Claude Opus 4.6 Benchmarks

LLM basics

February 5, 2026

•

12 min

15 Best Make Alternatives: Reviewed & Compared

Product Updates

February 3, 2026

•

5 min

Vellum Product Update | January

LLM basics

January 30, 2026

•

20 min

15 Best Zapier Alternatives: Reviewed & Compared

LLM basics

January 28, 2026

•

20 min

2026 Marketer's Guide to AI Agents for Marketing Operations

LLM basics

January 26, 2026

•

18 min

Top 20 AI Agent Builder Platforms (Complete 2026 Guide)

The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska

Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks

Solutions Architect

Book a DemoLearn more

Automate the work
that slows you down

AI agents for your boring ops tasks.

Product

Resources

Company

Careers

Affiliate program rules

How Marveri enabled lawyers to shape AI products without blocking developers

The challenge: Engineers chasing unvalidated ideas

The goal: Validate tests before AI engineering

The solution: Lawyer-led validation with Vellum

1/ No-code AI workflow prototyping

2/ Evaluation tooling for industry compliance & accuracy

The results

Ship AI legal products faster with Vellum

The challenge: Engineers chasing unvalidated ideas

The goal: Validate tests before AI engineering

The solution: Lawyer-led validation with Vellum

1/ No-code AI workflow prototyping

2/ Evaluation tooling for industry compliance & accuracy

The results

Ship AI legal products faster with Vellum

Automate the workthat slows you down

General CTA component, Use {{general-cta}}

General CTA component [For enterprise], Use {{general-cta-enterprise}}

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.Use {{ebook-cta}} and add a Ebook reference in the article

LLM leaderboard CTA component. Use {{llm-cta}}

Case study CTA component (ROI) = {{roi-cta}}

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

Case study CTA component (Time to value) = {{time-cta}}

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Dynamic template box for Sales, Use {{sales}}

Start with some of these sales examples

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Build AI agents in minutes for

{{industry_name}}

Case study results overview (usually added at top of case study)

1-click

28,000+

100+

Automate the work
that slows you down

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article