PDF Data Extraction to CSV

This agentic workflow extracts data from PDF files and converts it into structured CSV format. It processes each page of the PDF, generating separate CSV outputs for menu items, invoices, and product specifications.
Vellum Team
Created By
Anita Kirkovska
Expand to interact
Created By
Anita Kirkovska
Last Updated
July 31, 2025
Categories
Document extraction

How it Works / How to Build It

  1. GetParseEachPage: This node takes a list of PDF file names as input and initiates the subworkflow to process each page of the PDFs.
  2. GetPage: This templating node retrieves each page of the PDF based on the input item.
  3. GetPage1: This search node queries the document index for the specific page content, applying weights for semantic similarity and keywords.
  4. ParseProcessedPDF: This inline prompt node processes the unstructured text data retrieved from the PDF and converts it into a structured CSV format.
  5. ProcessedPDF: This final output node captures the processed CSV output from the ParseProcessedPDF node.
  6. MenuCSVOutput, InvoiceCSVOutput, ProductSpecCSVOutput: These nodes output the structured data into separate CSV files for menu items, invoices, and product specifications.

What You Can Use This For

  • Automating the extraction of data from invoices for accounting teams.
  • Generating product specifications from product catalogs for marketing teams.
  • Creating menu item lists from restaurant PDFs for operations teams.

Prerequisites

  • Vellum account.
  • PDF files containing the data to be extracted.

How to Set It Up

  1. Clone the workflow template in your Vellum account.
  2. Upload your PDF files to the designated input field in the Inputs node.
  3. Connect the GetParseEachPage node to the MenuCSVOutput, InvoiceCSVOutput, and ProductSpecCSVOutput nodes.
  4. Configure any additional settings as needed for your specific use case.
  5. Run the workflow to generate the CSV outputs.
Related Templates

Discover more AI agent templates to automate different aspects of your business

Medical
Document extraction
SOAP Note Generation Agent
Created By
Anita Kirkovska
Document extraction
Content generation
Agent that summarizes lengthy reports (PDF -> Summary)
Created By
Anita Kirkovska
AI Agents
Web Search
Page scraping
React Agent for Web Search and Page Scraping
Created By
Aaron Levin
Coding
Automated Code Review Comment Generator for GitHub PRs
Created By
David Vargas
Evaluation
Synthetic Dataset Generator
Created By
Nico Finelli
Document extraction
RAG
Q&A RAG Chatbot with Cohere reranking
Created By
Aaron Levin
Finance
Document extraction
Financial Statement Review Workflow
Created By
Anita Kirkovska
Content generation
Turn LinkedIn Posts into Articles and Push to Notion
Created By
Anita Kirkovska
Customer Service
Enhancing Customer
Trust Center RAG Chatbot
Created By
Akash Sharma
CUSTOMERS

We have changed the game of AI development — hear it from the enterprise leaders

Vellum helped us quickly evaluate prompt designs and workflows, saving us hours of development. This gave us the confidence to launch our virtual assistant in 14 U.S. markets.
Pratik Bhat
ai Product manager
We sped up AI development by 50 percent and decoupled updates from releases with Vellum. This allowed us to fix errors instantly without worrying about infrastructure uptime or costs.
Jordan Nemrow
Co-Founder & CTO @ Woflow
Vellum helped us quickly evaluate prompt designs and workflows, saving us hours of development. This gave us the confidence to launch our virtual assistant in 14 U.S. markets.
Sebi Lozano
Sr. Product Manager @ Redfin
GET STARTED

Build any AI agent with Vellum

Get started today and transform your business with intelligent automation
👋 Your partners in AI Excellence