Vellum is coming to the AI Engineering World's Fair in SF. Come visit our booth and get a live demo!

Tutorial: Setting Up OpenAI Function Calling with Chat Models

Learn how to use OpenAI function calling in your AI apps to enable reliable, structured outputs.

Written by
Reviewed by
No items found.

LLMs are great at complex language tasks but often produce unstructured and unpredictable responses, creating challenges for developers who prefer structured data. Extracting info from unstructured text usually involves intricate methods like RegEx or prompt engineering—thus slowing development.

To simplify this, OpenAI introduced function calling to ensure more reliable structured data output from their models.

After reading this tutorial, you'll understand how this feature works and how to implement various function calling techniques with OpenAI's API.

Let's get started.

<h1 id="what">What is Function Calling?</h1>

With function calling you can get consistent structured data from models.

But wait, don't be misled by the name—this feature doesn't actually execute functions on your behalf. Instead, you describe the functions in the API call, and the model learns how to generate the necessary arguments. Once the arguments are generated, you can use them to execute functions in your code.

So now that we’ve cleared that, let’s show you how to set it up.

<h1 id="overview">OpenAI Function Calling example</h1>

In this tutorial, we'll show you how to dynamically generate arguments for two arbitrary weather forecast functions. We'll show you how to:

  • Use the OpenAI's tool parameter to describe your functions;
  • Run the model to generate arguments for one or multiple functions;
  • Use those arguments to execute arbitrary functions in your code;

💡 Please note that this tutorial primarily focuses on configuring the "function calling" feature and does not include instructions for setting up the OpenAI environment. We assume that you already have that covered; if not, please refer to this documentation here. In the sections below, we'll detail each step and share the code we used. If you'd like to run the code while you read, feel free to use this Colab notebook.

<h1 id="fun">Describing your Functions</h1>

First we need to describe our functions in the tools parameter in the OpenAI's Chat Completions API call.

For this example, we'll describe these two functions:

  • get_current_weather(): Obtains the weather of a given city at the time of request.some text
    • location: A string indicating the city and state (e.g., San Francisco, CA).
    • format: A string enum specifying the temperature unit, either as celsius or fahrenheit.(the model will automatically derrive this from the location)
  • get_n_day_weather_forecast(): Returns the weather over n days at a given location. The function includes the parameters location and format, but also includes:
    • num_days: An integer indicating the number of days for the forecast.

This is how our schema looks like:

<pre><code class="language-python">
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_n_day_weather_forecast",
            "description": "Get an N-day weather forecast",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                    "num_days": {
                        "type": "integer",
                        "description": "The number of days to forecast",
                    }
                },
                "required": ["location", "format", "num_days"]
            },
        }
    },
]
</code></pre>

Before using this schema, we’ll introduce a helper function to make calling the Chat Completions API easier. Our helper function will reduce code repetition, handle errors, and set a default model. In our Collab notebook, we’ve defined the GPT_MODEL as gpt-3.5-turbo-0613. Here's the helper function that we'll continue to use in the following sections:

<pre><code class="language-python">
# Helper function

def chat_completion_request(messages, tools=None, tool_choice=None, model=GPT_MODEL):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=tools,
            tool_choice=tool_choice,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e
</code></pre>
<h1 id="gen">Generating Function Arguments</h1>

Now let's see how this schema works, as we pass a system and a user message.

<pre><code class="language-python">
# Define messages

messages = []
messages.append(
    create_message(
        "system",
        "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    )
)
messages.append(create_message("user", "What's the weather like today?"))

# Submit response

chat_response = chat_completion_request(
    messages, tools=tools
)
messages.append(chat_response.choices[0].message)
print(chat_response.choices[0])
</code></pre>

In the example above, we instructed the model not to assume function parameters if they're not provided in the System message. This means the model won't generate a function call unless it has all the necessary parameter details. For instance, if the user message is "What's the weather like today," the model will ask the user for the location before it generates the function call output:

<pre><code class="language-python">
# Output 

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Sure, could you please provide me with your current location?', role='assistant', function_call=None, tool_calls=None))
</code></pre>

When the model is confident that it has all the required parameters that we defined in our schema, it will finally output the function calling arguments. You can tell a function has been called by observing the finish_reason and function flags in the response.

In our snippet below, we add our response to the messages list, which is then sent as a request to the API again:

<pre><code class="language-python">
# Define messages

messages.append(create_message("user", "I'm in San Francisco, CA"))
chat_response = chat_completion_request(
    messages, tools=tools
)

# Submit response

chat_response = chat_completion_request(
    messages, tools=tools
)
messages.append(chat_response.choices[0].message)
print(chat_response.choices[0])
</code></pre>

Since we’re providing the last missing piece of information, this should be enough information for the model to return a function call with arguments:

<pre><code class="language-python">
#Output

Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_npQlZt0Ef84rYiT6Dat8V1xO', function=Function(arguments='{\n  "location": "San Francisco, CA",\n  "format": "celsius"\n}', name='get_current_weather'), type='function')]))
</code></pre>

Noticed that the model automatically called the function for this user message?

That's because if multiple functions are present, the model will intelligently choose which function call to provide by default. This means that the tool_choice parameter will be set to auto. If there are no functions, the tool_choice parameter will be set to none.

Take a look at the following example, where we change the user's request. For instance, if we changed our prompt to:

<pre><code class="language-python">
...
messages.append(create_message("user", "What is the weather going to be like in San Francisco, CA over the next 5 days"))
...
</code></pre>

The model will know to suggest our get_n_day_weather_forecast()instead:

<pre><code class="language-python">
Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_npQlZt0Ef84rYiT6Dat8V1xO', function=Function(arguments='{\n  "location": "San Francisco, CA",\n  "format": "celsius"\n}', name='get_current_weather'), type='function')]))
</code></pre>

Forcing a model to choose one function

It's important to note that you can also force a model to choose only from one function. Here's how you can do that:

<pre><code class="language-python">
...
chat_response = chat_completion_request(
    messages, tools=tools, tool_choice={"type": "function", "function": {"name": "get_n_day_weather_forecast"}}
)
...
</code></pre>

And here's the output that we get from it:

<pre><code class="language-python">
Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_FQSl6U46sIG2HyXHQ1UnNMrm', function=Function(arguments='{\n  "location": "Toronto, Canada",\n  "format": "celsius",\n  "num_days": 1\n}', name='get_n_day_weather_forecast'), type='function')]))
</code></pre>

<h1 id="call">Parallel Function Calling</h1>

In some cases, you'd like the model to run multiple function calls together, allowing the effects and results of these function calls to be resolved in parallel. This can be done by newer models like gpt-4-1106-preview or gpt-3.5-turbo-1106.

In our case, let's imagine that a user is asking for the weather in two locations:

<pre><code class="language-python">
messages = []
messages.append({"role": "system", "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."})
messages.append({"role": "user", "content": "What is the weather going to be like in San Francisco and Glasgow over the next 4 days"})
chat_response = chat_completion_request(
    messages, tools=tools, model='gpt-3.5-turbo-1106'
)

assistant_message = chat_response.choices[0].message.tool_calls
print(assistant_message)
</code></pre>

This means that the model should output a list of two results, with two different arguments for the same function:

<pre><code class="language-python">
[ChatCompletionMessageToolCall(id='call_oEWfcqY5wiBNAGw8Rb6xlymf', function=Function(arguments='{"location": "San Francisco, CA", "format": "celsius", "num_days": 4}', name='get_n_day_weather_forecast'), type='function'), ChatCompletionMessageToolCall(id='call_yBIdc8jb2m4c3Z2zB4NUEofO', function=Function(arguments='{"location": "Glasgow", "format": "celsius", "num_days": 4}', name='get_n_day_weather_forecast'), type='function')]
</code></pre>

<h1 id="call">Calling Functions</h1>

Now that we know how to manipulate our API requests, it’s time to use this output to call our arbitrary functions.

Just to illustrate how this works, we only return an arbitrary text from each function. Then we wrote a function called execute_function_call() that contains if-else conditionals that check the LLM's output and calls the appropriate function based on that response.

<pre><code class="language-python">
import json

def get_current_weather(location, format):
    return "Call successful from get_current_weather()."


def get_n_day_weather_forecast(location, format, num_days):
    return "Call successful from get_n_day_weather_forecast()"


def execute_function_call(message):
    args = json.loads(msg.tool_calls[0].function.arguments)
    if message.tool_calls[0].function.name == "get_current_weather":
        results = get_current_weather(args["location"], args["format"])
    elif message.tool_calls[0].function.name == "get_n_day_weather_forecast":
        results = get_n_day_weather_forecast(args["location"], args["format"], args["num_days"])
    else:
        results = f"Error: function {message.tool_calls[0].function.name} does not exist"
    return results
</code></pre>

Piecing everything together, let's send one more request to the API. 

The code below:

  • Handles a user request
  • Submits the list of messages to the model via the chat_completion_request()
  • Parses the model's response
  • Calls the corresponding function with our defined function execute_function_call()
  • Finally, it prints the results in a structured format
<pre><code class="language-python">
# Define messages

messages = []
messages.append(
    create_message(
        "system",
        "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    )
)
messages.append(create_message("user", "what is the weather going to be like in San Francisco, CA?"))

# Submit response

chat_response = chat_completion_request(messages, tools)

# Parse response

msg = chat_response.choices[0].message
messages.append({"role": msg.role, "content": msg.tool_calls[0].function})
msg_func = str(msg.tool_calls[0].function)

# Call corresponding function

if msg.tool_calls:
    results = execute_function_call(msg)
    messages.append({"role": "function",
                     "tool_call_id": msg.tool_calls[0].id,
                     "name": msg.tool_calls[0].function.name,
                     "content": results
                     })
pretty_print_conversation(messages)
</code> </pre>

And we get this final output:

<pre><code class="language-python">
System: Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.

User: what is the weather going to be like in San Francisco, CA?

Assistant: Function(arguments='{\n  "location": "San Francisco, CA",\n  "format": "celsius"\n}', name='get_current_weather')

function (get_current_weather): Call successful from get_current_weather().
</code></pre>

<h1 id="conclusion">Conclusion</h1>

In summary, the OpenAI API's function calling feature allows you to describe custom functions that the AI model can intelligently decide to call, generating structured JSON outputs containing the necessary arguments. This helps with more dynamic and interactive applications where the AI can perform specific tasks or retrieve information by invoking these functions based on natural language inputs. 

Using this demo, you should be good to implement function calling for your use-case. If you have any troubles feel free to DM me on twitter.

If you want to get these insights in your inbox, subscribe to our newsletter here.

Additional Resources:

ABOUT THE AUTHOR
Anita Kirkovska
Founding Growth Lead

An AI expert with a strong ML background, specializing in GenAI and LLM education. A former Fulbright scholar, she leads Growth and Education at Vellum, helping companies build and scale AI products. She conducts LLM evaluations and writes extensively on AI best practices, empowering business leaders to drive effective AI adoption.

ABOUT THE reviewer

No items found.
lAST UPDATED
Apr 23, 2024
share post
Expert verified
Related Posts
Product Updates
February 3, 2026
5 min
Vellum Product Update | January
LLM basics
January 30, 2026
20 min
15 Best Zapier Alternatives: Reviewed & Compared
LLM basics
January 28, 2026
20 min
2026 Marketer's Guide to AI Agents for Marketing Operations
LLM basics
January 26, 2026
18 min
Top 20 AI Agent Builder Platforms (Complete 2026 Guide)
Product Updates
January 13, 2026
5 min
Introducing Vellum for Agents
Product Updates
January 10, 2026
8 min
Vellum Product Update | December
The Best AI Tips — Direct To Your Inbox

Latest AI news, tips, and techniques

Specific tips for Your AI use cases

No spam

Oops! Something went wrong while submitting the form.

Each issue is packed with valuable resources, tools, and insights that help us stay ahead in AI development. We've discovered strategies and frameworks that boosted our efficiency by 30%, making it a must-read for anyone in the field.

Marina Trajkovska
Head of Engineering

This is just a great newsletter. The content is so helpful, even when I’m busy I read them.

Jeremy Hicks
Solutions Architect

Experiment, Evaluate, Deploy, Repeat.

AI development doesn’t end once you've defined your system. Learn how Vellum helps you manage the entire AI development lifecycle.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component, Use {{general-cta}}

Build AI agents in minutes with Vellum
Build agents that take on the busywork and free up hundreds of hours. No coding needed, just start creating.

General CTA component  [For enterprise], Use {{general-cta-enterprise}}

The best AI agent platform for enterprises
Production-grade rigor in one platform: prompt builder, agent sandbox, and built-in evals and monitoring so your whole org can go AI native.

[Dynamic] Ebook CTA component using the Ebook CMS filtered by name of ebook.
Use {{ebook-cta}} and add a Ebook reference in the article

Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Button Text

LLM leaderboard CTA component. Use {{llm-cta}}

Check our LLM leaderboard
Compare all open-source and proprietary model across different tasks like coding, math, reasoning and others.

Case study CTA component (ROI) = {{roi-cta}}

40% cost reduction on AI investment
Learn how Drata’s team uses Vellum and moves fast with AI initiatives, without sacrificing accuracy and security.

Case study CTA component (cutting eng overhead) = {{coursemojo-cta}}

6+ months on engineering time saved
Learn how CourseMojo uses Vellum to enable their domain experts to collaborate on AI initiatives, reaching 10x of business growth without expanding the engineering team.

Case study CTA component (Time to value) = {{time-cta}}

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.

[Dynamic] Guide CTA component using Blog Post CMS, filtering on Guides’ names

100x faster time to deployment for AI agents
See how RelyHealth uses Vellum to deliver hundreds of custom healthcare agents with the speed customers expect and the reliability healthcare demands.
New CTA
Sorts the trigger and email categories

Dynamic template box for healthcare, Use {{healthcare}}

Start with some of these healthcare examples

Healthcare explanations of a patient-doctor match
Summarize why a patient was matched with a specific provider.
Clinical trial matchmaker
Match patients to relevant clinical trials based on EHR.

Dynamic template box for insurance, Use {{insurance}}

Start with some of these insurance examples

Agent that summarizes lengthy reports (PDF -> Summary)
Summarize all kinds of PDFs into easily digestible summaries.
AI agent for claims review
Review healthcare claims, detect anomalies and benchmark pricing.
Insurance claims automation agent
Collect and analyze claim information, assess risk and verify policy details.

Dynamic template box for eCommerce, Use {{ecommerce}}

Start with some of these eCommerce examples

E-commerce shopping agent
Check order status, manage shopping carts and process returns.

Dynamic template box for Marketing, Use {{marketing}}

Start with some of these marketing examples

Reddit monitoring agent
Monitor Reddit for new posts and send summaries to a specified Slack channel.
Competitor research agent
Scrape relevant case studies from competitors and extract ICP details.

Dynamic template box for Sales, Use {{sales}}

Start with some of these sales examples

Active deals health check agent
Sends a weekly HubSpot deal health update, ranks deals and enables the sales team.
Closed-lost deal review agent
Review all deals marked as "Closed lost" in Hubspot and send summary to the team.

Dynamic template box for Legal, Use {{legal}}

Start with some of these legal examples

NDA deviation review agent
Reviews NDAs against your standard template, highlights differences, and sends a risk rated summary to Slack.
Contract review agent
Reviews contract text against a checklist, flags deviations, scores risk, and produces a lawyer friendly summary.

Dynamic template box for Supply Chain/Logistics, Use {{supply}}

Start with some of these supply chain examples

Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.

Dynamic template box for Edtech, Use {{edtech}}

Start with some of these edtech examples

No items found.

Dynamic template box for Compliance, Use {{compliance}}

Start with some of these compliance examples

No items found.

Dynamic template box for Customer Support, Use {{customer}}

Start with some of these customer support examples

Customer support agent
Support chatbot that classifies user messages and escalates to a human when needed.
Ticket Escalation Bot
Detect escalated support tickets and assigns them in Linear.

Template box, 2 random templates, Use {{templates}}

Start with some of these agents

Stripe transaction review agent
Analyzes recent Stripe transactions for suspicious patterns, flags potential fraud, posts a summary in Slack.
Review Comment Generator for GitHub PRs
Use predefined guidelines to write a code review comment for a GitHub PR.

Template box, 6 random templates, Use {{templates-plus}}

Build AI agents in minutes

Risk assessment agent for supply chain operations
Comprehensive risk assessment for suppliers based on various data inputs.
Contract review agent
Reviews contract text against a checklist, flags deviations, scores risk, and produces a lawyer friendly summary.
Stripe transaction review agent
Analyzes recent Stripe transactions for suspicious patterns, flags potential fraud, posts a summary in Slack.
Client portfolio review agent
Compiles weekly portfolio summaries from PDFs, highlights performance and risk, builds a Gamma presentation deck.
Account monitoring agent
Combines product usage data with CRM data from HubSpot or Salesforce to flag accounts with declining usage, especially ahead of renewals.
Population health insights reporter
Combine healthcare sources and structure data for population health management.

Build AI agents in minutes for

{{industry_name}}

Roadmap planner
Agent that reviews your roadmap and suggests changes based on team capacity.
Account monitoring agent
Combines product usage data with CRM data from HubSpot or Salesforce to flag accounts with declining usage, especially ahead of renewals.
Cross team status updates
Scans Linear for stale, blocked, or repeatedly reopened issues, flags patterns, and uses Devin to propose cleanup or refactor suggestions.
SEO article generator
Generates SEO optimized articles by researching top results, extracting themes, and writing content ready to publish.
Stripe transaction review agent
Analyzes recent Stripe transactions for suspicious patterns, flags potential fraud, posts a summary in Slack.
KYC compliance agent
Automates KYC checks by reviewing customer documents stored in HubSpot

Case study results overview (usually added at top of case study)

What we did:

1-click

This is some text inside of a div block.

28,000+

Separate vector databases managed per tenant.

100+

Real-world eval tests run before every release.