!pip install openai -qqq
Introspection for Function Calling
Function calling is one of the key elements behind the current rise of agentic workflows. The simple idea that an LLM can decide to call one among the provided functions based on the user’s input and receive the results back is a very powerful pattern.
This pattern powers many of the features we see on ChatGPT itself, such as searching the web, running generated code in an interpreter, generating images using DALL-E, or storing memory based on what the user has said across sessions.
Most closed-source LLM providers, like OpenAI and Anthropic, expose this feature as a capability called function calling or tool use. They require us to provide the signature and parameters of our custom function to the API in a specific way as a JSON Schema. This is done because JSON Schema is language agnostic, and we can use the returned function name and parameters to call the actual implementation in any language of our choice.
For example, if we wanted to provide a simple add function to OpenAI, we would write a JSON schema for the function as shown below.
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
At first, we convert the function into a JSON Schema as shown below. This includes the name of the function, the description of what it does, and the name and type of all the parameters that it can take.
= [
tools
{"type": "function",
"function": {
"name": "add",
"description": "Adds two integers together",
"strict": True,
"parameters": {
"type": "object",
"required": ["a", "b"],
"properties": {
"a": {"type": "integer", "description": "The first integer to add"},
"b": {
"type": "integer",
"description": "The second integer to add",
},
},"additionalProperties": False,
},
},
} ]
Then, we can provide our schema as a list of tools and send a user query.
from openai import OpenAI
= OpenAI()
client
= [{"role": "user", "content": "Add 2 and 3"}]
messages = client.chat.completions.create(
completion ="gpt-4o-mini",
model=messages,
messages=tools,
tools )
The model then decides that it wants to call the add function with the parameters a=2 and b=3
= completion.choices[0].message.tool_calls[0]
tool_call tool_call.function
Function(arguments='{"a":2,"b":3}', name='add')
tool_call.function.name
'add'
We can fetch the arguments to be passed to the function as shown below
import json
= json.loads(tool_call.function.arguments)
args args
{'a': 2, 'b': 3}
Then we call our function with those arguments and get a result
= add(**args)
result print(result)
5
The result is sent back to the LLM context as a separate message and it will generate a natural language response as a reply for the next turn
0].message)
messages.append(completion.choices[
messages.append(
{"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result),
}
)
= client.chat.completions.create(
completion_after_tool_call ="gpt-4o-mini",
model=messages,
messages=tools,
tools
)0].message.content completion_after_tool_call.choices[
'The sum of 2 and 3 is 5.'
Now, the question becomes: how can we automatically convert Python functions into JSON Schemas?
In this post, I will go over various runtime introspection features that Python provides to extract pretty much everything about a function definition.
Object Introspection
Let’s understand the various introspection features step by step.
Extracting the parameters and type-annotations
To get the parameters of the function, we can use the signature function of the inspect module.
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
This will return the entire signature for both the input parameters and the return type.
import inspect
= inspect.signature(add)
signature signature
<Signature (a: int, b: int) -> int>
We can get a dictionary of the parameters of the function from the signature
signature.parameters
mappingproxy({'a': <Parameter "a: int">, 'b': <Parameter "b: int">})
We can access each parameter from the dictionary. It will return an object that has many useful properties
= signature.parameters["a"]
a a
<Parameter "a: int">
We can now easily access the name of the parameter, its default value as well as the type annotation.
print("Name of parameter: ", a.name)
print("Default value: ", a.default)
print("Type annotation: ", a.annotation)
Name of parameter: a
Default value: <class 'inspect._empty'>
Type annotation: <class 'int'>
This means that if a parameter has a default value of inspect._empty
, it’s a required parameter`
== inspect._empty a.default
True
The type annotation is of particular interest to us. It will return the type directly
a.annotation
<class 'int'>
== int a.annotation
True
We can also get the type annotation for the return statement i.e. output of the function using the signature itself
signature.return_annotation
<class 'int'>
Extracting the docstring
To get the docstring, we can use the __doc__ attribute in the function
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
add.__doc__
'Adds two integers together'
An alternate approach is to use inspect module itself.
import inspect
inspect.getdoc(add)
'Adds two integers together'
Extracting the function name
This is relatively simple as python already provides a name attribute on each function
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
__name__ add.
'add'
Extracting the parameter descriptions from the docstring
We can make use of a third-party library called docstring_parser as the format of docstrings can vary a lot.
!pip install docstring_parser -qqq
def add(a: int, b: int) -> int:
"""
Adds two integers together.
Args:
a (int): The first integer.
b (int): The second integer.
Returns:
int: The sum of a and b.
"""
return a + b
from docstring_parser import parse
= parse(add.__doc__)
doc for param in doc.params} {param.arg_name: param.description
{'a': 'The first integer.', 'b': 'The second integer.'}
Functions to JSON Schema
With the above background knowledge, we have everything needed to convert the function definition to JSON Schema.
Let’s see how this is applied in various popular agent libraries.
Approach 1: Pure Python
This is the approach implemented in the OpenAI Swarm library. In this, we can use all introspection feature discussed above to write the conversion function from scratch.
!pip install git+https://github.com/openai/swarm.git -qqq
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
Swarm has a utility function called function_to_json
that converts a python function into a JSON schema.
from swarm.util import function_to_json
function_to_json(add)
{ 'type': 'function', 'function': { 'name': 'add', 'description': 'Adds two integers together', 'parameters': { 'type': 'object', 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'] } } }
As seen above, we first need some mapping to convert the parameter types from Python to the equivalent JSON schema data type.
python | json_schema |
---|---|
str | string |
int | integer |
float | number |
bool | boolean |
list | array |
dict | object |
None | null |
Based on this, the implementation is quite simple and reuses all the concept we discussed before.
We take the function signature and extract the parameter types for each paramter as well as get the function name and docstring. Using this, we construct the JSON Schema at the end.
# Source: https://github.com/openai/swarm/blob/9db581cecaacea0d46a933d6453c312b034dbf47/swarm/util.py#L31
import inspect
def function_to_json(func) -> dict:
# A mapping of types from python to JSON
= {
type_map str: "string",
int: "integer",
float: "number",
bool: "boolean",
list: "array",
dict: "object",
type(None): "null",
}
try:
= inspect.signature(func)
signature except ValueError as e:
raise ValueError(
f"Failed to get signature for function {func.__name__}: {str(e)}"
)
= {}
parameters for param in signature.parameters.values():
try:
= type_map.get(param.annotation, "string")
param_type except KeyError as e:
raise KeyError(
f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
)= {"type": param_type}
parameters[param.name]
= [
required
param.namefor param in signature.parameters.values()
if param.default == inspect._empty
]
return {
"type": "function",
"function": {
"name": func.__name__,
"description": func.__doc__ or "",
"parameters": {
"type": "object",
"properties": parameters,
"required": required,
},
}, }
- 1
- Get the function signature
- 2
- For each parameter, convert the type annotation to valid JSON type. Default to string if user didn’t specify a type
- 3
- Find out which parameters are required
- 4
- Extract the function name
- 5
- Extract the docstring
function_to_json(add)
{ 'type': 'function', 'function': { 'name': 'add', 'description': 'Adds two integers together', 'parameters': { 'type': 'object', 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'] } } }
Approach 2: Pydantic
This approach is implemented in popular libraries like LlamaIndex and LangChain under the hood.
I first came across this approach on Jeremy Howards’s talk on A Hackers’ Guide to Language Models.
Pydantic is already used for data validation and serialization in Python for structured data. As such, it can convert a Python class into a JSON schema directly.
For example, if we were to define a Pydantic model for our add function manually, it would look something like below.
from pydantic import BaseModel
class Add(BaseModel):
int
a: int
b:
Add.model_json_schema()
{ 'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}, 'required': ['a', 'b'], 'title': 'Add', 'type': 'object' }
But, we actually want to create the Pydantic data model dynamically. This is possible via the create_model function provided by Pydantic.
It takes the name for the model as the first argument, and then the named paramters for the different fields in the model.
Here a=(int, ...)
means that the field a
is of type int
and is required.
from pydantic import create_model
= create_model("Add", a=(int, ...), b=(int, ...))
a a.model_json_schema()
{ 'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}, 'required': ['a', 'b'], 'title': 'Add', 'type': 'object' }
Thus, if we can somehow create a dictionary of our function parameters, then we can pass that using the **kwargs trick and then get the JSON schema directly.
from pydantic import create_model
= create_model("Add", **{"a": (int, ...), "b": (int, ...)})
a a.model_json_schema()
{ 'properties': {'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}, 'required': ['a', 'b'], 'title': 'Add', 'type': 'object' }
Below, we see a function that uses this concept to convert a function into JSON Schema directly.
We use inspect.signature as before to get all the function parameters and then prepare a Pydantic model directly from it.
import inspect
from pydantic import create_model
def add(a: int, b: int) -> int:
"""Adds two integers together"""
return a + b
def schema(f):
= {
kws
name: (# Get the type annotation
parameter.annotation,# Check if parameter is required or optional
if parameter.default == inspect._empty else parameter.default,
...
)for name, parameter in inspect.signature(f).parameters.items()
}# Pass the function name and parameters to get a pydantic model
= create_model(f"`{f.__name__}`", **kws)
p
# Convert to JSON Schema
= p.model_json_schema()
schema return {
"type": "function",
"function": {
"name": f.__name__,
"description": f.__doc__,
"parameters": schema,
},
}
schema(add)
{ 'type': 'function', 'function': { 'name': 'add', 'description': 'Adds two integers together', 'parameters': { 'properties': { 'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'} }, 'required': ['a', 'b'], 'title': '`add`', 'type': 'object' } } }
Approach 3: Decorators
To make them easier to use, most agent libraries wrap approaches like above as decorators (e.g. smolagents).
For example, we can make a decorator called tool
, which, when applied to a function, will add a json_schema
method to that function.
def tool(func):
= lambda: function_to_json(func)
func.json_schema return func
We can apply the decorator now
@tool
def add(a: int, b: int) -> int:
"""Adds two numbers"""
return a + b
And can use the json_schema
method to get the schema directly and use it downstream in LLM API.
add.json_schema()
{ 'type': 'function', 'function': { 'name': 'add', 'description': 'Adds two numbers', 'parameters': { 'type': 'object', 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'] } } }
Conclusion
Thus, we understood how Python’s runtime introspection enables automatic conversion of function definitions into JSON Schema.