Chapter 6: Tool Calling, Function Binding, and Custom Execution Loops

A common misconception is that LLMs can execute Python code, query databases, or browse the web directly. They cannot. An LLM is simply a text-prediction engine.

However, modern chat models support Tool Calling. When you bind tools to a model, you provide it with schemas (names, descriptions, parameters) of functions in your local codebase. If the model determines it needs external information, it will halt text generation and output a structured request to call a tool. Your local code executes the function, sends the result back as a ToolMessage, and the model compiles the final answer.

While frameworks like LangGraph automate this process, understanding how to handle tool calling manually is crucial for professional developers. In this chapter, we will build a complete, multi-turn tool execution loop from scratch using pure LCEL and Python.

6.1 Defining Tools in LangChain

In LangChain, there are two primary ways to declare a tool:

The @tool decorator: The fastest way to turn any Python function into a tool.
Subclassing BaseTool: Used for complex tools requiring custom initialization, configuration, or state.

Method 1: The `@tool` Decorator

The @tool decorator uses your function's signature, docstring, and type hints to generate the JSON Schema that is sent to the LLM.

[!IMPORTANT] The docstrings and type hints are not optional. The model reads them to determine when to use the tool and what data types to provide. If your docstrings are vague, the LLM will fail to invoke the tool correctly.

from langchain_core.tools import tool

@tool
def calculate_compound_interest(
    principal: float, 
    rate: float, 
    years: int
) -> float:
    """
    Calculates the final amount of an investment with compounded interest.
    Use this tool when users ask about investments, savings growth, or future values.
    """
    return principal * ((1 + rate) ** years)

Method 2: Subclassing `BaseTool`

For advanced tool structures (such as needing a database pool or API credentials injected at runtime), you subclass BaseTool and define a Pydantic argument schema:

from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Type

# 1. Define the input arguments schema
class QueryDatabaseInput(BaseModel):
    user_id: int = Field(description="The unique ID of the user.")
    field: str = Field(description="The profile field to query (e.g., 'email', 'status').")

# 2. Define the Tool Class
class UserDatabaseTool(BaseTool):
    name: str = "query_user_db"
    description: str = "Accesses the internal user database to retrieve profile fields."
    args_schema: Type[BaseModel] = QueryDatabaseInput
    
    # We can inject dependencies like database connections here during instantiation
    db_connection: str = "sqlite:///users.db" 

    def _run(self, user_id: int, field: str) -> str:
        # Mock database lookup logic
        return f"Database result for user {user_id} ({field}): Active"

6.2 Binding Tools to the LLM

Once tools are defined, we bind them to the LLM using the .bind_tools() method. This tells the LLM that it is allowed to invoke these tools.

from langchain_ollama import ChatOllama

# 1. Instantiate the model
model = ChatOllama(model="llama3.2", temperature=0.0)

# 2. Declare the tools list
tools_list = [calculate_compound_interest, UserDatabaseTool()]

# 3. Bind them to the model
model_with_tools = model.bind_tools(tools_list)

If we call model_with_tools.invoke("What is the value of $10,000 at 5% interest after 10 years?"), the model will not output conversational text. It will return an AIMessage containing a populated tool_calls attribute.

6.3 Deconstructing the Tool Execution Loop

To build an agent loop manually, we must implement a control cycle:

Send Messages to LLM: Send the user prompt to model_with_tools.
Check for Tool Calls: If the returned AIMessage contains tool_calls, proceed. If not, return the response.
Execute Tools: Loop through each tool request, run the corresponding Python function, and wrap the output in a ToolMessage.
Resubmit to LLM: Append the AIMessage (containing the tool request) and all ToolMessages (containing the results) to the message list, and resubmit them to the LLM.
Loop: Repeat until the model generates a standard text response.

                  ┌────────────────────────┐
                  │   User Query / Input   │
                  └───────────┬────────────┘
                              │
                    ┌─────────▼─────────┐
                    │  Invoke LLM with  │◄─────────────────┐
                    │    Message List   │                  │
                    └─────────┬─────────┘                  │
                              │                            │
                     [Has Tool Calls?]                     │
                    /                 \                    │
                  Yes                  No                  │
                  /                     \                  │
        ┌────────▼────────┐     ┌────────▼────────┐        │
        │ Run Python Tool │     │ Return final    │        │
        │  Functions      │     │ AIMessage text  │        │
        └────────┬────────┘     └─────────────────┘        │
                 │                                         │
        ┌────────▼────────┐                                │
        │ Create and Add  │                                │
        │ ToolMessages    ├────────────────────────────────┘
        └─────────────────┘

6.4 Hands-on Example: Pure-LCEL Multi-Turn Agent Loop

Let's build a functional CLI application demonstrating this loop in action. We'll give the LLM two tools: a system time reader and a compound interest calculator.

import asyncio
import datetime
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain_core.tools import tool
from langchain_ollama import ChatOllama

# --- 1. Define Tools ---
@tool
def get_system_time() -> str:
    """Returns the current date and system time."""
    return datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

@tool
def calculate_compound_interest(principal: float, rate: float, years: int) -> float:
    """
    Calculates compound interest. Rate should be a decimal (e.g. 0.05 for 5%).
    Use this for savings calculations.
    """
    return principal * ((1 + rate) ** years)

# List and map tools for quick lookup by name
tools = [get_system_time, calculate_compound_interest]
tool_map = {tool.name: tool for tool in tools}

# --- 2. The Custom Execution Loop ---
async def run_agent_loop(user_query: str, model_with_tools) -> str:
    # Initialize message history
    message_history = [HumanMessage(content=user_query)]
    
    max_iterations = 5
    iteration = 0
    
    while iteration < max_iterations:
        print(f"\n--- Iteration {iteration + 1} ---")
        print("Model is thinking...")
        
        # 1. Invoke the LLM with the complete history
        response = await model_with_tools.ainvoke(message_history)
        
        # Append the model's response to the running history
        message_history.append(response)
        
        # 2. Check if the model requested any tool calls
        if not response.tool_calls:
            print("Model finished execution.")
            return response.content
            
        print(f"Model requested {len(response.tool_calls)} tool calls:")
        
        # 3. Process each tool call request
        for tool_call in response.tool_calls:
            tool_name = tool_call["name"]
            tool_args = tool_call["args"]
            tool_id = tool_call["id"]
            
            print(f" -> Executing Tool '{tool_name}' with args {tool_args}")
            
            # Fetch and run the tool
            target_tool = tool_map.get(tool_name)
            if not target_tool:
                tool_result = f"Error: Tool '{tool_name}' not found."
            else:
                try:
                    # Run the tool sync (or async if defined as async)
                    tool_result = target_tool.invoke(tool_args)
                except Exception as e:
                    tool_result = f"Error executing tool: {e}"
            
            # 4. Format the result as a ToolMessage
            tool_message = ToolMessage(
                content=str(tool_result),
                name=tool_name,
                tool_call_id=tool_id
            )
            
            # Append tool result to history
            message_history.append(tool_message)
            
        iteration += 1
        
    raise TimeoutError("Agent execution exceeded maximum permitted loop iterations.")

# --- 3. Run the Program ---
async def main():
    # Setup Ollama model with tools support
    llm = ChatOllama(model="llama3.2", temperature=0.0)
    model_with_tools = llm.bind_tools(tools)
    
    query = (
        "Hello! Please tell me what time it is on the system. "
        "Also, if I invest $5,000 at a 6% interest rate for 12 years, what will it be worth?"
    )
    
    print(f"Starting execution for query: '{query}'")
    try:
        final_answer = await run_agent_loop(query, model_with_tools)
        print("\n========================================")
        print("FINAL AGENT ANSWER:")
        print("========================================")
        print(final_answer)
    except Exception as e:
        print(f"Agent failed: {e}")

if __name__ == "__main__":
    asyncio.run(main())

6.5 Why Understand the Manual Loop?

Writing your own tool loop reveals exactly how data flows.

State Visibility: You can inspect every message in the history.
Explicit Control: You can place security gates (e.g. asking the user before executing a specific tool) inside the for tool_call in response.tool_calls: loop.
Low Overhead: You do not have to install or learn LangGraph's state graph compilation syntax if you only need a basic tool-calling pipeline.

6.6 Summary

You now understand:

How to define tools using decorators or subclasses.
How to bind tools to models using .bind_tools().
How tool_calls are requested via AIMessage and returned via ToolMessage.
How to build a complete multi-turn tool execution loop from scratch.

In the next chapter, we will address conversational memory: Conversational Memory: State Management with RunnableWithMessageHistory. We will learn how to wrap our chains to load and save message history automatically to a database.