Chapter 46 min read

Chapter 4: Dynamic Chain Routing & Selection

In production applications, user inputs are highly unpredictable. If a user asks a general question like "What is your refund policy?", you want to route their query to a static retrieval chain. If they ask a complex logic question like "Help me debug this python script", you want to route it to a high-end coding model. If they say "Hi", you want a lightweight conversational model to reply instantly.

Trying to solve this with a single monolithic prompt is slow, expensive, and error-prone. Instead, you need Dynamic Chain Routing.

In this chapter, we will learn how to inspect inputs at runtime and guide them down different execution paths. We will study why functional python routing is superior to RunnableBranch and build a complete intent-based customer support router.


4.1 Functional Routing vs. RunnableBranch

LangChain historically provided a utility called RunnableBranch to handle conditional execution. While it still works, the modern standard is to use a standard Python function (wrapped in RunnableLambda) that returns another Runnable.

When a step in a RunnableSequence returns a Runnable, LangChain does not just return the object; it automatically executes the returned runnable with the current input data.

Here is the comparison:

The Old Way: RunnableBranch

from langchain_core.runnables import RunnableBranch

# Verbose, hard to debug, uses complex tuple matching syntax
branch = RunnableBranch(
    (lambda x: x["topic"] == "billing", billing_chain),
    (lambda x: x["topic"] == "technical", tech_chain),
    general_chain
)

The Modern Way: Functional Routing (Using RunnableLambda)

from langchain_core.runnables import RunnableLambda

def select_chain(info: dict):
    """
    Decides at runtime which runnable to return and execute.
    """
    topic = info["topic"].lower()
    if "billing" in topic:
        return billing_chain
    elif "technical" in topic:
        return tech_chain
    else:
        return general_chain

# Piped seamlessly inside LCEL
routing_chain = classifier | RunnableLambda(select_chain)

Why the functional approach is superior:

  • Readability: It uses standard Python if/else statements.
  • Traceability: You can place standard Python breakpoints (breakpoint()) or logs directly inside the decision function.
  • Flexibility: You can run arbitrary preprocessing, database checks, or external API checks within the selection function before choosing the next chain.

4.2 Dynamic Model Selection & Configuration

Dynamic selection is not limited to routing between different chains; you can also configure LLM instances or prompts dynamically.

For instance, you might want to run a cheap, fast model (like a local llama3.2 via Ollama) to classify user intent, and then swap to a heavier model (like OpenAI's gpt-4o or Google's gemini-1.5-pro) only if the input requires complex mathematical reasoning.

You can achieve this using the .with_config() method or by wrapping model instantiation within a RunnableLambda.


4.3 Hands-on Example: The Intent-Based Support Router

Let's build an automated customer support triage system. It will:

  1. Accept a user's support ticket.
  2. Run a fast, lightweight classification chain to determine the category (Billing, Technical, or General).
  3. Route the ticket to the appropriate sub-chain based on the classification.
  4. Execute the chosen sub-chain.
import asyncio
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_ollama import ChatOllama

# --- 1. Sub-Chains Definition ---
# Let's assume we have different models/prompts for different tasks.

# Fast local model for classification
classifier_model = ChatOllama(model="llama3.2", temperature=0.0)

# Specialized models or settings for sub-chains
billing_model = ChatOllama(model="llama3.2", temperature=0.1)
tech_model = ChatOllama(model="llama3.2", temperature=0.2) # or OpenAI/Groq for advanced coding
general_model = ChatOllama(model="llama3.2", temperature=0.5)

# Sub-Chain A: Billing
billing_chain = (
    ChatPromptTemplate.from_template(
        "You are a billing support agent. Help the customer with their billing question. "
        "Provide clear steps and state that billing changes take 2-3 business days.\n\n"
        "Customer Question: {question}"
    )
    | billing_model
    | StrOutputParser()
)

# Sub-Chain B: Technical Support
tech_chain = (
    ChatPromptTemplate.from_template(
        "You are a Senior Technical Support Engineer. Diagnose and guide the user through "
        "solving their bug or setup issue. Provide concise troubleshooting steps.\n\n"
        "Customer Question: {question}"
    )
    | tech_model
    | StrOutputParser()
)

# Sub-Chain C: General/FAQ
general_chain = (
    ChatPromptTemplate.from_template(
        "You are a helpful customer concierge. Answer general questions politely.\n\n"
        "Customer Question: {question}"
    )
    | general_model
    | StrOutputParser()
)

# --- 2. Classification Logic ---
class_prompt = ChatPromptTemplate.from_template(
    "Classify the following customer ticket into exactly one of these three categories: "
    "'BILLING', 'TECHNICAL', or 'GENERAL'. Return ONLY the word. Do not explain.\n\n"
    "Ticket: {question}"
)

# Classifier chain output will be a string like "BILLING"
classifier_chain = class_prompt | classifier_model | StrOutputParser()

# --- 3. Functional Router ---
def route_ticket(info: dict):
    """
    Looks at the classification result and returns the appropriate sub-chain.
    """
    category = info["category"].strip().upper()
    print(f"-> Classification Decided: {category}")
    
    if "BILLING" in category:
        return billing_chain
    elif "TECHNICAL" in category:
        return tech_chain
    else:
        return general_chain

# --- 4. Master Routing Chain ---
# - We use RunnablePassthrough to pass the question through
# - We assign the "category" key by running the classifier_chain
# - We pass the dictionary containing both keys to our router lambda
master_routing_chain = (
    RunnablePassthrough.assign(
        category=classifier_chain
    )
    | RunnableLambda(route_ticket)
)

# --- 5. Async Execution Run ---
async def test_router():
    tickets = [
        "I was charged twice on my credit card for this month's subscription. Please refund me.",
        "My database connection is throwing a ConnectionRefusedError: [Errno 111] in python.",
        "What are your business operating hours during the holidays?"
    ]
    
    for idx, ticket in enumerate(tickets):
        print(f"\n[Ticket #{idx+1}]: {ticket}")
        response = await master_routing_chain.ainvoke({"question": ticket})
        print(f"Response:\n{response}")
        print("="*60)

if __name__ == "__main__":
    asyncio.run(test_router())

4.4 Flow Analysis

Let's trace how the data flows when you invoke master_routing_chain:

  1. Input Entry: You pass {"question": "I was charged twice..."}.
  2. Assignment Step: RunnablePassthrough.assign() is triggered.
    • It clones the input dictionary.
    • It triggers classifier_chain.invoke({"question": "I was charged twice..."}).
    • The classification chain queries the LLM, which returns "BILLING".
  3. Resulting Schema: The dictionary becomes {"question": "I was charged twice...", "category": "BILLING"}.
  4. Lambda Route: This dictionary is passed to RunnableLambda(route_ticket).
  5. Chain Return & Execution: The route_ticket function matches "BILLING" and returns the billing_chain object. LangChain automatically intercepts the returned chain and invokes it using the original input dictionary.
  6. Final Prompt Output: The billing_chain formats the prompt using the {question} field, queries the billing model, parses it to a string, and outputs the final message.

This demonstrates the power of LCEL. It handles state propagation and structural mapping behind a single clean abstraction.


4.5 Summary

You now understand:

  1. How functional routing (using RunnableLambda) makes code simpler and cleaner than RunnableBranch.
  2. How returning a Runnable from a function inside a pipeline causes LangChain to execute that runnable automatically.
  3. How to build a complete multi-model, multi-prompt routing classifier.

In the next chapter, we will address output constraints: Structured Output Generation & JSON Parsing. We will study the .with_structured_output() syntax and learn how to compel models to return strictly typed Pydantic objects.

    Chapter 4: Dynamic Chain Routing & Selection — Mastering LangChain: From Basics to Stateful Agents | Krishna Tiwari