Chapter 5: Structured Output Generation & JSON Parsing

When building production software, you rarely want the LLM to reply with conversational text. If you are building a dashboard, a database pipeline, or a web frontend, you need the model to output structured data (like JSON) that matches a specific database or API schema.

If you simply prompt an LLM: "Return the result in JSON format", it will often add conversational prefix text like "Here is your JSON:", wrap the JSON in markdown code blocks (```json ... ```), or miss critical fields under load.

Modern LangChain provides two robust paradigms to enforce structured outputs:

.with_structured_output() (Native API Integration): The recommended standard for models supporting native function/tool calling.
JsonOutputParser (Text-Based Instruction & Parsing): The fallback standard for smaller, local, or legacy models that do not support native tool calling.

In this chapter, we will master both techniques and build a strict document metadata extraction pipeline.

5.1 Native Structuring: `.with_structured_output()`

The modern standard to enforce structured output is the .with_structured_output() method. It leverages the model's native API capabilities (such as OpenAI's JSON Mode or Tool Calling APIs).

Instead of parsing text post-generation, this method binds a schema directly to the model's decoding constraints, ensuring the output matches the schema with near-perfect reliability.

Step 1: Define the Schema using Pydantic

We define our target schema using Pydantic, Python's premier data validation library.

from pydantic import BaseModel, Field
from typing import List, Optional

class ContactInfo(BaseModel):
    name: str = Field(description="The full name of the contact person.")
    email: str = Field(description="The email address of the contact.")
    phone: Optional[str] = Field(None, description="The phone number, if provided.")

class TechnicalSpecification(BaseModel):
    title: str = Field(description="The name of the software or system.")
    version: str = Field(description="The version number.")
    tags: List[str] = Field(description="A list of technical tags or keywords.")
    contacts: List[ContactInfo] = Field(description="List of contacts identified.")

Step 2: Bind the Schema to the Model

We call .with_structured_output() on our chat model instance, passing the Pydantic class:

from langchain_openai import ChatOpenAI

# 1. Initialize the model
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.0)

# 2. Bind the schema
structured_llm = llm.with_structured_output(TechnicalSpecification)

# 3. Invoke directly
result = structured_llm.invoke("Contact John Doe ([email protected]) for Antigravity system v2.4.1. Core tags: AI, agent.")

# The result is NOT a string or AIMessage; it is an instance of TechnicalSpecification!
print(type(result))          # <class '__main__.TechnicalSpecification'>
print(result.contacts[0].name) # "John Doe"

5.2 Text-Based Fallback: `JsonOutputParser`

Some models—particularly smaller local models like llama3.2:1b run via Ollama—do not natively support structured output APIs. For these, we must use a text-based parser: JsonOutputParser.

This parser does two things:

Injects formatting instructions into your prompt automatically using parser.get_format_instructions().
Parses the raw text response from the model, stripping markdown wrapper code blocks and converting it into a standard Python dictionary.

Here is how to set up JsonOutputParser:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_ollama import ChatOllama

# 1. Setup local model
local_llm = ChatOllama(model="llama3.2", temperature=0.0)

# 2. Create the parser with the schema
parser = JsonOutputParser(pydantic_object=TechnicalSpecification)

# 3. Inject formatting instructions into prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "Extract the requested information from the text.\n{format_instructions}"),
    ("human", "{text}")
])

# Use the .partial() method to inject formatting rules automatically
prompt_with_instructions = prompt.partial(format_instructions=parser.get_format_instructions())

# 4. Chain the components using LCEL
extraction_chain = prompt_with_instructions | local_llm | parser

# Invoke returns a validated Python dict matching the schema
result_dict = extraction_chain.invoke({
    "text": "Contact John Doe ([email protected]) for Antigravity system v2.4.1. Core tags: AI, agent."
})
print(type(result_dict)) # <class 'dict'>

5.3 Hands-on Example: Document Metadata Extractor

Let's build a production-grade asynchronous pipeline that extracts medical summaries from clinical intake notes. We want to extract:

Patient's age and gender.
A list of symptoms.
Severity level (Enum: LOW, MEDIUM, HIGH).

We will write this such that it handles parsing errors gracefully.

import asyncio
from typing import List
from enum import Enum
from pydantic import BaseModel, Field
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
# Alternately: from langchain_openai import ChatOpenAI

# 1. Define Pydantic Schema
class SeverityEnum(str, Enum):
    LOW = "LOW"
    MEDIUM = "MEDIUM"
    HIGH = "HIGH"

class PatientIntakeSummary(BaseModel):
    age: int = Field(description="The age of the patient in years.")
    gender: str = Field(description="The gender of the patient.")
    symptoms: List[str] = Field(description="List of primary medical symptoms mentioned.")
    severity: SeverityEnum = Field(description="Calculated triage severity level.")
    notes: str = Field(description="Additional clinical notes or context.")

async def main():
    # 2. Initialize Model
    # Since llama3.2 has basic tool capabilities, we can use structured outputs natively,
    # or fallback to Ollama if needed.
    llm = ChatOllama(model="llama3.2", temperature=0.0)
    
    # 3. Create Structured LLM
    try:
        structured_llm = llm.with_structured_output(PatientIntakeSummary)
    except NotImplementedError:
        print("Model does not support native structured outputs. Using JsonOutputParser fallback.")
        from langchain_core.output_parsers import JsonOutputParser
        parser = JsonOutputParser(pydantic_object=PatientIntakeSummary)
        prompt = ChatPromptTemplate.from_messages([
            ("system", "Extract patient triage information from the report.\n{format_instructions}"),
            ("human", "{text}")
        ]).partial(format_instructions=parser.get_format_instructions())
        structured_llm = prompt | llm | parser

    # 4. Construct extraction prompt (when using native with_structured_output)
    if isinstance(structured_llm, ChatOllama) or hasattr(structured_llm, "invoke"):
        # If we have a native structured model, we wrap it with a prompt
        prompt = ChatPromptTemplate.from_messages([
            ("system", "You are an automated medical triage assistant. Analyze the text and extract the requested fields."),
            ("human", "{text}")
        ])
        extraction_chain = prompt | structured_llm
    else:
        # Fallback chain is already composed
        extraction_chain = structured_llm

    # 5. Run the Extraction
    medical_report = """
    Patient is a 45-year-old male presenting with severe chest tightness, shortness of breath, 
    and mild sweating. Symptoms started approximately 2 hours ago. He rates the pain as 8 out of 10. 
    History of high blood pressure. Triage immediately.
    """
    
    print("Extracting structured patient record...")
    try:
        record = await extraction_chain.ainvoke({"text": medical_report})
        
        print("\n--- EXTRACTED DATABASE RECORD ---")
        if isinstance(record, PatientIntakeSummary):
            # Print attributes directly (Pydantic object)
            print(f"Age: {record.age}")
            print(f"Gender: {record.gender}")
            print(f"Symptoms: {record.symptoms}")
            print(f"Severity Level: {record.severity.value}")
            print(f"Notes: {record.notes}")
        else:
            # Print keys (dict from fallback parser)
            print(record)
            
    except Exception as e:
        print(f"Extraction failed. Validation error: {e}")

if __name__ == "__main__":
    asyncio.run(main())

5.4 Best Practices for Structured Output

To get the highest accuracy in production:

Describe Fields Clearly: The description strings inside Field(description="...") are converted into JSON Schema descriptions which the LLM reads. If you write vague descriptions, the model will struggle to populate the fields correctly.
Use Enums: If a field must be one of a predefined list of options, use Python's Enum or Literal. This restricts the LLM from outputting values outside your permitted set.
Use Coercion: If the model sometimes writes numbers as strings (e.g. "45" instead of 45), Pydantic will automatically convert it to an integer if the field is typed as int.

5.5 Summary

You now understand:

How .with_structured_output() provides a robust native binding between models and schemas.
How to fall back to text-based JSON parsing using JsonOutputParser and .partial().
How to declare Pydantic schemas with type descriptions and enums.

In the next chapter, we will look at Tool Calling, Function Binding, and Custom Execution Loops. We will see how to bind custom Python tools to the model and implement a multi-turn reasoning loop ourselves, entirely using LCEL and pure Python.

5.1 Native Structuring: .with_structured_output()