Why Pydantic-AI Failed with Gemini 2.5 Pro (But Direct OpenRouter Integration Succeeded)

The Problem We Faced

While building a construction specification extraction system, I encountered a critical issue: Pydantic AI + OpenRouter + Gemini 2.5 Pro consistently failed to generate valid JSON outputs, despite the promise of seamless structured output generation.

The code would run without errors, but the extracted data was unreliable often incomplete, malformed, or missing critical fields. For production systems requiring consistent structured data extraction, this was a dealbreaker.

The Solution That Worked

By switching to the OpenAI Python client directly with OpenRouter + Gemini 2.5 Pro, structured outputs became 100% reliable. Same model, same API gateway, but dramatically different results.

The Technical Deep Dive

Why Pydantic-AI Failed:

1. Intermediary Abstraction Misalignment

Pydantic-AI wraps model interactions in abstraction layers designed for multiple LLM providers
While this provides a unified interface, it introduces constraints that don’t align with Gemini 2.5 Pro’s native capabilities
The library’s schema enforcement mechanisms may conflict with how Gemini implements JSON Schema

2. JSON Schema Feature Support Gap

Pydantic-AI’s schema generation uses features like additionalProperties that weren’t fully supported by Gemini models through the abstraction layer at the time
This caused warnings and schema validation failures that corrupted structured outputs
The tool’s rigid validation requirements didn’t mesh well with Gemini’s output formatting

3. Response Formatting Issues

Gemini 2.5 Pro sometimes wraps JSON in markdown code blocks or adds explanatory text
Pydantic-AI’s parser expects pure JSON, leading to parsing failures
The abstraction layer couldn’t gracefully handle these formatting variations

Why Direct OpenRouter Integration Succeeded:

1. Native JSON Schema Support

response_format={
    "type": "json_schema",
    "json_schema": {
        "name": "extraction_result",
        "schema": ExtractionResult.model_json_schema(),
        "strict": True
    }
}

OpenRouter implements the OpenAI-compatible API specification, including the response_format parameter
When you specify type: "json_schema", OpenRouter translates this to Gemini 2.5 Pro’s native structured output API
Gemini 2.5 Pro was recently enhanced with robust JSON Schema support including:
- anyOf, $ref, and additionalProperties keywords
- Implicit property ordering preservation
- Strict schema adherence mode

2. Direct Model-Level Enforcement

The schema is enforced at the model inference level, not post-processing
Gemini 2.5 Pro’s decoder is constrained to only generate tokens that conform to the schema
This eliminates the possibility of formatting errors, missing fields, or invalid structure

3. Zero Intermediary Abstraction

OpenRouter acts purely as an API gateway, not a transformation layer
The request format goes directly to Gemini with minimal modification
No conflicting validation logic between tool and model

4. Enhanced Error Handling

OpenRouter provides clear error messages if the schema is incompatible
The model refuses to respond rather than generating invalid data
This fail-fast behavior prevents downstream corruption

The Key Architectural Difference

❌ FAILED APPROACH:
Our Code → Pydantic-AI (schema abstraction) → OpenRouter (gateway) 
→ Gemini 2.5 Pro → Response → Pydantic-AI (validation) → Potential Failure

✅ WORKING APPROACH:
Our Code → OpenAI Client (standard interface) → OpenRouter (gateway) 
→ Gemini 2.5 Pro (native JSON schema enforcement) → Valid JSON Response

Real-World Impact

For our construction specification extraction system:

Before: ~40% failure rate with missing or malformed data
After: 100% success rate with fully structured, validated outputs
Extracted data from 6 CSI sections with complex nested structures:
- 35+ submittal requirements
- 120+ material specifications
- Perfect preservation of manufacturer names and technical specs

Key Takeaways

Abstraction layers aren’t always beneficial - Sometimes going direct to the API yields better results
Model-native features > post-processing validation - Structured output at inference time is more reliable
API gateway ≠ abstraction framework - OpenRouter succeeds because it translates, not transforms
Match your tools to model capabilities - Use frameworks that align with your model’s native features

The Technical Stack That Works

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="your-openrouter-key"
)

completion = client.chat.completions.create(
    model="google/gemini-2.5-pro",
    messages=[...],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "your_schema",
            "schema": YourModel.model_json_schema(),
            "strict": True
        }
    }
)

Why This Matters

As we build more complex AI systems, understanding the actual integration mechanics not just the happy-path documentation—becomes critical. The difference between a 40% failure rate and 100% reliability isn’t just about code quality; it’s about architectural decisions that align with how models actually work.

Based on production experience extracting structured data from construction specifications using Gemini 2.5 Pro via OpenRouter.