The Problem We Faced
While building a construction specification extraction system, I encountered a critical issue: Pydantic AI + OpenRouter + Gemini 2.5 Pro consistently failed to generate valid JSON outputs, despite the promise of seamless structured output generation.
The code would run without errors, but the extracted data was unreliable often incomplete, malformed, or missing critical fields. For production systems requiring consistent structured data extraction, this was a dealbreaker.
The Solution That Worked
By switching to the OpenAI Python client directly with OpenRouter + Gemini 2.5 Pro, structured outputs became 100% reliable. Same model, same API gateway, but dramatically different results.
The Technical Deep Dive
Why Pydantic-AI Failed:
1. Intermediary Abstraction Misalignment
- Pydantic-AI wraps model interactions in abstraction layers designed for multiple LLM providers
- While this provides a unified interface, it introduces constraints that don’t align with Gemini 2.5 Pro’s native capabilities
- The library’s schema enforcement mechanisms may conflict with how Gemini implements JSON Schema
2. JSON Schema Feature Support Gap
- Pydantic-AI’s schema generation uses features like
additionalPropertiesthat weren’t fully supported by Gemini models through the abstraction layer at the time - This caused warnings and schema validation failures that corrupted structured outputs
- The tool’s rigid validation requirements didn’t mesh well with Gemini’s output formatting
3. Response Formatting Issues
- Gemini 2.5 Pro sometimes wraps JSON in markdown code blocks or adds explanatory text
- Pydantic-AI’s parser expects pure JSON, leading to parsing failures
- The abstraction layer couldn’t gracefully handle these formatting variations
Why Direct OpenRouter Integration Succeeded:
1. Native JSON Schema Support
response_format={
"type": "json_schema",
"json_schema": {
"name": "extraction_result",
"schema": ExtractionResult.model_json_schema(),
"strict": True
}
}
- OpenRouter implements the OpenAI-compatible API specification, including the
response_formatparameter - When you specify
type: "json_schema", OpenRouter translates this to Gemini 2.5 Pro’s native structured output API - Gemini 2.5 Pro was recently enhanced with robust JSON Schema support including:
anyOf,$ref, andadditionalPropertieskeywords- Implicit property ordering preservation
- Strict schema adherence mode
2. Direct Model-Level Enforcement
- The schema is enforced at the model inference level, not post-processing
- Gemini 2.5 Pro’s decoder is constrained to only generate tokens that conform to the schema
- This eliminates the possibility of formatting errors, missing fields, or invalid structure
3. Zero Intermediary Abstraction
- OpenRouter acts purely as an API gateway, not a transformation layer
- The request format goes directly to Gemini with minimal modification
- No conflicting validation logic between tool and model
4. Enhanced Error Handling
- OpenRouter provides clear error messages if the schema is incompatible
- The model refuses to respond rather than generating invalid data
- This fail-fast behavior prevents downstream corruption
The Key Architectural Difference
❌ FAILED APPROACH:
Our Code → Pydantic-AI (schema abstraction) → OpenRouter (gateway)
→ Gemini 2.5 Pro → Response → Pydantic-AI (validation) → Potential Failure
✅ WORKING APPROACH:
Our Code → OpenAI Client (standard interface) → OpenRouter (gateway)
→ Gemini 2.5 Pro (native JSON schema enforcement) → Valid JSON Response
Real-World Impact
For our construction specification extraction system:
- Before: ~40% failure rate with missing or malformed data
- After: 100% success rate with fully structured, validated outputs
- Extracted data from 6 CSI sections with complex nested structures:
- 35+ submittal requirements
- 120+ material specifications
- Perfect preservation of manufacturer names and technical specs
Key Takeaways
- Abstraction layers aren’t always beneficial - Sometimes going direct to the API yields better results
- Model-native features > post-processing validation - Structured output at inference time is more reliable
- API gateway ≠ abstraction framework - OpenRouter succeeds because it translates, not transforms
- Match your tools to model capabilities - Use frameworks that align with your model’s native features
The Technical Stack That Works
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="your-openrouter-key"
)
completion = client.chat.completions.create(
model="google/gemini-2.5-pro",
messages=[...],
response_format={
"type": "json_schema",
"json_schema": {
"name": "your_schema",
"schema": YourModel.model_json_schema(),
"strict": True
}
}
)
Why This Matters
As we build more complex AI systems, understanding the actual integration mechanics not just the happy-path documentation—becomes critical. The difference between a 40% failure rate and 100% reliability isn’t just about code quality; it’s about architectural decisions that align with how models actually work.
Based on production experience extracting structured data from construction specifications using Gemini 2.5 Pro via OpenRouter.