Google Gemini powers all AI agents in the system through LangChain, providing natural language understanding, intent classification, and conversational responses.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/KevinhosUTP/Automatizacion-Lurwis/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The platform uses Google Gemini (formerly PaLM API) for:- Intent classification - Routing customers to correct agent
- Order processing - Understanding menu requests and building orders
- Reservation handling - Managing table and venue bookings
- General inquiries - Answering questions about hours, location, etc.
- Order detection - Distinguishing modification vs. status queries
Model Selection
The system uses two types of models:| Model Type | Use Cases | Response Time | Cost |
|---|---|---|---|
| Fast Chat Models | Classifier, Detector, General inquiries | ~1-2s | Low |
| Thinking Models | Order processing (complex logic) | ~3-5s | Medium |
“Thinking models” are larger Gemini models (like gemini-1.5-pro) that perform better on complex reasoning tasks like building order JSONs.
API Setup
Get Google AI API key
- Go to Google AI Studio
- Click Get API key
- Select or create a Google Cloud project
- Copy the generated API key
Enable Gemini API
In Google Cloud Console:
- Navigate to APIs & Services → Library
- Search for “Generative Language API”
- Click Enable
n8n Configuration
Store Gemini credentials in n8n:Agent Configurations
Each agent uses LangChain’s Google Gemini Chat Model:Classifier Agent
Purpose: Route messages to correct specialist agentOrder Agent
Purpose: Handle food orders with menu database toolsDetector Agent
Purpose: Determine if customer wants to modify or query existing orderGeneral Agent
Purpose: Answer FAQs (hours, location, contact)Reservation Agents
Purpose: Handle table and venue reservationsTemperature Settings
Temperature controls response randomness:| Temperature | Use Case | Agents |
|---|---|---|
| 0.0 | Deterministic, exact classification | Detector |
| 0.1 | Consistent categorization | Classifier |
| 0.2 | Structured outputs (JSON) | Orders |
| 0.3 | Natural conversation | General, Reservations |
Lower temperature = More deterministic. Higher = More creative but less predictable.
LangChain Tools Integration
AI agents use PostgreSQL Tools to query the menu:Prompt Engineering
Key system prompt patterns used:Role Definition
Critical Rules
Context Injection
Token Usage Optimization
Limit context window
Use smaller
contextWindowLength for simple agents:- Classifier: 10 messages
- General: 10 messages
- Orders: 25 messages (needs full conversation)
Rate Limits & Quotas
Free Tier
- 60 requests per minute (RPM)
- 1,500 requests per day (RPD)
- Best for development/testing
Paid Tier
- 1,000+ RPM (depends on plan)
- Unlimited daily requests
- Priority access during high load
Error Handling
Implement graceful fallbacks:Monitoring & Logging
Track AI performance:Cost Estimation
Gemini pricing (approximate):| Model | Input | Output |
|---|---|---|
| gemini-1.5-flash | $0.075 / 1M tokens | $0.30 / 1M tokens |
| gemini-1.5-pro | $1.25 / 1M tokens | $5.00 / 1M tokens |
- 1,000 orders/day
- Avg 500 input tokens + 200 output tokens per order
- Using gemini-1.5-pro:
Switch to gemini-1.5-flash for non-order agents to reduce costs by ~90%.
Troubleshooting
Agent responses are inconsistent
Agent responses are inconsistent
- Lower temperature (try 0.1 or 0.0)
- Make system prompt more explicit
- Add examples in prompt
- Use structured output format (JSON)
Agent hallucinates prices/menu items
Agent hallucinates prices/menu items
Rate limit errors (429)
Rate limit errors (429)
- Implement exponential backoff retry
- Upgrade to paid tier
- Reduce concurrent requests
- Cache common queries
Slow response times (>5s)
Slow response times (>5s)
- Use gemini-1.5-flash instead of pro
- Reduce maxOutputTokens
- Reduce context window length
- Check if model is overloaded (try different region)
Agent ignores tools
Agent ignores tools
- Ensure tools are properly connected in n8n workflow
- Make prompt explicitly mention: “Use consultar_platos to check menu”
- Test tool independently
- Check PostgreSQL credentials are valid
Best Practices
Use appropriate models
- Classification: Fast models (flash)
- Orders: Thinking models (pro)
- General Q&A: Fast models (flash)
Optimize prompts
- Be explicit about expected format
- Use tags like
<ROL>,<REGLAS>for structure - Include examples for complex outputs
- Keep system prompts under 2000 tokens
Memory management
- Match context window to conversation complexity
- Clear old sessions periodically in MongoDB
- Monitor memory collection sizes
Related Resources
Google AI Studio
Test prompts and get API keys
LangChain Docs
LangChain Google AI integration
Order Service
See AI agents in action
Procesador Workflow
Complete agent orchestration