LLM Providers
🤖 Configure and optimize different LLM providers with ReasonKit.
ReasonKit supports multiple LLM providers, each with different strengths, pricing, and capabilities.
Supported Providers
| Provider | Models | Best For | Pricing |
|---|---|---|---|
| Anthropic | Claude 4, Sonnet, Haiku | Best quality, safety | $$$ |
| OpenAI | GPT-4, GPT-4 Turbo | Broad compatibility | $$$ |
| OpenRouter | 300+ models | Variety, cost optimization | $ - $$$ |
| Ollama | Llama, Mistral, etc. | Privacy, free | Free |
| Gemini Pro, Flash | Long context | $$ |
Provider Configuration
Anthropic (Recommended)
Claude models provide the best reasoning quality for ThinkTools.
# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."
# Use explicitly
rk-core think "question" --provider anthropic --model claude-sonnet-4-20250514
Config file:
[providers.anthropic]
api_key = "${ANTHROPIC_API_KEY}" # Use env var
model = "claude-sonnet-4-20250514"
max_tokens = 4096
Available models:
| Model | Context | Speed | Quality |
|---|---|---|---|
claude-opus-4-20250514 | 200K | Slow | Best |
claude-sonnet-4-20250514 | 200K | Fast | Excellent |
claude-haiku-3-5-20241022 | 200K | Fastest | Good |
OpenAI
export OPENAI_API_KEY="sk-..."
rk-core think "question" --provider openai --model gpt-4-turbo
Config file:
[providers.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4-turbo"
organization_id = "org-..." # Optional
base_url = "https://api.openai.com/v1" # For proxies
Available models:
| Model | Context | Speed | Quality |
|---|---|---|---|
gpt-4-turbo | 128K | Fast | Excellent |
gpt-4 | 8K | Medium | Excellent |
gpt-3.5-turbo | 16K | Fastest | Good |
OpenRouter
Access 300+ models through a single API. Great for cost optimization and experimentation.
export OPENROUTER_API_KEY="sk-or-..."
rk-core think "question" --provider openrouter --model anthropic/claude-sonnet-4
Config file:
[providers.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4"
site_url = "https://yourapp.com" # For rankings
site_name = "Your App"
Popular models:
| Model | Provider | Quality | Price |
|---|---|---|---|
anthropic/claude-sonnet-4 | Anthropic | Excellent | $$ |
openai/gpt-4-turbo | OpenAI | Excellent | $$ |
google/gemini-pro | Good | $ | |
mistralai/mistral-large | Mistral | Good | $ |
meta-llama/llama-3-70b | Meta | Good | $ |
Ollama (Local)
Run models locally for privacy and zero API costs.
# Start Ollama
ollama serve
# Pull a model
ollama pull llama3.2
# Use with ReasonKit
rk-core think "question" --provider ollama --model llama3.2
Config file:
[providers.ollama]
host = "http://localhost:11434"
model = "llama3.2"
Recommended models:
| Model | Size | Quality | RAM Required |
|---|---|---|---|
llama3.2 | 8B | Good | 8GB |
llama3.2:70b | 70B | Excellent | 48GB |
mistral | 7B | Good | 8GB |
mixtral | 8x7B | Excellent | 32GB |
deepseek-coder | 33B | Good (code) | 24GB |
Google Gemini
export GOOGLE_API_KEY="..."
rk-core think "question" --provider google --model gemini-pro
Config file:
[providers.google]
api_key = "${GOOGLE_API_KEY}"
model = "gemini-pro"
Provider Selection
Automatic Selection
By default, ReasonKit auto-selects based on available API keys:
# Priority order:
# 1. ANTHROPIC_API_KEY
# 2. OPENAI_API_KEY
# 3. OPENROUTER_API_KEY
# 4. GOOGLE_API_KEY
# 5. Ollama (if running)
rk-core think "question" # Uses first available
Per-Profile Provider
Configure different providers for different profiles:
[profiles.quick]
provider = "ollama"
model = "llama3.2"
[profiles.balanced]
provider = "anthropic"
model = "claude-sonnet-4-20250514"
[profiles.deep]
provider = "anthropic"
model = "claude-opus-4-20250514"
Cost Optimization
# Use cheaper models for simple tasks
[profiles.quick]
provider = "openrouter"
model = "mistralai/mistral-7b-instruct" # Very cheap
[profiles.balanced]
provider = "openrouter"
model = "anthropic/claude-sonnet-4" # Good balance
[profiles.paranoid]
provider = "anthropic"
model = "claude-opus-4-20250514" # Best quality
Advanced Configuration
Timeouts
[providers.anthropic]
timeout_secs = 120
connect_timeout_secs = 10
Retries
[providers.anthropic]
max_retries = 3
retry_delay_ms = 1000
retry_multiplier = 2.0 # Exponential backoff
Rate Limiting
[providers.anthropic]
requests_per_minute = 50
tokens_per_minute = 100000
Custom Endpoints
For proxies or enterprise deployments:
[providers.openai]
base_url = "https://your-proxy.com/v1"
api_key = "${PROXY_API_KEY}"
Temperature and Sampling
[providers.anthropic]
temperature = 0.7 # 0.0-1.0, lower = more deterministic
top_p = 0.9 # Nucleus sampling
top_k = 40 # Top-k sampling
Provider-Specific Features
Anthropic Extended Thinking
Enable extended thinking for complex analysis:
[providers.anthropic]
extended_thinking = true
thinking_budget = 16000 # Max thinking tokens
OpenAI Function Calling
[providers.openai]
function_calling = true
OpenRouter Fallbacks
[providers.openrouter]
model = "anthropic/claude-sonnet-4"
fallback_models = [
"openai/gpt-4-turbo",
"google/gemini-pro",
]
Monitoring and Debugging
Token Usage
# Show token usage after each analysis
rk-core think "question" --verbose
# Output includes:
# Tokens: 1,234 prompt + 567 completion = 1,801 total
# Cost: ~$0.0054
Request Logging
# Log all API requests (for debugging)
export RK_DEBUG_API=true
rk-core think "question"
Provider Health Check
# Check if provider is working
rk-core provider test anthropic
rk-core provider test openai
rk-core provider test ollama
Switching Providers
Migration Checklist
When switching providers:
- Test compatibility — Run same prompts, compare quality
- Adjust timeouts — Different providers have different latencies
- Check token limits — Models have different context windows
- Update rate limits — Different quotas per provider
- Review costs — Pricing varies significantly
Quality Comparison
# Run same analysis with different providers
rk-core think "question" --provider anthropic --output json > anthropic.json
rk-core think "question" --provider openai --output json > openai.json
rk-core think "question" --provider ollama --output json > ollama.json
# Compare results
diff anthropic.json openai.json
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| “API key invalid” | Wrong/expired key | Regenerate API key |
| “Rate limited” | Too many requests | Add retry logic, reduce frequency |
| “Model not found” | Wrong model ID | Check provider’s model list |
| “Context too long” | Input exceeds limit | Use model with larger context |
| “Connection refused” | Ollama not running | ollama serve |
Error Codes
| Code | Meaning | Action |
|---|---|---|
| 401 | Unauthorized | Check API key |
| 429 | Rate limited | Wait and retry |
| 500 | Server error | Retry or switch provider |
| 503 | Service unavailable | Try fallback provider |
Related
- Configuration — General configuration
- Environment Variables — API key setup
- Architecture — Provider layer internals