Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

LLM Providers

🤖 Configure and optimize different LLM providers with ReasonKit.

ReasonKit supports multiple LLM providers, each with different strengths, pricing, and capabilities.

Supported Providers

ProviderModelsBest ForPricing
AnthropicClaude 4, Sonnet, HaikuBest quality, safety$$$
OpenAIGPT-4, GPT-4 TurboBroad compatibility$$$
OpenRouter300+ modelsVariety, cost optimization$ - $$$
OllamaLlama, Mistral, etc.Privacy, freeFree
GoogleGemini Pro, FlashLong context$$

Provider Configuration

Claude models provide the best reasoning quality for ThinkTools.

# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Use explicitly
rk-core think "question" --provider anthropic --model claude-sonnet-4-20250514

Config file:

[providers.anthropic]
api_key = "${ANTHROPIC_API_KEY}"  # Use env var
model = "claude-sonnet-4-20250514"
max_tokens = 4096

Available models:

ModelContextSpeedQuality
claude-opus-4-20250514200KSlowBest
claude-sonnet-4-20250514200KFastExcellent
claude-haiku-3-5-20241022200KFastestGood

OpenAI

export OPENAI_API_KEY="sk-..."

rk-core think "question" --provider openai --model gpt-4-turbo

Config file:

[providers.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4-turbo"
organization_id = "org-..."  # Optional
base_url = "https://api.openai.com/v1"  # For proxies

Available models:

ModelContextSpeedQuality
gpt-4-turbo128KFastExcellent
gpt-48KMediumExcellent
gpt-3.5-turbo16KFastestGood

OpenRouter

Access 300+ models through a single API. Great for cost optimization and experimentation.

export OPENROUTER_API_KEY="sk-or-..."

rk-core think "question" --provider openrouter --model anthropic/claude-sonnet-4

Config file:

[providers.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4"
site_url = "https://yourapp.com"  # For rankings
site_name = "Your App"

Popular models:

ModelProviderQualityPrice
anthropic/claude-sonnet-4AnthropicExcellent$$
openai/gpt-4-turboOpenAIExcellent$$
google/gemini-proGoogleGood$
mistralai/mistral-largeMistralGood$
meta-llama/llama-3-70bMetaGood$

Ollama (Local)

Run models locally for privacy and zero API costs.

# Start Ollama
ollama serve

# Pull a model
ollama pull llama3.2

# Use with ReasonKit
rk-core think "question" --provider ollama --model llama3.2

Config file:

[providers.ollama]
host = "http://localhost:11434"
model = "llama3.2"

Recommended models:

ModelSizeQualityRAM Required
llama3.28BGood8GB
llama3.2:70b70BExcellent48GB
mistral7BGood8GB
mixtral8x7BExcellent32GB
deepseek-coder33BGood (code)24GB

Google Gemini

export GOOGLE_API_KEY="..."

rk-core think "question" --provider google --model gemini-pro

Config file:

[providers.google]
api_key = "${GOOGLE_API_KEY}"
model = "gemini-pro"

Provider Selection

Automatic Selection

By default, ReasonKit auto-selects based on available API keys:

# Priority order:
# 1. ANTHROPIC_API_KEY
# 2. OPENAI_API_KEY
# 3. OPENROUTER_API_KEY
# 4. GOOGLE_API_KEY
# 5. Ollama (if running)

rk-core think "question"  # Uses first available

Per-Profile Provider

Configure different providers for different profiles:

[profiles.quick]
provider = "ollama"
model = "llama3.2"

[profiles.balanced]
provider = "anthropic"
model = "claude-sonnet-4-20250514"

[profiles.deep]
provider = "anthropic"
model = "claude-opus-4-20250514"

Cost Optimization

# Use cheaper models for simple tasks
[profiles.quick]
provider = "openrouter"
model = "mistralai/mistral-7b-instruct"  # Very cheap

[profiles.balanced]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"  # Good balance

[profiles.paranoid]
provider = "anthropic"
model = "claude-opus-4-20250514"  # Best quality

Advanced Configuration

Timeouts

[providers.anthropic]
timeout_secs = 120
connect_timeout_secs = 10

Retries

[providers.anthropic]
max_retries = 3
retry_delay_ms = 1000
retry_multiplier = 2.0  # Exponential backoff

Rate Limiting

[providers.anthropic]
requests_per_minute = 50
tokens_per_minute = 100000

Custom Endpoints

For proxies or enterprise deployments:

[providers.openai]
base_url = "https://your-proxy.com/v1"
api_key = "${PROXY_API_KEY}"

Temperature and Sampling

[providers.anthropic]
temperature = 0.7        # 0.0-1.0, lower = more deterministic
top_p = 0.9             # Nucleus sampling
top_k = 40              # Top-k sampling

Provider-Specific Features

Anthropic Extended Thinking

Enable extended thinking for complex analysis:

[providers.anthropic]
extended_thinking = true
thinking_budget = 16000  # Max thinking tokens

OpenAI Function Calling

[providers.openai]
function_calling = true

OpenRouter Fallbacks

[providers.openrouter]
model = "anthropic/claude-sonnet-4"
fallback_models = [
    "openai/gpt-4-turbo",
    "google/gemini-pro",
]

Monitoring and Debugging

Token Usage

# Show token usage after each analysis
rk-core think "question" --verbose

# Output includes:
# Tokens: 1,234 prompt + 567 completion = 1,801 total
# Cost: ~$0.0054

Request Logging

# Log all API requests (for debugging)
export RK_DEBUG_API=true
rk-core think "question"

Provider Health Check

# Check if provider is working
rk-core provider test anthropic
rk-core provider test openai
rk-core provider test ollama

Switching Providers

Migration Checklist

When switching providers:

  1. Test compatibility — Run same prompts, compare quality
  2. Adjust timeouts — Different providers have different latencies
  3. Check token limits — Models have different context windows
  4. Update rate limits — Different quotas per provider
  5. Review costs — Pricing varies significantly

Quality Comparison

# Run same analysis with different providers
rk-core think "question" --provider anthropic --output json > anthropic.json
rk-core think "question" --provider openai --output json > openai.json
rk-core think "question" --provider ollama --output json > ollama.json

# Compare results
diff anthropic.json openai.json

Troubleshooting

Common Issues

IssueCauseSolution
“API key invalid”Wrong/expired keyRegenerate API key
“Rate limited”Too many requestsAdd retry logic, reduce frequency
“Model not found”Wrong model IDCheck provider’s model list
“Context too long”Input exceeds limitUse model with larger context
“Connection refused”Ollama not runningollama serve

Error Codes

CodeMeaningAction
401UnauthorizedCheck API key
429Rate limitedWait and retry
500Server errorRetry or switch provider
503Service unavailableTry fallback provider