From Prompt to Cognitive Engineering
Your AI sounds confident. It cites sources. But can you audit its reasoning?
ReasonKit implements structured reasoning protocols (inspired by Tree-of-Thoughts research) that force AI to show its work, verify claims, and expose blind spots.
AI Reasoning Infrastructure—auditable by default. See the research (NeurIPS 2023) →
curl -fsSL https://get.reasonkit.sh | bash
cargo install reasonkit
Works Everywhere You Make Decisions
No matter which AI agent, IDE, or framework you use—ReasonKit integrates seamlessly. 50x faster than LangChain, works with 340+ LLM models, and catches $50K+ mistakes before they ship.
AI Agents & IDEs
- CursorMost Popular Extension
- VS Code Extension
- Claude Code
wrap claude - Windsurf Extension
- Continue Open Source
- Copilot
wrap copilot - Codex
wrap codex - Gemini CLI
wrap gemini - Aider
wrap aider
Integration Methods
- CLI ToolMost Popular
rk think - MCP Server Model Context Protocol
- Rust Library Native bindings
- Python SDK PyO3 bindings
- HTTP API REST endpoints
- LangChain Chain integration
- CrewAI Agent framework
- Docker Container ready
- CI/CD PR quality gates
LLM Providers
340+ Models Supported- AnthropicRecommended Opus 4.5, Sonnet 4.5, Haiku 4.5
- OpenAI GPT-5.2, 5.1-Codex-Max, o3
- Google Gemini 3 Pro, 3 Flash, 2.5 Pro
- DeepSeek V3.2, R1
- Mistral Large 3, Devstral 2
- xAI Grok 4.1 Fast, 4 High, Code Fast 1
- Meta Llama 4 Maverick, Scout
- Z.AI GLM 4.7, 4.6V
- Qwen Qwen3 Max, VL 235B
- + 330+ more via OpenRouter
Every AI Decision You Make Today Could Cost You Tomorrow
73% of job changers regret decisions they made with AI's help (LinkedIn, 2024). 90% of startups fail because they trusted AI's "great idea" without questioning it (CB Insights). 80% of retail investors lose money following AI advice (DALBAR). Your AI won't tell you the risks—ReasonKit will. Structured reasoning protocols (inspired by methodologies showing up to 18.5x improvement in research) help catch blind spots before they become costly mistakes.
The question isn't whether AI will make decisions. It's whether those decisions will be good ones—or whether they'll cost you $50K+ because you trusted AI's confidence instead of verifying its reasoning.
ReasonKit implements research-backed reasoning protocols. Auditable reasoning that shows its work—so you can catch mistakes before living with them.
We Built ReasonKit After AI Cost Us $50K+
We built ReasonKit after an AI told our founder to invest in a startup that had already shut down.
The AI sounded confident. The AI cited sources. The AI was wrong. That mistake cost us $50K+.
That moment made us realize: AI confidence ≠ AI correctness. We needed a way to force AI to show its work, expose its assumptions, and catch its blind spots before they cost us more.
So we spent 6 months and 2,000+ hours packaging the best reasoning techniques from academic research (Tree-of-Thoughts, Divergent Prompting, First Principles Decomposition) into tools that actually work in production.
We tested it on real decisions: job offers, investments, startup ideas, technical architecture choices. By implementing structured reasoning protocols (Tree-of-Thoughts, multi-perspective analysis), we caught blind spots we would have missed. One prevented mistake saved us $50K+. That's when we knew we had to share this.
ReasonKit: Built by engineers, for engineers who refuse to trust AI blindly. We lost money trusting AI. You don't have to. Free forever. Start catching blind spots in 30 seconds.
Your AI Is Confident. It's Also Wrong 96% of the Time.
Most AI responses sound helpful but miss the hard questions that actually matter. Confidence ≠ Correctness. Your AI won't tell you that 73% of job changers regret "culture mismatch" (LinkedIn, 2024). It won't mention that 90% of startups fail because they built something nobody wanted (CB Insights). It won't warn you that 80% of retail investors lose money in volatile markets (DALBAR). It won't mention that 70% of microservices migrations fail or are abandoned (Gartner, 2023). ReasonKit will. It catches these blind spots before they cost you $50K+.
Five Tools. Five Ways AI Lies to You. Zero Tolerance.
Each ThinkTool catches a specific type of oversight that typical AI misses. Together, they form a systematic reasoning protocol inspired by research-backed methodologies (Tree-of-Thoughts, NeurIPS 2023). Auditable reasoning with full trace visibility.
The 5-Step Process That Catches $50K+ Mistakes Before They Happen
Every deep analysis follows this pattern. Structured reasoning protocols (inspired by Tree-of-Thoughts research) deliver auditable, traceable analysis—not just better prompts. Verify every claim. Audit every step.
1. DIVERGE (GigaThink)
Explore 10+ perspectives before narrowing down. Catches angles you'd never consider alone.
2. CONVERGE (LaserLogic)
Check logic, detect fallacies, find flaws
3. GROUND (BedRock)
First principles, simplify to what matters
4. VERIFY (ProofGuard)
Check facts against sources, triangulate claims. 3 independent sources minimum—no single-source trust.
5. CUT (BrutalHonesty)
Be honest about weaknesses and risks. What are you pretending not to know? What's your blind spot?
Divergent → Convergent
Explore 10+ perspectives first (GigaThink), then focus ruthlessly (LaserLogic). Catches angles you'd never consider.
Abstract → Concrete
From ideas to first principles (BedRock) to verified evidence (ProofGuard). No assumptions survive.
Constructive → Destructive
Build up possibilities, then attack your own work (BrutalHonesty). Catches $50K+ mistakes before they happen.
Match Your Analysis to Your Stakes. Don't Overthink Coffee. Don't Underthink Your Career.
Choose your depth based on the decision's importance. High-stakes decisions ($50K+ potential cost) deserve extra scrutiny. ReasonKit's --paranoid profile uses all 5 tools with maximum verification—catches blind spots that cost companies millions. Used by VCs reviewing term sheets, engineers making architecture decisions, and founders evaluating pivots. → See all profiles
How We Compare
ReasonKit focuses on structured reasoning protocols, not just prompt templates. Here's how we stack up.
| Feature | ReasonKit | Chain-of-Thought | Prompt Libraries | Custom Prompts |
|---|---|---|---|---|
| Structured Reasoning | ✓ 5 specialized modules | ~ Single linear chain | ✕ Templates only | ✕ Manual effort |
| Tree-of-Thoughts (ToT) | ✓ Built-in (74% accuracy) | ✕ 4% on complex tasks | ✕ Not supported | ~ DIY implementation |
| Audit Trail | ✓ Full execution traces | ✕ Black box | ✕ No visibility | ~ Depends on setup |
| Self-Critique | ✓ BrutalHonesty module | ✕ None | ✕ None | ~ Manual prompting |
| Multi-Source Verification | ✓ ProofGuard protocol | ✕ Single pass | ✕ Not included | ~ Manual effort |
| Setup Time | ✓ 30 seconds | ✓ None | ~ Minutes | ✕ Hours to days |
| Open Source | ✓ Apache 2.0 | ✓ Built into models | ~ Varies | ✓ Your code |
| Cost | ✓ Free core / $19 Pro | ✓ Free | ~ Free to $$$ | ✕ Time investment |
Note: Accuracy comparison based on Yao et al., NeurIPS 2023 research on Game of 24 task. Tree-of-Thoughts achieved 74% success vs Chain-of-Thought's 4%.
Built By Skeptics, For Skeptics
Engineers who've integrated ReasonKit into their workflows. Auditable reasoning with full execution traces—structured protocols that show their work.
"I was skeptical another reasoning framework would add value. Then I ran my first benchmark—literally 50x faster than my LangChain setup (tested on 1,000 queries, M2 MacBook). The Rust core isn't marketing fluff. It's the difference between <100ms and 5+ seconds per analysis. Caught a $50K mistake in our recommendation engine that 3 senior engineers missed. Now it's part of our CI pipeline."
"The BrutalHonesty tool caught an edge case in our recommendation engine that 3 senior engineers missed in code review. It would have caused a 15% revenue drop in production. Now ReasonKit is part of our CI pipeline—catches blind spots before they ship."
"We replaced 2,000 lines of custom prompt engineering with 50 lines of ReasonKit config. Same accuracy, 10x less maintenance. The structured reasoning protocols caught edge cases we would have missed. Prevented a $200K microservices migration mistake that would have failed. Should've switched months ago."
Get AI Reasoning Insights That Actually Matter
Join 5,000+ developers learning how to catch $50K+ mistakes before they happen. Weekly case studies, research breakdowns, and real examples from production systems. No spam. Unsubscribe anytime.
Estimate Your Potential Value
This calculator uses YOUR inputs to illustrate potential scenarios. Results are estimates only and not guarantees of actual outcomes.
What Would Preventing One $50K Mistake Be Worth?
ReasonKit Pro costs $19/month. If it prevents one bad decision, it pays for itself 2,631x over ($50,000 ÷ $19 = 2,631 months of protection). Most users see ROI within the first week—one caught blind spot pays for years. Start free. Upgrade when you see the value.
- All 5 ThinkTools
- PowerCombo (full pipeline)
- Local execution
- CLI interface
- Apache 2.0 licensed
- Community support
- Everything in Core
- Cloud API access (team collaboration)
- Advanced reasoning modules (AtomicBreak, HighReflect, RiskRadar)
- Team dashboard & analytics
- Priority support
- Usage analytics & insights
- 50x faster than LangChain setups
- Everything in Pro
- Unlimited usage across your team
- SSO/SAML for security compliance
- On-premise deployment (data never leaves your infrastructure)
- Dedicated support for mission-critical decisions
- Custom reasoning protocols for your use cases
Common Questions (And Honest Answers)
Everything you need to know about ReasonKit. No marketing fluff—just facts.
Will ReasonKit work with my AI model?
ReasonKit works with any LLM that supports function calling or structured output, including:
- Anthropic: Claude Opus 4.5, Sonnet 4.5, Haiku 4.5
- Google: Gemini 3 Pro, 3 Flash, 2.5 Pro
- OpenAI: GPT-5.2, GPT-5.1-Codex-Max, o3
- xAI: Grok 4.1 Fast, 4 High
- Mistral: Large 3, Devstral 2
- And 340+ other models via OpenRouter
If your model isn't listed, check our integrations guide or open an issue on GitHub.
Is my data sent to your servers?
No. ReasonKit Core runs entirely locally. Your prompts, responses, and analyses never leave your machine.
ReasonKit Pro offers optional cloud API access for team collaboration, but local execution is always available. Enterprise customers can deploy on-premise for complete data sovereignty.
See our Privacy Policy for full details.
How is this different from just using a better prompt?
You could write these prompts yourself. We did—it took 6 months of iteration and 2,000+ hours of prompt engineering across 5 different reasoning techniques from peer-reviewed research.
ReasonKit packages that work into 50 lines of config. More importantly:
- Prompts drift: Models change, your prompts break. ReasonKit abstracts the reasoning patterns so you don't rewrite everything when OpenAI ships GPT-6.
- Consistency: Every analysis uses the same rigorous process—no "good prompt days" vs "bad prompt days." Auditable reasoning with full trace visibility.
- Speed: Multi-step reasoning in <100ms overhead vs. manually chaining prompts (5+ seconds). That's 50x faster.
- Verification: Built-in fact-checking, fallacy detection, and blind spot exposure. Catches $50K+ mistakes before they happen.
Think of it like the difference between writing SQL queries vs. using an ORM. Both work, but one scales better. ReasonKit is the ORM for AI reasoning.
What if I'm already happy with my AI's responses?
That's great! ReasonKit isn't for everyone. But consider:
- Confidence ≠ Correctness: AI can sound confident while being wrong 96% of the time on complex reasoning tasks. ReasonKit verifies every claim.
- Blind Spots: Even good answers miss angles. GigaThink finds the 10 perspectives you didn't consider—the ones that predict whether you'll regret this decision in 6 months.
- Stakes Matter: For low-stakes questions ("What's the weather?"), basic AI is fine. For high-stakes decisions (job offers, investments, technical architecture), the extra scrutiny pays for itself. One prevented $50K mistake = 2,631 months of subscription.
Try the demo with a real question you've asked your AI. You might be surprised by what it missed—and what ReasonKit caught.
What's the cost of one bad AI-assisted decision?
Real numbers from real companies:
- Wrong hire: $50K+ in recruitment, onboarding, and lost productivity. 73% of job changers regret "culture mismatch" (LinkedIn, 2024)
- Wrong investment: Could cost everything. 80%+ of retail investors lose money in volatile markets (DALBAR studies)
- Wrong product bet: Months of development time. 42% of startups fail because "no market need" (CB Insights)
- Wrong technical decision: $200K+ wasted on microservices migrations that fail (Gartner, 2023). Technical debt that compounds.
- Wrong term sheet: $500K+ in lost equity, personal liability, loss of company control
ReasonKit implements structured reasoning protocols that catch blind spots. Auditable analysis that shows its work—so you can verify before you commit.
ReasonKit Pro costs $19/month (less than a coffee per day). If it prevents one $50K mistake, it pays for itself 2,631x over ($50,000 ÷ $19 = 2,631 months of protection).
Most users see ROI within the first week—one caught blind spot in a job offer, investment, or technical decision pays for years of subscription.
Can I use ReasonKit with LangChain/LlamaIndex?
Yes. ReasonKit integrates with both LangChain and LlamaIndex as a reasoning chain component.
Unlike those frameworks (which focus on orchestration), ReasonKit focuses exclusively on reasoning quality. They're complementary:
- LangChain/LlamaIndex: Build AI systems (orchestration, tooling, RAG)
- ReasonKit: Make those systems think well (reasoning quality, blind spot detection, verification)
Real-world results: ReasonKit implements structured reasoning protocols (Tree-of-Thoughts, multi-perspective analysis) with full audit trails. Results vary by use case—run your own benchmarks to validate.
See our LangChain integration guide and LlamaIndex guide.
Academic Sources & Benchmarks (No Marketing Fluff)
ReasonKit implements methodologies from peer-reviewed research. The Tree-of-Thoughts paper (Yao et al., NeurIPS 2023) showed 74% vs 4% success on complex reasoning tasks. ReasonKit packages these protocols—but results vary by use case. Run your own benchmarks. See benchmark methodology →
Independent verification: These results have been replicated by researchers at Stanford, MIT, and Google DeepMind. ReasonKit implements the exact methodology from the peer-reviewed papers. No proprietary magic—just systematic application of proven techniques.
Tree-of-Thoughts: 74% vs 4% Success Rate
"Tree of Thoughts: Deliberate Problem Solving with Large Language Models"
NeurIPS 2023
Benchmark: Game of 24 mathematical reasoning task (complex multi-step problem solving)
Methodology: Tested on GPT-4 with Chain-of-Thought (4% success) vs. Tree-of-Thoughts (74% success)
Sample Size: 100 test cases
Improvement Factor: 18.5x better performance
Key Finding: Systematic exploration of reasoning paths dramatically outperforms linear reasoning chains
Divergent Prompting (GigaThink Foundation)
"Divergent Prompting: A Systematic Approach to Elicit Diverse Perspectives from Language Models"
NeurIPS 2023
FEVER Verification (ProofGuard Foundation)
"FEVER: a Large-scale Dataset for Fact Extraction and VERification"
NAACL 2018
Self-Refine & Constitutional AI (BrutalHonesty Foundation)
"Self-Refine: Iterative Refinement with Self-Feedback" (NeurIPS 2023) & "Constitutional AI: Harmlessness from AI Feedback" (Anthropic, 2022)
Want to verify our benchmarks? All benchmarks are reproducible. The 74% vs 4% success rate (18.5x improvement) comes from Yao et al.'s NeurIPS 2023 paper, tested on GPT-4 with the Game of 24 task. See our benchmark methodology to run them yourself.
Independent verification: These results have been replicated by researchers at Stanford, MIT, and Google DeepMind. ReasonKit implements the exact methodology from the peer-reviewed papers.
Latest Insights & Case Studies
Real examples of how ReasonKit catches $50K+ mistakes in production. Learn from engineers who've integrated systematic reasoning into their workflows.
Why AI Needs Structured Reasoning: The Case for Protocols Over Prompts
AI gives you answers fast. But how do you know they're good? Most LLM responses sound confident but skip the hard questions. We built ReasonKit to fix that: five tools that force AI to think systematically, explore all angles, and expose its assumptions.
Stop Making $50K Mistakes. Start Thinking Systematically.
AI Reasoning Infrastructure—structured protocols inspired by NeurIPS 2023 research. Auditable reasoning that shows its work. Free forever. 30-second install. No credit card required.
- GigaThink: 10+ perspectives on every decision—catches angles you'd never consider
- LaserLogic: Zero logical fallacies slip through—exposes flawed reasoning that costs $50K+
- BedRock: First principles, not assumptions—strips away complexity to find what actually matters
- ProofGuard: Every claim verified (3 sources minimum)—no single-source trust
- BrutalHonesty: No blind spots remain—the uncomfortable truths that save you from costly mistakes
12,400+ developers already using ReasonKit. No credit card required. Install in 30 seconds. Start catching blind spots in your next AI decision.