Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oobo.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

We tested whether accumulated engineering context actually helps AI agents perform better. The answer: yes, significantly. Oobo captures context from every commit — not just the code diff, but WHY it was built, what was tried, and what patterns work. This memory becomes available to agents via MCP tools.

Results

60% More Bugs Fixed

Agents with oobo memory resolve 60% more real-world bugs on SWE-bench tasks

75% Win Rate

When memory makes a difference, oobo wins 3 out of 4 contested cases

2.5x More Accurate

On codebase-specific questions, agents answer correctly 2.5x more often with context

Zero False Alarms

When oobo warns about a risky modification, it’s always right — 0% false positive rate

How We Tested

Setup:
  • Same agent (Claude Sonnet 4) with identical tools in both conditions
  • Only difference: one gets oobo’s memory context, one doesn’t
  • Real engineering experiences from 12 major open-source Python repositories
  • Evaluated on tasks the agent has never seen (strict no-leakage protocol)
Benchmark: SWE-bench — real GitHub issues with verified gold-standard solutions. Evaluation: Full agentic loop — agents iterate with file reading, code search, editing, and bash commands to produce patches. Compared A/B with LLM judge + factual verification.

What This Means

Every commit your team makes becomes searchable intelligence. When an agent encounters something similar months later, oobo provides the context: which files to look at, what patterns worked, what broke last time.
Agents without memory waste iterations exploring dead ends. With oobo context, agents navigate directly to the relevant code — producing fixes they’d otherwise miss entirely. Near-zero cost overhead.
Zero false alarms on regression warnings. When oobo surfaces a risk, it’s always based on real past experience — not heuristics or guesses.

Methodology

ParameterValue
DatasetSWE-bench (2294 tasks, 12 Python repositories)
ModelClaude Sonnet 4
EmbeddingOpenAI text-embedding-3-large
Searchpgvector cosine similarity
ProtocolStrict train/test split, no data leakage

The Oobo Difference

Without oobo: Every agent session starts from zero. It reads files, searches code, tries approaches, hits dead ends, backtracks. With oobo: The agent inherits accumulated knowledge. Past solutions, known pitfalls, architectural decisions — surfaced automatically at the start of each task. Every commit enriches the system. No configuration. No training. No manual tagging.