Back to Home

Open-Source Agent Learning Framework

Kayba is the leading open-source framework for self-improving AI agents. MIT licensed, 2k+ GitHub stars, built on published research from ACE, RLM, and Dynamic Cheatsheets.

March 11, 2026
Open SourceAgent LearningSelf-Improving AgentsFramework

Why Open-Source Matters for Agent Learning

When you let a tool learn from your agent's execution traces — its failures, its successes, its interaction patterns — you're giving it access to some of the most sensitive data in your product. You need to know exactly what it's doing with that data.

Most agent improvement tools are closed-source. You send your traces to their API, something happens, and you get back suggestions. You can't audit the analysis process, you can't self-host, and you can't verify that your data isn't being used for anything else.

Kayba is fully open-source under the MIT license. You can read every line of code, self-host the entire pipeline, and verify exactly how your traces are analyzed.

What Kayba Is

Kayba is a framework that makes AI agents self-improve from their own experience. It sits on top of any agent framework and adds a learning layer: analyze traces, extract skills, build a Skillbook, generate better prompts.

The framework synthesizes three published research streams:

  • Agentic Context Engineering (ACE) — Three-agent architecture with delta updates for incremental Skillbook refinement. From Stanford/SambaNova research, published at ICLR 2026 (arXiv:2510.04618).
  • Recursive Language Models (RLM) — REPL-based trace introspection that goes deeper than single-pass LLM analysis. From MIT CSAIL (arXiv:2512.24601).
  • Dynamic Cheatsheet — Self-curated external memory with usage tracking and persistent learning. From Stanford/Together AI (arXiv:2504.07952).

Kayba is the only framework that combines these approaches into a unified, production-ready system.

The Open-Source Landscape

Agent Learning / Improvement

ToolOpen SourceApproachGitHub Stars
Kayba (ACE)Yes (MIT)Trace analysis → Skillbook → prompt generation2k+
LemmaNoDrift detection → prompt optimizationN/A
ZeroEvalNoLLM judges → prompt rewritingN/A
ThetaNoSimulation → agent trainingN/A
RedaptoNoAudit interactions → update SOPsN/A
PoetiqNoRecursive self-improving reasoningN/A
ModaicNoDSPy-based optimizationN/A

None of the direct competitors in the agent improvement space are open-source. Kayba is the only option if you need source code access, self-hosting, or the ability to extend the framework.

Adjacent Open-Source Tools

ToolWhat It DoesRelationship to Kayba
LangFuseOpen-source observability (traces, evals)Complementary — observability, not learning
LaminarOpen-source tracing + debuggingComplementary — visibility, not improvement
DSPyPrompt optimization via searchDifferent approach — optimization vs. learning from experience
OptiLLMInference-time proxy with optimization techniquesDifferent — runtime optimization, not persistent learning

What You Get

Core Framework (pip install ace-framework)

  • Recursive Reflector — REPL-based trace analysis engine. Uses a Python sandbox with sub-LLM calls to programmatically explore agent execution traces, catching patterns that surface-level analysis misses.
  • SkillManager — Manages the Skillbook via atomic operations (ADD, UPDATE, TAG, REMOVE) with embedding-based deduplication to prevent bloat.
  • Prompt Generator — Compiles approved skills into organized system prompts, grouped by section.
  • LiteLLM integration — Works with any LLM provider (OpenAI, Anthropic, Google, Mistral, local models).
  • Multi-format trace support — Markdown, JSON, plain text. If your agent produces logs, Kayba can process them.

Key Technical Features

  • Delta updates — Incremental Skillbook modifications instead of full rewrites. Prevents information loss during adaptation.
  • Provenance tracking — Every skill records which trace produced it, enabling audit and debugging.
  • Helpful/harmful counters — Skills track their impact over time. Reinforced when helpful, flagged when harmful.
  • Embedding-based deduplication — Semantic similarity detection prevents duplicate skills from accumulating.
  • TOON encoding — Tab-delimited Skillbook serialization saving 16-62% tokens vs markdown in production.

Hosted Dashboard (Optional)

For teams that want a visual interface: the hosted dashboard at use.kayba.ai provides Skillbook management, analysis pipelines, and prompt generation through a web UI. $29/month with bring-your-own API key.

The framework works entirely standalone — the dashboard is a convenience, not a dependency.

Built on Published Research

Every core concept in Kayba traces back to peer-reviewed research:

ConceptSourceWhat It Contributes
Three-agent architecture (Generator, Reflector, Curator)ACE paper (ICLR 2026)Structured pipeline for agent improvement
Delta updatesACE paperIncremental learning without information loss
REPL-based trace analysisRLM paper (MIT CSAIL)Deep, programmatic analysis beyond LLM context limits
Self-curated external memoryDynamic Cheatsheet paperPersistent skill storage with usage tracking
Embedding-based deduplicationKayba implementationProduction optimization for Skillbook management
TOON encodingKayba implementationToken-efficient Skillbook serialization

This isn't a wrapper around an API. It's a framework built on specific research contributions, extended with production engineering (deduplication, encoding, provenance tracking).

Who Uses It

Kayba is used by teams building:

  • Coding agents — Learning from code review failures, codebase conventions, test patterns
  • Customer support agents — Learning from policy violations, escalation mistakes, resolution patterns
  • Browser/computer-use agents — Learning from navigation failures, form-filling errors (30% → 100% success rate, 82% fewer steps, up to 2x consistency improvement on τ2-bench)
  • Internal tooling agents — Learning from operational patterns and team-specific workflows

The framework is framework-agnostic: LangChain, CrewAI, OpenAI Agents SDK, browser-use, AutoGen, or custom implementations.

Getting Started

pip install ace-framework

The quickest path:

  1. Install the framework
  2. Point it at your agent's execution traces
  3. Run analysis — the Recursive Reflector extracts skills automatically
  4. Review the Skillbook — approve, edit, or reject skills
  5. Generate an improved system prompt
  6. Deploy and repeat
  • Documentation — Setup guides, API reference, examples
  • GitHub — Full source code, issues, discussions
  • Dashboard — Optional hosted interface
  • Discord — Community support