Agents & Subagents.
Agents that delegate, not just chat
Design multi-agent hierarchies, keep context focused, and evaluate every level with built-in evals.
From idea to agent in 4 steps
Create a task for each role in your system—orchestrator, researcher, analyst—with a prompt and schema.
Connect MCP tools, Kiln search tools, or other Kiln tasks as callable tools from the project Tools screen.
Pick tools and subagents per run, then save the config for repeat use.
Execute the agent, inspect every tool call and subagent invocation, then iterate on prompts and tool selection.
Agents in Kiln
Subagents: Levels of Autonomy
An agent is a Kiln task that loops autonomously — reasoning, calling tools, deciding when it's done. A subagent is any Kiln task turned into a tool. The parent delegates a focused job, the subagent runs in its own context. The same subagent can be reused across many parent workflows.
Focused Context Windows
Long-running agents accumulate context fast — web pages, API responses, intermediate reasoning. Quality degrades, costs spike. Kiln subagents fix this structurally: each runs in its own context window.
~4× cheaper context for web-research workloads
Evaluate Each Level
Every agent and subagent can be evaluated independently. Tool-use evals check tool calls and parameters. Evals measure output quality. Run configurations lock model, prompt, and tools per subagent — swap one variable, measure the impact across the whole system.
Everything you need for production agents
Turn any task into a callable subagent for multi-actor orchestration.
Connect local or remote MCP servers—APIs, databases, web search.
Agents loop through reasoning and tool calls until the job is done.
Lock model, prompt, and tools per subagent for reproducible runs.
Interleave reasoning and tool calls.
Inspect every tool call, subagent invocation, and message in one view.
Evaluate whether agents call the right tools with the right parameters.
MIT-licensed Python library, source-available app. Your project files stay on your machine.
Multi-agent systems before and after Kiln
- Wire together LangChain or custom glue code to orchestrate multiple agents—then maintain it.
- Watch context windows bloat with irrelevant tool output until quality degrades and costs spike.
- Evaluate the final answer and hope the intermediate steps are working correctly.
- Compose orchestrators and specialists by turning any Kiln task into a callable subagent.
- Subagent isolation manages context automatically—irrelevant data is dropped when a subtask ends.
- Evaluate every level of the hierarchy independently with tool-use evals and spec-based scoring.
Frequently asked
What is the difference between an agent and a subagent in Kiln?
An agent is a Kiln task that loops autonomously — reasoning and calling tools until it's done. A subagent is any Kiln task turned into a tool another task can call. The same task can be both: agent in one workflow, subagent in another.
How does context management work with subagents?
Each subagent runs in its own context window. When it completes, the full message history is discarded — only the final message returns to the parent. The parent's context stays small and focused, avoiding overflow and runaway costs.
Do I need to write code to build multi-agent systems?
No. Tasks, tools, and agent hierarchies are all built in the Kiln desktop app and saved to agent configuration files. The Python library runs the same configurations in production.
Can I connect agents I already built in LangChain, CrewAI, or custom code?
Yes. Wrap it as an MCP server (SDKs exist for Python, TypeScript, Go, Rust, and more) and Kiln can call it, eval it, and compare against Kiln-native agents — no rewrite required.
What tools can my agents use?
Any MCP-compatible tool server (local or remote), Kiln Search Tools for RAG, and other Kiln tasks as subagents. Connect any custom MCP server through the UI.
How do I evaluate agent quality?
Tool-use evals verify the right tools were called with the right parameters. LLM-as-Judge evals measure output quality. Because every subagent is a standalone Kiln task, you can evaluate any level of the hierarchy.
Ship agents you can trace and evaluate.
Compose multi-agent hierarchies, manage context automatically, and evaluate every level.