How serious teams build AI that actually works.
Everything teams need to build and optimize AI systems: evals, optimization, fine-tuning & more.
One Workbench.
Everything You Need.
Index, chunk, retrieve.
Reusable capabilities.
Compose tools, sub-agents.
Hand off and delegate.
Score every change.
Translate intent → Evals.
Auto-tune prompts to evals.
Distill into smaller models.
Generate, filter, label.
Chat to create experiments and optimize.
Datasets versioned in your repo.
Reviews from the whole team.
Python, MIT-licensed.
Solve a Problem Once.
It Stays Solved.
Our evals platform tracks quality across every axis, while our optimizers find the optimal AI configuration.
Built for the whole team.
Deploy anywhere with our open-source Python Library.
Try new models or dispatch experiments in seconds. Replace vague specs with evals and golden data.
Contribute to quality without coding — feedback, ratings, evals and data generation.
AI that builds AI.
Kiln Assistant can run experiments and optimize your AI systems through conversation.
Every model.
Cloud or local.
Skip the guesswork — we've tested every model's capabilities.
Browse model libraryAn app for the team.
A library for production.
Work in our user-friendly app. Deploy Kiln tasks using our open-source Python library.
task = Task.load_from_file(TASK_PATH)
run_config = task.default_run_config()
# An adapter can run a task
adapter = adapter_for_task(
task,
run_config_properties=run_config.run_config_properties,
)
task_run, run_output = await adapter.invoke_returning_run_output(
input,
)You're in good company.
People in our Community:
Kiln has increasingly become our go-to tool for new research. The fine-tuning, synthetic data, and evals have been invaluable.
Ship AI you can actually trust.
Free download, one-click install.