Documentation
The handbook.
Status pills tell you what is shippable today. Tanvrit Automator is in active development; some references forward-link to abstractions being built in parallel.
Concepts
How the agent thinks. Read these first; everything else builds on them.
Perceive · Plan · Execute · Verify · Record
ReadyThe five-phase loop that drives every agent run. State machine, transition rules, retry caps.
Perception strategies
ReadyDOM-first via Playwright a11y tree; vision fallback via Qwen2.5-VL + Tesseract OCR.
Bench corpus
ReadyHow regressions are detected. The gate for any PR touching agent / llm / perception / execution.
Installation
Get the desktop app running and pull a planner model. Five minutes end-to-end.
Configuration
AppPreferences JSON shape, environment variables, and per-project overrides.
AppPreferences (~/.automator/prefs.json)
DraftplannerModel, embeddingModel, visionModel, browserEngine, browserDevice, headless.
Environment variables
DraftAUTOMATOR_LLM_PROVIDER, OLLAMA_URL, OLLAMA_MODEL, ANTHROPIC_API_KEY, AGENT_SMART_AUTOFILL.
Per-project overrides
In progress.automator.json in your project root overrides global preferences for runs from that directory.
Agent loop & states
The internals. Read after you have run the sample flow.
AgentState enum
DraftIDLE → PERCEIVING → PLANNING → EXECUTING → VERIFYING → RECORDING → IDLE. Plus FAILED, STUCK, COMPLETE.
AgentAction sealed class
DraftEvery action the planner can emit. Click, Type, Navigate, Scroll, Hover, Drag, Screenshot, Wait, Done.
Verifier semantics
DraftURL + DOM hash + visible text comparison. When does a step count as success?
Scenarios DSL
Describe what you want the agent to do. Forward-references commonMain abstractions being built in parallel.
Goal-driven scenarios
In progressOne natural-language goal. Planner figures out steps. Best for exploratory testing.
Pinned-action scenarios
In progressLock specific actions to specific steps. Best for deterministic regression replay.
Pre / post conditions
Not startedSetup actions before the run starts; teardown actions after. Failure modes.
MCP server
Expose the agent's 31 tools to any MCP-compatible client (Claude Desktop, Cursor, Zed, custom).
Bench corpus
Lock in good behaviour. AssertionEvaluator compares new runs to the canonical bench.
API reference
For library consumers. Public Kotlin API exposed via commonMain interfaces.
Examples
Worked examples, end-to-end. Clone, run, modify.
GitHub login flow
ReadyThe default sample. Two-factor handled, success verified by URL hash.
SPA with mostly-canvas DOM
DraftWhen DOM perception returns nothing useful and vision fallback takes over.
Cross-browser parity
In progressSame scenario, run against Chromium, Firefox, and WebKit. Diff the results.
Something missing?
The docs are evolving with the product. Tell us what would help.
Send docs feedback