LLM as Compiler Architecture¶
The core architectural insight behind Trailblaze — treating the LLM as a compiler rather than a chatbot.
Background¶
Traditional UI test frameworks require developers to write explicit, imperative test code. We want to enable natural language test authoring while maintaining deterministic execution.
What we decided¶
Trailblaze treats the LLM as a compiler that transforms natural language test cases into deterministic tool sequences.
The Compiler Metaphor¶
Natural Language → LLM + Agent + Tools → Trail Recording
(Source) (Compiler) (Output/IR)
| Concept | Traditional Compiler | Trailblaze |
|---|---|---|
| Source | Code (.c, .kt) | Natural language test steps |
| Compiler | gcc, kotlinc | LLM + Trailblaze Agent |
| IR/Output | Assembly, bytecode | Trail YAML (tool sequence) |
| Runtime | CPU, JVM | Device + Maestro/Tools |
Compilation Flow¶
Test Case Steps → LLM interprets steps → Execute tools on device
↓ ↓ ↓
Natural Language Agent orchestration Success/Failure
↓ ↓ ↓
On failure: retry Record successful run
with context as .trail.yaml
Key Properties¶
- Compilation happens once: First successful run is recorded
- Replay is deterministic: Subsequent runs use recording, no LLM needed
- Self-healing on failure: LLM can adapt and retry when UI changes
- Recompilation on demand: Force AI mode to generate new recording
Agent Loop¶
- LLM receives test step + current screen state
- LLM selects and invokes tools
- Tools execute via Maestro/device drivers
- On success → record tool invocation
- On failure → provide error context, retry
- After all steps → save complete
.trail.yaml
What changed¶
Positive: Natural language authoring, deterministic replay, self-healing capability, familiar mental model for engineers.
Negative: Initial “compilation” requires LLM (cost/latency); recordings may need “recompilation” when UI changes significantly.