Skip to content

Architecture

This document describes the software architecture and key design decisions of the Trailblaze framework.

Technology Stack

Technology Purpose
Kotlin Primary language for all modules (Kotlin Multiplatform where applicable)
Gradle (Kotlin DSL) Build system with version catalogs (libs.versions.toml)
Kotlin Serialization JSON serialization for tools, logs, and API communication
Kotlin Coroutines Async execution for tool calls and LLM interactions
Compose Multiplatform Desktop application UI
Compose Material 3 UI component library and theming
Ktor HTTP client for LLM API calls, server for log aggregation

Why Kotlin?

  • Multiplatform support - Share code between Android on-device and JVM host modes
  • Coroutines - Clean async APIs for LLM calls and tool execution
  • Serialization - First-class support for JSON serialization with annotations
  • Type safety - Catch errors at compile time, especially important for tool definitions
  • Android ecosystem - Native integration with Android instrumentation testing

Kotlin Multiplatform Strategy

We aim to write as much code as possible in commonMain source sets for maximum reusability:

src/
├── commonMain/     # Preferred - platform-agnostic code
├── jvmMain/        # JVM-specific (desktop, host mode)
└── androidMain/    # Android-specific (on-device mode)

Why commonMain? Code in commonMain can be shared across all targets, reducing duplication and ensuring consistent behavior.

JVM Constraints: Some code must be JVM-specific due to dependencies:

  • Maestro - The device interaction library is JVM-only, so driver implementations live in jvmMain or androidMain
  • File I/O - Some file operations use JVM-specific APIs
  • Desktop UI - Compose Desktop targets JVM

When adding new code, prefer commonMain unless there’s a specific platform requirement.

UI with Compose Material 3

The Desktop application uses Compose Multiplatform with Material 3:

  • Material 3 Components - Buttons, cards, dialogs, navigation, etc.
  • Dynamic Theming - Support for light/dark modes
  • Material Icons Extended - Comprehensive icon library
  • Multiplatform Markdown Renderer - For displaying test documentation

UI code lives primarily in trailblaze-ui (shared components) and trailblaze-desktop (application shell).

Build Structure

The project uses Gradle with Kotlin DSL and a multi-module structure:

opensource/
├── build.gradle.kts              # Root build configuration
├── gradle/libs.versions.toml     # Centralized dependency versions
├── trailblaze-agent/             # Core agent logic
├── trailblaze-android/           # Android on-device driver
├── trailblaze-common/            # Shared utilities and tools
├── trailblaze-desktop/           # Desktop application
├── trailblaze-host/              # Host-mode driver
├── trailblaze-models/            # Data models (Kotlin Multiplatform)
├── trailblaze-report/            # Reporting utilities
├── trailblaze-server/            # Log server
└── trailblaze-ui/                # Shared UI components

Overview

Trailblaze is a layered architecture that separates AI reasoning from platform-specific execution:

flowchart TB
    subgraph agent["Agent Layer"]
        LLM["LLM Provider"]
        SP["System Prompt"]
        TC["Tool Calling"]
    end
    subgraph syntax["Test Syntax Layer"]
        YAML[".trail.yaml Parser"]
        PROMPT["Prompt Steps"]
        ACTION["Action Steps"]
    end
    subgraph tools["Tool Layer"]
        ET["Executable Tools"]
        DT["Delegating Tools"]
        CT["Custom Tools"]
    end
    subgraph drivers["Platform Driver Layer"]
        AD["Android Driver"]
        ID["iOS Driver"]
    end
    agent --> syntax
    syntax --> tools
    tools --> drivers

Design Principles

1. Platform Agnostic Agent

The Trailblaze agent is designed to be platform-agnostic. It reasons about UI interactions using natural language and a standardized tool interface, without knowledge of the underlying platform implementation.

Why? This allows the same test logic to work across different platforms and enables future extensibility.

2. Tool-Based Execution

All device interactions are expressed as tools that the agent can invoke. Tools provide a clean abstraction between the agent’s intent and the platform-specific implementation.

Why? Tools are:

  • Composable and reusable
  • Easy to extend with custom implementations
  • Testable in isolation
  • Self-documenting via annotations

3. Deterministic + AI Hybrid

Tests can combine deterministic actions (always execute the same way) with AI-driven steps (adapt to screen state). This provides both reliability and flexibility.

Why? Pure AI tests can be unpredictable; pure scripted tests are brittle. The hybrid approach gives test authors control over the trade-off.

Key Components

MaestroTrailblazeAgent

The abstract base class for all Trailblaze agents. It provides:

  • Tool execution framework
  • Memory for storing data between steps
  • Logging integration
  • Support for both blocking and suspending execution
abstract class MaestroTrailblazeAgent : TrailblazeAgent {
    abstract suspend fun executeMaestroCommands(commands: List<Command>, traceId: TraceId?)
    val memory = AgentMemory()
    fun runTrailblazeTools(tools: List<TrailblazeTool>): RunTrailblazeToolsResult
}

Implementations:

  • AndroidMaestroTrailblazeAgent - On-device Android execution
  • HostMaestroTrailblazeAgent - Host-mode execution for Android/iOS

Tool System

Tools are the primary abstraction for device interactions. There are two main types:

ExecutableTrailblazeTool

Tools that directly perform actions:

interface ExecutableTrailblazeTool : TrailblazeTool {
    suspend fun execute(toolExecutionContext: TrailblazeToolExecutionContext): TrailblazeToolResult
}

Examples: tapOnElementWithText, inputText, assertVisible

DelegatingTrailblazeTool

Tools that expand into multiple executable tools:

interface DelegatingTrailblazeTool : TrailblazeTool {
    fun toExecutableTrailblazeTools(context: TrailblazeToolExecutionContext): List<ExecutableTrailblazeTool>
}

Why delegating tools? They allow complex, multi-step operations to be exposed as single tool calls to the agent, reducing token usage and improving reliability.

Custom Tools

Applications can define custom tools for app-specific interactions:

@Serializable
@TrailblazeToolClass("signInWithEmailAndPassword")
@LLMDescription("Sign in using email and password")
data class SignInTool(val email: String, val password: String) : MapsToMaestroCommands {
    override fun toMaestroCommands(): List<Command> = listOf(/* commands */)
}

Why? Custom tools let teams encode domain knowledge and reduce test verbosity.

Extensibility

Trailblaze is designed to be extended without modifying the core framework. There are several extension points:

Custom Tools

The primary extension mechanism. Define tools as Kotlin data classes with annotations:

@Serializable
@TrailblazeToolClass("myCustomAction")
@LLMDescription("Description shown to the LLM")
data class MyCustomTool(
    val param1: String,
    val param2: Int
) : ExecutableTrailblazeTool {
    override suspend fun execute(context: TrailblazeToolExecutionContext): TrailblazeToolResult {
        // Custom implementation
        return TrailblazeToolResult.Success
    }
}

Tools are automatically:

  • Serialized to JSON for LLM function calling
  • Documented via @LLMDescription annotations
  • Validated at compile time via Kotlin’s type system

Test Rules

Create reusable test base classes that configure the agent:

class MyAppTrailblazeTest(
    testTimeoutMinutes: Int = 10
) : TrailblazeTestRule(
    customTools = listOf(MyCustomTool::class),
    // ... other configuration
)

Platform Drivers

Implement MaestroTrailblazeAgent to add support for new platforms:

class WebTrailblazeAgent : MaestroTrailblazeAgent() {
    override suspend fun executeMaestroCommands(
        commands: List<Command>,
        traceId: TraceId?
    ): TrailblazeToolResult {
        // Translate commands to web automation
    }
}

LLM Providers

The agent communicates with LLMs through an abstraction that can be swapped:

  • Default: OpenAI-compatible API
  • Extensible to other providers (Anthropic, local models, etc.)

Test Syntax (.trail.yaml)

Tests are authored in .trail.yaml format, which supports:

Prompt Steps (AI-driven)

- prompts:
    - step: Navigate to the Settings screen and enable notifications

Action Steps (Deterministic)

- maestro:
    - tapOn: "Settings"
    - assertVisible: "Notifications"

Recording Hints

- prompts:
    - step: Tap the login button
      recording:
        tools:
        - tapOnElementWithText:
            text: "Login"

Recording hints guide the agent toward specific tool calls while still allowing adaptation.

Execution Modes

On-Device Mode (Android)

flowchart LR
    subgraph device["Android Device"]
        TEST["Test Process"]
        AGENT["Trailblaze Agent"]
        DRIVER["Android Driver"]
        APP["App Under Test"]
    end
    LLM["LLM API"]

    TEST --> AGENT
    AGENT <--> LLM
    AGENT --> DRIVER
    DRIVER --> APP

The agent runs inside the test process on the device. This enables:

  • Integration with cloud device farms (Firebase Test Lab, AWS Device Farm)
  • No host machine required
  • Parallel execution at scale

Host Mode

flowchart LR
    subgraph host["Host Machine"]
        DESKTOP["Desktop App"]
        AGENT["Trailblaze Agent"]
    end
    subgraph device["Device"]
        DRIVER["Platform Driver"]
        APP["App Under Test"]
    end
    LLM["LLM API"]

    DESKTOP --> AGENT
    AGENT <--> LLM
    AGENT --> DRIVER
    DRIVER --> APP

The agent runs on a host machine and controls devices remotely. This enables:

  • Cross-platform testing (Android + iOS)
  • Interactive test authoring via Desktop app
  • Local development workflow

Platform Drivers

Platform drivers translate tool calls into device-specific actions.

Current Implementation

For mobile platforms, Trailblaze uses Maestro internally as the device interaction layer. However:

  • This is an implementation detail - users interact with Trailblaze tools, not Maestro directly
  • Not all Maestro features are exposed - only the subset needed for Trailblaze tools
  • The driver is replaceable - the architecture supports alternative implementations

Driver Interface

Drivers must implement command execution:

abstract suspend fun executeMaestroCommands(
    commands: List<Command>,
    traceId: TraceId?
): TrailblazeToolResult

Logging & Reporting

All tool executions are logged with:

  • Tool name and parameters
  • Execution duration
  • Success/failure status
  • Trace IDs for correlation
  • Session information

Logs are sent to the Trailblaze server for aggregation and reporting.

Memory System

The AgentMemory class allows tests to store and retrieve data:

// Store a value
memory.set("username", "test@example.com")

// Retrieve in a later step
val username = memory.get("username")

Why? Memory enables multi-step tests where later steps depend on data from earlier steps (e.g., storing a generated ID).

Code Conventions

Kotlin Style

  • Follow Kotlin coding conventions
  • Use coroutines for async operations (avoid callbacks)
  • Prefer val over var (immutability)
  • Use data classes for DTOs and tool definitions
  • Use sealed classes for restricted hierarchies (e.g., TrailblazeToolResult)

Tool Definitions

  • Always annotate with @Serializable, @TrailblazeToolClass, and @LLMDescription
  • Keep tool parameters simple (primitives, strings, enums)
  • Write clear @LLMDescription text - the LLM uses this to decide when to call the tool
  • Return TrailblazeToolResult.Success or TrailblazeToolResult.Error with a helpful message

Testing

  • Use JUnit 4 for tests (org.junit.Test)
  • Create fake/mock agents for unit testing tools
  • Integration tests should use real devices or emulators

Logging

  • Use TrailblazeLogger for structured logging
  • Include traceId for correlation across tool calls
  • Log tool execution start, completion, and errors

Future Considerations

The architecture is designed to support:

  • Additional platform drivers - Web, desktop, or custom UI frameworks
  • Multiple LLM providers - Currently supports OpenAI, extensible to others
  • Enhanced recording - Capture and replay interactions with higher fidelity
  • Distributed execution - Coordinate agents across multiple devices