Trailblaze Decision 047: on-device tool-callback channel (Option B)¶
Consumer: app-specific broadcast-tool migration.
Problem¶
A TypeScript tool bundled on-device can invoke its own handler (that
wiring has landed). It cannot compose other Trailblaze tools from inside
that handler — ctx.client.callTool("mobile_listInstalledApps", {…})
has nowhere to dispatch to. The host path sends the request to the
daemon’s /scripting/callback endpoint (see the 2026-04-22 envelope-
migration devlog). On-device there is no daemon and no HTTP server.
Options considered¶
A — embed a minimal HTTP server in the instrumentation process.
Stand up Ktor (or a hand-rolled HttpServer) on-device, mount
/scripting/callback, populate _meta.trailblaze.baseUrl with
http://localhost:<port>. TS SDK unchanged.
- Pro: zero TS branching; single wire shape; reuses the endpoint code.
- Con: HTTP server inside instrumentation. Port binding is iffy on locked-down device farms. Lifecycle ceremony (bind on rule start, unbind on teardown). Serializing a JSON body through the OS loopback interface to dispatch a call that already lives in the same JVM is silly.
B — direct QuickJS binding, no HTTP.
QuickJsBridge grows a __trailblazeCallback async function binding
next to the existing __trailblazeDeliverToKotlin. The TS SDK’s
client.callTool detects the in-process runtime via
ctx.runtime === "ondevice" and calls the binding; Kotlin dispatches
through the same JsScriptingInvocationRegistry the HTTP endpoint uses.
- Pro: no HTTP surface; zero syscalls; best fit for same-process architecture. Reuses the registry + depth gate.
- Con: TS SDK grows a branch. Must keep wire semantics identical to the HTTP path (error shapes, depth gate, timeouts) or authors see environment-dependent behavior.
C — adb reverse back to a host daemon.
Works only when a laptop is running ./trailblaze. Doesn’t cover the
cloud device-farm shard — which is the most common on-device test
environment and the one we need to make CI-green. Non-starter as a
primary.
Decision — Option B¶
Pick B. The cost of the SDK branch (≈30 lines, fully covered by the
integration test) is cheaper than running an HTTP server inside an
Android instrumentation process to serialize a call back into the same
JVM. Option A was tempting because it keeps the TS SDK simple, but the
complexity it adds lives in a worse place (OS sockets, port
lifecycle, device-farm networking edge cases) than the complexity B
adds (one binding and one if branch on a well-understood envelope
field).
Implementation shape¶
- Envelope:
_meta.trailblazegains an optionalruntimefield."ondevice"when Kotlin-side is the:trailblaze-scripting-bundleruntime; absent on subprocess / daemon paths (TS SDK treats absent as the HTTP path for backward compat with every existing host envelope). - TS SDK:
client.callToolinspectsctx.runtime. If"ondevice", serialize the sameJsScriptingCallbackRequestthat the HTTP path builds,await globalThis.__trailblazeCallback(json), parse theJsScriptingCallbackResponse, surface errors the same way. No other path differences —JsScriptingCallbackResult.Error/CallToolResult(success: false)/ etc. all map to the same thrownErrorshape. - Kotlin — shared dispatcher: extract the session-match / depth
gate / dispatch core out of
ScriptingCallbackEndpointinto aJsScriptingCallbackDispatcherobject in:trailblaze-common. The HTTP endpoint keeps its HTTP-specific concerns (loopback gate, body-size cap, content-type); the on-device binding callsJsScriptingCallbackDispatcherdirectly. Single source of truth for the dispatch semantics. - Kotlin — binding:
QuickJsBridgeregisters__trailblazeCallback(requestJson: string): Promise<string>. The binding parsesJsScriptingCallbackRequest, callsJsScriptingCallbackDispatcher, serializes theJsScriptingCallbackResponseand resolves. Errors that escape (malformed JSON, etc.) resolve with aJsScriptingCallbackResult.Errorenvelope — the TS client surfaces them the same way the HTTP path does for non-2xx responses. - Bundle tool wiring:
BundleTrailblazeToolgains an optionalJsScriptingCallbackContext(toolRepo)(nobaseUrl— the binding replaces it). Atexecute()time, register withJsScriptingInvocationRegistry, build_meta.trailblazewithruntime: "ondevice"+invocationId, close the handle infinally. Same register-close shapeSubprocessTrailblazeTooluses.
What stays the same¶
- The
JsScriptingCallbackContractwire types (JsScriptingCallbackRequest,JsScriptingCallbackAction.CallTool,JsScriptingCallbackResult,JsScriptingCallbackResponse) are unchanged. Adding a new transport, not a new contract. JsScriptingInvocationRegistryis unchanged. It’s already process-wide — the on-device binding registers into the same singleton the HTTP endpoint reads from.- Depth gate + timeout knobs (
JsScriptingCallbackDispatchDepth,MAX_CALLBACK_DEPTH,trailblaze.callback.timeoutMs) carry over verbatim via the shared dispatcher.
Non-goals¶
- Bundler automation is still separate. This work lands with
a hand-crafted test fixture JS that mirrors what the bundler will
eventually emit, same pattern as the existing
OnDeviceBundleRoundTripTest. - No new MCP primitives. The TS surface (
ctx.client.callTool) is unchanged; we’re only teaching it a second transport. - No retry / backoff on the binding call. Binding failures are programmer bugs (malformed JSON, registry miss) and should fail loudly, not silently retry.
Acceptance¶
A bundled TS tool that calls ctx.client.callTool("otherTool", {…})
from inside its handler produces the same composed result on-device as
on the host subprocess path. The uitest-sample-app CI step
exercises this every PR; a regression fails the required check.