Documentation index for AI agents: see /llms.txt. Markdown versions of every page are available at <path>.md or via Accept: text/markdown.
Concepts

Runtime

define_runtime(...) produces a validated RuntimeSpec; create_runtime(...) hands the spec to the Rust runtime (flowai-runtime) and returns a native Runtime handle.

define_runtime(...) produces a validated RuntimeSpec; create_runtime(...) hands the spec to the Rust runtime (flowai-runtime) and returns a native Runtime handle.

import asyncio

from flowai_harness import Runtime, TestingConfig, create_runtime, define_runtime, define_tenant

tenant = define_tenant("acme", "v1")

runtime_spec = define_runtime(
    tenant=tenant,
    agents=[coordinator, planner, executor, specialist],
    references=[ProductSet],
    providers={"anthropic": {"apiKeyEnv": "ANTHROPIC_API_KEY"}},
)

runtime: Runtime = create_runtime(
    runtime_spec,
    testing=TestingConfig(mock_response="hello from the Rust runtime"),
)

async def main() -> None:
    async for event in runtime.query("Say hello", thread_id="thread-1"):
        print(event)

asyncio.run(main())

define_runtime

define_runtime collects tenant identity, agents, references, plans, toolkits, storage descriptors, and provider config into one pure data spec. Plans, toolkits, and tool bindings declared via agents are auto-attached.

Validation rules enforced at construction time:

  • Agent names must be unique.
  • At most one coordinator.
  • Every routes target must reference a registered agent.
  • An agent cannot route to itself, and cannot list a route twice.

create_runtime

create_runtime constructs the native handle. Python supplies callback adapters only; the agent loop, provider routing, approval gates, plan lifecycle, reference storage, and stream generation stay in the runtime.

The runtime tenant comes from runtime_spec.tenant.resource_id. There is no per-call tenant override.

runtime = create_runtime(
    runtime_spec,
    interpreter="anthropic",
)

The interpreter argument selects one of three interpreters:

  • "noop" — the default. It does not call a model; the loop starts and finishes immediately. Useful for validating specs and wiring.
  • "scripted" — replays tool calls scripted in the prompt through real tool dispatch, without contacting a provider.
  • "anthropic" — real model calls; requires an anthropic entry in define_runtime(..., providers=...).

Separately, testing=TestingConfig(mock_response=...) configures the deterministic testing interpreter, which streams the canned response back. testing= is mutually exclusive with a non-default interpreter=: passing both makes create_runtime raise a ValueError. See Testing for the full testing workflow.

Built-in toolkits like catalog receive Rust dependencies through data_environment.

runtime = create_runtime(
    runtime_spec,
    data_environment={
        "target_database_url": "sqlite:/path/to/acme.db",
        "catalog": {"kind": "inline", "entries": []},
        "catalog_search": {"index_path": "/path/to/catalog-index"},
    },
)

Python tool handlers receive host application services through services. This is the customer-owned dependency-injection path for objects that cannot be serialized into the Rust runtime spec, such as SDK clients, repositories, or domain services.

runtime = create_runtime(
    runtime_spec,
    services={"product_catalog": product_catalog_client},
)

Inside a Python tool handler, the same object is available as ctx.product_catalog, ctx["product_catalog"], and ctx.services["product_catalog"]. Service names must be non-empty strings and cannot use reserved runtime keys such as tool_use_id, services, or references.

Runtime reference APIs

The runtime owns typed reference storage. This is the canonical host-side API surface; References & Glimpses covers the concept. Host code can create a reference, resolve the full payload, or read only the cached glimpse without going through an LLM tool call:

ref = await runtime.create_reference(ProductSet, payload)
payload = await runtime.resolve_reference(ref)
glimpse = await runtime.reference_glimpse(ref)

Passing a ReferenceSpec to create_reference(...) runs its Python glimpse callback once before storing. Passing a string kind is also supported; callers can supply glimpse=... explicitly, otherwise the stored glimpse defaults to {}.

MCP tool serving

The runtime handle can construct MCP tool servers for a selected agent. The MCP path reuses the same tool composition, toolkit dependencies, and Python callback bridge as normal runtime dispatch.

from flowai_harness import mcp

runtime = mcp.create_mcp_runtime(tools=[search_products])
tools = mcp.list_tools(runtime, agent="mcp")

Direct MCP serving exposes tools only by default. Approval-gated tools return a tool error unless the host uses a noninteractive policy; the server does not wait indefinitely for interactive approval. Recursive agent-tool exposure is reserved for a later phase; the runtime-generated agents toolkit is not supported by direct MCP serving yet.

Driving the runtime

runtime.query(prompt, thread_id, resume=None) returns an async-iterable event stream. Each iteration yields a dict in the wire-shape event format produced by the runtime.

async for event in runtime.query("Draft a tiny pricing scenario.", thread_id="thread-1"):
    print(event)

When the runtime emits an approval event, resume it with the approval id, an outcome ("approve", "reject", or "revise"), and optional feedback.

await runtime.respond_to_approval(
    approval_id,
    "approve",
    feedback="Approved by the host application.",
)

Specialists can also be dispatched directly:

async for event in runtime.run_specialist(
    "product_insights",
    "What's the average price of our enterprise SKUs?",
    thread_id="thread-1",
):
    print(event)

See also

  • Streaming — the full event vocabulary yielded by runtime.query(...).
  • Tenant — the identity that scopes every runtime handle.
  • Agents — the roles registered in define_runtime(agents=...).