Documentation index for AI agents: see /llms.txt. Markdown versions of every page are available at <path>.md or via Accept: text/markdown.
Reference

Runtime

Native-backed runtime handle, spec constructors, approval and storage policy values, and testing / data-environment configuration.

Native-backed runtime handle, spec constructors, approval and storage policy values, and testing / data-environment configuration.

Runtime

Native Flow AI runtime handle returned by create_runtime(...).

Owns agent orchestration, provider routing, approval gates, plan lifecycle, reference storage, eval execution, and MCP tool serving. Python supplies callback adapters only. Exported as flowai_harness.Runtime.

query

query(self, /, prompt, thread_id, resume=None)

ParameterTypeDefault
promptAnyrequired
thread_idAnyrequired
resumeAnyNone

Run a coordinator turn and return an async-iterable event stream.

Args: prompt: User prompt for this turn. thread_id: Conversation thread identifier. resume: Optional resume token continuing an interrupted run.

Returns: An async-iterable stream yielding runtime event dicts.

run_specialist

run_specialist(self, /, specialist, prompt, thread_id=None)

ParameterTypeDefault
specialistAnyrequired
promptAnyrequired
thread_idAnyNone

Dispatch a specialist agent directly, bypassing the coordinator.

Args: specialist: Name of the registered specialist agent. prompt: User prompt for the specialist. thread_id: Optional conversation thread identifier.

Returns: An async-iterable stream yielding runtime event dicts.

run_eval

run_eval(self, /, eval_request)

ParameterTypeDefault
eval_requestAnyrequired

Run an eval to completion and return the eval artifact.

Args: eval_request: EvalRequest model, mapping, or JSON string.

Returns: Awaitable resolving to the eval artifact dict (validate with EvalArtifact).

Raises: ValueError: If the request cannot be parsed. RuntimeError: If the eval run fails.

stream_eval

stream_eval(self, /, eval_request)

ParameterTypeDefault
eval_requestAnyrequired

Run an eval and stream progress event envelopes.

Args: eval_request: EvalRequest model, mapping, or JSON string.

Returns: An async-iterable stream yielding eval event envelope dicts (validate with HarnessEvalEventEnvelope).

Raises: ValueError: If the request cannot be parsed.

get_trace

get_trace(self, /, trace_id)

ParameterTypeDefault
trace_idAnyrequired

Return one recorded trace by id, or None when not found.

Args: trace_id: Trace identifier from an eval artifact or event.

Returns: The trace dict, or None when no trace has that id.

list_traces

list_traces(self, /, eval_run_id=None, test_case_id=None, thread_id=None)

ParameterTypeDefault
eval_run_idAnyNone
test_case_idAnyNone
thread_idAnyNone

List recorded traces, optionally filtered.

Args: eval_run_id: Only traces recorded for this eval run. test_case_id: Only traces recorded for this test case. thread_id: Only traces recorded for this thread.

Returns: A list of trace dicts matching every supplied filter.

create_reference

create_reference(self, /, reference, value, glimpse=None)

ParameterTypeDefault
referenceAnyrequired
valueAnyrequired
glimpseAnyNone

Store a value and return its typed reference envelope.

Args: reference: ReferenceSpec or reference kind name. A spec's Python glimpse callback runs once before storing. value: JSON-serializable payload to store. glimpse: Explicit glimpse value; defaults to {} when neither a callback nor a value is supplied.

Returns: Awaitable resolving to the reference envelope dict.

resolve_reference

resolve_reference(self, /, reference)

ParameterTypeDefault
referenceAnyrequired

Resolve a reference envelope to its full stored payload.

Args: reference: Reference envelope previously returned by create_reference(...) or emitted by the runtime.

Returns: Awaitable resolving to the stored payload.

reference_glimpse

reference_glimpse(self, /, reference)

ParameterTypeDefault
referenceAnyrequired

Return the cached glimpse for a reference without resolving it.

Args: reference: Reference envelope previously returned by create_reference(...) or emitted by the runtime.

Returns: Awaitable resolving to the cached glimpse value.

respond_to_approval

respond_to_approval(self, /, approval_id, outcome, feedback=None, partial=None)

ParameterTypeDefault
approval_idAnyrequired
outcomeAnyrequired
feedbackAnyNone
partialAnyNone

Resolve a pending approval gate.

Args: approval_id: Approval id from the approval event. outcome: "approve", "reject", or "revise". feedback: Optional reviewer feedback forwarded to the agent. partial: Optional partial revision payload.

Raises: ValueError: If outcome is not a supported value. RuntimeError: If the approval id is unknown or the gate cannot be resolved.

create_runtime

create_runtime(spec: 'RuntimeSpec | Mapping[str, Any]', *, tool_bindings: 'list[ToolSpec] | None' = None, services: 'Mapping[str, Any] | None' = None, approval_predicates: 'Mapping[str, Callable[..., bool]] | None' = None, action_dispatcher: 'Callable[..., Any] | None' = None, event_hooks: 'list[Callable[..., Any]] | None' = None, data_environment: 'DataEnvironmentConfig | Mapping[str, Any] | None' = None, target_database_url: 'str | None' = None, testing: 'TestingConfig | None' = None, interpreter: "Literal['noop', 'scripted', 'anthropic']" = 'noop') -> 'Any'

ParameterTypeDefault
specRuntimeSpec | Mapping[str, Any]required
tool_bindingslist[ToolSpec] | NoneNone
servicesMapping[str, Any] | NoneNone
approval_predicatesMapping[str, Callable[..., bool]] | NoneNone
action_dispatcherCallable[..., Any] | NoneNone
event_hookslist[Callable[..., Any]] | NoneNone
data_environmentDataEnvironmentConfig | Mapping[str, Any] | NoneNone
target_database_urlstr | NoneNone
testingTestingConfig | NoneNone
interpreterLiteral['noop', 'scripted', 'anthropic']'noop'

Returns: Any

Construct a native Rust runtime handle from a validated spec.

Python supplies callback adapters only: tool handlers, action dispatch, event hooks, and dynamic approval predicates are registered with the embedded Rust runtime. Agent orchestration and approval gating stay in flowai-runtime. The runtime tenant comes from spec.tenant.resource_id; there is no per-call tenant override.

Args: spec: RuntimeSpec or mapping validated as one. tool_bindings: Additional ToolSpec values with Python handlers. Agent-attached tools are registered automatically; every tool bound to an agent must carry a handler. services: Host service objects exposed to Python tool handlers via the tool context (ctx.&lt;name&gt;, ctx["<name>"]). Keys must be non-empty strings and must not use the reserved names tool_use_id, services, or references. approval_predicates: Dynamic approval predicates keyed by predicate id, for tools whose approval is {"kind": "dynamic"} without an attached approval_handler. action_dispatcher: Callable that receives executor business actions for host-side dispatch. event_hooks: Callables invoked for each runtime event during streaming. data_environment: Rust-owned data dependencies (kv store, target database, catalog, catalog search) consumed by built-in toolkits. See DataEnvironmentConfig. target_database_url: Shorthand for data_environment["target_database_url"]. Conflicts with an explicit target_database descriptor or a differing target_database_url value. testing: TestingConfig with mock_response. Runs the deterministic mock interpreter; mutually exclusive with a non-default interpreter. interpreter: Model interpreter key: "noop" (default, no provider), "scripted" (deterministic scripted replay), or "anthropic" (live provider).

Returns: A native Runtime handle exposing query, run_specialist, eval, reference, approval, trace, and MCP-serving methods.

Raises: ValueError: If a dynamic approval predicate is not registered, an agent tool binding has no Python handler, testing is combined with a non-default interpreter, the testing config is malformed, target_database_url conflicts with data_environment, or a service key is reserved. TypeError: If services or data-environment values have invalid types. pydantic.ValidationError: If spec or data_environment fail validation.

define_runtime

define_runtime(tenant: 'TenantIdentity | Mapping[str, Any]', *, agents: 'list[AgentSpec | Mapping[str, Any]] | None' = None, references: 'list[ReferenceSpec | Mapping[str, Any]] | None' = None, plans: 'list[PlanSpec | Mapping[str, Any]] | None' = None, toolkits: 'list[ToolkitSpec | Mapping[str, Any]] | None' = None, approval_policies: 'ApprovalPolicies | Mapping[str, Any] | None' = None, approval_overrides: 'ApprovalOverrides | Mapping[str, Any] | None' = None, storage_factories: 'StorageFactories | Mapping[str, Any] | None' = None, providers: 'Mapping[str, Any] | None' = None, tool_bindings: 'list[ToolSpec] | None' = None) -> 'RuntimeSpec'

ParameterTypeDefault
tenantTenantIdentity | Mapping[str, Any]required
agentslist[AgentSpec | Mapping[str, Any]] | NoneNone
referenceslist[ReferenceSpec | Mapping[str, Any]] | NoneNone
planslist[PlanSpec | Mapping[str, Any]] | NoneNone
toolkitslist[ToolkitSpec | Mapping[str, Any]] | NoneNone
approval_policiesApprovalPolicies | Mapping[str, Any] | NoneNone
approval_overridesApprovalOverrides | Mapping[str, Any] | NoneNone
storage_factoriesStorageFactories | Mapping[str, Any] | NoneNone
providersMapping[str, Any] | NoneNone
tool_bindingslist[ToolSpec] | NoneNone

Returns: RuntimeSpec

Create a validated Flow AI runtime spec value.

Collects tenant identity, agents, references, plans, toolkits, approval policy, storage descriptors, and provider config into one pure data spec. Plans, toolkits, and tool bindings declared on agents are auto-attached, and toolkit/agent tool rows are merged into each agent's prompt.

Args: tenant: TenantIdentity or mapping with resource_id and version. agents: AgentSpec values or mappings validated as such. references: ReferenceSpec values or mappings. plans: PlanSpec values or mappings. Plans attached to agents are appended automatically when not listed. toolkits: ToolkitSpec values or mappings. Toolkit ids referenced by agents are appended automatically when not listed. approval_policies: Runtime-wide approval floor. When omitted, it is derived from the coordinator's approval patch applied on top of the defaults (plans always, tools never). approval_overrides: Per-agent/per-tool approval overrides. When omitted, they are collected from each agent's approval and tool_approvals declarations. storage_factories: Host-provided store factory descriptors. providers: Provider configuration keyed by provider name. tool_bindings: Runtime-level ToolSpec bindings. Agent-attached tools are appended automatically.

Returns: A frozen, validated RuntimeSpec.

Raises: pydantic.ValidationError: On duplicate agent names, more than one coordinator, unknown / duplicate / self-referencing routes, approval overrides naming unknown agents, or more than one coordinator supplying approval_policies.

RuntimeSpec

RuntimeSpec(*, tenant: flowai_harness.tenant.TenantIdentity, agents: list[AgentSpec] = <factory>, references: list[flowai_harness.references.ReferenceSpec] = <factory>, plans: list[flowai_harness.plans.PlanSpec] = <factory>, toolkits: list[ToolkitSpec] = <factory>, approvalPolicies: ApprovalPolicies = <factory>, approvalOverrides: ApprovalOverrides = <factory>, storageFactories: StorageFactories = <factory>, providers: dict[str, typing.Any] = <factory>, toolBindings: tuple[flowai_harness.tools.ToolSpec, ...] = <factory>) -> None

ParameterTypeDefault
tenantflowai_harness.tenant.TenantIdentityrequired
agentslist<factory>
referenceslist<factory>
planslist<factory>
toolkitslist<factory>
approvalPoliciesflowai_harness.runtime.ApprovalPolicies<factory>
approvalOverridesflowai_harness.runtime.ApprovalOverrides<factory>
storageFactoriesflowai_harness.runtime.StorageFactories<factory>
providersdict<factory>
toolBindingstuple<factory>

Returns: None

Canonical pure runtime specification consumed by flowai-runtime.

AgentSpec

AgentSpec(*, name: Annotated[str, MinLen(min_length=1)], role: Literal['coordinator', 'planner', 'executor', 'specialist'], stateful: bool, model: ModelSpec, systemPrompt: str, routes: list[str] = <factory>, toolkits: list[str] = <factory>, maxTurns: Annotated[int | None, Ge(ge=1)] = None, plan: flowai_harness.plans.PlanSpec | None = None, tools: tuple[flowai_harness.tools.ToolSpec, ...] = <factory>, approvalPolicies: ApprovalPolicyPatch | None = None, toolApprovalPolicies: dict[str, dict[str, typing.Any]] = <factory>, promptCacheKey: str | None = None) -> None

ParameterTypeDefault
nametyping.Annotatedrequired
roletyping.Literalrequired
statefulboolrequired
modelflowai_harness.runtime.ModelSpecrequired
systemPromptstrrequired
routeslist<factory>
toolkitslist<factory>
maxTurnstyping.AnnotatedNone
planflowai_harness.plans.PlanSpec | NoneNone
toolstuple<factory>
approvalPoliciesflowai_harness.runtime.ApprovalPolicyPatch | NoneNone
toolApprovalPoliciesdict<factory>
promptCacheKeystr | NoneNone

Returns: None

Agent registration compiled by flowai-runtime into an orchestrator agent.

ModelSpec

ModelSpec(*, id: str, provider: str | None = None) -> None

ParameterTypeDefault
idstrrequired
providerstr | NoneNone

Returns: None

Per-agent model selection.

A plain model id string is accepted anywhere a ModelSpec is expected and is coerced to ModelSpec(id=...).

ToolkitSpec

ToolkitSpec(*, id: Annotated[str, MinLen(min_length=1)], config: typing.Any | None = None) -> None

ParameterTypeDefault
idtyping.Annotatedrequired
configAny | NoneNone

Returns: None

Toolkit declaration by stable identifier.

ApprovalPolicies

ApprovalPolicies(*, plans: dict[str, typing.Any] = <factory>, tools: dict[str, typing.Any] = <factory>) -> None

ParameterTypeDefault
plansdict<factory>
toolsdict<factory>

Returns: None

Runtime-level approval policy floor.

Each channel accepts "never", "always", or {"kind": "dynamic", "value": predicate_id} and is normalized to the wire shape {"kind": ...}.

ApprovalPolicyPatch

ApprovalPolicyPatch(*, plans: dict[str, typing.Any] | None = None, tools: dict[str, typing.Any] | None = None) -> None

ParameterTypeDefault
plansdict[str, Any] | NoneNone
toolsdict[str, Any] | NoneNone

Returns: None

Partial agent-level approval override.

Missing channels inherit from the runtime-level approval policy. Each channel accepts "never", "always", or {"kind": "dynamic", "value": predicate_id}.

ApprovalOverrides

ApprovalOverrides(*, agents: dict[str, ApprovalPolicyPatch] = <factory>, tools: dict[str, dict[str, dict[str, typing.Any]]] = <factory>) -> None

ParameterTypeDefault
agentsdict<factory>
toolsdict<factory>

Returns: None

Hierarchical approval overrides scoped by agent and tool.

Every agent named in agents or tools must be registered in the same runtime spec; RuntimeSpec validation rejects unknown names.

StorageFactorySpec

StorageFactorySpec(*, kind: str, config: typing.Any | None = None) -> None

ParameterTypeDefault
kindstrrequired
configAny | NoneNone

Returns: None

Host-provided store factory descriptor.

StorageFactories

StorageFactories(*, kv: StorageFactorySpec | None = None, plans: StorageFactorySpec | None = None, memory: StorageFactorySpec | None = None) -> None

ParameterTypeDefault
kvflowai_harness.runtime.StorageFactorySpec | NoneNone
plansflowai_harness.runtime.StorageFactorySpec | NoneNone
memoryflowai_harness.runtime.StorageFactorySpec | NoneNone

Returns: None

Store factory descriptions supplied by the host language facade.

TestingConfig

Deterministic native runtime test configuration.

Passing testing=TestingConfig(mock_response=...) to create_runtime(...) runs the deterministic mock interpreter, which emits mock_response as the model output instead of calling a provider. testing is mutually exclusive with a non-default interpreter.

Keys: mock_response: Text emitted by the mock interpreter for every model turn. Required.

DataEnvironmentConfig

Rust data dependencies attached to built-in toolkit dispatch.

All keys are optional and accept snake_case or camelCase spellings.

Keys: tenant_id: Tenant id the environment is pinned to. When set it must match the runtime tenant resource_id. workspace_id: Workspace id scope for stored data. kv: KV store descriptor. Supported kinds: memory, sqlite, postgres, redis. target_database: Target database descriptor for agent data queries. Supported kinds: sqlite, postgres. Mutually exclusive with target_database_url. target_database_url: Connection URL shorthand for the target database. target_database_schema: Schema name used for target database introspection. catalog: Data catalog store descriptor. Supported kinds: empty, inline, sqlite, postgres. catalog_search: Catalog fuzzy-search index configuration with index_path and optional rebuild_on_start / write_through flags.