flowai-harness
Opinionated Python harness for building agents on the Flow AI runtime.
Opinionated Python harness for building agents on the Flow AI runtime.
What it is
flowai-harness builds multi-step, tool-using data agents: they plan a
sequence of actions, call your tools to carry them out, pause for human
approval on sensitive steps, and stream progress back to you. You describe the
agent in plain Python (Pydantic specs + callbacks); the harness runs the
agent loop, plan lifecycle, approval gates, and provider routing — with
optimized defaults tuned for data-heavy agents.
When to use it
Reach for it when an agent needs multiple tool-driven steps, human-in-the-loop approval, or a typed plan you can review before it runs. For a single LLM call or a plain chatbot, a model SDK is lighter. The public surface is Python; the runtime is Rust, embedded in the wheel.
flowai-harness is the public Python facade for the embedded Flow AI runtime. The runtime engine
itself is written in Rust and ships inside the wheel as a private flowai_harness._internal
extension module. Python is a thin spec-construction and callback layer: you describe the agent
topology with validated Pydantic models, attach handlers for tools and approvals, and hand the
result to the native runtime.
The agent loop, plan lifecycle, approval gates, provider routing, and stream generation all live
in Rust. Python supplies inputs and consumes events. The package ships with a py.typed marker
and a .pyi stub for the native module, so IDEs and type checkers see the full public surface.
Install
Private preview access
flowai-harness is not currently available on the public PyPI registry.
To get access to the preview release, contact
aaro@flow-ai.com or
karolus@flow-ai.com before running the
install command below.
pip install flowai-harnessBefore you begin
- Use Python 3.11 or newer.
- Pin the alpha package version in production environments.
- The hello-world example below uses
TestingConfig, so it does not require provider credentials. Live model runs need the provider environment variables referenced by your runtime spec, such asANTHROPIC_API_KEY.
Alpha release
Version 1.0.0a1 is the Python package version for the v1.0.0-alpha.1
private preview tag. The public API is stable enough to build against, but
breaking changes are still possible during the alpha cycle.
Hello world
The following snippet defines a coordinator and a specialist, builds a runtime with the deterministic testing interpreter, and prints events from a single query.
import asyncio
from flowai_harness import (
TestingConfig,
create_runtime,
define_coordinator,
define_runtime,
define_specialist,
define_tenant,
)
async def main() -> None:
tenant = define_tenant("acme", "v1")
specialist = define_specialist(
name="greeter",
model="claude-haiku-4-5",
prompt="You greet the user politely.",
)
coordinator = define_coordinator(
name="hello_coordinator",
model="claude-sonnet-4-6",
routes=["greeter"],
prompt="Route greeting requests to the greeter specialist.",
)
runtime_spec = define_runtime(
tenant=tenant,
agents=[coordinator, specialist],
providers={"anthropic": {"apiKey": "unused"}},
)
runtime = create_runtime(
runtime_spec,
testing=TestingConfig(mock_response="hello from the Rust runtime"),
)
async for event in runtime.query("Say hello", thread_id="thread-1"):
print(event)
asyncio.run(main())Expected output is a short stream of runtime event dictionaries. With the
deterministic testing interpreter, the final text event includes
hello from the Rust runtime.
Where to next
Quickstart
Create a minimal coordinator and specialist runtime without provider credentials.
Tutorial
Build the full Acme scenario-planning example end to end: tenant identity, references, plans, tools, layered prompts, approvals, and the four agent roles.
Concepts
Mental model and per-primitive deep dives: tenant identity, agents, plans, references, tools, prompts, and the runtime handle.
Reference
Generated API reference for every public symbol, including Pydantic spec values and the native Runtime handle.
Studio
Run a local browser UI for chat, data inspection, tests, evals, runs, and traces.
