Documentation index for AI agents: see /llms.txt. Markdown versions of every page are available at <path>.md or via Accept: text/markdown.
Concepts

Prompts

layered_prompt renders a deterministic system prompt from explicit layers and returns a LayeredPrompt value with a SHA-256 cache key derived from the rendered text.

layered_prompt renders a deterministic system prompt from explicit layers and returns a LayeredPrompt value with a SHA-256 cache key derived from the rendered text.

from flowai_harness import define_tool, layered_prompt

domain_knowledge = {
    "entities": ["product", "segment", "channel"],
    "metrics": ["revenue", "margin"],
}

@define_tool(
    name="search_products",
    description="Search products by query.",
    input_schema={"query": str, "limit": int},
    approval="never",
)
async def search_products(args, ctx):
    ...

coordinator_prompt = layered_prompt(
    identity="You coordinate scenario planning for Acme customers.",
    communication="Be concise and surface approval points explicitly.",
    operational_rules=[
        "Route plan-building work to the planner.",
        "Route approved materialization work to the executor.",
    ],
    tools=[search_products],
    domain_knowledge=domain_knowledge,
    safety=["Never execute side-effecting tools without approval."],
    output_format={"events": ["plan_proposed", "approval_required", "tool_result"]},
    examples=[{"user": "Branch Q3 pricing into a 5% promotion for top EU SKUs."}],
)

Why layering

  • Identity is per-agent.
  • Domain knowledge is shared structured prompt content, owned by customer/domain code.
  • Operational rules are behavioral instructions separate from domain facts.
  • Tools are auto-derived from tool specs and rendered consistently.
planner = define_planner(
    name="scenario_planner",
    model="claude-sonnet-4-6",
    plan=scenario_plan,
    prompt=layered_prompt(
        identity="You produce typed scenario plans.",
        domain_knowledge=domain_knowledge,
    ),
)
executor = define_executor(
    name="scenario_executor",
    model="claude-sonnet-4-6",
    plan=scenario_plan,
    tools=[search_products],
    prompt=layered_prompt(
        identity="You execute approved scenario plans action by action.",
        operational_rules=["Execute only approved plans."],
        domain_knowledge=domain_knowledge,
        tools=[search_products],
    ),
)

Section order

layered_prompt always renders sections in this fixed order, omitting empty sections:

  1. Identity — who the agent is. The only required section.
  2. Communication — tone, format, conventions.
  3. Operational Rules — bullet list of behavioral rules.
  4. Tools — auto-derived Markdown table of tool name, description, and approval policy.
  5. Domain Knowledge — any structured value, rendered as JSON.
  6. Safety — bullet list of guardrails.
  7. Output Format — structured description of expected output.
  8. Examples — structured examples.

Tenant identity is not rendered automatically. If a prompt needs domain facts, pass them through domain_knowledge; if it needs operating constraints, pass them through operational_rules.

When define_runtime(...) assembles agents, it also merges runtime-visible tools into the same # Tools section:

  • explicit layered_prompt(..., tools=...) rows;
  • agent-bound Python tools passed as define_coordinator(...), define_planner(...), define_executor(...), or define_specialist(...) with tools=[...];
  • enabled built-in toolkit tools from toolkits=[...], including narrowed toolkit configs such as ToolkitSpec(id="catalog", config={"tools": ["execute_query"]}).

Duplicate tool names render once. Explicit prompt rows win so customer-authored descriptions are not overwritten by generated toolkit descriptions.

Accepted section types

SectionType
identitystr (required)
communication, operational_rules, safetystr or sequence of strings (rendered as bullet list)
toolsiterable of ToolSpec, str, or mapping
domain_knowledge, output_format, examplesany JSON-serializable value, Pydantic model, or string

Structured sections serialize with sorted keys, so dict ordering does not affect the cache key.

How a prompt reaches the model

Prompt construction is split into two different concerns:

  • Python owns the ergonomic prompt authoring API and validates the runtime spec.
  • Rust owns runtime assembly, agent orchestration, framework primitives, tool dispatch, and provider calls.

The system prompt text itself is not rebuilt at every layer. It is rendered once by Python, copied into the native runtime spec, copied into the framework agent registration, extracted into a ChatProgram, and finally passed to Rig as the provider preamble:

Python: layered_prompt(...) renders the prompt text
    -> define_coordinator / define_planner / define_executor / define_specialist
    -> AgentSpec.system_prompt
    -> define_runtime(...) -> RuntimeSpec
    -> create_runtime(...) serializes RuntimeSpec as camelCase JSON
    -> PyO3: _internal.create_runtime parses flowai_runtime::RuntimeSpec
    -> flowai-runtime: AgentSpec::to_registration
    -> agent-fw-agent: AgentRegistration.system_prompt
    -> Runtime::query -> AgentOrchestrator::invoke
    -> agent-fw-agent: ChatMessage::system + parse_conversation
    -> agent-fw-agent: ChatProgram.system_prompt
    -> agent-fw-interpreter: RigAnthropicChatInterpreter::interpret
    -> Rig: AgentBuilder.preamble(system_prompt)
    -> LLM request

Python: agent definition and runtime spec

Agent constructors live in the harness module flowai_harness.agents. Each define_* function accepts either a plain str or a LayeredPrompt and normalizes the prompt with the same helper:

def _prompt_text_and_cache_key(prompt):
    if isinstance(prompt, LayeredPrompt):
        return prompt.text, prompt.cache_key
    if isinstance(prompt, str):
        return prompt, None
    raise TypeError("prompt must be a string or LayeredPrompt")

The normalized text is stored as AgentSpec.system_prompt — at this point the system prompt is just a field on a frozen Pydantic model. There is no provider, no tool dispatcher, and no LLM call. One subtle point: prompt_cache_key is retained on the Python AgentSpec, but it is excluded from the native runtime wire shape. The Rust runtime receives the system prompt text, not the Python cache key.

define_runtime(...) collects agents, references, plans, toolkits, approval policies, storage descriptors, providers, and tool bindings into a pure RuntimeSpec. When create_runtime(...) is called, Python serializes that spec to camelCase JSON:

json.dumps(runtime_spec.model_dump(by_alias=True, mode="json"))

The wire JSON contains each agent's systemPrompt field, along with the role's stateful default. The optional maxTurns field is omitted when unset:

{
  "agents": [
    {
      "name": "scenario_planner",
      "role": "planner",
      "stateful": true,
      "model": {"id": "claude-sonnet-4-6", "provider": null},
      "systemPrompt": "# Identity\nYou produce typed scenario plans.",
      "routes": [],
      "toolkits": []
    }
  ]
}

create_runtime(...) also passes host tool callback maps, approval predicates, event hooks, interpreter selection, and data environment config. Those are runtime dependencies; they do not rewrite systemPrompt.

The PyO3 bridge

The private PyO3 module is the harness extension crate (py-flowai-harness's src/lib.rs). Its create_runtime(...) function parses the JSON into flowai_runtime::RuntimeSpec, builds RuntimeDeps, attaches Python callback tools as Rust ToolHandlers, selects an interpreter, and calls Runtime::new(spec, deps). The PyO3 layer is a boundary adapter: it does not compose prompt text. Its job is to move the already-rendered systemPrompt from Python JSON into Rust types, then attach effectful dependencies around that pure spec.

Rust runtime assembly

The native runtime spec is defined in the crates/flowai-runtime crate root. Its AgentSpec has:

pub struct AgentSpec {
    pub name: String,
    pub role: AgentRole,
    pub stateful: bool,
    pub model: ModelSpec,
    pub system_prompt: String,
    pub routes: Vec<String>,
    pub toolkits: Vec<String>,
    pub max_turns: Option<u32>,
}

Runtime assembly converts each harness AgentSpec into the generic framework AgentRegistration:

fn to_registration(&self) -> AgentRegistration {
    AgentRegistration {
        name: self.name.clone(),
        model: ModelId::new(self.model.id.clone()),
        system_prompt: self.system_prompt.clone(),
        role: Some(self.role.to_agent_label()),
        stateful: self.stateful,
    }
}

This is where the harness stops owning prompt semantics. From here on, the generic framework sees an agent name, model, optional role label, statefulness flag, and system prompt string.

Request-time orchestration

Runtime::query(...) and Runtime::run_specialist(...) are implemented in crates/flowai-runtime's runtime::query module. For a normal query, the runtime finds the registered coordinator, builds a per-request AgentOrchestrator, and calls orchestrator.invoke(SubAgentRequest::new(entry_agent, prompt)). The per-request orchestrator wires request-scoped dependencies — a ChannelEventSink for streaming events back to Python, per-agent dispatchers for built-in toolkits and host tools, approval, cancellation, tracing, KV, catalog, target database, and sub-agent invocation extensions, and per-agent interpreters selected from the runtime provider config. None of that changes the system prompt; it determines the execution environment in which the prompt will run.

AgentOrchestrator::invoke(...) lives in crates/agent-fw-agent's orchestrator module. For each agent call, it builds framework conversation values. Stateful agents also load their persisted non-system message history before the current prompt (error mapping elided here):

let mut messages = Vec::new();
messages.push(ChatMessage::system(&registration.system_prompt));
if registration.stateful {
    let history = self.memory.load(&agent_tenant, &registration.name).await?;
    messages.extend(history);
}
messages.push(ChatMessage::user(&request.prompt));

let conversation = parse_conversation(messages)?;

let program = ChatProgram::new(
    conversation,
    registration.model.clone(),
    agent_tenant.clone(),
);

The key framework types are in crates/agent-fw-agent's conversation module:

  • SystemPrompt(String): a thin typed wrapper around the system prompt text.
  • Conversation: validated messages plus the current user prompt.
  • ChatProgram: a pure value containing Conversation, SystemPrompt, ModelId, and TenantContext.

parse_conversation(...) extracts the first system message into Conversation.system. ChatProgram::new(...) then copies that value into ChatProgram.system_prompt. System messages are never included in Conversation.history(), and stored memory never contains system messages — the system prompt always comes from the agent registration. This gives the framework a clean split:

  • conversation().prompt() is the current user task.
  • conversation().history() is non-system conversational history.
  • system_prompt() is the agent's behavioral instruction.

Interpreter to LLM

The framework interpreter trait is in crates/agent-fw-agent's interpreter module:

pub trait ChatInterpreter: Send + Sync {
    fn interpret(
        &self,
        program: ChatProgram,
        cancel: CancellationToken,
    ) -> Pin<Box<dyn Stream<Item = StreamPart> + Send>>;
}

The shipped Anthropic implementation is in crates/agent-fw-interpreter's rig_chat module. It extracts the pieces from ChatProgram:

let prompt = program.conversation().prompt().as_str().to_string();
let history = conversation_to_rig_history(program.conversation());
let system_prompt = program.system_prompt().as_str().to_string();
let model = program.model().as_str().to_string();

Then the provider-specific runner gives Rig the system prompt as the agent preamble:

let mut builder = AgentBuilder::new(completion_model)
    .preamble(&system_prompt)
    .default_max_turns(max_turns);

let agent = builder.tools(dispatcher_rig_tools(dispatcher)).build();
agent.stream_chat(&prompt, history).with_hook(hook).await

For Anthropic, OpenAI-compatible, and Bedrock Rig paths, the same conceptual mapping applies:

  • System prompt -> Rig preamble(...).
  • Current user prompt -> stream_chat(&prompt, history).
  • Prior non-system messages -> Rig chat history.
  • Tools -> Rig tool definitions from the dispatcher, not string concatenation.

The provider client is the first place an LLM request can happen. Everything before ChatInterpreter::interpret(...) is spec construction, runtime assembly, or pure framework value construction. The framework deliberately keeps this boring: the system prompt is immutable configuration for an agent invocation. The text is rendered once in Python, then copied unchanged from RuntimeSpec to AgentRegistration to ChatProgram until the interpreter hands it to the provider as the preamble. Prompt text is data until the interpreter consumes the ChatProgram; all effects sit at the interpreter and tool-dispatch layers.

Tool descriptions versus executable tools

There are two tool surfaces that are easy to confuse:

SurfaceWhere it appearsPurpose
Prompt tool tablelayered_prompt(tools=[...])Human-readable instructions inside the system prompt.
Runtime tool bindingdefine_*agent*(..., tools=[...]), define_runtime(...), toolkitsExecutable tool schema and handler registration.

Including a tool in the prompt table does not register it with the runtime. Registering a tool with the runtime does not automatically insert a tool table into your prompt. In typical agents you do both: register the tool for execution, and include a concise prompt description when the agent needs policy or domain guidance about when to use it.

See also