> For the complete documentation index, see [llms.txt](/llms.txt). Every page on this site is also available as markdown at `<path>.md`.

# Prompts

`layered_prompt` renders a deterministic system prompt from explicit layers and returns a
`LayeredPrompt` value with a SHA-256 cache key derived from the rendered text.

```python
from flowai_harness import define_tool, layered_prompt

domain_knowledge = {
    "entities": ["product", "segment", "channel"],
    "metrics": ["revenue", "margin"],
}

@define_tool(
    name="search_products",
    description="Search products by query.",
    input_schema={"query": str, "limit": int},
    approval="never",
)
async def search_products(args, ctx):
    ...

coordinator_prompt = layered_prompt(
    identity="You coordinate scenario planning for Acme customers.",
    communication="Be concise and surface approval points explicitly.",
    operational_rules=[
        "Route plan-building work to the planner.",
        "Route approved materialization work to the executor.",
    ],
    tools=[search_products],
    domain_knowledge=domain_knowledge,
    safety=["Never execute side-effecting tools without approval."],
    output_format={"events": ["plan_proposed", "approval_required", "tool_result"]},
    examples=[{"user": "Branch Q3 pricing into a 5% promotion for top EU SKUs."}],
)
```

## Why layering

- **Identity** is per-agent.
- **Domain knowledge** is shared structured prompt content, owned by customer/domain code.
- **Operational rules** are behavioral instructions separate from domain facts.
- **Tools** are auto-derived from tool specs and rendered consistently.

```python
planner = define_planner(
    name="scenario_planner",
    model="claude-sonnet-4-6",
    plan=scenario_plan,
    prompt=layered_prompt(
        identity="You produce typed scenario plans.",
        domain_knowledge=domain_knowledge,
    ),
)
executor = define_executor(
    name="scenario_executor",
    model="claude-sonnet-4-6",
    plan=scenario_plan,
    tools=[search_products],
    prompt=layered_prompt(
        identity="You execute approved scenario plans action by action.",
        operational_rules=["Execute only approved plans."],
        domain_knowledge=domain_knowledge,
        tools=[search_products],
    ),
)
```

## Section order

`layered_prompt` always renders sections in this fixed order, omitting empty sections:

1. **Identity** — who the agent is. The only required section.
2. **Communication** — tone, format, conventions.
3. **Operational Rules** — bullet list of behavioral rules.
4. **Tools** — auto-derived Markdown table of tool name, description, and approval policy.
5. **Domain Knowledge** — any structured value, rendered as JSON.
6. **Safety** — bullet list of guardrails.
7. **Output Format** — structured description of expected output.
8. **Examples** — structured examples.

Tenant identity is not rendered automatically. If a prompt needs domain facts, pass them through
`domain_knowledge`; if it needs operating constraints, pass them through `operational_rules`.

When `define_runtime(...)` assembles agents, it also merges runtime-visible tools into the same
`# Tools` section:

- explicit `layered_prompt(..., tools=...)` rows;
- agent-bound Python tools passed as `define_coordinator(...)`,
  `define_planner(...)`, `define_executor(...)`, or `define_specialist(...)`
  with `tools=[...]`;
- enabled built-in toolkit tools from `toolkits=[...]`, including narrowed toolkit configs such as
  `ToolkitSpec(id="catalog", config={"tools": ["execute_query"]})`.

Duplicate tool names render once. Explicit prompt rows win so customer-authored descriptions are
not overwritten by generated toolkit descriptions.

## Accepted section types

| Section | Type |
| --- | --- |
| `identity` | `str` (required) |
| `communication`, `operational_rules`, `safety` | `str` or sequence of strings (rendered as bullet list) |
| `tools` | iterable of `ToolSpec`, `str`, or mapping |
| `domain_knowledge`, `output_format`, `examples` | any JSON-serializable value, Pydantic model, or string |

Structured sections serialize with sorted keys, so dict ordering does not affect the cache key.

## How a prompt reaches the model

Prompt construction is split into two different concerns:

- Python owns the ergonomic prompt authoring API and validates the runtime
  spec.
- Rust owns runtime assembly, agent orchestration, framework primitives, tool
  dispatch, and provider calls.

The system prompt text itself is not rebuilt at every layer. It is rendered
once by Python, copied into the native runtime spec, copied into the framework
agent registration, extracted into a `ChatProgram`, and finally passed to Rig
as the provider preamble:

```text
Python: layered_prompt(...) renders the prompt text
    -> define_coordinator / define_planner / define_executor / define_specialist
    -> AgentSpec.system_prompt
    -> define_runtime(...) -> RuntimeSpec
    -> create_runtime(...) serializes RuntimeSpec as camelCase JSON
    -> PyO3: _internal.create_runtime parses flowai_runtime::RuntimeSpec
    -> flowai-runtime: AgentSpec::to_registration
    -> agent-fw-agent: AgentRegistration.system_prompt
    -> Runtime::query -> AgentOrchestrator::invoke
    -> agent-fw-agent: ChatMessage::system + parse_conversation
    -> agent-fw-agent: ChatProgram.system_prompt
    -> agent-fw-interpreter: RigAnthropicChatInterpreter::interpret
    -> Rig: AgentBuilder.preamble(system_prompt)
    -> LLM request
```

### Python: agent definition and runtime spec

Agent constructors live in the harness module `flowai_harness.agents`. Each
`define_*` function accepts either a plain `str` or a `LayeredPrompt` and
normalizes the prompt with the same helper:

```python
def _prompt_text_and_cache_key(prompt):
    if isinstance(prompt, LayeredPrompt):
        return prompt.text, prompt.cache_key
    if isinstance(prompt, str):
        return prompt, None
    raise TypeError("prompt must be a string or LayeredPrompt")
```

The normalized text is stored as `AgentSpec.system_prompt` — at this point the
system prompt is just a field on a frozen Pydantic model. There is no provider,
no tool dispatcher, and no LLM call. One subtle point: `prompt_cache_key` is
retained on the Python `AgentSpec`, but it is excluded from the native runtime
wire shape. The Rust runtime receives the system prompt text, not the Python
cache key.

`define_runtime(...)` collects agents, references, plans, toolkits, approval
policies, storage descriptors, providers, and tool bindings into a pure
`RuntimeSpec`. When `create_runtime(...)` is called, Python serializes that
spec to camelCase JSON:

```python
json.dumps(runtime_spec.model_dump(by_alias=True, mode="json"))
```

The wire JSON contains each agent's `systemPrompt` field, along with the
role's `stateful` default. The optional `maxTurns` field is omitted when
unset:

```json
{
  "agents": [
    {
      "name": "scenario_planner",
      "role": "planner",
      "stateful": true,
      "model": {"id": "claude-sonnet-4-6", "provider": null},
      "systemPrompt": "# Identity\nYou produce typed scenario plans.",
      "routes": [],
      "toolkits": []
    }
  ]
}
```

`create_runtime(...)` also passes host tool callback maps, approval predicates,
event hooks, interpreter selection, and data environment config. Those are
runtime dependencies; they do not rewrite `systemPrompt`.

### The PyO3 bridge

The private PyO3 module is the harness extension crate (`py-flowai-harness`'s
`src/lib.rs`). Its `create_runtime(...)` function parses the JSON into
`flowai_runtime::RuntimeSpec`, builds `RuntimeDeps`, attaches Python callback
tools as Rust `ToolHandler`s, selects an interpreter, and calls
`Runtime::new(spec, deps)`. The PyO3 layer is a boundary adapter: it does not
compose prompt text. Its job is to move the already-rendered `systemPrompt`
from Python JSON into Rust types, then attach effectful dependencies around
that pure spec.

### Rust runtime assembly

The native runtime spec is defined in the `crates/flowai-runtime` crate root.
Its `AgentSpec` has:

```rust
pub struct AgentSpec {
    pub name: String,
    pub role: AgentRole,
    pub stateful: bool,
    pub model: ModelSpec,
    pub system_prompt: String,
    pub routes: Vec<String>,
    pub toolkits: Vec<String>,
    pub max_turns: Option<u32>,
}
```

Runtime assembly converts each harness `AgentSpec` into the generic framework
`AgentRegistration`:

```rust
fn to_registration(&self) -> AgentRegistration {
    AgentRegistration {
        name: self.name.clone(),
        model: ModelId::new(self.model.id.clone()),
        system_prompt: self.system_prompt.clone(),
        role: Some(self.role.to_agent_label()),
        stateful: self.stateful,
    }
}
```

This is where the harness stops owning prompt semantics. From here on, the
generic framework sees an agent name, model, optional role label, statefulness
flag, and system prompt string.

### Request-time orchestration

`Runtime::query(...)` and `Runtime::run_specialist(...)` are implemented in
`crates/flowai-runtime`'s `runtime::query` module. For a normal query, the
runtime finds the registered coordinator, builds a per-request
`AgentOrchestrator`, and calls
`orchestrator.invoke(SubAgentRequest::new(entry_agent, prompt))`. The
per-request orchestrator wires request-scoped dependencies — a
`ChannelEventSink` for streaming events back to Python, per-agent dispatchers
for built-in toolkits and host tools, approval, cancellation, tracing, KV,
catalog, target database, and sub-agent invocation extensions, and per-agent
interpreters selected from the runtime provider config. None of that changes
the system prompt; it determines the execution environment in which the prompt
will run.

`AgentOrchestrator::invoke(...)` lives in `crates/agent-fw-agent`'s
`orchestrator` module. For each agent call, it builds framework conversation
values. Stateful agents also load their persisted non-system message history
before the current prompt (error mapping elided here):

```rust
let mut messages = Vec::new();
messages.push(ChatMessage::system(&registration.system_prompt));
if registration.stateful {
    let history = self.memory.load(&agent_tenant, &registration.name).await?;
    messages.extend(history);
}
messages.push(ChatMessage::user(&request.prompt));

let conversation = parse_conversation(messages)?;

let program = ChatProgram::new(
    conversation,
    registration.model.clone(),
    agent_tenant.clone(),
);
```

The key framework types are in `crates/agent-fw-agent`'s `conversation`
module:

- `SystemPrompt(String)`: a thin typed wrapper around the system prompt text.
- `Conversation`: validated messages plus the current user prompt.
- `ChatProgram`: a pure value containing `Conversation`, `SystemPrompt`,
  `ModelId`, and `TenantContext`.

`parse_conversation(...)` extracts the first system message into
`Conversation.system`. `ChatProgram::new(...)` then copies that value into
`ChatProgram.system_prompt`. System messages are never included in
`Conversation.history()`, and stored memory never contains system messages —
the system prompt always comes from the agent registration. This gives the
framework a clean split:

- `conversation().prompt()` is the current user task.
- `conversation().history()` is non-system conversational history.
- `system_prompt()` is the agent's behavioral instruction.

### Interpreter to LLM

The framework interpreter trait is in `crates/agent-fw-agent`'s `interpreter`
module:

```rust
pub trait ChatInterpreter: Send + Sync {
    fn interpret(
        &self,
        program: ChatProgram,
        cancel: CancellationToken,
    ) -> Pin<Box<dyn Stream<Item = StreamPart> + Send>>;
}
```

The shipped Anthropic implementation is in `crates/agent-fw-interpreter`'s
`rig_chat` module. It extracts the pieces from `ChatProgram`:

```rust
let prompt = program.conversation().prompt().as_str().to_string();
let history = conversation_to_rig_history(program.conversation());
let system_prompt = program.system_prompt().as_str().to_string();
let model = program.model().as_str().to_string();
```

Then the provider-specific runner gives Rig the system prompt as the agent
preamble:

```rust
let mut builder = AgentBuilder::new(completion_model)
    .preamble(&system_prompt)
    .default_max_turns(max_turns);

let agent = builder.tools(dispatcher_rig_tools(dispatcher)).build();
agent.stream_chat(&prompt, history).with_hook(hook).await
```

For Anthropic, OpenAI-compatible, and Bedrock Rig paths, the same conceptual
mapping applies:

- System prompt -> Rig `preamble(...)`.
- Current user prompt -> `stream_chat(&prompt, history)`.
- Prior non-system messages -> Rig chat history.
- Tools -> Rig tool definitions from the dispatcher, not string concatenation.

The provider client is the first place an LLM request can happen. Everything
before `ChatInterpreter::interpret(...)` is spec construction, runtime
assembly, or pure framework value construction. The framework deliberately
keeps this boring: the system prompt is immutable configuration for an agent
invocation. The text is rendered once in Python, then copied unchanged from
`RuntimeSpec` to `AgentRegistration` to `ChatProgram` until the interpreter
hands it to the provider as the preamble. Prompt text is data until the
interpreter consumes the `ChatProgram`; all effects sit at the interpreter and
tool-dispatch layers.

## Tool descriptions versus executable tools

There are two tool surfaces that are easy to confuse:

| Surface | Where it appears | Purpose |
| --- | --- | --- |
| Prompt tool table | `layered_prompt(tools=[...])` | Human-readable instructions inside the system prompt. |
| Runtime tool binding | `define_*agent*(..., tools=[...])`, `define_runtime(...)`, toolkits | Executable tool schema and handler registration. |

Including a tool in the prompt table does not register it with the runtime.
Registering a tool with the runtime does not automatically insert a tool table
into your prompt. In typical agents you do both: register the tool for
execution, and include a concise prompt description when the agent needs policy
or domain guidance about when to use it.

## See also

- [`layered_prompt` reference](/docs/reference/prompts#flowai_harness.prompts.layered_prompt)
- [Debugging System Prompts](/docs/guides/system-prompts) — find the stage where a prompt went missing or stale.
- [Tenant](/docs/concepts/tenant) — runtime identity is separate from prompt content.
- [Tools](/docs/concepts/tools) — the `tools` argument auto-derives the Markdown table.